1
|
Grazioli G, Tao A, Bhatia I, Regan P. Genetic Algorithm for Automated Parameterization of Network Hamiltonian Models of Amyloid Fibril Formation. J Phys Chem B 2024; 128:1854-1865. [PMID: 38359362 PMCID: PMC10910512 DOI: 10.1021/acs.jpcb.3c07322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 01/07/2024] [Accepted: 02/05/2024] [Indexed: 02/17/2024]
Abstract
The time scales of long-time atomistic molecular dynamics simulations are typically reported in microseconds, while the time scales for experiments studying the kinetics of amyloid fibril formation are typically reported in minutes or hours. This time scale deficit of roughly 9 orders of magnitude presents a major challenge in the design of computer simulation methods for studying protein aggregation events. Coarse-grained molecular simulations offer a computationally tractable path forward for exploring the molecular mechanism driving the formation of these structures, which are implicated in diseases such as Alzheimer's, Parkinson's, and type-II diabetes. Network Hamiltonian models of aggregation are centered around a Hamiltonian function that returns the total energy of a system of aggregating proteins, given the graph structure of the system as an input. In the graph, or network, representation of the system, each protein molecule is represented as a node, and noncovalent bonds between proteins are represented as edges. The parameter, i.e., a set of coefficients that determine the degree to which each topological degree of freedom is favored or disfavored, must be determined for each network Hamiltonian model, and is a well-known technical challenge. The methodology is first demonstrated by beginning with an initial set of randomly parametrized models of low fibril fraction (<5% fibrillar), and evolving to subsequent generations of models, ultimately leading to high fibril fraction models (>70% fibrillar). The methodology is also demonstrated by applying it to optimizing previously published network Hamiltonian models for the 5 key amyloid fibril topologies that have been reported in the Protein Data Bank (PDB). The models generated by the AI produced fibril fractions that surpass previously published fibril fractions in 3 of 5 cases, including the most naturally abundant amyloid fibril topology, the 1,2 2-ribbon, which features a steric zipper. The authors also aim to encourage more widespread use of the network Hamiltonian methodology for fitting a wide variety of self-assembling systems by releasing a free open-source implementation of the genetic algorithm introduced here.
Collapse
Affiliation(s)
- Gianmarc Grazioli
- Department of Chemistry, San
José State University, San Jose, California 95192, United States
| | - Andy Tao
- Department of Chemistry, San
José State University, San Jose, California 95192, United States
| | - Inika Bhatia
- Department of Chemistry, San
José State University, San Jose, California 95192, United States
| | - Patrick Regan
- Department of Chemistry, San
José State University, San Jose, California 95192, United States
| |
Collapse
|
2
|
Appadurai R, Koneru JK, Bonomi M, Robustelli P, Srivastava A. Clustering Heterogeneous Conformational Ensembles of Intrinsically Disordered Proteins with t-Distributed Stochastic Neighbor Embedding. J Chem Theory Comput 2023; 19:4711-4727. [PMID: 37338049 PMCID: PMC11108026 DOI: 10.1021/acs.jctc.3c00224] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
Intrinsically disordered proteins (IDPs) populate a range of conformations that are best described by a heterogeneous ensemble. Grouping an IDP ensemble into "structurally similar" clusters for visualization, interpretation, and analysis purposes is a much-desired but formidable task, as the conformational space of IDPs is inherently high-dimensional and reduction techniques often result in ambiguous classifications. Here, we employ the t-distributed stochastic neighbor embedding (t-SNE) technique to generate homogeneous clusters of IDP conformations from the full heterogeneous ensemble. We illustrate the utility of t-SNE by clustering conformations of two disordered proteins, Aβ42, and α-synuclein, in their APO states and when bound to small molecule ligands. Our results shed light on ordered substates within disordered ensembles and provide structural and mechanistic insights into binding modes that confer specificity and affinity in IDP ligand binding. t-SNE projections preserve the local neighborhood information, provide interpretable visualizations of the conformational heterogeneity within each ensemble, and enable the quantification of cluster populations and their relative shifts upon ligand binding. Our approach provides a new framework for detailed investigations of the thermodynamics and kinetics of IDP ligand binding and will aid rational drug design for IDPs.
Collapse
Affiliation(s)
- Rajeswari Appadurai
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka 560012, India
| | | | - Massimiliano Bonomi
- Structural Bioinformatics Unit, Department of Structural Biology and Chemistry. CNRS UMR 3528, C3BI, CNRS USR 3756, Institut Pasteur, Paris, France
| | - Paul Robustelli
- Dartmouth College, Department of Chemistry, Hanover, NH, 03755, USA
| | - Anand Srivastava
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka 560012, India
| |
Collapse
|
3
|
Diessner EM, Takahashi GR, Martin RW, Butts CT. Comparative Modeling and Analysis of Extremophilic D-Ala-D-Ala Carboxypeptidases. Biomolecules 2023; 13:biom13020328. [PMID: 36830697 PMCID: PMC9953012 DOI: 10.3390/biom13020328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 01/21/2023] [Accepted: 02/02/2023] [Indexed: 02/11/2023] Open
Abstract
Understanding the molecular adaptations of organisms to extreme environments requires a comparative analysis of protein structure, function, and dynamics across species found in different environmental conditions. Computational studies can be particularly useful in this pursuit, allowing exploratory studies of large numbers of proteins under different thermal and chemical conditions that would be infeasible to carry out experimentally. Here, we perform such a study of the MEROPS family S11, S12, and S13 proteases from psychophilic, mesophilic, and thermophilic bacteria. Using a combination of protein structure prediction, atomistic molecular dynamics, and trajectory analysis, we examine both conserved features and trends across thermal groups. Our findings suggest a number of hypotheses for experimental investigation.
Collapse
Affiliation(s)
| | - Gemma R. Takahashi
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA 92697, USA
| | - Rachel W. Martin
- Department of Chemistry, University of California, Irvine, CA 92697, USA
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA 92697, USA
- Correspondence: (R.W.M.); (C.T.B.)
| | - Carter T. Butts
- Departments of Sociology, Statistics, Electrical Engineering and Computer Science, University of California, Irvine, CA 92697, USA
- Correspondence: (R.W.M.); (C.T.B.)
| |
Collapse
|
4
|
Diessner EM, Freites JA, Tobias DJ, Butts CT. Network Hamiltonian Models for Unstructured Protein Aggregates, with Application to γD-Crystallin. J Phys Chem B 2023; 127:685-697. [PMID: 36637342 PMCID: PMC10437096 DOI: 10.1021/acs.jpcb.2c07672] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Network Hamiltonian models (NHMs) are a framework for topological coarse-graining of protein-protein interactions, in which each node corresponds to a protein, and edges are drawn between nodes representing proteins that are noncovalently bound. Here, this framework is applied to aggregates of γD-crystallin, a structural protein of the eye lens implicated in cataract disease. The NHMs in this study are generated from atomistic simulations of equilibrium distributions of wild-type and the cataract-causing variant W42R in solution, performed by Wong, E. K.; Prytkova, V.; Freites, J. A.; Butts, C. T.; Tobias, D. J. Molecular Mechanism of Aggregation of the Cataract-Related γD-Crystallin W42R Variant from Multiscale Atomistic Simulations. Biochemistry2019, 58 (35), 3691-3699. Network models are shown to successfully reproduce the aggregate size and structure observed in the atomistic simulation, and provide information about the transient protein-protein interactions therein. The system size is scaled from the original 375 monomers to a system of 10000 monomers, revealing a lowering of the upper tail of the aggregate size distribution of the W42R variant. Extrapolation to higher and lower concentrations is also performed. These results provide an example of the utility of NHMs for coarse-grained simulation of protein systems, as well as their ability to scale to large system sizes and high concentrations, reducing computational costs while retaining topological information about the system.
Collapse
Affiliation(s)
- Elizabeth M Diessner
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - J Alfredo Freites
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - Douglas J Tobias
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - Carter T Butts
- Departments of Sociology, Statistics, Computer Science, and EECS, University of California, Irvine, California92697, United States
| |
Collapse
|
5
|
Montepietra D, Cecconi C, Brancolini G. Combining enhanced sampling and deep learning dimensionality reduction for the study of the heat shock protein B8 and its pathological mutant K141E. RSC Adv 2022; 12:31996-32011. [PMID: 36380940 PMCID: PMC9641792 DOI: 10.1039/d2ra04913a] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Accepted: 10/28/2022] [Indexed: 11/11/2022] Open
Abstract
The biological functions of proteins closely depend on their conformational dynamics. This aspect is especially relevant for intrinsically disordered proteins (IDP) for which structural ensembles often offer more useful representations than individual conformations. Here we employ extensive enhanced sampling temperature replica-exchange atomistic simulations (TREMD) and deep learning dimensionality reduction to study the conformational ensembles of the human heat shock protein B8 and its pathological mutant K141E, for which no experimental 3D structures are available. First, we combined homology modelling with TREMD to generate high-dimensional data sets of 3D structures. Then, we employed a recently developed machine learning based post-processing algorithm, EncoderMap, to project the large conformational data sets into meaningful two-dimensional maps that helped us interpret the data and extract the most significant conformations adopted by both proteins during TREMD. These studies provide the first 3D structural characterization of HSPB8 and reveal the effects of the pathogenic K141E mutation on its conformational ensembles. In particular, this missense mutation appears to increase the compactness of the protein and its structural variability, at the same time rearranging the hydrophobic patches exposed on the protein surface. These results offer the possibility of rationalizing the pathogenic effects of the K141E mutation in terms of conformational changes. The study provides the first 3D structural characterization of HSPB8 and its K141E mutant: extensive TREMD are combined with a deep learning algorithm to rationalize the disordered ensemble of structures adopted by each variant.![]()
Collapse
Affiliation(s)
- Daniele Montepietra
- Department of Physics, Computer Science and Mathematics, University of Modena and Reggio Emilia, Via Campi 213/A, 41100 Modena, Italy
- Istituto Nanoscienze – CNR-NANO, Center S3, Via G. Campi 213/A, 41100 Modena, Italy
| | - Ciro Cecconi
- Department of Physics, Computer Science and Mathematics, University of Modena and Reggio Emilia, Via Campi 213/A, 41100 Modena, Italy
- Istituto Nanoscienze – CNR-NANO, Center S3, Via G. Campi 213/A, 41100 Modena, Italy
| | - Giorgia Brancolini
- Istituto Nanoscienze – CNR-NANO, Center S3, Via G. Campi 213/A, 41100 Modena, Italy
| |
Collapse
|
6
|
Duong VT, Diessner EM, Grazioli G, Martin RW, Butts CT. Neural Upscaling from Residue-Level Protein Structure Networks to Atomistic Structures. Biomolecules 2021; 11:biom11121788. [PMID: 34944432 PMCID: PMC8698800 DOI: 10.3390/biom11121788] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 11/11/2021] [Accepted: 11/19/2021] [Indexed: 01/01/2023] Open
Abstract
Coarse-graining is a powerful tool for extending the reach of dynamic models of proteins and other biological macromolecules. Topological coarse-graining, in which biomolecules or sets thereof are represented via graph structures, is a particularly useful way of obtaining highly compressed representations of molecular structures, and simulations operating via such representations can achieve substantial computational savings. A drawback of coarse-graining, however, is the loss of atomistic detail—an effect that is especially acute for topological representations such as protein structure networks (PSNs). Here, we introduce an approach based on a combination of machine learning and physically-guided refinement for inferring atomic coordinates from PSNs. This “neural upscaling” procedure exploits the constraints implied by PSNs on possible configurations, as well as differences in the likelihood of observing different configurations with the same PSN. Using a 1 μs atomistic molecular dynamics trajectory of Aβ1–40, we show that neural upscaling is able to effectively recapitulate detailed structural information for intrinsically disordered proteins, being particularly successful in recovering features such as transient secondary structure. These results suggest that scalable network-based models for protein structure and dynamics may be used in settings where atomistic detail is desired, with upscaling employed to impute atomic coordinates from PSNs.
Collapse
Affiliation(s)
- Vy T. Duong
- Department of Chemistry, University of California, Irvine, CA 92697, USA; (V.T.D.); (E.M.D.)
| | - Elizabeth M. Diessner
- Department of Chemistry, University of California, Irvine, CA 92697, USA; (V.T.D.); (E.M.D.)
| | - Gianmarc Grazioli
- Department of Chemistry, San Jose State University, San Jose, CA 95192, USA;
| | - Rachel W. Martin
- Department of Chemistry, University of California, Irvine, CA 92697, USA; (V.T.D.); (E.M.D.)
- Department of Molecular Biology & Biochemistry, University of California, Irvine, CA 92697, USA
- Correspondence: (R.W.M.); (C.T.B.)
| | - Carter T. Butts
- Departments of Sociology, Statistics and Electrical Engineering & Computer Science, University of California, Irvine, CA 92697, USA
- Correspondence: (R.W.M.); (C.T.B.)
| |
Collapse
|
7
|
Stivala A, Lomi A. Testing biological network motif significance with exponential random graph models. APPLIED NETWORK SCIENCE 2021; 6:91. [PMID: 34841042 PMCID: PMC8608783 DOI: 10.1007/s41109-021-00434-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 11/08/2021] [Indexed: 06/13/2023]
Abstract
UNLABELLED Analysis of the structure of biological networks often uses statistical tests to establish the over-representation of motifs, which are thought to be important building blocks of such networks, related to their biological functions. However, there is disagreement as to the statistical significance of these motifs, and there are potential problems with standard methods for estimating this significance. Exponential random graph models (ERGMs) are a class of statistical model that can overcome some of the shortcomings of commonly used methods for testing the statistical significance of motifs. ERGMs were first introduced into the bioinformatics literature over 10 years ago but have had limited application to biological networks, possibly due to the practical difficulty of estimating model parameters. Advances in estimation algorithms now afford analysis of much larger networks in practical time. We illustrate the application of ERGM to both an undirected protein-protein interaction (PPI) network and directed gene regulatory networks. ERGM models indicate over-representation of triangles in the PPI network, and confirm results from previous research as to over-representation of transitive triangles (feed-forward loop) in an E. coli and a yeast regulatory network. We also confirm, using ERGMs, previous research showing that under-representation of the cyclic triangle (feedback loop) can be explained as a consequence of other topological features. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s41109-021-00434-y.
Collapse
Affiliation(s)
- Alex Stivala
- Institute of Computational Science, Università della Svizzera italiana, Via Giuseppe Buffi 13, 6900 Lugano, Switzerland
| | - Alessandro Lomi
- Institute of Computational Science, Università della Svizzera italiana, Via Giuseppe Buffi 13, 6900 Lugano, Switzerland
- The University of Exeter Business School, Rennes Drive, Exeter, EX4 4PU UK
| |
Collapse
|
8
|
Ma YW, Lin TY, Tsai MY. Fibril Surface-Dependent Amyloid Precursors Revealed by Coarse-Grained Molecular Dynamics Simulation. Front Mol Biosci 2021; 8:719320. [PMID: 34422910 PMCID: PMC8378332 DOI: 10.3389/fmolb.2021.719320] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 07/26/2021] [Indexed: 01/05/2023] Open
Abstract
Amyloid peptides are known to self-assemble into larger aggregates that are linked to the pathogenesis of many neurodegenerative disorders. In contrast to primary nucleation, recent experimental and theoretical studies have shown that many toxic oligomeric species are generated through secondary processes on a pre-existing fibrillar surface. Nucleation, for example, can also occur along the surface of a pre-existing fibril—secondary nucleation—as opposed to the primary one. However, explicit pathways are still not clear. In this study, we use molecular dynamics simulation to explore the free energy landscape of a free Abeta monomer binding to an existing fibrillar surface. We specifically look into several potential Abeta structural precursors that might precede some secondary events, including elongation and secondary nucleation. We find that the overall process of surface-dependent events can be described at least by the following three stages: 1. Free diffusion 2. Downhill guiding 3. Dock and lock. And we show that the outcome of adding a new monomer onto a pre-existing fibril is pathway-dependent, which leads to different secondary processes. To understand structural details, we have identified several monomeric amyloid precursors over the fibrillar surfaces and characterize their heterogeneity using a probability contact map analysis. Using the frustration analysis (a bioinformatics tool), we show that surface heterogeneity correlates with the energy frustration of specific local residues that form binding sites on the fibrillar structure. We further investigate the helical twisting of protofilaments of different sizes and observe a length dependence on the filament twisting. This work presents a comprehensive survey over the properties of fibril growth using a combination of several openMM-based platforms, including the GPU-enabled openAWSEM package for coarse-grained modeling, MDTraj for trajectory analysis, and pyEMMA for free energy calculation. This combined approach makes long-timescale simulation for aggregation systems as well as all-in-one analysis feasible. We show that this protocol allows us to explore fibril stability, surface binding affinity/heterogeneity, as well as fibrillar twisting. All these properties are important for understanding the molecular mechanism of surface-catalyzed secondary processes of fibril growth.
Collapse
Affiliation(s)
- Yuan-Wei Ma
- Department of Chemistry, Tamkang University, New Taipei City, Taiwan
| | - Tong-You Lin
- Department of Chemistry, Tamkang University, New Taipei City, Taiwan
| | - Min-Yeh Tsai
- Department of Chemistry, Tamkang University, New Taipei City, Taiwan
| |
Collapse
|
9
|
Lindorff-Larsen K, Kragelund BB. On the potential of machine learning to examine the relationship between sequence, structure, dynamics and function of intrinsically disordered proteins. J Mol Biol 2021; 433:167196. [PMID: 34390736 DOI: 10.1016/j.jmb.2021.167196] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 08/03/2021] [Accepted: 08/04/2021] [Indexed: 11/29/2022]
Abstract
Intrinsically disordered proteins (IDPs) constitute a broad set of proteins with few uniting and many diverging properties. IDPs-and intrinsically disordered regions (IDRs) interspersed between folded domains-are generally characterized as having no persistent tertiary structure; instead they interconvert between a large number of different and often expanded structures. IDPs and IDRs are involved in an enormously wide range of biological functions and reveal novel mechanisms of interactions, and while they defy the common structure-function paradigm of folded proteins, their structural preferences and dynamics are important for their function. We here discuss open questions in the field of IDPs and IDRs, focusing on areas where machine learning and other computational methods play a role. We discuss computational methods aimed to predict transiently formed local and long-range structure, including methods for integrative structural biology. We discuss the many different ways in which IDPs and IDRs can bind to other molecules, both via short linear motifs, as well as in the formation of larger dynamic complexes such as biomolecular condensates. We discuss how experiments are providing insight into such complexes and may enable more accurate predictions. Finally, we discuss the role of IDPs in disease and how new methods are needed to interpret the mechanistic effects of genomic variants in IDPs.
Collapse
Affiliation(s)
- Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen. Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| | - Birthe B Kragelund
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen. Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| |
Collapse
|
10
|
Foutch D, Pham B, Shen T. Protein conformational switch discerned via network centrality properties. Comput Struct Biotechnol J 2021; 19:3599-3608. [PMID: 34257839 PMCID: PMC8246261 DOI: 10.1016/j.csbj.2021.06.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 06/01/2021] [Accepted: 06/02/2021] [Indexed: 11/17/2022] Open
Abstract
Network analysis has emerged as a powerful tool for examining structural biology systems. The spatial organization of the components of a biomolecular structure has been rendered as a graph representation and analyses have been performed to deduce the biophysical and mechanistic properties of these components. For proteins, the analysis of protein structure networks (PSNs), especially via network centrality measurements and cluster coefficients, has led to identifying amino acid residues that play key functional roles and classifying amino acid residues in general. Whether these network properties examined in various studies are sensitive to subtle (yet biologically significant) conformational changes remained to be addressed. Here, we focused on four types of network centrality properties (betweenness, closeness, degree, and eigenvector centralities) for conformational changes upon ligand binding of a sensor protein (constitutive androstane receptor) and an allosteric enzyme (ribonucleotide reductase). We found that eigenvector centrality is sensitive and can distinguish salient structural features between protein conformational states while other centrality measures, especially closeness centrality, are less sensitive and rather generic with respect to the structural specificity. We also demonstrated that an ensemble-informed, modified PSN with static edges removed (which we term PSN*) has enhanced sensitivity at discerning structural changes.
Collapse
Affiliation(s)
- David Foutch
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Bill Pham
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Tongye Shen
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA.,UT-ORNL Center for Molecular Biophysics, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| |
Collapse
|
11
|
Kalinin SV, Zhang S, Valleti M, Pyles H, Baker D, De Yoreo JJ, Ziatdinov M. Disentangling Rotational Dynamics and Ordering Transitions in a System of Self-Organizing Protein Nanorods via Rotationally Invariant Latent Representations. ACS NANO 2021; 15:6471-6480. [PMID: 33861068 DOI: 10.1021/acsnano.0c08914] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The dynamics of complex ordering systems with active rotational degrees of freedom exemplified by protein self-assembly is explored using a machine learning workflow that combines deep learning-based semantic segmentation and rotationally invariant variational autoencoder-based analysis of orientation and shape evolution. The latter allows for disentanglement of the particle orientation from other degrees of freedom and compensates for lateral shifts. The disentangled representations in the latent space encode the rich spectrum of local transitions that can now be visualized and explored via continuous variables. The time dependence of ensemble averages allows insight into the time dynamics of the system and, in particular, illustrates the presence of the potential ordering transition. Finally, analysis of the latent variables along the single-particle trajectory allows tracing these parameters on a single-particle level. The proposed approach is expected to be universally applicable for the description of the imaging data in optical, scanning probe, and electron microscopy seeking to understand the dynamics of complex systems where rotations are a significant part of the process.
Collapse
Affiliation(s)
- Sergei V Kalinin
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Shuai Zhang
- Materials Science and Engineering, University of Washington, Seattle, Washington 98195, United States
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Mani Valleti
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, Tennessee 37996, United States
| | - Harley Pyles
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, United States
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, United States
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, United States
| | - James J De Yoreo
- Materials Science and Engineering, University of Washington, Seattle, Washington 98195, United States
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Maxim Ziatdinov
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| |
Collapse
|
12
|
Schweinberger M, Krivitsky PN, Butts CT, Stewart JR. Exponential-Family Models of Random Graphs: Inference in Finite, Super and Infinite Population Scenarios. Stat Sci 2020. [DOI: 10.1214/19-sts743] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
13
|
Cross TJ, Takahashi GR, Diessner EM, Crosby MG, Farahmand V, Zhuang S, Butts CT, Martin RW. Sequence Characterization and Molecular Modeling of Clinically Relevant Variants of the SARS-CoV-2 Main Protease. Biochemistry 2020; 59:3741-3756. [PMID: 32931703 PMCID: PMC7518256 DOI: 10.1021/acs.biochem.0c00462] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 09/12/2020] [Indexed: 02/08/2023]
Abstract
The SARS-CoV-2 main protease (Mpro) is essential to viral replication and cleaves highly specific substrate sequences, making it an obvious target for inhibitor design. However, as for any virus, SARS-CoV-2 is subject to constant neutral drift and selection pressure, with new Mpro mutations arising over time. Identification and structural characterization of Mpro variants is thus critical for robust inhibitor design. Here we report sequence analysis, structure predictions, and molecular modeling for seventy-nine Mpro variants, constituting all clinically observed mutations in this protein as of April 29, 2020. Residue substitution is widely distributed, with some tendency toward larger and more hydrophobic residues. Modeling and protein structure network analysis suggest differences in cohesion and active site flexibility, revealing patterns in viral evolution that have relevance for drug discovery.
Collapse
Affiliation(s)
- Thomas J Cross
- Department of Chemistry, University of California, Irvine, California 92697-2025, United States
| | - Gemma R Takahashi
- Department of Molecular Biology and Biochemistry, University of California, Irvine, California 92697-3900, United States
| | - Elizabeth M Diessner
- Department of Chemistry, University of California, Irvine, California 92697-2025, United States
- California Institute for Telecommunications and Information Technology, University of California, Irvine, California 92697-3900, United States
| | - Marquise G Crosby
- Department of Molecular Biology and Biochemistry, University of California, Irvine, California 92697-3900, United States
| | - Vesta Farahmand
- Department of Chemistry, University of California, Irvine, California 92697-2025, United States
| | - Shannon Zhuang
- Department of Chemistry, University of California, Irvine, California 92697-2025, United States
| | - Carter T Butts
- California Institute for Telecommunications and Information Technology, University of California, Irvine, California 92697-3900, United States
- Departments of Sociology, Statistics, Computer Science, and Electrical Engineering and Computer Science, University of California, Irvine, California 92697-3900, United States
| | - Rachel W Martin
- Department of Chemistry, University of California, Irvine, California 92697-2025, United States
- Department of Molecular Biology and Biochemistry, University of California, Irvine, California 92697-3900, United States
| |
Collapse
|
14
|
Choi H, Lee W, Lee G, Yoon DS, Na S. The Formation Mechanism of Segmented Ring-Shaped Aβ Oligomers and Protofibrils. ACS Chem Neurosci 2019; 10:3830-3838. [PMID: 31313912 DOI: 10.1021/acschemneuro.9b00324] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
A clear understanding of amyloid formation with diverse morphologies is critical to overcoming the fatal disease amyloidosis. Studies have revealed that monomer concentration is a crucial factor for determining amyloid morphologies, such as protofibrils, annular, or spherical oligomers. However, gaining a complete understanding of the mechanism of formation of the various amyloid morphologies has been limited by the lack of experimental devices and insufficient knowledge. In this study, we demonstrate that the monomer concentration is an essential factor in determining the morphology of beta-amyloid (Aβ) oligomers or protofibrils. By computational and experimental approaches, we investigated the strategies for structural stabilization of amyloid protein, the morphological changes, and amyloid aggregation. In particular, we found unprecedented conformations, e.g., single bent oligomers and segmented ring-shaped protofibrils, the formation of which was explained by the computational analysis. Our findings provide insight into the structural features of amyloid molecules formed at low concentrations of monomer, which will help determine the clinical targets (in therapy) to effectively inhibit amyloid formation in the early stages of the amyloid growth phase.
Collapse
Affiliation(s)
| | - Wonseok Lee
- Department of Control and Instrumentation Engineering , Korea University , Sejong 30019 , Republic of Korea
| | | | | | | |
Collapse
|