1
|
Sekmen A, Al Nasr K, Bilgin B, Koku AB, Jones C. Mathematical and Machine Learning Approaches for Classification of Protein Secondary Structure Elements from Cα Coordinates. Biomolecules 2023; 13:923. [PMID: 37371503 DOI: 10.3390/biom13060923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 05/16/2023] [Accepted: 05/16/2023] [Indexed: 06/29/2023] Open
Abstract
Determining Secondary Structure Elements (SSEs) for any protein is crucial as an intermediate step for experimental tertiary structure determination. SSEs are identified using popular tools such as DSSP and STRIDE. These tools use atomic information to locate hydrogen bonds to identify SSEs. When some spatial atomic details are missing, locating SSEs becomes a hinder. To address the problem, when some atomic information is missing, three approaches for classifying SSE types using Cα atoms in protein chains were developed: (1) a mathematical approach, (2) a deep learning approach, and (3) an ensemble of five machine learning models. The proposed methods were compared against each other and with a state-of-the-art approach, PCASSO.
Collapse
Affiliation(s)
- Ali Sekmen
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| | - Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| | - Bahadir Bilgin
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
- Department of Mechanical Engineering, Middle East Technical University, Ankara 06800, Türkiye
| | - Ahmet Bugra Koku
- Department of Mechanical Engineering, Middle East Technical University, Ankara 06800, Türkiye
- Center for Robotics and AI, Middle East Technical University, Ankara 06800, Türkiye
| | - Christopher Jones
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| |
Collapse
|
2
|
Zhang Y, Liu X, Chen J. Re-Balancing Replica Exchange with Solute Tempering for Sampling Dynamic Protein Conformations. J Chem Theory Comput 2023; 19:1602-1614. [PMID: 36791464 PMCID: PMC10795075 DOI: 10.1021/acs.jctc.2c01139] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Replica exchange with solute tempering (REST) is a highly effective variant of replica exchange for enhanced sampling in explicit solvent simulations of biomolecules. By scaling the Hamiltonian for a selected "solute" region of the system, REST effectively applies tempering only to the degrees of freedom of interest but not the rest of the system ("solvent"), allowing fewer replicas for covering the same temperature range. A key consideration of REST is how the solute-solvent interactions are scaled together with the solute-solute interactions. Here, we critically evaluate the performance of the latest REST2 protocol for sampling large-scale conformation fluctuations of intrinsically disordered proteins (IDPs). The results show that REST2 promotes artificial protein conformational collapse at high effective temperatures, which seems to be a designed feature originally to promote the sampling of reversible folding of small proteins. The collapse is particularly severe with larger IDPs, leading to replica segregation in the effective temperature space and hindering effective sampling of large-scale conformational changes. We propose that the scaling of the solute-solvent interactions can be treated as free parameters in REST, which can be tuned to control the solute conformational properties (e.g., chain expansion) at different effective temperatures and achieve more effective sampling. To this end, we derive a new REST3 protocol, where the strengths of the solute-solvent van der Waals interactions are recalibrated to reproduce the levels of protein chain expansion at high effective temperatures. The efficiency of REST3 is examined using two IDPs with nontrivial local and long-range structural features, including the p53 N-terminal domain and the kinase inducible transactivation domain of transcription factor CREB. The results suggest that REST3 leads to a much more efficient temperature random walk and improved sampling efficiency, which also further reduces the number of replicas required. Nonetheless, our analysis also reveals significant challenges of relying on tempering alone for sampling large-scale conformational fluctuations of disordered proteins. It is likely that more efficient sampling protocols will require incorporating more sophisticated Hamiltonian replica exchange schemes in addition to tempering.
Collapse
Affiliation(s)
- Yumeng Zhang
- Department of Chemistry, University of Massachusetts, Amherst, MA 01003, USA
| | - Xiaorong Liu
- Corresponding Authors: (XL), (JC), Phone: (413) 545-3386 (JC)
| | - Jianhan Chen
- Department of Chemistry, University of Massachusetts, Amherst, MA 01003, USA
| |
Collapse
|
3
|
Baidya L, Reddy G. pH Induced Switch in the Conformational Ensemble of Intrinsically Disordered Protein Prothymosin-α and Its Implications for Amyloid Fibril Formation. J Phys Chem Lett 2022; 13:9589-9598. [PMID: 36206480 DOI: 10.1021/acs.jpclett.2c01972] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Aggregation of intrinsically disordered proteins (IDPs) can lead to neurodegenerative diseases. Although there is experimental evidence that acidic pH promotes IDP monomer compaction leading to aggregation, the general mechanism is unclear. We studied the pH effect on the conformational ensemble of prothymosin-α (proTα), which is involved in multiple essential functions, and probed its role in aggregation using computer simulations. We show that compaction in the proTα dimension at low pH is due to the protein's collapse in the intermediate region (E41-D80) rich in glutamic acid residues, enhancing its β-sheet content. We observed by performing dimer simulations that the conformations with high β-sheet content could act as aggregation-prone (N*) states and nucleate the aggregation process. The simulations initiated using N* states form dimers within a microsecond time scale, whereas the non-N* states do not form dimers within this time scale. This study contributes to understanding the general principles of pH-induced IDP aggregation.
Collapse
Affiliation(s)
- Lipika Baidya
- Solid State and Structural Chemistry Unit, Indian Institute of Science, Bengaluru, Karnataka560012, India
| | - Govardhan Reddy
- Solid State and Structural Chemistry Unit, Indian Institute of Science, Bengaluru, Karnataka560012, India
| |
Collapse
|
4
|
Rizuan A, Jovic N, Phan TM, Kim YC, Mittal J. Developing Bonded Potentials for a Coarse-Grained Model of Intrinsically Disordered Proteins. J Chem Inf Model 2022; 62:4474-4485. [PMID: 36066390 PMCID: PMC10165611 DOI: 10.1021/acs.jcim.2c00450] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recent advances in residue-level coarse-grained (CG) computational models have enabled molecular-level insights into biological condensates of intrinsically disordered proteins (IDPs), shedding light on the sequence determinants of their phase separation. The existing CG models that treat protein chains as flexible molecules connected via harmonic bonds cannot populate common secondary-structure elements. Here, we present a CG dihedral angle potential between four neighboring beads centered at Cα atoms to faithfully capture the transient helical structures of IDPs. In order to parameterize and validate our new model, we propose Cα-based helix assignment rules based on dihedral angles that succeed in reproducing the atomistic helicity results of a polyalanine peptide and folded proteins. We then introduce sequence-dependent dihedral angle potential parameters (εd) and use experimentally available helical propensities of naturally occurring 20 amino acids to find their optimal values. The single-chain helical propensities from the CG simulations for commonly studied prion-like IDPs are in excellent agreement with the NMR-based α-helix fraction, demonstrating that the new HPS-SS model can accurately produce structural features of IDPs. Furthermore, this model can be easily implemented for large-scale assembly simulations due to its simplicity.
Collapse
Affiliation(s)
- Azamat Rizuan
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843, United States
| | - Nina Jovic
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843, United States
| | - Tien M Phan
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843, United States
| | - Young C Kim
- Center for Materials Physics and Technology, Naval Research Laboratory, Washington, District of Columbia 20375, United States
| | - Jeetain Mittal
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843, United States
| |
Collapse
|
5
|
Antony JV, Koya R, Pournami PN, Nair GG, Balakrishnan JP. Protein secondary structure assignment using residual networks. J Mol Model 2022; 28:269. [PMID: 35997827 DOI: 10.1007/s00894-022-05271-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 08/12/2022] [Indexed: 11/27/2022]
Abstract
Proteins are constructed from amino acid sequences. Their structural classifications include primary, secondary, tertiary, and quaternary, with tertiary and quaternary structures influencing protein function. Because a protein's structure is inextricably connected to its biological function, machine learning algorithms that can better anticipate the structures have the potential to lead to new scientific discoveries in human health and improve our capacity to develop new treatments. Protein secondary structure assignment enriches the structural and functional understanding of proteins. It helps in protein structure comparison and classification studies, besides facilitating secondary and tertiary structure prediction systems. Several secondary structure assignment methods have been developed since the 1980s, most of which are based on hydrogen bond analysis and atomic coordinate features. However, the assignment process becomes complex when protein data includes missing atoms. Deep neural networks are often referred to as universal function approximators because they can approximate any function to produce the desired output when properly designed and trained. Optimised deep learning architectures have already proven their ability to increase performance in a wide range of problems. Recently, the ResNet architecture has garnered significant interest due to its applicability in various areas, including image classification and protein contact map prediction. The proposed model, which is based on the ResNet architecture, assigns secondary structures using Cα atom coordinates. The model achieved an accuracy of 94% when evaluated against the benchmark and independent test sets. The findings encourage the development of new deep learning-based methods that are more generalised across various protein learning tasks. Furthermore, it allows computational biologists to delve deeper into integrating these techniques with experimental methods. The model codes are available at: https://github.com/jisnava/ResNet_for_Structure_Assignments/ .
Collapse
Affiliation(s)
- Jisna Vellara Antony
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kattangal, Kerala, 673601, India.
| | - Roosafeed Koya
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kattangal, Kerala, 673601, India
| | | | - Gopakumar Gopalakrishnan Nair
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kattangal, Kerala, 673601, India
| | | |
Collapse
|
6
|
Automated Protein Secondary Structure Assignment from C α Positions Using Neural Networks. Biomolecules 2022; 12:biom12060841. [PMID: 35740966 PMCID: PMC9220970 DOI: 10.3390/biom12060841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 11/17/2022] Open
Abstract
The assignment of secondary structure elements in protein conformations is necessary to interpret a protein model that has been established by computational methods. The process essentially involves labeling the amino acid residues with H (Helix), E (Strand), or C (Coil, also known as Loop). When particular atoms are absent from an input protein structure, the procedure becomes more complicated, especially when only the alpha carbon locations are known. Various techniques have been tested and applied to this problem during the last forty years. The application of machine learning techniques is the most recent trend. This contribution presents the HECA classifier, which uses neural networks to assign protein secondary structure types. The technique exclusively employs Cα coordinates. The Keras (TensorFlow) library was used to implement and train the neural network model. The BioShell toolkit was used to calculate the neural network input features from raw coordinates. The study’s findings show that neural network-based methods may be successfully used to take on structure assignment challenges when only Cα trace is available. Thanks to the careful selection of input features, our approach’s accuracy (above 97%) exceeded that of the existing methods.
Collapse
|
7
|
Antony JV, Madhu P, Balakrishnan JP, Yadav H. Assigning secondary structure in proteins using AI. J Mol Model 2021; 27:252. [PMID: 34402969 DOI: 10.1007/s00894-021-04825-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 06/16/2021] [Indexed: 12/16/2022]
Abstract
Knowledge about protein structure assignment enriches the structural and functional understanding of proteins. Accurate and reliable structure assignment data is crucial for secondary structure prediction systems. Since the 1980s, various methods based on hydrogen bond analysis and atomic coordinate geometry, followed by machine learning, have been employed in protein structure assignment. However, the assignment process becomes challenging when missing atoms are present in the protein files. Our method proposed a multi-class classifier program named DLFSA for assigning protein secondary structure elements (SSE) using convolutional neural networks (CNNs). A fast and efficient GPU-based parallel procedure extracts fragments from protein files. The model implemented in this work is trained with a subset of the protein fragments and achieves 88.1% and 82.5% train and test accuracy, respectively. The model uses only Cα coordinates for secondary structure assignments. The model has been successfully tested on a few full-length proteins also. Results from the fragment-based studies demonstrate the feasibility of applying deep learning solutions for structure assignment problems.
Collapse
Affiliation(s)
- Jisna Vellara Antony
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kerala, 673601, India.
| | - Prayagh Madhu
- Computer Science and Engineering Dept., Rajiv Gandhi Institute of Technology, Kottayam, India
| | | | - Hemant Yadav
- Department of Computer Science and Engineering, National Institute of Technology Calicut, Kerala, 673601, India
| |
Collapse
|
8
|
Wang L, Cao C, Zuo S. Protein secondary structure assignment using pc-polyline and convolutional neural network. Proteins 2021; 89:1017-1029. [PMID: 33780034 DOI: 10.1002/prot.26079] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 02/22/2021] [Accepted: 03/15/2021] [Indexed: 01/17/2023]
Abstract
MOTIVATION The assignment of protein secondary structure elements (SSEs) underpins structural analysis and prediction. The backbone of a protein could be adequately represented using a pc-polyline that passes through the centers of its peptide planes. One salient feature of pc-polyline representation is that the secondary structure of a protein becomes recognizable in a matrix whose elements are the pairwise distances between two peptide plane centers. Thus, a pc-polyline could in turn be used to assign SSEs. RESULTS Using convolutional neural network (CNN) here we confirm that a pc-polyline indeed contains enough information for it to be used for the accurate assignments of the six SSE types: α-helix, β-sheet, β-bulge, 310 -helix, turn and loop. The applications to three large data sets show that the assignments by our CNN-based p2psse program agree very well with those by dssp, stride and quite well with those by five other programs. The analyses of their SSE assignments raise some general questions about the characterizations of protein secondary structure. In particular the analyses illustrate the difficulty with giving a quantitative and consistent definition for each of the six SSE types especially for 310 -helix, β-bulge, turn or loop in terms of either backbone H-bond patterns, or backbone dihedral angles, or Cα -polyline or pc-polyline. The difficulty suggests that the SSE space though being dominated by the regions for the six SSE types is to a certain degree continuous. AVAILABILITY The program is available at https://github.com/wlincong/p2pSSE.
Collapse
Affiliation(s)
- Lincong Wang
- The College of Computer Science and Technology, Jilin University, Changchun, China
| | - Chen Cao
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Shuxue Zuo
- The College of Computer Science and Technology, Jilin University, Changchun, China
| |
Collapse
|
9
|
Adasme-Carreño F, Caballero J, Ireta J. PSIQUE: Protein Secondary Structure Identification on the Basis of Quaternions and Electronic Structure Calculations. J Chem Inf Model 2021; 61:1789-1800. [PMID: 33769809 DOI: 10.1021/acs.jcim.0c01343] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The secondary structure is important in protein structure analysis, classification, and modeling. We have developed a novel method for secondary structure assignment, termed PSIQUE, based on the potential energy surface (PES) of polyalanine obtained using an infinitely long chain model and density functional theory calculations. First, uniform protein segments are determined in terms of a difference of quaternions between neighboring amino acids along the protein backbone. Then, the identification of the secondary structure motifs is carried out based on the minima found in the PES. PSIQUE shows good agreement with other secondary structure assignment methods. However, it provides better discrimination of subtle secondary structures (e.g., helix types) and termini and produces more uniform segments while also accounting for local distortions. Overall, PSIQUE provides a precise and reliable assignment of secondary structures, so it should be helpful for the detailed characterization of the protein structure.
Collapse
Affiliation(s)
- Francisco Adasme-Carreño
- Departamento de Bioinformática, Centro de Bioinformática, Simulación y Modelado (CBSM), Facultad de Ingeniería, Universidad de Talca, Campus Talca, 1 Poniente No. 1141, Casilla 721, Talca 3460000, Chile
| | - Julio Caballero
- Departamento de Bioinformática, Centro de Bioinformática, Simulación y Modelado (CBSM), Facultad de Ingeniería, Universidad de Talca, Campus Talca, 1 Poniente No. 1141, Casilla 721, Talca 3460000, Chile
| | - Joel Ireta
- Departamento de Química, División de Ciencias Básicas e Ingeniería, Universidad Autónoma Metropolitana-Iztapalapa, A.P. 55-534, Ciudad de Mexico 09340, Mexico
| |
Collapse
|
10
|
Wang L, Zhang Y, Zou S. The characterization of pc-polylines representing protein backbones. Proteins 2019; 88:307-318. [PMID: 31442337 DOI: 10.1002/prot.25803] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 08/08/2019] [Accepted: 08/19/2019] [Indexed: 11/10/2022]
Abstract
The backbone of a protein is typically represented as either a C α -polyline, a three-dimensional (3D) polyline that passes through the C α atoms, or a tuple of ϕ,ψ pairs while its fold is usually assigned using the 3D topological arrangement of the secondary structure elements (SSEs). It is tricky to obtain the SSE composition for a protein from the C α -polyline representation while its 3D SSE arrangement is not apparent in the two-dimensional (2D) ϕ,ψ representation. In this article, we first represent the backbone of a protein as a pc-polyline that passes through the centers of its peptide planes. We then analyze the pc-polylines for six different sets of proteins with high quality crystal structures. The results show that SSE composition becomes recognizable in pc-polyline presentation and consequently the geometrical property of the pc-polyline of a protein could be used to assign its secondary structure. Furthermore, our analysis finds that for each of the six sets the total length of a pc-polyline increases linearly with the number of the peptide planes. Interestingly a comparison of the six regression lines shows that they have almost identical slopes but different intercepts. Most interestingly there exist decent linear correlations between the intercepts of the six lines and either the average helix contents or the average sheet contents and between the intercepts and the average backbone hydrogen bonding energetics. Finally, we discuss the implications of the identified correlations for structure classification and protein folding, and the potential applications of pc-polyline representation to structure prediction and protein design.
Collapse
Affiliation(s)
- Lincong Wang
- The College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
| | - Yao Zhang
- The College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
| | - Shuxue Zou
- The College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
| |
Collapse
|
11
|
Oldfield CJ, Chen K, Kurgan L. Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences. Methods Mol Biol 2019; 1958:73-100. [PMID: 30945214 DOI: 10.1007/978-1-4939-9161-7_4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many new methods for the sequence-based prediction of the secondary and supersecondary structures have been developed over the last several years. These and older sequence-based predictors are widely applied for the characterization and prediction of protein structure and function. These efforts have produced countless accurate predictors, many of which rely on state-of-the-art machine learning models and evolutionary information generated from multiple sequence alignments. We describe and motivate both types of predictions. We introduce concepts related to the annotation and computational prediction of the three-state and eight-state secondary structure as well as several types of supersecondary structures, such as β hairpins, coiled coils, and α-turn-α motifs. We review 34 predictors focusing on recent tools and provide detailed information for a selected set of 14 secondary structure and 3 supersecondary structure predictors. We conclude with several practical notes for the end users of these predictive methods.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA
| | - Ke Chen
- School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin, People's Republic of China
| | - Lukasz Kurgan
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
12
|
Wei S, Ahlstrom LS, Brooks CL. Exploring Protein-Nanoparticle Interactions with Coarse-Grained Protein Folding Models. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2017; 13:10.1002/smll.201603748. [PMID: 28266786 PMCID: PMC5551056 DOI: 10.1002/smll.201603748] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 12/08/2016] [Indexed: 05/28/2023]
Abstract
Understanding the fundamental biophysics behind protein-nanoparticle (NP) interactions is essential for the design and engineering bio-NP systems. The authors describe the development of a coarse-grained protein-NP model that utilizes a structure centric protein model. A key feature of the protein-NP model is the quantitative inclusion of the hydrophobic character of residues in the protein and their interactions with the NP surface. In addition, the curvature of the NP is taken into account, capturing the protein behavior on NPs of different size. The authors evaluate this model by comparison with experimental results for structure and adsorption of a model protein interacting with an NP. It is demonstrated that the simulation results recapitulate the structure of the small α/β protein GB1 on the NP for data from circular dichroism and fluorescence spectroscopy. In addition, the calculated protein adsorption free energy agrees well with the experimental value. The authors predict the dependence of protein folding on the NP size, surface chemistry, and temperature. The model has the potential to guide NP design efforts by predicting protein behavior on NP surfaces with various chemical properties and curvatures.
Collapse
Affiliation(s)
- Shuai Wei
- Department of Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Logan S Ahlstrom
- Department of Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Charles L Brooks
- Department of Chemistry and Biophysics Program, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
13
|
|
14
|
Zou X, Wei S, Jasensky J, Xiao M, Wang Q, Brooks Iii CL, Chen Z. Molecular Interactions between Graphene and Biological Molecules. J Am Chem Soc 2017; 139:1928-1936. [PMID: 28092440 DOI: 10.1021/jacs.6b11226] [Citation(s) in RCA: 73] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Applications of graphene have extended into areas of nanobio-technology such as nanobio-medicine, nanobio-sensing, as well as nanoelectronics with biomolecules. These applications involve interactions between proteins, peptides, DNA, RNA etc. and graphene, therefore understanding such molecular interactions is essential. For example, many applications based on using graphene and peptides require peptides to interact with (e.g., noncovalently bind to) graphene at one end, while simultaneously exposing the other end to the surrounding medium (e.g., to detect analytes in solution). To control and characterize peptide behavior on a graphene surface in solution is difficult. Here we successfully probed the molecular interactions between two peptides (cecropin P1 and MSI-78(C1)) and graphene in situ and in real-time using sum frequency generation (SFG) vibrational spectroscopy and molecular dynamics (MD) simulation. We demonstrated that the distribution of various planar (including aromatic (Phe, Trp, Tyr, and His)/amide (Asn and Gln)/Guanidine (Arg)) side-chains and charged hydrophilic (such as Lys) side-chains in a peptide sequence determines the orientation of the peptide adsorbed on a graphene surface. It was found that peptide interactions with graphene depend on the competition between both planar and hydrophilic residues in the peptide. Our results indicated that part of cecropin P1 stands up on graphene due to an unbalanced distribution of planar and hydrophilic residues, whereas MSI-78(C1) lies down on graphene due to an even distribution of Phe residues and hydrophilic residues. With such knowledge, we could rationally design peptides with desired residues to manipulate peptide-graphene interactions, which allows peptides to adopt optimized structure and exhibit excellent activity for nanobio-technological applications. This research again demonstrates the power to combine SFG vibrational spectroscopy and MD simulation in studying interfacial biological molecules.
Collapse
Affiliation(s)
- Xingquan Zou
- Department of Chemistry, and ‡Department of Biophysics, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Shuai Wei
- Department of Chemistry, and ‡Department of Biophysics, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Joshua Jasensky
- Department of Chemistry, and ‡Department of Biophysics, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Minyu Xiao
- Department of Chemistry, and ‡Department of Biophysics, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Qiuming Wang
- Department of Chemistry, and ‡Department of Biophysics, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Charles L Brooks Iii
- Department of Chemistry, and ‡Department of Biophysics, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Zhan Chen
- Department of Chemistry, and ‡Department of Biophysics, University of Michigan , Ann Arbor, Michigan 48109, United States
| |
Collapse
|
15
|
Frank AT, Law SM, Ahlstrom LS, Brooks CL. Predicting protein backbone chemical shifts from Cα coordinates: extracting high resolution experimental observables from low resolution models. J Chem Theory Comput 2016; 11:325-31. [PMID: 25620895 PMCID: PMC4295808 DOI: 10.1021/ct5009125] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Indexed: 12/18/2022]
Abstract
![]()
Given
the demonstrated utility of coarse-grained modeling and simulations
approaches in studying protein structure and dynamics, developing
methods that allow experimental observables to be directly recovered from coarse-grained models is of great importance. In
this work, we develop one such method that enables protein backbone
chemical shifts (1HN, 1Hα, 13Cα, 13C, 13Cβ, and 15N) to be predicted from Cα coordinates. We show that our Cα-based
method, LARMORCα, predicts backbone chemical shifts
with comparable accuracy to some all-atom approaches. More importantly,
we demonstrate that LARMORCα predicted chemical shifts
are able to resolve native structure from decoy pools that contain
both native and non-native models, and so it is sensitive to protein
structure. As an application, we use LARMORCα to
characterize the transient state of the fast-folding protein gpW using
recently published NMR relaxation dispersion derived backbone chemical
shifts. The model we obtain is consistent with the previously proposed
model based on independent analysis of the chemical shift dispersion
pattern of the transient state. We anticipate that LARMORCα will find utility as a tool that enables important protein conformational
substates to be identified by “parsing” trajectories
and ensembles generated using coarse-grained modeling and simulations.
Collapse
Affiliation(s)
- Aaron T Frank
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | | | | | | |
Collapse
|
16
|
Salmon L, Ahlstrom LS, Horowitz S, Dickson A, Brooks CL, Bardwell JCA. Capturing a Dynamic Chaperone-Substrate Interaction Using NMR-Informed Molecular Modeling. J Am Chem Soc 2016; 138:9826-39. [PMID: 27415450 DOI: 10.1021/jacs.6b02382] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Chaperones maintain a healthy proteome by preventing aggregation and by aiding in protein folding. Precisely how chaperones influence the conformational properties of their substrates, however, remains unclear. To achieve a detailed description of dynamic chaperone-substrate interactions, we fused site-specific NMR information with coarse-grained simulations. Our model system is the binding and folding of a chaperone substrate, immunity protein 7 (Im7), with the chaperone Spy. We first used an automated procedure in which NMR chemical shifts inform the construction of system-specific force fields that describe each partner individually. The models of the two binding partners are then combined to perform simulations on the chaperone-substrate complex. The binding simulations show excellent agreement with experimental data from multiple biophysical measurements. Upon binding, Im7 interacts with a mixture of hydrophobic and hydrophilic residues on Spy's surface, causing conformational exchange within Im7 to slow down as Im7 folds. Meanwhile, the motion of Spy's flexible loop region increases, allowing for better interaction with different substrate conformations, and helping offset losses in Im7 conformational dynamics that occur upon binding and folding. Spy then preferentially releases Im7 into a well-folded state. Our strategy has enabled a residue-level description of a dynamic chaperone-substrate interaction, improving our understanding of how chaperones facilitate substrate folding. More broadly, we validate our approach using two other binding partners, showing that this approach provides a general platform from which to investigate other flexible biomolecular complexes through the integration of NMR data with efficient computational models.
Collapse
Affiliation(s)
- Loïc Salmon
- Department of Molecular, Cellular and Developmental Biology, and the Howard Hughes Medical Institute, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Logan S Ahlstrom
- Department of Molecular, Cellular and Developmental Biology, and the Howard Hughes Medical Institute, University of Michigan , Ann Arbor, Michigan 48109, United States.,Department of Chemistry, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Scott Horowitz
- Department of Molecular, Cellular and Developmental Biology, and the Howard Hughes Medical Institute, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Alex Dickson
- Department of Chemistry, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Charles L Brooks
- Department of Chemistry, University of Michigan , Ann Arbor, Michigan 48109, United States.,Biophysics Program, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - James C A Bardwell
- Department of Molecular, Cellular and Developmental Biology, and the Howard Hughes Medical Institute, University of Michigan , Ann Arbor, Michigan 48109, United States
| |
Collapse
|
17
|
A New Secondary Structure Assignment Algorithm Using Cα Backbone Fragments. Int J Mol Sci 2016; 17:333. [PMID: 26978354 PMCID: PMC4813195 DOI: 10.3390/ijms17030333] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2016] [Revised: 02/26/2016] [Accepted: 02/29/2016] [Indexed: 11/17/2022] Open
Abstract
The assignment of secondary structure elements in proteins is a key step in the analysis of their structures and functions. We have developed an algorithm, SACF (secondary structure assignment based on Cα fragments), for secondary structure element (SSE) assignment based on the alignment of Cα backbone fragments with central poses derived by clustering known SSE fragments. The assignment algorithm consists of three steps: First, the outlier fragments on known SSEs are detected. Next, the remaining fragments are clustered to obtain the central fragments for each cluster. Finally, the central fragments are used as a template to make assignments. Following a large-scale comparison of 11 secondary structure assignment methods, SACF, KAKSI and PROSS are found to have similar agreement with DSSP, while PCASSO agrees with DSSP best. SACF and PCASSO show preference to reducing residues in N and C cap regions, whereas KAKSI, P-SEA and SEGNO tend to add residues to the terminals when DSSP assignment is taken as standard. Moreover, our algorithm is able to assign subtle helices (310-helix, π-helix and left-handed helix) and make uniform assignments, as well as to detect rare SSEs in β-sheets or long helices as outlier fragments from other programs. The structural uniformity should be useful for protein structure classification and prediction, while outlier fragments underlie the structure-function relationship.
Collapse
|
18
|
Lee KH, Chen J. Multiscale enhanced sampling of intrinsically disordered protein conformations. J Comput Chem 2015; 37:550-7. [PMID: 26052838 DOI: 10.1002/jcc.23957] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2015] [Revised: 05/03/2015] [Accepted: 05/11/2015] [Indexed: 12/24/2022]
Abstract
In a recently developed multiscale enhanced sampling (MSES) technique, topology-based coarse-grained (CG) models are coupled to atomistic force fields to enhance the sampling of atomistic protein conformations. Here, the MSES protocol is refined by designing more sophisticated Hamiltonian/temperature replica exchange schemes that involve additional parameters in the MSES coupling restraint potential, to more carefully control how conformations are coupled between the atomistic and CG models. A specific focus is to derive an optimal MSES protocol for simulating conformational ensembles of intrinsically disordered proteins (IDPs). The efficacy of the refined protocols, referred to as MSES-soft asymptote (SA), was evaluated using two model peptides with various levels of residual helicities. The results show that MSES-SA generates more reversible helix-coil transitions and leads to improved convergence on various ensemble conformational properties. This study further suggests that more detailed CG models are likely necessary for more effective sampling of local conformational transition of IDPs. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Kuo Hao Lee
- Department of Biochemistry and Molecular Biophysics, Kansas State University, Manhattan, Kansas, 66506
| | - Jianhan Chen
- Department of Biochemistry and Molecular Biophysics, Kansas State University, Manhattan, Kansas, 66506
| |
Collapse
|