1
|
Li Z, Song G, Zhu J, Mu J, Sun Y, Hong X, Choi T, Cui X, Chen HF. Excited-Ground-State Transition of the RNA Strand Slippage Mechanism Captured by the Base-Specific Force Field. J Chem Theory Comput 2024. [PMID: 38980289 DOI: 10.1021/acs.jctc.4c00497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Excited-ground-state transition and strand slippage of RNA play key roles in transcription and translation of central dogma. Due to limitation of current experimental techniques, the dynamic structure ensembles of RNA remain inadequately understood. Molecular dynamics simulations offer a promising complementary approach, whose accuracy depends on the force field. Here, we develop the new version of RNA base-specific force field (BSFF2) to address underestimation of base pairing stability and artificial backbone conformations. Extensive evaluations on typical RNA systems have comprehensively confirmed the accuracy of BSFF2. Furthermore, BSFF2 demonstrates exceptional efficiency in de novo folding of tetraloops and reproducing base pair reshuffling transition between RNA excited and ground states. Then, we explored the RNA strand slippage mechanism with BSFF2. We conducted a comprehensive three-dimensional structural investigation into the strand slippage of the most complex r(G4C2)9 repeat element and presented the molecular details in the dynamic transition along with the underlying mechanism. Our results of capturing the strand slippage, excited-ground transition, de novo folding, and simulations for various typical RNA motifs indicate that BSFF2 should be one of valuable tools for dynamic conformation research and structure prediction of RNA, and a future contribution to RNA-targeted drug design as well as RNA therapy development.
Collapse
Affiliation(s)
- Zhengxin Li
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ge Song
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Junjie Zhu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Junxi Mu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yutong Sun
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiaokun Hong
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- College of Biological Science and Engineering, Fuzhou University, Fuzhou, Fujian 350116, China
| | - Taeyoung Choi
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiaochen Cui
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Hai-Feng Chen
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
2
|
Nagy G, Hoffmann SV, Jones NC, Grubmüller H. Reference Data Set for Circular Dichroism Spectroscopy Comprised of Validated Intrinsically Disordered Protein Models. APPLIED SPECTROSCOPY 2024:37028241239977. [PMID: 38646777 DOI: 10.1177/00037028241239977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Circular dichroism (CD) spectroscopy is an analytical technique that measures the wavelength-dependent differential absorbance of circularly polarized light and is applicable to most biologically important macromolecules, such as proteins, nucleic acids, and carbohydrates. It serves to characterize the secondary structure composition of proteins, including intrinsically disordered proteins, by analyzing their recorded spectra. Several computational tools have been developed to interpret protein CD spectra. These methods have been calibrated and tested mostly on globular proteins with well-defined structures, mainly due to the lack of reliable reference structures for disordered proteins. It is therefore still largely unclear how accurately these computational methods can determine the secondary structure composition of disordered proteins. Here, we provide such a required reference data set consisting of model structural ensembles and matching CD spectra for eight intrinsically disordered proteins. Using this set of data, we have assessed the accuracy of several published CD prediction and secondary structure estimation tools, including our own CD analysis package, SESCA. Our results show that for most of the tested methods, their accuracy for disordered proteins is generally lower than for globular proteins. In contrast, SESCA, which was developed using globular reference proteins, but was designed to be applicable to disordered proteins as well, performs similarly well for both classes of proteins. The new reference data set for disordered proteins should allow for further improvement of all published methods.
Collapse
Affiliation(s)
- Gabor Nagy
- Department of Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | | | - Nykola C Jones
- ISA, Department of Physics and Astronomy, Aarhus University, Aarhus, Denmark
| | - Helmut Grubmüller
- Department of Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| |
Collapse
|
3
|
Mu J, Li Z, Zhang B, Zhang Q, Iqbal J, Wadood A, Wei T, Feng Y, Chen HF. Graphormer supervised de novo protein design method and function validation. Brief Bioinform 2024; 25:bbae135. [PMID: 38557677 PMCID: PMC10982952 DOI: 10.1093/bib/bbae135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 01/31/2024] [Accepted: 03/12/2024] [Indexed: 04/04/2024] Open
Abstract
Protein design is central to nearly all protein engineering problems, as it can enable the creation of proteins with new biological functions, such as improving the catalytic efficiency of enzymes. One key facet of protein design, fixed-backbone protein sequence design, seeks to design new sequences that will conform to a prescribed protein backbone structure. Nonetheless, existing sequence design methods present limitations, such as low sequence diversity and shortcomings in experimental validation of the designed functional proteins. These inadequacies obstruct the goal of functional protein design. To improve these limitations, we initially developed the Graphormer-based Protein Design (GPD) model. This model utilizes the Transformer on a graph-based representation of three-dimensional protein structures and incorporates Gaussian noise and a sequence random masks to node features, thereby enhancing sequence recovery and diversity. The performance of the GPD model was significantly better than that of the state-of-the-art ProteinMPNN model on multiple independent tests, especially for sequence diversity. We employed GPD to design CalB hydrolase and generated nine artificially designed CalB proteins. The results show a 1.7-fold increase in catalytic activity compared to that of the wild-type CalB and strong substrate selectivity on p-nitrophenyl acetate with different carbon chain lengths (C2-C16). Thus, the GPD method could be used for the de novo design of industrial enzymes and protein drugs. The code was released at https://github.com/decodermu/GPD.
Collapse
Affiliation(s)
- Junxi Mu
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
- Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, No.5 Yiheyuan Road, Beijing, 100871, China
| | - Zhengxin Li
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Bo Zhang
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Qi Zhang
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Jamshed Iqbal
- Centre for Advanced Drug Research, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, 22060, Pakistan
| | - Abdul Wadood
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan, 23200, Pakistan
| | - Ting Wei
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Yan Feng
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Hai-Feng Chen
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| |
Collapse
|
4
|
Viennet T, Yin M, Jayaraj A, Kim W, Sun ZYJ, Fujiwara Y, Zhang K, Seruggia D, Seo HS, Dhe-Paganon S, Orkin SH, Arthanari H. Structural Insights into the DNA-Binding Mechanism of BCL11A: The Integral Role of ZnF6. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.17.576058. [PMID: 38293057 PMCID: PMC10827156 DOI: 10.1101/2024.01.17.576058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
The transcription factor BCL11A is a critical regulator of the switch from fetal hemoglobin (HbF: α 2 γ 2 ) to adult hemoglobin (HbA: α 2 β 2 ) during development. BCL11A binds at a cognate recognition site (TGACCA) in the γ-globin gene promoter and represses its expression. DNA-binding is mediated by a triple zinc finger domain, designated ZnF456. Here, we report comprehensive investigation of ZnF456, leveraging X-ray crystallography and NMR to determine the structures in both the presence and absence of DNA. We delve into the dynamics and mode of interaction with DNA. Moreover, we discovered that the last zinc finger of BCL11A (ZnF6) plays a special role in DNA binding and γ-globin gene repression. Our findings help account for some rare γ-globin gene promoter mutations that perturb BCL11A binding and lead to increased HbF in adults (hereditary persistence of fetal hemoglobin). Comprehending the DNA binding mechanism of BCL11A opens avenues for the strategic, structure-based design of novel therapeutics targeting sickle cell disease and β-thalassemia.
Collapse
|
5
|
Zhu J, Li Z, Tong H, Lu Z, Zhang N, Wei T, Chen HF. Phanto-IDP: compact model for precise intrinsically disordered protein backbone generation and enhanced sampling. Brief Bioinform 2023; 25:bbad429. [PMID: 38018910 PMCID: PMC10783862 DOI: 10.1093/bib/bbad429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/21/2023] [Accepted: 11/05/2023] [Indexed: 11/30/2023] Open
Abstract
The biological function of proteins is determined not only by their static structures but also by the dynamic properties of their conformational ensembles. Numerous high-accuracy static structure prediction tools have been recently developed based on deep learning; however, there remains a lack of efficient and accurate methods for exploring protein dynamic conformations. Traditionally, studies concerning protein dynamics have relied on molecular dynamics (MD) simulations, which incur significant computational costs for all-atom precision and struggle to adequately sample conformational spaces with high energy barriers. To overcome these limitations, various enhanced sampling techniques have been developed to accelerate sampling in MD. Traditional enhanced sampling approaches like replica exchange molecular dynamics (REMD) and frontier expansion sampling (FEXS) often follow the MD simulation approach and still cost a lot of computational resources and time. Variational autoencoders (VAEs), as a classic deep generative model, are not restricted by potential energy landscapes and can explore conformational spaces more efficiently than traditional methods. However, VAEs often face challenges in generating reasonable conformations for complex proteins, especially intrinsically disordered proteins (IDPs), which limits their application as an enhanced sampling method. In this study, we presented a novel deep learning model (named Phanto-IDP) that utilizes a graph-based encoder to extract protein features and a transformer-based decoder combined with variational sampling to generate highly accurate protein backbones. Ten IDPs and four structured proteins were used to evaluate the sampling ability of Phanto-IDP. The results demonstrate that Phanto-IDP has high fidelity and diversity in the generated conformation ensembles, making it a suitable tool for enhancing the efficiency of MD simulation, generating broader protein conformational space and a continuous protein transition path.
Collapse
Affiliation(s)
- Junjie Zhu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Zhengxin Li
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Haowei Tong
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Zhouyu Lu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Ningjie Zhang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Ting Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Hai-Feng Chen
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
6
|
Wang X, Wang Y, Guo M, Wang X, Li Y, Zhang JZH. Assessment of an Electrostatic Energy-Based Charge Model for Modeling the Electrostatic Interactions in Water Solvent. J Chem Theory Comput 2023; 19:6294-6312. [PMID: 37656610 DOI: 10.1021/acs.jctc.3c00467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/03/2023]
Abstract
The protein force field based on the restrained electrostatic potential (RESP) charges has limitations in accurately describing hydrogen bonding interactions in proteins. To address this issue, we propose an alternative approach called the electrostatic energy-based charges (EEC) model, which shows improved performance in describing electrostatic interactions (EIs) of hydrogen bonds in proteins. In this study, we further investigate the performance of the EEC model in modeling EIs in water solvent. Our findings demonstrate that the fixed EEC model can effectively reproduce the quantum mechanics/molecular mechanics (QM/MM)-calculated EIs between a water molecule and various water solvent environments. However, to achieve the same level of computational accuracy, the electrostatic potential (ESP) charge model needs to fluctuate according to the electrostatic environment. Our analysis indicates that the requirement for charge adjustments depends on the specific mathematical and physical representation of EIs as a function of the environment for deriving charges. By comparing with widely used empirical water models calibrated to reproduce experimental properties, we confirm that the performance of the EEC model in reproducing QM/MM EIs is similar to that of general purpose TIP4P-like water models such as TIP4P-Ew and TIP4P/2005. When comparing the computed 10,000 distinct EI values within the range of -40 to 0 kcal/mol with the QM/MM results calculated at the MP2/aug-cc-pVQZ/TIP3P level, we noticed that the mean unsigned error (MUE) for the EEC model is merely 0.487 kcal/mol, which is remarkably similar to the MUE values of the TIP4P-Ew (0.63 kcal/mol) and TIP4P/2005 (0.579 kcal/mol) models. However, both the RESP method and the TIP3P model exhibit a tendency to overestimate the EIs, as evidenced by their higher MUE values of 1.761 and 1.293 kcal/mol, respectively. EEC-based molecular dynamics simulations have demonstrated that, when combined with appropriate van der Waals parameters, the EEC model can closely reproduce oxygen-oxygen radial distribution function and density of water, showing a remarkable similarity to the well-established TIP4P-like empirical water models. Our results demonstrate that the EEC model has the potential to build force fields with comparable accuracy to more sophisticated empirical TIP4P-like water models.
Collapse
Affiliation(s)
- Xianwei Wang
- College of Science, Zhejiang University of Technology, Hangzhou, Zhejiang 310023, China
| | - Yiying Wang
- College of Science, Zhejiang University of Technology, Hangzhou, Zhejiang 310023, China
| | - Man Guo
- College of Science, Zhejiang University of Technology, Hangzhou, Zhejiang 310023, China
| | - Xuechao Wang
- College of Science, Zhejiang University of Technology, Hangzhou, Zhejiang 310023, China
| | - Yang Li
- College of Information Science and Engineering, Shandong Agricultural University, Tai'an, Shandong 271018, China
| | - John Z H Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| |
Collapse
|
7
|
Pan Z, Mu J, Chen HF. Balanced Three-Point Water Model OPC3-B for Intrinsically Disordered and Ordered Proteins. J Chem Theory Comput 2023; 19:4837-4850. [PMID: 37452752 DOI: 10.1021/acs.jctc.3c00297] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Intrinsically disordered proteins (IDPs) play a critical role in many biological processes. Due to the inherent structural flexibility of IDPs, experimental methods present significant challenges for sampling their conformational information at the atomic level. Therefore, molecular dynamics (MD) simulations have emerged as the primary tools for modeling IDPs whose accuracy depend on force field and water model. To enhance the accuracy of physical modeling of IDPs, several force fields have been developed. However, current water models lack precision and underestimate the interaction between water molecules and proteins. Here, we used Monte-Carlo re-weighting method to re-parameterize a three-point water model based on OPC3 for IDPs (named OPC3-B). We benchmarked the performance of OPC3-B compared with nine different water models for 10 IDPs and three ordered proteins. The results indicate that the performance of OPC3-B is better than other water models for both IDPs and ordered proteins. At the same time, OPC3-B possess the power of transferability with other force field to simulate IDPs. This newly developed water model can be used to insight into the research of sequence-disordered-function paradigm for IDPs.
Collapse
Affiliation(s)
- Zhengsong Pan
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- 4+4 Medical Doctor Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing 100730, China
| | - Junxi Mu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Hai-Feng Chen
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Center for Bioinformation Technology, Shanghai 200235, China
| |
Collapse
|
8
|
Zhu JJ, Zhang NJ, Wei T, Chen HF. Enhancing Conformational Sampling for Intrinsically Disordered and Ordered Proteins by Variational Autoencoder. Int J Mol Sci 2023; 24:ijms24086896. [PMID: 37108059 PMCID: PMC10138423 DOI: 10.3390/ijms24086896] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/26/2023] [Accepted: 03/27/2023] [Indexed: 04/29/2023] Open
Abstract
Intrinsically disordered proteins (IDPs) account for more than 50% of the human proteome and are closely associated with tumors, cardiovascular diseases, and neurodegeneration, which have no fixed three-dimensional structure under physiological conditions. Due to the characteristic of conformational diversity, conventional experimental methods of structural biology, such as NMR, X-ray diffraction, and CryoEM, are unable to capture conformational ensembles. Molecular dynamics (MD) simulation can sample the dynamic conformations at the atomic level, which has become an effective method for studying the structure and function of IDPs. However, the high computational cost prevents MD simulations from being widely used for IDPs conformational sampling. In recent years, significant progress has been made in artificial intelligence, which makes it possible to solve the conformational reconstruction problem of IDP with fewer computational resources. Here, based on short MD simulations of different IDPs systems, we use variational autoencoders (VAEs) to achieve the generative reconstruction of IDPs structures and include a wider range of sampled conformations from longer simulations. Compared with the generative autoencoder (AEs), VAEs add an inference layer between the encoder and decoder in the latent space, which can cover the conformational landscape of IDPs more comprehensively and achieve the effect of enhanced sampling. Through experimental verification, the Cα RMSD between VAE-generated and MD simulation sampling conformations in the 5 IDPs test systems was significantly lower than that of AE. The Spearman correlation coefficient on the structure was higher than that of AE. VAE can also achieve excellent performance regarding structured proteins. In summary, VAEs can be used to effectively sample protein structures.
Collapse
Affiliation(s)
- Jun-Jie Zhu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ning-Jie Zhang
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ting Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Hai-Feng Chen
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Center for Bioinformation Technology, Shanghai 200240, China
| |
Collapse
|
9
|
Yu L, Brüschweiler R. Quantitative prediction of ensemble dynamics, shapes and contact propensities of intrinsically disordered proteins. PLoS Comput Biol 2022; 18:e1010036. [PMID: 36084124 PMCID: PMC9491582 DOI: 10.1371/journal.pcbi.1010036] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 09/21/2022] [Accepted: 08/03/2022] [Indexed: 12/29/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) are highly dynamic systems that play an important role in cell signaling processes and their misfunction often causes human disease. Proper understanding of IDP function not only requires the realistic characterization of their three-dimensional conformational ensembles at atomic-level resolution but also of the time scales of interconversion between their conformational substates. Large sets of experimental data are often used in combination with molecular modeling to restrain or bias models to improve agreement with experiment. It is shown here for the N-terminal transactivation domain of p53 (p53TAD) and Pup, which are two IDPs that fold upon binding to their targets, how the latest advancements in molecular dynamics (MD) simulations methodology produces native conformational ensembles by combining replica exchange with series of microsecond MD simulations. They closely reproduce experimental data at the global conformational ensemble level, in terms of the distribution properties of the radius of gyration tensor, and at the local level, in terms of NMR properties including 15N spin relaxation, without the need for reweighting. Further inspection revealed that 10-20% of the individual MD trajectories display the formation of secondary structures not observed in the experimental NMR data. The IDP ensembles were analyzed by graph theory to identify dominant inter-residue contact clusters and characteristic amino-acid contact propensities. These findings indicate that modern MD force fields with residue-specific backbone potentials can produce highly realistic IDP ensembles sampling a hierarchy of nano- and picosecond time scales providing new insights into their biological function.
Collapse
Affiliation(s)
- Lei Yu
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio, United States of America
| | - Rafael Brüschweiler
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio, United States of America
- Department of Biological Chemistry and Pharmacology, The Ohio State University, Columbus, Ohio, United States of America
- * E-mail:
| |
Collapse
|
10
|
Chen J, Liu H, Cui X, Li Z, Chen HF. RNA-Specific Force Field Optimization with CMAP and Reweighting. J Chem Inf Model 2022; 62:372-385. [PMID: 35021622 DOI: 10.1021/acs.jcim.1c01148] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
RNA plays a key role in a variety of cell activities. However, it is difficult to capture its structure dynamics by the traditional experimental methods because of the inherent limitations. Molecular dynamics simulation has become a valuable complement to the experimental methods. Previous studies have indicated that the current force fields cannot accurately reproduce the conformations and structural dynamics of RNA. Therefore, an RNA-specific force field was developed to improve the conformation sampling of RNA. The distribution of ζ/α dihedrals of tetranucleotides was optimized by a reweighting method, and the grid-based energy correction map (CMAP) term was first introduced into the Amber RNA force field of ff99bsc0χOL3, named ff99OL3_CMAP1. Extensive validations of tetranucleotides and tetraloops show that ff99OL3_CMAP1 can significantly decrease the population of an incorrect structure, increase the consistency between the simulation results and experimental values for tetranucleotides, and improve the stability of tetraloops. ff99OL3_CMAP1 can also precisely reproduce the conformation of a duplex and riboswitches. These findings confirm that the newly developed force field ff99OL3_CMAP1 can improve the conformer sampling of RNA.
Collapse
Affiliation(s)
- Jun Chen
- State Key Laboratory of Microbial Metabolism, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 20024 Shanghai, China
| | - Hao Liu
- Institute of Natural Sciences, Shanghai Jiao Tong University, 200240 Shanghai, China
| | - Xiaochen Cui
- State Key Laboratory of Microbial Metabolism, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 20024 Shanghai, China
| | - Zhengxin Li
- State Key Laboratory of Microbial Metabolism, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 20024 Shanghai, China
| | - Hai-Feng Chen
- State Key Laboratory of Microbial Metabolism, Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 20024 Shanghai, China.,Shanghai Center for Bioinformation Technology, 200240 Shanghai, China
| |
Collapse
|