1
|
Wang Q, Miao Z, Xiao X, Zhang X, Yang D, Jiang B, Liu M. Prediction of order parameters based on protein NMR structure ensemble and machine learning. JOURNAL OF BIOMOLECULAR NMR 2024; 78:87-94. [PMID: 38530516 DOI: 10.1007/s10858-024-00435-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 01/31/2024] [Indexed: 03/28/2024]
Abstract
The fast motions of proteins at the picosecond to nanosecond timescale, known as fast dynamics, are closely related to protein conformational entropy and rearrangement, which in turn affect catalysis, ligand binding and protein allosteric effects. The most used NMR approach to study fast protein dynamics is the model free method, which uses order parameter S2 to describe the amplitude of the internal motion of local group. However, to obtain order parameter through NMR experiments is quite complex and lengthy. In this paper, we present a machine learning approach for predicting backbone 1H-15N order parameters based on protein NMR structure ensemble. A random forest model is used to learn the relationship between order parameters and structural features. Our method achieves high accuracy in predicting backbone 1H-15N order parameters for a test dataset of 10 proteins, with a Pearson correlation coefficient of 0.817 and a root-mean-square error of 0.131.
Collapse
Affiliation(s)
- Qianqian Wang
- Wuhan National Laboratory for Optoelectronics, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Zhiwei Miao
- Wuhan National Laboratory for Optoelectronics, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Xiongjie Xiao
- Wuhan National Laboratory for Optoelectronics, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Xu Zhang
- Wuhan National Laboratory for Optoelectronics, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Huazhong University of Science and Technology, Wuhan, 430074, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
- Optics Valley Laboratory, Wuhan, 430074, China
| | - Daiwen Yang
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore
| | - Bin Jiang
- Wuhan National Laboratory for Optoelectronics, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Huazhong University of Science and Technology, Wuhan, 430074, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
- Optics Valley Laboratory, Wuhan, 430074, China.
| | - Maili Liu
- Wuhan National Laboratory for Optoelectronics, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Huazhong University of Science and Technology, Wuhan, 430074, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
- Optics Valley Laboratory, Wuhan, 430074, China.
| |
Collapse
|
2
|
Wang W, Su X, Liu D, Zhang H, Wang X, Zhou Y. Predicting DNA-binding protein and coronavirus protein flexibility using protein dihedral angle and sequence feature. Proteins 2023; 91:497-507. [PMID: 36321218 PMCID: PMC9877568 DOI: 10.1002/prot.26443] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 09/07/2022] [Accepted: 10/20/2022] [Indexed: 11/07/2022]
Abstract
The flexibility of protein structure is related to various biological processes, such as molecular recognition, allosteric regulation, catalytic activity, and protein stability. At the molecular level, protein dynamics and flexibility are important factors to understand protein function. DNA-binding proteins and Coronavirus proteins are of great concern and relatively unique proteins. However, exploring the flexibility of DNA-binding proteins and Coronavirus proteins through experiments or calculations is a difficult process. Since protein dihedral rotational motion can be used to predict protein structural changes, it provides key information about protein local conformation. Therefore, this paper introduces a method to improve the accuracy of protein flexibility prediction, DihProFle (Prediction of DNA-binding proteins and Coronavirus proteins flexibility introduces the calculated dihedral Angle information). Based on protein dihedral Angle information, protein evolution information, and amino acid physical and chemical properties, DihProFle realizes the prediction of protein flexibility in two cases on DNA-binding proteins and Coronavirus proteins, and assigns flexibility class to each protein sequence position. In this study, compared with the flexible prediction using sequence evolution information, and physicochemical properties of amino acids, the flexible prediction accuracy based on protein dihedral Angle information, sequence evolution information and physicochemical properties of amino acids improved by 2.2% and 3.1% in the nonstrict and strict conditions, respectively. And DihProFle achieves better performance than previous methods for protein flexibility analysis. In addition, we further analyzed the correlation of amino acid properties and protein dihedral angles with residues flexibility. The results show that the charged hydrophilic residues have higher proportion in the flexible region, and the rigid region tends to be in the angular range of the protein dihedral angle (such as the ψ angle of amino acid residues is more flexible than rigid in the range of 91°-120°). Therefore, the results indicate that hydrophilic residues and protein dihedral angle information play an important role in protein flexibility.
Collapse
Affiliation(s)
- Wei Wang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China.,Key Laboratory of Artificial Intelligence and Personalized Learning in Education of Henan Province, Xinxiang, China
| | - Xili Su
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Dong Liu
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Hongjun Zhang
- School of Computer Science and Technology, Anyang University, Anyang, China
| | - Xianfang Wang
- College of Computer Science and Technology Engineering, Henan Institute of Technology, Xinxiang, China
| | - Yun Zhou
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| |
Collapse
|
3
|
Vander Meersche Y, Cretin G, de Brevern AG, Gelly JC, Galochkina T. MEDUSA: Prediction of Protein Flexibility from Sequence. J Mol Biol 2021; 433:166882. [PMID: 33972018 DOI: 10.1016/j.jmb.2021.166882] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 02/12/2021] [Accepted: 02/13/2021] [Indexed: 12/11/2022]
Abstract
Information on the protein flexibility is essential to understand crucial molecular mechanisms such as protein stability, interactions with other molecules and protein functions in general. B-factor obtained in the X-ray crystallography experiments is the most common flexibility descriptor available for the majority of the resolved protein structures. Since the gap between the number of the resolved protein structures and available protein sequences is continuously growing, it is important to provide computational tools for protein flexibility prediction from amino acid sequence. In the current study, we report a Deep Learning based protein flexibility prediction tool MEDUSA (https://www.dsimb.inserm.fr/MEDUSA). MEDUSA uses evolutionary information extracted from protein homologous sequences and amino acid physico-chemical properties as input for a convolutional neural network to assign a flexibility class to each protein sequence position. Trained on a non-redundant dataset of X-ray structures, MEDUSA provides flexibility prediction in two, three and five classes. MEDUSA is freely available as a web-server providing a clear visualization of the prediction results as well as a standalone utility (https://github.com/DSIMB/medusa). Analysis of the MEDUSA output allows a user to identify the potentially highly deformable protein regions and general dynamic properties of the protein.
Collapse
Affiliation(s)
- Yann Vander Meersche
- Université de Paris, Inserm UMR_S 1134 - BIGR, INTS, 6 rue Alexandre Cabanel, 75015 Paris, France; Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| | - Gabriel Cretin
- Université de Paris, Inserm UMR_S 1134 - BIGR, INTS, 6 rue Alexandre Cabanel, 75015 Paris, France; Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| | - Alexandre G de Brevern
- Université de Paris, Inserm UMR_S 1134 - BIGR, INTS, 6 rue Alexandre Cabanel, 75015 Paris, France; Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| | - Jean-Christophe Gelly
- Université de Paris, Inserm UMR_S 1134 - BIGR, INTS, 6 rue Alexandre Cabanel, 75015 Paris, France; Laboratoire d'Excellence GR-Ex, 75015 Paris, France.
| | - Tatiana Galochkina
- Université de Paris, Inserm UMR_S 1134 - BIGR, INTS, 6 rue Alexandre Cabanel, 75015 Paris, France; Laboratoire d'Excellence GR-Ex, 75015 Paris, France.
| |
Collapse
|