1
|
Liu B, Jiang Y, Yang Y, Chen JX. OmeDDG: Improved Protein Mutation Stability Prediction Based on Predicted 3D Structures. J Phys Chem B 2024; 128:67-76. [PMID: 38130113 DOI: 10.1021/acs.jpcb.3c05601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Determining changes in the protein's thermal stability following mutations is critical in protein engineering and understanding pathogenic missense mutations. Despite the development of various computational methods to predict the effects of single-point mutations, their accuracy remains limited. In this study, we propose a new computational method, OmeDDG, that more accurately predicts mutation-induced Gibbs free energy changes in protein folding (ΔΔG). OmeDDG takes the sequences of wild-type and mutant proteins as input, utilizes OmegaFold to obtain the 3D structure, employs a convolutional neural network to extract structural features, and combines them with protein mutation features and pretraining features to predict the stability of single-point mutations in proteins. We performed a comprehensive comparison between OmeDDG and other available prediction methods on four blind test datasets, confirming that OmeDDG can effectively enhance protein mutation prediction performance. Notably, on the antisymmetric dataset Ssym, OmeDDG achieves the best performance, demonstrating favorable antisymmetry with PCC = 0.79 and RMSE = 0.96 for forward mutations and PCC = 0.77 and RMSE = 0.97 for reverse mutant types.
Collapse
Affiliation(s)
- Baoying Liu
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
| | - Yongquan Jiang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
- Artificial Intelligence Research Institute, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
| | - Yan Yang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
- Artificial Intelligence Research Institute, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
| | - Jim X Chen
- Department of Computer Science, George Mason University, Fairfax, Virginia 22030-4444, United States
| |
Collapse
|
2
|
Wang S, Tang H, Shan P, Wu Z, Zuo L. ProS-GNN: Predicting effects of mutations on protein stability using graph neural networks. Comput Biol Chem 2023; 107:107952. [PMID: 37643501 DOI: 10.1016/j.compbiolchem.2023.107952] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 08/18/2023] [Accepted: 08/25/2023] [Indexed: 08/31/2023]
Abstract
Predicting protein stability change upon variation through a computational approach is a valuable tool to unveil the mechanisms of mutation-induced drug failure and develop immunotherapy strategies. Some previous machine learning-based techniques exhibit anti-symmetric bias toward destabilizing situations, whereas others struggle with generalization to unseen examples. To address these issues, we propose a gated graph neural network-based approach to predict changes in protein stability upon mutation. The model uses message passing to encode the links between the molecular structure and property after eliminating the non-mutant structure and creating input feature vectors. While doing so, it also incorporates the coordinates of the raw atoms to provide spatial insights into the chemical systems. We test the model on the Ssym, Myoglobin, Broom, and p53 datasets to demonstrate the generalization performance. Compared to existing approaches, our proposed method achieves improved linearity with symmetry in less time. The code for this study is available at: https://github.com/HongzhouTang/Pros-GNN.
Collapse
Affiliation(s)
- Shuyu Wang
- Department of Control Engineering, Northeastern University, Qinhuangdao Campus, Qinhuangdao 066001, China.
| | - Hongzhou Tang
- Department of Control Engineering, Northeastern University, Qinhuangdao Campus, Qinhuangdao 066001, China
| | - Peng Shan
- Department of Control Engineering, Northeastern University, Qinhuangdao Campus, Qinhuangdao 066001, China
| | - Zhaoxia Wu
- Department of Control Engineering, Northeastern University, Qinhuangdao Campus, Qinhuangdao 066001, China
| | - Lei Zuo
- Department of Marine Engineering, University of Michigan, Ann Arbor 48109, USA
| |
Collapse
|
3
|
Chen J, Woldring DR, Huang F, Huang X, Wei GW. Topological deep learning based deep mutational scanning. Comput Biol Med 2023; 164:107258. [PMID: 37506452 PMCID: PMC10528359 DOI: 10.1016/j.compbiomed.2023.107258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/28/2023] [Accepted: 07/08/2023] [Indexed: 07/30/2023]
Abstract
High-throughput deep mutational scanning (DMS) experiments have significantly impacted protein engineering, drug discovery, immunology, cancer biology, and evolutionary biology by enabling the systematic understanding of protein functions. However, the mutational space associated with proteins is astronomically large, making it overwhelming for current experimental capabilities. Therefore, alternative methods for DMS are imperative. We propose a topological deep learning (TDL) paradigm to facilitate in silico DMS. We utilize a new topological data analysis (TDA) technique based on the persistent spectral theory, also known as persistent Laplacian, to capture both topological invariants and the homotopic shape evolution of data. To validate our TDL-DMS model, we use SARS-CoV-2 datasets and show excellent accuracy and reliability for binding interface mutations. This finding is significant for SARS-CoV-2 variant forecasting and designing effective antibodies and vaccines. Our proposed model is expected to have a significant impact on drug discovery, vaccine design, precision medicine, and protein engineering.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematical Sciences, University of Arkansas, Fayetteville, AR 72701, USA
| | - Daniel R Woldring
- Department of Chemical Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Faqing Huang
- Department of Chemistry and Biochemistry, University of Southern Mississippi, Hattiesburg, MS 39406, USA
| | - Xuefei Huang
- Department of Chemistry, Michigan State University, MI 48824, USA; Department of Biomedical Engineering, Michigan State University, East Lansing, MI 48824, USA; The Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
4
|
Pandey P, Panday SK, Rimal P, Ancona N, Alexov E. Predicting the Effect of Single Mutations on Protein Stability and Binding with Respect to Types of Mutations. Int J Mol Sci 2023; 24:12073. [PMID: 37569449 PMCID: PMC10418460 DOI: 10.3390/ijms241512073] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 07/24/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
The development of methods and algorithms to predict the effect of mutations on protein stability, protein-protein interaction, and protein-DNA/RNA binding is necessitated by the needs of protein engineering and for understanding the molecular mechanism of disease-causing variants. The vast majority of the leading methods require a database of experimentally measured folding and binding free energy changes for training. These databases are collections of experimental data taken from scientific investigations typically aimed at probing the role of particular residues on the above-mentioned thermodynamic characteristics, i.e., the mutations are not introduced at random and do not necessarily represent mutations originating from single nucleotide variants (SNV). Thus, the reported performance of the leading algorithms assessed on these databases or other limited cases may not be applicable for predicting the effect of SNVs seen in the human population. Indeed, we demonstrate that the SNVs and non-SNVs are not equally presented in the corresponding databases, and the distribution of the free energy changes is not the same. It is shown that the Pearson correlation coefficients (PCCs) of folding and binding free energy changes obtained in cases involving SNVs are smaller than for non-SNVs, indicating that caution should be used in applying them to reveal the effect of human SNVs. Furthermore, it is demonstrated that some methods are sensitive to the chemical nature of the mutations, resulting in PCCs that differ by a factor of four across chemically different mutations. All methods are found to underestimate the energy changes by roughly a factor of 2.
Collapse
Affiliation(s)
- Preeti Pandey
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Shailesh Kumar Panday
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Prawin Rimal
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Nicolas Ancona
- Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA;
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| |
Collapse
|
5
|
Kan Y, Paung Y, Kim Y, Seeliger MA, Miller WT. Biochemical Studies of Systemic Lupus Erythematosus-Associated Mutations in Nonreceptor Tyrosine Kinases Ack1 and Brk. Biochemistry 2023; 62:1124-1137. [PMID: 36854171 PMCID: PMC10052838 DOI: 10.1021/acs.biochem.2c00685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
Tyrosine kinases (TKs) play essential roles in signaling processes that regulate cell survival, migration, and proliferation. Dysregulation of tyrosine kinases underlies many disorders, including cancer, cardiovascular and developmental diseases, as well as pathologies of the immune system. Ack1 and Brk are nonreceptor tyrosine kinases (NRTKs) best known for their roles in cancer. Here, we have biochemically characterized novel Ack1 and Brk mutations identified in patients with systemic lupus erythematosus (SLE). These mutations are the first SLE-linked polymorphisms found among NRTKs. We show that two of the mutants are catalytically inactive, while the other three have reduced activity. To understand the structural changes associated with the loss-of-function phenotype, we solved the crystal structure of one of the Ack1 kinase mutants, K161Q. Furthermore, two of the mutated residues (Ack1 A156 and K161) critical for catalytic activity are highly conserved among other TKs, and their substitution in other members of the kinase family could have implications in cancer. In contrast to canonical gain-of-function mutations in TKs observed in many cancers, we report loss-of-function mutations in Ack1 and Brk, highlighting the complexity of TK involvement in human diseases.
Collapse
Affiliation(s)
- Yagmur Kan
- Department of Physiology and Biophysics, School of Medicine, Stony Brook University, Stony Brook, New York 11794-8661, United States
| | - YiTing Paung
- Department of Pharmacology, School of Medicine, Stony Brook University, Stony Brook, New York 11794-8661, United States
| | - Yunyoung Kim
- Department of Physiology and Biophysics, School of Medicine, Stony Brook University, Stony Brook, New York 11794-8661, United States
| | - Markus A Seeliger
- Department of Pharmacology, School of Medicine, Stony Brook University, Stony Brook, New York 11794-8661, United States
| | - W Todd Miller
- Department of Physiology and Biophysics, School of Medicine, Stony Brook University, Stony Brook, New York 11794-8661, United States
- Department of Veterans Affairs Medical Center, Northport, New York 11768, United States
| |
Collapse
|
6
|
Tu H, Han Y, Wang Z, Li J. Clustered tree regression to learn protein energy change with mutated amino acid. Brief Bioinform 2022; 23:6702668. [PMID: 36124753 DOI: 10.1093/bib/bbac374] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 07/31/2022] [Accepted: 08/08/2022] [Indexed: 12/14/2022] Open
Abstract
Accurate and effective prediction of mutation-induced protein energy change remains a great challenge and of great interest in computational biology. However, high resource consumption and insufficient structural information of proteins severely limit the experimental techniques and structure-based prediction methods. Here, we design a structure-independent protocol to accurately and effectively predict the mutation-induced protein folding free energy change with only sequence, physicochemical and evolutionary features. The proposed clustered tree regression protocol is capable of effectively exploiting the inherent data patterns by integrating unsupervised feature clustering by K-means and supervised tree regression using XGBoost, and thus enabling fast and accurate protein predictions with different mutations, with an average Pearson correlation coefficient of 0.83 and an average root-mean-square error of 0.94kcal/mol. The proposed sequence-based method not only eliminates the dependence on protein structures, but also has potential applications in protein predictions with rare structural information.
Collapse
Affiliation(s)
- Hongwei Tu
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanqiang Han
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Zhilong Wang
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jinjin Li
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
7
|
Yang ZY, Ye ZF, Xiao YJ, Hsieh CY, Zhang SY. SPLDExtraTrees: robust machine learning approach for predicting kinase inhibitor resistance. Brief Bioinform 2022; 23:6543900. [PMID: 35262669 DOI: 10.1093/bib/bbac050] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 01/17/2022] [Accepted: 01/31/2022] [Indexed: 12/25/2022] Open
Abstract
Drug resistance is a major threat to the global health and a significant concern throughout the clinical treatment of diseases and drug development. The mutation in proteins that is related to drug binding is a common cause for adaptive drug resistance. Therefore, quantitative estimations of how mutations would affect the interaction between a drug and the target protein would be of vital significance for the drug development and the clinical practice. Computational methods that rely on molecular dynamics simulations, Rosetta protocols, as well as machine learning methods have been proven to be capable of predicting ligand affinity changes upon protein mutation. However, the severely limited sample size and heavy noise induced overfitting and generalization issues have impeded wide adoption of machine learning for studying drug resistance. In this paper, we propose a robust machine learning method, termed SPLDExtraTrees, which can accurately predict ligand binding affinity changes upon protein mutation and identify resistance-causing mutations. Especially, the proposed method ranks training data following a specific scheme that starts with easy-to-learn samples and gradually incorporates harder and diverse samples into the training, and then iterates between sample weight recalculations and model updates. In addition, we calculate additional physics-based structural features to provide the machine learning model with the valuable domain knowledge on proteins for these data-limited predictive tasks. The experiments substantiate the capability of the proposed method for predicting kinase inhibitor resistance under three scenarios and achieve predictive accuracy comparable with that of molecular dynamics and Rosetta methods with much less computational costs.
Collapse
Affiliation(s)
- Zi-Yi Yang
- Tencent Quantum Laboratory, Shenzhen, 518057, Guangdong, China
| | - Zhao-Feng Ye
- Tencent Quantum Laboratory, Shenzhen, 518057, Guangdong, China
| | - Yi-Jia Xiao
- Tencent Quantum Laboratory, Shenzhen, 518057, Guangdong, China.,Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Shenzhen, 518057, Guangdong, China
| | - Sheng-Yu Zhang
- Tencent Quantum Laboratory, Shenzhen, 518057, Guangdong, China
| |
Collapse
|
8
|
Lai J, Yang J, Gamsiz Uzun ED, Rubenstein BM, Sarkar IN. LYRUS: a machine learning model for predicting the pathogenicity of missense variants. BIOINFORMATICS ADVANCES 2021; 2:vbab045. [PMID: 35036922 PMCID: PMC8754197 DOI: 10.1093/bioadv/vbab045] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 12/08/2021] [Accepted: 12/21/2021] [Indexed: 01/27/2023]
Abstract
SUMMARY Single amino acid variations (SAVs) are a primary contributor to variations in the human genome. Identifying pathogenic SAVs can provide insights to the genetic architecture of complex diseases. Most approaches for predicting the functional effects or pathogenicity of SAVs rely on either sequence or structural information. This study presents 〈Lai Yang Rubenstein Uzun Sarkar〉 (LYRUS), a machine learning method that uses an XGBoost classifier to predict the pathogenicity of SAVs. LYRUS incorporates five sequence-based, six structure-based and four dynamics-based features. Uniquely, LYRUS includes a newly proposed sequence co-evolution feature called the variation number. LYRUS was trained using a dataset that contains 4363 protein structures corresponding to 22 639 SAVs from the ClinVar database, and tested using the VariBench testing dataset. Performance analysis showed that LYRUS achieved comparable performance to current variant effect predictors. LYRUS's performance was also benchmarked against six Deep Mutational Scanning datasets for PTEN and TP53. AVAILABILITY AND IMPLEMENTATION LYRUS is freely available and the source code can be found at https://github.com/jiaying2508/LYRUS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Jiaying Lai
- Center for Biomedical Informatics, Brown University, Providence, RI 02903, USA,Center for Computational Molecular Biology, Brown University, Providence, RI 02906, USA
| | - Jordan Yang
- Department of Chemistry, Brown University, Providence, RI 02906, USA
| | - Ece D Gamsiz Uzun
- Center for Computational Molecular Biology, Brown University, Providence, RI 02906, USA,Department of Pathology and Laboratory Medicine, Brown University Alpert Medical School, Providence, RI 02903, USA,Department of Pathology, Rhode Island Hospital, Providence, RI 02903, USA
| | - Brenda M Rubenstein
- Center for Computational Molecular Biology, Brown University, Providence, RI 02906, USA,Department of Chemistry, Brown University, Providence, RI 02906, USA,To whom correspondence should be addressed. and
| | - Indra Neil Sarkar
- Center for Biomedical Informatics, Brown University, Providence, RI 02903, USA,Rhode Island Quality Institute, Providence, RI 02908, USA,To whom correspondence should be addressed. and
| |
Collapse
|
9
|
Sun T, Chen Y, Wen Y, Zhu Z, Li M. PremPLI: a machine learning model for predicting the effects of missense mutations on protein-ligand interactions. Commun Biol 2021; 4:1311. [PMID: 34799678 PMCID: PMC8604987 DOI: 10.1038/s42003-021-02826-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 10/26/2021] [Indexed: 02/07/2023] Open
Abstract
Resistance to small-molecule drugs is the main cause of the failure of therapeutic drugs in clinical practice. Missense mutations altering the binding of ligands to proteins are one of the critical mechanisms that result in genetic disease and drug resistance. Computational methods have made a lot of progress for predicting binding affinity changes and identifying resistance mutations, but their prediction accuracy and speed are still not satisfied and need to be further improved. To address these issues, we introduce a structure-based machine learning method for quantitatively estimating the effects of single mutations on ligand binding affinity changes (named as PremPLI). A comprehensive comparison of the predictive performance of PremPLI with other available methods on two benchmark datasets confirms that our approach performs robustly and presents similar or even higher predictive accuracy than the approaches relying on first-principle statistical mechanics and mixed physics- and knowledge-based potentials while requires much less computational resources. PremPLI can be used for guiding the design of ligand-binding proteins, identifying and understanding disease driver mutations, and finding potential resistance mutations for different drugs. PremPLI is freely available at https://lilab.jysw.suda.edu.cn/research/PremPLI/ and allows to do large-scale mutational scanning. Sun et al. present PremPLI, a machine learning approach and web tool to predict how missense mutations in a drug’s target protein will affect the drug’s binding affinity. PremPLI can be applied to identify drug resistance mechanisms in cancer cells and microorganisms and develop drugs to combat drug resistance.
Collapse
Affiliation(s)
- Tingting Sun
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Yuting Chen
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Yuhao Wen
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Zefeng Zhu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Minghui Li
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China.
| |
Collapse
|
10
|
Koirala M, Shashikala HBM, Jeffries J, Wu B, Loftus SK, Zippin JH, Alexov E. Computational Investigation of the pH Dependence of Stability of Melanosome Proteins: Implication for Melanosome formation and Disease. Int J Mol Sci 2021; 22:ijms22158273. [PMID: 34361043 PMCID: PMC8347052 DOI: 10.3390/ijms22158273] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 07/27/2021] [Accepted: 07/29/2021] [Indexed: 11/16/2022] Open
Abstract
Intravesicular pH plays a crucial role in melanosome maturation and function. Melanosomal pH changes during maturation from very acidic in the early stages to neutral in late stages. Neutral pH is critical for providing optimal conditions for the rate-limiting, pH-sensitive melanin-synthesizing enzyme tyrosinase (TYR). This dramatic change in pH is thought to result from the activity of several proteins that control melanosomal pH. Here, we computationally investigated the pH-dependent stability of several melanosomal membrane proteins and compared them to the pH dependence of the stability of TYR. We confirmed that the pH optimum of TYR is neutral, and we also found that proteins that are negative regulators of melanosomal pH are predicted to function optimally at neutral pH. In contrast, positive pH regulators were predicted to have an acidic pH optimum. We propose a competitive mechanism among positive and negative regulators that results in pH equilibrium. Our findings are consistent with previous work that demonstrated a correlation between the pH optima of stability and activity, and they are consistent with the expected activity of positive and negative regulators of melanosomal pH. Furthermore, our data suggest that disease-causing variants impact the pH dependence of melanosomal proteins; this is particularly prominent for the OCA2 protein. In conclusion, melanosomal pH appears to affect the activity of multiple melanosomal proteins.
Collapse
Affiliation(s)
- Mahesh Koirala
- Department of Physics, Clemson University, Clemson, SC 29634, USA; (M.K.); (H.B.M.S.); (J.J.); (B.W.)
| | - H. B. Mihiri Shashikala
- Department of Physics, Clemson University, Clemson, SC 29634, USA; (M.K.); (H.B.M.S.); (J.J.); (B.W.)
| | - Jacob Jeffries
- Department of Physics, Clemson University, Clemson, SC 29634, USA; (M.K.); (H.B.M.S.); (J.J.); (B.W.)
| | - Bohua Wu
- Department of Physics, Clemson University, Clemson, SC 29634, USA; (M.K.); (H.B.M.S.); (J.J.); (B.W.)
| | - Stacie K. Loftus
- Genetic Disease Research Branch, National Human Genome Research Branch, National Institutes of Health, Bethesda, MD 22066, USA;
| | - Jonathan H. Zippin
- Department of Dermatology, Weill Cornell Medical College, New York, NY 10021, USA;
| | - Emil Alexov
- Department of Physics, Clemson University, Clemson, SC 29634, USA; (M.K.); (H.B.M.S.); (J.J.); (B.W.)
- Correspondence:
| |
Collapse
|
11
|
Li G, Pahari S, Murthy AK, Liang S, Fragoza R, Yu H, Alexov E. SAAMBE-SEQ: a sequence-based method for predicting mutation effect on protein-protein binding affinity. Bioinformatics 2021; 37:992-999. [PMID: 32866236 PMCID: PMC8128451 DOI: 10.1093/bioinformatics/btaa761] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 08/17/2020] [Accepted: 08/24/2020] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Vast majority of human genetic disorders are associated with mutations that affect protein-protein interactions by altering wild-type binding affinity. Therefore, it is extremely important to assess the effect of mutations on protein-protein binding free energy to assist the development of therapeutic solutions. Currently, the most popular approaches use structural information to deliver the predictions, which precludes them to be applicable on genome-scale investigations. Indeed, with the progress of genomic sequencing, researchers are frequently dealing with assessing effect of mutations for which there is no structure available. RESULTS Here, we report a Gradient Boosting Decision Tree machine learning algorithm, the SAAMBE-SEQ, which is completely sequence-based and does not require structural information at all. SAAMBE-SEQ utilizes 80 features representing evolutionary information, sequence-based features and change of physical properties upon mutation at the mutation site. The approach is shown to achieve Pearson correlation coefficient (PCC) of 0.83 in 5-fold cross validation in a benchmarking test against experimentally determined binding free energy change (ΔΔG). Further, a blind test (no-STRUC) is compiled collecting experimental ΔΔG upon mutation for protein complexes for which structure is not available and used to benchmark SAAMBE-SEQ resulting in PCC in the range of 0.37-0.46. The accuracy of SAAMBE-SEQ method is found to be either better or comparable to most advanced structure-based methods. SAAMBE-SEQ is very fast, available as webserver and stand-alone code, and indeed utilizes only sequence information, and thus it is applicable for genome-scale investigations to study the effect of mutations on protein-protein interactions. AVAILABILITY AND IMPLEMENTATION SAAMBE-SEQ is available at http://compbio.clemson.edu/saambe_webserver/indexSEQ.php#started. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gen Li
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Swagata Pahari
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | | | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Robert Fragoza
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| |
Collapse
|
12
|
Hayes RL, Brooks CL. A strategy for proline and glycine mutations to proteins with alchemical free energy calculations. J Comput Chem 2021; 42:1088-1094. [PMID: 33844328 DOI: 10.1002/jcc.26525] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Revised: 03/03/2021] [Accepted: 03/05/2021] [Indexed: 11/07/2022]
Abstract
Computation of the thermodynamic consequences of protein mutations holds great promise in protein biophysics and design. Alchemical free energy methods can give improved estimates of mutational free energies, and are already widely used in calculations of relative and absolute binding free energies in small molecule design problems. In principle, alchemical methods can address any amino acid mutation with an appropriate alchemical pathway, but identifying a strategy that produces such a path for proline and glycine mutations is an ongoing challenge. Most current strategies perturb only side chain atoms, while proline and glycine mutations also alter the backbone parameters and backbone ring topology. Some strategies also perturb backbone parameters and enable glycine mutations. This work presents a strategy that enables both proline and glycine mutations and comprises two key elements: a dual backbone with restraints and scaling of bonded terms, facilitating backbone parameter changes, and a soft bond in the proline ring, enabling ring topology changes in proline mutations. These elements also have utility for core hopping and macrocycle studies in computer-aided drug design. This new strategy shows slight improvements over an alternative side chain perturbation strategy for a set T4 lysozyme mutations lacking proline and glycine, and yields good agreement with experiment for a set of T4 lysozyme proline and glycine mutations not previously studied. To our knowledge this is the first report comparing alchemical predictions of proline mutations with experiment. With this strategy in hand, alchemical methods now have access to the full palette of amino acid mutations.
Collapse
Affiliation(s)
- Ryan L Hayes
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan, USA
| | - Charles L Brooks
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan, USA.,Biophysics Program, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
13
|
Bahia MS, Khazanov N, Zhou Q, Yang Z, Wang C, Hong JS, Rab A, Sorscher EJ, Brouillette CG, Hunt JF, Senderowitz H. Stability Prediction for Mutations in the Cytosolic Domains of Cystic Fibrosis Transmembrane Conductance Regulator. J Chem Inf Model 2021; 61:1762-1777. [PMID: 33720715 DOI: 10.1021/acs.jcim.0c01207] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Cystic Fibrosis (CF) is caused by mutations to the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) chloride channel. CFTR is composed of two membrane spanning domains, two cytosolic nucleotide-binding domains (NBD1 and NBD2) and a largely unstructured R-domain. Multiple CF-causing mutations reside in the NBDs and some are known to compromise the stability of these domains. The ability to predict the effect of mutations on the stability of the cytosolic domains of CFTR and to shed light on the mechanisms by which they exert their effect is therefore important in CF research. With this in mind, we have predicted the effect on domain stability of 59 mutations in NBD1 and NBD2 using 15 different algorithms and evaluated their performances via comparison to experimental data using several metrics including the correct classification rate (CCR), and the squared Pearson correlation (R2) and Spearman's correlation (ρ) calculated between the experimental ΔTm values and the computationally predicted ΔΔG values. Overall, the best results were obtained with FoldX and Rosetta. For NBD1 (35 mutations), FoldX provided R2 and ρ values of 0.64 and -0.71, respectively, with an 86% correct classification rate (CCR). For NBD2 (24 mutations), FoldX R2, ρ, and CCR were 0.51, -0.73, and 75%, respectively. Application of the Rosetta high-resolution protocol (Rosetta_hrp) to NBD1 yielded R2, ρ, and CCR of 0.64, -0.75, and 69%, respectively, and for NBD2 yielded R2, ρ, and CCR of 0.29, -0.27, and 50%, respectively. The corresponding numbers for the Rosetta's low-resolution protocol (Rosetta_lrp) were R2 = 0.47, ρ = -0.69, and CCR = 69% for NBD1 and R2 = 0.27, ρ = -0.24, and CCR = 63% for NBD2. For NBD1, both algorithms suggest that destabilizing mutations suffer from destabilizing vdW clashes, whereas stabilizing mutations benefit from favorable H-bond interactions. Two triple consensus approaches based on FoldX, Rosetta_lpr, and Rosetta_hpr were attempted using either "majority-voting" or "all-voting". The all-voting consensus outperformed the individual predictors, albeit on a smaller data set. In summary, our results suggest that the effect of mutations on the stability of CFTR's NBDs could be largely predicted. Since NBDs are common to all ABC transporters, these results may find use in predicting the effect and mechanism of the action of multiple disease-causing mutations in other proteins.
Collapse
Affiliation(s)
| | - Netaly Khazanov
- Department of Chemistry, Bar-Ilan University, Ramat-Gan, 5290002, Israel
| | - Qingxian Zhou
- School of Medicine, Division of Pulmonary, Allergy and Critical Care Medicine, University of Alabama at Birmingham, Birmingham, Alabama 35294, United States
| | - Zhengrong Yang
- School of Medicine, Division of Hematology & Oncology, University of Alabama at Birmingham, Birmingham, Alabama 35294, United States
| | - Chi Wang
- 702 Fairchild Center, MC3423, Department of Biological Sciences, Columbia University, New York, New York 10027, United States
| | - Jeong S Hong
- Department of Paediatrics, Emory University School of Medicine, Atlanta, Georgia 30303, United States
| | - Andras Rab
- Department of Paediatrics, Emory University School of Medicine, Atlanta, Georgia 30303, United States
| | - Eric J Sorscher
- Department of Paediatrics, Emory University School of Medicine, Atlanta, Georgia 30303, United States
| | - Christie G Brouillette
- Department of Biochemistry & Molecular Genetics, University of Alabama at Birmingham, Birmingham, Alabama 35294, United States
| | - John F Hunt
- 702 Fairchild Center, MC3423, Department of Biological Sciences, Columbia University, New York, New York 10027, United States
| | - Hanoch Senderowitz
- Department of Chemistry, Bar-Ilan University, Ramat-Gan, 5290002, Israel
| |
Collapse
|
14
|
Blake S, Hemming I, Heng JIT, Agostino M. Structure-Based Approaches to Classify the Functional Impact of ZBTB18 Missense Variants in Health and Disease. ACS Chem Neurosci 2021; 12:979-989. [PMID: 33621064 DOI: 10.1021/acschemneuro.0c00758] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The Cys2His2 type zinc finger is a motif found in many eukaryotic transcription factor proteins that facilitates binding to genomic DNA so as to influence cellular gene expression. One such transcription factor is ZBTB18, characterized as a repressor that orchestrates the development of mammalian tissues including skeletal muscle and brain during embryogenesis. In humans, it has been recognized that disease-associated ZBTB18 missense variants mapping to the coding sequence of the zinc finger domain influence sequence-specific DNA binding, disrupt transcriptional regulation, and impair neural circuit formation in the brain. Furthermore, general population ZBTB18 missense variants that influence DNA binding and transcriptional regulation have also been documented within this domain; however, the molecular traits that explain why some variants cause disease while others do not are poorly understood. Here, we have applied five structure-based approaches to evaluate their ability to discriminate between disease-associated and general population ZBTB18 missense variants. We found that thermodynamic integration and Residue Scanning in the Schrodinger Biologics Suite were the best approaches for distinguishing disease-associated variants from general population variants. Our results demonstrate the effectiveness of structure-based approaches for the functional characterization of missense alleles to DNA binding, zinc finger transcription factor protein-coding genes that underlie human health and disease.
Collapse
Affiliation(s)
- Steven Blake
- Curtin Health Innovation Research Institute, Curtin University, Bentley, Western Australia 6102, Australia
- Ralph and Patricia Sarich Neuroscience Research Institute, Nedlands, Western Australia 6009, Australia
- School of Pharmacy and Biomedical Sciences, Curtin University, Bentley, Western Australia 6845, Australia
| | - Isabel Hemming
- Curtin Health Innovation Research Institute, Curtin University, Bentley, Western Australia 6102, Australia
- Ralph and Patricia Sarich Neuroscience Research Institute, Nedlands, Western Australia 6009, Australia
- The Faculty of Health and Medical Sciences, Medical School, The University of Western Australia, Crawley, Western Australia 6009, Australia
| | - Julian Ik-Tsen Heng
- Curtin Health Innovation Research Institute, Curtin University, Bentley, Western Australia 6102, Australia
- Ralph and Patricia Sarich Neuroscience Research Institute, Nedlands, Western Australia 6009, Australia
| | - Mark Agostino
- Curtin Health Innovation Research Institute, Curtin University, Bentley, Western Australia 6102, Australia
- School of Pharmacy and Biomedical Sciences, Curtin University, Bentley, Western Australia 6845, Australia
- Curtin Institute for Computation, Curtin University, Bentley, Western Australia, Australia
| |
Collapse
|
15
|
SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability. Int J Mol Sci 2021; 22:ijms22020606. [PMID: 33435356 PMCID: PMC7827184 DOI: 10.3390/ijms22020606] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Revised: 12/23/2020] [Accepted: 01/06/2021] [Indexed: 01/04/2023] Open
Abstract
Modeling the effect of mutations on protein thermodynamics stability is useful for protein engineering and understanding molecular mechanisms of disease-causing variants. Here, we report a new development of the SAAFEC method, the SAAFEC-SEQ, which is a gradient boosting decision tree machine learning method to predict the change of the folding free energy caused by amino acid substitutions. The method does not require the 3D structure of the corresponding protein, but only its sequence and, thus, can be applied on genome-scale investigations where structural information is very sparse. SAAFEC-SEQ uses physicochemical properties, sequence features, and evolutionary information features to make the predictions. It is shown to consistently outperform all existing state-of-the-art sequence-based methods in both the Pearson correlation coefficient and root-mean-squared-error parameters as benchmarked on several independent datasets. The SAAFEC-SEQ has been implemented into a web server and is available as stand-alone code that can be downloaded and embedded into other researchers’ code.
Collapse
|
16
|
Chen Y, Lu H, Zhang N, Zhu Z, Wang S, Li M. PremPS: Predicting the impact of missense mutations on protein stability. PLoS Comput Biol 2020; 16:e1008543. [PMID: 33378330 PMCID: PMC7802934 DOI: 10.1371/journal.pcbi.1008543] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 01/12/2021] [Accepted: 11/16/2020] [Indexed: 12/12/2022] Open
Abstract
Computational methods that predict protein stability changes induced by missense mutations have made a lot of progress over the past decades. Most of the available methods however have very limited accuracy in predicting stabilizing mutations because existing experimental sets are dominated by mutations reducing protein stability. Moreover, few approaches could consistently perform well across different test cases. To address these issues, we developed a new computational method PremPS to more accurately evaluate the effects of missense mutations on protein stability. The PremPS method is composed of only ten evolutionary- and structure-based features and parameterized on a balanced dataset with an equal number of stabilizing and destabilizing mutations. A comprehensive comparison of the predictive performance of PremPS with other available methods on nine benchmark datasets confirms that our approach consistently outperforms other methods and shows considerable improvement in estimating the impacts of stabilizing mutations. A protein could have multiple structures available, and if another structure of the same protein is used, the predicted change in stability for structure-based methods might be different. Thus, we further estimated the impact of using different structures on prediction accuracy, and demonstrate that our method performs well across different types of structures except for low-resolution structures and models built based on templates with low sequence identity. PremPS can be used for finding functionally important variants, revealing the molecular mechanisms of functional influences and protein design. PremPS is freely available at https://lilab.jysw.suda.edu.cn/research/PremPS/, which allows to do large-scale mutational scanning and takes about four minutes to perform calculations for a single mutation per protein with ~ 300 residues and requires ~ 0.4 seconds for each additional mutation. The development of computational methods to accurately predict the impacts of amino acid substitutions on protein stability is of paramount importance for the field of protein design and understanding the roles of missense mutations in disease. However, most of the available methods have very limited predictive accuracy for mutations increasing stability and few could consistently perform well across different test cases. Here we present a new computational approach PremPS, which is capable of predicting the effects of single point mutations on protein stability. PremPS employs only ten evolutionary- and structure-based features and is trained on a symmetrical dataset consisting of the same number of cases of stabilizing and destabilizing mutations. Our method was tested against numerous blind datasets and shows a considerable improvement especially in evaluating the effects of stabilizing mutations, outperforming previously developed methods. PremPS is freely available as a user-friendly web server at http://lilab.jysw.suda.edu.cn/research/PremPS/, which is fast enough to handle the large number of cases.
Collapse
Affiliation(s)
- Yuting Chen
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Haoyu Lu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Ning Zhang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Zefeng Zhu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Shuqin Wang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Minghui Li
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
- * E-mail:
| |
Collapse
|
17
|
Wang R, Chen J, Hozumi Y, Yin C, Wei GW. Decoding Asymptomatic COVID-19 Infection and Transmission. J Phys Chem Lett 2020; 11:10007-10015. [PMID: 33179934 PMCID: PMC8150094 DOI: 10.1021/acs.jpclett.0c02765] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
One of the major challenges in controlling the coronavirus disease 2019 (COVID-19) outbreak is its asymptomatic transmission. The pathogenicity and virulence of asymptomatic COVID-19 remain mysterious. On the basis of the genotyping of 75775 SARS-CoV-2 genome isolates, we reveal that asymptomatic infection is linked to SARS-CoV-2 11083G>T mutation (i.e., L37F at nonstructure protein 6 (NSP6)). By analyzing the distribution of 11083G>T in various countries, we unveil that 11083G>T may correlate with the hypotoxicity of SARS-CoV-2. Moreover, we show a global decaying tendency of the 11083G>T mutation ratio indicating that 11083G>T hinders the SARS-CoV-2 transmission capacity. Artificial intelligence, sequence alignment, and network analysis are applied to show that NSP6 mutation L37F may have compromised the virus's ability to undermine the innate cellular defense against viral infection via autophagy regulation. This assessment is in good agreement with our genotyping of the SARS-CoV-2 evolution and transmission across various countries and regions over the past few months.
Collapse
Affiliation(s)
| | | | | | - Changchuan Yin
- Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607, United States
| | | |
Collapse
|
18
|
Sarkar A, Yang Y, Vihinen M. Variation benchmark datasets: update, criteria, quality and applications. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:5710862. [PMID: 32016318 PMCID: PMC6997940 DOI: 10.1093/database/baz117] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 06/03/2019] [Accepted: 07/01/2019] [Indexed: 02/07/2023]
Abstract
Development of new computational methods and testing their performance has to be carried out using experimental data. Only in comparison to existing knowledge can method performance be assessed. For that purpose, benchmark datasets with known and verified outcome are needed. High-quality benchmark datasets are valuable and may be difficult, laborious and time consuming to generate. VariBench and VariSNP are the two existing databases for sharing variation benchmark datasets used mainly for variation interpretation. They have been used for training and benchmarking predictors for various types of variations and their effects. VariBench was updated with 419 new datasets from 109 papers containing altogether 329 014 152 variants; however, there is plenty of redundancy between the datasets. VariBench is freely available at http://structure.bmc.lu.se/VariBench/. The contents of the datasets vary depending on information in the original source. The available datasets have been categorized into 20 groups and subgroups. There are datasets for insertions and deletions, substitutions in coding and non-coding region, structure mapped, synonymous and benign variants. Effect-specific datasets include DNA regulatory elements, RNA splicing, and protein property for aggregation, binding free energy, disorder and stability. Then there are several datasets for molecule-specific and disease-specific applications, as well as one dataset for variation phenotype effects. Variants are often described at three molecular levels (DNA, RNA and protein) and sometimes also at the protein structural level including relevant cross references and variant descriptions. The updated VariBench facilitates development and testing of new methods and comparison of obtained performances to previously published methods. We compared the performance of the pathogenicity/tolerance predictor PON-P2 to several benchmark studies, and show that such comparisons are feasible and useful, however, there may be limitations due to lack of provided details and shared data. Database URL: http://structure.bmc.lu.se/VariBench
Collapse
Affiliation(s)
- Anasua Sarkar
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184 Lund, Sweden
| | - Yang Yang
- School of Computer Science and Technology, Soochow University, No1. Shizi Street, Suzhou, 215006 Jiangsu, China.,Provincial Key Laboratory for Computer Information Processing Technology, No1. Shizi Street, Soochow University, Suzhou, 215006 Jiangsu, China
| | - Mauno Vihinen
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184 Lund, Sweden
| |
Collapse
|
19
|
Heydari A, Abolnezhadian F, Sadeghi-Shabestari M, Saberi A, Shamsizadeh A, Ghadiri AA, Ghandil P. Identification of Cytochrome b-245, beta-chain gene mutations, and clinical presentations in Iranian patients with X-linked chronic granulomatous disease. J Clin Lab Anal 2020; 35:e23637. [PMID: 33098164 PMCID: PMC7891530 DOI: 10.1002/jcla.23637] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Revised: 10/06/2020] [Accepted: 10/08/2020] [Indexed: 01/25/2023] Open
Abstract
Background X‐linked chronic granulomatous disease (X‐CGD) is an immunodeficiency disorder caused by defects in the gp91phox subunit that leads to life‐threatening infections. We aimed to identify CYBB gene mutations and study clinical phenotypes in Iranian patients with probable X‐CGD. Methods We studied four unrelated Iranian patients with probable X‐CGD and their families recruited in several years. We isolated genomic DNA from whole blood and performed Sanger sequencing in the CYBB gene's coding and flanking regions. We also performed pathogenicity predictions using in silico tools. Results We detected four different mutations, including a novel insertion mutation in exon 5 (p.Ile117AsnfsX6), in the patient. Bioinformatics analysis confirmed the pathogenic effect of this mutation. We predicted protein modeling and demonstrated lost functional domains. The patient with the insertion mutation presented pneumonia and acute sinusitis during his life. We also detected three other known nonsense mutations (p.Arg157Ter, p.Arg226Ter, and p.Trp424Ter) in the CYBB gene. The patient with p.Arg157Ter developed lymphadenitis and pneumonia. Moreover, the patient with inflammatory bowel disease showed p.Arg226Ter and the patient with tuberculosis presented p.Trp424Ter. We detected different clinical features in the patients compared to other Iranian patients with the same mutations. Conclusion Our results expand the genetic database of patients with X‐CGD from Iran and make it much easier and faster to identify patients with X‐CGD. Our results also help to detect carriers and enable prenatal diagnosis in high‐risk families as a cost‐effective strategy.
Collapse
Affiliation(s)
- Atefeh Heydari
- Cellular and Molecular Research Center, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.,Department of Medical Genetics, School of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Farhad Abolnezhadian
- Department of Pediatrics, Abuzar Children's Hospital, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Mahnaz Sadeghi-Shabestari
- Immunology research center of Tabriz-TB and lung research center of Tabriz-children hospital, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Alihossein Saberi
- Department of Medical Genetics, School of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Ahmad Shamsizadeh
- Infectious and Tropical Diseases Research Center, Health Research Institute, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Ata A Ghadiri
- Department of Immunology, Cellular and Molecular Research Center, School of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Pegah Ghandil
- Cellular and Molecular Research Center, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.,Department of Medical Genetics, School of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.,Diabetes Research Center, Health Research Institute, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| |
Collapse
|
20
|
Mohamadian M, Ghandil P, Naseri M, Bahrami A, Momen AA. A novel homozygous variant in an Iranian pedigree with cerebellar ataxia, mental retardation, and dysequilibrium syndrome type 4. J Clin Lab Anal 2020; 34:e23484. [PMID: 33079427 PMCID: PMC7676196 DOI: 10.1002/jcla.23484] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 06/12/2020] [Accepted: 06/26/2020] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Cerebellar ataxia, mental retardation, and dysequilibrium (CAMRQ) syndrome is a rare and early-onset neurodevelopmental disorder. Four subtypes of this syndrome have been identified, which are clinically and genetically different. To date, altogether 32 patients have been described with ATP8A2 mutations and phenotypic features assigned to CAMRQ type 4. Herein, three additional patients in an Iranian consanguineous family with non-progressive cerebellar ataxia, severe hypotonia, intellectual disability, dysarthria, and cerebellar atrophy have been identified. METHODS Following the thorough clinical examination, consecutive detections including chromosome karyotyping, chromosomal microarray analysis, and whole exome sequencing (WES) were performed on the proband. The sequence variants derived from WES interpreted by a standard bioinformatics pipeline. Pathogenicity assessment of candidate variant was done by in silico analysis. The familial cosegregation of the WES finding was carried out by PCR-based Sanger sequencing. RESULTS A novel homozygous missense variant (c.1339G > A, p.Gly447Arg) in the ATP8A2 gene was identified and completely segregated with the phenotype in the family. In silico analysis and structural modeling revealed that the p.G477R substitution is deleterious and induced undesired effects on the protein stability and residue distribution in the ligand-binding pocket. The novel sequence variant occurred within an extremely conserved subregion of the ATP-binding domain. CONCLUSION Our findings expand the spectrum of ATP8A2 mutations and confirm the reported genotype-phenotype correlation. These results could improve genetic counseling and prenatal diagnosis in families with clinical presentations related to CAMRQ4 syndrome.
Collapse
Affiliation(s)
- Malihe Mohamadian
- Department of Molecular Medicine, Birjand University of Medical Sciences, Birjand, Iran
| | - Pegah Ghandil
- Diabetes Research Center, Health Research Institute, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.,Department of Medical Genetics, School of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Mohsen Naseri
- Cellular and Molecular Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | - Afsane Bahrami
- Cellular and Molecular Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | - Ali Akbar Momen
- Department of Paediatric Neurology, Golestan Medical, Educational, and Research Center, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| |
Collapse
|
21
|
Structural and Molecular Interaction Studies on Familial Hypercholesterolemia Causative PCSK9 Functional Domain Mutations Reveals Binding Affinity Alterations with LDLR. Int J Pept Res Ther 2020. [DOI: 10.1007/s10989-020-10121-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
22
|
Enzyme dysfunction at atomic resolution: Disease-associated variants of human phosphoglucomutase-1. Biochimie 2020; 183:44-48. [PMID: 32898648 DOI: 10.1016/j.biochi.2020.08.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 08/26/2020] [Accepted: 08/30/2020] [Indexed: 11/20/2022]
Abstract
Once experimentally prohibitive, structural studies of individual missense variants in proteins are increasingly feasible, and can provide a new level of insight into human genetic disease. One example of this is the recently identified inborn error of metabolism known as phosphoglucomutase-1 (PGM1) deficiency. Just as different variants of a protein can produce different patient phenotypes, they may also produce distinct biochemical phenotypes, affecting properties such as catalytic activity, protein stability, or 3D structure/dynamics. Experimental studies of missense variants, and particularly structural characterization, can reveal details of the underlying biochemical pathomechanisms of missense variants. Here, we review four examples of enzyme dysfunction observed in disease-related variants of PGM1. These studies are based on 11 crystal structures of wild-type (WT) and mutant enzymes, and multiple biochemical assays. Lessons learned include the value of comparing mutant and WT structures, synergy between structural and biochemical studies, and the rich understanding of molecular pathomechanism provided by experimental characterization relative to the use of predictive algorithms. We further note functional insights into the WT enzyme that can be gained from the study of pathogenic variants.
Collapse
|
23
|
Mazurenko S. Predicting protein stability and solubility changes upon mutations: data perspective. ChemCatChem 2020. [DOI: 10.1002/cctc.202000933] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Stanislav Mazurenko
- Loschmidt Laboratories Department of Experimental Biology and RECETOX Faculty of Science Masaryk University Zerotinovo nam. 617/9 601 77 Brno Czech Republic
| |
Collapse
|
24
|
Insight into the structural and functional analysis of the impact of missense mutation on cytochrome P450 oxidoreductase. J Mol Graph Model 2020; 100:107708. [PMID: 32805558 DOI: 10.1016/j.jmgm.2020.107708] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 07/15/2020] [Accepted: 07/15/2020] [Indexed: 01/26/2023]
Abstract
Cytochrome P450 oxidoreductase (POR) is a steroidogenic and drug-metabolizing enzyme. It helps in the NADPH dependent transfer of electrons to cytochrome P450 (CYP) enzymes for their biological activity. In this study, we employed integrative computational approaches to decipher the impact of proline to leucine missense mutation at position 384 (P384L) in the connecting/hinge domain region which is essential for the catalytic activity of POR. Analysis of protein stability using DUET, MUpro, CUPSAT, I-Mutant2.0, iStable and SAAFEC servers predicted that mutation might alter the structural stability of POR. The significant conformational changes induced by the mutation to the POR structure were analyzed by long-range molecular dynamics simulation. The results revealed that missense mutation decreased the conformational stability of POR as compared to wild type (WT). The PCA based FEL analysis described the mutant-specific conformational alterations and dominant motions essential for the biological activity of POR. The LIGPLOT interaction analysis showed the different binding architecture of FMN, FAD, and NADPH as a result of mutation. The increased number of hydrogen bonds in the FEL conformation of WT proved the strong binding of cofactors in the binding pocket as compared to the mutant. The porcupine plot analysis associated with cross-correlation analysis depicted the high-intensity flexible motion exhibited by functionally important FAD and NADPH binding domain regions. The computational findings unravel the impact of mutation at the structural level, which could be helpful in understanding the molecular mechanism of drug metabolism.
Collapse
|
25
|
Mahase V, Sobitan A, Johnson C, Cooper F, Xie Y, Li L, Teng S. Computational analysis of hereditary spastic paraplegia mutations in the kinesin motor domains of KIF1A and KIF5A. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2020. [DOI: 10.1142/s0219633620410035] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Hereditary spastic paraplegias (HSPs) are a genetically heterogeneous collection of neurodegenerative disorders categorized by progressive lower-limb spasticity and frailty. The complex HSP forms are characterized by various neurological features including progressive spastic weakness, urinary sphincter dysfunction, extra pyramidal signs and intellectual disability (ID). The kinesin superfamily proteins (KIFs) are microtubule-dependent molecular motors involved in intracellular transport. Kinesins directionally transport membrane vesicles, protein complexes, and mRNAs along neurites, thus playing important roles in neuronal development and function. Recent genetic studies have identified kinesin mutations in patients with HSPs. In this study, we used the computational approaches to investigate the 40 missense mutations associated with HSP and ID in KIF1A and KIF5A. We performed homology modeling to construct the structures of kinesin–microtubule binding domain and kinesin–tubulin complex. We applied structure-based energy calculation methods to determine the effects of missense mutations on protein stability and protein–protein interaction. The results revealed that the most of disease-causing mutations could change the folding free energy of kinesin motor domain and the binding free energy of kinesin–tubulin complex. We found that E253K associated with ID in KIF1A decrease the protein stability of kinesin motor domains. We showed that the HSP mutations located in kinesin–tubulin complex interface, such as K253N and R280C in KIF5A, can destabilize the kinesin–tubulin complex. The computational analysis provides useful information for understanding the roles of kinesin mutations in the development of ID and HSPs.
Collapse
Affiliation(s)
- Vidhyanand Mahase
- Department of Biology, Howard University, Washington, D.C., 20059 USA
| | - Adebiyi Sobitan
- Department of Biology, Howard University, Washington, D.C., 20059 USA
| | - Christina Johnson
- Department of Biology, Howard University, Washington, D.C., 20059 USA
| | - Farion Cooper
- Department of Biology, Howard University, Washington, D.C., 20059 USA
| | - Yixin Xie
- Computational Science Program, University of Texas at El Paso, El Paso, Texas 79902, USA
| | - Lin Li
- Computational Science Program, University of Texas at El Paso, El Paso, Texas 79902, USA
- Department of Physics, University of Texas at El Paso, El Paso, Texas 79902, USA
| | - Shaolei Teng
- Department of Biology, Howard University, Washington, D.C., 20059 USA
| |
Collapse
|
26
|
Zhang N, Lu H, Chen Y, Zhu Z, Yang Q, Wang S, Li M. PremPRI: Predicting the Effects of Missense Mutations on Protein-RNA Interactions. Int J Mol Sci 2020; 21:ijms21155560. [PMID: 32756481 PMCID: PMC7432928 DOI: 10.3390/ijms21155560] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 07/28/2020] [Accepted: 07/30/2020] [Indexed: 12/23/2022] Open
Abstract
Protein–RNA interactions are crucial for many cellular processes, such as protein synthesis and regulation of gene expression. Missense mutations that alter protein–RNA interaction may contribute to the pathogenesis of many diseases. Here, we introduce a new computational method PremPRI, which predicts the effects of single mutations occurring in RNA binding proteins on the protein–RNA interactions by calculating the binding affinity changes quantitatively. The multiple linear regression scoring function of PremPRI is composed of three sequence- and eight structure-based features, and is parameterized on 248 mutations from 50 protein–RNA complexes. Our model shows a good agreement between calculated and experimental values of binding affinity changes with a Pearson correlation coefficient of 0.72 and the corresponding root-mean-square error of 0.76 kcal·mol−1, outperforming three other available methods. PremPRI can be used for finding functionally important variants, understanding the molecular mechanisms, and designing new protein–RNA interaction inhibitors.
Collapse
|
27
|
Sanavia T, Birolo G, Montanucci L, Turina P, Capriotti E, Fariselli P. Limitations and challenges in protein stability prediction upon genome variations: towards future applications in precision medicine. Comput Struct Biotechnol J 2020; 18:1968-1979. [PMID: 32774791 PMCID: PMC7397395 DOI: 10.1016/j.csbj.2020.07.011] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 07/10/2020] [Accepted: 07/14/2020] [Indexed: 12/13/2022] Open
Abstract
Protein stability predictions are becoming essential in medicine to develop novel immunotherapeutic agents and for drug discovery. Despite the large number of computational approaches for predicting the protein stability upon mutation, there are still critical unsolved problems: 1) the limited number of thermodynamic measurements for proteins provided by current databases; 2) the large intrinsic variability of ΔΔG values due to different experimental conditions; 3) biases in the development of predictive methods caused by ignoring the anti-symmetry of ΔΔG values between mutant and native protein forms; 4) over-optimistic prediction performance, due to sequence similarity between proteins used in training and test datasets. Here, we review these issues, highlighting new challenges required to improve current tools and to achieve more reliable predictions. In addition, we provide a perspective of how these methods will be beneficial for designing novel precision medicine approaches for several genetic disorders caused by mutations, such as cancer and neurodegenerative diseases.
Collapse
Affiliation(s)
- Tiziana Sanavia
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Giovanni Birolo
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Ludovica Montanucci
- Department of Comparative Biomedicine and Food Science (BCA), University of Padova, Viale dell'Università 16, 35020 Legnaro, Italy
| | - Paola Turina
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126 Bologna, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126 Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| |
Collapse
|
28
|
Mutations in FAM50A suggest that Armfield XLID syndrome is a spliceosomopathy. Nat Commun 2020; 11:3698. [PMID: 32703943 PMCID: PMC7378245 DOI: 10.1038/s41467-020-17452-6] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 06/17/2020] [Indexed: 02/06/2023] Open
Abstract
Intellectual disability (ID) is a heterogeneous clinical entity and includes an excess of males who harbor variants on the X-chromosome (XLID). We report rare FAM50A missense variants in the original Armfield XLID syndrome family localized in Xq28 and four additional unrelated males with overlapping features. Our fam50a knockout (KO) zebrafish model exhibits abnormal neurogenesis and craniofacial patterning, and in vivo complementation assays indicate that the patient-derived variants are hypomorphic. RNA sequencing analysis from fam50a KO zebrafish show dysregulation of the transcriptome, with augmented spliceosome mRNAs and depletion of transcripts involved in neurodevelopment. Zebrafish RNA-seq datasets show a preponderance of 3′ alternative splicing events in fam50a KO, suggesting a role in the spliceosome C complex. These data are supported with transcriptomic signatures from cell lines derived from affected individuals and FAM50A protein-protein interaction data. In sum, Armfield XLID syndrome is a spliceosomopathy associated with aberrant mRNA processing during development. Armfield X-linked disability (XLID) disorder has previously been linked to a locus in Xq28. Here, the authors report rare missense variants in FAM50A at Xq28, show that FAM50A interacts with the spliceosome, and that mis-splicing is enriched in knockout zebrafish suggesting it is a spliceosomopathy.
Collapse
|
29
|
Ganakammal SR, Koirala M, Wu B, Alexov E. In-silico analysis to identify the role of MEN1 missense mutations in breast cancer. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2020. [DOI: 10.1142/s0219633620410023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Background: The multiple endocrine neoplasia type 1 (MEN1) gene located on chromosome 11q13 encodes menin protein. Previously reported mutations were thought to result in loss of function of menin protein and that they are associated with multiple endocrine neoplasia 1 disorder. However, recently menin has also been characterized as an oncosuppressor protein and it was suggested that mutations in it are associated with various other tumors. Studies indicate that the menin protein stimulates the estrogen receptor (ER) that in turn increases the predisposition for inherited breast cancer. Methods: Here, we used our supervised in-house combinatory in-silico predictor method to investigate the impact of unclassified missense mutations in MEN1 gene found in breast cancer tissue. We also examined the biophysical and biochemical properties to predict the effects of these missense variants on the menin protein stability and interactions. The results are compared with the effects of known pathogenic mutations in menin causing neoplasia. Results: Our analysis indicates that some of the variants found in breast cancer tissue show similar pattern of destabilizing the menin protein and its interactions as the pathogenic variants associated with neoplasia. Taking together with the results of our in-silico consensus predictor, we classify missense mutations in menin protein found in breast cancer tissue into pathogenic and benign, and thus, suggesting as an indicator for early detection of elevated breast cancer risk.
Collapse
Affiliation(s)
| | - Mahesh Koirala
- Department of Physics, Clemson University, Clemson SC, USA
| | - Bohua Wu
- Department of Physics, Clemson University, Clemson SC, USA
| | - Emil Alexov
- Department of Healthcare Genetics, School of Nursing, Clemson University, Clemson SC, USA
- Department of Physics, Clemson University, Clemson SC, USA
| |
Collapse
|
30
|
Lu Y, Villoutreix BO, Biswas I, Ding Q, Wang X, Rezaie AR. Thr90Ser Mutation in Antithrombin is Associated with Recurrent Thrombosis in a Heterozygous Carrier. Thromb Haemost 2020; 120:1045-1055. [PMID: 32422680 DOI: 10.1055/s-0040-1710590] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Antithrombin (AT) is a serine protease inhibitor that regulates the activity of coagulation proteases of both intrinsic and extrinsic pathways. We identified an AT-deficient patient with a heterozygous Thr90Ser (T90S) mutation who experiences recurrent venous thrombosis. To understand the molecular basis of the clotting defect, we expressed AT-T90S in mammalian cells, purified it to homogeneity, and characterized its properties in established kinetics, binding, and coagulation assays. The possible effect of mutation on the AT structure was also evaluated by molecular modeling. Results demonstrate the inhibitory activity of AT-T90S toward thrombin and factor Xa has been impaired three- to fivefold in both the absence and presence of heparin. The affinity of heparin for AT-T90S has been decreased by four- to fivefold. Kinetic analysis revealed the stoichiometry of AT-T90S inhibition of both thrombin and factor Xa has been elevated by three- to fourfold in both the absence and presence of heparin, suggesting that the reactivity of coagulation proteases with AT-T90S has been elevated in the substrate pathway. The anticoagulant activity of AT-T90S has been significantly impaired as analyzed in the AT-deficient plasma supplemented with AT-T90S. The anti-inflammatory effect of AT-T90S was also decreased. Structural analysis predicts the shorter side-chain of Ser in AT-T90S has a destabilizing effect on the structure of AT and/or the AT-protease complex, possibly increasing the size of an internal cavity and altering a hydrogen-bonding network that modulates conformations of the allosterically linked heparin-binding site and reactive center loop of the serpin. This mutational effect increases the reactivity of AT-T90S with coagulation proteases in the substrate pathway.
Collapse
Affiliation(s)
- Yeling Lu
- Department of Laboratory Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.,Cardiovascular Biology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States
| | - Bruno O Villoutreix
- Drugs and Molecules for Living Systems, Inserm, Institut Pasteur de Lille, University of Lille, Lille, France
| | - Indranil Biswas
- Cardiovascular Biology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States
| | - Qiulan Ding
- Department of Laboratory Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xuefeng Wang
- Department of Laboratory Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Alireza R Rezaie
- Cardiovascular Biology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States.,Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, Oklahoma, United States
| |
Collapse
|
31
|
Banerjee A, Mitra P. Estimating the Effect of Single-Point Mutations on Protein Thermodynamic Stability and Analyzing the Mutation Landscape of the p53 Protein. J Chem Inf Model 2020; 60:3315-3323. [PMID: 32401507 DOI: 10.1021/acs.jcim.0c00256] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Anupam Banerjee
- Advanced Technology Development Centre, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Pralay Mitra
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| |
Collapse
|
32
|
Effects of Single and Double Mutants in Human Glucose-6-Phosphate Dehydrogenase Variants Present in the Mexican Population: Biochemical and Structural Analysis. Int J Mol Sci 2020; 21:ijms21082732. [PMID: 32326520 PMCID: PMC7215812 DOI: 10.3390/ijms21082732] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 04/12/2020] [Accepted: 04/13/2020] [Indexed: 11/16/2022] Open
Abstract
Glucose-6-phosphate dehydrogenase (G6PD) deficiency is the most frequent human enzymopathy, affecting over 400 million people globally. Worldwide, 217 mutations have been reported at the genetic level, and only 19 have been found in Mexico. The objective of this work was to contribute to the knowledge of the function and structure of three single natural variants (G6PD A+, G6PD San Luis Potosi, and G6PD Guadalajara) and a double mutant (G6PD Mount Sinai), each localized in a different region of the three-dimensional (3D) structure. In the functional characterization of the mutants, we observed a decrease in specific activity, protein expression and purification, catalytic efficiency, and substrate affinity in comparison with wild-type (WT) G6PD. Moreover, the analysis of the effect of all mutations on the structural stability showed that its presence increases denaturation and lability with temperature and it is more sensible to trypsin digestion protease and guanidine hydrochloride compared with WT G6PD. This could be explained by accelerated degradation of the variant enzymes due to reduced stability of the protein, as is shown in patients with G6PD deficiency.
Collapse
|
33
|
Funk CR, Huey ES, May MM, Peng Y, Michonova E, Best RG, Schwartz CE, Blenda AV. Rare missense variant p.Ala505Ser in the ZAK protein observed in a patient with split-hand/foot malformation from a non-consanguineous pedigree. J Int Med Res 2020; 48:300060519879293. [PMID: 32266845 PMCID: PMC7144677 DOI: 10.1177/0300060519879293] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Objective Split-hand/foot malformation (SHFM) is a rare, often debilitating, congenital limb malformation. A single nucleotide polymorphism within the leucine zipper containing kinase AZK (ZAK) gene was recently associated with SHFM in two consanguineous Pakistani pedigrees. We hypothesized that additional unrelated patients with the phenotype may carry a pathogenic mutation in ZAK. Methods DNA samples were collected from 38 patients with SHFM and associated hearing loss for Sanger DNA sequencing and in silico analysis. Results Two missense mutations within ZAK were detected in 11 patients, but only one missense variant, p.Ala505Ser, occurred with a presumed rare allele frequency. In silico modeling of the ZAK protein with the p.Ala505Ser substitution indicated a negative binding free energy change (mean ΔΔG = −0.9), representing destabilization of the ZAK tertiary structure. Additional laboratory analysis demonstrated a chromosome region 7q21.3-q22.1 deletion. This locus contains the SHFM-1 causative genes SHFM1, DLX5, and DLX6 (distal-less homeobox-5 and -6). Conclusions We report a novel and rare missense variant, ZAK p.Ala505Ser, in one patient with SHFM from a non-consanguineous pedigree. This variant mildly destabilizes the ZAK tertiary structure. Although this mutation involved a deletion at the SHFM1 locus (7q21.3-q22.1), ZAK signaling destabilization may have contributed to the phenotype, which included hearing loss.
Collapse
MESH Headings
- Alleles
- Amino Acid Substitution
- Animals
- Chromosome Deletion
- Chromosomes, Human, Pair 7
- DNA Mutational Analysis
- Disease Models, Animal
- Evolution, Molecular
- Genetic Association Studies
- Genetic Predisposition to Disease
- Humans
- Limb Deformities, Congenital/diagnosis
- Limb Deformities, Congenital/genetics
- Limb Deformities, Congenital/metabolism
- MAP Kinase Kinase Kinases/chemistry
- MAP Kinase Kinase Kinases/genetics
- MAP Kinase Kinase Kinases/metabolism
- Mice
- Mice, Knockout
- Models, Molecular
- Mutation, Missense
- Polymorphism, Single Nucleotide
- Protein Conformation
- Signal Transduction
- Structure-Activity Relationship
Collapse
Affiliation(s)
- Christopher Ronald Funk
- J.C. Self Research Institute, Greenwood Genetic Center, Greenwood, SC, United States
- Emory University School of Medicine, Atlanta, GA, United States
| | - Elizabeth S. Huey
- Department of Biomedical Sciences, University of South Carolina School of Medicine Greenville, Greenville, SC, United States
| | - Melanie M. May
- J.C. Self Research Institute, Greenwood Genetic Center, Greenwood, SC, United States
| | - Yunhui Peng
- Computational Biophysics and Bioinformatics Laboratory, Department of Physics and Astronomy, Clemson University, Clemson, SC, United States
| | - Ekaterina Michonova
- Department of Chemistry and Physics, Erskine College, Due West, SC, United States
| | - Robert G. Best
- Department of Biomedical Sciences, University of South Carolina School of Medicine Greenville, Greenville, SC, United States
| | - Charles E. Schwartz
- J.C. Self Research Institute, Greenwood Genetic Center, Greenwood, SC, United States
| | - Anna V. Blenda
- Department of Biomedical Sciences, University of South Carolina School of Medicine Greenville, Greenville, SC, United States
- Anna V. Blenda, Department of Biomedical Sciences, University of South Carolina School of Medicine Greenville, 701 Grove Rd, Greenville, SC 29605, United States.
| |
Collapse
|
34
|
Gyulkhandanyan A, Rezaie AR, Roumenina L, Lagarde N, Fremeaux-Bacchi V, Miteva MA, Villoutreix BO. Analysis of protein missense alterations by combining sequence- and structure-based methods. Mol Genet Genomic Med 2020; 8:e1166. [PMID: 32096919 PMCID: PMC7196459 DOI: 10.1002/mgg3.1166] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 01/20/2020] [Accepted: 01/27/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Different types of in silico approaches can be used to predict the phenotypic consequence of missense variants. Such algorithms are often categorized as sequence based or structure based, when they necessitate 3D structural information. In addition, many other in silico tools, not dedicated to the analysis of variants, can be used to gain additional insights about the possible mechanisms at play. METHODS Here we applied different computational approaches to a set of 20 known missense variants present on different proteins (CYP, complement factor B, antithrombin and blood coagulation factor VIII). The tools that were used include fast computational approaches and web servers such as PolyPhen-2, PopMusic, DUET, MaestroWeb, SAAFEC, Missense3D, VarSite, FlexPred, PredyFlexy, Clustal Omega, meta-PPISP, FTMap, ClusPro, pyDock, PPM, RING, Cytoscape, and ChannelsDB. RESULTS We observe some conflicting results among the methods but, most of the time, the combination of several engines helped to clarify the potential impacts of the amino acid substitutions. CONCLUSION Combining different computational approaches including some that were not developed to investigate missense variants help to predict the possible impact of the amino acid substitutions. Yet, when the modified residues are involved in a salt-bridge, the tools tend to fail, even when the analysis is performed in 3D. Thus, interactive structural analysis with molecular graphics packages such as Chimera or PyMol or others are still needed to clarify automatic prediction.
Collapse
Affiliation(s)
- Aram Gyulkhandanyan
- INSERM U973, Laboratory MTi, University Paris Diderot, Paris, France
- Laboratory SABNP, University of Evry, INSERM U1204, Université Paris-Saclay, Evry, France
| | - Alireza R Rezaie
- Cardiovascular Biology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
- Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Lubka Roumenina
- INSERM, UMR_S 1138, Centre de Recherche des Cordeliers, Paris, France
- Sorbonne Universités, Paris, France
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Nathalie Lagarde
- INSERM U973, Laboratory MTi, University Paris Diderot, Paris, France
- Laboratoire GBCM, EA7528, Conservatoire national des arts et métiers, Hesam Université, Paris, France
| | - Veronique Fremeaux-Bacchi
- INSERM, UMR_S 1138, Centre de Recherche des Cordeliers, Paris, France
- Sorbonne Universités, Paris, France
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France
- Assistance Publique-Hôpitaux de Paris, Service d'Immunologie Biologique, Hôpital Européen Georges Pompidou, Paris, France
| | - Maria A Miteva
- INSERM U973, Laboratory MTi, University Paris Diderot, Paris, France
- Inserm U1268 MCTR, CNRS UMR 8038 CiTCoM, Faculté de Pharmacie de Paris, Univ. De Paris, Paris, France
| | - Bruno O Villoutreix
- INSERM U973, Laboratory MTi, University Paris Diderot, Paris, France
- INSERM, Institut Pasteur de Lille, U1177-Drugs and Molecules for Living Systems, Université de Lille, Lille, France
| |
Collapse
|
35
|
Medina-Ortiz D, Contreras S, Quiroz C, Olivera-Nappa Á. Development of Supervised Learning Predictive Models for Highly Non-linear Biological, Biomedical, and General Datasets. Front Mol Biosci 2020; 7:13. [PMID: 32118039 PMCID: PMC7031350 DOI: 10.3389/fmolb.2020.00013] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 01/22/2020] [Indexed: 11/13/2022] Open
Abstract
In highly non-linear datasets, attributes or features do not allow readily finding visual patterns for identifying common underlying behaviors. Therefore, it is not possible to achieve classification or regression using linear or mildly non-linear hyperspace partition functions. Hence, supervised learning models based on the application of most existing algorithms are limited, and their performance metrics are low. Linear transformations of variables, such as principal components analysis, cannot avoid the problem, and even models based on artificial neural networks and deep learning are unable to improve the metrics. Sometimes, even when features allow classification or regression in reported cases, performance metrics of supervised learning algorithms remain unsatisfyingly low. This problem is recurrent in many areas of study as, per example, the clinical, biotechnological, and protein engineering areas, where many of the attributes are correlated in an unknown and very non-linear fashion or are categorical and difficult to relate to a target response variable. In such areas, being able to create predictive models would dramatically impact the quality of their outcomes, generating an immediate added value for both the scientific and general public. In this manuscript, we present RV-Clustering, a library of unsupervised learning algorithms, and a new methodology designed to find optimum partitions within highly non-linear datasets that allow deconvoluting variables and notoriously improving performance metrics in supervised learning classification or regression models. The partitions obtained are statistically cross-validated, ensuring correct representativity and no over-fitting. We have successfully tested RV-Clustering in several highly non-linear datasets with different origins. The approach herein proposed has generated classification and regression models with high-performance metrics, which further supports its ability to generate predictive models for highly non-linear datasets. Advantageously, the method does not require significant human input, which guarantees a higher usability in the biological, biomedical, and protein engineering community with no specific knowledge in the machine learning area.
Collapse
Affiliation(s)
- David Medina-Ortiz
- Departamento de Ingeniería Química, Biotecnología y Materiales, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile.,Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
| | - Sebastián Contreras
- Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
| | - Cristofer Quiroz
- Facultad de Ingeniería, Universidad Autónoma de Chile, Talca, Chile
| | - Álvaro Olivera-Nappa
- Departamento de Ingeniería Química, Biotecnología y Materiales, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile.,Centre for Biotechnology and Bioengineering, Universidad de Chile, Santiago, Chile
| |
Collapse
|
36
|
Goswami AM. Computational analyses prioritize and reveal the deleterious nsSNPs in human angiotensinogen gene. Comput Biol Chem 2020; 84:107199. [PMID: 31931433 DOI: 10.1016/j.compbiolchem.2019.107199] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 12/26/2019] [Accepted: 12/30/2019] [Indexed: 12/27/2022]
Abstract
Angiotensinogen (AGT) is a key component of renin-angiotensin-aldosterone system (RAAS), which plays central role in blood pressure homeostasis. Association of AGT polymorphisms have been investigated in different ethnic populations in variety of cardiovascular and non-cardiovascular conditions. In this study, 354 non-synonymous SNPs (nsSNPs) of AGT were evaluated to predict damaging and structurally important variants. Majority of the deleterious nsSNPs occurred in the evolutionary conserved regions. Several of these nsSNPs were found to affect post-translational modifications like methylation, glycosylation, phosphorylation, ubiquitination etc. Structural evaluations predicted 19 variants as destabilizing and some of them were also predicted to destabilize the renin-AGT interaction. Therefore, the present computational investigation predicted pathogenic and functionally important variants of human AGT gene. The study has also shown that AGT deregulation is associated with survival outcome in patients with gastric and breast cancer, using microarray gene expression profile. Furthermore, the computationally screened nsSNPs can be analyzed in population based genotyping studies and may help futuristic drug development in the area of AGT pharmacogenomics.
Collapse
Affiliation(s)
- Achintya Mohan Goswami
- Department of Physiology, Krishnagar Govt. College, Krishnagar, Nadia, West Bengal, 741101, India.
| |
Collapse
|
37
|
Li C, Jia Z, Chakravorty A, Pahari S, Peng Y, Basu S, Koirala M, Panday SK, Petukh M, Li L, Alexov E. DelPhi Suite: New Developments and Review of Functionalities. J Comput Chem 2019; 40:2502-2508. [PMID: 31237360 PMCID: PMC6771749 DOI: 10.1002/jcc.26006] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 05/07/2019] [Accepted: 06/09/2019] [Indexed: 12/25/2022]
Abstract
Electrostatic potential, energies, and forces affect virtually any process in molecular biology, however, computing these quantities is a difficult task due to irregularly shaped macromolecules and the presence of water. Here, we report a new edition of the popular software package DelPhi along with describing its functionalities. The new DelPhi is a C++ object-oriented package supporting various levels of multiprocessing and memory distribution. It is demonstrated that multiprocessing results in significant improvement of computational time. Furthermore, for computations requiring large grid size (large macromolecular assemblages), the approach of memory distribution is shown to reduce the requirement of RAM and thus permitting large-scale modeling to be done on Linux clusters with moderate architecture. The new release comes with new features, whose functionalities and applications are described as well. © 2019 The Authors. Journal of Computational Chemistry published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Chuan Li
- Department of MathematicsWest Chester University of PennsylvaniaWest ChesterPennsylvania19383
| | - Zhe Jia
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Arghya Chakravorty
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Swagata Pahari
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Yunhui Peng
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Sankar Basu
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Mahesh Koirala
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | | | - Marharyta Petukh
- Department of BiologyPresbyterian CollegeClintonSouth Carolina29325
| | - Lin Li
- Department of PhysicsUniversity of Texas at EI PasoTexas79968
| | - Emil Alexov
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| |
Collapse
|
38
|
Koirala M, Alexov E. Computational chemistry methods to investigate the effects caused by DNA variants linked with disease. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2019. [DOI: 10.1142/s0219633619300015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Computational chemistry offers variety of tools to study properties of biological macromolecules. These tools vary in terms of levels of details from quantum mechanical treatment to numerous macroscopic approaches. Here, we provide a review of computational chemistry algorithms and tools for modeling the effects of genetic variations and their association with diseases. Particular emphasis is given on modeling the effects of missense mutations on stability, conformational dynamics, binding, hydrogen bond network, salt bridges, and pH-dependent properties of the corresponding macromolecules. It is outlined that the disease may be caused by alteration of one or several of above-mentioned biophysical characteristics, and a successful prediction of pathogenicity requires detailed analysis of how the alterations affect the function of involved macromolecules. The review provides a short list of most commonly used algorithms to predict the molecular effects of mutations as well.
Collapse
Affiliation(s)
- Mahesh Koirala
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29630, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29630, USA
| |
Collapse
|
39
|
Novel Genetic Markers for Early Detection of Elevated Breast Cancer Risk in Women. Int J Mol Sci 2019; 20:ijms20194828. [PMID: 31569399 PMCID: PMC6801521 DOI: 10.3390/ijms20194828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 09/20/2019] [Accepted: 09/25/2019] [Indexed: 12/25/2022] Open
Abstract
This study suggests that two newly discovered variants in the MSH2 gene, which codes for a DNA mismatch repair (MMR) protein, can be associated with a high risk of breast cancer. While variants in the MSH2 gene are known to be linked with an elevated cancer risk, the MSH2 gene is not a part of the standard kit for testing patients for elevated breast cancer risk. Here we used the results of genetic testing of women diagnosed with breast cancer, but who did not have variants in BRCA1 and BRCA2 genes. Instead, the test identified four variants with unknown significance (VUS) in the MSH2 gene. Here, we carried in silico analysis to develop a classifier that can distinguish pathogenic from benign mutations in MSH2 genes taken from ClinVar. The classifier was then used to classify VUS in MSH2 genes, and two of them, p.Ala272Val and p.Met592Val, were predicted to be pathogenic mutations. These two mutations were found in women with breast cancer who did not have mutations in BRCA1 and BRCA2 genes, and thus they are suggested to be considered as new bio-markers for the early detection of elevated breast cancer risk. However, before this is done, an in vitro validation of mutation pathogenicity is needed and, moreover, the presence of these mutations should be demonstrated in a higher number of patients or in families with breast cancer history.
Collapse
|
40
|
Tajielyato N, Alexov E. Modeling pKas of unfolded proteins to probe structural models of unfolded state. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2019. [DOI: 10.1142/s0219633619500202] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Modeling unfolded states of proteins has implications for protein folding and stability. Since in unfolded state proteins adopt multiple conformations, any experimentally measured quantity is ensemble averaged, therefore the computed quantity should be ensemble averaged as well. Here, we investigate the possibility that one can model an unfolded state ensemble with the coil model approach, algorithm such as “flexible-meccano” [Ozenne V et al., Flexible-meccano: A tool for the generation of explicit ensemle descriptions of intrinsically disordered proteins and their associated experimental observables, Bioinformatics 28:1463–1470, 2012], developed to generate structures for intrinsically disordered proteins. We probe such a possibility by using generated structures to calculate pKas of titratable groups and compare with experimental data. It is demonstrated that even with a small number of representative structures of unfolded state, the average calculated pKas are in very good agreement with experimentally measured pKas. Also, predictions are made for titratable groups for which there is no experimental data available. This suggests that the coil model approach is suitable for generating 3D structures of unfolded state of proteins. To make the approach suitable for large-scale modeling, which requires limited number of structures, we ranked the structures according to their solvent accessible surface area (SASA). It is shown that in the majority of cases, the top structures with smallest SASA are enough to represent unfolded state.
Collapse
Affiliation(s)
- Nayere Tajielyato
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29630, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29630, USA
| |
Collapse
|
41
|
Kumar V, Pandey P, Idrees D, Prakash A, Lynn A. Delineating the effect of mutations on the conformational dynamics of N-terminal domain of TDP-43. Biophys Chem 2019; 250:106174. [DOI: 10.1016/j.bpc.2019.106174] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2018] [Revised: 03/06/2019] [Accepted: 04/21/2019] [Indexed: 12/12/2022]
|
42
|
Spellicy CJ, Peng Y, Olewiler L, Cathey SS, Rogers RC, Bartholomew D, Johnson J, Alexov E, Lee JA, Friez MJ, Jones JR. Three additional patients with EED-associated overgrowth: potential mutation hotspots identified? J Hum Genet 2019; 64:561-572. [PMID: 30858506 DOI: 10.1038/s10038-019-0585-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Revised: 02/12/2019] [Accepted: 02/13/2019] [Indexed: 12/25/2022]
Abstract
Variants have been identified in the embryonic ectoderm development (EED) gene in seven patients with syndromic overgrowth similar to that observed in Weaver syndrome. Here, we present three additional patients with missense variants in the EED gene. All the missense variants reported to date (including the three presented here) have localized to one of seven WD40 domains of the EED protein, which are necessary for interaction with enhancer of zeste 2 polycomb repressive complex 2 subunit (EZH2). In addition, among the seven patients reported in the literature and the three new patients presented here, all of the reported pathogenic variants except one occurred at one of four amino acid residues in the EED protein. The recurrence of pathogenic variation at these loci suggests that these residues are functionally important (mutation hotspots). In silico modeling and calculations of the free energy changes resulting from these variants suggested that they not only destabilize the EED protein structure but also adversely affect interactions between EED, EZH2, and/or H3K27me3. These cases help demonstrate the mechanism(s) by which apparently deleterious variants in the EED gene might cause overgrowth and lend further support that amino acid residues in the WD40 domain region may be mutation hotspots.
Collapse
Affiliation(s)
| | - Yunhui Peng
- Computational Biophysics and Bioinformatics laboratory, Clemson University, Clemson, SC, 29634, USA
| | - Leah Olewiler
- Genetics, Nationwide Children's Hospital, Columbus, OH, 43205, USA
| | - Sara S Cathey
- Greenwood Genetic Center, Greenwood, SC, 29646, USA
- Clinical Genetics, Greenwood Genetic Center, Greenwood, SC, 29646, USA
| | - R Curtis Rogers
- Greenwood Genetic Center, Greenwood, SC, 29646, USA
- Clinical Genetics, Greenwood Genetic Center, Greenwood, SC, 29646, USA
| | | | | | - Emil Alexov
- Computational Biophysics and Bioinformatics laboratory, Clemson University, Clemson, SC, 29634, USA
| | | | | | - Julie R Jones
- Greenwood Genetic Center, Greenwood, SC, 29646, USA.
| |
Collapse
|
43
|
Chakravorty A, Gallicchio E, Alexov E. A grid-based algorithm in conjunction with a gaussian-based model of atoms for describing molecular geometry. J Comput Chem 2019; 40:1290-1304. [PMID: 30698861 PMCID: PMC6506848 DOI: 10.1002/jcc.25786] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2018] [Revised: 12/12/2018] [Accepted: 01/06/2019] [Indexed: 11/06/2022]
Abstract
A novel grid-based method is presented, which in conjunction with a smooth Gaussian-based model of atoms, is used to compute molecular volume (MV) and surface area (MSA). The MV and MSA are essential for computing nonpolar component of free energies. The objective of our grid-based approach is to identify solute atom pairs that share overlapping volumes in space. Once completed, this information is used to construct a rooted tree using depth-first method to yield the final volume and SA by using the formulations of the Gaussian model described by Grant and Pickup (J. Phys Chem, 1995, 99, 3503). The method is designed to function uninterruptedly with the grid-based finite-difference method implemented in Delphi, a popular and open-source package used for solving the Poisson-Boltzmann equation (PBE). We demonstrate the time efficacy of the method while also validating its performance in terms of the effect of grid-resolution, positioning of the solute within the grid-map and accuracy in identification of overlapping atom pairs. We also explore and discuss different aspects of the Gaussian model with key emphasis on its physical meaningfulness. This development and its future release with the Delphi package are intended to provide a physically meaningful, fast, robust and comprehensive tool for MM/PBSA based free energy calculations. © 2019 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Arghya Chakravorty
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina 29634
| | | | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina 29634
| |
Collapse
|
44
|
Peng Y, Alexov E, Basu S. Structural Perspective on Revealing and Altering Molecular Functions of Genetic Variants Linked with Diseases. Int J Mol Sci 2019; 20:ijms20030548. [PMID: 30696058 PMCID: PMC6386852 DOI: 10.3390/ijms20030548] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Revised: 01/25/2019] [Accepted: 01/26/2019] [Indexed: 12/25/2022] Open
Abstract
Structural information of biological macromolecules is crucial and necessary to deliver predictions about the effects of mutations-whether polymorphic or deleterious (i.e., disease causing), wherein, thermodynamic parameters, namely, folding and binding free energies potentially serve as effective biomarkers. It may be emphasized that the effect of a mutation depends on various factors, including the type of protein (globular, membrane or intrinsically disordered protein) and the structural context in which it occurs. Such information may positively aid drug-design. Furthermore, due to the intrinsic plasticity of proteins, even mutations involving radical change of the structural and physico⁻chemical properties of the amino acids (native vs. mutant) can still have minimal effects on protein thermodynamics. However, if a mutation causes significant perturbation by either folding or binding free energies, it is quite likely to be deleterious. Mitigating such effects is a promising alternative to the traditional approaches of designing inhibitors. This can be done by structure-based in silico screening of small molecules for which binding to the dysfunctional protein restores its wild type thermodynamics. In this review we emphasize the effects of mutations on two important biophysical properties, stability and binding affinity, and how structures can be used for structure-based drug design to mitigate the effects of disease-causing variants on the above biophysical properties.
Collapse
Affiliation(s)
- Yunhui Peng
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
| | - Sankar Basu
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
| |
Collapse
|
45
|
Qi R, Luo R. Robustness and Efficiency of Poisson-Boltzmann Modeling on Graphics Processing Units. J Chem Inf Model 2018; 59:409-420. [PMID: 30550277 DOI: 10.1021/acs.jcim.8b00761] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Poisson-Boltzmann equation (PBE) based continuum electrostatics models have been widely used in modeling electrostatic interactions in biochemical processes, particularly in estimating protein-ligand binding affinities. Fast convergence of PBE solvers is crucial in binding affinity computations as numerous snapshots need to be processed. Efforts have been reported to develop PBE solvers on graphics processing units (GPUs) for efficient modeling of biomolecules, though only relatively simple successive over-relaxation and conjugate gradient methods were implemented. However, neither convergence nor scaling properties of the two methods are optimal for large biomolecules. On the other hand, geometric multigrid (MG) has been shown to be an optimal solver on CPUs, though no MG have been reported for biomolecular applications on GPUs. This is not a surprise as it is a more complex method and depends on simpler but limited iterative methods such as Gauss-Seidel in its core relaxation procedure. The robustness and efficiency of MG on GPUs are also unclear. Here we present an implementation and a thorough analysis of MG on GPUs. Our analysis shows that robustness is a more pronounced issue than efficiency for both MG and other tested solvers when the single precision is used for complex biomolecules. We further show how to balance robustness and efficiency utilizing MG's overall efficiency and conjugate gradient's robustness, pointing to a hybrid GPU solver with a good balance of efficiency and accuracy. The new PBE solver will significantly improve the computational throughput for a range of biomolecular applications on the GPU platforms.
Collapse
|
46
|
PremPDI estimates and interprets the effects of missense mutations on protein-DNA interactions. PLoS Comput Biol 2018; 14:e1006615. [PMID: 30533007 PMCID: PMC6303081 DOI: 10.1371/journal.pcbi.1006615] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Revised: 12/21/2018] [Accepted: 11/01/2018] [Indexed: 01/01/2023] Open
Abstract
Protein-DNA interactions play important roles in regulations of many vital cellular processes, including transcription, translation, DNA replication and recombination. Sequence variants occurring in these DNA binding proteins that alter protein-DNA interactions may cause significant perturbations or complete abolishment of function, potentially leading to diseases. Developing a mechanistic understanding of impacts of variants on protein-DNA interactions becomes a persistent need. To address this need we introduce a new computational method PremPDI that predicts the effect of single missense mutation in the protein on the protein-DNA interaction and calculates the quantitative binding affinity change. The PremPDI method is based on molecular mechanics force fields and fast side-chain optimization algorithms with parameters optimized on experimental sets of 219 mutations from 49 protein-DNA complexes. PremPDI yields a very good agreement between predicted and experimental values with Pearson correlation coefficient of 0.71 and root-mean-square error of 0.86 kcal mol-1. The PremPDI server could map mutations on a structural protein-DNA complex, calculate the associated changes in binding affinity, determine the deleterious effect of a mutation, and produce a mutant structural model for download. PremPDI can be applied to many tasks, such as determination of potential damaging mutations in cancer and other diseases. PremPDI is available at http://lilab.jysw.suda.edu.cn/research/PremPDI/. Developing methods for accurate prediction of effects of amino acid substitutions on protein-DNA interactions is important for a wide range of biomedical applications such as understanding disease-causing mechanism of missense mutations and guiding protein engineering. Very few methods have been developed for predicting the effects of mutations on protein-DNA binding affinity. Here we report a new computational method, PRedicts the Effects of single Mutations on Protein-DNA Interactions (PremPDI). The core of the PremPDI method is based on molecular mechanics force fields and fast side-chain optimization algorithms that makes the PremPDI algorithm efficient and being fast enough to handle large number of cases. The performance of the PremPDI protocol was tested against experimentally determined binding free energy changes of 219 mutations from 49 protein-DNA complexes and yields very good correlation coefficient. The PremPDI webserver is available to the community at http://lilab.jysw.suda.edu.cn/research/PremPDI/.
Collapse
|
47
|
Valdebenito-Maturana B, Reyes-Suarez JA, Henriquez J, Holmes DS, Quatrini R, Pohl E, Arenas-Salinas M. Mutantelec: An In Silico mutation simulation platform for comparative electrostatic potential profiling of proteins. J Comput Chem 2018; 38:467-474. [PMID: 28114729 DOI: 10.1002/jcc.24712] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2016] [Revised: 12/06/2016] [Accepted: 12/07/2016] [Indexed: 11/07/2022]
Abstract
The electrostatic potential plays a key role in many biological processes like determining the affinity of a ligand to a given protein target, and they are responsible for the catalytic activity of many enzymes. Understanding the effect that amino acid mutations will have on the electrostatic potential of a protein, will allow a thorough understanding of which residues are the most important in a protein. MutantElec, is a friendly web application for in silico generation of site-directed mutagenesis of proteins and the comparison of electrostatic potential between the wild type protein and the mutant(s), based on the three-dimensional structure of the protein. The effect of the mutation is evaluated using different approach to the traditional surface map. MutantElec provides a graphical display of the results that allows the visualization of changes occurring at close distance from the mutation and thus uncovers the local and global impact of a specific change. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Braulio Valdebenito-Maturana
- Centro de Bioinformática y Simulación Molecular, Facultad de Ingeniería, Universidad de Talca, Talca, 346 5548, Chile
| | - Jose Antonio Reyes-Suarez
- Centro de Bioinformática y Simulación Molecular, Facultad de Ingeniería, Universidad de Talca, Talca, 346 5548, Chile
| | - Jaime Henriquez
- Centro de Bioinformática y Simulación Molecular, Facultad de Ingeniería, Universidad de Talca, Talca, 346 5548, Chile
| | - David S Holmes
- Fundación Ciencia & Vida, Santiago, 778 0272, Chile.,Facultad de Ciencias Biologicas, Universidad Andres Bello, Santiago, Chile
| | | | - Ehmke Pohl
- Department of Chemistry, Durham University, Durham, DH1 3LE, United Kingdom.,Department of Biosciences, Durham University, Durham, DH1 3LE, United Kingdom.,Biophysical Sciences Institute, Durham University, Durham, DH1 3LE, United Kingdom
| | - Mauricio Arenas-Salinas
- Centro de Bioinformática y Simulación Molecular, Facultad de Ingeniería, Universidad de Talca, Talca, 346 5548, Chile
| |
Collapse
|
48
|
Zhou Y, Fujikura K, Mkrtchian S, Lauschke VM. Computational Methods for the Pharmacogenetic Interpretation of Next Generation Sequencing Data. Front Pharmacol 2018; 9:1437. [PMID: 30564131 PMCID: PMC6288784 DOI: 10.3389/fphar.2018.01437] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Accepted: 11/20/2018] [Indexed: 12/21/2022] Open
Abstract
Up to half of all patients do not respond to pharmacological treatment as intended. A substantial fraction of these inter-individual differences is due to heritable factors and a growing number of associations between genetic variations and drug response phenotypes have been identified. Importantly, the rapid progress in Next Generation Sequencing technologies in recent years unveiled the true complexity of the genetic landscape in pharmacogenes with tens of thousands of rare genetic variants. As each individual was found to harbor numerous such rare variants they are anticipated to be important contributors to the genetically encoded inter-individual variability in drug effects. The fundamental challenge however is their functional interpretation due to the sheer scale of the problem that renders systematic experimental characterization of these variants currently unfeasible. Here, we review concepts and important progress in the development of computational prediction methods that allow to evaluate the effect of amino acid sequence alterations in drug metabolizing enzymes and transporters. In addition, we discuss recent advances in the interpretation of functional effects of non-coding variants, such as variations in splice sites, regulatory regions and miRNA binding sites. We anticipate that these methodologies will provide a useful toolkit to facilitate the integration of the vast extent of rare genetic variability into drug response predictions in a precision medicine framework.
Collapse
Affiliation(s)
- Yitian Zhou
- Section of Pharmacogenetics, Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| | - Kohei Fujikura
- Department of Diagnostic Pathology, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Souren Mkrtchian
- Section of Pharmacogenetics, Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| | - Volker M. Lauschke
- Section of Pharmacogenetics, Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
49
|
Srinivasan E, Rajasekaran R. Quantum chemical and molecular mechanics studies on the assessment of interactions between resveratrol and mutant SOD1 (G93A) protein. J Comput Aided Mol Des 2018; 32:1347-1361. [PMID: 30368622 DOI: 10.1007/s10822-018-0175-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 10/24/2018] [Indexed: 12/29/2022]
Abstract
Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease that has been associated with mutations in metalloenzyme superoxide dismutase (SOD1) causing protein structural destabilization and aggregation. However, the mechanistic action and the cure for the disease still remain obscure. Herein, we initially studied the conformational preferences of SOD1 protein structures upon substitution of Ala at Gly93 in comparison with that of wild type. Our results corroborated with the previous experimental studies on the aggregation and the destabilizing activity of mutant SOD1 protein G93A. On the therapeutic point of view, we computationally analyzed the influence of resveratrol, a natural polyphenol widely found in red wine on mutant SOD1 relative to wild type, using molecular docking studies. Further, FMO calculations were performed, using GAMESS to study the pair residual interaction on the wild type and mutant complex systems. Consequently, the resveratrol showed greater interaction with mutant than the wild type. Subsequently, we evaluated the conformational preferences of wild type and mutant complex systems, where the protein conformational structures of mutant that were earlier found to lose their conformational stability was regained, upon binding with resveratrol. Similar trend of results were found on the 2-D free energy landscapes of both the wild type and mutant systems. Hence, the combined biophysical and quantum chemical studies in our study supported the results of previous experimental studies, thereby stipulating an action of resveratrol on mutant SOD1 and paving a way for the design of highly potent effective inhibitors against fALS affecting the mankind.
Collapse
Affiliation(s)
- E Srinivasan
- Bioinformatics Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to be University), Vellore, Tamil Nadu, 632014, India
| | - R Rajasekaran
- Bioinformatics Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to be University), Vellore, Tamil Nadu, 632014, India.
| |
Collapse
|
50
|
Peng Y, Sun L, Jia Z, Li L, Alexov E. Predicting protein-DNA binding free energy change upon missense mutations using modified MM/PBSA approach: SAMPDI webserver. Bioinformatics 2018; 34:779-786. [PMID: 29091991 DOI: 10.1093/bioinformatics/btx698] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 10/27/2017] [Indexed: 12/28/2022] Open
Abstract
Motivation Protein-DNA interactions are essential for regulating many cellular processes, such as transcription, replication, recombination and translation. Amino acid mutations occurring in DNA-binding proteins have profound effects on protein-DNA binding and are linked with many diseases. Hence, accurate and fast predictions of the effects of mutations on protein-DNA binding affinity are essential for understanding disease-causing mechanisms and guiding plausible treatments. Results Here we report a new method Single Amino acid Mutation binding free energy change of Protein-DNA Interaction (SAMPDI). The method utilizes modified Molecular Mechanics Poisson-Boltzmann Surface Area (MM/PBSA) approach along with an additional set of knowledge-based terms delivered from investigations of the physicochemical properties of protein-DNA complexes. The method is benchmarked against experimentally determined binding free energy changes caused by 105 mutations in 13 proteins (compiled ProNIT database and data from recent references), and results in correlation coefficient of 0.72. Availability and implementation http://compbio.clemson.edu/SAMPDI. Contact ealexov@clemson.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yunhui Peng
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Lexuan Sun
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Zhe Jia
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Lin Li
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| |
Collapse
|