1
|
Xu W, Li A, Zhao Y, Peng Y. Decoding the effects of mutation on protein interactions using machine learning. BIOPHYSICS REVIEWS 2025; 6:011307. [PMID: 40013003 PMCID: PMC11857871 DOI: 10.1063/5.0249920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Accepted: 01/14/2025] [Indexed: 02/28/2025]
Abstract
Accurately predicting mutation-caused binding free energy changes (ΔΔGs) on protein interactions is crucial for understanding how genetic variations affect interactions between proteins and other biomolecules, such as proteins, DNA/RNA, and ligands, which are vital for regulating numerous biological processes. Developing computational approaches with high accuracy and efficiency is critical for elucidating the mechanisms underlying various diseases, identifying potential biomarkers for early diagnosis, and developing targeted therapies. This review provides a comprehensive overview of recent advancements in predicting the impact of mutations on protein interactions across different interaction types, which are central to understanding biological processes and disease mechanisms, including cancer. We summarize recent progress in predictive approaches, including physicochemical-based, machine learning, and deep learning methods, evaluating the strengths and limitations of each. Additionally, we discuss the challenges related to the limitations of mutational data, including biases, data quality, and dataset size, and explore the difficulties in developing accurate prediction tools for mutation-induced effects on protein interactions. Finally, we discuss future directions for advancing these computational tools, highlighting the capabilities of advancing technologies, such as artificial intelligence to drive significant improvements in mutational effects prediction.
Collapse
Affiliation(s)
- Wang Xu
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Anbang Li
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunhui Peng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
2
|
Gromiha MM, Pandey M, Kulandaisamy A, Sharma D, Ridha F. Progress on the development of prediction tools for detecting disease causing mutations in proteins. Comput Biol Med 2025; 185:109510. [PMID: 39637461 DOI: 10.1016/j.compbiomed.2024.109510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 11/27/2024] [Accepted: 11/29/2024] [Indexed: 12/07/2024]
Abstract
Proteins are involved in a variety of functions in living organisms. The mutation of amino acid residues in a protein alters its structure, stability, binding, and function, with some mutations leading to diseases. Understanding the influence of mutations on protein structure and function help to gain deep insights on the molecular mechanism of diseases and devising therapeutic strategies. Hence, several generic and disease-specific methods have been proposed to reveal pathogenic effects on mutations. In this review, we focus on the development of prediction methods for identifying disease causing mutations in proteins. We briefly outline the existing databases for disease-causing mutations, followed by a discussion on sequence- and structure-based features used for prediction. Further, we discuss computational tools based on machine learning, deep learning and large language models for detecting disease-causing mutations. Specifically, we emphasize the advances in predicting hotspots and mutations for targets involved in cancer, neurodegenerative and infectious diseases as well as in membrane proteins. The computational resources including databases and algorithms understanding/predicting the effect of mutations will be listed. Moreover, limitations of existing methods and possible improvements will be discussed.
Collapse
Affiliation(s)
- M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India.
| | - Medha Pandey
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - A Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Divya Sharma
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Fathima Ridha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| |
Collapse
|
3
|
Faraggi E, Jernigan RL, Kloczkowski A. Rapid discrimination between deleterious and benign missense mutations in the CAGI 6 experiment. Hum Genomics 2024; 18:89. [PMID: 39192324 PMCID: PMC11350969 DOI: 10.1186/s40246-024-00655-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 08/08/2024] [Indexed: 08/29/2024] Open
Abstract
We describe the machine learning tool that we applied in the CAGI 6 experiment to predict whether single residue mutations in proteins are deleterious or benign. This tool was trained using only single sequences, i.e., without multiple sequence alignments or structural information. Instead, we used global characterizations of the protein sequence. Training and testing data for human gene mutations was obtained from ClinVar (ncbi.nlm.nih.gov/pub/ClinVar/), and for non-human gene mutations from Uniprot (www.uniprot.org). Testing was done on post-training data from ClinVar. This testing yielded high AUC and Matthews correlation coefficient (MCC) for well trained examples but low generalizability. For genes with either sparse or unbalanced training data, the prediction accuracy is poor. The resulting prediction server is available online at http://www.mamiris.com/Shoni.cagi6.
Collapse
Affiliation(s)
- Eshel Faraggi
- Research and Information Systems, LLC, 1620 E. 72nd ST., Indianapolis, IN, 46240, USA.
- Physics Department, Indiana University Purdue University Indianapolis, Indianapolis, IN, 46202, USA.
| | - Robert L Jernigan
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA, 50011, USA
| | - Andrzej Kloczkowski
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Columbus, OH, 43205, USA
- Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, 43205, USA
| |
Collapse
|
4
|
Pandey P, Alexov E. Most Monogenic Disorders Are Caused by Mutations Altering Protein Folding Free Energy. Int J Mol Sci 2024; 25:1963. [PMID: 38396641 PMCID: PMC10888012 DOI: 10.3390/ijms25041963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 01/31/2024] [Accepted: 02/02/2024] [Indexed: 02/25/2024] Open
Abstract
Revealing the molecular effect that pathogenic missense mutations have on the corresponding protein is crucial for developing therapeutic solutions. This is especially important for monogenic diseases since, for most of them, there is no treatment available, while typically, the treatment should be provided in the early development stages. This requires fast targeted drug development at a low cost. Here, we report an updated database of monogenic disorders (MOGEDO), which includes 768 proteins and the corresponding 2559 pathogenic and 1763 benign mutations, along with the functional classification of the corresponding proteins. Using the database and various computational tools that predict folding free energy change (ΔΔG), we demonstrate that, on average, 70% of pathogenic cases result in decreased protein stability. Such a large fraction indicates that one should aim at in silico screening for small molecules stabilizing the structure of the mutant protein. We emphasize that knowledge of ΔΔG is essential because one wants to develop stabilizers that compensate for ΔΔG, but do not make protein over-stable, since over-stable protein may be dysfunctional. We demonstrate that, by using ΔΔG and predicted solvent exposure of the mutation site, one can develop a predictive method that distinguishes pathogenic from benign mutations with a success rate even better than some of the leading pathogenicity predictors. Furthermore, hydrophobic-hydrophobic mutations have stronger correlations between folding free energy change and pathogenicity compared with others. Also, mutations involving Cys, Gly, Arg, Trp, and Tyr amino acids being replaced by any other amino acid are more likely to be pathogenic. To facilitate further detection of pathogenic mutations, the wild type of amino acids in the 768 proteins mentioned above was mutated to other 19 residues (14,847,817 mutations), the ΔΔG was calculated with SAAFEC-SEQ, and 5,506,051 mutations were predicted to be pathogenic.
Collapse
Affiliation(s)
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA;
| |
Collapse
|
5
|
Mohammadnejadi E, Razzaghi-Asl N. In silico target specific design of potential quinazoline-based anti-NSCLC agents. J Biomol Struct Dyn 2023; 41:10725-10736. [PMID: 36826424 DOI: 10.1080/07391102.2023.2183029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 12/07/2022] [Indexed: 02/25/2023]
Abstract
Non-small cell lung cancer (NSCLC) accounts for 85% of all lung cancers. In spite of great advances, treatment of the disease is a medical challenge. Epidermal-growth factor receptor (EGFR) has been taken as a promising cell surface target to develop anti-NSCLC therapies. The main bottleneck to attain clinical efficacy with current EGFR tyrosine kinase inhibitors (EGFR-TKIs) is the rapid spread of oncogenic mutations. Numerous efforts have been made for the synthesis of diverse EGFR-TKIs against resistance-conferring mutations. One of the best strategies to design potent agents would be to explore existing anti-NSCLC drugs at the nonclinical development stage and prioritize privileged structural patterns. Within current study, conformational stability of clinically frequent EGFR mutants (G719S, T790M, L858R and a double mutant form L858R/T790M) were validated via DynaMut and missense3D computational servers. Subsequently, structure activity relationship (SAR) and scaffold similarity inquiry were used to rationally propose a few erlotinib analogues. Intended molecules were subjected to molecular docking and top-scored binders were further analyzed through 50-ns all atom molecular dynamics (MD) simulations to infer the dynamic behavior. The aim was to offer potential binders to overwhelm clinically frequent EGFR-TK mutants. The linear interaction energy (LIE) method was applied to compute the binding free energies between EGFR and intended ligands. For this purpose, MD-based conformational sampling of ligand-enzyme complexes and ligand-water associations were used to acquire thermodynamic energy averages. Though mechanistic details are to be explored, results of the current study identify synthetically accessible quinazoline small molecules with potential affinity toward frequent EGFR-TK mutants.[Figure: see text]Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Elaheh Mohammadnejadi
- Students Research Committee, School of Pharmacy, Ardabil University of Medical Sciences, Ardabil, Iran
| | - Nima Razzaghi-Asl
- Department of Medicinal Chemistry, School of Pharmacy, Ardabil University of Medical Sciences, Ardabil, Iran
| |
Collapse
|
6
|
Pandey P, Alexov E. Most monogenic disorders are caused by mutations altering protein folding free energy. RESEARCH SQUARE 2023:rs.3.rs-3442589. [PMID: 37886551 PMCID: PMC10602188 DOI: 10.21203/rs.3.rs-3442589/v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Revealing the molecular effect that pathogenic missense mutations cause on the corresponding protein is crucial for developing therapeutic solutions. This is especially important for monogenic diseases since, for most of them, there is no treatment available, while typically, the treatment should be provided in the early development stages. This requires fast, targeted drug development at a low cost. Here, we report a database of monogenic disorders (MOGEDO), which includes 768 proteins, the corresponding 2559 pathogenic and 1763 benign mutations, along with the functional classification of the corresponding proteins. Using the database and various computational tools that predict folding free energy change (ΔΔG), we demonstrate that, on average, 70% of pathogenic cases result in decreased protein stability. Such a large fraction indicates that one should aim at in-silico screening for small molecules stabilizing the structure of the mutant protein. We emphasize that knowledge of ΔΔG is essential because one wants to develop stabilizers that compensate for ΔΔG but not to make protein over-stable since over-stable protein may be dysfunctional. We demonstrate that using ΔΔG and predicted solvent exposure of the mutation site; one can develop a predictive method that distinguishes pathogenic from benign mutation with a success rate even better than some of the leading pathogenicity predictors. Furthermore, hydrophobic-hydrophobic mutations have stronger correlations between folding free energy change and pathogenicity compared with others. Also, mutations involving Cys, Gly, Arg, Trp and Tyr amino acids being replaced by any other amino acid are more likely to be pathogenic. To facilitate further detection of pathogenic mutations, the wild type of amino acids in the 768 proteins mentioned above was mutated to other 19 residues (14,847,817 mutations), and the ΔΔG was calculated with SAAFEC-SEQ, and 5,506,051 mutations were predicted to be pathogenic.
Collapse
|
7
|
Berber I, Erten C, Kazan H. Predator: Predicting the Impact of Cancer Somatic Mutations on Protein-Protein Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3163-3172. [PMID: 37030791 DOI: 10.1109/tcbb.2023.3262119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Since many biological processes are governed by protein-protein interactions, understanding which mutations lead to a disruption in these interactions is profoundly important for cancer research. Most of the existing methods focus on the stability of the protein without considering the specific effects of a mutation on its interactions with other proteins. Here, we focus on somatic mutations that appear on the interface regions of the protein and predict the interactions that would be affected by a mutation of interest. We build an ensemble model, Predator, that classifies the interface mutations as disruptive or nondisruptive based on the predicted effects of mutations on specific protein-protein interactions. We show that Predator outperforms existing approaches in literature in terms of prediction accuracy. We then apply Predator on various TCGA cancer cohorts and perform comprehensive analysis at cohort level, patient level, and gene level in determining the genes whose interface mutations tend to yield a disruption in its interactions. The predictions obtained by Predator shed light on interesting patterns on several genes for each cohort regarding their potential as cancer drivers. Our analyses further reveal that the identified genes and their frequently disrupted partners exhibit patterns of mutually exclusivity across cancer cohorts under study.
Collapse
|
8
|
Pandey P, Panday SK, Rimal P, Ancona N, Alexov E. Predicting the Effect of Single Mutations on Protein Stability and Binding with Respect to Types of Mutations. Int J Mol Sci 2023; 24:12073. [PMID: 37569449 PMCID: PMC10418460 DOI: 10.3390/ijms241512073] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 07/24/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
The development of methods and algorithms to predict the effect of mutations on protein stability, protein-protein interaction, and protein-DNA/RNA binding is necessitated by the needs of protein engineering and for understanding the molecular mechanism of disease-causing variants. The vast majority of the leading methods require a database of experimentally measured folding and binding free energy changes for training. These databases are collections of experimental data taken from scientific investigations typically aimed at probing the role of particular residues on the above-mentioned thermodynamic characteristics, i.e., the mutations are not introduced at random and do not necessarily represent mutations originating from single nucleotide variants (SNV). Thus, the reported performance of the leading algorithms assessed on these databases or other limited cases may not be applicable for predicting the effect of SNVs seen in the human population. Indeed, we demonstrate that the SNVs and non-SNVs are not equally presented in the corresponding databases, and the distribution of the free energy changes is not the same. It is shown that the Pearson correlation coefficients (PCCs) of folding and binding free energy changes obtained in cases involving SNVs are smaller than for non-SNVs, indicating that caution should be used in applying them to reveal the effect of human SNVs. Furthermore, it is demonstrated that some methods are sensitive to the chemical nature of the mutations, resulting in PCCs that differ by a factor of four across chemically different mutations. All methods are found to underestimate the energy changes by roughly a factor of 2.
Collapse
Affiliation(s)
- Preeti Pandey
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Shailesh Kumar Panday
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Prawin Rimal
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Nicolas Ancona
- Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA;
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| |
Collapse
|
9
|
Pandey P, Ghimire S, Wu B, Alexov E. On the linkage of thermodynamics and pathogenicity. Curr Opin Struct Biol 2023; 80:102572. [PMID: 36965249 PMCID: PMC10239362 DOI: 10.1016/j.sbi.2023.102572] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 02/16/2023] [Accepted: 02/21/2023] [Indexed: 03/27/2023]
Abstract
This review outlines the effect of disease-causing mutations on proteins' thermodynamics. Two major thermodynamics quantities, which are essential for structural integrity, the folding and binding free energy changes caused by missense mutations, are considered. It is emphasized that disease effects in case of complex diseases may originate from several mutations over several genes, while monogenic diseases are caused by mutation is a single gene. Nevertheless, in both cases it is shown that pathogenic mutations cause larger perturbations of the above-mentioned thermodynamics quantities as compared with the benign mutations. Recent works demonstrating the effect of pathogenic mutations on the above-mentioned thermodynamics quantities, as well as on structural dynamics and allosteric pathways, are reviewed.
Collapse
Affiliation(s)
- Preeti Pandey
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Sanjeev Ghimire
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Bohua Wu
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
| |
Collapse
|
10
|
Gao Y, Wang B, Hu S, Zhu T, Zhang JZH. An efficient method to predict protein thermostability in alanine mutation. Phys Chem Chem Phys 2022; 24:29629-29639. [PMID: 36449314 DOI: 10.1039/d2cp04236c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The relationship between protein sequence and its thermodynamic stability is a critical aspect of computational protein design. In this work, we present a new theoretical method to calculate the free energy change (ΔΔG) resulting from a single-point amino acid mutation to alanine in a protein sequence. The method is derived based on physical interactions and is very efficient in estimating the free energy changes caused by a series of alanine mutations from just a single molecular dynamics (MD) trajectory. Numerical calculations are carried out on a total of 547 alanine mutations in 19 diverse proteins whose experimental results are available. The comparison between the experimental ΔΔGexp and the calculated values shows a generally good correlation with a correlation coefficient of 0.67. Both the advantages and limitations of this method are discussed. This method provides an efficient and valuable tool for protein design and engineering.
Collapse
Affiliation(s)
- Ya Gao
- School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Bo Wang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.
| | - Shiyu Hu
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China. .,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - John Z H Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China. .,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China.,Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| |
Collapse
|
11
|
Electrostatics in Computational Biophysics and Its Implications for Disease Effects. Int J Mol Sci 2022; 23:ijms231810347. [PMID: 36142260 PMCID: PMC9499338 DOI: 10.3390/ijms231810347] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 08/31/2022] [Accepted: 09/02/2022] [Indexed: 12/25/2022] Open
Abstract
This review outlines the role of electrostatics in computational molecular biophysics and its implication in altering wild-type characteristics of biological macromolecules, and thus the contribution of electrostatics to disease mechanisms. The work is not intended to review existing computational approaches or to propose further developments. Instead, it summarizes the outcomes of relevant studies and provides a generalized classification of major mechanisms that involve electrostatic effects in both wild-type and mutant biological macromolecules. It emphasizes the complex role of electrostatics in molecular biophysics, such that the long range of electrostatic interactions causes them to dominate all other forces at distances larger than several Angstroms, while at the same time, the alteration of short-range wild-type electrostatic pairwise interactions can have pronounced effects as well. Because of this dual nature of electrostatic interactions, being dominant at long-range and being very specific at short-range, their implications for wild-type structure and function are quite pronounced. Therefore, any disruption of the complex electrostatic network of interactions may abolish wild-type functionality and could be the dominant factor contributing to pathogenicity. However, we also outline that due to the plasticity of biological macromolecules, the effect of amino acid mutation may be reduced, and thus a charge deletion or insertion may not necessarily be deleterious.
Collapse
|
12
|
Xiong D, Lee D, Li L, Zhao Q, Yu H. Implications of disease-related mutations at protein-protein interfaces. Curr Opin Struct Biol 2022; 72:219-225. [PMID: 34959033 PMCID: PMC8863207 DOI: 10.1016/j.sbi.2021.11.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 11/01/2021] [Accepted: 11/18/2021] [Indexed: 02/03/2023]
Abstract
Protein-protein interfaces have been attracting great attention owing to their critical roles in protein-protein interactions and the fact that human disease-related mutations are generally enriched in them. Recently, substantial research progress has been made in this field, which has significantly promoted the understanding and treatment of various human diseases. For example, many studies have discovered the properties of disease-related mutations. Besides, as more large-scale experimental data become available, various computational approaches have been proposed to advance our understanding of disease mutations from the data. Here, we overview recent advances in characteristics of disease-related mutations at protein-protein interfaces, mutation effects on protein interactions, and investigation of mutations on specific diseases.
Collapse
Affiliation(s)
- Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Dongjin Lee
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Le Li
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Qiuye Zhao
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.
| |
Collapse
|
13
|
Lai J, Yang J, Gamsiz Uzun ED, Rubenstein BM, Sarkar IN. LYRUS: a machine learning model for predicting the pathogenicity of missense variants. BIOINFORMATICS ADVANCES 2021; 2:vbab045. [PMID: 35036922 PMCID: PMC8754197 DOI: 10.1093/bioadv/vbab045] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 12/08/2021] [Accepted: 12/21/2021] [Indexed: 01/27/2023]
Abstract
SUMMARY Single amino acid variations (SAVs) are a primary contributor to variations in the human genome. Identifying pathogenic SAVs can provide insights to the genetic architecture of complex diseases. Most approaches for predicting the functional effects or pathogenicity of SAVs rely on either sequence or structural information. This study presents 〈Lai Yang Rubenstein Uzun Sarkar〉 (LYRUS), a machine learning method that uses an XGBoost classifier to predict the pathogenicity of SAVs. LYRUS incorporates five sequence-based, six structure-based and four dynamics-based features. Uniquely, LYRUS includes a newly proposed sequence co-evolution feature called the variation number. LYRUS was trained using a dataset that contains 4363 protein structures corresponding to 22 639 SAVs from the ClinVar database, and tested using the VariBench testing dataset. Performance analysis showed that LYRUS achieved comparable performance to current variant effect predictors. LYRUS's performance was also benchmarked against six Deep Mutational Scanning datasets for PTEN and TP53. AVAILABILITY AND IMPLEMENTATION LYRUS is freely available and the source code can be found at https://github.com/jiaying2508/LYRUS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Jiaying Lai
- Center for Biomedical Informatics, Brown University, Providence, RI 02903, USA,Center for Computational Molecular Biology, Brown University, Providence, RI 02906, USA
| | - Jordan Yang
- Department of Chemistry, Brown University, Providence, RI 02906, USA
| | - Ece D Gamsiz Uzun
- Center for Computational Molecular Biology, Brown University, Providence, RI 02906, USA,Department of Pathology and Laboratory Medicine, Brown University Alpert Medical School, Providence, RI 02903, USA,Department of Pathology, Rhode Island Hospital, Providence, RI 02903, USA
| | - Brenda M Rubenstein
- Center for Computational Molecular Biology, Brown University, Providence, RI 02906, USA,Department of Chemistry, Brown University, Providence, RI 02906, USA,To whom correspondence should be addressed. and
| | - Indra Neil Sarkar
- Center for Biomedical Informatics, Brown University, Providence, RI 02903, USA,Rhode Island Quality Institute, Providence, RI 02908, USA,To whom correspondence should be addressed. and
| |
Collapse
|
14
|
Identification of discriminative gene-level and protein-level features associated with pathogenic gain-of-function and loss-of-function variants. Am J Hum Genet 2021; 108:2301-2318. [PMID: 34762822 DOI: 10.1016/j.ajhg.2021.10.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Accepted: 10/19/2021] [Indexed: 12/13/2022] Open
Abstract
Identifying whether a given genetic mutation results in a gene product with increased (gain-of-function; GOF) or diminished (loss-of-function; LOF) activity is an important step toward understanding disease mechanisms because they may result in markedly different clinical phenotypes. Here, we generated an extensive database of documented germline GOF and LOF pathogenic variants by employing natural language processing (NLP) on the available abstracts in the Human Gene Mutation Database. We then investigated various gene- and protein-level features of GOF and LOF variants and applied machine learning and statistical analyses to identify discriminative features. We found that GOF variants were enriched in essential genes, for autosomal-dominant inheritance, and in protein binding and interaction domains, whereas LOF variants were enriched in singleton genes, for protein-truncating variants, and in protein core regions. We developed a user-friendly web-based interface that enables the extraction of selected subsets from the GOF/LOF database by a broad set of annotated features and downloading of up-to-date versions. These results improve our understanding of how variants affect gene/protein function and may ultimately guide future treatment options.
Collapse
|
15
|
Singh AN, Sharma N. In-silico identification of frequently mutated genes and their co-enriched metabolic pathways associated with Prostate cancer progression. Andrologia 2021; 53:e14236. [PMID: 34468989 DOI: 10.1111/and.14236] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 08/04/2021] [Accepted: 08/15/2021] [Indexed: 11/27/2022] Open
Abstract
Prostate cancer (PCa) has emerged as a significant health burden in men globally. Several genetic anomalies such as mutations and also epigenetic aberrations are responsible for the heterogeneity of this disease. This study identified the 20 most frequently mutated genes reported in PCa based on literature and database survey. Further gene ontology and functional enrichment analysis were conducted to determine their co-modulated molecular and biological pathways. A protein-protein interaction network was used for the identification of hub genes. These hub genes identified were then subjected to survival analysis. The prognostic values of these identified genes were investigated using GEPIA and HPA. Gene Ontology analysis of the identified genes depicted that these genes significantly contributed to the cell cycle, apoptosis, angiogenesis and TGF-β receptor signalling. Further, the research showed that high expressions of identified mutated genes led to a reduction in the long-term survival of PCa patients, which was supported by immunohistochemical and mRNA expression level data. Our results suggest that identified panel of mutated genes viz., CTNNB1, TP53, ATM, AR and KMT2D play crucial roles in the onset and progression of PCa, thereby providing candidate diagnostic markers for PCa for individualised treatment in the future.
Collapse
Affiliation(s)
- Anshika N Singh
- School of Engineering, Ajeenkya DY Patil University (ADYPU), Pune, India
| | - Neeti Sharma
- School of Engineering, Ajeenkya DY Patil University (ADYPU), Pune, India
| |
Collapse
|
16
|
Pei J, Grishin NV. The DBSAV Database: Predicting Deleteriousness of Single Amino Acid Variations in the Human Proteome. J Mol Biol 2021; 433:166915. [PMID: 33676930 DOI: 10.1016/j.jmb.2021.166915] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 02/28/2021] [Accepted: 03/01/2021] [Indexed: 12/22/2022]
Abstract
Deleterious single amino acid variation (SAV) is one of the leading causes of human diseases. Evaluating the functional impact of SAVs is crucial for diagnosis of genetic disorders. We previously developed a deep convolutional neural network predictor, DeepSAV, to evaluate the deleterious effects of SAVs on protein function based on various sequence, structural, and functional properties. DeepSAV scores of rare SAVs observed in the human population are aggregated into a gene-level score called GTS (Gene Tolerance of rare SAVs) that reflects a gene's tolerance to deleterious missense mutations and serves as a useful tool to study gene-disease associations. In this study, we aim to enhance the performance of DeepSAV by using expanded datasets of pathogenic and benign variants, more features, and neural network optimization. We found that multiple sequence alignments built from vertebrate-level orthologs yield better prediction results compared to those built from mammalian-level orthologs. For multiple sequence alignments built from BLAST searches, optimal performance was achieved with a sequence identify cutoff of 50% to remove distant homologs. The new version of DeepSAV exhibits the best performance among standalone predictors of deleterious effects of SAVs. We developed the DBSAV database (http://prodata.swmed.edu/DBSAV) that reports GTS scores of human genes and DeepSAV scores of SAVs in the human proteome, including pathogenic and benign SAVs, population-level SAVs, and all possible SAVs by single nucleotide variations. This database serves as a useful resource for research of human SAVs and their relationships with protein functions and human diseases.
Collapse
Affiliation(s)
- Jimin Pei
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.
| |
Collapse
|
17
|
SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability. Int J Mol Sci 2021; 22:ijms22020606. [PMID: 33435356 PMCID: PMC7827184 DOI: 10.3390/ijms22020606] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Revised: 12/23/2020] [Accepted: 01/06/2021] [Indexed: 01/04/2023] Open
Abstract
Modeling the effect of mutations on protein thermodynamics stability is useful for protein engineering and understanding molecular mechanisms of disease-causing variants. Here, we report a new development of the SAAFEC method, the SAAFEC-SEQ, which is a gradient boosting decision tree machine learning method to predict the change of the folding free energy caused by amino acid substitutions. The method does not require the 3D structure of the corresponding protein, but only its sequence and, thus, can be applied on genome-scale investigations where structural information is very sparse. SAAFEC-SEQ uses physicochemical properties, sequence features, and evolutionary information features to make the predictions. It is shown to consistently outperform all existing state-of-the-art sequence-based methods in both the Pearson correlation coefficient and root-mean-squared-error parameters as benchmarked on several independent datasets. The SAAFEC-SEQ has been implemented into a web server and is available as stand-alone code that can be downloaded and embedded into other researchers’ code.
Collapse
|
18
|
Chen Y, Lu H, Zhang N, Zhu Z, Wang S, Li M. PremPS: Predicting the impact of missense mutations on protein stability. PLoS Comput Biol 2020; 16:e1008543. [PMID: 33378330 PMCID: PMC7802934 DOI: 10.1371/journal.pcbi.1008543] [Citation(s) in RCA: 130] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 01/12/2021] [Accepted: 11/16/2020] [Indexed: 12/12/2022] Open
Abstract
Computational methods that predict protein stability changes induced by missense mutations have made a lot of progress over the past decades. Most of the available methods however have very limited accuracy in predicting stabilizing mutations because existing experimental sets are dominated by mutations reducing protein stability. Moreover, few approaches could consistently perform well across different test cases. To address these issues, we developed a new computational method PremPS to more accurately evaluate the effects of missense mutations on protein stability. The PremPS method is composed of only ten evolutionary- and structure-based features and parameterized on a balanced dataset with an equal number of stabilizing and destabilizing mutations. A comprehensive comparison of the predictive performance of PremPS with other available methods on nine benchmark datasets confirms that our approach consistently outperforms other methods and shows considerable improvement in estimating the impacts of stabilizing mutations. A protein could have multiple structures available, and if another structure of the same protein is used, the predicted change in stability for structure-based methods might be different. Thus, we further estimated the impact of using different structures on prediction accuracy, and demonstrate that our method performs well across different types of structures except for low-resolution structures and models built based on templates with low sequence identity. PremPS can be used for finding functionally important variants, revealing the molecular mechanisms of functional influences and protein design. PremPS is freely available at https://lilab.jysw.suda.edu.cn/research/PremPS/, which allows to do large-scale mutational scanning and takes about four minutes to perform calculations for a single mutation per protein with ~ 300 residues and requires ~ 0.4 seconds for each additional mutation.
Collapse
Affiliation(s)
- Yuting Chen
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Haoyu Lu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Ning Zhang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Zefeng Zhu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Shuqin Wang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Minghui Li
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| |
Collapse
|
19
|
Qiu J, Nechaev D, Rost B. Protein-protein and protein-nucleic acid binding residues important for common and rare sequence variants in human. BMC Bioinformatics 2020; 21:452. [PMID: 33050876 PMCID: PMC7557062 DOI: 10.1186/s12859-020-03759-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 09/16/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Any two unrelated people differ by about 20,000 missense mutations (also referred to as SAVs: Single Amino acid Variants or missense SNV). Many SAVs have been predicted to strongly affect molecular protein function. Common SAVs (> 5% of population) were predicted to have, on average, more effect on molecular protein function than rare SAVs (< 1% of population). We hypothesized that the prevalence of effect in common over rare SAVs might partially be caused by common SAVs more often occurring at interfaces of proteins with other proteins, DNA, or RNA, thereby creating subgroup-specific phenotypes. We analyzed SAVs from 60,706 people through the lens of two prediction methods, one (SNAP2) predicting the effects of SAVs on molecular protein function, the other (ProNA2020) predicting residues in DNA-, RNA- and protein-binding interfaces. RESULTS Three results stood out. Firstly, SAVs predicted to occur at binding interfaces were predicted to more likely affect molecular function than those predicted as not binding (p value < 2.2 × 10-16). Secondly, for SAVs predicted to occur at binding interfaces, common SAVs were predicted more strongly with effect on protein function than rare SAVs (p value < 2.2 × 10-16). Restriction to SAVs with experimental annotations confirmed all results, although the resulting subsets were too small to establish statistical significance for any result. Thirdly, the fraction of SAVs predicted at binding interfaces differed significantly between tissues, e.g. urinary bladder tissue was found abundant in SAVs predicted at protein-binding interfaces, and reproductive tissues (ovary, testis, vagina, seminal vesicle and endometrium) in SAVs predicted at DNA-binding interfaces. CONCLUSIONS Overall, the results suggested that residues at protein-, DNA-, and RNA-binding interfaces contributed toward predicting that common SAVs more likely affect molecular function than rare SAVs.
Collapse
Affiliation(s)
- Jiajun Qiu
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany. .,TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), 85748, Garching, Germany. .,Biobank of Ninth People's Hospital, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200125, China.
| | - Dmitrii Nechaev
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), 85748, Garching, Germany
| | - Burkhard Rost
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748, Garching, Munich, Germany.,Institute of Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching, Munich, Germany.,Institute for Food and Plant Sciences (WZW) Weihenstephan, Alte Akademie 8, 85354, Freising, Germany
| |
Collapse
|
20
|
Huang X, Zheng W, Pearce R, Zhang Y. SSIPe: accurately estimating protein-protein binding affinity change upon mutations using evolutionary profiles in combination with an optimized physical energy function. Bioinformatics 2020; 36:2429-2437. [PMID: 31830252 DOI: 10.1093/bioinformatics/btz926] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2019] [Revised: 11/08/2019] [Accepted: 12/09/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Most proteins perform their biological functions through interactions with other proteins in cells. Amino acid mutations, especially those occurring at protein interfaces, can change the stability of protein-protein interactions (PPIs) and impact their functions, which may cause various human diseases. Quantitative estimation of the binding affinity changes (ΔΔGbind) caused by mutations can provide critical information for protein function annotation and genetic disease diagnoses. RESULTS We present SSIPe, which combines protein interface profiles, collected from structural and sequence homology searches, with a physics-based energy function for accurate ΔΔGbind estimation. To offset the statistical limits of the PPI structure and sequence databases, amino acid-specific pseudocounts were introduced to enhance the profile accuracy. SSIPe was evaluated on large-scale experimental data containing 2204 mutations from 177 proteins, where training and test datasets were stringently separated with the sequence identity between proteins from the two datasets below 30%. The Pearson correlation coefficient between estimated and experimental ΔΔGbind was 0.61 with a root-mean-square-error of 1.93 kcal/mol, which was significantly better than the other methods. Detailed data analyses revealed that the major advantage of SSIPe over other traditional approaches lies in the novel combination of the physical energy function with the new knowledge-based interface profile. SSIPe also considerably outperformed a former profile-based method (BindProfX) due to the newly introduced sequence profiles and optimized pseudocount technique that allows for consideration of amino acid-specific prior mutation probabilities. AVAILABILITY AND IMPLEMENTATION Web-server/standalone program, source code and datasets are freely available at https://zhanglab.ccmb.med.umich.edu/SSIPe and https://github.com/tommyhuangthu/SSIPe. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
21
|
An Ensemble Approach to Predict the Pathogenicity of Synonymous Variants. Genes (Basel) 2020; 11:genes11091102. [PMID: 32967157 PMCID: PMC7565489 DOI: 10.3390/genes11091102] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/08/2020] [Accepted: 09/17/2020] [Indexed: 12/18/2022] Open
Abstract
Single-nucleotide variants (SNVs) are a major form of genetic variation in the human genome that contribute to various disorders. There are two types of SNVs, namely non-synonymous (missense) variants (nsSNVs) and synonymous variants (sSNVs), predominantly involved in RNA processing or gene regulation. sSNVs, unlike missense or nsSNVs, do not alter the amino acid sequences, thereby making challenging candidates for downstream functional studies. Numerous computational methods have been developed to evaluate the clinical impact of nsSNVs, but very few methods are available for understanding the effects of sSNVs. For this analysis, we have downloaded sSNVs from the ClinVar database with various features such as conservation, DNA-RNA, and splicing properties. We performed feature selection and implemented an ensemble random forest (RF) classification algorithm to build a classifier to predict the pathogenicity of the sSNVs. We demonstrate that the ensemble predictor with selected features (20 features) enhances the classification of sSNVs into two categories, pathogenic and benign, with high accuracy (87%), precision (79%), and recall (91%). Furthermore, we used this prediction model to reclassify sSNVs with unknown clinical significance. Finally, the method is very robust and can be used to predict the effect of other unknown sSNVs.
Collapse
|
22
|
Long QT Syndrome Type 2: Emerging Strategies for Correcting Class 2 KCNH2 ( hERG) Mutations and Identifying New Patients. Biomolecules 2020. [PMID: 32759882 DOI: 10.3390/biom10081144s] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023] Open
Abstract
Significant advances in our understanding of the molecular mechanisms that cause congenital long QT syndrome (LQTS) have been made. A wide variety of experimental approaches, including heterologous expression of mutant ion channel proteins and the use of inducible pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) from LQTS patients offer insights into etiology and new therapeutic strategies. This review briefly discusses the major molecular mechanisms underlying LQTS type 2 (LQT2), which is caused by loss-of-function (LOF) mutations in the KCNH2 gene (also known as the human ether-à-go-go-related gene or hERG). Almost half of suspected LQT2-causing mutations are missense mutations, and functional studies suggest that about 90% of these mutations disrupt the intracellular transport, or trafficking, of the KCNH2-encoded Kv11.1 channel protein to the cell surface membrane. In this review, we discuss emerging strategies that improve the trafficking and functional expression of trafficking-deficient LQT2 Kv11.1 channel proteins to the cell surface membrane and how new insights into the structure of the Kv11.1 channel protein will lead to computational approaches that identify which KCNH2 missense variants confer a high-risk for LQT2.
Collapse
|
23
|
Ono M, Burgess DE, Schroder EA, Elayi CS, Anderson CL, January CT, Sun B, Immadisetty K, Kekenes-Huskey PM, Delisle BP. Long QT Syndrome Type 2: Emerging Strategies for Correcting Class 2 KCNH2 ( hERG) Mutations and Identifying New Patients. Biomolecules 2020; 10:E1144. [PMID: 32759882 PMCID: PMC7464307 DOI: 10.3390/biom10081144] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 07/25/2020] [Accepted: 07/27/2020] [Indexed: 12/15/2022] Open
Abstract
Significant advances in our understanding of the molecular mechanisms that cause congenital long QT syndrome (LQTS) have been made. A wide variety of experimental approaches, including heterologous expression of mutant ion channel proteins and the use of inducible pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) from LQTS patients offer insights into etiology and new therapeutic strategies. This review briefly discusses the major molecular mechanisms underlying LQTS type 2 (LQT2), which is caused by loss-of-function (LOF) mutations in the KCNH2 gene (also known as the human ether-à-go-go-related gene or hERG). Almost half of suspected LQT2-causing mutations are missense mutations, and functional studies suggest that about 90% of these mutations disrupt the intracellular transport, or trafficking, of the KCNH2-encoded Kv11.1 channel protein to the cell surface membrane. In this review, we discuss emerging strategies that improve the trafficking and functional expression of trafficking-deficient LQT2 Kv11.1 channel proteins to the cell surface membrane and how new insights into the structure of the Kv11.1 channel protein will lead to computational approaches that identify which KCNH2 missense variants confer a high-risk for LQT2.
Collapse
Affiliation(s)
- Makoto Ono
- Department of Physiology, Cardiovascular Research Center, Center for Muscle Biology, University of Kentucky, Lexington, KY 40536, USA; (M.O.); (D.E.B.); (E.A.S.)
| | - Don E. Burgess
- Department of Physiology, Cardiovascular Research Center, Center for Muscle Biology, University of Kentucky, Lexington, KY 40536, USA; (M.O.); (D.E.B.); (E.A.S.)
| | - Elizabeth A. Schroder
- Department of Physiology, Cardiovascular Research Center, Center for Muscle Biology, University of Kentucky, Lexington, KY 40536, USA; (M.O.); (D.E.B.); (E.A.S.)
| | | | - Corey L. Anderson
- Cellular and Molecular Arrhythmia Research Program, University of Wisconsin, Madison, WI 53706, USA; (C.L.A.); (C.T.J.)
| | - Craig T. January
- Cellular and Molecular Arrhythmia Research Program, University of Wisconsin, Madison, WI 53706, USA; (C.L.A.); (C.T.J.)
| | - Bin Sun
- Department of Cellular & Molecular Physiology, Loyola University Chicago, Chicago, IL 60153, USA; (B.S.); (K.I.); (P.M.K.-H.)
| | - Kalyan Immadisetty
- Department of Cellular & Molecular Physiology, Loyola University Chicago, Chicago, IL 60153, USA; (B.S.); (K.I.); (P.M.K.-H.)
| | - Peter M. Kekenes-Huskey
- Department of Cellular & Molecular Physiology, Loyola University Chicago, Chicago, IL 60153, USA; (B.S.); (K.I.); (P.M.K.-H.)
| | - Brian P. Delisle
- Department of Physiology, Cardiovascular Research Center, Center for Muscle Biology, University of Kentucky, Lexington, KY 40536, USA; (M.O.); (D.E.B.); (E.A.S.)
| |
Collapse
|
24
|
Koirala M, Alexov E. Ab-initio binding of barnase–barstar with DelPhiForce steered Molecular Dynamics (DFMD) approach. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2020. [DOI: 10.1142/s0219633620500169] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Receptor–ligand interactions are involved in various biological processes, therefore understanding the binding mechanism and ability to predict the binding mode are essential for many biological investigations. While many computational methods exist to predict the 3D structure of the corresponding complex provided the knowledge of the monomers, here we use the newly developed DelPhiForce steered Molecular Dynamics (DFMD) approach to model the binding of barstar to barnase to demonstrate that first-principles methods are also capable of modeling the binding. Essential component of DFMD approach is enhancing the role of long-range electrostatic interactions to provide guiding force of the monomers toward their correct binding orientation and position. Thus, it is demonstrated that the DFMD can successfully dock barstar to barnase even if the initial positions and orientations of both are completely different from the correct ones. Thus, the electrostatics provides orientational guidance along with pulling force to deliver the ligand in close proximity to the receptor.
Collapse
Affiliation(s)
- Mahesh Koirala
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| |
Collapse
|
25
|
Shringari SR, Giannakoulias S, Ferrie JJ, Petersson EJ. Rosetta custom score functions accurately predict ΔΔG of mutations at protein-protein interfaces using machine learning. Chem Commun (Camb) 2020; 56:6774-6777. [PMID: 32441721 DOI: 10.1039/d0cc01959c] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Protein-protein interfaces play essential roles in a variety of biological processes and many therapeutic molecules are targeted at these interfaces. However, accurate predictions of the effects of interfacial mutations to identify "hotspots" have remained elusive despite the myriad of modeling and machine learning methods tested. Here, for the first time, we demonstrate that nonlinear reweighting of energy terms from Rosetta, through the use of machine learning, exhibits improved predictability of ΔΔG values associated with interfacial mutations.
Collapse
Affiliation(s)
- Sumant R Shringari
- Department of Chemistry, University of Pennsylvania, 231 South 34th Street, Philadelphia, PA 19104, USA.
| | | | | | | |
Collapse
|
26
|
Pei J, Kinch LN, Otwinowski Z, Grishin NV. Mutation severity spectrum of rare alleles in the human genome is predictive of disease type. PLoS Comput Biol 2020; 16:e1007775. [PMID: 32413045 PMCID: PMC7255613 DOI: 10.1371/journal.pcbi.1007775] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 05/28/2020] [Accepted: 03/06/2020] [Indexed: 12/19/2022] Open
Abstract
The human genome harbors a variety of genetic variations. Single-nucleotide changes that alter amino acids in protein-coding regions are one of the major causes of human phenotypic variation and diseases. These single-amino acid variations (SAVs) are routinely found in whole genome and exome sequencing. Evaluating the functional impact of such genomic alterations is crucial for diagnosis of genetic disorders. We developed DeepSAV, a deep-learning convolutional neural network to differentiate disease-causing and benign SAVs based on a variety of protein sequence, structural and functional properties. Our method outperforms most stand-alone programs, and the version incorporating population and gene-level information (DeepSAV+PG) has similar predictive power as some of the best available. We transformed DeepSAV scores of rare SAVs in the human population into a quantity termed "mutation severity measure" for each human protein-coding gene. It reflects a gene's tolerance to deleterious missense mutations and serves as a useful tool to study gene-disease associations. Genes implicated in cancer, autism, and viral interaction are found by this measure as intolerant to mutations, while genes associated with a number of other diseases are scored as tolerant. Among known disease-associated genes, those that are mutation-intolerant are likely to function in development and signal transduction pathways, while those that are mutation-tolerant tend to encode metabolic and mitochondrial proteins.
Collapse
Affiliation(s)
- Jimin Pei
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Lisa N. Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Zbyszek Otwinowski
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Nick V. Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- * E-mail:
| |
Collapse
|
27
|
Pahari S, Li G, Murthy AK, Liang S, Fragoza R, Yu H, Alexov E. SAAMBE-3D: Predicting Effect of Mutations on Protein-Protein Interactions. Int J Mol Sci 2020; 21:E2563. [PMID: 32272725 PMCID: PMC7177817 DOI: 10.3390/ijms21072563] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 04/04/2020] [Accepted: 04/05/2020] [Indexed: 12/26/2022] Open
Abstract
Maintaining wild type protein-protein interactions is essential for the normal function of cell and any mutation that alter their characteristics can cause disease. Therefore, the ability to correctly and quickly predict the effect of amino acid mutations is crucial for understanding disease effects and to be able to carry out genome-wide studies. Here, we report a new development of the SAAMBE method, SAAMBE-3D, which is a machine learning-based approach, resulting in accurate predictions and is extremely fast. It achieves the Pearson correlation coefficient ranging from 0.78 to 0.82 depending on the training protocol in benchmarking five-fold validation test against the SKEMPI v2.0 database and outperforms currently existing algorithms on various blind-tests. Furthermore, optimized and tested via five-fold cross-validation on the Cornell University dataset, the SAAMBE-3D achieves AUC of 1.0 and 0.96 on a homo and hereto-dimer test datasets. Another important feature of SAAMBE-3D is that it is very fast, it takes less than a fraction of a second to complete a prediction. SAAMBE-3D is available as a web server and as well as a stand-alone code, the last one being another important feature allowing other researchers to directly download the code and run it on their local computer. Combined all together, SAAMBE-3D is an accurate and fast software applicable for genome-wide studies to assess the effect of amino acid mutations on protein-protein interactions. The webserver and the stand-alone codes (SAAMBE-3D for predicting the change of binding free energy and SAAMBE-3D-DN for predicting if the mutation is disruptive or non-disruptive) are available.
Collapse
Affiliation(s)
- Swagata Pahari
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (S.P.); (G.L.); (A.K.M.)
| | - Gen Li
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (S.P.); (G.L.); (A.K.M.)
| | - Adithya Krishna Murthy
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (S.P.); (G.L.); (A.K.M.)
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA; (S.L.); (R.F.); (H.Y.)
| | - Robert Fragoza
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA; (S.L.); (R.F.); (H.Y.)
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA; (S.L.); (R.F.); (H.Y.)
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (S.P.); (G.L.); (A.K.M.)
| |
Collapse
|
28
|
Gyulkhandanyan A, Rezaie AR, Roumenina L, Lagarde N, Fremeaux-Bacchi V, Miteva MA, Villoutreix BO. Analysis of protein missense alterations by combining sequence- and structure-based methods. Mol Genet Genomic Med 2020; 8:e1166. [PMID: 32096919 PMCID: PMC7196459 DOI: 10.1002/mgg3.1166] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 01/20/2020] [Accepted: 01/27/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Different types of in silico approaches can be used to predict the phenotypic consequence of missense variants. Such algorithms are often categorized as sequence based or structure based, when they necessitate 3D structural information. In addition, many other in silico tools, not dedicated to the analysis of variants, can be used to gain additional insights about the possible mechanisms at play. METHODS Here we applied different computational approaches to a set of 20 known missense variants present on different proteins (CYP, complement factor B, antithrombin and blood coagulation factor VIII). The tools that were used include fast computational approaches and web servers such as PolyPhen-2, PopMusic, DUET, MaestroWeb, SAAFEC, Missense3D, VarSite, FlexPred, PredyFlexy, Clustal Omega, meta-PPISP, FTMap, ClusPro, pyDock, PPM, RING, Cytoscape, and ChannelsDB. RESULTS We observe some conflicting results among the methods but, most of the time, the combination of several engines helped to clarify the potential impacts of the amino acid substitutions. CONCLUSION Combining different computational approaches including some that were not developed to investigate missense variants help to predict the possible impact of the amino acid substitutions. Yet, when the modified residues are involved in a salt-bridge, the tools tend to fail, even when the analysis is performed in 3D. Thus, interactive structural analysis with molecular graphics packages such as Chimera or PyMol or others are still needed to clarify automatic prediction.
Collapse
Affiliation(s)
- Aram Gyulkhandanyan
- INSERM U973, Laboratory MTi, University Paris Diderot, Paris, France
- Laboratory SABNP, University of Evry, INSERM U1204, Université Paris-Saclay, Evry, France
| | - Alireza R Rezaie
- Cardiovascular Biology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
- Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Lubka Roumenina
- INSERM, UMR_S 1138, Centre de Recherche des Cordeliers, Paris, France
- Sorbonne Universités, Paris, France
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Nathalie Lagarde
- INSERM U973, Laboratory MTi, University Paris Diderot, Paris, France
- Laboratoire GBCM, EA7528, Conservatoire national des arts et métiers, Hesam Université, Paris, France
| | - Veronique Fremeaux-Bacchi
- INSERM, UMR_S 1138, Centre de Recherche des Cordeliers, Paris, France
- Sorbonne Universités, Paris, France
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France
- Assistance Publique-Hôpitaux de Paris, Service d'Immunologie Biologique, Hôpital Européen Georges Pompidou, Paris, France
| | - Maria A Miteva
- INSERM U973, Laboratory MTi, University Paris Diderot, Paris, France
- Inserm U1268 MCTR, CNRS UMR 8038 CiTCoM, Faculté de Pharmacie de Paris, Univ. De Paris, Paris, France
| | - Bruno O Villoutreix
- INSERM U973, Laboratory MTi, University Paris Diderot, Paris, France
- INSERM, Institut Pasteur de Lille, U1177-Drugs and Molecules for Living Systems, Université de Lille, Lille, France
| |
Collapse
|
29
|
Ganakammal SR, Alexov E. Evaluation of performance of leading algorithms for variant pathogenicity predictions and designing a combinatory predictor method: application to Rett syndrome variants. PeerJ 2019; 7:e8106. [PMID: 31799076 PMCID: PMC6884988 DOI: 10.7717/peerj.8106] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Accepted: 10/27/2019] [Indexed: 12/13/2022] Open
Abstract
Background Genomics diagnostic tests are done for a wide spectrum of complex genetics conditions such as autism and cancer. The growth of technology has not only aided in successfully decoding the genetic variants that causes or trigger these disorders. However, interpretation of these variants is not a trivial task even at a level of distinguish pathogenic vs benign variants. Methods We used the clinically significant variants from ClinVar database to evaluate the performance of 14 most popular in-silico predictors using supervised learning methods. We implemented a feature selection and random forest classification algorithm to identify the best combination of predictors to evaluate the pathogenicity of a variant. Finally, we have also utilized this combination of predictors to reclassify the variants of unknown significance in MeCP2 gene that are associated with the Rett syndrome. Results The results from analysis shows an optimized selection of prediction algorithm and developed a combinatory predictor method. Our combinatory approach of using both best performing independent and ensemble predictors reduces any algorithm biases in variant characterization. The reclassification of variants (such as VUS) in MECP2 gene associated with RETT syndrome suggest that the combinatory in-silico predictor approach had a higher success rate in categorizing their pathogenicity.
Collapse
Affiliation(s)
| | - Emil Alexov
- Department of Physics, Clemson University, Clemson, SC, USA
| |
Collapse
|
30
|
Jankauskaite J, Jiménez-García B, Dapkunas J, Fernández-Recio J, Moal IH. SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics 2019; 35:462-469. [PMID: 30020414 PMCID: PMC6361233 DOI: 10.1093/bioinformatics/bty635] [Citation(s) in RCA: 174] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 07/17/2018] [Indexed: 11/18/2022] Open
Abstract
Motivation Understanding the relationship between the sequence, structure, binding energy, binding kinetics and binding thermodynamics of protein–protein interactions is crucial to understanding cellular signaling, the assembly and regulation of molecular complexes, the mechanisms through which mutations lead to disease, and protein engineering. Results We present SKEMPI 2.0, a major update to our database of binding free energy changes upon mutation for structurally resolved protein–protein interactions. This version now contains manually curated binding data for 7085 mutations, an increase of 133%, including changes in kinetics for 1844 mutations, enthalpy and entropy changes for 443 mutations, and 440 mutations, which abolish detectable binding. Availability and implementation The database is available as supplementary data and at https://life.bsc.es/pid/skempi2/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Justina Jankauskaite
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Brian Jiménez-García
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University, Utrecht, the Netherlands
| | - Justas Dapkunas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Juan Fernández-Recio
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Institut de Biologia Molecular de Barcelona (IBMB), CSIC, Barcelona, Spain
| | - Iain H Moal
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
31
|
Li C, Jia Z, Chakravorty A, Pahari S, Peng Y, Basu S, Koirala M, Panday SK, Petukh M, Li L, Alexov E. DelPhi Suite: New Developments and Review of Functionalities. J Comput Chem 2019; 40:2502-2508. [PMID: 31237360 PMCID: PMC6771749 DOI: 10.1002/jcc.26006] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 05/07/2019] [Accepted: 06/09/2019] [Indexed: 12/25/2022]
Abstract
Electrostatic potential, energies, and forces affect virtually any process in molecular biology, however, computing these quantities is a difficult task due to irregularly shaped macromolecules and the presence of water. Here, we report a new edition of the popular software package DelPhi along with describing its functionalities. The new DelPhi is a C++ object-oriented package supporting various levels of multiprocessing and memory distribution. It is demonstrated that multiprocessing results in significant improvement of computational time. Furthermore, for computations requiring large grid size (large macromolecular assemblages), the approach of memory distribution is shown to reduce the requirement of RAM and thus permitting large-scale modeling to be done on Linux clusters with moderate architecture. The new release comes with new features, whose functionalities and applications are described as well. © 2019 The Authors. Journal of Computational Chemistry published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Chuan Li
- Department of MathematicsWest Chester University of PennsylvaniaWest ChesterPennsylvania19383
| | - Zhe Jia
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Arghya Chakravorty
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Swagata Pahari
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Yunhui Peng
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Sankar Basu
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | - Mahesh Koirala
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| | | | - Marharyta Petukh
- Department of BiologyPresbyterian CollegeClintonSouth Carolina29325
| | - Lin Li
- Department of PhysicsUniversity of Texas at EI PasoTexas79968
| | - Emil Alexov
- Department of Physics and AstronomyClemson UniversityClemsonSouth Carolina29634
| |
Collapse
|
32
|
Ozdemir ES, Gursoy A, Keskin O. Analysis of single amino acid variations in singlet hot spots of protein-protein interfaces. Bioinformatics 2019; 34:i795-i801. [PMID: 30423104 DOI: 10.1093/bioinformatics/bty569] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Motivation Single amino acid variations (SAVs) in protein-protein interaction (PPI) sites play critical roles in diseases. PPI sites (interfaces) have a small subset of residues called hot spots that contribute significantly to the binding energy, and they may form clusters called hot regions. Singlet hot spots are the single amino acid hot spots outside of the hot regions. The distribution of SAVs on the interface residues may be related to their disease association. Results We performed statistical and structural analyses of SAVs with literature curated experimental thermodynamics data, and demonstrated that SAVs which destabilize PPIs are more likely to be found in singlet hot spots rather than hot regions and energetically less important interface residues. In contrast, non-hot spot residues are significantly enriched in neutral SAVs, which do not affect PPI stability. Surprisingly, we observed that singlet hot spots tend to be enriched in disease-causing SAVs, while benign SAVs significantly occur in non-hot spot residues. Our work demonstrates that SAVs in singlet hot spot residues have significant effect on protein stability and function. Availability and implementation The dataset used in this paper is available as Supplementary Material. The data can be found at http://prism.ccbb.ku.edu.tr/data/sav/ as well. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- E Sila Ozdemir
- Department of Chemical and Biological Engineering, Koc University, Istanbul, Turkey
| | - Attila Gursoy
- Department of Computer Engineering, Koc University, Istanbul, Turkey.,Research Center for Translational Medicine (KUTTAM), Koc University, Istanbul, Turkey
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, Koc University, Istanbul, Turkey.,Research Center for Translational Medicine (KUTTAM), Koc University, Istanbul, Turkey
| |
Collapse
|
33
|
Functional and Structural Features of Disease-Related Protein Variants. Int J Mol Sci 2019; 20:ijms20071530. [PMID: 30934684 PMCID: PMC6479756 DOI: 10.3390/ijms20071530] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 03/22/2019] [Accepted: 03/22/2019] [Indexed: 12/28/2022] Open
Abstract
Modern sequencing technologies provide an unprecedented amount of data of single-nucleotide variations occurring in coding regions and leading to changes in the expressed protein sequences. A significant fraction of these single-residue variations is linked to disease onset and collected in public databases. In recent years, many scientific studies have been focusing on the dissection of salient features of disease-related variations from different perspectives. In this work, we complement previous analyses by updating a dataset of disease-related variations occurring in proteins with 3D structure. Within this dataset, we describe functional and structural features that can be of interest for characterizing disease-related variations, including major chemico-physical properties, the strength of association to disease of variation types, their effect on protein stability, their location on the protein structure, and their distribution in Pfam structural/functional protein models. Our results support previous findings obtained in different data sets and introduce Pfam models as possible fingerprints of patterns of disease related single-nucleotide variations.
Collapse
|
34
|
Peng Y, Alexov E, Basu S. Structural Perspective on Revealing and Altering Molecular Functions of Genetic Variants Linked with Diseases. Int J Mol Sci 2019; 20:ijms20030548. [PMID: 30696058 PMCID: PMC6386852 DOI: 10.3390/ijms20030548] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Revised: 01/25/2019] [Accepted: 01/26/2019] [Indexed: 12/25/2022] Open
Abstract
Structural information of biological macromolecules is crucial and necessary to deliver predictions about the effects of mutations-whether polymorphic or deleterious (i.e., disease causing), wherein, thermodynamic parameters, namely, folding and binding free energies potentially serve as effective biomarkers. It may be emphasized that the effect of a mutation depends on various factors, including the type of protein (globular, membrane or intrinsically disordered protein) and the structural context in which it occurs. Such information may positively aid drug-design. Furthermore, due to the intrinsic plasticity of proteins, even mutations involving radical change of the structural and physico⁻chemical properties of the amino acids (native vs. mutant) can still have minimal effects on protein thermodynamics. However, if a mutation causes significant perturbation by either folding or binding free energies, it is quite likely to be deleterious. Mitigating such effects is a promising alternative to the traditional approaches of designing inhibitors. This can be done by structure-based in silico screening of small molecules for which binding to the dysfunctional protein restores its wild type thermodynamics. In this review we emphasize the effects of mutations on two important biophysical properties, stability and binding affinity, and how structures can be used for structure-based drug design to mitigate the effects of disease-causing variants on the above biophysical properties.
Collapse
Affiliation(s)
- Yunhui Peng
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
| | - Sankar Basu
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
| |
Collapse
|
35
|
Peng Y, Sun L, Jia Z, Li L, Alexov E. Predicting protein-DNA binding free energy change upon missense mutations using modified MM/PBSA approach: SAMPDI webserver. Bioinformatics 2018; 34:779-786. [PMID: 29091991 DOI: 10.1093/bioinformatics/btx698] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 10/27/2017] [Indexed: 12/28/2022] Open
Abstract
Motivation Protein-DNA interactions are essential for regulating many cellular processes, such as transcription, replication, recombination and translation. Amino acid mutations occurring in DNA-binding proteins have profound effects on protein-DNA binding and are linked with many diseases. Hence, accurate and fast predictions of the effects of mutations on protein-DNA binding affinity are essential for understanding disease-causing mechanisms and guiding plausible treatments. Results Here we report a new method Single Amino acid Mutation binding free energy change of Protein-DNA Interaction (SAMPDI). The method utilizes modified Molecular Mechanics Poisson-Boltzmann Surface Area (MM/PBSA) approach along with an additional set of knowledge-based terms delivered from investigations of the physicochemical properties of protein-DNA complexes. The method is benchmarked against experimentally determined binding free energy changes caused by 105 mutations in 13 proteins (compiled ProNIT database and data from recent references), and results in correlation coefficient of 0.72. Availability and implementation http://compbio.clemson.edu/SAMPDI. Contact ealexov@clemson.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yunhui Peng
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Lexuan Sun
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Zhe Jia
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Lin Li
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| |
Collapse
|
36
|
Peng Y, Michonova E. Long-range effect of a single mutation in spermine synthase. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2018. [DOI: 10.1142/s021963361850030x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Spermine synthase (SpmSyn) is an enzyme critical for maintaining the balance of spermine/spermidine in the cell. The amino acid sequence of SpmSyn is highly conserved among the species. Most of the mutations found in the human population are shown to be causing Snyder–Robinson syndrome, a severe mental disorder, while not so many are neutral. This is intriguing since SpmSyn is a relatively large protein and less than 10% of its amino acids are directly involved in the catalysis. Here, we demonstrated that a mutation (G191S) at a site far away from the active pocket affects the active site dynamics and thus the functionality of SpmSyn. This suggests that SpmSyn functionality is regulated by networks of interacting residues and thus expands the functional and structural importance beyond the amino acids directly involved in the catalysis. Comparing the calculated effects of G191S and a nine-residue deletion shown to decrease SpmSyn activity [Wu H, Min J, Zeng H, McCloskey DE, Ikeguchi Y, Loppnau P, Michael AJ, Pegg AE, Plotnikov AN, Crystal structure of human spermine synthase: Implications of substrate binding and catalytic mechanism, J Biol Chem 283:16135–16146, 2008], we predict that G191S mutation also decreases SpmSyn activity and may be causing disease.
Collapse
Affiliation(s)
- Yunhui Peng
- Department of Physics and Astronomy, Clemson University, Clemson SC 29634, USA
| | - Ekaterina Michonova
- Department of Chemistry and Physics, Erskine College, Due West SC 29639, USA
| |
Collapse
|
37
|
Computational Approaches to Prioritize Cancer Driver Missense Mutations. Int J Mol Sci 2018; 19:ijms19072113. [PMID: 30037003 PMCID: PMC6073793 DOI: 10.3390/ijms19072113] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Revised: 07/02/2018] [Accepted: 07/05/2018] [Indexed: 12/31/2022] Open
Abstract
Cancer is a complex disease that is driven by genetic alterations. There has been a rapid development of genome-wide techniques during the last decade along with a significant lowering of the cost of gene sequencing, which has generated widely available cancer genomic data. However, the interpretation of genomic data and the prediction of the association of genetic variations with cancer and disease phenotypes still requires significant improvement. Missense mutations, which can render proteins non-functional and provide a selective growth advantage to cancer cells, are frequently detected in cancer. Effects caused by missense mutations can be pinpointed by in silico modeling, which makes it more feasible to find a treatment and reverse the effect. Specific human phenotypes are largely determined by stability, activity, and interactions between proteins and other biomolecules that work together to execute specific cellular functions. Therefore, analysis of missense mutations’ effects on proteins and their complexes would provide important clues for identifying functionally important missense mutations, understanding the molecular mechanisms of cancer progression and facilitating treatment and prevention. Herein, we summarize the major computational approaches and tools that provide not only the classification of missense mutations as cancer drivers or passengers but also the molecular mechanisms induced by driver mutations. This review focuses on the discussion of annotation and prediction methods based on structural and biophysical data, analysis of somatic cancer missense mutations in 3D structures of proteins and their complexes, predictions of the effects of missense mutations on protein stability, protein-protein and protein-nucleic acid interactions, and assessment of conformational changes in protein conformations induced by mutations.
Collapse
|
38
|
Peng Y, Myers R, Zhang W, Alexov E. Computational Investigation of the Missense Mutations in DHCR7 Gene Associated with Smith-Lemli-Opitz Syndrome. Int J Mol Sci 2018; 19:E141. [PMID: 29300326 PMCID: PMC5796090 DOI: 10.3390/ijms19010141] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Revised: 12/29/2017] [Accepted: 12/30/2017] [Indexed: 12/25/2022] Open
Abstract
Smith-Lemli-Opitz syndrome (SLOS) is a cholesterol synthesis disorder characterized by physical, mental, and behavioral symptoms. It is caused by mutations in 7-dehydroxycholesterolreductase gene (DHCR7) encoding DHCR7 protein, which is the rate-limiting enzyme in the cholesterol synthesis pathway. Here we demonstrate that pathogenic mutations in DHCR7 protein are located either within the transmembrane region or are near the ligand-binding site, and are highly conserved among species. In contrast, non-pathogenic mutations observed in the general population are located outside the transmembrane region and have different effects on the conformational dynamics of DHCR7. All together, these observations suggest that the non-classified mutation R228Q is pathogenic. Our analyses indicate that pathogenic effects may affect protein stability and dynamics and alter the binding affinity and flexibility of the binding site.
Collapse
Affiliation(s)
- Yunhui Peng
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29630, USA.
| | - Rebecca Myers
- Department of Healthcare Genetics, Clemson University, Clemson, SC 29630, USA.
| | - Wenxing Zhang
- Department of Chemistry, Clemson University, Clemson, SC 29630, USA.
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29630, USA.
| |
Collapse
|
39
|
Steinbrecher T, Zhu C, Wang L, Abel R, Negron C, Pearlman D, Feyfant E, Duan J, Sherman W. Predicting the Effect of Amino Acid Single-Point Mutations on Protein Stability—Large-Scale Validation of MD-Based Relative Free Energy Calculations. J Mol Biol 2017; 429:948-963. [DOI: 10.1016/j.jmb.2016.12.007] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 12/02/2016] [Accepted: 12/02/2016] [Indexed: 12/22/2022]
|
40
|
Tan Z, Nie S, McDermott SP, Wicha MS, Lubman DM. Single Amino Acid Variant Profiles of Subpopulations in the MCF-7 Breast Cancer Cell Line. J Proteome Res 2017; 16:842-851. [PMID: 28076950 DOI: 10.1021/acs.jproteome.6b00824] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Cancers are initiated and developed from a small population of stem-like cells termed cancer stem cells (CSCs). There is heterogeneity among this CSC population that leads to multiple subpopulations with their own distinct biological features and protein expression. The protein expression and function may be impacted by amino acid variants that can occur largely due to single nucleotide changes. We have thus performed proteomic analysis of breast CSC subpopulations by mass spectrometry to study the presence of single amino acid variants (SAAVs) and their relation to breast cancer. We have used CSC markers to isolate pure breast CSC subpopulation fractions (ALDH+ and CD44+/CD24- cell populations) and the mature luminal cells (CD49f-EpCAM+) from the MCF-7 breast cancer cell line. By searching the Swiss-CanSAAVs database, 374 unique SAAVs were identified in total, where 27 are cancer-related SAAVs. 135 unique SAAVs were found in the CSC population compared with the mature luminal cells. The distribution of SAAVs detected in MCF-7 cells was compared with those predicted from the Swiss-CanSAAVs database, where we found distinct differences in the numbers of SAAVs detected relative to that expected from the Swiss-CanSAAVs database for several of the amino acids.
Collapse
Affiliation(s)
- Zhijing Tan
- Department of Surgery, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Song Nie
- Department of Surgery, University of Michigan , Ann Arbor, Michigan 48109, United States.,Biological Sciences Division and Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory , Richland, Washington 99352, United States
| | - Sean P McDermott
- Department of Internal Medicine, Division of Hematology/Oncology, University of Michigan , Ann Arbor, Michigan 48109, United States.,Comprehensive Cancer Center, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Max S Wicha
- Department of Internal Medicine, Division of Hematology/Oncology, University of Michigan , Ann Arbor, Michigan 48109, United States.,Comprehensive Cancer Center, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - David M Lubman
- Department of Surgery, University of Michigan , Ann Arbor, Michigan 48109, United States
| |
Collapse
|
41
|
McCafferty CL, Sergeev YV. In silico Mapping of Protein Unfolding Mutations for Inherited Disease. Sci Rep 2016; 6:37298. [PMID: 27905547 PMCID: PMC5131339 DOI: 10.1038/srep37298] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 10/27/2016] [Indexed: 01/09/2023] Open
Abstract
The effect of disease-causing missense mutations on protein folding is difficult to evaluate. To understand this relationship, we developed the unfolding mutation screen (UMS) for in silico evaluation of the severity of genetic perturbations at the atomic level of protein structure. The program takes into account the protein-unfolding curve and generates propensities using calculated free energy changes for every possible missense mutation at once. These results are presented in a series of unfolding heat maps and a colored protein 3D structure to show the residues critical to the protein folding and are available for quick reference. UMS was tested with 16 crystal structures to evaluate the unfolding for 1391 mutations from the ProTherm database. Our results showed that the computational accuracy of the unfolding calculations was similar to the accuracy of previously published free energy changes but provided a better scale. Our residue identity control helps to improve protein homology models. The unfolding predictions for proteins involved in age-related macular degeneration, retinitis pigmentosa, and Leber's congenital amaurosis matched well with data from previous studies. These results suggest that UMS could be a useful tool in the analysis of genotype-to-phenotype associations and next-generation sequencing data for inherited diseases.
Collapse
Affiliation(s)
- Caitlyn L. McCafferty
- Ophthalmic Genetics and Visual Function Branch, National Eye Institute, NIH, Bethesda Maryland, 20892, USA
| | - Yuri V. Sergeev
- Ophthalmic Genetics and Visual Function Branch, National Eye Institute, NIH, Bethesda Maryland, 20892, USA
| |
Collapse
|
42
|
Martelli PL, Fariselli P, Savojardo C, Babbi G, Aggazio F, Casadio R. Large scale analysis of protein stability in OMIM disease related human protein variants. BMC Genomics 2016; 17 Suppl 2:397. [PMID: 27356511 PMCID: PMC4928156 DOI: 10.1186/s12864-016-2726-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Background Modern genomic techniques allow to associate several Mendelian human diseases to single residue variations in different proteins. Molecular mechanisms explaining the relationship among genotype and phenotype are still under debate. Change of protein stability upon variation appears to assume a particular relevance in annotating whether a single residue substitution can or cannot be associated to a given disease. Thermodynamic properties of human proteins and of their disease related variants are lacking. In the present work, we take advantage of the available three dimensional structure of human proteins for predicting the role of disease related variations on the perturbation of protein stability. Results We develop INPS3D, a new predictor based on protein structure for computing the effect of single residue variations on protein stability (ΔΔG), scoring at the state-of-the-art (Pearson’s correlation value of the regression is equal to 0.72 with mean standard error of 1.15 kcal/mol on a blind test set comprising 351 variations in 60 proteins). We then filter 368 OMIM disease related proteins known with atomic resolution (where the three dimensional structure covers at least 70 % of the sequence) with 4717 disease related single residue variations and 685 polymorphisms without clinical consequence. We find that the effect on protein stability of disease related variations is larger than the effect of polymorphisms: in particular, by setting to |1 kcal/mol| the threshold between perturbing and not perturbing variations of the protein stability, about 44 % of disease related variations and 20 % of polymorphisms are predicted with |ΔΔG| > 1 kcal/mol, respectively. A consistent fraction of OMIM disease related variations is however predicted to promote |ΔΔG| ≤ 1 kcal/mol and we focus here on detecting features that can be associated to the thermodynamic property of the protein variant. Our analysis reveals that some 47 % of disease related variations promoting |ΔΔG| ≤ 1 are located in solvent exposed sites of the protein structure. We also find that the increase of the fraction of variations that in proteins are predicted with |ΔΔG| ≤ 1 kcal/mol, partially relates with the increasing number of the protein interacting partners, corroborating the notion that disease related, non-perturbing variations are likely to impair protein-protein interaction (70 % of the disease causing variations, with high accessible surface are indeed predicted in interacting sites). The set of OMIM surface accessible variations with |ΔΔG| ≤ 1 kcal/mol and located in interaction sites are 23 % of the total in 161 proteins. Among these, 43 proteins with some 327 disease causing variations are involved in signalling, structural biological processes, development and differentiation. Conclusions We compute the effect of disease causing variations on protein stability with INPS3D, a new state-of-the-art tool for predicting the change in ΔΔG value associated to single residue substitution in protein structures. The analysis indicates that OMIM disease related variations in proteins promote a much larger effect on protein stability than polymorphisms non-associated to diseases. Disease related variations with a slight effect on protein stability (|ΔΔG| < 1 kcal/mol) frequently occur at the protein accessible surface suggesting that they are located in protein-protein interactions patches in putative human biological functional networks. The hypothesis is corroborated by proving that proteins with many disease related variations that slightly perturb protein stability are on average more connected in the human physical interactome (IntAct) than proteins with variations predicted with |ΔΔG| > 1 kcal/mol.
Collapse
Affiliation(s)
- Pier Luigi Martelli
- Biocomputing Group, University of Bologna, Via San Giacomo 9/2, 40126, Bologna, Italy. .,Department BiGeA, University of Bologna, Via Selmi 3, 40126, Bologna, Italy.
| | - Piero Fariselli
- Biocomputing Group, University of Bologna, Via San Giacomo 9/2, 40126, Bologna, Italy.,Department BCA, University of Padova, Viale Università 16, 35020, Legnaro (PD), Italy
| | - Castrense Savojardo
- Biocomputing Group, University of Bologna, Via San Giacomo 9/2, 40126, Bologna, Italy.,Department BiGeA, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
| | - Giulia Babbi
- Biocomputing Group, University of Bologna, Via San Giacomo 9/2, 40126, Bologna, Italy.,Department BiGeA, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
| | - Francesco Aggazio
- Biocomputing Group, University of Bologna, Via San Giacomo 9/2, 40126, Bologna, Italy.,Department BiGeA, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, University of Bologna, Via San Giacomo 9/2, 40126, Bologna, Italy.,Department BiGeA, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
| |
Collapse
|
43
|
Niroula A, Vihinen M. Variation Interpretation Predictors: Principles, Types, Performance, and Choice. Hum Mutat 2016; 37:579-97. [DOI: 10.1002/humu.22987] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2015] [Accepted: 03/07/2016] [Indexed: 12/18/2022]
Affiliation(s)
- Abhishek Niroula
- Department of Experimental Medical Science; Lund University; BMC B13 Lund SE-22184 Sweden
| | - Mauno Vihinen
- Department of Experimental Medical Science; Lund University; BMC B13 Lund SE-22184 Sweden
| |
Collapse
|
44
|
Petukh M, Dai L, Alexov E. SAAMBE: Webserver to Predict the Charge of Binding Free Energy Caused by Amino Acids Mutations. Int J Mol Sci 2016; 17:547. [PMID: 27077847 PMCID: PMC4849003 DOI: 10.3390/ijms17040547] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Revised: 04/05/2016] [Accepted: 04/07/2016] [Indexed: 12/01/2022] Open
Abstract
Predicting the effect of amino acid substitutions on protein–protein affinity (typically evaluated via the change of protein binding free energy) is important for both understanding the disease-causing mechanism of missense mutations and guiding protein engineering. In addition, researchers are also interested in understanding which energy components are mostly affected by the mutation and how the mutation affects the overall structure of the corresponding protein. Here we report a webserver, the Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) webserver, which addresses the demand for tools for predicting the change of protein binding free energy. SAAMBE is an easy to use webserver, which only requires that a coordinate file be inputted and the user is provided with various, but easy to navigate, options. The user specifies the mutation position, wild type residue and type of mutation to be made. The server predicts the binding free energy change, the changes of the corresponding energy components and provides the energy minimized 3D structure of the wild type and mutant proteins for download. The SAAMBE protocol performance was tested by benchmarking the predictions against over 1300 experimentally determined changes of binding free energy and a Pearson correlation coefficient of 0.62 was obtained. How the predictions can be used for discriminating disease-causing from harmless mutations is discussed. The webserver can be accessed via http://compbio.clemson.edu/saambe_webserver/.
Collapse
Affiliation(s)
- Marharyta Petukh
- Computational Biophysics and Bioinformatics, Physics Department, Clemson University, Clemson, SC 29634, USA.
| | - Luogeng Dai
- Computational Biophysics and Bioinformatics, Physics Department, Clemson University, Clemson, SC 29634, USA.
- Department of Computer Sciences, Clemson University, Clemson, SC 29634, USA.
| | - Emil Alexov
- Computational Biophysics and Bioinformatics, Physics Department, Clemson University, Clemson, SC 29634, USA.
| |
Collapse
|