1
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors. Hum Genomics 2024; 18:90. [PMID: 39198917 PMCID: PMC11360829 DOI: 10.1186/s40246-024-00663-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 08/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). RESULTS The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past three decades, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 190 VIPs, resulting in a total of 407 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. CONCLUSIONS VIPdb version 2 summarizes 407 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. VIPdb is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - Arul S Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA
- Illumina, Foster City, CA, 94404, USA
| | - Steven E Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA.
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA.
| |
Collapse
|
2
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600283. [PMID: 38979289 PMCID: PMC11230257 DOI: 10.1101/2024.06.25.600283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). Results The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past 25 years, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 186 VIPs, resulting in a total of 403 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. Conclusions VIPdb version 2 summarizes 403 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. Availability VIPdb version 2 is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Arul S. Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Currently at: Illumina, Foster City, California 94404, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
3
|
Grasso D, Galderisi S, Santucci A, Bernini A. Pharmacological Chaperones and Protein Conformational Diseases: Approaches of Computational Structural Biology. Int J Mol Sci 2023; 24:ijms24065819. [PMID: 36982893 PMCID: PMC10054308 DOI: 10.3390/ijms24065819] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 03/09/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023] Open
Abstract
Whenever a protein fails to fold into its native structure, a profound detrimental effect is likely to occur, and a disease is often developed. Protein conformational disorders arise when proteins adopt abnormal conformations due to a pathological gene variant that turns into gain/loss of function or improper localization/degradation. Pharmacological chaperones are small molecules restoring the correct folding of a protein suitable for treating conformational diseases. Small molecules like these bind poorly folded proteins similarly to physiological chaperones, bridging non-covalent interactions (hydrogen bonds, electrostatic interactions, and van der Waals contacts) loosened or lost due to mutations. Pharmacological chaperone development involves, among other things, structural biology investigation of the target protein and its misfolding and refolding. Such research can take advantage of computational methods at many stages. Here, we present an up-to-date review of the computational structural biology tools and approaches regarding protein stability evaluation, binding pocket discovery and druggability, drug repurposing, and virtual ligand screening. The tools are presented as organized in an ideal workflow oriented at pharmacological chaperones' rational design, also with the treatment of rare diseases in mind.
Collapse
Affiliation(s)
- Daniela Grasso
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Silvia Galderisi
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Annalisa Santucci
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Andrea Bernini
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| |
Collapse
|
4
|
Scafuri B, Verdino A, D'Arminio N, Marabotti A. Computational methods to assist in the discovery of pharmacological chaperones for rare diseases. Brief Bioinform 2022; 23:6590149. [PMID: 35595532 DOI: 10.1093/bib/bbac198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/13/2022] [Accepted: 04/28/2022] [Indexed: 12/21/2022] Open
Abstract
Pharmacological chaperones are chemical compounds able to bind proteins and stabilize them against denaturation and following degradation. Some pharmacological chaperones have been approved, or are under investigation, for the treatment of rare inborn errors of metabolism, caused by genetic mutations that often can destabilize the structure of the wild-type proteins expressed by that gene. Given that, for rare diseases, there is a general lack of pharmacological treatments, many expectations are poured out on this type of compounds. However, their discovery is not straightforward. In this review, we would like to focus on the computational methods that can assist and accelerate the search for these compounds, showing also examples in which these methods were successfully applied for the discovery of promising molecules belonging to this new category of pharmacologically active compounds.
Collapse
Affiliation(s)
- Bernardina Scafuri
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Anna Verdino
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Nancy D'Arminio
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Anna Marabotti
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| |
Collapse
|
5
|
McGreig JE, Uri H, Antczak M, Sternberg MJE, Michaelis M, Wass MN. 3DLigandSite: structure-based prediction of protein-ligand binding sites. Nucleic Acids Res 2022; 50:W13-W20. [PMID: 35412635 PMCID: PMC9252821 DOI: 10.1093/nar/gkac250] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 03/13/2022] [Accepted: 04/03/2022] [Indexed: 01/13/2023] Open
Abstract
3DLigandSite is a web tool for the prediction of ligand-binding sites in proteins. Here, we report a significant update since the first release of 3DLigandSite in 2010. The overall methodology remains the same, with candidate binding sites in proteins inferred using known binding sites in related protein structures as templates. However, the initial structural modelling step now uses the newly available structures from the AlphaFold database or alternatively Phyre2 when AlphaFold structures are not available. Further, a sequence-based search using HHSearch has been introduced to identify template structures with bound ligands that are used to infer the ligand-binding residues in the query protein. Finally, we introduced a machine learning element as the final prediction step, which improves the accuracy of predictions and provides a confidence score for each residue predicted to be part of a binding site. Validation of 3DLigandSite on a set of 6416 binding sites obtained 92% recall at 75% precision for non-metal binding sites and 52% recall at 75% precision for metal binding sites. 3DLigandSite is available at https://www.wass-michaelislab.org/3dligandsite. Users submit either a protein sequence or structure. Results are displayed in multiple formats including an interactive Mol* molecular visualization of the protein and the predicted binding sites.
Collapse
Affiliation(s)
- Jake E McGreig
- School of Biosciences, Division of Natural Sciences, University of Kent, Canterbury, Kent CT2 7NJ, UK
| | - Hannah Uri
- School of Biosciences, Division of Natural Sciences, University of Kent, Canterbury, Kent CT2 7NJ, UK
| | - Magdalena Antczak
- School of Biosciences, Division of Natural Sciences, University of Kent, Canterbury, Kent CT2 7NJ, UK
| | - Michael J E Sternberg
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| | - Martin Michaelis
- School of Biosciences, Division of Natural Sciences, University of Kent, Canterbury, Kent CT2 7NJ, UK
| | - Mark N Wass
- School of Biosciences, Division of Natural Sciences, University of Kent, Canterbury, Kent CT2 7NJ, UK
| |
Collapse
|
6
|
Pereira GRC, Tavares GDB, de Freitas MC, De Mesquita JF. In silico analysis of the tryptophan hydroxylase 2 (TPH2) protein variants related to psychiatric disorders. PLoS One 2020; 15:e0229730. [PMID: 32119710 PMCID: PMC7051086 DOI: 10.1371/journal.pone.0229730] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 02/12/2020] [Indexed: 11/19/2022] Open
Abstract
The tryptophan hydroxylase 2 (TPH2) enzyme catalyzes the first step of serotonin biosynthesis. Serotonin is known for its role in several homeostatic systems related to sleep, mood, and food intake. As the reaction catalyzed by TPH2 is the rate-limiting step of serotonin biosynthesis, mutations in TPH2 have been associated with several psychiatric disorders (PD). This work undertakes an in silico analysis of the effects of genetic mutations in the human TPH2 protein. Ten algorithms were used to predict the functional and stability effects of the TPH2 mutations. ConSurf was used to estimate the evolutionary conservation of TPH2 amino acids. GROMACS was used to perform molecular dynamics (MD) simulations of TPH2 WT and P260S, R303W, and R441H, which had already been associated with the development of PD. Forty-six TPH2 variants were compiled from the literature. Among the analyzed variants, those occurring at the catalytic domain were shown to be more damaging to protein structure and function. The ConSurf analysis indicated that the mutations affecting the catalytic domain were also more conserved throughout evolution. The variants S364K and S383F were predicted to be deleterious by all the functional algorithms used and occurred at conserved positions, suggesting that they might be deleterious. The MD analyses indicate that the mutations P206S, R303W, and R441H affect TPH2 flexibility and essential mobility at the catalytic and oligomerization domains. The variants P206S, R303W, and R441H also exhibited alterations in dimer binding affinity and stability throughout the simulations. Thus, these mutations may impair TPH2 functional interactions and, consequently, its function, leading to the development of PD. Furthermore, we developed a database, SNPMOL (http://www.snpmol.org/), containing the results presented in this paper. Understanding the effects of TPH2 mutations on protein structure and function may lead to improvements in existing treatments for PD and facilitate the design of further experiments.
Collapse
Affiliation(s)
- Gabriel Rodrigues Coutinho Pereira
- Bioinformatics and Computational Biology Laboratory, Department of Genetics and Molecular Biology, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Gustavo Duarte Bocayuva Tavares
- Bioinformatics and Computational Biology Laboratory, Department of Genetics and Molecular Biology, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Marta Costa de Freitas
- Bioinformatics and Computational Biology Laboratory, Department of Genetics and Molecular Biology, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Joelma Freire De Mesquita
- Bioinformatics and Computational Biology Laboratory, Department of Genetics and Molecular Biology, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
7
|
Galano-Frutos JJ, García-Cebollada H, Sancho J. Molecular dynamics simulations for genetic interpretation in protein coding regions: where we are, where to go and when. Brief Bioinform 2019; 22:3-19. [PMID: 31813950 DOI: 10.1093/bib/bbz146] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Revised: 09/22/2019] [Accepted: 10/25/2019] [Indexed: 12/18/2022] Open
Abstract
The increasing ease with which massive genetic information can be obtained from patients or healthy individuals has stimulated the development of interpretive bioinformatics tools as aids in clinical practice. Most such tools analyze evolutionary information and simple physical-chemical properties to predict whether replacement of one amino acid residue with another will be tolerated or cause disease. Those approaches achieve up to 80-85% accuracy as binary classifiers (neutral/pathogenic). As such accuracy is insufficient for medical decision to be based on, and it does not appear to be increasing, more precise methods, such as full-atom molecular dynamics (MD) simulations in explicit solvent, are also discussed. Then, to describe the goal of interpreting human genetic variations at large scale through MD simulations, we restrictively refer to all possible protein variants carrying single-amino-acid substitutions arising from single-nucleotide variations as the human variome. We calculate its size and develop a simple model that allows calculating the simulation time needed to have a 0.99 probability of observing unfolding events of any unstable variant. The knowledge of that time enables performing a binary classification of the variants (stable-potentially neutral/unstable-pathogenic). Our model indicates that the human variome cannot be simulated with present computing capabilities. However, if they continue to increase as per Moore's law, it could be simulated (at 65°C) spending only 3 years in the task if we started in 2031. The simulation of individual protein variomes is achievable in short times starting at present. International coordination seems appropriate to embark upon massive MD simulations of protein variants.
Collapse
Affiliation(s)
- Juan J Galano-Frutos
- Protein Folding and Molecular Design (ProtMol)' group at BIFI, University of Zaragoza
| | | | - Javier Sancho
- Protein Folding and Molecular Design (ProtMol)' group at BIFI, University of Zaragoza
| |
Collapse
|
8
|
Hu Z, Yu C, Furutsuki M, Andreoletti G, Ly M, Hoskins R, Adhikari AN, Brenner SE. VIPdb, a genetic Variant Impact Predictor Database. Hum Mutat 2019; 40:1202-1214. [PMID: 31283070 PMCID: PMC7288905 DOI: 10.1002/humu.23858] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 06/27/2019] [Indexed: 12/30/2022]
Abstract
Genome sequencing identifies vast number of genetic variants. Predicting these variants' molecular and clinical effects is one of the preeminent challenges in human genetics. Accurate prediction of the impact of genetic variants improves our understanding of how genetic information is conveyed to molecular and cellular functions, and is an essential step towards precision medicine. Over one hundred tools/resources have been developed specifically for this purpose. We summarize these tools as well as their characteristics, in the genetic Variant Impact Predictor Database (VIPdb). This database will help researchers and clinicians explore appropriate tools, and inform the development of improved methods. VIPdb can be browsed and downloaded at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| | - Changhua Yu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Department of Bioengineering, University of California, Berkeley, California 94720, USA
| | - Mabel Furutsuki
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720, USA
| | - Gaia Andreoletti
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| | - Melissa Ly
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Division of Data Sciences, University of California, Berkeley, California 94720, USA
| | - Roger Hoskins
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| | - Aashish N. Adhikari
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| | - Steven E. Brenner
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
9
|
De Oliveira CCS, Pereira GRC, De Alcantara JYS, Antunes D, Caffarena ER, De Mesquita JF. In silico analysis of the V66M variant of human BDNF in psychiatric disorders: An approach to precision medicine. PLoS One 2019; 14:e0215508. [PMID: 30998730 PMCID: PMC6472887 DOI: 10.1371/journal.pone.0215508] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 04/04/2019] [Indexed: 11/19/2022] Open
Abstract
Brain-derived neurotrophic factor (BDNF) plays an important role in neurogenesis and synapse formation. The V66M is the most prevalent BDNF mutation in humans and impairs the function and distribution of BDNF. This mutation is related to several psychiatric disorders. The pro-region of BDNF, particularly position 66 and its adjacent residues, are determinant for the intracellular sorting and activity-dependent secretion of BDNF. However, it has not yet been fully elucidated. The present study aims to analyze the effects of the V66M mutation on BDNF structure and function. Here, we applied nine algorithms, including SIFT and PolyPhen-2, for functional and stability prediction of the V66M mutation. The complete theoretical model of BNDF was generated by Rosetta and validated by PROCHECK, RAMPAGE, ProSa, QMEAN and Verify-3D algorithms. Structural alignment was performed using TM-align. Phylogenetic analysis was performed using the ConSurf server. Molecular dynamics (MD) simulations were performed and analyzed using the GROMACS 2018.2 package. The V66M mutation was predicted as deleterious by PolyPhen-2 and SIFT in addition to being predicted as destabilizing by I-Mutant. According to SNPeffect, the V66M mutation does not affect protein aggregation, amyloid propensity, and chaperone binding. The complete theoretical structure of BDNF proved to be a reliable model. Phylogenetic analysis indicated that the V66M mutation of BDNF occurs at a non-conserved position of the protein. MD analyses indicated that the V66M mutation does not affect the BDNF flexibility and surface-to-volume ratio, but affects the BDNF essential motions, hydrogen-bonding and secondary structure particularly at its pre and pro-domain, which are crucial for its activity and distribution. Thus, considering that these parameters are determinant for protein interactions and, consequently, protein function; the alterations observed throughout the MD analyses may be related to the functional impairment of BDNF upon V66M mutation, as well as its involvement in psychiatric disorders.
Collapse
Affiliation(s)
- Clara Carolina Silva De Oliveira
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Gabriel Rodrigues Coutinho Pereira
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Jamile Yvis Santos De Alcantara
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Deborah Antunes
- Computational Biophysics and Molecular Modeling Group, Scientific Computing Program (PROCC), Fundação Oswaldo Cruz, Manguinhos, Rio de Janeiro, Brazil
| | - Ernesto Raul Caffarena
- Computational Biophysics and Molecular Modeling Group, Scientific Computing Program (PROCC), Fundação Oswaldo Cruz, Manguinhos, Rio de Janeiro, Brazil
| | - Joelma Freire De Mesquita
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
10
|
Hassan MS, Shaalan AA, Dessouky MI, Abdelnaiem AE, ElHefnawi M. A review study: Computational techniques for expecting the impact of non-synonymous single nucleotide variants in human diseases. Gene 2018; 680:20-33. [PMID: 30240882 DOI: 10.1016/j.gene.2018.09.028] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 09/14/2018] [Indexed: 01/18/2023]
Abstract
Non-Synonymous Single-Nucleotide Variants (nsSNVs) and mutations can create a diversity effect on proteins as changing genotype and phenotype, which interrupts its stability. The alterations in the protein stability may cause diseases like cancer. Discovering of nsSNVs and mutations can be a useful tool for diagnosing the disease at a beginning stage. Many studies introduced the various predicting singular and consensus tools that based on different Machine Learning Techniques (MLTs) using diverse datasets. Therefore, we introduce the current comprehensive review of the most popular and recent unique tools that predict pathogenic variations and Meta-tool that merge some of them for enhancing their predictive power. Also, we scanned the several types computational techniques in the state-of-the-art and methods for predicting the effect both of coding and noncoding variants. We then displayed, the protein stability predictors. We offer the details of the most common benchmark database for variations including the main predictive features used by the different methods. Finally, we address the most common fundamental criteria for performance assessment of predictive tools. This review is targeted at bioinformaticians attentive in the characterization of regulatory variants, geneticists, molecular biologists attentive in understanding more about the nature and effective role of such variants from a functional point of views, and clinicians who may hope to learn about variants in human associated with a specific disease and find out what to do next to uncover how they impact on the underlying mechanisms.
Collapse
Affiliation(s)
- Marwa S Hassan
- Systems and Information Department and Biomedical Informatics Group, Engineering Research Division, National Research Center, Giza, Egypt; Patent Office of Scientific Research Academy, Egypt.
| | - A A Shaalan
- Electronics and Communication Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt
| | - M I Dessouky
- Electronics and Electrical Communications Department, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt
| | - Abdelaziz E Abdelnaiem
- Electronics and Communication Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt
| | - Mahmoud ElHefnawi
- Systems and Information Department and Biomedical Informatics Group, Engineering Research Division, National Research Center, Giza, Egypt; Center for Informatics, Nile University, Giza, Egypt
| |
Collapse
|
11
|
Xie H, Zeng D, Chen X, Huo D, Liu L, Zhang D, Jin Q, Ke K, Hu M. Prediction on the risk population of idiosyncratic adverse reactions based on molecular docking with mutant proteins. Oncotarget 2017; 8:95568-95576. [PMID: 29221149 PMCID: PMC5707043 DOI: 10.18632/oncotarget.21509] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 09/20/2017] [Indexed: 01/11/2023] Open
Abstract
Idiosyncratic adverse drug reactions are drug reactions that occur rarely and unpredictably among the population. These reactions often occur after a drug is marketed, which means that they are strongly related to the genotype of the population. The prediction of such adverse reactions is a major challenge because of the lack of appropriate test models during the drug development process. In this study, we chose withdrawn drugs because the reasons why they were withdrawn and from which countries or regions is easily obtained. We selected Dilevalol and its chiral drug (Labetalol) as the investigatory drugs, as they have been withdrawn from a European market (Britain) because of serious hepatotoxicity. First, we searched for and obtained the Dilevalol-induced- liver-injury related protein, multidrug resistance protein 1 (MDR1), from the Comparative Toxicogenomics Database (CTD). Then, we searched and extracted 477 non-synonymous single nucleotide polymorphisms (nsSNP) on MDR1 in the dbSNP database. Second, we used the VarMod tool to predict the functional changes of MDR1 induced by these nsSNPs, from which we extracted the nsSNPs that significantly change the functions of this protein. Third, we built the three-dimensional structures of those variant proteins and used AutoDock to perform a docking study, choosing the best model to determine the sites of nsSNPs. Finally, we used the data from the 1000 Genomes Project to verify the dominant population distribution of the risk SNP. We applied the same strategy to the post-marketing drug-induced liver injury drugs to further test the feasibility of our method.
Collapse
Affiliation(s)
- Hongbo Xie
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Diheng Zeng
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Xiujie Chen
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Diwei Huo
- The 2nd Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Lei Liu
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Denan Zhang
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Qing Jin
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Kehui Ke
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| | - Ming Hu
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, PR China
| |
Collapse
|
12
|
Martell HJ, Wong KA, Martin JF, Kassam Z, Thomas K, Wass MN. Associating mutations causing cystinuria with disease severity with the aim of providing precision medicine. BMC Genomics 2017; 18:550. [PMID: 28812535 PMCID: PMC5558187 DOI: 10.1186/s12864-017-3913-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Cystinuria is an inherited disease that results in the formation of cystine stones in the kidney, which can have serious health complications. Two genes (SLC7A9 and SLC3A1) that form an amino acid transporter are known to be responsible for the disease. Variants that cause the disease disrupt amino acid transport across the cell membrane, leading to the build-up of relatively insoluble cystine, resulting in formation of stones. Assessing the effects of each mutation is critical in order to provide tailored treatment options for patients. We used various computational methods to assess the effects of cystinuria associated mutations, utilising information on protein function, evolutionary conservation and natural population variation of the two genes. We also analysed the ability of some methods to predict the phenotypes of individuals with cystinuria, based on their genotypes, and compared this to clinical data. Results Using a literature search, we collated a set of 94 SLC3A1 and 58 SLC7A9 point mutations known to be associated with cystinuria. There are differences in sequence location, evolutionary conservation, allele frequency, and predicted effect on protein function between these mutations and other genetic variants of the same genes that occur in a large population. Structural analysis considered how these mutations might lead to cystinuria. For SLC7A9, many mutations swap hydrophobic amino acids for charged amino acids or vice versa, while others affect known functional sites. For SLC3A1, functional information is currently insufficient to make confident predictions but mutations often result in the loss of hydrogen bonds and largely appear to affect protein stability. Finally, we showed that computational predictions of mutation severity were significantly correlated with the disease phenotypes of patients from a clinical study, despite different methods disagreeing for some of their predictions. Conclusions The results of this study are promising and highlight the areas of research which must now be pursued to better understand how mutations in SLC3A1 and SLC7A9 cause cystinuria. The application of our approach to a larger data set is essential, but we have shown that computational methods could play an important role in designing more effective personalised treatment options for patients with cystinuria. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3913-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Henry J Martell
- School of Biosciences, University of Kent, Canterbury, Kent, CT2 7NJ, UK
| | - Kathie A Wong
- Urology Centre, Guy's and St. Thomas' NHS Foundation Trust, London, SE1 9RT, UK
| | - Juan F Martin
- School of Biosciences, University of Kent, Canterbury, Kent, CT2 7NJ, UK
| | - Ziyan Kassam
- Urology Centre, Guy's and St. Thomas' NHS Foundation Trust, London, SE1 9RT, UK
| | - Kay Thomas
- Urology Centre, Guy's and St. Thomas' NHS Foundation Trust, London, SE1 9RT, UK.
| | - Mark N Wass
- School of Biosciences, University of Kent, Canterbury, Kent, CT2 7NJ, UK.
| |
Collapse
|
13
|
Ferradini V, Cassone M, Nuovo S, Bagni I, D'Apice MR, Botta A, Novelli G, Sangiuolo F. Targeted Next Generation Sequencing in patients with Myotonia Congenita. Clin Chim Acta 2017; 470:1-7. [PMID: 28427807 DOI: 10.1016/j.cca.2017.04.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Revised: 04/13/2017] [Accepted: 04/15/2017] [Indexed: 12/28/2022]
Abstract
INTRODUCTION Myotonia Congenita (MC) is a nondystrophic skeletal muscle disease characterized by muscle stiffness, weakness, delayed skeletal relaxation and hypertrophic muscle. The disease can be inherited as dominant or recessive. More than 130 mutations in CLCN1 gene have been identified. MATERIALS AND METHODS We analyzed the entire coding region and exon-intron boundaries of the CLCN1 gene in 40 MC patients. Samples already Sanger-sequenced were successively evaluated by Next Generation Sequencing (NGS), on Ion Torrent PGM. Moreover, additional 15 patients were sequenced directly by NGS. RESULTS NGS allowed us to identify all CLCN1 mutations except those located within exon 3, demonstrating a 96% of sensitivity. Due to primer design, one SNP (exactly rs7794560) also failed to be detected. Our results enlarge the spectrum of CLCN1 mutations and showed a novel approach for molecular analysis of MC.
Collapse
Affiliation(s)
- Valentina Ferradini
- Dept of Biomedicine and Prevention, Tor Vergata University, via Montpellier, 1, 00133 Rome, Italy
| | - Marco Cassone
- Dept of Biomedicine and Prevention, Tor Vergata University, via Montpellier, 1, 00133 Rome, Italy
| | - Sara Nuovo
- Dept of Biomedicine and Prevention, Tor Vergata University, via Montpellier, 1, 00133 Rome, Italy
| | - Ilaria Bagni
- Dept of Biomedicine and Prevention, Tor Vergata University, via Montpellier, 1, 00133 Rome, Italy
| | - Maria Rosaria D'Apice
- Dept of Biomedicine and Prevention, Tor Vergata University, via Montpellier, 1, 00133 Rome, Italy
| | - Annalisa Botta
- Dept of Biomedicine and Prevention, Tor Vergata University, via Montpellier, 1, 00133 Rome, Italy
| | - Giuseppe Novelli
- Dept of Biomedicine and Prevention, Tor Vergata University, via Montpellier, 1, 00133 Rome, Italy
| | - Federica Sangiuolo
- Dept of Biomedicine and Prevention, Tor Vergata University, via Montpellier, 1, 00133 Rome, Italy.
| |
Collapse
|
14
|
Johnston SB, Raines RT. PTENpred: A Designer Protein Impact Predictor for PTEN-related Disorders. J Comput Biol 2016; 23:969-975. [PMID: 27310656 DOI: 10.1089/cmb.2016.0058] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Connecting a genotype with a phenotype can provide immediate advantages in the context of modern medicine. Especially useful would be an algorithm for predicting the impact of nonsynonymous single-nucleotide polymorphisms in the gene for PTEN, a protein that is implicated in most human cancers and connected to germline disorders that include autism. We have developed a protein impact predictor, PTENpred, that integrates data from multiple analyses using a support vector machine algorithm. PTENpred can predict phenotypes related to a human PTEN mutation with high accuracy. The output of PTENpred is designed for use by biologists, clinicians, and laymen, and features an interactive display of the three-dimensional structure of PTEN. Using knowledge about the structure of proteins, in general, and the PTEN protein, in particular, enables the prediction of consequences from damage to the human PTEN gene. This algorithm, which can be accessed online, could facilitate the implementation of effective therapeutic regimens for cancer and other diseases.
Collapse
Affiliation(s)
- Sean B Johnston
- 1 Department of Biochemistry, University of Wisconsin-Madison , Madison, Wisconsin
| | - Ronald T Raines
- 1 Department of Biochemistry, University of Wisconsin-Madison , Madison, Wisconsin.,2 Department of Chemistry, University of Wisconsin-Madison , Madison, Wisconsin
| |
Collapse
|
15
|
Lu HC, Herrera Braga J, Fraternali F. PinSnps: structural and functional analysis of SNPs in the context of protein interaction networks. ACTA ACUST UNITED AC 2016; 32:2534-6. [PMID: 27153707 PMCID: PMC4978923 DOI: 10.1093/bioinformatics/btw153] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2015] [Accepted: 03/15/2016] [Indexed: 12/24/2022]
Abstract
Summary: We present a practical computational pipeline to readily perform data analyses of protein–protein interaction networks by using genetic and functional information mapped onto protein structures. We provide a 3D representation of the available protein structure and its regions (surface, interface, core and disordered) for the selected genetic variants and/or SNPs, and a prediction of the mutants’ impact on the protein as measured by a range of methods. We have mapped in total 2587 genetic disorder-related SNPs from OMIM, 587 873 cancer-related variants from COSMIC, and 1 484 045 SNPs from dbSNP. All result data can be downloaded by the user together with an R-script to compute the enrichment of SNPs/variants in selected structural regions. Availability and Implementation: PinSnps is available as open-access service at http://fraternalilab.kcl.ac.uk/PinSnps/ Contact:franca.fraternali@kcl.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hui-Chun Lu
- Randall Division of Cell and Molecular Biophysics, King's College London, London SE1 1UL, UK
| | - Julián Herrera Braga
- Randall Division of Cell and Molecular Biophysics, King's College London, London SE1 1UL, UK
| | - Franca Fraternali
- Randall Division of Cell and Molecular Biophysics, King's College London, London SE1 1UL, UK
| |
Collapse
|
16
|
Cardoso JGR, Andersen MR, Herrgård MJ, Sonnenschein N. Analysis of genetic variation and potential applications in genome-scale metabolic modeling. Front Bioeng Biotechnol 2015; 3:13. [PMID: 25763369 PMCID: PMC4329917 DOI: 10.3389/fbioe.2015.00013] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 01/22/2015] [Indexed: 11/13/2022] Open
Abstract
Genetic variation is the motor of evolution and allows organisms to overcome the environmental challenges they encounter. It can be both beneficial and harmful in the process of engineering cell factories for the production of proteins and chemicals. Throughout the history of biotechnology, there have been efforts to exploit genetic variation in our favor to create strains with favorable phenotypes. Genetic variation can either be present in natural populations or it can be artificially created by mutagenesis and selection or adaptive laboratory evolution. On the other hand, unintended genetic variation during a long term production process may lead to significant economic losses and it is important to understand how to control this type of variation. With the emergence of next-generation sequencing technologies, genetic variation in microbial strains can now be determined on an unprecedented scale and resolution by re-sequencing thousands of strains systematically. In this article, we review challenges in the integration and analysis of large-scale re-sequencing data, present an extensive overview of bioinformatics methods for predicting the effects of genetic variants on protein function, and discuss approaches for interfacing existing bioinformatics approaches with genome-scale models of cellular processes in order to predict effects of sequence variation on cellular phenotypes.
Collapse
Affiliation(s)
- João G. R. Cardoso
- The Novo Nordisk Foundation Center of Biosustainability, Technical University of Denmark, Hørsholm, Denmark
| | | | - Markus J. Herrgård
- The Novo Nordisk Foundation Center of Biosustainability, Technical University of Denmark, Hørsholm, Denmark
| | - Nikolaus Sonnenschein
- The Novo Nordisk Foundation Center of Biosustainability, Technical University of Denmark, Hørsholm, Denmark
| |
Collapse
|
17
|
Katsonis P, Koire A, Wilson SJ, Hsu TK, Lua RC, Wilkins AD, Lichtarge O. Single nucleotide variations: biological impact and theoretical interpretation. Protein Sci 2014; 23:1650-66. [PMID: 25234433 PMCID: PMC4253807 DOI: 10.1002/pro.2552] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2014] [Revised: 09/12/2014] [Accepted: 09/15/2014] [Indexed: 12/27/2022]
Abstract
Genome-wide association studies (GWAS) and whole-exome sequencing (WES) generate massive amounts of genomic variant information, and a major challenge is to identify which variations drive disease or contribute to phenotypic traits. Because the majority of known disease-causing mutations are exonic non-synonymous single nucleotide variations (nsSNVs), most studies focus on whether these nsSNVs affect protein function. Computational studies show that the impact of nsSNVs on protein function reflects sequence homology and structural information and predict the impact through statistical methods, machine learning techniques, or models of protein evolution. Here, we review impact prediction methods and discuss their underlying principles, their advantages and limitations, and how they compare to and complement one another. Finally, we present current applications and future directions for these methods in biological research and medical genetics.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
| | - Amanda Koire
- Department of Structural and Computational Biology and Molecular BiophysicsHouston, Texas
| | - Stephen Joseph Wilson
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
| | - Teng-Kuei Hsu
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
| | - Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
| | - Angela Dawn Wilkins
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of MedicineHouston, Texas
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
- Department of Structural and Computational Biology and Molecular BiophysicsHouston, Texas
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of MedicineHouston, Texas
- Department of Pharmacology, Baylor College of MedicineHouston, Texas
| |
Collapse
|
18
|
Kumar A, Bhandari A, Goswami C. Surveying genetic variants and molecular phylogeny of cerebral cavernous malformation gene, CCM3/PDCD10. Biochem Biophys Res Commun 2014; 455:98-106. [DOI: 10.1016/j.bbrc.2014.10.105] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 10/21/2014] [Indexed: 11/29/2022]
|