Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Schaefer C, Bromberg Y, Achten D, Rost B. Disease-related mutations predicted to impact protein function. BMC Genomics 2012;13 Suppl 4:S11. [PMID: 22759649 PMCID: PMC3394413 DOI: 10.1186/1471-2164-13-s4-s11] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

For:	Schaefer C, Bromberg Y, Achten D, Rost B. Disease-related mutations predicted to impact protein function. BMC Genomics 2012;13 Suppl 4:S11. [PMID: 22759649 PMCID: PMC3394413 DOI: 10.1186/1471-2164-13-s4-s11] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Number

Cited by Other Article(s)

Qiu J, Nechaev D, Rost B. Protein-protein and protein-nucleic acid binding residues important for common and rare sequence variants in human. BMC Bioinformatics 2020;21:452. [PMID: 33050876 PMCID: PMC7557062 DOI: 10.1186/s12859-020-03759-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 09/16/2020] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Any two unrelated people differ by about 20,000 missense mutations (also referred to as SAVs: Single Amino acid Variants or missense SNV). Many SAVs have been predicted to strongly affect molecular protein function. Common SAVs (> 5% of population) were predicted to have, on average, more effect on molecular protein function than rare SAVs (< 1% of population). We hypothesized that the prevalence of effect in common over rare SAVs might partially be caused by common SAVs more often occurring at interfaces of proteins with other proteins, DNA, or RNA, thereby creating subgroup-specific phenotypes. We analyzed SAVs from 60,706 people through the lens of two prediction methods, one (SNAP2) predicting the effects of SAVs on molecular protein function, the other (ProNA2020) predicting residues in DNA-, RNA- and protein-binding interfaces.

RESULTS

Three results stood out. Firstly, SAVs predicted to occur at binding interfaces were predicted to more likely affect molecular function than those predicted as not binding (p value < 2.2 × 10^-16). Secondly, for SAVs predicted to occur at binding interfaces, common SAVs were predicted more strongly with effect on protein function than rare SAVs (p value < 2.2 × 10^-16). Restriction to SAVs with experimental annotations confirmed all results, although the resulting subsets were too small to establish statistical significance for any result. Thirdly, the fraction of SAVs predicted at binding interfaces differed significantly between tissues, e.g. urinary bladder tissue was found abundant in SAVs predicted at protein-binding interfaces, and reproductive tissues (ovary, testis, vagina, seminal vesicle and endometrium) in SAVs predicted at DNA-binding interfaces.

CONCLUSIONS

Overall, the results suggested that residues at protein-, DNA-, and RNA-binding interfaces contributed toward predicting that common SAVs more likely affect molecular function than rare SAVs.

Collapse

Pejaver V, Babbi G, Casadio R, Folkman L, Katsonis P, Kundu K, Lichtarge O, Martelli PL, Miller M, Moult J, Pal LR, Savojardo C, Yin Y, Zhou Y, Radivojac P, Bromberg Y. Assessment of methods for predicting the effects of PTEN and TPMT protein variants. Hum Mutat 2019;40:1495-1506. [PMID: 31184403 PMCID: PMC6744362 DOI: 10.1002/humu.23838] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 05/27/2019] [Accepted: 06/06/2019] [Indexed: 01/16/2023]

Affiliation(s)

Vikas Pejaver Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington The eScience Institute, University of Washington, Seattle, Washington
Giulia Babbi Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
Rita Casadio Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
Lukas Folkman School of Information and Communication Technology, Griffith University, Southport, Australia
Panagiotis Katsonis Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
Kunal Kundu Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland
Olivier Lichtarge Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, Texas Department of Pharmacology, Baylor College of Medicine, Houston, Texas Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas
Pier Luigi Martelli Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
Maximilian Miller Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey
John Moult Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
Lipika R Pal Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
Castrense Savojardo Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
Yizhou Yin Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
Yaoqi Zhou Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Australia
Predrag Radivojac Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts
Yana Bromberg Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey Department of Genetics, Human Genetics Institute, Rutgers University, Piscataway, New Jersey Institute for Advanced Study at Technische Universität München (TUM-IAS), Garching/Munich, Germany

Collapse

Gorlov IP, Pikielny CW, Frost HR, Her SC, Cole MD, Strohbehn SD, Wallace-Bradley D, Kimmel M, Gorlova OY, Amos CI. Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples. BMC Bioinformatics 2018;19:430. [PMID: 30453881 PMCID: PMC6245819 DOI: 10.1186/s12859-018-2455-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2018] [Accepted: 10/31/2018] [Indexed: 12/18/2022] Open

Abstract

BACKGROUND

Because driver mutations provide selective advantage to the mutant clone, they tend to occur at a higher frequency in tumor samples compared to selectively neutral (passenger) mutations. However, mutation frequency alone is insufficient to identify cancer genes because mutability is influenced by many gene characteristics, such as size, nucleotide composition, etc. The goal of this study was to identify gene characteristics associated with the frequency of somatic mutations in the gene in tumor samples.

RESULTS

We used data on somatic mutations detected by genome wide screens from the Catalog of Somatic Mutations in Cancer (COSMIC). Gene size, nucleotide composition, expression level of the gene, relative replication time in the cell cycle, level of evolutionary conservation and other gene characteristics (totaling 11) were used as predictors of the number of somatic mutations. We applied stepwise multiple linear regression to predict the number of mutations per gene. Because missense, nonsense, and frameshift mutations are associated with different sets of gene characteristics, they were modeled separately. Gene characteristics explain 88% of the variation in the number of missense, 40% of nonsense, and 23% of frameshift mutations. Comparisons of the observed and expected numbers of mutations identified genes with a higher than expected number of mutations- positive outliers. Many of these are known driver genes. A number of novel candidate driver genes was also identified.

CONCLUSIONS

By comparing the observed and predicted number of mutations in a gene, we have identified known cancer-associated genes as well as 111 novel cancer associated genes. We also showed that adding the number of silent mutations per gene reported by genome/exome wide screens across all cancer type (COSMIC data) as a predictor substantially exceeds predicting accuracy of the most popular cancer gene predicting tool - MutsigCV.

Collapse

Pejaver V, Mooney SD, Radivojac P. Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges. Hum Mutat 2017;38:1092-1108. [PMID: 28508593 PMCID: PMC5561458 DOI: 10.1002/humu.23258] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Revised: 03/16/2017] [Accepted: 03/26/2017] [Indexed: 11/08/2022]

Golestan Hashemi FS, Razi Ismail M, Rafii Yusop M, Golestan Hashemi MS, Nadimi Shahraki MH, Rastegari H, Miah G, Aslani F. Intelligent mining of large-scale bio-data: Bioinformatics applications. BIOTECHNOL BIOTEC EQ 2017. [DOI: 10.1080/13102818.2017.1364977] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Common sequence variants affect molecular function more than rare variants? Sci Rep 2017;7:1608. [PMID: 28487536 PMCID: PMC5431670 DOI: 10.1038/s41598-017-01054-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 02/28/2017] [Indexed: 12/29/2022] Open

Rost B, Radivojac P, Bromberg Y. Protein function in precision medicine: deep understanding with machine learning. FEBS Lett 2016;590:2327-41. [PMID: 27423136 PMCID: PMC5937700 DOI: 10.1002/1873-3468.12307] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Revised: 07/12/2016] [Accepted: 07/12/2016] [Indexed: 12/21/2022]

Peng Y, Alexov E. Investigating the linkage between disease-causing amino acid variants and their effect on protein stability and binding. Proteins 2016;84:232-9. [PMID: 26650512 DOI: 10.1002/prot.24968] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Accepted: 11/30/2015] [Indexed: 12/12/2022]

Mutations in the KDM5C ARID Domain and Their Plausible Association with Syndromic Claes-Jensen-Type Disease. Int J Mol Sci 2015;16:27270-87. [PMID: 26580603 PMCID: PMC4661880 DOI: 10.3390/ijms161126022] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Revised: 11/01/2015] [Accepted: 11/04/2015] [Indexed: 11/30/2022] Open

Hecht M, Bromberg Y, Rost B. Better prediction of functional effects for sequence variants. BMC Genomics 2015;16 Suppl 8:S1. [PMID: 26110438 PMCID: PMC4480835 DOI: 10.1186/1471-2164-16-s8-s1] [Citation(s) in RCA: 364] [Impact Index Per Article: 40.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Neutral and weakly nonneutral sequence variants may define individuality. Proc Natl Acad Sci U S A 2013;110:14255-60. [PMID: 23940345 DOI: 10.1073/pnas.1216613110] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open

Hecht M, Bromberg Y, Rost B. News from the protein mutability landscape. J Mol Biol 2013;425:3937-48. [PMID: 23896297 DOI: 10.1016/j.jmb.2013.07.028] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 07/08/2013] [Accepted: 07/19/2013] [Indexed: 12/16/2022]

Capriotti E, Altman RB, Bromberg Y. Collective judgment predicts disease-associated single nucleotide variants. BMC Genomics 2013;14 Suppl 3:S2. [PMID: 23819846 PMCID: PMC3839641 DOI: 10.1186/1471-2164-14-s3-s2] [Citation(s) in RCA: 176] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Abstract

Background

In recent years the number of human genetic variants deposited into the publicly available databases has been increasing exponentially. The latest version of dbSNP, for example, contains ~50 million validated Single Nucleotide Variants (SNVs). SNVs make up most of human variation and are often the primary causes of disease. The non-synonymous SNVs (nsSNVs) result in single amino acid substitutions and may affect protein function, often causing disease. Although several methods for the detection of nsSNV effects have already been developed, the consistent increase in annotated data is offering the opportunity to improve prediction accuracy.

Results

Here we present a new approach for the detection of disease-associated nsSNVs (Meta-SNP) that integrates four existing methods: PANTHER, PhD-SNP, SIFT and SNAP. We first tested the accuracy of each method using a dataset of 35,766 disease-annotated mutations from 8,667 proteins extracted from the SwissVar database. The four methods reached overall accuracies of 64%-76% with a Matthew's correlation coefficient (MCC) of 0.38-0.53. We then used the outputs of these methods to develop a machine learning based approach that discriminates between disease-associated and polymorphic variants (Meta-SNP). In testing, the combined method reached 79% overall accuracy and 0.59 MCC, ~3% higher accuracy and ~0.05 higher correlation with respect to the best-performing method. Moreover, for the hardest-to-define subset of nsSNVs, i.e. variants for which half of the predictors disagreed with the other half, Meta-SNP attained 8% higher accuracy than the best predictor.

Conclusions

Here we find that the Meta-SNP algorithm achieves better performance than the best single predictor. This result suggests that the methods used for the prediction of variant-disease associations are orthogonal, encoding different biologically relevant relationships. Careful combination of predictions from various resources is therefore a good strategy for the selection of high reliability predictions. Indeed, for the subset of nsSNVs where all predictors were in agreement (46% of all nsSNVs in the set), our method reached 87% overall accuracy and 0.73 MCC. Meta-SNP server is freely accessible at http://snps.biofold.org/meta-snp.

Collapse

Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem J 2013;449:581-94. [DOI: 10.1042/bj20121221] [Citation(s) in RCA: 131] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Bromberg Y, Capriotti E. SNP-SIG Meeting 2011: identification and annotation of SNPs in the context of structure, function, and disease. BMC Genomics 2012;13 Suppl 4:S1. [PMID: 22759647 PMCID: PMC3395891 DOI: 10.1186/1471-2164-13-s4-s1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open