1
|
Sun Y, Men W, Kennerknecht I, Fang W, Zheng HF, Zhang W, Rao Y. Human genetics of face recognition: discovery of MCTP2 mutations in humans with face blindness (congenital prosopagnosia). Genetics 2024; 227:iyae047. [PMID: 38547502 DOI: 10.1093/genetics/iyae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Accepted: 03/19/2024] [Indexed: 06/06/2024] Open
Abstract
Face recognition is important for both visual and social cognition. While prosopagnosia or face blindness has been known for seven decades and face-specific neurons for half a century, the molecular genetic mechanism is not clear. Here we report results after 17 years of research with classic genetics and modern genomics. From a large family with 18 congenital prosopagnosia (CP) members with obvious difficulties in face recognition in daily life, we uncovered a fully cosegregating private mutation in the MCTP2 gene which encodes a calcium binding transmembrane protein expressed in the brain. After screening through cohorts of 6589, we found more CPs and their families, allowing detection of more CP associated mutations in MCTP2. Face recognition differences were detected between 14 carriers with the frameshift mutation S80fs in MCTP2 and 19 noncarrying volunteers. Six families including one with 10 members showed the S80fs-CP correlation. Functional magnetic resonance imaging found association of impaired recognition of individual faces by MCTP2 mutant CPs with reduced repetition suppression to repeated facial identities in the right fusiform face area. Our results have revealed genetic predisposition of MCTP2 mutations in CP, 76 years after the initial report of prosopagnosia and 47 years after the report of the first CP. This is the first time a gene required for a higher form of visual social cognition was found in humans.
Collapse
Affiliation(s)
- Yun Sun
- Chinese Institutes for Medical Research, Capital Medical University, Beijing 100069, China
- Chinese Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, PKU-IDG/McGovern Institute for Brain Research, School of Life Sciences, Peking University, Beijing 100871, China
- Institute of Molecular Physiology, Shenzhen Bay Laboratory, Shenzhen 518107, China
| | - Weiwei Men
- Center for MRI Research, Academy for Advanced Interdisciplinary Studies, Beijing Key Lab for Medical Physics and Engineering, Institute of Heavy Ion Physics, School of Physics, Peking University, Beijing 100871, China
| | - Ingo Kennerknecht
- Institute of Human Genetics, Westfälische Wilhelms-Universität, Münster 48149, Germany
| | - Wan Fang
- Chinese Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, PKU-IDG/McGovern Institute for Brain Research, School of Life Sciences, Peking University, Beijing 100871, China
| | - Hou-Feng Zheng
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
| | - Wenxia Zhang
- Chinese Institutes for Medical Research, Capital Medical University, Beijing 100069, China
- Chinese Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, PKU-IDG/McGovern Institute for Brain Research, School of Life Sciences, Peking University, Beijing 100871, China
| | - Yi Rao
- Chinese Institutes for Medical Research, Capital Medical University, Beijing 100069, China
- Chinese Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, PKU-IDG/McGovern Institute for Brain Research, School of Life Sciences, Peking University, Beijing 100871, China
- Institute of Molecular Physiology, Shenzhen Bay Laboratory, Shenzhen 518107, China
| |
Collapse
|
2
|
Tahara S, Tsuchiya T, Matsumoto H, Ozaki H. Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans. BMC Genomics 2023; 24:597. [PMID: 37805453 PMCID: PMC10560430 DOI: 10.1186/s12864-023-09692-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 09/21/2023] [Indexed: 10/09/2023] Open
Abstract
BACKGROUND Transcription factors (TFs) exhibit heterogeneous DNA-binding specificities in individual cells and whole organisms under natural conditions, and de novo motif discovery usually provides multiple motifs, even from a single chromatin immunoprecipitation-sequencing (ChIP-seq) sample. Despite the accumulation of ChIP-seq data and ChIP-seq-derived motifs, the diversity of DNA-binding specificities across different TFs and cell types remains largely unexplored. RESULTS Here, we applied MOCCS2, our k-mer-based motif discovery method, to a collection of human TF ChIP-seq samples across diverse TFs and cell types, and systematically computed profiles of TF-binding specificity scores for all k-mers. After quality control, we compiled a set of TF-binding specificity score profiles for 2,976 high-quality ChIP-seq samples, comprising 473 TFs and 398 cell types. Using these high-quality samples, we confirmed that the k-mer-based TF-binding specificity profiles reflected TF- or TF-family dependent DNA-binding specificities. We then compared the binding specificity scores of ChIP-seq samples with the same TFs but with different cell type classes and found that half of the analyzed TFs exhibited differences in DNA-binding specificities across cell type classes. Additionally, we devised a method to detect differentially bound k-mers between two ChIP-seq samples and detected k-mers exhibiting statistically significant differences in binding specificity scores. Moreover, we demonstrated that differences in the binding specificity scores between k-mers on the reference and alternative alleles could be used to predict the effect of variants on TF binding, as validated by in vitro and in vivo assay datasets. Finally, we demonstrated that binding specificity score differences can be used to interpret disease-associated non-coding single-nucleotide polymorphisms (SNPs) as TF-affecting SNPs and provide candidates responsible for TFs and cell types. CONCLUSIONS Our study provides a basis for investigating the regulation of gene expression in a TF-, TF family-, or cell-type-dependent manner. Furthermore, our differential analysis of binding-specificity scores highlights noncoding disease-associated variants in humans.
Collapse
Affiliation(s)
- Saeko Tahara
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
- School of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
| | - Takaho Tsuchiya
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
- Center for Artificial Intelligence Research, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
| | - Hirotaka Matsumoto
- School of Information and Data Sciences, Nagasaki University, 1-14, Bunkyo-Machi, Nagasaki City, Nagasaki, 852-8521, Japan
- Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics, Wako, Saitama, 351-0198, Japan
| | - Haruka Ozaki
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan.
- Center for Artificial Intelligence Research, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan.
- Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics, Wako, Saitama, 351-0198, Japan.
| |
Collapse
|
3
|
Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022; 14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open
Abstract
A significant fraction of mutations in proteins are deleterious and result in adverse consequences for protein function, stability, or interaction with other molecules. Intragenic compensation is a specific case of positive epistasis when a neutral missense mutation cancels effect of a deleterious mutation in the same protein. Permissive compensatory mutations facilitate protein evolution, since without them all sequences would be extremely conserved. Understanding compensatory mechanisms is an important scientific challenge at the intersection of protein biophysics and evolution. In human genetics, intragenic compensatory interactions are important since they may result in variable penetrance of pathogenic mutations or fixation of pathogenic human alleles in orthologous proteins from related species. The latter phenomenon complicates computational and clinical inference of an allele's pathogenicity. Deep mutational scanning is a relatively new technique that enables experimental studies of functional effects of thousands of mutations in proteins. We review the important aspects of the field and discuss existing limitations of current datasets. We reviewed ten published DMS datasets with quantified functional effects of single and double mutations and described rates and patterns of intragenic compensation in eight of them. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01005-w.
Collapse
Affiliation(s)
- Nadezhda Azbukina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
| | - Anastasia Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| | - Vasily Ramensky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| |
Collapse
|
4
|
Kuru N, Dereli O, Akkoyun E, Bircan A, Tastan O, Adebali O. PHACT: Phylogeny-aware computing of tolerance for missense mutations. Mol Biol Evol 2022; 39:6593375. [PMID: 35639618 PMCID: PMC9178230 DOI: 10.1093/molbev/msac114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Evolutionary conservation is a fundamental resource for predicting the substitutability of amino acids and loss of function in proteins. The use of multiple sequence alignment alone-without considering the evolutionary relationships among sequences-results in the redundant counting of evolutionarily related alteration events as if they were independent. Here we propose a new method, PHACT that predicts the pathogenicity of missense mutations directly from the phylogenetic tree of proteins. PHACT travels through the nodes of the phylogenetic tree and evaluates the deleteriousness of a substitution based on the probability differences of ancestral amino acids between neighboring nodes in the tree. Moreover, PHACT assigns weights to each node in the tree based on their distance to the query organism. For each potential amino acid substitution, the algorithm generates a score that is used to calculate the effect of substitution on protein function. To analyze the predictive performance of PHACT, we performed various experiments over the subsets of two datasets that include 3023 proteins and 61662 variants in total. The experiments demonstrated that our method outperformed the widely used pathogenicity prediction tools (i.e., SIFT and PolyPhen-2) and achieved better predictive performance than did other conventional statistical approaches presented in dbNSFP. The PHACT source code is available at https://github.com/CompGenomeLab/PHACT.
Collapse
Affiliation(s)
- Nurdan Kuru
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Onur Dereli
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Emrah Akkoyun
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Aylin Bircan
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Oznur Tastan
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Ogun Adebali
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| |
Collapse
|
5
|
Yang Z, Cao J, Song Y, Li S, Jiao Z, Ren S, Gao X, Zhang S, Liu J, Chen Y. Whole-exome sequencing identified novel variants in three Chinese Leigh syndrome pedigrees. Am J Med Genet A 2022; 188:1214-1225. [PMID: 35014173 DOI: 10.1002/ajmg.a.62641] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 12/12/2021] [Accepted: 12/18/2021] [Indexed: 11/08/2022]
Abstract
Leigh syndrome (LS), the most common mitochondrial disease in early childhood, usually manifests variable neurodegenerative symptoms and typical brain magnetic resonance imaging (MRI) lesions. To date, pathogenic variants in more than 80 genes have been identified. However, there are still many cases without molecular diagnoses, and thus more disease-causing variants need to be unveiled. Here, we presented three clinically suspected LS patients manifesting neurological symptoms including developmental delay, hypotonia, and epilepsy during the first year of age, along with symmetric brain lesions on MRI. We explored disease-associated variants in patients and their nonconsanguineous parents by whole-exome sequencing and subsequent Sanger sequencing verification. Sequencing data revealed three pairs of disease-associated compound heterozygous variants: c.1A>G (p.Met1?) and 409G>C (p.Asp137His) in SDHA, c.1253G>A (p.Arg418His) and 1300C>T (p.Leu434Phe) in NARS2, and c.5C>T (p.Ala2Val) and 773T>G (p.Leu258Trp) in ECHS1. Among them, the likely pathogenic variants c.409G>C (p.Asp137His) in SDHA, c.1300C>T (p.Leu434Phe) in NARS2, and c.773T>G (p.Leu258Trp) in ECHS1 were newly identified. Segregation analysis indicated the possible disease-causing nature of the novel variants. In silico prediction and three-dimensional protein modeling further suggested the potential pathogenicity of these variants. Our discovery of novel variants expands the gene variant spectrum of LS and provides novel evidence for genetic counseling.
Collapse
Affiliation(s)
- Zhihua Yang
- Genetic and Prenatal Diagnosis Center, Department of Gynecology and Obstetrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Jun Cao
- Genetic and Prenatal Diagnosis Center, Department of Gynecology and Obstetrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Yucen Song
- Genetic and Prenatal Diagnosis Center, Department of Gynecology and Obstetrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Suyi Li
- Genetic and Prenatal Diagnosis Center, Department of Gynecology and Obstetrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Zhihui Jiao
- Genetic and Prenatal Diagnosis Center, Department of Gynecology and Obstetrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Shumin Ren
- Genetic and Prenatal Diagnosis Center, Department of Gynecology and Obstetrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Xu Gao
- Genetic and Prenatal Diagnosis Center, Department of Gynecology and Obstetrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Suqin Zhang
- Department of Pediatrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Jingjing Liu
- Department of MR Imaging, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| | - Yibing Chen
- Genetic and Prenatal Diagnosis Center, Department of Gynecology and Obstetrics, First Affiliated Hospital, Zhengzhou University, Zhengzhou, China
| |
Collapse
|
6
|
Zahedi T, Colagar AH, Mahmoodzadeh H, Raoof JB. Missense mutations involvement in COX-2 structure, and protein-substrate binding affinity: in-silico study. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2021; 40:1125-1143. [PMID: 34632961 DOI: 10.1080/15257770.2021.1983826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Cyclooxygenase-2 (COX-2) is an inducible inflammatory enzyme, which produces prostanoids from arachidonic acid. COX-2 overexpression and over-activity can cause inflammation, tumorigenesis, and angiogenesis. Prostanoids are the main reason for the inflammation, and increase of mitogenesis by COX-2. So, any change such as mutations that can lead to COX-2 over-activity could ignite the tumor situations with increase of prostanoids production is one of its ways. The aim of this study was to check the effect of 166 missense mutations of COX-2 on protein features that can affect the COX-2 activity such as protein stability, fluctuation, 2D structure, and its binding affinity with the substrate by in silico methods, network modeling, and docking calculations, by which 44 of them shown to be deleterious. Among them, the S124I and S474F mutations can increase the stability of the protein. 11.36% of deleterious nsSNPs were part of the substrate-binding region among which the M508T, H337R, and V511G have the potential to affect the protein by 2D structure alteration. V511G can improve binding affinity and H337R showed a small decrease in the deformation overall energy that can represent a decrease in the stability of COX-2. Also, L517S showed a significant decrease in the binding power of COX-2/substrate but based on the anisotropic network modeling this mutation has a dual effect on COX-2 stability. These nsSNPs/mutations have the potential causing an increase or decrease of tumorigenesis because increasing of COX-2 stability and its binding affinity can lead to altering its activity.
Collapse
Affiliation(s)
- Tahereh Zahedi
- Department of Molecular and Cell Biology, Faculty of Basic Science, University of Mazandaran, Babolsar, Mazandaran, Iran
| | - Abasalt Hosseinzadeh Colagar
- Department of Molecular and Cell Biology, Faculty of Basic Science, University of Mazandaran, Babolsar, Mazandaran, Iran
| | - Habibollah Mahmoodzadeh
- Department of Surgical Oncology, Cancer Institute, Imam Khomeini Hospital Complex, Tehran University of Medical Sciences, Tehran, Iran
| | - Jahan-Bakhsh Raoof
- Department Analytical Chemistry, Faculty of Chemistry, University of Mazandaran, Babolsar, Mazandaran, Iran
| |
Collapse
|
7
|
Baas FS, Rishi G, Swinkels DW, Subramaniam VN. Genetic Diagnosis in Hereditary Hemochromatosis: Discovering and Understanding the Biological Relevance of Variants. Clin Chem 2021; 67:1324-1341. [PMID: 34402502 DOI: 10.1093/clinchem/hvab130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 06/23/2021] [Indexed: 11/13/2022]
Abstract
BACKGROUND Hereditary hemochromatosis (HH) is a genetic disease, leading to iron accumulation and possible organ damage. Patients are usually homozygous for p. Cys282Tyr in the homeostatic iron regulator gene but may have mutations in other genes involved in the regulation of iron. Next-generation sequencing is increasingly being utilized for the diagnosis of patients, leading to the discovery of novel genetic variants. The clinical significance of these variants is often unknown. CONTENT Determining the pathogenicity of such variants of unknown significance is important for diagnostics and genetic counseling. Predictions can be made using in silico computational tools and population data, but additional evidence is required for a conclusive pathogenicity classification. Genetic disease models, such as in vitro models using cellular overexpression, induced pluripotent stem cells or organoids, and in vivo models using mice or zebrafish all have their own challenges and opportunities when used to model HH and other iron disorders. Recent developments in gene-editing technologies are transforming the field of genetic disease modeling. SUMMARY In summary, this review addresses methods and developments regarding the discovery and classification of genetic variants, from in silico tools to in vitro and in vivo models, and presents them in the context of HH. It also explores recent gene-editing developments and how they can be applied to the discussed models of genetic disease.
Collapse
Affiliation(s)
- Floor S Baas
- Translational Metabolic Laboratory (TML 831), Radboudumc, Nijmegen, the Netherlands.,Hepatogenomics Research Group, School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD, Australia
| | - Gautam Rishi
- Hepatogenomics Research Group, School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD, Australia
| | - Dorine W Swinkels
- Translational Metabolic Laboratory (TML 831), Radboudumc, Nijmegen, the Netherlands
| | - V Nathan Subramaniam
- Hepatogenomics Research Group, School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD, Australia
| |
Collapse
|
8
|
Nguyen-Dumont T, Stewart J, Winship I, Southey MC. Rare genetic variants: making the connection with breast cancer susceptibility. AIMS GENETICS 2021. [DOI: 10.3934/genet.2015.4.281] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
AbstractThe practice of clinical genetics in the context of breast cancer predisposition has reached another critical point in its evolution. For the past two decades, genetic testing offered to women attending clinics has been limited to BRCA1 and BRCA2 unless other syndromic indicators have been evident (e.g. PTEN and TP53 for Cowden and Li-Fraumeni syndrome, respectively). Women (and their families) who are concerned about their personal and/or family history of breast and ovarian cancer have enthusiastically engaged with clinical genetics services, anticipating a genetic cause for their cancer predisposition will be identified and to receive clinical guidance for their risk management and treatment options. Genetic testing laboratories have demonstrated similar enthusiasm for transitioning from single gene to gene panel testing that now provide opportunities for the large number of women found not to carry mutations in BRCA1 and BRCA2, enabling them to undergo additional genetic testing. However, these panel tests have limited clinical utility until more is understood about the cancer risks (if any) associated with the genetic variation observed in the genes included on these panels. New data is urgently needed to improve the interpretation of the genetic variation data that is already reported from these panels and to inform the selection of genes included in gene panel tests in the future. To address this issue, large internationally coordinated research studies are required to provide the evidence-base from which clinical genetics for breast cancer susceptibility can be practiced in the era of gene panel testing and oncogenetic practice.Two significant steps associated with this process include i) validating the genes on these panels (and those likely to be added in the future) as bona fide1
breast cancer predisposition genes and ii) interpreting the variation, on a variant-by-variant basis in terms of their likely “pathogenicity”—a process commonly referred to as “variant classification” that will enable this new genetic information to be used at an individual level in clinical genetics services. Neither of these fundamental steps have been achieved for the majority of genes included on the panels.We are thus at a critical point for translational research in breast cancer clinical genetics—how can rare genetic variants be interpreted such that they can be used in clinical genetics services and oncogenetic practice to identify and to inform the management of families that carry these variants?
Collapse
Affiliation(s)
- Tú Nguyen-Dumont
- Genetic Epidemiology Laboratory, Department of Pathology, The University of Melbourne, Victoria, 3010, Australia and The Royal Melbourne Hospital, Parkville, Victoria, 3050, Australia
| | - Jenna Stewart
- Genetic Epidemiology Laboratory, Department of Pathology, The University of Melbourne, Victoria, 3010, Australia and The Royal Melbourne Hospital, Parkville, Victoria, 3050, Australia
| | - Ingrid Winship
- Department of Medicine, The University of Melbourne, Victoria, 3010, Australia and The Royal Melbourne Hospital, Parkville, Victoria, 3050, Australia
| | - Melissa C. Southey
- Genetic Epidemiology Laboratory, Department of Pathology, The University of Melbourne, Victoria, 3010, Australia and The Royal Melbourne Hospital, Parkville, Victoria, 3050, Australia
| |
Collapse
|
9
|
Yaman Y, Aymaz R, Keleş M, Bay V, Ün C, Heaton MP. Association of TLR2 haplotypes encoding Q650 with reduced susceptibility to ovine Johne's disease in Turkish sheep. Sci Rep 2021; 11:7088. [PMID: 33782507 PMCID: PMC8007707 DOI: 10.1038/s41598-021-86605-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 03/18/2021] [Indexed: 02/06/2023] Open
Abstract
Ovine Johne’s disease (OJD) is caused by Mycobacterium avium subsp. paratuberculosis (MAP) and carries a potential zoonotic risk for humans. Selective breeding strategies for reduced OJD susceptibility would be welcome tools in disease eradication efforts, if available. The Toll-like receptor 2 gene (TLR2) plays an important signaling role in immune response to MAP, and missense variants are associated with mycobacterial infections in mammals. Our aim was to identify and evaluate ovine TLR2 missense variants for association with OJD in Turkish sheep. Eleven TLR2 missense variants and 17 haplotype configurations were identified in genomic sequences of 221 sheep from 61 globally-distributed breeds. The five most frequent haplotypes were tested for OJD association in 102 matched pairs of infected and uninfected ewes identified in 2257 Turkish sheep. Ewes with one or two copies of TLR2 haplotypes encoding glutamine (Q) at position 650 (Q650) in the Tir domain were 6.6-fold more likely to be uninfected compared to ewes with arginine (R650) at that position (CI95 = 2.6 to 16.9, p-value = 3.7 × 10–6). The protective TLR2 Q650 allele was present in at least 25% of breeds tested and thus may facilitate selective breeding for sheep with reduced susceptibility to OJD.
Collapse
Affiliation(s)
- Yalçın Yaman
- Department of Biometry and Genetics, Sheep Breeding and Research Institute, 10200, Bandırma, Balıkesir, Turkey.
| | - Ramazan Aymaz
- Department of Biometry and Genetics, Sheep Breeding and Research Institute, 10200, Bandırma, Balıkesir, Turkey
| | - Murat Keleş
- Department of Biometry and Genetics, Sheep Breeding and Research Institute, 10200, Bandırma, Balıkesir, Turkey
| | - Veysel Bay
- Department of Biometry and Genetics, Sheep Breeding and Research Institute, 10200, Bandırma, Balıkesir, Turkey
| | - Cemal Ün
- Department of Biology, Faculty of Science, Ege University, 35000, İzmir, Turkey
| | - Michael P Heaton
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, 68933, USA
| |
Collapse
|
10
|
|
11
|
Katsonis P, Lichtarge O. CAGI5: Objective performance assessments of predictions based on the Evolutionary Action equation. Hum Mutat 2019; 40:1436-1454. [PMID: 31317604 PMCID: PMC6900054 DOI: 10.1002/humu.23873] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 07/02/2019] [Accepted: 07/11/2019] [Indexed: 12/14/2022]
Abstract
Many computational approaches estimate the effect of coding variants, but their predictions often disagree with each other. These contradictions confound users and raise questions regarding reliability. Performance assessments can indicate the expected accuracy for each method and highlight advantages and limitations. The Critical Assessment of Genome Interpretation (CAGI) community aims to organize objective and systematic assessments: They challenge predictors on unpublished experimental and clinical data and assign independent assessors to evaluate the submissions. We participated in CAGI experiments as predictors, using the Evolutionary Action (EA) method to estimate the fitness effect of coding mutations. EA is untrained, uses homology information, and relies on a formal equation: The fitness effect equals the functional sensitivity to residue changes multiplied by the magnitude of the substitution. In previous CAGI experiments (between 2011 and 2016), our submissions aimed to predict the protein activity of single mutants. In 2018 (CAGI5), we also submitted predictions regarding clinical associations, folding stability, and matching genomic data with phenotype. For all these diverse challenges, we used EA to predict the fitness effect of variants, adjusted to specifically address each question. Our submissions had consistently good performance, suggesting that EA predicts reliably the effects of genetic variants.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.,Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, Texas.,Department of Pharmacology, Baylor College of Medicine, Houston, Texas.,Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas
| |
Collapse
|
12
|
Kheiri S, Safarzad M, Shariati M, Sohrabi H. Prioritization of Deleterious Variations in the Human Hypoxanthine-Guanine Phosphoribosyltransferase Gene. MEDICAL LABORATORY JOURNAL 2018. [DOI: 10.29252/mlj.12.5.29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
|
13
|
Heaton MP, Smith TPL, Freking BA, Workman AM, Bennett GL, Carnahan JK, Kalbfleisch TS. Using sheep genomes from diverse U.S. breeds to identify missense variants in genes affecting fecundity. F1000Res 2017; 6:1303. [PMID: 28928950 PMCID: PMC5590088 DOI: 10.12688/f1000research.12216.1] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/28/2017] [Indexed: 11/20/2022] Open
Abstract
Background: Access to sheep genome sequences significantly improves the chances of identifying genes that may influence the health, welfare, and productivity of these animals. Methods: A public, searchable DNA sequence resource for U.S. sheep was created with whole genome sequence (WGS) of 96 rams. The animals shared minimal pedigree relationships and represent nine popular U.S. breeds and a composite line. The genomes are viewable online with the user-friendly Integrated Genome Viewer environment, and may be used to identify and decode gene variants present in U.S. sheep. Results: The genomes had a combined average read depth of 16, and an average WGS genotype scoring rate and accuracy exceeding 99%. The utility of this resource was illustrated by characterizing three genes with 14 known coding variants affecting litter size in global sheep populations: growth and differentiation factor 9 (
GDF9), bone morphogenetic protein 15 (
BMP15), and bone morphogenetic protein receptor 1B (
BMPR1B). In the 96 U.S. rams, nine missense variants encoding 11 protein variants were identified. However, only one was previously reported to affect litter size (
GDF9 V371M, Finnsheep). Two missense variants in
BMP15 were identified that had not previously been reported: R67Q in Dorset, and L252P in Dorper and White Dorper breeds. Also, two novel missense variants were identified in
BMPR1B: M64I in Katahdin, and T345N in Romanov and Finnsheep breeds. Based on the strict conservation of amino acid residues across placental mammals, the four variants encoded by
BMP15 and
BMPR1B are predicted to interfere with their function. However, preliminary analyses of litter sizes in small samples did not reveal a correlation with variants in
BMP15 and
BMPR1B with daughters of these rams. Conclusions: Collectively, this report describes a new resource for discovering protein variants
in silico and identifies alleles for further testing of their effects on litter size in U.S. breeds.
Collapse
Affiliation(s)
- Michael P Heaton
- U.S. Meat Animal Research Center (USMARC), Clay Center, NE, 68933, USA
| | - Timothy P L Smith
- U.S. Meat Animal Research Center (USMARC), Clay Center, NE, 68933, USA
| | - Bradley A Freking
- U.S. Meat Animal Research Center (USMARC), Clay Center, NE, 68933, USA
| | - Aspen M Workman
- U.S. Meat Animal Research Center (USMARC), Clay Center, NE, 68933, USA
| | - Gary L Bennett
- U.S. Meat Animal Research Center (USMARC), Clay Center, NE, 68933, USA
| | - Jacky K Carnahan
- U.S. Meat Animal Research Center (USMARC), Clay Center, NE, 68933, USA
| | - Theodore S Kalbfleisch
- Department of Biochemistry and Molecular Biology, School of Medicine, University of Louisville, Louisville, KY, 40202, USA
| |
Collapse
|
14
|
Katsonis P, Lichtarge O. Objective assessment of the evolutionary action equation for the fitness effect of missense mutations across CAGI-blinded contests. Hum Mutat 2017; 38:1072-1084. [PMID: 28544059 DOI: 10.1002/humu.23266] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 03/13/2017] [Accepted: 05/17/2017] [Indexed: 01/09/2023]
Abstract
A major challenge in genome interpretation is to estimate the fitness effect of coding variants of unknown significance (VUS). Labor, limited understanding of protein functions, and lack of assays generally limit direct experimental assessment of VUS, and make robust and accurate computational approaches a necessity. Often, however, algorithms that predict mutational effect disagree among themselves and with experimental data, slowing their adoption for clinical diagnostics. To objectively assess such methods, the Critical Assessment of Genome Interpretation (CAGI) community organizes contests to predict unpublished experimental data, available only to CAGI assessors. We review here the CAGI performance of evolutionary action (EA) predictions of mutational impact. EA models the fitness effect of coding mutations analytically, as a product of the gradient of the fitness landscape times the perturbation size. In practice, these terms are computed from phylogenetic considerations as the functional sensitivity of the mutated site and as the magnitude of amino acid substitution, respectively, and yield the percentage loss of wild-type activity. In five CAGI challenges, EA consistently performed on par or better than sophisticated machine learning approaches. This objective assessment suggests that a simple differential model of evolution can interpret the fitness effect of coding variations, opening diverse clinical applications.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.,Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, Texas.,Department of Pharmacology, Baylor College of Medicine, Houston, Texas.,Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas
| |
Collapse
|
15
|
McFarland CD, Yaglom JA, Wojtkowiak JW, Scott JG, Morse DL, Sherman MY, Mirny LA. The Damaging Effect of Passenger Mutations on Cancer Progression. Cancer Res 2017; 77:4763-4772. [PMID: 28536279 DOI: 10.1158/0008-5472.can-15-3283-t] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Revised: 02/02/2017] [Accepted: 05/16/2017] [Indexed: 01/29/2023]
Abstract
Genomic instability and high mutation rates cause cancer to acquire numerous mutations and chromosomal alterations during its somatic evolution; most are termed passengers because they do not confer cancer phenotypes. Evolutionary simulations and cancer genomic studies suggest that mildly deleterious passengers accumulate and can collectively slow cancer progression. Clinical data also suggest an association between passenger load and response to therapeutics, yet no causal link between the effects of passengers and cancer progression has been established. To assess this, we introduced increasing passenger loads into human cell lines and immunocompromised mouse models. We found that passengers dramatically reduced proliferative fitness (∼3% per Mb), slowed tumor growth, and reduced metastatic progression. We developed new genomic measures of damaging passenger load that can accurately predict the fitness costs of passengers in cell lines and in human breast cancers. We conclude that genomic instability and an elevated load of DNA alterations in cancer is a double-edged sword: it accelerates the accumulation of adaptive drivers, but incurs a harmful passenger load that can outweigh driver benefit. The effects of passenger alterations on cancer fitness were unrelated to enhanced immunity, as our tests were performed either in cell culture or in immunocompromised animals. Our findings refute traditional paradigms of passengers as neutral events, suggesting that passenger load reduces the fitness of cancer cells and slows or prevents progression of both primary and metastatic disease. The antitumor effects of chemotherapies can in part be due to the induction of genomic instability and increased passenger load. Cancer Res; 77(18); 4763-72. ©2017 AACR.
Collapse
Affiliation(s)
| | - Julia A Yaglom
- Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts
| | - Jonathan W Wojtkowiak
- Department of Cancer Imaging and Metabolism, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida
| | - Jacob G Scott
- Translational Hematology and Oncology Research, and Radiation Oncology, Cleveland Clinic, Cleveland, Ohio
| | - David L Morse
- Department of Cancer Imaging and Metabolism, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida
| | - Michael Y Sherman
- Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts.
| | - Leonid A Mirny
- Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts. .,Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts
| |
Collapse
|
16
|
Genetics of the human placenta: implications for toxicokinetics. Arch Toxicol 2016; 90:2563-2581. [DOI: 10.1007/s00204-016-1816-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 08/04/2016] [Indexed: 10/21/2022]
|
17
|
Heaton MP, Smith TPL, Carnahan JK, Basnayake V, Qiu J, Simpson B, Kalbfleisch TS. Using diverse U.S. beef cattle genomes to identify missense mutations in EPAS1, a gene associated with pulmonary hypertension. F1000Res 2016; 5:2003. [PMID: 27746904 PMCID: PMC5040160 DOI: 10.12688/f1000research.9254.2] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/04/2016] [Indexed: 01/08/2023] Open
Abstract
The availability of whole genome sequence (WGS) data has made it possible to discover protein variants
in silico. However, existing bovine WGS databases do not show data in a form conducive to protein variant analysis, and tend to under represent the breadth of genetic diversity in global beef cattle. Thus, our first aim was to use 96 beef sires, sharing minimal pedigree relationships, to create a searchable and publicly viewable set of mapped genomes relevant for 19 popular breeds of U.S. cattle. Our second aim was to identify protein variants encoded by the bovine endothelial PAS domain-containing protein 1 gene (
EPAS1), a gene associated with pulmonary hypertension in Angus cattle. The identity and quality of genomic sequences were verified by comparing WGS genotypes to those derived from other methods. The average read depth, genotype scoring rate, and genotype accuracy exceeded 14, 99%, and 99%, respectively. The 96 genomes were used to discover four amino acid variants encoded by
EPAS1 (E270Q, P362L, A671G, and L701F) and confirm two variants previously associated with disease (A606T and G610S). The six
EPAS1 missense mutations were verified with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry assays, and their frequencies were estimated in a separate collection of 1154 U.S. cattle representing 46 breeds. A rooted phylogenetic tree of eight polypeptide sequences provided a framework for evaluating the likely order of mutations and potential impact of
EPAS1 alleles on the adaptive response to chronic hypoxia in U.S. cattle. This public, whole genome resource facilitates
in silico identification of protein variants in diverse types of U.S. beef cattle, and provides a means of translating WGS data into a practical biological and evolutionary context for generating and testing hypotheses.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Theodore S Kalbfleisch
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, USA
| |
Collapse
|
18
|
Heaton MP, Smith TP, Carnahan JK, Basnayake V, Qiu J, Simpson B, Kalbfleisch TS. Using diverse U.S. beef cattle genomes to identify missense mutations in EPAS1, a gene associated with pulmonary hypertension. F1000Res 2016; 5:2003. [PMID: 27746904 PMCID: PMC5040160 DOI: 10.12688/f1000research.9254.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/04/2016] [Indexed: 08/24/2023] Open
Abstract
The availability of whole genome sequence (WGS) data has made it possible to discover protein variants in silico. However, existing bovine WGS databases do not show data in a form conducive to protein variant analysis, and tend to under represent the breadth of genetic diversity in global beef cattle. Thus, our first aim was to use 96 beef sires, sharing minimal pedigree relationships, to create a searchable and publicly viewable set of mapped genomes relevant for 19 popular breeds of U.S. cattle. Our second aim was to identify protein variants encoded by the bovine endothelial PAS domain-containing protein 1 gene ( EPAS1), a gene associated with pulmonary hypertension in Angus cattle. The identity and quality of genomic sequences were verified by comparing WGS genotypes to those derived from other methods. The average read depth, genotype scoring rate, and genotype accuracy exceeded 14, 99%, and 99%, respectively. The 96 genomes were used to discover four amino acid variants encoded by EPAS1 (E270Q, P362L, A671G, and L701F) and confirm two variants previously associated with disease (A606T and G610S). The six EPAS1 missense mutations were verified with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry assays, and their frequencies were estimated in a separate collection of 1154 U.S. cattle representing 46 breeds. A rooted phylogenetic tree of eight polypeptide sequences provided a framework for evaluating the likely order of mutations and potential impact of EPAS1 alleles on the adaptive response to chronic hypoxia in U.S. cattle. This public, whole genome resource facilitates in silico identification of protein variants in diverse types of U.S. beef cattle, and provides a means of translating WGS data into a practical biological and evolutionary context for generating and testing hypotheses.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Theodore S. Kalbfleisch
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, USA
| |
Collapse
|
19
|
Thirumal Kumar D, George Priya Doss C, Sneha P, Tayubi IA, Siva R, Chakraborty C, Magesh R. Influence of V54M mutation in giant muscle protein titin: a computational screening and molecular dynamics approach. J Biomol Struct Dyn 2016; 35:917-928. [PMID: 27125723 DOI: 10.1080/07391102.2016.1166456] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Recent genetic studies have revealed the impact of mutations in associated genes for cardiac sarcomere components leading to dilated cardiomyopathy (DCM). The cardiac sarcomere is composed of thick and thin filaments and a giant muscle protein known as titin or connectin. Titin interacts with T-cap/telethonin in the Z-line region and plays a vital role in regulating sarcomere assembly. Initially, we screened all the variants associated with giant protein titin and analyzed their impact with the aid of pathogenicity and stability prediction methods. V54M mutation found in the hydrophobic core region of the protein associated with abnormal clinical phenotype leads to DCM was selected for further analysis. To address this issue, we mapped the deleterious mutant V54M, modeled the mutant protein complex, and deciphered the impact of mutation on binding with its partner telethonin in the titin crystal structure of PDB ID: 1YA5 with the aid of docking analysis. Furthermore, two run molecular dynamics simulation was initiated to understand the mechanistic action of V54M mutation in altering the protein structure, dynamics, and stability. According to the results obtained from the repeated 50 ns trajectory files, the overall effect of V54M mutation was destabilizing and transition of bend to coil in the secondary structure was observed. Furthermore, MMPBSA elucidated that V54M found in the Z-line region of titin decreases the binding affinity of titin to Z-line proteins T-cap/telethonin thereby hindering the protein-protein interaction.
Collapse
Affiliation(s)
- D Thirumal Kumar
- a School of Biosciences and Technology , VIT University , Vellore , Tamil Nadu 632014 , India
| | - C George Priya Doss
- a School of Biosciences and Technology , VIT University , Vellore , Tamil Nadu 632014 , India
| | - P Sneha
- a School of Biosciences and Technology , VIT University , Vellore , Tamil Nadu 632014 , India
| | - Iftikhar Aslam Tayubi
- a School of Biosciences and Technology , VIT University , Vellore , Tamil Nadu 632014 , India.,b Faculty of Computing and Information Technology , King Abdulaziz University , Rabigh 21911 , Saudi Arabia
| | - R Siva
- a School of Biosciences and Technology , VIT University , Vellore , Tamil Nadu 632014 , India
| | - Chiranjib Chakraborty
- c Department of Bio-informatics , School of Computer and Information Sciences, Galgotias University , Greater Noida , Uttar Pradesh 201306 , India
| | - R Magesh
- d Faculty of Biomedical Sciences, Technology & Research, Department of Biotechnology , Sri Ramachandra University , Chennai , Tamil Nadu 600116 , India
| |
Collapse
|
20
|
Masica DL, Karchin R. Towards Increasing the Clinical Relevance of In Silico Methods to Predict Pathogenic Missense Variants. PLoS Comput Biol 2016; 12:e1004725. [PMID: 27171182 PMCID: PMC4865359 DOI: 10.1371/journal.pcbi.1004725] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Affiliation(s)
- David L. Masica
- Department of Biomedical Engineering and The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Rachel Karchin
- Department of Biomedical Engineering and The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Oncology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
21
|
Adebali O, Reznik AO, Ory DS, Zhulin IB. Establishing the precise evolutionary history of a gene improves prediction of disease-causing missense mutations. Genet Med 2016; 18:1029-36. [PMID: 26890452 PMCID: PMC4990510 DOI: 10.1038/gim.2015.208] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2015] [Accepted: 12/09/2015] [Indexed: 11/29/2022] Open
Abstract
Purpose: Predicting the phenotypic effects of mutations has become an important application in clinical genetic diagnostics. Computational tools evaluate the behavior of the variant over evolutionary time and assume that variations seen during the course of evolution are probably benign in humans. However, current tools do not take into account orthologous/paralogous relationships. Paralogs have dramatically different roles in Mendelian diseases. For example, whereas inactivating mutations in the NPC1 gene cause the neurodegenerative disorder Niemann-Pick C, inactivating mutations in its paralog NPC1L1 are not disease-causing and, moreover, are implicated in protection from coronary heart disease. Methods: We identified major events in NPC1 evolution and revealed and compared orthologs and paralogs of the human NPC1 gene through phylogenetic and protein sequence analyses. We predicted whether an amino acid substitution affects protein function by reducing the organism’s fitness. Results: Removing the paralogs and distant homologs improved the overall performance of categorizing disease-causing and benign amino acid substitutions. Conclusion: The results show that a thorough evolutionary analysis followed by identification of orthologs improves the accuracy in predicting disease-causing missense mutations. We anticipate that this approach will be used as a reference in the interpretation of variants in other genetic diseases as well. Genet Med18 10, 1029–1036.
Collapse
Affiliation(s)
- Ogun Adebali
- Graduate School of Genome Science and Technology, University of Tennessee-Oak Ridge National Laboratory, Knoxville, Tennessee, USA.,Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.,Department of Microbiology, University of Tennessee, Knoxville, Tennessee, USA
| | - Alexander O Reznik
- Department of Microbiology, University of Tennessee, Knoxville, Tennessee, USA.,Present address: Center for Bioinformatics, Pavlov First Saint Petersburg State Medical University, Saint Petersburg, Russia
| | - Daniel S Ory
- Diabetes Cardiovascular Disease Center, Department of Medicine, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Igor B Zhulin
- Graduate School of Genome Science and Technology, University of Tennessee-Oak Ridge National Laboratory, Knoxville, Tennessee, USA.,Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.,Department of Microbiology, University of Tennessee, Knoxville, Tennessee, USA
| |
Collapse
|
22
|
Gromiha MM, Anoosha P, Huang LT. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants. Methods Mol Biol 2016; 1415:71-89. [PMID: 27115628 DOI: 10.1007/978-1-4939-3572-7_4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Protein stability is the free energy difference between unfolded and folded states of a protein, which lies in the range of 5-25 kcal/mol. Experimentally, protein stability is measured with circular dichroism, differential scanning calorimetry, and fluorescence spectroscopy using thermal and denaturant denaturation methods. These experimental data have been accumulated in the form of a database, ProTherm, thermodynamic database for proteins and mutants. It also contains sequence and structure information of a protein, experimental methods and conditions, and literature information. Different features such as search, display, and sorting options and visualization tools have been incorporated in the database. ProTherm is a valuable resource for understanding/predicting the stability of proteins and it can be accessed at http://www.abren.net/protherm/ . ProTherm has been effectively used to examine the relationship among thermodynamics, structure, and function of proteins. We describe the recent progress on the development of methods for understanding/predicting protein stability, such as (1) general trends on mutational effects on stability, (2) relationship between the stability of protein mutants and amino acid properties, (3) applications of protein three-dimensional structures for predicting their stability upon point mutations, (4) prediction of protein stability upon single mutations from amino acid sequence, and (5) prediction methods for addressing double mutants. A list of online resources for predicting has also been provided.
Collapse
Affiliation(s)
- M Michael Gromiha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600 036, India.
| | - P Anoosha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600 036, India
| | - Liang-Tsung Huang
- Department of Medical Informatics, Tzu Chi University, Hualien, 970, Taiwan
| |
Collapse
|
23
|
Rodrigues C, Santos-Silva A, Costa E, Bronze-da-Rocha E. Performance of In Silico Tools for the Evaluation of UGT1A1 Missense Variants. Hum Mutat 2015; 36:1215-25. [PMID: 26377032 DOI: 10.1002/humu.22903] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Accepted: 08/31/2015] [Indexed: 01/17/2023]
Abstract
Variations in the gene encoding uridine diphosphate glucuronosyltransferase 1A1 (UGT1A1) are particularly important because they have been associated with hyperbilirubinemia in Gilbert's and Crigler-Najjar syndromes as well as with changes in drug metabolism. Several variants associated with these phenotypes are nonsynonymous single-nucleotide polymorphisms (nsSNPs). Bioinformatics approaches have gained increasing importance in predicting the functional significance of these variants. This study was focused on the predictive ability of bioinformatics approaches to determine the pathogenicity of human UGT1A1 nsSNPs, which were previously characterized at the protein level by in vivo and in vitro studies. Using 16 Web algorithms, we evaluated 48 nsSNPs described in the literature and databases. Eight of these algorithms reached or exceeded 90% sensitivity and six presented a Matthews correlation coefficient above 0.46. The best-performing method was MutPred, followed by Sorting Intolerant from Tolerant (SIFT). The prediction measures varied significantly when predictors such us SIFT, polyphen-2, and Prediction of Pathological Mutations on Proteins were run with their native alignment generated by the tool, or with an input alignment that was strictly built with UGT1A1 orthologs and manually curated. Our results showed that the prediction performance of some methods based on sequence conservation analysis can be negatively affected when nsSNPs are positioned at the hypervariable or constant regions of UGT1A1 ortholog sequences.
Collapse
Affiliation(s)
- Carina Rodrigues
- UCIBIO/REQUIMTE, Laboratório de Bioquímica, Departamento de Ciências Biológicas, Faculdade de Farmácia, Universidade do Porto, Porto, Portugal.,Escola Superior de Saúde, Instituto Politécnico de Bragança, Bragança, Portugal
| | - Alice Santos-Silva
- UCIBIO/REQUIMTE, Laboratório de Bioquímica, Departamento de Ciências Biológicas, Faculdade de Farmácia, Universidade do Porto, Porto, Portugal
| | - Elísio Costa
- UCIBIO/REQUIMTE, Laboratório de Bioquímica, Departamento de Ciências Biológicas, Faculdade de Farmácia, Universidade do Porto, Porto, Portugal
| | - Elsa Bronze-da-Rocha
- UCIBIO/REQUIMTE, Laboratório de Bioquímica, Departamento de Ciências Biológicas, Faculdade de Farmácia, Universidade do Porto, Porto, Portugal
| |
Collapse
|
24
|
Tetreault M, Fahiminiya S, Antonicka H, Mitchell GA, Geraghty MT, Lines M, Boycott KM, Shoubridge EA, Mitchell JJ, Michaud JL, Majewski J. Whole-exome sequencing identifies novel ECHS1 mutations in Leigh syndrome. Hum Genet 2015; 134:981-91. [PMID: 26099313 DOI: 10.1007/s00439-015-1577-y] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Accepted: 06/03/2015] [Indexed: 12/28/2022]
Abstract
Leigh syndrome (LS) is a rare heterogeneous progressive neurodegenerative disorder usually presenting in infancy or early childhood. Clinical presentation is variable and includes psychomotor delay or regression, acute neurological or acidotic episodes, hypotonia, ataxia, spasticity, movement disorders, and corresponding anomalies of the basal ganglia and brain stem on magnetic resonance imaging. To date, 35 genes have been associated with LS, mostly involved in mitochondrial respiratory chain function and encoded in either nuclear or mitochondrial DNA. We used whole-exome sequencing to identify disease-causing variants in four patients with basal ganglia abnormalities and clinical presentations consistent with LS. Compound heterozygote variants in ECHS1, encoding the enzyme enoyl-CoA hydratase were identified. One missense variant (p.Thr180Ala) was common to all four patients and the haplotype surrounding this variant was also shared, suggesting a common ancestor of French-Canadian origin. Rare mutations in ECHS1 as well as in HIBCH, the enzyme downstream in the valine degradation pathway, have been associated with LS or LS-like disorders. A clear clinical overlap is observed between our patients and the reported cases with ECHS1 or HIBCH deficiency. The main clinical features observed in our cohort are T2-hyperintense signal in the globus pallidus and putamen, failure to thrive, developmental delay or regression, and nystagmus. Respiratory chain studies are not strikingly abnormal in our patients: one patient had a mild reduction of complex I and III and another of complex IV. The identification of four additional patients with mutations in ECHS1 highlights the emerging importance of this pathway in LS.
Collapse
Affiliation(s)
- Martine Tetreault
- Department of Human Genetics, McGill University, Montreal, QC, H3A 1B1, Canada,
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Butler BM, Gerek ZN, Kumar S, Ozkan SB. Conformational dynamics of nonsynonymous variants at protein interfaces reveals disease association. Proteins 2015; 83:428-35. [PMID: 25546381 DOI: 10.1002/prot.24748] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Revised: 11/20/2014] [Accepted: 12/10/2014] [Indexed: 12/12/2022]
Abstract
Recent studies have shown that the protein interface sites between individual monomeric units in biological assemblies are enriched in disease-associated non-synonymous single nucleotide variants (nsSNVs). To elucidate the mechanistic underpinning of this observation, we investigated the conformational dynamic properties of protein interface sites through a site-specific structural dynamic flexibility metric (dfi) for 333 multimeric protein assemblies. dfi measures the dynamic resilience of a single residue to perturbations that occurred in the rest of the protein structure and identifies sites contributing the most to functionally critical dynamics. Analysis of dfi profiles of over a thousand positions harboring variation revealed that amino acid residues at interfaces have lower average dfi (31%) than those present at non-interfaces (50%), which means that protein interfaces have less dynamic flexibility. Interestingly, interface sites with disease-associated nsSNVs have significantly lower average dfi (23%) as compared to those of neutral nsSNVs (42%), which directly relates structural dynamics to functional importance. We found that less conserved interface positions show much lower dfi for disease nsSNVs as compared to neutral nsSNVs. In this case, dfi is better as compared to the accessible surface area metric, which is based on the static protein structure. Overall, our proteome-wide conformational dynamic analysis indicates that certain interface sites play a critical role in functionally related dynamics (i.e., those with low dfi values), therefore mutations at those sites are more likely to be associated with disease.
Collapse
|
26
|
Pucci F, Bernaerts K, Teheux F, Gilis D, Rooman M. Symmetry Principles in Optimization Problems: an application to Protein Stability Prediction. ACTA ACUST UNITED AC 2015. [DOI: 10.1016/j.ifacol.2015.05.068] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
27
|
Walters-Sen LC, Hashimoto S, Thrush DL, Reshmi S, Gastier-Foster JM, Astbury C, Pyatt RE. Variability in pathogenicity prediction programs: impact on clinical diagnostics. Mol Genet Genomic Med 2014; 3:99-110. [PMID: 25802880 PMCID: PMC4367082 DOI: 10.1002/mgg3.116] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Revised: 09/11/2014] [Accepted: 09/16/2014] [Indexed: 11/19/2022] Open
Abstract
Current practice by clinical diagnostic laboratories is to utilize online prediction programs to help determine the significance of novel variants in a given gene sequence. However, these programs vary widely in their methods and ability to correctly predict the pathogenicity of a given sequence change. The performance of 17 publicly available pathogenicity prediction programs was assayed using a dataset consisting of 122 credibly pathogenic and benign variants in genes associated with the RASopathy family of disorders and limb-girdle muscular dystrophy. Performance metrics were compared between the programs to determine the most accurate program for loss-of-function and gain-of-function mechanisms. No one program correctly predicted the pathogenicity of all variants analyzed. A major hindrance to the analysis was the lack of output from a significant portion of the programs. The best performer was MutPred, which had a weighted accuracy of 82.6% in the full dataset. Surprisingly, combining the results of the top three programs did not increase the ability to predict pathogenicity over the top performer alone. As the increasing number of sequence changes in larger datasets will require interpretation, the current study demonstrates that extreme caution must be taken when reporting pathogenicity based on statistical online protein prediction programs in the absence of functional studies.
Collapse
Affiliation(s)
| | - Sayaka Hashimoto
- Department of Pathology and Laboratory Medicine, Nationwide Children's Hospital Columbus, Ohio
| | - Devon Lamb Thrush
- Department of Pathology and Laboratory Medicine, Nationwide Children's Hospital Columbus, Ohio ; Department of Pediatrics, The Ohio State University College of Medicine Columbus, Ohio
| | - Shalini Reshmi
- Department of Pathology and Laboratory Medicine, Nationwide Children's Hospital Columbus, Ohio ; Department of Pathology, The Ohio State University College of Medicine Columbus, Ohio
| | - Julie M Gastier-Foster
- Department of Pathology and Laboratory Medicine, Nationwide Children's Hospital Columbus, Ohio ; Department of Pediatrics, The Ohio State University College of Medicine Columbus, Ohio ; Department of Pathology, The Ohio State University College of Medicine Columbus, Ohio
| | - Caroline Astbury
- Department of Pathology and Laboratory Medicine, Nationwide Children's Hospital Columbus, Ohio ; Department of Pathology, The Ohio State University College of Medicine Columbus, Ohio
| | - Robert E Pyatt
- Department of Pathology and Laboratory Medicine, Nationwide Children's Hospital Columbus, Ohio ; Department of Pathology, The Ohio State University College of Medicine Columbus, Ohio
| |
Collapse
|
28
|
Tug-of-war between driver and passenger mutations in cancer and other adaptive processes. Proc Natl Acad Sci U S A 2014; 111:15138-43. [PMID: 25277973 DOI: 10.1073/pnas.1404341111] [Citation(s) in RCA: 95] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Cancer progression is an example of a rapid adaptive process where evolving new traits is essential for survival and requires a high mutation rate. Precancerous cells acquire a few key mutations that drive rapid population growth and carcinogenesis. Cancer genomics demonstrates that these few driver mutations occur alongside thousands of random passenger mutations--a natural consequence of cancer's elevated mutation rate. Some passengers are deleterious to cancer cells, yet have been largely ignored in cancer research. In population genetics, however, the accumulation of mildly deleterious mutations has been shown to cause population meltdown. Here we develop a stochastic population model where beneficial drivers engage in a tug-of-war with frequent mildly deleterious passengers. These passengers present a barrier to cancer progression describable by a critical population size, below which most lesions fail to progress, and a critical mutation rate, above which cancers melt down. We find support for this model in cancer age-incidence and cancer genomics data that also allow us to estimate the fitness advantage of drivers and fitness costs of passengers. We identify two regimes of adaptive evolutionary dynamics and use these regimes to understand successes and failures of different treatment strategies. A tumor's load of deleterious passengers can explain previously paradoxical treatment outcomes and suggest that it could potentially serve as a biomarker of response to mutagenic therapies. The collective deleterious effect of passengers is currently an unexploited therapeutic target. We discuss how their effects might be exacerbated by current and future therapies.
Collapse
|
29
|
Jia M, Yang B, Li Z, Shen H, Song X, Gu W. Computational analysis of functional single nucleotide polymorphisms associated with the CYP11B2 gene. PLoS One 2014; 9:e104311. [PMID: 25102047 PMCID: PMC4125216 DOI: 10.1371/journal.pone.0104311] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Accepted: 07/07/2014] [Indexed: 12/17/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) are the most common type of genetic variations in humans and play a major role in the genetics of human phenotype variation and the genetic basis of human complex diseases. Recently, there is considerable interest in understanding the possible role of the CYP11B2 gene with corticosterone methyl oxidase deficiency, primary aldosteronism, and cardio-cerebro-vascular diseases. Hence, the elucidation of the function and molecular dynamic behavior of CYP11B2 mutations is crucial in current genomics. In this study, we investigated the pathogenic effect of 51 nsSNPs and 26 UTR SNPs in the CYP11B2 gene through computational platforms. Using a combination of SIFT, PolyPhen, I-Mutant Suite, and ConSurf server, four nsSNPs (F487V, V129M, T498A, and V403E) were identified to potentially affect the structure, function, and activity of the CYP11B2 protein. Furthermore, molecular dynamics simulation and structure analyses also confirmed the impact of these nsSNPs on the stability and secondary properties of the CYP11B2 protein. Additionally, utilizing the UTRscan, MirSNP, PolymiRTS and miRNASNP, three SNPs in the 3'UTR region were predicted to exhibit a pattern change in the upstream open reading frames (uORF), and eight microRNA binding sites were found to be highly affected due to 3'UTR SNPs. This cataloguing of deleterious SNPs is essential for narrowing down the number of CYP11B2 mutations to be screened in genetic association studies and for a better understanding of the functional and structural aspects of the CYP11B2 protein.
Collapse
Affiliation(s)
- Minyue Jia
- Department of Endocrinology and Metabolism, the Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Boyun Yang
- Department of Endocrinology and Metabolism, the Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Zhongyi Li
- Department of Urology, the Second Affiliated Hospital (Binjiang Branch) Zhejiang University School of Medicine, Hangzhou Binjiang Hospital, Hangzhou, China
| | - Huiling Shen
- Department of Endocrinology and Metabolism, the Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Xiaoxiao Song
- Department of Endocrinology and Metabolism, the Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Wei Gu
- Department of Endocrinology and Metabolism, the Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| |
Collapse
|
30
|
Abstract
Proteins are macromolecules that serve a cell’s myriad processes and functions in all living organisms via dynamic interactions with other proteins, small molecules and cellular components. Genetic variations in the protein-encoding regions of the human genome account for >85% of all known Mendelian diseases, and play an influential role in shaping complex polygenic diseases. Proteins also serve as the predominant target class for the design of small molecule drugs to modulate their activity. Knowledge of the shape and form of proteins, by means of their three-dimensional structures, is therefore instrumental to understanding their roles in disease and their potentials for drug development. In this chapter we outline, with the wide readership of non-structural biologists in mind, the various experimental and computational methods available for protein structure determination. We summarize how the wealth of structure information, contributed to a large extent by the technological advances in structure determination to date, serves as a useful tool to decipher the molecular basis of genetic variations for disease characterization and diagnosis, particularly in the emerging era of genomic medicine, and becomes an integral component in the modern day approach towards rational drug development.
Collapse
Affiliation(s)
- Nelson L.S. Tang
- Dept. of Chemical Pathology and Lab. of Genetics of Disease Suscept., The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Terence Poon
- Department of Paediatrics and Proteomics Laboratory, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| |
Collapse
|
31
|
Gfeller D, Ernst A, Jarvik N, Sidhu SS, Bader GD. Prediction and experimental characterization of nsSNPs altering human PDZ-binding motifs. PLoS One 2014; 9:e94507. [PMID: 24722214 PMCID: PMC3983204 DOI: 10.1371/journal.pone.0094507] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Accepted: 03/17/2014] [Indexed: 01/03/2023] Open
Abstract
Single nucleotide polymorphisms (SNPs) are a major contributor to genetic and phenotypic variation within populations. Non-synonymous SNPs (nsSNPs) modify the sequence of proteins and can affect their folding or binding properties. Experimental analysis of all nsSNPs is currently unfeasible and therefore computational predictions of the molecular effect of nsSNPs are helpful to guide experimental investigations. While some nsSNPs can be accurately characterized, for instance if they fall into strongly conserved or well annotated regions, the molecular consequences of many others are more challenging to predict. In particular, nsSNPs affecting less structured, and often less conserved regions, are difficult to characterize. Binding sites that mediate protein-protein or other protein interactions are an important class of functional sites on proteins and can be used to help interpret nsSNPs. Binding sites targeted by the PDZ modular peptide recognition domain have recently been characterized. Here we use this data to show that it is possible to computationally identify nsSNPs in PDZ binding motifs that modify or prevent binding to the proteins containing the motifs. We confirm these predictions by experimentally validating a selected subset with ELISA. Our work also highlights the importance of better characterizing linear motifs in proteins as many of these can be affected by genetic variations.
Collapse
Affiliation(s)
- David Gfeller
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, Lausanne, Switzerland
| | - Andreas Ernst
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Nick Jarvik
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Sachdev S. Sidhu
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Gary D. Bader
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
32
|
Nievergelt CM, Wineinger NE, Libiger O, Pham P, Zhang G, Baker DG, Schork NJ. Chip-based direct genotyping of coding variants in genome wide association studies: utility, issues and prospects. Gene 2014; 540:104-9. [PMID: 24521671 DOI: 10.1016/j.gene.2014.01.069] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2013] [Revised: 01/20/2014] [Accepted: 01/23/2014] [Indexed: 11/19/2022]
Abstract
There is considerable debate about the most efficient way to interrogate rare coding variants in association studies. The options include direct genotyping of specific known coding variants in genes or, alternatively, sequencing across the entire exome to capture known as well as novel variants. Each strategy has advantages and disadvantages, but the availability of cost-efficient exome arrays has made the former appealing. Here we consider the utility of a direct genotyping chip, the Illumina HumanExome array (HE), by evaluating its content based on: 1. functionality; and 2. amenability to imputation. We explored these issues by genotyping a large, ethnically diverse cohort on the HumanOmniExpressExome array (HOEE) which combines the HE with content from the GWAS array (HOE). We find that the use of the HE is likely to be a cost-effective way of expanding GWAS, but does have some drawbacks that deserve consideration when planning studies.
Collapse
Affiliation(s)
- Caroline M Nievergelt
- Department of Psychiatry, University of California, San Diego; VA Center of Excellence for Stress and Mental Health, VA San Diego.
| | - Nathan E Wineinger
- Scripps Genomic Medicine, Scripps Health; The Scripps Translational Science Institute, The Scripps Research Institute
| | - Ondrej Libiger
- The Scripps Translational Science Institute, The Scripps Research Institute
| | | | | | - Dewleen G Baker
- Department of Psychiatry, University of California, San Diego; VA Center of Excellence for Stress and Mental Health, VA San Diego
| | | |
Collapse
|
33
|
Carter H, Hofree M, Ideker T. Genotype to phenotype via network analysis. Curr Opin Genet Dev 2013; 23:611-21. [PMID: 24238873 PMCID: PMC3866044 DOI: 10.1016/j.gde.2013.10.003] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Revised: 10/04/2013] [Accepted: 10/09/2013] [Indexed: 02/06/2023]
Abstract
A prime objective of genomic medicine is the identification of disease-causing mutations and the mechanisms by which such events result in disease. As most disease phenotypes arise not from single genes and proteins but from a complex network of molecular interactions, a priori knowledge about the molecular network serves as a framework for biological inference and data mining. Here we review recent developments at the interface of biological networks and mutation analysis. We examine how mutations may be treated as a perturbation of the molecular interaction network and what insights may be gained from taking this perspective. We review work that aims to transform static networks into rich context-dependent networks and recent attempts to integrate non-coding RNAs into such analysis. Finally, we conclude with an overview of the many challenges and opportunities that lie ahead.
Collapse
Affiliation(s)
- Hannah Carter
- Institute for Genomic Medicine and Department of Medicine, University of California, San Diego, 9500 Gillman Drive, La Jolla, CA 92093, United States
| | | | | |
Collapse
|
34
|
Gemovic B, Perovic V, Glisic S, Veljkovic N. Feature-based classification of amino acid substitutions outside conserved functional protein domains. ScientificWorldJournal 2013; 2013:948617. [PMID: 24348198 PMCID: PMC3855963 DOI: 10.1155/2013/948617] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 09/24/2013] [Indexed: 01/01/2023] Open
Abstract
There are more than 500 amino acid substitutions in each human genome, and bioinformatics tools irreplaceably contribute to determination of their functional effects. We have developed feature-based algorithm for the detection of mutations outside conserved functional domains (CFDs) and compared its classification efficacy with the most commonly used phylogeny-based tools, PolyPhen-2 and SIFT. The new algorithm is based on the informational spectrum method (ISM), a feature-based technique, and statistical analysis. Our dataset contained neutral polymorphisms and mutations associated with myeloid malignancies from epigenetic regulators ASXL1, DNMT3A, EZH2, and TET2. PolyPhen-2 and SIFT had significantly lower accuracies in predicting the effects of amino acid substitutions outside CFDs than expected, with especially low sensitivity. On the other hand, only ISM algorithm showed statistically significant classification of these sequences. It outperformed PolyPhen-2 and SIFT by 15% and 13%, respectively. These results suggest that feature-based methods, like ISM, are more suitable for the classification of amino acid substitutions outside CFDs than phylogeny-based tools.
Collapse
Affiliation(s)
- Branislava Gemovic
- Centre for Multidisciplinary Research and Engineering, Vinca Institute of Nuclear Sciences, University of Belgrade, 12-14 Mihajla Petrovica Alasa, 11001 Belgrade, Serbia
| | - Vladimir Perovic
- Centre for Multidisciplinary Research and Engineering, Vinca Institute of Nuclear Sciences, University of Belgrade, 12-14 Mihajla Petrovica Alasa, 11001 Belgrade, Serbia
| | - Sanja Glisic
- Centre for Multidisciplinary Research and Engineering, Vinca Institute of Nuclear Sciences, University of Belgrade, 12-14 Mihajla Petrovica Alasa, 11001 Belgrade, Serbia
| | - Nevena Veljkovic
- Centre for Multidisciplinary Research and Engineering, Vinca Institute of Nuclear Sciences, University of Belgrade, 12-14 Mihajla Petrovica Alasa, 11001 Belgrade, Serbia
| |
Collapse
|
35
|
In Silico survey of functional coding variants in human AEG-1 gene. EGYPTIAN JOURNAL OF MEDICAL HUMAN GENETICS 2013. [DOI: 10.1016/j.ejmhg.2013.08.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
36
|
C GPD, Rajith B, Chakraborty C. Predicting the impact of deleterious mutations in the protein kinase domain of FGFR2 in the context of function, structure, and pathogenesis--a bioinformatics approach. Appl Biochem Biotechnol 2013; 170:1853-70. [PMID: 23754559 DOI: 10.1007/s12010-013-0315-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 05/27/2013] [Indexed: 11/26/2022]
Abstract
Fibroblast growth factor receptor 2 (FGFR2) controls a wide range of biological functions by regulating the cellular proliferation, survival, migration and differentiation. A growing body of preclinical data demonstrated that deregulation of the FGFR signalling through genetic modification was observed in various types of cancers. However, the extent to which genetic modifications interfere with gene regulation and their involvement in cancer susceptibility remains largely unknown. In this work, we performed in silico profiling of harmful non-synonymous single nucleotide polymorphisms (SNPs) in the protein kinase domain of FGFR2. Tolerance index, position-specific independent count score, change in free energy score (ΔΔG), Eris and FoldX indicated that seven mutations were found to be deleterious and may alter the protein function and structure. Furthermore, based on physico-chemical properties, two mutations K659N and R747H were found to be most deleterious in protein kinase domain and taken for further structural analysis. Docking study showed a complete loss of binding affinity followed by interference in hydrogen bonding and surrounding residues due to K659N and R747H mutations. In order to elucidate the mechanism behind the impact of mutation that can generate a ripple effect throughout the protein structure and ultimately affect the function, in-depth molecular dynamics simulation and principal component analysis were performed. The obtained results indicate that K659N and R747H mutations have a distinct effect on the dynamic behaviour of FGFR2 protein. Our strategy may be helpful for understanding SNP effects on proteins with function and their role in human genetic diseases and for the development of novel pharmacological strategies.
Collapse
Affiliation(s)
- George Priya Doss C
- Medical Biotechnology Division, School of Biosciences and Technology, VIT University, Vellore, 632014, Tamil Nadu, India.
| | | | | |
Collapse
|
37
|
Goldstein DB, Allen A, Keebler J, Margulies EH, Petrou S, Petrovski S, Sunyaev S. Sequencing studies in human genetics: design and interpretation. Nat Rev Genet 2013; 14:460-70. [PMID: 23752795 DOI: 10.1038/nrg3455] [Citation(s) in RCA: 185] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Next-generation sequencing is becoming the primary discovery tool in human genetics. There have been many clear successes in identifying genes that are responsible for Mendelian diseases, and sequencing approaches are now poised to identify the mutations that cause undiagnosed childhood genetic diseases and those that predispose individuals to more common complex diseases. There are, however, growing concerns that the complexity and magnitude of complete sequence data could lead to an explosion of weakly justified claims of association between genetic variants and disease. Here, we provide an overview of the basic workflow in next-generation sequencing studies and emphasize, where possible, measures and considerations that facilitate accurate inferences from human sequencing studies.
Collapse
Affiliation(s)
- David B Goldstein
- Center for Human Genome Variation, Duke University School of Medicine, 308 Research Drive, Box 91009, LSRC B Wing, Room 330, Durham, North Carolina 27708, USA.
| | | | | | | | | | | | | |
Collapse
|
38
|
Gene dosage effects: nonlinearities, genetic interactions, and dosage compensation. Trends Genet 2013; 29:385-93. [PMID: 23684842 DOI: 10.1016/j.tig.2013.04.004] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Revised: 03/23/2013] [Accepted: 04/15/2013] [Indexed: 11/20/2022]
Abstract
High-throughput genomic analyses have shown that many mutations, including loss-of-function (LOF) mutations, are present in diseased as well as in healthy individuals. Gene dosage effects due to deletions, duplications, and LOF mutations provide avenues to explore oligo- and multigenic inheritance. Here, we focus on several mechanisms that mediate gene dosage effects and analyze biochemical interactions among multiple gene products that are sources of nonlinear relations connecting genotypes and phenotypes. We also explore potential mechanisms that compensate for gene dosage effects. Understanding these issues is critical to understanding why an individual bearing a few damaging mutations can be severely diseased, whereas others harboring tens of potentially deleterious mutations can appear quite healthy.
Collapse
|
39
|
El-Yazbi AF, Loppnow GR. Chimeric RNA–DNA Molecular Beacons for Quantification of Nucleic Acids, Single Nucleotide Polymophisms, and Nucleic Acid Damage. Anal Chem 2013; 85:4321-7. [DOI: 10.1021/ac301669y] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Amira F. El-Yazbi
- Department of Chemistry, University of Alberta, Edmonton, AB
T6G 2G2 Canada
| | - Glen R. Loppnow
- Department of Chemistry, University of Alberta, Edmonton, AB
T6G 2G2 Canada
| |
Collapse
|
40
|
Nevin Gerek Z, Kumar S, Banu Ozkan S. Structural dynamics flexibility informs function and evolution at a proteome scale. Evol Appl 2013; 6:423-33. [PMID: 23745135 PMCID: PMC3673471 DOI: 10.1111/eva.12052] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Accepted: 01/13/2013] [Indexed: 01/04/2023] Open
Abstract
Protein structures are dynamic entities with a myriad of atomic fluctuations, side-chain rotations, and collective domain movements. Although the importance of these dynamics to proper functioning of proteins is emerging in the studies of many protein families, there is a lack of broad evidence for the critical role of protein dynamics in shaping the biological functions of a substantial fraction of residues for a large number of proteins in the human proteome. Here, we propose a novel dynamic flexibility index (dfi) to quantify the dynamic properties of individual residues in any protein and use it to assess the importance of protein dynamics in 100 human proteins. Our analyses involving functionally critical positions, disease-associated and putatively neutral population variations, and the rate of interspecific substitutions per residue produce concordant patterns at a proteome scale. They establish that the preservation of dynamic properties of residues in a protein structure is critical for maintaining the protein/biological function. Therefore, structural dynamics needs to become a major component of the analysis of protein function and evolution. Such analyses will be facilitated by the dfi, which will also enable the integrative use of structural dynamics with evolutionary conservation in genomic medicine as well as functional genomics investigations.
Collapse
Affiliation(s)
- Zeynep Nevin Gerek
- Center for Evolutionary Medicine and Informatics, Biodesign Institute, Arizona State University Tempe, AZ, USA ; Department of Physics, Center for Biological Physics, Bateman Physical Sciences F-Wing, Arizona State University Tempe, AZ, USA
| | | | | |
Collapse
|
41
|
Reimand J, Bader GD. Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers. Mol Syst Biol 2013; 9:637. [PMID: 23340843 PMCID: PMC3564258 DOI: 10.1038/msb.2012.68] [Citation(s) in RCA: 197] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 12/06/2012] [Indexed: 12/20/2022] Open
Abstract
Large-scale cancer genome sequencing has uncovered thousands of gene mutations, but distinguishing tumor driver genes from functionally neutral passenger mutations is a major challenge. We analyzed 800 cancer genomes of eight types to find single-nucleotide variants (SNVs) that precisely target phosphorylation machinery, important in cancer development and drug targeting. Assuming that cancer-related biological systems involve unexpectedly frequent mutations, we used novel algorithms to identify genes with significant phosphorylation-associated SNVs (pSNVs), phospho-mutated pathways, kinase networks, drug targets, and clinically correlated signaling modules. We highlight increased survival of patients with TP53 pSNVs, hierarchically organized cancer kinase modules, a novel pSNV in EGFR, and an immune-related network of pSNVs that correlates with prolonged survival in ovarian cancer. Our findings include multiple actionable cancer gene candidates (FLNB, GRM1, POU2F1), protein complexes (HCF1, ASF1), and kinases (PRKCZ). This study demonstrates new ways of interpreting cancer genomes and presents new leads for cancer research.
Collapse
Affiliation(s)
- Jüri Reimand
- The Donnelly Centre, University of Toronto, Toronto, Canada
| | - Gary D Bader
- The Donnelly Centre, University of Toronto, Toronto, Canada
| |
Collapse
|
42
|
Disease-associated mutations disrupt functionally important regions of intrinsic protein disorder. PLoS Comput Biol 2012; 8:e1002709. [PMID: 23055912 PMCID: PMC3464192 DOI: 10.1371/journal.pcbi.1002709] [Citation(s) in RCA: 110] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 08/14/2012] [Indexed: 01/01/2023] Open
Abstract
The effects of disease mutations on protein structure and function have been extensively investigated, and many predictors of the functional impact of single amino acid substitutions are publicly available. The majority of these predictors are based on protein structure and evolutionary conservation, following the assumption that disease mutations predominantly affect folded and conserved protein regions. However, the prevalence of the intrinsically disordered proteins (IDPs) and regions (IDRs) in the human proteome together with their lack of fixed structure and low sequence conservation raise a question about the impact of disease mutations in IDRs. Here, we investigate annotated missense disease mutations and show that 21.7% of them are located within such intrinsically disordered regions. We further demonstrate that 20% of disease mutations in IDRs cause local disorder-to-order transitions, which represents a 1.7–2.7 fold increase compared to annotated polymorphisms and neutral evolutionary substitutions, respectively. Secondary structure predictions show elevated rates of transition from helices and strands into loops and vice versa in the disease mutations dataset. Disease disorder-to-order mutations also influence predicted molecular recognition features (MoRFs) more often than the control mutations. The repertoire of disorder-to-order transition mutations is limited, with five most frequent mutations (R→W, R→C, E→K, R→H, R→Q) collectively accounting for 44% of all deleterious disorder-to-order transitions. As a proof of concept, we performed accelerated molecular dynamics simulations on a deleterious disorder-to-order transition mutation of tumor protein p63 and, in agreement with our predictions, observed an increased α-helical propensity of the region harboring the mutation. Our findings highlight the importance of mutations in IDRs and refine the traditional structure-centric view of disease mutations. The results of this study offer a new perspective on the role of mutations in disease, with implications for improving predictors of the functional impact of missense mutations. Intrinsically unstructured or disordered proteins have been implicated in the etiology of a wide spectrum of diseases. However, the molecular mechanisms that relate mutations in intrinsically disordered regions (IDRs) to disease pathogenesis have not been investigated. Disordered proteins do not conform to the prevailing view of deleterious mutations which equates function, structure and evolutionary conservation – intrinsically disordered regions are functional, but lack a fixed three-dimensional structure and in general have low sequence conservation. Here we demonstrate that >20% of disease-associated missense mutations affect IDRs and interfere with their functions. We further show that 20% of deleterious mutations in IDRs induce predicted disorder-to-order transitions. Our predictions are supported by accelerated molecular dynamics simulations that show an increase in helical propensity of the region harboring a disease disorder-to-order transition mutation of tumor protein p63. Our results refine the traditional structure-centric view of disease mutations and offer a new perspective on the role of non-synonymous mutations in disease. Our findings have broad implications for improving predictors of the functional impact of missense mutations, and for interpretation of novel variants identified in large genome sequencing projects that aim to provide a better understanding of human genetic variation and its relevance to common diseases.
Collapse
|
43
|
Structure-based mutant stability predictions on proteins of unknown structure. J Biotechnol 2012; 161:287-93. [DOI: 10.1016/j.jbiotec.2012.06.020] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2012] [Revised: 06/19/2012] [Accepted: 06/22/2012] [Indexed: 11/23/2022]
|
44
|
Sunyaev SR. Inferring causality and functional significance of human coding DNA variants. Hum Mol Genet 2012; 21:R10-7. [PMID: 22990389 DOI: 10.1093/hmg/dds385] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Sequencing technology enables the complete characterization of human genetic variation. Statistical genetics studies identify numerous loci linked to or associated with phenotypes of direct medical interest. The major remaining challenge is to characterize functionally significant alleles that are causally implicated in the genetic basis of human traits. Here, I review three sources of evidence for the functional significance of human DNA variants in protein-coding genes. These include (i) statistical genetics considerations such as co-segregation with the phenotype, allele frequency in unaffected controls and recurrence; (ii) in vitro functional assays and model organism experiments; and (iii) computational methods for predicting the functional effect of amino acid substitutions. In spite of many successes of recent studies, functional characterization of human allelic variants remains problematic.
Collapse
Affiliation(s)
- Shamil R Sunyaev
- Genetics Division, Brigham and Women's Hospital, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.
| |
Collapse
|
45
|
Tennant MR, Edwards M, Miyamoto MM. Redesigning a library-based genetics class research project through instructional theory and authentic experience. J Med Libr Assoc 2012; 100:90-7. [PMID: 22514504 DOI: 10.3163/1536-5050.100.2.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
QUESTION How can the library-based research project of a genetics course be reinvigorated and made sustainable without sacrificing educational integrity? SETTING The University of Florida's Health Science Center Library provides the case study. METHODS Since 1996, the librarian has codeveloped, supported, and graded all components of the project. In 2009, the project evolved from a single-authored paper to a group-work poster, with graded presentations hosted by the library. In 2010, students were surveyed regarding class enhancements. RESULTS Responses indicated a preference for collaborative work and the poster format and suggested the changes facilitated learning. Instructors reported that the poster format more clearly documented students' understanding of genetics. CONCLUSION Results suggest project enhancements contributed to greater appreciation, understanding, and application of classroom material and offered a unique and authentic learning experience, without compromising educational integrity. The library benefitted through increased visibility as a partner in the educational mission and development of a sustainable instructional collaboration.
Collapse
Affiliation(s)
- Michele R Tennant
- Biomedical and Health Information Services, Health Science Center Libraries, University of Florida, Gainesville, FL 32610-0206, USA.
| | | | | |
Collapse
|
46
|
Computational exploration of polymorphisms in 5-Hydoxytryptamine 5-HT1A and 5-HT2A receptors associated with psychiatric disease. Gene 2012; 502:16-26. [DOI: 10.1016/j.gene.2012.04.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Revised: 02/13/2012] [Accepted: 04/05/2012] [Indexed: 01/12/2023]
|
47
|
|
48
|
Masica DL, Sosnay PR, Cutting GR, Karchin R. Phenotype-optimized sequence ensembles substantially improve prediction of disease-causing mutation in cystic fibrosis. Hum Mutat 2012; 33:1267-74. [PMID: 22573477 DOI: 10.1002/humu.22110] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2011] [Accepted: 04/12/2012] [Indexed: 12/20/2022]
Abstract
Cystic fibrosis transmembrane conductance regulator (CFTR) mutation is associated with a phenotypic spectrum that includes cystic fibrosis (CF). The disease liability of some common CFTR mutations is known, but rare mutations are seen in too few patients to categorize unequivocally, making genetic diagnosis difficult. Computational methods can predict the impact of mutation, but prediction specificity is often below that required for clinical utility. Here, we present a novel supervised learning approach for predicting CF from CFTR missense mutation. The algorithm begins by constructing custom multiple sequence alignments called phenotype-optimized sequence ensembles (POSEs). POSEs are constructed iteratively, by selecting sequences that optimize predictive performance on a training set of CFTR mutations of known clinical significance. Next, we predict CF disease liability from a different set of CFTR mutations (test-set mutations). This approach achieves improved prediction performance relative to popular methods recently assessed using the same test-set mutations. Of clinical significance, our method achieves 94% prediction specificity. Because databases such as HGMD and locus-specific mutation databases are growing rapidly, methods that automatically tailor their predictions for a specific phenotype may be of immediate utility. If the performance achieved here generalizes to other systems, the approach could be an excellent tool to help establish genetic diagnoses.
Collapse
Affiliation(s)
- David L Masica
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland, USA
| | | | | | | |
Collapse
|
49
|
Peterson TA, Nehrt NL, Park D, Kann MG. Incorporating molecular and functional context into the analysis and prioritization of human variants associated with cancer. J Am Med Inform Assoc 2012; 19:275-83. [PMID: 22319177 DOI: 10.1136/amiajnl-2011-000655] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
BACKGROUND AND OBJECTIVE With recent breakthroughs in high-throughput sequencing, identifying deleterious mutations is one of the key challenges for personalized medicine. At the gene and protein level, it has proven difficult to determine the impact of previously unknown variants. A statistical method has been developed to assess the significance of disease mutation clusters on protein domains by incorporating domain functional annotations to assist in the functional characterization of novel variants. METHODS Disease mutations aggregated from multiple databases were mapped to domains, and were classified as either cancer- or non-cancer-related. The statistical method for identifying significantly disease-associated domain positions was applied to both sets of mutations and to randomly generated mutation sets for comparison. To leverage the known function of protein domain regions, the method optionally distributes significant scores to associated functional feature positions. RESULTS Most disease mutations are localized within protein domains and display a tendency to cluster at individual domain positions. The method identified significant disease mutation hotspots in both the cancer and non-cancer datasets. The domain significance scores (DS-scores) for cancer form a bimodal distribution with hotspots in oncogenes forming a second peak at higher DS-scores than non-cancer, and hotspots in tumor suppressors have scores more similar to non-cancers. In addition, on an independent mutation benchmarking set, the DS-score method identified mutations known to alter protein function with very high precision. CONCLUSION By aggregating mutations with known disease association at the domain level, the method was able to discover domain positions enriched with multiple occurrences of deleterious mutations while incorporating relevant functional annotations. The method can be incorporated into translational bioinformatics tools to characterize rare and novel variants within large-scale sequencing studies.
Collapse
Affiliation(s)
- Thomas A Peterson
- University of Maryland, Baltimore County, Baltimore, Maryland 21250, USA
| | | | | | | |
Collapse
|
50
|
Luu TD, Rusu AM, Walter V, Ripp R, Moulinier L, Muller J, Toursel T, Thompson JD, Poch O, Nguyen H. MSV3d: database of human MisSense Variants mapped to 3D protein structure. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas018. [PMID: 22491796 PMCID: PMC3317913 DOI: 10.1093/database/bas018] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The elucidation of the complex relationships linking genotypic and phenotypic variations to protein structure is a major challenge in the post-genomic era. We present MSV3d (Database of human MisSense Variants mapped to 3D protein structure), a new database that contains detailed annotation of missense variants of all human proteins (20 199 proteins). The multi-level characterization includes details of the physico-chemical changes induced by amino acid modification, as well as information related to the conservation of the mutated residue and its position relative to functional features in the available or predicted 3D model. Major releases of the database are automatically generated and updated regularly in line with the dbSNP (database of Single Nucleotide Polymorphism) and SwissVar releases, by exploiting the extensive Décrypthon computational grid resources. The database (http://decrypthon.igbmc.fr/msv3d) is easily accessible through a simple web interface coupled to a powerful query engine and a standard web service. The content is completely or partially downloadable in XML or flat file formats. Database URL:http://decrypthon.igbmc.fr/msv3d
Collapse
Affiliation(s)
- Tien-Dao Luu
- Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire (UMR7104), 67404 Illkirch
| | | | | | | | | | | | | | | | | | | |
Collapse
|