1
|
Quantifying negative selection on synonymous variants. HGG ADVANCES 2024; 5:100262. [PMID: 38192100 PMCID: PMC10835449 DOI: 10.1016/j.xhgg.2024.100262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 01/01/2024] [Accepted: 01/01/2024] [Indexed: 01/10/2024] Open
Abstract
Widespread adoption of DNA sequencing has resulted in large numbers of genetic variants, whose contribution to disease is not easily determined. Although many types of variation are known to disrupt cellular processes in predictable ways, for some categories of variants, the effects may not be directly detectable. A particular example is synonymous variants, that is, those single-nucleotide variants that create a codon substitution, such that the produced amino acid sequence is unaffected. Contrary to the original theory suggesting that synonymous variants are benign, there is a growing volume of research showing that, despite their "silent" mechanism of action, some synonymous variation may be deleterious. Here, we studied the extent of the negative selective pressure acting on different classes of synonymous variants by analyzing the relative enrichment of synonymous singleton variants in the human exomes provided by gnomAD. Using a modification of the mutability-adjusted proportion of singletons (MAPS) metric as a measure of purifying selection, we found that some classes of synonymous variants are subject to stronger negative selection than others. For instance, variants that reduce codon optimality undergo stronger selection than optimality-increasing variants. Besides, selection affects synonymous variants implicated in splice-site-loss or splice-site-gain events. To understand what drives this negative selection, we tested a number of predictors in the aim to explain the variability in the selection scores. Our findings provide insights into the effects of synonymous variants at the population level, highlighting the specifics of the role that these variants play in health and disease.
Collapse
|
2
|
Broken silence: 22,841 predicted deleterious synonymous variants identified in the human exome through computational analysis. Genet Mol Biol 2024; 46:e20230125. [PMID: 38259032 PMCID: PMC10804382 DOI: 10.1590/1678-4685-gmb-2023-0125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 12/10/2023] [Indexed: 01/24/2024] Open
Abstract
Synonymous single nucleotide variants (sSNVs) do not alter the primary structure of a protein, thus it was previously accepted that they were neutral. Recently, several studies demonstrated their significance to a range of diseases. Still, variant prioritization strategies lack focus on sSNVs. Here, we identified 22,841 deleterious synonymous variants in 125,748 human exomes using two in silico predictors (SilVA and CADD). While 98.2% of synonymous variants are classified as neutral, 1.8% are predicted to be deleterious, yielding an average of 9.82 neutral and 0.18 deleterious sSNVs per exome. Further investigation of prediction features via Heterogeneous Ensemble Feature Selection revealed that impact on amino acid sequence and conservation carry the most weight for a deleterious prediction. Thirty nine detrimental sSNVs are not rare and are located on disease associated genes. Ten distinct putatively non-deleterious sSNVs are likely to be under positive selection in the North-Western European and East Asian populations. Taken together our analysis gives voice to the so-called silent mutations as we propose a robust framework for evaluating the deleteriousness of sSNVs in variant prioritization studies.
Collapse
|
3
|
Synonymous variants in the ATP6AP2 gene may lead to developmental and epileptic encephalopathy. Front Neurol 2024; 14:1320514. [PMID: 38274877 PMCID: PMC10808393 DOI: 10.3389/fneur.2023.1320514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 12/22/2023] [Indexed: 01/27/2024] Open
Abstract
Objective To the literature, variants in the ATP6AP2 gene may cause abnormal nervous system development and associated neurological symptoms. Methods We report a patient with developmental and epileptic encephalopathy (DEE) carrying an ATP6AP2 c.858G > A (p.Ala286=) synonymous variant. In addition, an overview of reported patients with the same variant were collected and summarized to compare our findings. Results The patient started experiencing tonic seizures at 3.5 months of age, and magnetic resonance imaging (MRI) indicated impaired brain white matter development and reduced left hippocampal volume. Furthermore, electroencephalography showed multifocal interictal epileptiform discharges. Treatment with various anti-seizure medications yielded unsatisfactory results, and the disorder eventually developed into epileptic spasms. An in vitro splicing assay for the ATP6AP2 gene mRNA revealed that the variant caused a deletion in exon 8 and a corresponding protein truncation. A review of previously reported ATP6AP2-related DEE patients found that synonymous variants in the ATP6AP2 gene can cause early DEE onset, progressive changes in early-life MRI, and exon skipping in all ATP6AP2-related DEE patients. Significance We found that synonymous variants in ATP6AP2 may have significant pathogenicity and are highly correlated with DEE. Due to increased isoform production, ATP6AP2 synonymous variants may cause nervous system developmental disorders by competitively reducing the generation of full-length transcripts, resulting in defects in ATP6AP2-related physiological processes.
Collapse
|
4
|
Implementing computational methods in tandem with synonymous gene recoding for therapeutic development. Trends Pharmacol Sci 2023; 44:73-84. [PMID: 36307252 DOI: 10.1016/j.tips.2022.09.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 09/26/2022] [Accepted: 09/27/2022] [Indexed: 12/24/2022]
Abstract
Synonymous gene recoding, the substitution of synonymous variants into the genetic sequence, has been used to overcome many production limitations in therapeutic development. However, the safety and efficacy of recoded therapeutics can be difficult to evaluate because synonymous codon substitutions can result in subtle, yet impactful changes in protein features and require sensitive methods for detection. Given that computational approaches have made significant leaps in recent years, we propose that machine-learning (ML) tools may be leveraged to assess gene-recoded therapeutics and foresee an opportunity to adapt codon contexts to enhance some powerful existing tools. Here, we examine how synonymous gene recoding has been used to address challenges in therapeutic development, explain the biological mechanisms underlying its effects, and explore the application of computational platforms to improve the surveillance of functional variants in therapeutic design.
Collapse
|
5
|
Inferring Potential Cancer Driving Synonymous Variants. Genes (Basel) 2022; 13:778. [PMID: 35627162 PMCID: PMC9140830 DOI: 10.3390/genes13050778] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 04/25/2022] [Accepted: 04/26/2022] [Indexed: 02/01/2023] Open
Abstract
Synonymous single nucleotide variants (sSNVs) are often considered functionally silent, but a few cases of cancer-causing sSNVs have been reported. From available databases, we collected four categories of sSNVs: germline, somatic in normal tissues, somatic in cancerous tissues, and putative cancer drivers. We found that screening sSNVs for recurrence among patients, conservation of the affected genomic position, and synVep prediction (synVep is a machine learning-based sSNV effect predictor) recovers cancer driver variants (termed proposed drivers) and previously unknown putative cancer genes. Of the 2.9 million somatic sSNVs found in the COSMIC database, we identified 2111 proposed cancer driver sSNVs. Of these, 326 sSNVs could be further tagged for possible RNA splicing effects, RNA structural changes, and affected RBP motifs. This list of proposed cancer driver sSNVs provides computational guidance in prioritizing the experimental evaluation of synonymous mutations found in cancers. Furthermore, our list of novel potential cancer genes, galvanized by synonymous mutations, may highlight yet unexplored cancer mechanisms.
Collapse
|
6
|
Characterization of Synonymous BRCA1:c.132C>T as a Pathogenic Variant. Front Oncol 2022; 11:812656. [PMID: 35087763 PMCID: PMC8789006 DOI: 10.3389/fonc.2021.812656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 12/08/2021] [Indexed: 11/26/2022] Open
Abstract
Breast cancer gene 1 (BRCA1) and BRCA2 are tumor suppressors involved in DNA damage response and repair. Carriers of germline pathogenic or likely pathogenic variants in BRCA1 or BRCA2 have significantly increased lifetime risks of breast cancer, ovarian cancer, and other cancer types; this phenomenon is known as hereditary breast and ovarian cancer (HBOC) syndrome. Accurate interpretation of BRCA1 and BRCA2 variants is important not only for disease management in patients, but also for determining preventative measures for their families. BRCA1:c.132C>T (p.Cys44=) is a synonymous variant recorded in the ClinVar database with “conflicting interpretations of its pathogenicity”. Here, we report our clinical tests in which we identified this variant in two unrelated patients, both of whom developed breast cancer at an early age with ovarian presentation a few years later and had a family history of relevant cancers. Minigene assay showed that this change caused a four-nucleotide loss at the end of exon 3, resulting in a truncated p.Cys44Tyrfs*5 protein. Reverse transcription-polymerase chain reaction identified two fragments (123 and 119 bp) using RNA isolated from patient blood samples, in consistency with the results of the minigene assay. Collectively, we classified BRCA1:c.132C>T (p.Cys44=) as a pathogenic variant, as evidenced by functional studies, RNA analysis, and the patients’ family histories. By analyzing variants recorded in the BRCA Exchange database, we found synonymous changes at the ends of exons could potentially influence splicing; meanwhile, current in silico tools could not predict splicing changes efficiently if the variants were in the middle of an exon, or in the deep intron region. Future studies should attempt to identify variants that influence gene expression and post-transcription modifications to improve our understanding of BRCA1 and BRCA2, as well as their related cancers.
Collapse
|
7
|
New approaches to predict the effect of co-occurring variants on protein characteristics. Am J Hum Genet 2021; 108:1502-1511. [PMID: 34256028 DOI: 10.1016/j.ajhg.2021.06.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 06/14/2021] [Indexed: 12/14/2022] Open
Abstract
Predicting the effect of a mutated gene before the onset of symptoms of genetic diseases would greatly facilitate diagnosis and potentiate early intervention. There have been myriad attempts to predict the effects of single-nucleotide variants. However, the applicability of these efforts does not scale to co-occurring variants. Furthermore, an increasing number of protein therapeutics contain co-occurring nucleotide variations, adding uncertainty during development to the safety and efficiency of these drugs. Co-occurring nucleotide variants may often have synergistic, additive, or antagonistic effects on protein attributes, further complicating the task of outcome prediction. We tested four models based on the cooperative and antagonistic effects of co-occurring variants to predict pathogenicity and effectiveness of protein therapeutics. A total of 30 attributes, including amino acid and nucleotide features, as well as existing single-variant effect prediction tools, were considered on the basis of previous studies on single-nucleotide variants. Importantly, the effects of synonymous variants, often seen in protein therapeutics, were also included in our models. We used 12 datasets of people with monogenic diseases and controls with co-occurring genetic variants to evaluate the accuracy of our models, accomplishing a degree of accuracy comparable to that of prediction tools for single-nucleotide variants. More importantly, our framework is generalizable to new, well-curated datasets of monogenic diseases and new variant scoring tools. This approach successfully assists in addressing the challenging task of predicting the effect of co-occurring variants on pathogenicity and protein effectiveness and is applicable for a wide range of protein therapeutics and genetic diseases.
Collapse
|
8
|
Synonymous variants in holoprosencephaly alter codon usage and impact the Sonic Hedgehog protein. Brain 2020; 143:2027-2038. [PMID: 32542401 DOI: 10.1093/brain/awaa152] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 03/04/2020] [Accepted: 03/21/2020] [Indexed: 11/13/2022] Open
Abstract
Synonymous single nucleotide variants (sSNVs) have been implicated in various genetic disorders through alterations of pre-mRNA splicing, mRNA structure and miRNA regulation. However, their impact on synonymous codon usage and protein translation remains to be elucidated in clinical context. Here, we explore the functional impact of sSNVs in the Sonic Hedgehog (SHH) gene, identified in patients affected by holoprosencephaly, a congenital brain defect resulting from incomplete forebrain cleavage. We identified eight sSNVs in SHH, selectively enriched in holoprosencephaly patients as compared to healthy individuals, and systematically assessed their effect at both transcriptional and translational levels using a series of in silico and in vitro approaches. Although no evidence of impact of these sSNVs on splicing, mRNA structure or miRNA regulation was found, five sSNVs introduced significant changes in codon usage and were predicted to impact protein translation. Cell assays demonstrated that these five sSNVs are associated with a significantly reduced amount of the resulting protein, ranging from 5% to 23%. Inhibition of the proteasome rescued the protein levels for four out of five sSNVs, confirming their impact on protein stability and folding. Remarkably, we found a significant correlation between experimental values of protein reduction and computational measures of codon usage, indicating the relevance of in silico models in predicting the impact of sSNVs on translation. Considering the critical role of SHH in brain development, our findings highlight the clinical relevance of sSNVs in holoprosencephaly and underline the importance of investigating their impact on translation in human pathologies.
Collapse
|
9
|
Rare-variant pathogenicity triage and inclusion of synonymous variants improves analysis of disease associations of orphan G protein-coupled receptors. J Biol Chem 2019; 294:18109-18121. [PMID: 31628190 DOI: 10.1074/jbc.ra119.009253] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 10/08/2019] [Indexed: 02/02/2023] Open
Abstract
The pace of deorphanization of G protein-coupled receptors (GPCRs) has slowed, and new approaches are required. Small molecule targeting of orphan GPCRs can potentially be of clinical benefit even if the endogenous receptor ligand has not been identified. Many GPCRs lack common variants that lead to reproducible genome-wide disease associations, and rare-variant approaches have emerged as a viable alternative to identify disease associations for such genes. Therefore, our goal was to prioritize orphan GPCRs by determining their associations with human diseases in a large clinical population. We used sequence kernel association tests to assess the disease associations of 85 orphan or understudied GPCRs in an unselected cohort of 51,289 individuals. Using rare loss-of-function variants, missense variants predicted to be pathogenic or likely pathogenic, and a subset of rare synonymous variants that cause large changes in local codon bias as independent data sets, we found strong, phenome-wide disease associations shared by two or more variant categories for 39% of the GPCRs. To validate the bioinformatics and sequence kernel association test analyses, we functionally characterized rare missense and synonymous variants of GPR39, a family A GPCR, revealing altered expression or Zn2+-mediated signaling for members of both variant classes. These results support the utility of rare variant analyses for identifying disease associations for GPCRs that lack impactful common variants. We highlight the importance of rare synonymous variants in human physiology and argue for their routine inclusion in any comprehensive analysis of genomic variants as potential causes of disease.
Collapse
|
10
|
Predicting Functional Effects of Synonymous Variants: A Systematic Review and Perspectives. Front Genet 2019; 10:914. [PMID: 31649718 PMCID: PMC6791167 DOI: 10.3389/fgene.2019.00914] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 08/29/2019] [Indexed: 12/13/2022] Open
Abstract
Recent advances in high-throughput experimentation have put the exploration of genome sequences at the forefront of precision medicine. In an effort to interpret the sequencing data, numerous computational methods have been developed for evaluating the effects of genome variants. Interestingly, despite the fact that every person has as many synonymous (sSNV) as non-synonymous single nucleotide variants, our ability to predict their effects is limited. The paucity of experimentally tested sSNV effects appears to be the limiting factor in development of such methods. Here, we summarize the details and evaluate the performance of nine existing computational methods capable of predicting sSNV effects. We used a set of observed and artificially generated variants to approximate large scale performance expectations of these tools. We note that the distribution of these variants across amino acid and codon types suggests purifying evolutionary selection retaining generated variants out of the observed set; i.e., we expect the generated set to be enriched for deleterious variants. Closer inspection of the relationship between the observed variant frequencies and the associated prediction scores identifies predictor-specific scoring thresholds of reliable effect predictions. Notably, across all predictors, the variants scoring above these thresholds were significantly more often generated than observed. which confirms our assumption that the generated set is enriched for deleterious variants. Finally, we find that while the methods differ in their ability to identify severe sSNV effects, no predictor appears capable of definitively recognizing subtle effects of such variants on a large scale.
Collapse
|
11
|
Splicing dysregulation contributes to the pathogenicity of several F9 exonic point variants. Mol Genet Genomic Med 2019; 7:e840. [PMID: 31257730 PMCID: PMC6687662 DOI: 10.1002/mgg3.840] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Accepted: 06/10/2019] [Indexed: 12/27/2022] Open
Abstract
Background Pre‐mRNA splicing is a complex process requiring the identification of donor site, acceptor site, and branch point site with an adjacent polypyrimidine tract sequence. Splicing is regulated by splicing regulatory elements (SREs) with both enhancer and suppressor functions. Variants located in exonic regions can impact splicing through dysregulation of native splice sites, SREs, and cryptic splice site activation. While splicing dysregulation is considered primary disease‐inducing mechanism of synonymous variants, its contribution toward disease phenotype of non‐synonymous variants is underappreciated. Methods In this study, we analyzed 415 disease‐causing and 120 neutral F9 exonic point variants including both synonymous and non‐synonymous for their effect on splicing using a series of in silico splice site prediction tools, SRE prediction tools, and in vitro minigene assays. Results The use of splice site and SRE prediction tools in tandem provided better prediction but were not always in agreement with the minigene assays. The net effect of splicing dysregulation caused by variants was context dependent. Minigene assays revealed that perturbed splicing can be found. Conclusion Synonymous variants primarily cause disease phenotype via splicing dysregulation while additional mechanisms such as translation rate also play an important role. Splicing dysregulation is likely to contribute to the disease phenotype of several non‐synonymous variants.
Collapse
|
12
|
Where are the missing gene defects in inherited retinal disorders? Intronic and synonymous variants contribute at least to 4% of CACNA1F-mediated inherited retinal disorders. Hum Mutat 2019; 40:765-787. [PMID: 30825406 DOI: 10.1002/humu.23735] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 02/15/2019] [Accepted: 02/26/2019] [Indexed: 12/27/2022]
Abstract
Inherited retinal disorders (IRD) represent clinically and genetically heterogeneous diseases. To date, pathogenic variants have been identified in ~260 genes. Albeit that many genes are implicated in IRD, for 30-50% of the cases, the gene defect is unknown. These cases may be explained by novel gene defects, by overlooked structural variants, by variants in intronic, promoter or more distant regulatory regions, and represent synonymous variants of known genes contributing to the dysfunction of the respective proteins. Patients with one subgroup of IRD, namely incomplete congenital stationary night blindness (icCSNB), show a very specific phenotype. The major cause of this condition is the presence of a hemizygous pathogenic variant in CACNA1F. A comprehensive study applying direct Sanger sequencing of the gene-coding regions, exome and genome sequencing applied to a large cohort of patients with a clinical diagnosis of icCSNB revealed indeed that seven of the 189 CACNA1F-related cases have intronic and synonymous disease-causing variants leading to missplicing as validated by minigene approaches. These findings highlight that gene-locus sequencing may be a very efficient method in detecting disease-causing variants in clinically well-characterized patients with a diagnosis of IRD, like icCSNB.
Collapse
|
13
|
Synonymous Somatic Variants in Human Cancer Are Not Infamous: A Plea for Full Disclosure in Databases and Publications. Hum Mutat 2017; 38:339-342. [PMID: 28026089 DOI: 10.1002/humu.23163] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Revised: 11/28/2016] [Accepted: 12/11/2016] [Indexed: 12/12/2022]
Abstract
Single-nucleotide variants (SNVs) are the most frequent genetic changes found in human cancer. Most driver alterations are missense and nonsense variants localized in the coding region of cancer genes. Unbiased cancer genome sequencing shows that synonymous SNVs (sSNVs) can be found clustered in the coding regions of several cancer oncogenes or tumor suppressor genes suggesting purifying selection. sSNVs are currently underestimated, as they are usually discarded during analysis. Furthermore, several public databases do not display sSNVs, which can lead to analytical bias and the false assumption that this mutational event is uncommon. Recent progress in our understanding of the deleterious consequences of these sSNVs for RNA stability and protein translation shows that they can act as strong drivers of cancer, as demonstrated for several cancer genes such as TP53 or BCL2L12. It is therefore essential that sSNVs be properly reported and analyzed in order to provide an accurate picture of the genetic landscape of the cancer genome.
Collapse
|