1
|
Brock DC, Wang M, Hussain HMJ, Rauch DE, Marra M, Pennesi ME, Yang P, Everett L, Ajlan RS, Colbert J, Porto FBO, Matynia A, Gorin MB, Koenekoop RK, Lopez I, Sui R, Zou G, Li Y, Chen R. Comparative analysis of in-silico tools in identifying pathogenic variants in dominant inherited retinal diseases. Hum Mol Genet 2024; 33:945-957. [PMID: 38453143 PMCID: PMC11102593 DOI: 10.1093/hmg/ddae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/16/2024] [Accepted: 02/19/2024] [Indexed: 03/09/2024] Open
Abstract
Inherited retinal diseases (IRDs) are a group of rare genetic eye conditions that cause blindness. Despite progress in identifying genes associated with IRDs, improvements are necessary for classifying rare autosomal dominant (AD) disorders. AD diseases are highly heterogenous, with causal variants being restricted to specific amino acid changes within certain protein domains, making AD conditions difficult to classify. Here, we aim to determine the top-performing in-silico tools for predicting the pathogenicity of AD IRD variants. We annotated variants from ClinVar and benchmarked 39 variant classifier tools on IRD genes, split by inheritance pattern. Using area-under-the-curve (AUC) analysis, we determined the top-performing tools and defined thresholds for variant pathogenicity. Top-performing tools were assessed using genome sequencing on a cohort of participants with IRDs of unknown etiology. MutScore achieved the highest accuracy within AD genes, yielding an AUC of 0.969. When filtering for AD gain-of-function and dominant negative variants, BayesDel had the highest accuracy with an AUC of 0.997. Five participants with variants in NR2E3, RHO, GUCA1A, and GUCY2D were confirmed to have dominantly inherited disease based on pedigree, phenotype, and segregation analysis. We identified two uncharacterized variants in GUCA1A (c.428T>A, p.Ile143Thr) and RHO (c.631C>G, p.His211Asp) in three participants. Our findings support using a multi-classifier approach comprised of new missense classifier tools to identify pathogenic variants in participants with AD IRDs. Our results provide a foundation for improved genetic diagnosis for people with IRDs.
Collapse
Affiliation(s)
- Daniel C Brock
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Medical Scientist Training Program, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Meng Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Hafiz Muhammad Jafar Hussain
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - David E Rauch
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Molly Marra
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Mark E Pennesi
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Paul Yang
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Lesley Everett
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Radwan S Ajlan
- Department of Ophthalmology, University of Kansas School of Medicine, 3901 Rainbow Blvd, Kansas City, KS 66160, United States
| | - Jason Colbert
- Department of Ophthalmology, University of Kansas School of Medicine, 3901 Rainbow Blvd, Kansas City, KS 66160, United States
| | - Fernanda Belga Ottoni Porto
- INRET Clínica e Centro de Pesquisa, Rua dos Otoni, 735/507 - Santa Efigênia, Belo Horizonte, MG 30150270, Brazil
- Department of Ophthalmology, Santa Casa de Misericórdia de Belo Horizonte, Av. Francisco Sales, 1111 - Santa Efigênia, Belo Horizonte, MG 30150221, Brazil
- Centro Oftalmológico de Minas Gerais, R. Santa Catarina, 941 - Lourdes, Belo Horizonte, MG 30180070, Brazil
| | - Anna Matynia
- College of Optometry, University of Houston, 4401 Martin Luther King Boulevard, Houston, TX 77004, United States
| | - Michael B Gorin
- Jules Stein Eye Institute, University of California Los Angeles, 100 Stein Plaza, Los Angeles, CA 90095, United States
- Department of Ophthalmology, University of California Los Angeles David Geffen School of Medicine, 10833 Le Conte Ave, Los Angeles, CA 90095, United States
| | - Robert K Koenekoop
- McGill Ocular Genetics Laboratory and Centre, Department of Paediatric Surgery, Human Genetics, and Ophthalmology, McGill University Health Centre, 5252 Boul de Maisonneuve ouest, Montreal, QC H4A 3S5, Canada
| | - Irma Lopez
- McGill Ocular Genetics Laboratory and Centre, Department of Paediatric Surgery, Human Genetics, and Ophthalmology, McGill University Health Centre, 5252 Boul de Maisonneuve ouest, Montreal, QC H4A 3S5, Canada
| | - Ruifang Sui
- Department of Ophthalmology, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, WC67+HW Dongcheng, Beijing 100005, China
| | - Gang Zou
- Department of Ophthalmology, Ningxia Eye Hospital, People's Hospital of Ningxia Hui Autonomous Region, First Affiliated Hospital of Northwest University for Nationalities, Ningxia Clinical Research Center on Diseases of Blindness in Eye, F4RJ+43 Xixia District, Yinchuan, Ningxia, China
| | - Yumei Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Rui Chen
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| |
Collapse
|
2
|
Lin W, Wells J, Wang Z, Orengo C, Martin ACR. Enhancing missense variant pathogenicity prediction with protein language models using VariPred. Sci Rep 2024; 14:8136. [PMID: 38584172 PMCID: PMC10999449 DOI: 10.1038/s41598-024-51489-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 01/05/2024] [Indexed: 04/09/2024] Open
Abstract
Computational approaches for predicting the pathogenicity of genetic variants have advanced in recent years. These methods enable researchers to determine the possible clinical impact of rare and novel variants. Historically these prediction methods used hand-crafted features based on structural, evolutionary, or physiochemical properties of the variant. In this study we propose a novel framework that leverages the power of pre-trained protein language models to predict variant pathogenicity. We show that our approach VariPred (Variant impact Predictor) outperforms current state-of-the-art methods by using an end-to-end model that only requires the protein sequence as input. Using one of the best-performing protein language models (ESM-1b), we establish a robust classifier that requires no calculation of structural features or multiple sequence alignments. We compare the performance of VariPred with other representative models including 3Cnet, Polyphen-2, REVEL, MetaLR, FATHMM and ESM variant. VariPred performs as well as, or in most cases better than these other predictors using six variant impact prediction benchmarks despite requiring only sequence data and no pre-processing of the data.
Collapse
Affiliation(s)
- Weining Lin
- Division of Biosciences, Institute of Structural and Molecular Biology, University College London, London, UK
| | - Jude Wells
- Department of Computer Science, University College London, London, UK
| | - Zeyuan Wang
- College of Computer Science and Technology, Zhejiang University, Zhejiang, China
| | - Christine Orengo
- Division of Biosciences, Institute of Structural and Molecular Biology, University College London, London, UK.
| | - Andrew C R Martin
- Division of Biosciences, Institute of Structural and Molecular Biology, University College London, London, UK.
| |
Collapse
|
3
|
Arani AA, Sehhati M, Tabatabaiefar MA. Predicting deleterious missense genetic variants via integrative supervised nonnegative matrix tri-factorization. Sci Rep 2021; 11:23747. [PMID: 34887492 PMCID: PMC8660898 DOI: 10.1038/s41598-021-03230-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 11/30/2021] [Indexed: 11/21/2022] Open
Abstract
Among an assortment of genetic variations, Missense are major ones which a small subset of them may led to the upset of the protein function and ultimately end in human diseases. Various machine learning methods were declared to differentiate deleterious and benign missense variants by means of a large number of features, including structure, sequence, interaction networks, gene disease associations as well as phenotypes. However, development of a reliable and accurate algorithm for merging heterogeneous information is highly needed as it could be captured all information of complex interactions on network that genes participate in. In this study we proposed a new method based on the non-negative matrix tri-factorization clustering method. We outlined two versions of the proposed method: two-source and three-source algorithms. Two-source algorithm aggregates individual deleteriousness prediction methods and PPI network, and three-source algorithm incorporates gene disease associations into the other sources already mentioned. Four benchmark datasets were employed for internally and externally validation of both algorithms of our predictor. The results at all datasets confirmed that, our method outperforms most state of the art variant prediction tools. Two key features of our variant effect prediction method are worth mentioning. Firstly, despite the fact that the incorporation of gene disease information at three-source algorithm can improve prediction performance by comparison with two-source algorithm, our method did not hinder by type 2 circularity error unlike some recent ensemble-based prediction methods. Type 2 circularity error occurs when the predictor annotates variants on the basis of the genes located on. Secondly, the performance of our predictor is superior over other ensemble-based methods for variants positioned on genes in which we do not have enough information about their pathogenicity.
Collapse
Affiliation(s)
- Asieh Amousoltani Arani
- Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
- Student Research Committee, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Mohammadreza Sehhati
- Department of Bioinformatics, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
- Deputy of Research and Technology, GTaC Corp, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Mohammad Amin Tabatabaiefar
- Deputy of Research and Technology, GTaC Corp, Isfahan University of Medical Sciences, Isfahan, Iran
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
4
|
Yang X, Breuss MW, Xu X, Antaki D, James KN, Stanley V, Ball LL, George RD, Wirth SA, Cao B, Nguyen A, McEvoy-Venneri J, Chai G, Nahas S, Van Der Kraan L, Ding Y, Sebat J, Gleeson JG. Developmental and temporal characteristics of clonal sperm mosaicism. Cell 2021; 184:4772-4783.e15. [PMID: 34388390 PMCID: PMC8496133 DOI: 10.1016/j.cell.2021.07.024] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 05/12/2021] [Accepted: 07/14/2021] [Indexed: 01/07/2023]
Abstract
Throughout development and aging, human cells accumulate mutations resulting in genomic mosaicism and genetic diversity at the cellular level. Mosaic mutations present in the gonads can affect both the individual and the offspring and subsequent generations. Here, we explore patterns and temporal stability of clonal mosaic mutations in male gonads by sequencing ejaculated sperm. Through 300× whole-genome sequencing of blood and sperm from healthy men, we find each ejaculate carries on average 33.3 ± 12.1 (mean ± SD) clonal mosaic variants, nearly all of which are detected in serial sampling, with the majority absent from sampled somal tissues. Their temporal stability and mutational signature suggest origins during embryonic development from a largely immutable stem cell niche. Clonal mosaicism likely contributes a transmissible, predicted pathogenic exonic variant for 1 in 15 men, representing a life-long threat of transmission for these individuals and a significant burden on human population health.
Collapse
Affiliation(s)
- Xiaoxu Yang
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Martin W Breuss
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Xin Xu
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Danny Antaki
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Kiely N James
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Valentina Stanley
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Laurel L Ball
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Renee D George
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Sara A Wirth
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Beibei Cao
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - An Nguyen
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Jennifer McEvoy-Venneri
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Guoliang Chai
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Shareef Nahas
- Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | | | - Yan Ding
- Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Jonathan Sebat
- Beyster Center for Genomics of Psychiatric Diseases, University of California, San Diego, La Jolla, CA 92093, USA; Department of Psychiatry, University of California, San Diego, La Jolla, CA 92093, USA; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Joseph G Gleeson
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093, USA; Rady Children's Institute for Genomic Medicine, San Diego, CA 92123, USA.
| |
Collapse
|
5
|
Sallah SR, Ellingford JM, Sergouniotis PI, Ramsden SC, Lench N, Lovell SC, Black GC. Improving the clinical interpretation of missense variants in X linked genes using structural analysis. J Med Genet 2021; 59:385-392. [PMID: 33766936 PMCID: PMC8961765 DOI: 10.1136/jmedgenet-2020-107404] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 01/18/2021] [Accepted: 01/21/2021] [Indexed: 11/04/2022]
Abstract
BACKGROUND Improving the clinical interpretation of missense variants can increase the diagnostic yield of genomic testing and lead to personalised management strategies. Currently, due to the imprecision of bioinformatic tools that aim to predict variant pathogenicity, their role in clinical guidelines remains limited. There is a clear need for more accurate prediction algorithms and this study aims to improve performance by harnessing structural biology insights. The focus of this work is missense variants in a subset of genes associated with X linked disorders. METHODS We have developed a protein-specific variant interpreter (ProSper) that combines genetic and protein structural data. This algorithm predicts missense variant pathogenicity by applying machine learning approaches to the sequence and structural characteristics of variants. RESULTS ProSper outperformed seven previously described tools, including meta-predictors, in correctly evaluating whether or not variants are pathogenic; this was the case for 11 of the 21 genes associated with X linked disorders that met the inclusion criteria for this study. We also determined gene-specific pathogenicity thresholds that improved the performance of VEST4, REVEL and ClinPred, the three best-performing tools out of the seven that were evaluated; this was the case in 11, 11 and 12 different genes, respectively. CONCLUSION ProSper can form the basis of a molecule-specific prediction tool that can be implemented into diagnostic strategies. It can allow the accurate prioritisation of missense variants associated with X linked disorders, aiding precise and timely diagnosis. In addition, we demonstrate that gene-specific pathogenicity thresholds for a range of missense prioritisation tools can lead to an increase in prediction accuracy.
Collapse
Affiliation(s)
- Shalaw Rassul Sallah
- Division of Evolution and Genomic Sciences, The University of Manchester Faculty of Biology, Medicine and Health, Manchester, UK.,Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester Academic Health Sciences Centre, Manchester, UK
| | - Jamie M Ellingford
- Division of Evolution and Genomic Sciences, The University of Manchester Faculty of Biology, Medicine and Health, Manchester, UK.,Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester Academic Health Sciences Centre, Manchester, UK
| | - Panagiotis I Sergouniotis
- Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester Academic Health Sciences Centre, Manchester, UK
| | - Simon C Ramsden
- Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester Academic Health Sciences Centre, Manchester, UK
| | - Nicholas Lench
- Congenica Ltd, Biodata Innovation Centre, Wellcome Genome Campus, Hinxton, London, UK
| | - Simon C Lovell
- Division of Evolution and Genomic Sciences, The University of Manchester Faculty of Biology, Medicine and Health, Manchester, UK
| | - Graeme C Black
- Division of Evolution and Genomic Sciences, The University of Manchester Faculty of Biology, Medicine and Health, Manchester, UK .,Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester Academic Health Sciences Centre, Manchester, UK
| |
Collapse
|
6
|
Zhou JB, Xiong Y, An K, Ye ZQ, Wu YD. IDRMutPred: predicting disease-associated germline nonsynonymous single nucleotide variants (nsSNVs) in intrinsically disordered regions. Bioinformatics 2021; 36:4977-4983. [PMID: 32756939 PMCID: PMC7755418 DOI: 10.1093/bioinformatics/btaa618] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 06/28/2020] [Accepted: 07/01/2020] [Indexed: 01/09/2023] Open
Abstract
Motivation Despite of the lack of folded structure, intrinsically disordered regions (IDRs) of proteins play versatile roles in various biological processes, and many nonsynonymous single nucleotide variants (nsSNVs) in IDRs are associated with human diseases. The continuous accumulation of nsSNVs resulted from the wide application of NGS has driven the development of disease-association prediction methods for decades. However, their performance on nsSNVs in IDRs remains inferior, possibly due to the domination of nsSNVs from structured regions in training data. Therefore, it is highly demanding to build a disease-association predictor specifically for nsSNVs in IDRs with better performance. Results We present IDRMutPred, a machine learning-based tool specifically for predicting disease-associated germline nsSNVs in IDRs. Based on 17 selected optimal features that are extracted from sequence alignments, protein annotations, hydrophobicity indices and disorder scores, IDRMutPred was trained using three ensemble learning algorithms on the training dataset containing only IDR nsSNVs. The evaluation on the two testing datasets shows that all the three prediction models outperform 17 other popular general predictors significantly, achieving the ACC between 0.856 and 0.868 and MCC between 0.713 and 0.737. IDRMutPred will prioritize disease-associated IDR germline nsSNVs more reliably than general predictors. Availability and implementation The software is freely available at http://www.wdspdb.com/IDRMutPred. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jing-Bo Zhou
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Yao Xiong
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Ke An
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Zhi-Qiang Ye
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China.,Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yun-Dong Wu
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China.,Shenzhen Bay Laboratory, Shenzhen 518055, China.,College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| |
Collapse
|
7
|
MosaicBase: A Knowledgebase of Postzygotic Mosaic Variants in Noncancer Disease-related and Healthy Human Individuals. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 18:140-149. [PMID: 32911083 PMCID: PMC7646124 DOI: 10.1016/j.gpb.2020.05.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Revised: 03/18/2020] [Accepted: 05/31/2020] [Indexed: 12/14/2022]
Abstract
Mosaic variants resulting from postzygotic mutations are prevalent in the human genome and play important roles in human diseases. However, except for cancer-related variants, there is no collection of postzygotic mosaic variants in noncancer disease-related and healthy individuals. Here, we present MosaicBase, a comprehensive database that includes 6698 mosaic variants related to 266 noncancer diseases and 27,991 mosaic variants identified in 422 healthy individuals. Genomic and phenotypic information of each variant was manually extracted and curated from 383 publications. MosaicBase supports the query of variants with Online Mendelian Inheritance in Man (OMIM) entries, genomic coordinates, gene symbols, or Entrez IDs. We also provide an integrated genome browser for users to easily access mosaic variants and their related annotations for any genomic region. By analyzing the variants collected in MosaicBase, we find that mosaic variants that directly contribute to disease phenotype show features distinct from those of variants in individuals with mild or no phenotypes, in terms of their genomic distribution, mutation signatures, and fraction of mutant cells. MosaicBase will not only assist clinicians in genetic counseling and diagnosis but also provide a useful resource to understand the genomic baseline of postzygotic mutations in the general human population. MosaicBase is publicly available at http://mosaicbase.com/ or http://49.4.21.8:8000.
Collapse
|
8
|
Tian Y, Pesaran T, Chamberlin A, Fenwick RB, Li S, Gau CL, Chao EC, Lu HM, Black MH, Qian D. REVEL and BayesDel outperform other in silico meta-predictors for clinical variant classification. Sci Rep 2019; 9:12752. [PMID: 31484976 PMCID: PMC6726608 DOI: 10.1038/s41598-019-49224-8] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 08/09/2019] [Indexed: 01/07/2023] Open
Abstract
Many in silico predictors of genetic variant pathogenicity have been previously developed, but there is currently no standard application of these algorithms for variant assessment. Using 4,094 ClinVar-curated missense variants in clinically actionable genes, we evaluated the accuracy and yield of benign and deleterious evidence in 5 in silico meta-predictors, as well as agreement of SIFT and PolyPhen2, and report the derived thresholds for the best performing predictor(s). REVEL and BayesDel outperformed all other meta-predictors (CADD, MetaSVM, Eigen), with higher positive predictive value, comparable negative predictive value, higher yield, and greater overall prediction performance. Agreement of SIFT and PolyPhen2 resulted in slightly higher yield but lower overall prediction performance than REVEL or BayesDel. Our results support the use of gene-level rather than generalized thresholds, when gene-level thresholds can be estimated. Our results also support the use of 2-sided thresholds, which allow for uncertainty, rather than a single, binary cut-point for assigning benign and deleterious evidence. The gene-level 2-sided thresholds we derived for REVEL or BayesDel can be used to assess in silico evidence for missense variants in accordance with current classification guidelines.
Collapse
Affiliation(s)
- Yuan Tian
- Ambry Genetics, 15 Argonaut, Aliso Viejo, CA, 92656, USA
| | - Tina Pesaran
- Ambry Genetics, 15 Argonaut, Aliso Viejo, CA, 92656, USA
| | | | - R Bryn Fenwick
- Ambry Genetics, 15 Argonaut, Aliso Viejo, CA, 92656, USA
| | - Shuwei Li
- Ambry Genetics, 15 Argonaut, Aliso Viejo, CA, 92656, USA
| | - Chia-Ling Gau
- Ambry Genetics, 15 Argonaut, Aliso Viejo, CA, 92656, USA
| | - Elizabeth C Chao
- Ambry Genetics, 15 Argonaut, Aliso Viejo, CA, 92656, USA.,Divsion of Genetics and Genomics, University of California, Irvine, CA, 92697, USA
| | - Hsiao-Mei Lu
- Ambry Genetics, 15 Argonaut, Aliso Viejo, CA, 92656, USA
| | | | - Dajun Qian
- Ambry Genetics, 15 Argonaut, Aliso Viejo, CA, 92656, USA.
| |
Collapse
|
9
|
Clark WT, Kasak L, Bakolitsa C, Hu Z, Andreoletti G, Babbi G, Bromberg Y, Casadio R, Dunbrack R, Folkman L, Ford CT, Jones D, Katsonis P, Kundu K, Lichtarge O, Martelli PL, Mooney SD, Nodzak C, Pal LR, Radivojac P, Savojardo C, Shi X, Zhou Y, Uppal A, Xu Q, Yin Y, Pejaver V, Wang M, Wei L, Moult J, Yu GK, Brenner SE, LeBowitz JH. Assessment of predicted enzymatic activity of α-N-acetylglucosaminidase variants of unknown significance for CAGI 2016. Hum Mutat 2019; 40:1519-1529. [PMID: 31342580 PMCID: PMC7156275 DOI: 10.1002/humu.23875] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 06/27/2019] [Accepted: 07/15/2019] [Indexed: 12/25/2022]
Abstract
The NAGLU challenge of the fourth edition of the Critical Assessment of Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict the impact of variants of unknown significance (VUS) on the enzymatic activity of the lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies in NAGLU activity lead to a rare, monogenic, recessive lysosomal storage disorder, Sanfilippo syndrome type B (MPS type IIIB). This challenge attracted 17 submissions from 10 groups. We observed that top models were able to predict the impact of missense mutations on enzymatic activity with Pearson's correlation coefficients of up to .61. We also observed that top methods were significantly more correlated with each other than they were with observed enzymatic activity values, which we believe speaks to the importance of sequence conservation across the different methods. Improved functional predictions on the VUS will help population-scale analysis of disease epidemiology and rare variant association analysis.
Collapse
Affiliation(s)
| | - Laura Kasak
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
- Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia
| | - Constantina Bakolitsa
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Gaia Andreoletti
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Giulia Babbi
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, USA
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | | | - Lukas Folkman
- School of Information and Communication Technology, Griffith University, Southport, Australia
| | - Colby T. Ford
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, NC, USA
| | - David Jones
- Bioinformatics Group, Department of Computer Science, University College London, UK
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Kunal Kundu
- University of Maryland, College Park, MD, USA
| | - Olivier Lichtarge
- Departments of Molecular and Human Genetics, Biochemistry & Molecular Biology, Pharmacology, and Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX, USA
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | | | - Conor Nodzak
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, NC, USA
| | | | - Predrag Radivojac
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Xinghua Shi
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, NC, USA
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Australia
| | - Aneeta Uppal
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, NC, USA
| | - Qifang Xu
- Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Yizhou Yin
- University of Maryland, College Park, MD, USA
| | - Vikas Pejaver
- Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA
| | - Meng Wang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, P.R. China
| | - Liping Wei
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, P.R. China
| | - John Moult
- University of Maryland, College Park, MD, USA
| | - G. Karen Yu
- BioMarin Pharmaceutical, San Rafael, California, USA
| | - Steven E. Brenner
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | | |
Collapse
|
10
|
Evaluation of computational techniques for predicting non-synonymous single nucleotide variants pathogenicity. Genomics 2019; 111:869-882. [DOI: 10.1016/j.ygeno.2018.05.013] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Revised: 04/15/2018] [Accepted: 05/16/2018] [Indexed: 11/18/2022]
|
11
|
Yang X, Yang X, Chen J, Li S, Zeng Q, Huang AY, Ye AY, Yu Z, Wang S, Jiang Y, Wu X, Wu Q, Wei L, Zhang Y. ATP1A3 mosaicism in families with alternating hemiplegia of childhood. Clin Genet 2019; 96:43-52. [PMID: 30891744 PMCID: PMC6850116 DOI: 10.1111/cge.13539] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Revised: 03/10/2019] [Accepted: 03/15/2019] [Indexed: 01/17/2023]
Abstract
Alternating hemiplegia of childhood (AHC) is a rare and severe neurodevelopmental disorder characterized by recurrent hemiplegic episodes. Most AHC cases are sporadic and caused by de novo ATP1A3 pathogenic variants. In this study, the aim was to identify the origin of ATP1A3 pathogenic variants in a Chinese cohort. In 105 probands including 101 sporadic and 4 familial cases, 98 patients with ATP1A3 pathogenic variants were identified, and 96.8% were confirmed as de novo. Micro-droplet digital polymerase chain reaction was applied for detecting ATP1A3 mosaicism in 80 available families. In blood samples, four asymptomatic parents, including two paternal and two maternal, and one proband with a milder phenotype were identified as mosaicism. Six (7.5%) parental mosaicisms were identified in multiple tissues, including four previously identified in blood and two additional cases identified from paternal sperms. Mosaicism was identified in multiple tissues with varied mutant allele fractions (MAFs, 0.03%-33.03%). The results suggested that MAF of mosaicism may be related to phenotype severity. This is the first systematic report of ATP1A3 mosaicism in AHC and showed mosaicism as an unrecognized source of previously considered "de novo" AHC. Identifying ATP1A3 mosaicism provides more evidence for estimating recurrence risk and has implications in genetic counseling of AHC.
Collapse
Affiliation(s)
- Xiaoling Yang
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Xiaoxu Yang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Jiaoyang Chen
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Shupin Li
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Qi Zeng
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - August Y Huang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Adam Y Ye
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China.,Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Zhe Yu
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Sheng Wang
- Dr Liping Wei's lab, National Institute of Biological Sciences, Beijing, China.,College of Biological Sciences, China Agricultural University, Beijing, China
| | - Yuwu Jiang
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Xiru Wu
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| | - Qixi Wu
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.,Human Genetic Resources Core Facility, School of Life Sciences, Peking University, Beijing, China
| | - Liping Wei
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Yuehua Zhang
- Department of Pediatrics, Peking University First Hospital, Beijing, China
| |
Collapse
|
12
|
Hassan MS, Shaalan AA, Dessouky MI, Abdelnaiem AE, ElHefnawi M. A review study: Computational techniques for expecting the impact of non-synonymous single nucleotide variants in human diseases. Gene 2018; 680:20-33. [PMID: 30240882 DOI: 10.1016/j.gene.2018.09.028] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 09/14/2018] [Indexed: 01/18/2023]
Abstract
Non-Synonymous Single-Nucleotide Variants (nsSNVs) and mutations can create a diversity effect on proteins as changing genotype and phenotype, which interrupts its stability. The alterations in the protein stability may cause diseases like cancer. Discovering of nsSNVs and mutations can be a useful tool for diagnosing the disease at a beginning stage. Many studies introduced the various predicting singular and consensus tools that based on different Machine Learning Techniques (MLTs) using diverse datasets. Therefore, we introduce the current comprehensive review of the most popular and recent unique tools that predict pathogenic variations and Meta-tool that merge some of them for enhancing their predictive power. Also, we scanned the several types computational techniques in the state-of-the-art and methods for predicting the effect both of coding and noncoding variants. We then displayed, the protein stability predictors. We offer the details of the most common benchmark database for variations including the main predictive features used by the different methods. Finally, we address the most common fundamental criteria for performance assessment of predictive tools. This review is targeted at bioinformaticians attentive in the characterization of regulatory variants, geneticists, molecular biologists attentive in understanding more about the nature and effective role of such variants from a functional point of views, and clinicians who may hope to learn about variants in human associated with a specific disease and find out what to do next to uncover how they impact on the underlying mechanisms.
Collapse
Affiliation(s)
- Marwa S Hassan
- Systems and Information Department and Biomedical Informatics Group, Engineering Research Division, National Research Center, Giza, Egypt; Patent Office of Scientific Research Academy, Egypt.
| | - A A Shaalan
- Electronics and Communication Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt
| | - M I Dessouky
- Electronics and Electrical Communications Department, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt
| | - Abdelaziz E Abdelnaiem
- Electronics and Communication Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt
| | - Mahmoud ElHefnawi
- Systems and Information Department and Biomedical Informatics Group, Engineering Research Division, National Research Center, Giza, Egypt; Center for Informatics, Nile University, Giza, Egypt
| |
Collapse
|
13
|
A Bayesian framework for efficient and accurate variant prediction. PLoS One 2018; 13:e0203553. [PMID: 30212499 PMCID: PMC6136750 DOI: 10.1371/journal.pone.0203553] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 08/22/2018] [Indexed: 12/04/2022] Open
Abstract
There is a growing need to develop variant prediction tools capable of assessing a wide spectrum of evidence. We present a Bayesian framework that involves aggregating pathogenicity data across multiple in silico scores on a gene-by-gene basis and multiple evidence statistics in both quantitative and qualitative forms, and performs 5-tiered variant classification based on the resulting probability credible interval. When evaluated in 1,161 missense variants, our gene-specific in silico model-based meta-predictor yielded an area under the curve (AUC) of 96.0% and outperformed all other in silico predictors. Multifactorial model analysis incorporating all available evidence yielded 99.7% AUC, with 22.8% predicted as variants of uncertain significance (VUS). Use of only 3 auto-computed evidence statistics yielded 98.6% AUC with 56.0% predicted as VUS, which represented sufficient accuracy to rapidly assign a significant portion of VUS to clinically meaningful classifications. Collectively, our findings support the use of this framework to conduct large-scale variant prioritization using in silico predictors followed by variant prediction and classification with a high degree of predictive accuracy.
Collapse
|
14
|
Giacopuzzi E, Laffranchi M, Berardelli R, Ravasio V, Ferrarotti I, Gooptu B, Borsani G, Fra A. Real-world clinical applicability of pathogenicity predictors assessed on SERPINA1
mutations in alpha-1-antitrypsin deficiency. Hum Mutat 2018; 39:1203-1213. [DOI: 10.1002/humu.23562] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 05/20/2018] [Accepted: 06/05/2018] [Indexed: 01/08/2023]
Affiliation(s)
- Edoardo Giacopuzzi
- Division of Biology and Genetics; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Mattia Laffranchi
- Experimental Oncology and Immunology; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Romina Berardelli
- Experimental Oncology and Immunology; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Viola Ravasio
- Division of Biology and Genetics; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Ilaria Ferrarotti
- Centre for Diagnosis of Inherited Alpha-1 Antitrypsin Deficiency; Department of Internal Medicine and Therapeutics; University of Pavia; Pavia Italy
| | - Bibek Gooptu
- Leicester Institute of Structural and Chemical Biology / NIHR Leicester BRC - Respiratory; University of Leicester; Leicester UK
| | - Giuseppe Borsani
- Division of Biology and Genetics; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| | - Annamaria Fra
- Experimental Oncology and Immunology; Department of Molecular and Translational Medicine; University of Brescia; Brescia Italy
| |
Collapse
|
15
|
Wang M, Tai C, E W, Wei L. DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants. Nucleic Acids Res 2018; 46:e69. [PMID: 29617928 PMCID: PMC6009584 DOI: 10.1093/nar/gky215] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 03/12/2018] [Accepted: 03/14/2018] [Indexed: 01/19/2023] Open
Abstract
The complex system of gene expression is regulated by the cell type-specific binding of transcription factors (TFs) to regulatory elements. Identifying variants that disrupt TF binding and lead to human diseases remains a great challenge. To address this, we implement sequence-based deep learning models that accurately predict the TF binding intensities to given DNA sequences. In addition to accurately classifying TF-DNA binding or unbinding, our models are capable of accurately predicting real-valued TF binding intensities by leveraging large-scale TF ChIP-seq data. The changes in the TF binding intensities between the altered sequence and the reference sequence reflect the degree of functional impact for the variant. This enables us to develop the tool DeFine (Deep learning based Functional impact of non-coding variants evaluator, http://define.cbi.pku.edu.cn) with improved performance for assessing the functional impact of non-coding variants including SNPs and indels. DeFine accurately identifies the causal functional non-coding variants from disease-associated variants in GWAS. DeFine is an effective and easy-to-use tool that facilities systematic prioritization of functional non-coding variants.
Collapse
Affiliation(s)
- Meng Wang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, 100871, P.R. China
| | - Cheng Tai
- Center for Data Science, Peking University, Beijing, 100871, P.R. China
- Beijing Institute of Big Data Research, Beijing, 100871, P.R. China
| | - Weinan E
- Center for Data Science, Peking University, Beijing, 100871, P.R. China
- Beijing Institute of Big Data Research, Beijing, 100871, P.R. China
- Department of Mathematics and PACM, Princeton University, Princeton, NJ, 08544, USA
| | - Liping Wei
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, 100871, P.R. China
| |
Collapse
|
16
|
Yang C, Li J, Wu Q, Yang X, Huang AY, Zhang J, Ye AY, Dou Y, Yan L, Zhou WZ, Kong L, Wang M, Ai C, Yang D, Wei L. AutismKB 2.0: a knowledgebase for the genetic evidence of autism spectrum disorder. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:5134097. [PMID: 30339214 PMCID: PMC6193446 DOI: 10.1093/database/bay106] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2018] [Accepted: 09/18/2018] [Indexed: 01/15/2023]
Abstract
Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with strong genetic contributions. To provide a comprehensive resource for the genetic evidence of ASD, we have updated the Autism KnowledgeBase (AutismKB) to version 2.0. AutismKB 2.0 integrates multiscale genetic data on 1379 genes, 5420 copy number variations and structural variations, 11 669 single-nucleotide variations or small insertions/deletions (SNVs/indels) and 172 linkage regions. In particular, AutismKB 2.0 highlights 5669 de novo SNVs/indels due to their significant contribution to ASD genetics and includes 789 mosaic variants due to their recently discovered contributions to ASD pathogenesis. The genes and variants are annotated extensively with genetic evidence and clinical evidence. To help users fully understand the functional consequences of SNVs and small indels, we provided comprehensive predictions of pathogenicity with iFish, SIFT, Polyphen etc. To improve user experiences, the new version incorporates multiple query methods, including simple query, advanced query and batch query. It also functionally integrates two analytical tools to help users perform downstream analyses, including a gene ranking tool and an enrichment analysis tool, KOBAS. AutismKB 2.0 is freely available and can be a valuable resource for researchers.
Collapse
Affiliation(s)
- Changhong Yang
- College of Life Sciences, Beijing Normal University, Beijing, China.,National Institute of Biological Sciences, Beijing, China
| | - Jiarui Li
- Institute of Infectious Diseases, Beijing Key Laboratory of Emerging Infectious Diseases, Beijing Ditan Hospital Capital Medical University, Beijing, China
| | - Qixi Wu
- Peking-Tsinghua Center for Life Sciences, Beijing, China.,School of Life Sciences, Peking University, Beijing, China
| | - Xiaoxu Yang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - August Yue Huang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Jie Zhang
- National Institute of Biological Sciences, Beijing, China.,Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Adam Yongxin Ye
- Peking-Tsinghua Center for Life Sciences, Beijing, China.,Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China.,Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Yanmei Dou
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Linlin Yan
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Wei-Zhen Zhou
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China.,State Key Laboratory of Cardiovascular Disease, Beijing Key Laboratory for Molecular Diagnostics of Cardiovascular Diseases, Diagnostic Laboratory Service, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Lei Kong
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Meng Wang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Chen Ai
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Dechang Yang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Liping Wei
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| |
Collapse
|
17
|
Genomic mosaicism in paternal sperm and multiple parental tissues in a Dravet syndrome cohort. Sci Rep 2017; 7:15677. [PMID: 29142202 PMCID: PMC5688122 DOI: 10.1038/s41598-017-15814-7] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 11/02/2017] [Indexed: 12/21/2022] Open
Abstract
Genomic mosaicism in parental gametes and peripheral tissues is an important consideration for genetic counseling. We studied a Chinese cohort affected by a severe epileptic disorder, Dravet syndrome (DS). There were 56 fathers who donated semen and 15 parents who donated multiple peripheral tissue samples. We used an ultra-sensitive quantification method, micro-droplet digital PCR (mDDPCR), to detect parental mosaicism of the proband’s pathogenic mutation in SCN1A, the causal gene of DS in 112 families. Ten of the 56 paternal sperm samples were found to exhibit mosaicism of the proband’s mutations, with mutant allelic fractions (MAFs) ranging from 0.03% to 39.04%. MAFs in the mosaic fathers’ sperm were significantly higher than those in their blood (p = 0.00098), even after conditional probability correction (p’ = 0.033). In three mosaic fathers, ultra-low fractions of mosaicism (MAF < 1%) were detected in the sperm samples. In 44 of 45 cases, mosaicism was also observed in other parental peripheral tissues. Hierarchical clustering showed that MAFs measured in the paternal sperm, hair follicles and urine samples were clustered closest together. Milder epileptic phenotypes were more likely to be observed in mosaic parents (p = 3.006e-06). Our study provides new insights for genetic counseling.
Collapse
|