3
|
Cheng H, Capponi S, Wakeling E, Marchi E, Li Q, Zhao M, Weng C, Piatek SG, Ahlfors H, Kleyner R, Rope A, Lumaka A, Lukusa P, Devriendt K, Vermeesch J, Posey JE, Palmer EE, Murray L, Leon E, Diaz J, Worgan L, Mallawaarachchi A, Vogt J, de Munnik SA, Dreyer L, Baynam G, Ewans L, Stark Z, Lunke S, Gonçalves AR, Soares G, Oliveira J, Fassi E, Willing M, Waugh JL, Faivre L, Riviere JB, Moutton S, Mohammed S, Payne K, Walsh L, Begtrup A, Sacoto MJG, Douglas G, Alexander N, Buckley MF, Mark PR, Adès LC, Sandaradura SA, Lupski JR, Roscioli T, Agrawal PB, Kline AD, Wang K, Timmers HTM, Lyon GJ. Missense variants in TAF1 and developmental phenotypes: challenges of determining pathogenicity. Hum Mutat 2019; 41:10.1002/humu.23936. [PMID: 31646703 PMCID: PMC7187541 DOI: 10.1002/humu.23936] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Accepted: 10/16/2019] [Indexed: 12/26/2022]
Abstract
We recently described a new neurodevelopmental syndrome (TAF1/MRXS33 intellectual disability syndrome) (MIM# 300966) caused by pathogenic variants involving the X-linked gene TAF1, which participates in RNA polymerase II transcription. The initial study reported eleven families, and the syndrome was defined as presenting early in life with hypotonia, facial dysmorphia, and developmental delay that evolved into intellectual disability (ID) and/or autism spectrum disorder (ASD). We have now identified an additional 27 families through a genotype-first approach. Familial segregation analysis, clinical phenotyping, and bioinformatics were capitalized on to assess potential variant pathogenicity, and molecular modelling was performed for those variants falling within structurally characterized domains of TAF1. A novel phenotypic clustering approach was also applied, in which the phenotypes of affected individuals were classified using 51 standardized Human Phenotype Ontology (HPO) terms. Phenotypes associated with TAF1 variants show considerable pleiotropy and clinical variability, but prominent among previously unreported effects were brain morphological abnormalities, seizures, hearing loss, and heart malformations. Our allelic series broadens the phenotypic spectrum of TAF1/MRXS33 intellectual disability syndrome and the range of TAF1 molecular defects in humans. It also illustrates the challenges for determining the pathogenicity of inherited missense variants, particularly for genes mapping to chromosome X. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Hanyin Cheng
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Simona Capponi
- German Cancer Consortium (DKTK), Partner Site Freiburg, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Department of Urology, Medical Faculty-University of Freiburg, Freiburg, Germany
| | - Emma Wakeling
- North West Thames Regional Genetics Service, London North West University Healthcare NHS Trust, Harrow, UK
| | - Elaine Marchi
- Institute for Basic Research in Developmental Disabilities (IBR), Staten Island, New York
| | - Quan Li
- Princess Margaret Cancer Centre, University Health Network, University of Toronto, Toronto, Ontario, Canada
| | - Mengge Zhao
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Medical Center, New York, New York
| | - Stefan G. Piatek
- North East Thames Regional Genetics Laboratory, Great Ormond Street Hospital, London, UK
| | - Helena Ahlfors
- North East Thames Regional Genetics Laboratory, Great Ormond Street Hospital, London, UK
| | - Robert Kleyner
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
| | - Alan Rope
- Kaiser Permanente Center for Health Research, Portland, Oregon
- Genome Medical, South San Francisco, California
| | - Aimé Lumaka
- Department of Biomedical and Preclinical Sciences, GIGA-R, Laboratory of Human Genetics, University of Liège, Liège, Belgium
- Institut National de Recherche Biomédicale, Kinshasa, DR Congo
- Centre for Human Genetics, Faculty of Medicine, University of Kinshasa, Kinshasa, DR Congo
| | - Prosper Lukusa
- Institut National de Recherche Biomédicale, Kinshasa, DR Congo
- Centre for Human Genetics, Faculty of Medicine, University of Kinshasa, Kinshasa, DR Congo
- Centre for Human Genetics, University Hospital, University of Leuven, Leuven, Belgium
| | - Koenraad Devriendt
- Centre for Human Genetics, University Hospital, University of Leuven, Leuven, Belgium
| | - Joris Vermeesch
- Centre for Human Genetics, University Hospital, University of Leuven, Leuven, Belgium
| | - Jennifer E. Posey
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Elizabeth E. Palmer
- Genetics of Learning Disability Service, Newcastle, New South Wales, Australia
- School of Women’s and Children’s Health, University of New South Wales, Randwick, New South Wales, Australia
| | - Lucinda Murray
- Genetics of Learning Disability Service, Newcastle, New South Wales, Australia
| | - Eyby Leon
- Rare Disease Institute, Children’s National Health System, Washington, District of Columbia
| | - Jullianne Diaz
- Rare Disease Institute, Children’s National Health System, Washington, District of Columbia
| | - Lisa Worgan
- Department of Clinical Genetics, Liverpool Hospital, Sydney, New South Wales, Australia
| | - Amali Mallawaarachchi
- Department of Clinical Genetics, Liverpool Hospital, Sydney, New South Wales, Australia
| | - Julie Vogt
- West Midlands Regional Clinical Genetics Service and Birmingham Health Partners, Birmingham Women’s and Children’s Hospitals NHS Foundation Trust, Birmingham, UK
| | - Sonja A. de Munnik
- Department of Human Genetics, Institute for Genetic and Metabolic Disease, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Lauren Dreyer
- Genetic Services of Western Australia, Undiagnosed Diseases Program, Perth, Western Australia, Australia
| | - Gareth Baynam
- Genetic Services of Western Australia, Undiagnosed Diseases Program, Perth, Western Australia, Australia
- Western Australian Register of Developmental Anomalies, Perth, Western Australia, Australia
- Institute for Immunology and Infectious Diseases, Murdoch University, Perth, Western Australia, Australia
- Telethon Kids Institute, Perth, Western Australia, Australia
- Division of Paediatrics, School of Medicine, University of Western Australia, Perth, Western Australia, Australia
| | - Lisa Ewans
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia
| | - Zornitza Stark
- Victorian Clinical Genetics Services, Murdoch Children’s Research Institute, Melbourne, Victoria, Australia
- Department of Paediatrics, University of Melbourne, Melbourne, Victoria, Australia
- Australian Genomics Health Alliance, Melbourne, Victoria, Australia
| | - Sebastian Lunke
- Victorian Clinical Genetics Services, Murdoch Children’s Research Institute, Melbourne, Victoria, Australia
- Department of Paediatrics, University of Melbourne, Melbourne, Victoria, Australia
- Australian Genomics Health Alliance, Melbourne, Victoria, Australia
| | - Ana R. Gonçalves
- Center for Medical Genetics Dr. Jacinto de Magalhāes, Hospital and University Center of Porto, Porto, Portugal
| | - Gabriela Soares
- Center for Medical Genetics Dr. Jacinto de Magalhāes, Hospital and University Center of Porto, Porto, Portugal
| | - Jorge Oliveira
- Center for Medical Genetics Dr. Jacinto de Magalhāes, Hospital and University Center of Porto, Porto, Portugal
- unIGENe, and Center for Predictive and Preventive Genetics (CGPP), Institute for Molecular and Cell Biology (IBMC), Institute of Health Research and Innovation (i3S), University of Porto, Porto, Portugal
| | - Emily Fassi
- Department of Pediatrics, Division of Genetics and Genomic Medicine, Washington University School of Medicine, St. Louis, Michigan
| | - Marcia Willing
- Department of Pediatrics, Division of Genetics and Genomic Medicine, Washington University School of Medicine, St. Louis, Michigan
| | - Jeff L. Waugh
- Department of Neurology, Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts
- Department of Pediatrics, Division of Pediatric Neurology, University of Texas Southwestern, Dallas, Texas
| | - Laurence Faivre
- INSERM U1231, LNC UMR1231 GAD, Burgundy University, Dijon, France
| | | | - Sebastien Moutton
- INSERM U1231, LNC UMR1231 GAD, Burgundy University, Dijon, France
- Department of Medical Genetics, Reference Center for Developmental Anomalies, Bordeaux University Hospital, Bordeaux, France
| | | | - Katelyn Payne
- Department of Neurology, Indiana University School of Medicine, Indianapolis, Indiana
| | - Laurence Walsh
- Department of Neurology, Indiana University School of Medicine, Indianapolis, Indiana
| | | | | | | | | | - Michael F. Buckley
- New South Wales Health Pathology Genomic Laboratory, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Paul R. Mark
- Spectrum Health Division of Medical and Molecular Genetics, Grand Rapids, Michigan
| | - Lesley C. Adès
- Department of Paediatrics and Child Health, University of Sydney, Sydney, New South Wales, Australia
- Department of Genetics, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia
| | - Sarah A. Sandaradura
- Department of Paediatrics and Child Health, University of Sydney, Sydney, New South Wales, Australia
- Department of Genetics, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia
| | - James R. Lupski
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas
- Department of Pediatrics, Texas Children’s Hospital, Houston, Texas
| | - Tony Roscioli
- New South Wales Health Pathology Genomic Laboratory, Prince of Wales Hospital, Randwick, New South Wales, Australia
- Centre for Clinical Genetics, Sydney Children’s Hospital, Randwick, New South Wales, Australia
- Neuroscience Research Australia, University of New South Wales, Sydney, New South Wales, Australia
| | - Pankaj B. Agrawal
- Divisions of Newborn Medicine and Genetics and Genomics, Manton Center for Orphan Disease Research, Boston Children’s Hospital, Harvard Medical School, Boston, Maryland
| | - Antonie D. Kline
- Harvey Institute for Human Genetics, Greater Baltimore Medical Center, Baltimore, Maryland
| | | | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania
| | - H. T. Marc Timmers
- German Cancer Consortium (DKTK), Partner Site Freiburg, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Department of Urology, Medical Faculty-University of Freiburg, Freiburg, Germany
| | - Gholson J. Lyon
- Institute for Basic Research in Developmental Disabilities (IBR), Staten Island, New York
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
- The Graduate Center, The City University of New York, New York, New York
| |
Collapse
|
6
|
Sun X, Gao J, Jin P, Eng C, Burchard EG, Beaty TH, Ruczinski I, Mathias RA, Barnes K, Wang F, Qin ZS. Optimized distributed systems achieve significant performance improvement on sorted merging of massive VCF files. Gigascience 2018; 7:4995263. [PMID: 29762754 PMCID: PMC6007233 DOI: 10.1093/gigascience/giy052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Revised: 02/06/2018] [Accepted: 05/05/2018] [Indexed: 12/24/2022] Open
Abstract
Background Sorted merging of genomic data is a common data operation necessary in many sequencing-based studies. It involves sorting and merging genomic data from different subjects by their genomic locations. In particular, merging a large number of variant call format (VCF) files is frequently required in large-scale whole-genome sequencing or whole-exome sequencing projects. Traditional single-machine based methods become increasingly inefficient when processing large numbers of files due to the excessive computation time and Input/Output bottleneck. Distributed systems and more recent cloud-based systems offer an attractive solution. However, carefully designed and optimized workflow patterns and execution plans (schemas) are required to take full advantage of the increased computing power while overcoming bottlenecks to achieve high performance. Findings In this study, we custom-design optimized schemas for three Apache big data platforms, Hadoop (MapReduce), HBase, and Spark, to perform sorted merging of a large number of VCF files. These schemas all adopt the divide-and-conquer strategy to split the merging job into sequential phases/stages consisting of subtasks that are conquered in an ordered, parallel, and bottleneck-free way. In two illustrating examples, we test the performance of our schemas on merging multiple VCF files into either a single TPED or a single VCF file, which are benchmarked with the traditional single/parallel multiway-merge methods, message passing interface (MPI)-based high-performance computing (HPC) implementation, and the popular VCFTools. Conclusions Our experiments suggest all three schemas either deliver a significant improvement in efficiency or render much better strong and weak scalabilities over traditional methods. Our findings provide generalized scalable schemas for performing sorted merging on genetics and genomics data using these Apache distributed systems.
Collapse
Affiliation(s)
- Xiaobo Sun
- Department of Computer Sciences, Emory University, Atlanta, GA 30322, USA
| | - Jingjing Gao
- Department of Medical Informatics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Peng Jin
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Celeste Eng
- Department of Medicine, University of California, San Francisco, San Francisco, CA 94143 USA
| | - Esteban G Burchard
- Department of Medicine, University of California, San Francisco, San Francisco, CA 94143 USA
| | - Terri H Beaty
- Department of Epidemiology, Bloomberg School of Public Health, JHU, Baltimore, MD 21205 USA
| | - Ingo Ruczinski
- Department of Biostatistics, Bloomberg School of Public Health, JHU, Baltimore, MD 21205 USA
| | - Rasika A Mathias
- Department of Medicine, Johns Hopkins University, Baltimore, MD 21224 USA
| | - Kathleen Barnes
- Department of Medicine, University of Colorado, Denver, Aurora, CO, 80045 USA
| | - Fusheng Wang
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY 11794, USA
| | - Zhaohui S Qin
- Department of Medical Informatics, Emory University School of Medicine, Atlanta, GA 30322, USA
- Department of Biostatistics, Emory University, Atlanta, GA 30322, USA
| | | |
Collapse
|
7
|
Jin ZB, Li Z, Liu Z, Jiang Y, Cai XB, Wu J. Identification of de novo germline mutations and causal genes for sporadic diseases using trio-based whole-exome/genome sequencing. Biol Rev Camb Philos Soc 2017; 93:1014-1031. [PMID: 29154454 DOI: 10.1111/brv.12383] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Revised: 09/28/2017] [Accepted: 10/10/2017] [Indexed: 12/14/2022]
Abstract
Whole-genome or whole-exome sequencing (WGS/WES) of the affected proband together with normal parents (trio) is commonly adopted to identify de novo germline mutations (DNMs) underlying sporadic cases of various genetic disorders. However, our current knowledge of the occurrence and functional effects of DNMs remains limited and accurately identifying the disease-causing DNM from a group of irrelevant DNMs is complicated. Herein, we provide a general-purpose discussion of important issues related to pathogenic gene identification based on trio-based WGS/WES data. Specifically, the relevance of DNMs to human sporadic diseases, current knowledge of DNM biogenesis mechanisms, and common strategies or software tools used for DNM detection are reviewed, followed by a discussion of pathogenic gene prioritization. In addition, several key factors that may affect DNM identification accuracy and causal gene prioritization are reviewed. Based on recent major advances, this review both sheds light on how trio-based WGS/WES technologies can play a significant role in the identification of DNMs and causal genes for sporadic diseases, and also discusses existing challenges.
Collapse
Affiliation(s)
- Zi-Bing Jin
- Division of Ophthalmic Genetics, The Eye Hospital, School of Ophthalmology & Optometry, Wenzhou Medical University, Wenzhou, 325027, China.,State Key Laboratory of Ophthalmology Optometry and Vision Science, Wenzhou Medical University, Wenzhou, 325027, China
| | - Zhongshan Li
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, 325000, China
| | - Zhenwei Liu
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, 325000, China
| | - Yi Jiang
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, 325000, China
| | - Xue-Bi Cai
- Division of Ophthalmic Genetics, The Eye Hospital, School of Ophthalmology & Optometry, Wenzhou Medical University, Wenzhou, 325027, China.,State Key Laboratory of Ophthalmology Optometry and Vision Science, Wenzhou Medical University, Wenzhou, 325027, China
| | - Jinyu Wu
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, 325000, China
| |
Collapse
|
8
|
Kleyner R, Malcolmson J, Tegay D, Ward K, Maughan A, Maughan G, Nelson L, Wang K, Robison R, Lyon GJ. KBG syndrome involving a single-nucleotide duplication in ANKRD11. Cold Spring Harb Mol Case Stud 2017; 2:a001131. [PMID: 27900361 PMCID: PMC5111005 DOI: 10.1101/mcs.a001131] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
KBG syndrome is a rare autosomal dominant genetic condition characterized by neurological involvement and distinct facial, hand, and skeletal features. More than 70 cases have been reported; however, it is likely that KBG syndrome is underdiagnosed because of lack of comprehensive characterization of the heterogeneous phenotypic features. We describe the clinical manifestations in a male currently 13 years of age, who exhibited symptoms including epilepsy, severe developmental delay, distinct facial features, and hand anomalies, without a positive genetic diagnosis. Subsequent exome sequencing identified a novel de novo heterozygous single base pair duplication (c.6015dupA) in ANKRD11, which was validated by Sanger sequencing. This single-nucleotide duplication is predicted to lead to a premature stop codon and loss of function in ANKRD11, thereby implicating it as contributing to the proband's symptoms and yielding a molecular diagnosis of KBG syndrome. Before molecular diagnosis, this syndrome was not recognized in the proband, as several key features of the disorder were mild and were not recognized by clinicians, further supporting the concept of variable expressivity in many disorders. Although a diagnosis of cerebral folate deficiency has also been given, its significance for the proband's condition remains uncertain.
Collapse
Affiliation(s)
- Robert Kleyner
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Janet Malcolmson
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA;; Genetic Counseling Graduate Program, Long Island University (LIU), Brookville, New York 11548, USA
| | - David Tegay
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Kenneth Ward
- Affiliated Genetics, Inc., Salt Lake City, Utah 84109, USA
| | | | - Glenn Maughan
- KBG Syndrome Foundation, West Jordan, Utah 84088, USA
| | - Lesa Nelson
- Affiliated Genetics, Inc., Salt Lake City, Utah 84109, USA
| | - Kai Wang
- Zilkha Neurogenetic Institute, University of Southern California, Los Angeles, California 90089, USA;; Department of Psychiatry & Behavioral Sciences, Keck School of Medicine, University of Southern California, Los Angeles, California 90033, USA;; Utah Foundation for Biomedical Research, Salt Lake City, Utah 84107, USA
| | - Reid Robison
- Utah Foundation for Biomedical Research, Salt Lake City, Utah 84107, USA
| | - Gholson J Lyon
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA;; Utah Foundation for Biomedical Research, Salt Lake City, Utah 84107, USA
| |
Collapse
|