1
|
Peng X, Xue DJ. Phenotypic Screening of Molecular Docking Enriched Chemical Libraries from Targets Identified in Ischemic Stroke Genome Data by Network-Based Method. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:9999340. [PMID: 34820079 PMCID: PMC8608496 DOI: 10.1155/2021/9999340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 10/19/2021] [Indexed: 11/26/2022]
Abstract
Cerebral ischemia (IS) is one of the main cardiovascular diseases threatening life and disability. Like most cardiovascular events, the disease progression of is affects a variety of signaling pathways and changes multiple overexpressed genes in the body. The use of new therapeutic agents to interfere with the disease progression of cardiovascular diseases (such as is) can be achieved by selectively regulating small molecules of the target set of different signal pathways, also known as selective multipharmacology. Phenotypic screening can be an effective method to solve this problem, but the lack of targeted methods for ischemic stroke limits its impact. Here, we aim to identify IS-specific targets by RNA sequencing data with a network-based approach. Molecular docking approach was applied to screen over 210,000 molecules from SPECS compound library. Screening of this enriched library resulted in 605 candidates that led to several potent active hits. The novelty analysis suggested that the structure scaffolds of the compounds were dissimilar to existing IKKB inhibitors, and further biological test result confirmed two identified compounds represented novel IKKB inhibitors. Further, docking exploration with IKKB (PDB id: 4KIK) showed that the three selective compounds were stable inside the binding pocket of IKKB which shared a homology of compound-protein interactions in comparison with the bioactive inhibitor of CHEMBL1762621. Our screening method is expected to produce selective multidrug lead compounds for the development of treatments for complex diseases, such as ischemic stroke.
Collapse
Affiliation(s)
- Xiaojiang Peng
- Zhaoqing Medical College, No. 6 Xijiangnan Road, Duanzhou District, Zhaoqing 526020, Guangdong, China
| | - Dao-jin Xue
- The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangdong Provincial Hospital of Chinese Medicine, 111 Dade Road, Yuexiu District 510120, Guangzhou, China
| |
Collapse
|
2
|
Gonzalez-Calderon G, Liu R, Carvajal R, Teer JK. A negative storage model for precise but compact storage of genetic variation data. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2020:5820061. [PMID: 32293013 PMCID: PMC7157186 DOI: 10.1093/database/baz158] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 11/21/2019] [Accepted: 12/09/2020] [Indexed: 12/30/2022]
Abstract
Falling sequencing costs and large initiatives are resulting in increasing amounts of data available for investigator use. However, there are informatics challenges in being able to access genomic data. Performance and storage are well-appreciated issues, but precision is critical for meaningful analysis and interpretation of genomic data. There is an inherent accuracy vs. performance trade-off with existing solutions. The most common approach (Variant-only Storage Model, VOSM) stores only variant data. Systems must therefore assume that everything not variant is reference, sacrificing precision and potentially accuracy. A more complete model (Full Storage Model, FSM) would store the state of every base (variant, reference and missing) in the genome thereby sacrificing performance. A compressed variation of the FSM can store the state of contiguous regions of the genome as blocks (Block Storage Model, BLSM), much like the file-based gVCF model. We propose a novel approach by which this state is encoded such that both performance and accuracy are maintained. The Negative Storage Model (NSM) can store and retrieve precise genomic state from different sequencing sources, including clinical and whole exome sequencing panels. Reduced storage requirements are achieved by storing only the variant and missing states and inferring the reference state. We evaluate the performance characteristics of FSM, BLSM and NSM and demonstrate dramatic improvements in storage and performance using the NSM approach.
Collapse
Affiliation(s)
- Guillermo Gonzalez-Calderon
- Biostatistics and Bioinformatics Shared Resource, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Dr., Tampa, FL, 33912, USA
| | - Ruizheng Liu
- Biostatistics and Bioinformatics Shared Resource, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Dr., Tampa, FL, 33912, USA
| | - Rodrigo Carvajal
- Biostatistics and Bioinformatics Shared Resource, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Dr., Tampa, FL, 33912, USA
| | - Jamie K Teer
- Biostatistics and Bioinformatics Shared Resource, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Dr., Tampa, FL, 33912, USA.,Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Dr., Tampa, FL, 33912, USA
| |
Collapse
|
3
|
Noncoding deletions reveal a gene that is critical for intestinal function. Nature 2019; 571:107-111. [PMID: 31217582 PMCID: PMC7061489 DOI: 10.1038/s41586-019-1312-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Accepted: 05/16/2019] [Indexed: 01/10/2023]
|
4
|
Alzu'bi AA, Zhou L, Watzlaf VJM. Genetic Variations and Precision Medicine. PERSPECTIVES IN HEALTH INFORMATION MANAGEMENT 2019; 16:1a. [PMID: 31019429 PMCID: PMC6462879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The time and costs associated with the sequencing of a human genome have decreased significantly in recent years. Many people have chosen to have their genomes sequenced to receive genomics-based personalized healthcare services. To reach the goal of genomics-based precision medicine, health information management (HIM) professionals need to manage and analyze patients' genomic data. Two important pieces of information from the genome sequence are the risk of genetic diseases and the specific medication or pharmacogenomic results for the individual patient, both of which are linked to a patient's genetic variations. In this review article, we introduce genetic variations, including their data types, relevant databases, and some currently available analysis methods and systems. HIM professionals can choose to use these databases, methods, and systems in the management and analysis of patients' genomic data.
Collapse
Affiliation(s)
- Amal Adel Alzu'bi
- The Department of Computer Information Systems at Jordan University of Science and Technology in Irbid, Jordan
| | - Leming Zhou
- The Department of Health Information Management at the University of Pittsburgh in Pittsburgh, PA
| | - Valerie J M Watzlaf
- The Department of Health Information Management at the University of Pittsburgh in Pittsburgh, PA
| |
Collapse
|
5
|
Delgado-Vega AM, Martínez-Bueno M, Oparina NY, López Herráez D, Kristjansdottir H, Steinsson K, Kozyrev SV, Alarcón-Riquelme ME. Whole Exome Sequencing of Patients from Multicase Families with Systemic Lupus Erythematosus Identifies Multiple Rare Variants. Sci Rep 2018; 8:8775. [PMID: 29884787 PMCID: PMC5993790 DOI: 10.1038/s41598-018-26274-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 05/03/2018] [Indexed: 01/30/2023] Open
Abstract
In an effort to identify rare alleles associated with SLE, we have performed whole exome sequencing of the most distantly related affected individuals from two large Icelandic multicase SLE families followed by Ta targeted genotyping of additional relatives. We identified multiple rare likely pathogenic variants in nineteen genes co-segregating with the disease through multiple generations. Gene co-expression and protein-protein interaction analysis identified a network of highly connected genes comprising several loci previously implicated in autoimmune diseases. These genes were significantly enriched for immune system development, lymphocyte activation, DNA repair, and V(D)J gene recombination GO-categories. Furthermore, we found evidence of aggregate association and enrichment of rare variants at the FAM71E1/EMC10 locus in an independent set of 4,254 European SLE-cases and 4,349 controls. Our study presents evidence supporting that multiple rare likely pathogenic variants, in newly identified genes involved in known disease pathogenic pathways, segregate with SLE at the familial and population level.
Collapse
Affiliation(s)
- Angélica M Delgado-Vega
- Department of Immunology, Genetics and Pathology, Uppsala University, The Rudbeck Laboratory, Uppsala, Sweden
| | - Manuel Martínez-Bueno
- Pfizer/University of Granada/Andalusian Government Centre for Genomics and Oncological Research (GENYO), Granada, Spain
| | - Nina Y Oparina
- Institute for Environmental Medicine, Karolinska Institutet, Solna, Sweden.,Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - David López Herráez
- Department Effect-Directed Analysis, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
| | | | | | - Sergey V Kozyrev
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Marta E Alarcón-Riquelme
- Pfizer/University of Granada/Andalusian Government Centre for Genomics and Oncological Research (GENYO), Granada, Spain. .,Institute for Environmental Medicine, Karolinska Institutet, Solna, Sweden.
| |
Collapse
|
6
|
Granata I, Sangiovanni M, Maiorano F, Miele M, Guarracino MR. Var2GO: a web-based tool for gene variants selection. BMC Bioinformatics 2016; 17:376. [PMID: 28185576 PMCID: PMC5123234 DOI: 10.1186/s12859-016-1197-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background One of the most challenging issue in the variant calling process is handling the resulting data, and filtering the genes retaining only the ones strictly related to the topic of interest. Several tools permit to gather annotations at different levels of complexity for the detected genes and to group them according to the pathways and/or processes they belong to. However, it might be a time consuming and frustrating task. This is partly due to the size of the file, that might contain many thousands of genes, and to the search of associated variants that requires a gene-by-gene investigation and annotation approach. As a consequence, the initial gene list is often reduced exploiting the knowledge of variants effect, novelty and genotype, with the potential risk of losing meaningful pieces of information. Results Here we present Var2GO, a new web-based tool to support the annotation and filtering of variants and genes coming from variant calling of high-throughput sequencing data. Var2GO permits to upload either the unprocessed Variant Calling Format file or a table containing the annotated variants. The raw data undergo a preliminary step of variants annotation, using the SnpEff tool, and are converted to a table format. The table is then uploaded into an on the fly generated database. Genes associated to the variants are automatically annotated with the corresponding Gene Ontology terms covering the three GO domains. Using the web interface it is then possible to filter and extract, from the whole list, genes having annotations in the domain of interest, by simply specifying filtering parameters and one or more keywords. The relevance of this tool is demonstrated on exome sequencing data. Conclusions Var2GO is a novel tool that implements a topic-based approach, expressly designed to help biologists in narrowing the search of relevant genes coming from variant calling analysis. Its main purpose is to support non-bioinformaticians in handling and processing raw variant calling data through an intuitive web interface. Furthermore, Var2GO offers a complete pipeline that, starting from the raw VCF file, allows to annotate both variants and associated genes and supports the extraction of relevant biological knowledge.
Collapse
Affiliation(s)
- Ilaria Granata
- High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy.
| | - Mara Sangiovanni
- High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy
| | - Francesco Maiorano
- High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy
| | - Marco Miele
- High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy
| | - Mario Rosario Guarracino
- High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy
| |
Collapse
|
7
|
Thangam M, Gopal RK. CRCDA--Comprehensive resources for cancer NGS data analysis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav092. [PMID: 26450948 PMCID: PMC4597977 DOI: 10.1093/database/bav092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 08/31/2015] [Indexed: 12/24/2022]
Abstract
Next generation sequencing (NGS) innovations put a compelling landmark in life science and changed the direction of research in clinical oncology with its productivity to diagnose and treat cancer. The aim of our portal comprehensive resources for cancer NGS data analysis (CRCDA) is to provide a collection of different NGS tools and pipelines under diverse classes with cancer pathways and databases and furthermore, literature information from PubMed. The literature data was constrained to 18 most common cancer types such as breast cancer, colon cancer and other cancers that exhibit in worldwide population. NGS-cancer tools for the convenience have been categorized into cancer genomics, cancer transcriptomics, cancer epigenomics, quality control and visualization. Pipelines for variant detection, quality control and data analysis were listed to provide out-of-the box solution for NGS data analysis, which may help researchers to overcome challenges in selecting and configuring individual tools for analysing exome, whole genome and transcriptome data. An extensive search page was developed that can be queried by using (i) type of data [literature, gene data and sequence read archive (SRA) data] and (ii) type of cancer (selected based on global incidence and accessibility of data). For each category of analysis, variety of tools are available and the biggest challenge is in searching and using the right tool for the right application. The objective of the work is collecting tools in each category available at various places and arranging the tools and other data in a simple and user-friendly manner for biologists and oncologists to find information easier. To the best of our knowledge, we have collected and presented a comprehensive package of most of the resources available in cancer for NGS data analysis. Given these factors, we believe that this website will be an useful resource to the NGS research community working on cancer. Database URL: http://bioinfo.au-kbc.org.in/ngs/ngshome.html.
Collapse
Affiliation(s)
- Manonanthini Thangam
- AU-KBC Research Centre, MIT Campus of Anna University, Chromepet, Chennai, India
| | - Ramesh Kumar Gopal
- AU-KBC Research Centre, MIT Campus of Anna University, Chromepet, Chennai, India
| |
Collapse
|
8
|
Pavlopoulos GA, Malliarakis D, Papanikolaou N, Theodosiou T, Enright AJ, Iliopoulos I. Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future. Gigascience 2015; 4:38. [PMID: 26309733 PMCID: PMC4548842 DOI: 10.1186/s13742-015-0077-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Accepted: 08/03/2015] [Indexed: 01/31/2023] Open
Abstract
"Α picture is worth a thousand words." This widely used adage sums up in a few words the notion that a successful visual representation of a concept should enable easy and rapid absorption of large amounts of information. Although, in general, the notion of capturing complex ideas using images is very appealing, would 1000 words be enough to describe the unknown in a research field such as the life sciences? Life sciences is one of the biggest generators of enormous datasets, mainly as a result of recent and rapid technological advances; their complexity can make these datasets incomprehensible without effective visualization methods. Here we discuss the past, present and future of genomic and systems biology visualization. We briefly comment on many visualization and analysis tools and the purposes that they serve. We focus on the latest libraries and programming languages that enable more effective, efficient and faster approaches for visualizing biological concepts, and also comment on the future human-computer interaction trends that would enable for enhancing visualization further.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | | | - Nikolas Papanikolaou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Theodosis Theodosiou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Anton J Enright
- EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SD UK
| | - Ioannis Iliopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| |
Collapse
|
9
|
Mousallem T, Urban TJ, McSweeney KM, Kleinstein SE, Zhu M, Adeli M, Parrott RE, Roberts JL, Krueger B, Buckley RH, Goldstein DB. Clinical application of whole-genome sequencing in patients with primary immunodeficiency. J Allergy Clin Immunol 2015; 136:476-9.e6. [PMID: 25981738 DOI: 10.1016/j.jaci.2015.02.040] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Revised: 12/19/2014] [Accepted: 02/03/2015] [Indexed: 11/17/2022]
Affiliation(s)
- Talal Mousallem
- Departments of Internal Medicine and Pediatrics, Wake Forest School of Medicine, Winston-Salem, NC; Department of Pediatrics, Duke University Medical Center, Durham, NC.
| | - Thomas J Urban
- Center for Human Genome Variation, Duke University School of Medicine, Durham, NC; Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC
| | - K Melodi McSweeney
- Center for Human Genome Variation, Duke University School of Medicine, Durham, NC; Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC; Institute for Genomic Medicine, Columbia University Medical Center, New York, NY
| | - Sarah E Kleinstein
- Center for Human Genome Variation, Duke University School of Medicine, Durham, NC; Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC; Institute for Genomic Medicine, Columbia University Medical Center, New York, NY
| | - Mingfu Zhu
- Center for Human Genome Variation, Duke University School of Medicine, Durham, NC
| | | | - Roberta E Parrott
- Department of Pediatrics, Duke University Medical Center, Durham, NC
| | - Joseph L Roberts
- Department of Pediatrics, Duke University Medical Center, Durham, NC
| | - Brian Krueger
- Center for Human Genome Variation, Duke University School of Medicine, Durham, NC; Institute for Genomic Medicine, Columbia University Medical Center, New York, NY
| | - Rebecca H Buckley
- Department of Pediatrics, Duke University Medical Center, Durham, NC; Department of Immunology, Duke University School of Medicine, Durham, NC.
| | - David B Goldstein
- Center for Human Genome Variation, Duke University School of Medicine, Durham, NC; Institute for Genomic Medicine, Columbia University Medical Center, New York, NY
| |
Collapse
|
10
|
Pham PH, Shipman WJ, Erikson GA, Schork NJ, Torkamani A. Scripps Genome ADVISER: Annotation and Distributed Variant Interpretation SERver. PLoS One 2015; 10:e0116815. [PMID: 25706643 PMCID: PMC4338027 DOI: 10.1371/journal.pone.0116815] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 12/01/2014] [Indexed: 12/31/2022] Open
Abstract
Interpretation of human genomes is a major challenge. We present the Scripps Genome ADVISER (SG-ADVISER) suite, which aims to fill the gap between data generation and genome interpretation by performing holistic, in-depth, annotations and functional predictions on all variant types and effects. The SG-ADVISER suite includes a de-identification tool, a variant annotation web-server, and a user interface for inheritance and annotation-based filtration. SG-ADVISER allows users with no bioinformatics expertise to manipulate large volumes of variant data with ease--without the need to download large reference databases, install software, or use a command line interface. SG-ADVISER is freely available at genomics.scripps.edu/ADVISER.
Collapse
Affiliation(s)
- Phillip H. Pham
- Cypher Genomics, Inc., La Jolla, CA 92037, United States of America
| | - William J. Shipman
- Scripps Health, La Jolla, CA 92037, United States of America
- The Scripps Translational Science Institute, La Jolla, CA 92037, United States of America
| | - Galina A. Erikson
- Scripps Health, La Jolla, CA 92037, United States of America
- The Scripps Translational Science Institute, La Jolla, CA 92037, United States of America
| | - Nicholas J. Schork
- Scripps Health, La Jolla, CA 92037, United States of America
- The Scripps Translational Science Institute, La Jolla, CA 92037, United States of America
- The Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, United States of America
- Cypher Genomics, Inc., La Jolla, CA 92037, United States of America
| | - Ali Torkamani
- Scripps Health, La Jolla, CA 92037, United States of America
- The Scripps Translational Science Institute, La Jolla, CA 92037, United States of America
- The Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States of America
- Cypher Genomics, Inc., La Jolla, CA 92037, United States of America
| |
Collapse
|
11
|
Li MJ, Wang J. Current trend of annotating single nucleotide variation in humans--A case study on SNVrap. Methods 2014; 79-80:32-40. [PMID: 25308971 DOI: 10.1016/j.ymeth.2014.10.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Revised: 09/25/2014] [Accepted: 10/02/2014] [Indexed: 12/16/2022] Open
Abstract
As high throughput methods, such as whole genome genotyping arrays, whole exome sequencing (WES) and whole genome sequencing (WGS), have detected huge amounts of genetic variants associated with human diseases, function annotation of these variants is an indispensable step in understanding disease etiology. Large-scale functional genomics projects, such as The ENCODE Project and Roadmap Epigenomics Project, provide genome-wide profiling of functional elements across different human cell types and tissues. With the urgent demands for identification of disease-causal variants, comprehensive and easy-to-use annotation tool is highly in demand. Here we review and discuss current progress and trend of the variant annotation field. Furthermore, we introduce a comprehensive web portal for annotating human genetic variants. We use gene-based features and the latest functional genomics datasets to annotate single nucleotide variation (SNVs) in human, at whole genome scale. We further apply several function prediction algorithms to annotate SNVs that might affect different biological processes, including transcriptional gene regulation, alternative splicing, post-transcriptional regulation, translation and post-translational modifications. The SNVrap web portal is freely available at http://jjwanglab.org/snvrap.
Collapse
Affiliation(s)
- Mulin Jun Li
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, China; Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, China; Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, China
| | - Junwen Wang
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, China; Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, China; Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, China.
| |
Collapse
|
12
|
ARYANA: Aligning Reads by Yet Another Approach. BMC Bioinformatics 2014; 15 Suppl 9:S12. [PMID: 25252881 PMCID: PMC4168712 DOI: 10.1186/1471-2105-15-s9-s12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Motivation Although there are many different algorithms and software tools for aligning sequencing reads, fast gapped sequence search is far from solved. Strong interest in fast alignment is best reflected in the $106 prize for the Innocentive competition on aligning a collection of reads to a given database of reference genomes. In addition, de novo assembly of next-generation sequencing long reads requires fast overlap-layout-concensus algorithms which depend on fast and accurate alignment. Contribution We introduce ARYANA, a fast gapped read aligner, developed on the base of BWA indexing infrastructure with a completely new alignment engine that makes it significantly faster than three other aligners: Bowtie2, BWA and SeqAlto, with comparable generality and accuracy. Instead of the time-consuming backtracking procedures for handling mismatches, ARYANA comes with the seed-and-extend algorithmic framework and a significantly improved efficiency by integrating novel algorithmic techniques including dynamic seed selection, bidirectional seed extension, reset-free hash tables, and gap-filling dynamic programming. As the read length increases ARYANA's superiority in terms of speed and alignment rate becomes more evident. This is in perfect harmony with the read length trend as the sequencing technologies evolve. The algorithmic platform of ARYANA makes it easy to develop mission-specific aligners for other applications using ARYANA engine. Availability ARYANA with complete source code can be obtained from http://github.com/aryana-aligner
Collapse
|
13
|
Rare hereditary COL4A3/COL4A4 variants may be mistaken for familial focal segmental glomerulosclerosis. Kidney Int 2014; 86:1253-9. [PMID: 25229338 PMCID: PMC4245465 DOI: 10.1038/ki.2014.305] [Citation(s) in RCA: 158] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Revised: 07/09/2014] [Accepted: 07/17/2014] [Indexed: 12/12/2022]
Abstract
Focal segmental glomerulosclerosis (FSGS) is a histological lesion with many causes including inherited genetic defects with significant proteinuria being the predominant clinical finding at presentation. Mutations in COL4A3 and COL4A4 are known to cause Alport syndrome, thin basement membrane nephropathy, and to result in pathognomonic glomerular basement membrane findings. Secondary FSGS is known to develop in classic Alport Syndrome at later stages of the disease. Here, we present seven families with rare or novel variants in COL4A3 or COL4A4 (six with single and one with two heterozygous variants) from a cohort of 70 families with a diagnosis of hereditary FSGS. The predominant clinical findings at diagnosis were proteinuria associated with hematuria. In all seven families, there were individuals with nephrotic range proteinuria with histologic features of FSGS by light microscopy. In one family, electron microscopy showed thin glomerular basement membrane, but four other families had variable findings inconsistent with classical Alport nephritis. There was no recurrence of disease after kidney transplantation. Families with COL4A3 and COL4A4 variants that segregated with disease represent 10% of our cohort. Thus, COL4A3 and COL4A4 variants should be considered in the interpretation of next-generation sequencing data from such patients. Furthermore, this study illustrates the power of molecular genetic diagnostics in the clarification of renal phenotypes.
Collapse
|
14
|
Gómez-Herreros F, Schuurs-Hoeijmakers JHM, McCormack M, Greally MT, Rulten S, Romero-Granados R, Counihan TJ, Chaila E, Conroy J, Ennis S, Delanty N, Cortés-Ledesma F, de Brouwer APM, Cavalleri GL, El-Khamisy SF, de Vries BBA, Caldecott KW. TDP2 protects transcription from abortive topoisomerase activity and is required for normal neural function. Nat Genet 2014; 46:516-21. [PMID: 24658003 DOI: 10.1038/ng.2929] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2013] [Accepted: 02/28/2014] [Indexed: 12/12/2022]
Abstract
Topoisomerase II (TOP2) removes torsional stress from DNA and facilitates gene transcription by introducing transient DNA double-strand breaks (DSBs). Such DSBs are normally rejoined by TOP2 but on occasion can become abortive and remain unsealed. Here we identify homozygous mutations in the TDP2 gene encoding tyrosyl DNA phosphodiesterase-2, an enzyme that repairs 'abortive' TOP2-induced DSBs, in individuals with intellectual disability, seizures and ataxia. We show that cells from affected individuals are hypersensitive to TOP2-induced DSBs and that loss of TDP2 inhibits TOP2-dependent gene transcription in cultured human cells and in mouse post-mitotic neurons following abortive TOP2 activity. Notably, TDP2 is also required for normal levels of many gene transcripts in developing mouse brain, including numerous gene transcripts associated with neurological function and/or disease, and for normal interneuron density in mouse cerebellum. Collectively, these data implicate chromosome breakage by TOP2 as an endogenous threat to gene transcription and to normal neuronal development and maintenance.
Collapse
Affiliation(s)
- Fernando Gómez-Herreros
- 1] Genome Damage and Stability Centre, School of Biological Sciences, University of Sussex, Sussex, UK. [2]
| | - Janneke H M Schuurs-Hoeijmakers
- 1] Department of Human Genetics, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands. [2] Department of Cognitive Neurosciences, Donders Institute for Brain Cognition and Behaviour, Radboud University Medical Centre, Nijmegen, The Netherlands. [3]
| | - Mark McCormack
- 1] Molecular and Cellular Therapeutics, The Royal College of Surgeons in Ireland, Dublin, Ireland. [2]
| | - Marie T Greally
- National Centre for Medical Genetics, Our Lady's Children's Hospital, Crumlin, Dublin, Ireland
| | - Stuart Rulten
- Genome Damage and Stability Centre, School of Biological Sciences, University of Sussex, Sussex, UK
| | - Rocío Romero-Granados
- Centro Andaluz de Biología Molecular y Medicina Regenerativa (CABIMER), Departamento de Genética, CSIC (Centro Superior de Investigaciones Científicas)-Universidad de Sevilla, Sevilla, Spain
| | | | - Elijah Chaila
- Division of Neurology, Beaumont Hospital, Dublin, Ireland
| | - Judith Conroy
- School of Medicine and Medical Science, University College Dublin, Dublin, Ireland
| | - Sean Ennis
- School of Medicine and Medical Science, University College Dublin, Dublin, Ireland
| | - Norman Delanty
- 1] Molecular and Cellular Therapeutics, The Royal College of Surgeons in Ireland, Dublin, Ireland. [2] Division of Neurology, Beaumont Hospital, Dublin, Ireland
| | - Felipe Cortés-Ledesma
- Centro Andaluz de Biología Molecular y Medicina Regenerativa (CABIMER), Departamento de Genética, CSIC (Centro Superior de Investigaciones Científicas)-Universidad de Sevilla, Sevilla, Spain
| | - Arjan P M de Brouwer
- 1] Department of Human Genetics, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands. [2] Department of Cognitive Neurosciences, Donders Institute for Brain Cognition and Behaviour, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Gianpiero L Cavalleri
- Molecular and Cellular Therapeutics, The Royal College of Surgeons in Ireland, Dublin, Ireland
| | - Sherif F El-Khamisy
- 1] Kreb's Institute, Department of Molecular Biology and Biotechnology, University of Sheffield, Sheffield, UK. [2] Center of Genomics, Helmy Institute, Zewail City of Science and Technology, Giza, Egypt
| | - Bert B A de Vries
- 1] Department of Human Genetics, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands. [2] Department of Cognitive Neurosciences, Donders Institute for Brain Cognition and Behaviour, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Keith W Caldecott
- Genome Damage and Stability Centre, School of Biological Sciences, University of Sussex, Sussex, UK
| |
Collapse
|
15
|
Torri F, Dinov ID, Zamanyan A, Hobel S, Genco A, Petrosyan P, Clark AP, Liu Z, Eggert P, Pierce J, Knowles JA, Ames J, Kesselman C, Toga AW, Potkin SG, Vawter MP, Macciardi F. Next generation sequence analysis and computational genomics using graphical pipeline workflows. Genes (Basel) 2014; 3:545-75. [PMID: 23139896 PMCID: PMC3490498 DOI: 10.3390/genes3030545] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Whole-genome and exome sequencing have already proven to be essential and powerful methods to identify genes responsible for simple Mendelian inherited disorders. These methods can be applied to complex disorders as well, and have been adopted as one of the current mainstream approaches in population genetics. These achievements have been made possible by next generation sequencing (NGS) technologies, which require substantial bioinformatics resources to analyze the dense and complex sequence data. The huge analytical burden of data from genome sequencing might be seen as a bottleneck slowing the publication of NGS papers at this time, especially in psychiatric genetics. We review the existing methods for processing NGS data, to place into context the rationale for the design of a computational resource. We describe our method, the Graphical Pipeline for Computational Genomics (GPCG), to perform the computational steps required to analyze NGS data. The GPCG implements flexible workflows for basic sequence alignment, sequence data quality control, single nucleotide polymorphism analysis, copy number variant identification, annotation, and visualization of results. These workflows cover all the analytical steps required for NGS data, from processing the raw reads to variant calling and annotation. The current version of the pipeline is freely available at http://pipeline.loni.ucla.edu. These applications of NGS analysis may gain clinical utility in the near future (e.g., identifying miRNA signatures in diseases) when the bioinformatics approach is made feasible. Taken together, the annotation tools and strategies that have been developed to retrieve information and test hypotheses about the functional role of variants present in the human genome will help to pinpoint the genetic risk factors for psychiatric disorders.
Collapse
Affiliation(s)
- Federica Torri
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Ivo D. Dinov
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Alen Zamanyan
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Sam Hobel
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Alex Genco
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Petros Petrosyan
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Andrew P. Clark
- Zilkha Neurogenetic Institute, USC Keck School of Medicine, Los Angeles, CA 90033, USA; E-Mails: (A.P.C.); (J.A.K.)
| | - Zhizhong Liu
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Paul Eggert
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
- Department of Computer Science, University of California, Los Angeles, CA 90095, USA
| | - Jonathan Pierce
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - James A. Knowles
- Zilkha Neurogenetic Institute, USC Keck School of Medicine, Los Angeles, CA 90033, USA; E-Mails: (A.P.C.); (J.A.K.)
| | - Joseph Ames
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Carl Kesselman
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Arthur W. Toga
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Steven G. Potkin
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Marquis P. Vawter
- Functional Genomics Laboratory, Department of Psychiatry And Human Behavior, School of Medicine, University of California, Irvine, CA 92697, USA; E-Mail:
| | - Fabio Macciardi
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +1-949-824-4559; Fax: +1-949-824-2072
| |
Collapse
|
16
|
Gbadegesin RA, Hall G, Adeyemo A, Hanke N, Tossidou I, Burchette J, Wu G, Homstad A, Sparks MA, Gomez J, Jiang R, Alonso A, Lavin P, Conlon P, Korstanje R, Stander MC, Shamsan G, Barua M, Spurney R, Singhal PC, Kopp JB, Haller H, Howell D, Pollak MR, Shaw AS, Schiffer M, Winn MP. Mutations in the gene that encodes the F-actin binding protein anillin cause FSGS. J Am Soc Nephrol 2014; 25:1991-2002. [PMID: 24676636 DOI: 10.1681/asn.2013090976] [Citation(s) in RCA: 108] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
FSGS is characterized by segmental scarring of the glomerulus and is a leading cause of kidney failure. Identification of genes causing FSGS has improved our understanding of disease mechanisms and points to defects in the glomerular epithelial cell, the podocyte, as a major factor in disease pathogenesis. Using a combination of genome-wide linkage studies and whole-exome sequencing in a kindred with familial FSGS, we identified a missense mutation R431C in anillin (ANLN), an F-actin binding cell cycle gene, as a cause of FSGS. We screened 250 additional families with FSGS and found another variant, G618C, that segregates with disease in a second family with FSGS. We demonstrate upregulation of anillin in podocytes in kidney biopsy specimens from individuals with FSGS and kidney samples from a murine model of HIV-1-associated nephropathy. Overexpression of R431C mutant ANLN in immortalized human podocytes results in enhanced podocyte motility. The mutant anillin displays reduced binding to the slit diaphragm-associated scaffold protein CD2AP. Knockdown of the ANLN gene in zebrafish morphants caused a loss of glomerular filtration barrier integrity, podocyte foot process effacement, and an edematous phenotype. Collectively, these findings suggest that anillin is important in maintaining the integrity of the podocyte actin cytoskeleton.
Collapse
Affiliation(s)
- Rasheed A Gbadegesin
- Departments of Pediatrics, Center for Human Genetics, Duke University Medical Center, Durham, North Carolina;
| | - Gentzon Hall
- Center for Human Genetics, Duke University Medical Center, Durham, North Carolina; Medicine, and
| | - Adebowale Adeyemo
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Nils Hanke
- Department of Nephrology, Hannover Medical School, Hannover, Germany; Mount Desert Island Biological Laboratory, Salisbury Cove, Maine
| | - Irini Tossidou
- Department of Nephrology, Hannover Medical School, Hannover, Germany
| | | | - Guanghong Wu
- Center for Human Genetics, Duke University Medical Center, Durham, North Carolina; Medicine, and
| | - Alison Homstad
- Departments of Pediatrics, Center for Human Genetics, Duke University Medical Center, Durham, North Carolina
| | | | | | - Ruiji Jiang
- Departments of Pediatrics, Center for Human Genetics, Duke University Medical Center, Durham, North Carolina
| | - Andrea Alonso
- Departments of Pediatrics, Center for Human Genetics, Duke University Medical Center, Durham, North Carolina
| | - Peter Lavin
- Center for Human Genetics, Duke University Medical Center, Durham, North Carolina; Medicine, and Trinity Health Kidney Centre, Tallaght Hospital, Trinity College, Dublin, Ireland
| | - Peter Conlon
- Department of Nephrology, Beaumont Hospital, Dublin, Ireland
| | - Ron Korstanje
- Mount Desert Island Biological Laboratory, Salisbury Cove, Maine; The Jackson Laboratory, Bar Harbor, Maine
| | - M Christine Stander
- Howard Hughes Medical Institute, Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | - Ghaidan Shamsan
- Howard Hughes Medical Institute, Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | - Moumita Barua
- Division of Nephrology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| | | | - Pravin C Singhal
- Feinstein Institute for Medical Research, North Shore-LIJ Health System, Manhasset, New York; and
| | - Jeffrey B Kopp
- Kidney Disease Section, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland
| | - Hermann Haller
- Department of Nephrology, Hannover Medical School, Hannover, Germany; Mount Desert Island Biological Laboratory, Salisbury Cove, Maine
| | | | - Martin R Pollak
- Division of Nephrology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| | - Andrey S Shaw
- Howard Hughes Medical Institute, Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | - Mario Schiffer
- Department of Nephrology, Hannover Medical School, Hannover, Germany; Mount Desert Island Biological Laboratory, Salisbury Cove, Maine
| | - Michelle P Winn
- Center for Human Genetics, Duke University Medical Center, Durham, North Carolina; Medicine, and
| |
Collapse
|
17
|
Lee IH, Lee K, Hsing M, Choe Y, Park JH, Kim SH, Bohn JM, Neu MB, Hwang KB, Green RC, Kohane IS, Kong SW. Prioritizing disease-linked variants, genes, and pathways with an interactive whole-genome analysis pipeline. Hum Mutat 2014; 35:537-47. [PMID: 24478219 DOI: 10.1002/humu.22520] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2013] [Accepted: 01/23/2014] [Indexed: 01/02/2023]
Abstract
Whole-genome sequencing (WGS) studies are uncovering disease-associated variants in both rare and nonrare diseases. Utilizing the next-generation sequencing for WGS requires a series of computational methods for alignment, variant detection, and annotation, and the accuracy and reproducibility of annotation results are essential for clinical implementation. However, annotating WGS with up to date genomic information is still challenging for biomedical researchers. Here, we present one of the fastest and highly scalable annotation, filtering, and analysis pipeline-gNOME-to prioritize phenotype-associated variants while minimizing false-positive findings. Intuitive graphical user interface of gNOME facilitates the selection of phenotype-associated variants, and the result summaries are provided at variant, gene, and genome levels. Moreover, the enrichment results of specific variants, genes, and gene sets between two groups or compared with population scale WGS datasets that is already integrated in the pipeline can help the interpretation. We found a small number of discordant results between annotation software tools in part due to different reporting strategies for the variants with complex impacts. Using two published whole-exome datasets of uveal melanoma and bladder cancer, we demonstrated gNOME's accuracy of variant annotation and the enrichment of loss-of-function variants in known cancer pathways. gNOME Web server and source codes are freely available to the academic community (http://gnome.tchlab.org).
Collapse
Affiliation(s)
- In-Hee Lee
- Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, Department of Medicine, Boston Children's Hospital, Boston, Massachusetts, 02115
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Ruzzo EK, Capo-Chichi JM, Ben-Zeev B, Chitayat D, Mao H, Pappas AL, Hitomi Y, Lu YF, Yao X, Hamdan FF, Pelak K, Reznik-Wolf H, Bar-Joseph I, Oz-Levi D, Lev D, Lerman-Sagie T, Leshinsky-Silver E, Anikster Y, Ben-Asher E, Olender T, Colleaux L, Décarie JC, Blaser S, Banwell B, Joshi RB, He XP, Patry L, Silver RJ, Dobrzeniecka S, Islam MS, Hasnat A, Samuels ME, Aryal DK, Rodriguiz RM, Jiang YH, Wetsel WC, McNamara JO, Rouleau GA, Silver DL, Lancet D, Pras E, Mitchell GA, Michaud JL, Goldstein DB. Deficiency of asparagine synthetase causes congenital microcephaly and a progressive form of encephalopathy. Neuron 2014; 80:429-41. [PMID: 24139043 DOI: 10.1016/j.neuron.2013.08.013] [Citation(s) in RCA: 110] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/15/2013] [Indexed: 12/30/2022]
Abstract
We analyzed four families that presented with a similar condition characterized by congenital microcephaly, intellectual disability, progressive cerebral atrophy, and intractable seizures. We show that recessive mutations in the ASNS gene are responsible for this syndrome. Two of the identified missense mutations dramatically reduce ASNS protein abundance, suggesting that the mutations cause loss of function. Hypomorphic Asns mutant mice have structural brain abnormalities, including enlarged ventricles and reduced cortical thickness, and show deficits in learning and memory mimicking aspects of the patient phenotype. ASNS encodes asparagine synthetase, which catalyzes the synthesis of asparagine from glutamine and aspartate. The neurological impairment resulting from ASNS deficiency may be explained by asparagine depletion in the brain or by accumulation of aspartate/glutamate leading to enhanced excitability and neuronal damage. Our study thus indicates that asparagine synthesis is essential for the development and function of the brain but not for that of other organs.
Collapse
Affiliation(s)
- Elizabeth K Ruzzo
- Center for Human Genome Variation, Duke University School of Medicine, Durham, NC 27708, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Roy S, Durso MB, Wald A, Nikiforov YE, Nikiforova MN. SeqReporter: automating next-generation sequencing result interpretation and reporting workflow in a clinical laboratory. J Mol Diagn 2013; 16:11-22. [PMID: 24220144 DOI: 10.1016/j.jmoldx.2013.08.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Revised: 08/20/2013] [Accepted: 08/27/2013] [Indexed: 01/15/2023] Open
Abstract
A wide repertoire of bioinformatics applications exist for next-generation sequencing data analysis; however, certain requirements of the clinical molecular laboratory limit their use: i) comprehensive report generation, ii) compatibility with existing laboratory information systems and computer operating system, iii) knowledgebase development, iv) quality management, and v) data security. SeqReporter is a web-based application developed using ASP.NET framework version 4.0. The client-side was designed using HTML5, CSS3, and Javascript. The server-side processing (VB.NET) relied on interaction with a customized SQL server 2008 R2 database. Overall, 104 cases (1062 variant calls) were analyzed by SeqReporter. Each variant call was classified into one of five report levels: i) known clinical significance, ii) uncertain clinical significance, iii) pending pathologists' review, iv) synonymous and deep intronic, and v) platform and panel-specific sequence errors. SeqReporter correctly annotated and classified 99.9% (859 of 860) of sequence variants, including 68.7% synonymous single-nucleotide variants, 28.3% nonsynonymous single-nucleotide variants, 1.7% insertions, and 1.3% deletions. One variant of potential clinical significance was re-classified after pathologist review. Laboratory information system-compatible clinical reports were generated automatically. SeqReporter also facilitated quality management activities. SeqReporter is an example of a customized and well-designed informatics solution to optimize and automate the downstream analysis of clinical next-generation sequencing data. We propose it as a model that may envisage the development of a comprehensive clinical informatics solution.
Collapse
Affiliation(s)
- Somak Roy
- Division of Molecular and Genomic Pathology, Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania.
| | - Mary Beth Durso
- Division of Molecular and Genomic Pathology, Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Abigail Wald
- Division of Molecular and Genomic Pathology, Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Yuri E Nikiforov
- Division of Molecular and Genomic Pathology, Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Marina N Nikiforova
- Division of Molecular and Genomic Pathology, Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania.
| |
Collapse
|
20
|
Robinson PN, Köhler S, Oellrich A, Wang K, Mungall CJ, Lewis SE, Washington N, Bauer S, Seelow D, Krawitz P, Gilissen C, Haendel M, Smedley D. Improved exome prioritization of disease genes through cross-species phenotype comparison. Genome Res 2013; 24:340-8. [PMID: 24162188 PMCID: PMC3912424 DOI: 10.1101/gr.160325.113] [Citation(s) in RCA: 245] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Numerous new disease-gene associations have been identified by whole-exome sequencing studies in the last few years. However, many cases remain unsolved due to the sheer number of candidate variants remaining after common filtering strategies such as removing low quality and common variants and those deemed unlikely to be pathogenic. The observation that each of our genomes contains about 100 genuine loss-of-function variants makes identification of the causative mutation problematic when using these strategies alone. We propose using the wealth of genotype to phenotype data that already exists from model organism studies to assess the potential impact of these exome variants. Here, we introduce PHenotypic Interpretation of Variants in Exomes (PHIVE), an algorithm that integrates the calculation of phenotype similarity between human diseases and genetically modified mouse models with evaluation of the variants according to allele frequency, pathogenicity, and mode of inheritance approaches in our Exomiser tool. Large-scale validation of PHIVE analysis using 100,000 exomes containing known mutations demonstrated a substantial improvement (up to 54.1-fold) over purely variant-based (frequency and pathogenicity) methods with the correct gene recalled as the top hit in up to 83% of samples, corresponding to an area under the ROC curve of >95%. We conclude that incorporation of phenotype data can play a vital role in translational bioinformatics and propose that exome sequencing projects should systematically capture clinical phenotypes to take advantage of the strategy presented here.
Collapse
Affiliation(s)
- Peter N Robinson
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Wang T, Liu J, Shen L, Tonti-Filippini J, Zhu Y, Jia H, Lister R, Whitaker JW, Ecker JR, Millar AH, Ren B, Wang W. STAR: an integrated solution to management and visualization of sequencing data. Bioinformatics 2013; 29:3204-10. [PMID: 24078702 DOI: 10.1093/bioinformatics/btt558] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Easily visualization of complex data features is a necessary step to conduct studies on next-generation sequencing (NGS) data. We developed STAR, an integrated web application that enables online management, visualization and track-based analysis of NGS data. RESULTS STAR is a multilayer web service system. On the client side, STAR leverages JavaScript, HTML5 Canvas and asynchronous communications to deliver a smoothly scrolling desktop-like graphical user interface with a suite of in-browser analysis tools that range from providing simple track configuration controls to sophisticated feature detection within datasets. On the server side, STAR supports private session state retention via an account management system and provides data management modules that enable collection, visualization and analysis of third-party sequencing data from the public domain with over thousands of tracks hosted to date. Overall, STAR represents a next-generation data exploration solution to match the requirements of NGS data, enabling both intuitive visualization and dynamic analysis of data. AVAILABILITY AND IMPLEMENTATION STAR browser system is freely available on the web at http://wanglab.ucsd.edu/star/browser and https://github.com/angell1117/STAR-genome-browser.
Collapse
Affiliation(s)
- Tao Wang
- Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093, USA, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA, The ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley, Western Australia 6009, Australia, Key Laboratory for Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China, Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA, Department of Cellular and Molecular Medicine, University of California, San Diego, CA 92093, USA and Ludwig Institute for Cancer Research, La Jolla, CA 92093, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Hitomi Y, Heinzen EL, Donatello S, Dahl HH, Damiano JA, McMahon JM, Berkovic SF, Scheffer IE, Legros B, Rai M, Weckhuysen S, Suls A, De Jonghe P, Pandolfo M, Goldstein DB, Van Bogaert P, Depondt C. Mutations in TNK2 in severe autosomal recessive infantile onset epilepsy. Ann Neurol 2013; 74:496-501. [PMID: 23686771 DOI: 10.1002/ana.23934] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Revised: 04/30/2013] [Accepted: 05/01/2013] [Indexed: 01/27/2023]
Abstract
We identified a small family with autosomal recessive, infantile onset epilepsy and intellectual disability. Exome sequencing identified a homozygous missense variant in the gene TNK2, encoding a brain-expressed tyrosine kinase. Sequencing of the coding region of TNK2 in 110 patients with a similar phenotype failed to detect further homozygote or compound heterozygote mutations. Pathogenicity of the variant is supported by the results of our functional studies, which demonstrated that the variant abolishes NEDD4 binding to TNK2, preventing its degradation after epidermal growth factor stimulation. Definitive proof of pathogenicity will require confirmation in unrelated patients.
Collapse
Affiliation(s)
- Yuki Hitomi
- Duke Center for Human Genome Variation, Duke University School of Medicine, Durham, NC
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Bromberg Y. Building a genome analysis pipeline to predict disease risk and prevent disease. J Mol Biol 2013; 425:3993-4005. [PMID: 23928561 DOI: 10.1016/j.jmb.2013.07.038] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 07/26/2013] [Accepted: 07/28/2013] [Indexed: 12/24/2022]
Abstract
Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, transcriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice.
Collapse
Affiliation(s)
- Y Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08873, USA.
| |
Collapse
|
24
|
Pavlopoulos GA, Oulas A, Iacucci E, Sifrim A, Moreau Y, Schneider R, Aerts J, Iliopoulos I. Unraveling genomic variation from next generation sequencing data. BioData Min 2013; 6:13. [PMID: 23885890 PMCID: PMC3726446 DOI: 10.1186/1756-0381-6-13] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 07/18/2013] [Indexed: 12/29/2022] Open
Abstract
Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece.
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Goldstein DB, Allen A, Keebler J, Margulies EH, Petrou S, Petrovski S, Sunyaev S. Sequencing studies in human genetics: design and interpretation. Nat Rev Genet 2013; 14:460-70. [PMID: 23752795 DOI: 10.1038/nrg3455] [Citation(s) in RCA: 185] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Next-generation sequencing is becoming the primary discovery tool in human genetics. There have been many clear successes in identifying genes that are responsible for Mendelian diseases, and sequencing approaches are now poised to identify the mutations that cause undiagnosed childhood genetic diseases and those that predispose individuals to more common complex diseases. There are, however, growing concerns that the complexity and magnitude of complete sequence data could lead to an explosion of weakly justified claims of association between genetic variants and disease. Here, we provide an overview of the basic workflow in next-generation sequencing studies and emphasize, where possible, measures and considerations that facilitate accurate inferences from human sequencing studies.
Collapse
Affiliation(s)
- David B Goldstein
- Center for Human Genome Variation, Duke University School of Medicine, 308 Research Drive, Box 91009, LSRC B Wing, Room 330, Durham, North Carolina 27708, USA.
| | | | | | | | | | | | | |
Collapse
|
26
|
Gbadegesin RA, Brophy PD, Adeyemo A, Hall G, Gupta IR, Hains D, Bartkowiak B, Rabinovich CE, Chandrasekharappa S, Homstad A, Westreich K, Wu G, Liu Y, Holanda D, Clarke J, Lavin P, Selim A, Miller S, Wiener JS, Ross SS, Foreman J, Rotimi C, Winn MP. TNXB mutations can cause vesicoureteral reflux. J Am Soc Nephrol 2013; 24:1313-22. [PMID: 23620400 DOI: 10.1681/asn.2012121148] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Primary vesicoureteral reflux (VUR) is the most common congenital anomaly of the kidney and the urinary tract, and it is a major risk factor for pyelonephritic scarring and CKD in children. Although twin studies support the heritability of VUR, specific genetic causes remain elusive. We performed a sequential genome-wide linkage study and whole-exome sequencing in a family with hereditary VUR. We obtained a significant multipoint parametric logarithm of odds score of 3.3 on chromosome 6p, and whole-exome sequencing identified a deleterious heterozygous mutation (T3257I) in the gene encoding tenascin XB (TNXB in 6p21.3). This mutation segregated with disease in the affected family as well as with a pathogenic G1331R change in another family. Fibroblast cell lines carrying the T3257I mutation exhibited a reduction in both cell motility and phosphorylated focal adhesion kinase expression, suggesting a defect in the focal adhesions that link the cell cytoplasm to the extracellular matrix. Immunohistochemical studies revealed that the human uroepithelial lining of the ureterovesical junction expresses TNXB, suggesting that TNXB may be important for generating tensile forces that close the ureterovesical junction during voiding. Taken together, these results suggest that mutations in TNXB can cause hereditary VUR.
Collapse
|
27
|
Gonzalez MA, Lebrigio RFA, Van Booven D, Ulloa RH, Powell E, Speziani F, Tekin M, Schüle R, Züchner S. GEnomes Management Application (GEM.app): a new software tool for large-scale collaborative genome analysis. Hum Mutat 2013; 34:842-6. [PMID: 23463597 DOI: 10.1002/humu.22305] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Accepted: 02/17/2013] [Indexed: 12/13/2022]
Abstract
Novel genes are now identified at a rapid pace for many Mendelian disorders, and increasingly, for genetically complex phenotypes. However, new challenges have also become evident: (1) effectively managing larger exome and/or genome datasets, especially for smaller labs; (2) direct hands-on analysis and contextual interpretation of variant data in large genomic datasets; and (3) many small and medium-sized clinical and research-based investigative teams around the world are generating data that, if combined and shared, will significantly increase the opportunities for the entire community to identify new genes. To address these challenges, we have developed GEnomes Management Application (GEM.app), a software tool to annotate, manage, visualize, and analyze large genomic datasets (https://genomics.med.miami.edu/). GEM.app currently contains ∼1,600 whole exomes from 50 different phenotypes studied by 40 principal investigators from 15 different countries. The focus of GEM.app is on user-friendly analysis for nonbioinformaticians to make next-generation sequencing data directly accessible. Yet, GEM.app provides powerful and flexible filter options, including single family filtering, across family/phenotype queries, nested filtering, and evaluation of segregation in families. In addition, the system is fast, obtaining results within 4 sec across ∼1,200 exomes. We believe that this system will further enhance identification of genetic causes of human disease.
Collapse
Affiliation(s)
- Michael A Gonzalez
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Pabinger S, Dander A, Fischer M, Snajder R, Sperk M, Efremova M, Krabichler B, Speicher MR, Zschocke J, Trajanoski Z. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform 2013; 15:256-78. [PMID: 23341494 PMCID: PMC3956068 DOI: 10.1093/bib/bbs086] [Citation(s) in RCA: 335] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers.
Collapse
Affiliation(s)
- Stephan Pabinger
- Division for Bioinformatics, Innsbruck Medical University, Innrain 80, 6020 Innsbruck, Austria. Tel.: +43-512-9003-71401; Fax: +43-512-9003-73100;
| | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Li MX, Kwan JSH, Bao SY, Yang W, Ho SL, Song YQ, Sham PC. Predicting mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies. PLoS Genet 2013; 9:e1003143. [PMID: 23341771 PMCID: PMC3547823 DOI: 10.1371/journal.pgen.1003143] [Citation(s) in RCA: 116] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 10/20/2012] [Indexed: 12/19/2022] Open
Abstract
Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ~22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases.
Collapse
Affiliation(s)
- Miao-Xin Li
- Department of Psychiatry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Centre for Reproduction, Development and Growth, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Centre for Genomic Sciences, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- * E-mail: (M-XL); (PCS)
| | - Johnny S. H. Kwan
- Department of Psychiatry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Department of Medicine, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Su-Ying Bao
- Department of Biochemistry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Shu-Leong Ho
- Department of Medicine, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Yong-Qiang Song
- Department of Biochemistry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
| | - Pak C. Sham
- Department of Psychiatry, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Centre for Reproduction, Development and Growth, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- Centre for Genomic Sciences, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- State Key Laboratory for Cognitive and Brain Sciences, University of Hong Kong, Pokfulam, Hong Kong, Special Administrative Region, People′s Republic of China
- * E-mail: (M-XL); (PCS)
| |
Collapse
|
30
|
Oz-Levi D, Ben-Zeev B, Ruzzo EK, Hitomi Y, Gelman A, Pelak K, Anikster Y, Reznik-Wolf H, Bar-Joseph I, Olender T, Alkelai A, Weiss M, Ben-Asher E, Ge D, Shianna KV, Elazar Z, Goldstein DB, Pras E, Lancet D. Mutation in TECPR2 reveals a role for autophagy in hereditary spastic paraparesis. Am J Hum Genet 2012. [PMID: 23176824 DOI: 10.1016/j.ajhg.2012.09.015] [Citation(s) in RCA: 117] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
We studied five individuals from three Jewish Bukharian families affected by an apparently autosomal-recessive form of hereditary spastic paraparesis accompanied by severe intellectual disability, fluctuating central hypoventilation, gastresophageal reflux disease, wake apnea, areflexia, and unique dysmorphic features. Exome sequencing identified one homozygous variant shared among all affected individuals and absent in controls: a 1 bp frameshift TECPR2 deletion leading to a premature stop codon and predicting significant degradation of the protein. TECPR2 has been reported as a positive regulator of autophagy. We thus examined the autophagy-related fate of two key autophagic proteins, SQSTM1 (p62) and MAP1LC3B (LC3), in skin fibroblasts of an affected individual, as compared to a healthy control, and found that both protein levels were decreased and that there was a more pronounced decrease in the lipidated form of LC3 (LC3II). siRNA knockdown of TECPR2 showed similar changes, consistent with aberrant autophagy. Our results are strengthened by the fact that autophagy dysfunction has been implicated in a number of other neurodegenerative diseases. The discovered TECPR2 mutation implicates autophagy, a central intracellular mechanism, in spastic paraparesis.
Collapse
Affiliation(s)
- Danit Oz-Levi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Hu Q, Wang D, Yan L, Zhao H, Liu S. VPA: an R tool for analyzing sequencing variants with user-specified frequency pattern. BMC Res Notes 2012; 5:31. [PMID: 22243673 PMCID: PMC3293055 DOI: 10.1186/1756-0500-5-31] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2011] [Accepted: 01/14/2012] [Indexed: 11/26/2022] Open
Abstract
Background The massive amounts of genetic variant generated by the next generation sequencing systems demand the development of effective computational tools for variant prioritization. Findings VPA (Variant Pattern Analyzer) is an R tool for prioritizing variants with specified frequency pattern from multiple study subjects in next-generation sequencing study. The tool starts from individual files of variant and sequence calls and extract variants with user-specified frequency pattern across the study subjects of interest. Several position level quality criteria can be incorporated into the variant extraction. It can be used in studies with matched pair design as well as studies with multiple groups of subjects. Conclusions VPA can be used as an automatic pipeline to prioritize variants for further functional exploration and hypothesis generation. The package is implemented in the R language and is freely available from http://vpa.r-forge.r-project.org.
Collapse
|
32
|
González-Pérez P, Cirulli ET, Drory VE, Dabby R, Nisipeanu P, Carasso RL, Sadeh M, Fox A, Festoff BW, Sapp PC, McKenna-Yasek D, Goldstein DB, Brown RH, Blumen SC. Novel mutation in VCP gene causes atypical amyotrophic lateral sclerosis. Neurology 2012; 79:2201-8. [PMID: 23152587 DOI: 10.1212/wnl.0b013e318275963b] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
OBJECTIVE To identify the genetic variant that causes autosomal dominantly inherited motor neuron disease in a 4-generation Israeli-Arab family using genetic linkage and whole exome sequencing. METHODS Genetic linkage analysis was performed in this family using Illumina single nucleotide polymorphism chips. Whole exome sequencing was then undertaken on DNA samples from 2 affected family members using an Illumina 2000 HiSeq platform in pursuit of potentially pathogenic genetic variants that comigrate with the disease in this pedigree. Variants meeting these criteria were then screened in all affected individuals. RESULTS A novel mutation (p.R191G) in the valosin-containing protein (VCP) gene was identified in the index family. Direct sequencing of the VCP gene in a panel of DNA from 274 unrelated individuals with familial amyotrophic lateral sclerosis (FALS) revealed 5 additional mutations. Among them, 2 were previously identified in pedigrees with a constellation of inclusion body myopathy with Paget disease of the bone and frontotemporal dementia (IBMPFD) and in FALS, and 2 other mutations (p.R159C and p.R155C) in IBMPFD alone. We did not detect VCP gene mutations in DNA from 178 cases of sporadic amyotrophic lateral sclerosis. CONCLUSIONS We report a novel VCP mutation identified in an amyotrophic lateral sclerosis family (p.R191G) with atypical clinical features. In our experience, VCP mutations arise in approximately 1.5% of FALS cases. Our study supports the view that motor neuron disease is part of the clinical spectrum of VCP-associated disease.
Collapse
Affiliation(s)
- Paloma González-Pérez
- Department of Neurology, University of Massachusetts Medical School, Worcester, MA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Love C, Sun Z, Jima D, Li G, Zhang J, Miles R, Richards KL, Dunphy CH, Choi WWL, Srivastava G, Lugar PL, Rizzieri DA, Lagoo AS, Bernal-Mizrachi L, Mann KP, Flowers CR, Naresh KN, Evens AM, Chadburn A, Gordon LI, Czader MB, Gill JI, Hsi ED, Greenough A, Moffitt AB, McKinney M, Banerjee A, Grubor V, Levy S, Dunson DB, Dave SS. The genetic landscape of mutations in Burkitt lymphoma. Nat Genet 2012; 44:1321-5. [PMID: 23143597 DOI: 10.1038/ng.2468] [Citation(s) in RCA: 440] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Accepted: 10/17/2012] [Indexed: 12/13/2022]
Abstract
Burkitt lymphoma is characterized by deregulation of MYC, but the contribution of other genetic mutations to the disease is largely unknown. Here, we describe the first completely sequenced genome from a Burkitt lymphoma tumor and germline DNA from the same affected individual. We further sequenced the exomes of 59 Burkitt lymphoma tumors and compared them to sequenced exomes from 94 diffuse large B-cell lymphoma (DLBCL) tumors. We identified 70 genes that were recurrently mutated in Burkitt lymphomas, including ID3, GNA13, RET, PIK3R1 and the SWI/SNF genes ARID1A and SMARCA4. Our data implicate a number of genes in cancer for the first time, including CCT6B, SALL3, FTCD and PC. ID3 mutations occurred in 34% of Burkitt lymphomas and not in DLBCLs. We show experimentally that ID3 mutations promote cell cycle progression and proliferation. Our work thus elucidates commonly occurring gene-coding mutations in Burkitt lymphoma and implicates ID3 as a new tumor suppressor gene.
Collapse
Affiliation(s)
- Cassandra Love
- Duke Institute for Genome Sciences and Policy, Duke University, Durham, NC, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Vergara IA, Frech C, Chen N. CooVar: co-occurring variant analyzer. BMC Res Notes 2012; 5:615. [PMID: 23116482 PMCID: PMC3532326 DOI: 10.1186/1756-0500-5-615] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2012] [Accepted: 10/26/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Evaluating the impact of genomic variations (GV) on protein-coding transcripts is an important step in identifying variants of functional significance. Currently available programs for variant annotation depend on external databases or annotate multiple variants affecting the same transcript independently, which limits program use to organisms available in these databases or results in potentially incorrect or incomplete annotations. FINDINGS We have developed CooVar (Co-occurring Variant Analyzer), a database-independent program for assessing the impact of GVs on protein-coding transcripts. CooVar takes GVs, reference genome sequence, and protein-coding exons as input and provides annotated GVs and transcripts as output. Other than similar programs, CooVar considers the combined impact of all GVs affecting the same transcript, generating biologically more accurate annotations. CooVar is operated from the command-line and supports standard file formats VCF, GFF/GTF, and GVF, which makes it easy to integrate into existing computational pipelines. We have extensively tested CooVar on worm and human data sets and demonstrate that it generates correct annotations in only a short amount of time. CONCLUSIONS CooVar is an easy-to-use and lightweight variant annotation tool that considers the combined impact of GVs on protein-coding transcripts. CooVar is freely available at http://genome.sfu.ca/projects/coovar/.
Collapse
Affiliation(s)
- Ismael A Vergara
- Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, B,C,, V5A 1S6, Canada
| | | | | |
Collapse
|
35
|
Sifrim A, Van Houdt JKJ, Tranchevent LC, Nowakowska B, Sakai R, Pavlopoulos GA, Devriendt K, Vermeesch JR, Moreau Y, Aerts J. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease. Genome Med 2012; 4:73. [PMID: 23013645 PMCID: PMC3580443 DOI: 10.1186/gm374] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Revised: 09/14/2012] [Accepted: 09/26/2012] [Indexed: 12/18/2022] Open
Abstract
The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org.
Collapse
Affiliation(s)
- Alejandro Sifrim
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Jeroen KJ Van Houdt
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Leon-Charles Tranchevent
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Beata Nowakowska
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Ryo Sakai
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Georgios A Pavlopoulos
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Koen Devriendt
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Joris R Vermeesch
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Yves Moreau
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Jan Aerts
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| |
Collapse
|
36
|
Coutant S, Cabot C, Lefebvre A, Léonard M, Prieur-Gaston E, Campion D, Lecroq T, Dauchel H. EVA: Exome Variation Analyzer, an efficient and versatile tool for filtering strategies in medical genomics. BMC Bioinformatics 2012; 13 Suppl 14:S9. [PMID: 23095660 PMCID: PMC3439720 DOI: 10.1186/1471-2105-13-s14-s9] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Background Whole exome sequencing (WES) has become the strategy of choice to identify a coding allelic variant for a rare human monogenic disorder. This approach is a revolution in medical genetics history, impacting both fundamental research, and diagnostic methods leading to personalized medicine. A plethora of efficient algorithms has been developed to ensure the variant discovery. They generally lead to ~20,000 variations that have to be narrow down to find the potential pathogenic allelic variant(s) and the affected gene(s). For this purpose, commonly adopted procedures which implicate various filtering strategies have emerged: exclusion of common variations, type of the allelics variants, pathogenicity effect prediction, modes of inheritance and multiple individuals for exome comparison. To deal with the expansion of WES in medical genomics individual laboratories, new convivial and versatile software tools have to implement these filtering steps. Non-programmer biologists have to be autonomous combining themselves different filtering criteria and conduct a personal strategy depending on their assumptions and study design. Results We describe EVA (Exome Variation Analyzer), a user-friendly web-interfaced software dedicated to the filtering strategies for medical WES. Thanks to different modules, EVA (i) integrates and stores annotated exome variation data as strictly confidential to the project owner, (ii) allows to combine the main filters dealing with common variations, molecular types, inheritance mode and multiple samples, (iii) offers the browsing of annotated data and filtered results in various interactive tables, graphical visualizations and statistical charts, (iv) and finally offers export files and cross-links to external useful databases and softwares for further prioritization of the small subset of sorted candidate variations and genes. We report a demonstrative case study that allowed to identify a new candidate gene related to a rare form of Alzheimer disease. Conclusions EVA is developed to be a user-friendly, versatile, and efficient-filtering assisting software for WES. It constitutes a platform for data storage and for drastic screening of clinical relevant genetics variations by non-programmer geneticists. Thereby, it provides a response to new needs at the expanding era of medical genomics investigated by WES for both fundamental research and clinical diagnostics.
Collapse
Affiliation(s)
- Sophie Coutant
- University of Rouen, INSERM U1079 Molecular genetics of cancer and neuropsychiatric diseases, 76183 Rouen cedex, France
| | | | | | | | | | | | | | | |
Collapse
|
37
|
Altmann A, Weber P, Bader D, Preuß M, Binder EB, Müller-Myhsok B. A beginners guide to SNP calling from high-throughput DNA-sequencing data. Hum Genet 2012; 131:1541-54. [DOI: 10.1007/s00439-012-1213-z] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 07/31/2012] [Indexed: 01/02/2023]
|
38
|
Heinzen E, Depondt C, Cavalleri G, Ruzzo E, Walley N, Need A, Ge D, He M, Cirulli E, Zhao Q, Cronin K, Gumbs C, Campbell C, Hong L, Maia J, Shianna K, McCormack M, Radtke R, O'Conner G, Mikati M, Gallentine W, Husain A, Sinha S, Chinthapalli K, Puranam R, McNamara J, Ottman R, Sisodiya S, Delanty N, Goldstein D. Exome sequencing followed by large-scale genotyping fails to identify single rare variants of large effect in idiopathic generalized epilepsy. Am J Hum Genet 2012; 91:293-302. [PMID: 22863189 DOI: 10.1016/j.ajhg.2012.06.016] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Revised: 03/30/2012] [Accepted: 06/26/2012] [Indexed: 10/28/2022] Open
Abstract
Idiopathic generalized epilepsy (IGE) is a complex disease with high heritability, but little is known about its genetic architecture. Rare copy-number variants have been found to explain nearly 3% of individuals with IGE; however, it remains unclear whether variants with moderate effect size and frequencies below what are reliably detected with genome-wide association studies contribute significantly to disease risk. In this study, we compare the exome sequences of 118 individuals with IGE and 242 controls of European ancestry by using next-generation sequencing. The exome-sequenced epilepsy cases include study subjects with two forms of IGE, including juvenile myoclonic epilepsy (n = 93) and absence epilepsy (n = 25). However, our discovery strategy did not assume common genetic control between the subtypes of IGE considered. In the sequence data, as expected, no variants were significantly associated with the IGE phenotype or more specific IGE diagnoses. We then selected 3,897 candidate epilepsy-susceptibility variants from the sequence data and genotyped them in a larger set of 878 individuals with IGE and 1,830 controls. Again, no variant achieved statistical significance. However, 1,935 variants were observed exclusively in cases either as heterozygous or homozygous genotypes. It is likely that this set of variants includes real risk factors. The lack of significant association evidence of single variants with disease in this two-stage approach emphasizes the high genetic heterogeneity of epilepsy disorders, suggests that the impact of any individual single-nucleotide variant in this disease is small, and indicates that gene-based approaches might be more successful for future sequencing studies of epilepsy predisposition.
Collapse
|
39
|
Next generation sequencing for molecular diagnosis of neuromuscular diseases. Acta Neuropathol 2012; 124:273-83. [PMID: 22526018 PMCID: PMC3400754 DOI: 10.1007/s00401-012-0982-8] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2012] [Revised: 04/05/2012] [Accepted: 04/05/2012] [Indexed: 12/21/2022]
Abstract
Inherited neuromuscular disorders (NMD) are chronic genetic diseases posing a significant burden on patients and the health care system. Despite tremendous research and clinical efforts, the molecular causes remain unknown for nearly half of the patients, due to genetic heterogeneity and conventional molecular diagnosis based on a gene-by-gene approach. We aimed to test next generation sequencing (NGS) as an efficient and cost-effective strategy to accelerate patient diagnosis. We designed a capture library to target the coding and splice site sequences of all known NMD genes and used NGS and DNA multiplexing to retrieve the pathogenic mutations in patients with heterogeneous NMD with or without known mutations. We retrieved all known mutations, including point mutations and small indels, intronic and exonic mutations, and a large deletion in a patient with Duchenne muscular dystrophy, validating the sensitivity and reproducibility of this strategy on a heterogeneous subset of NMD with different genetic inheritance. Most pathogenic mutations were ranked on top in our blind bioinformatic pipeline. Following the same strategy, we characterized probable TTN, RYR1 and COL6A3 mutations in several patients without previous molecular diagnosis. The cost was less than conventional testing for a single large gene. With appropriate adaptations, this strategy could be implemented into a routine genetic diagnosis set-up as a first screening approach to detect most kind of mutations, potentially before the need of more invasive and specific clinical investigations. An earlier genetic diagnosis should provide improved disease management and higher quality genetic counseling, and ease access to therapy or inclusion into therapeutic trials.
Collapse
|
40
|
Fischer M, Snajder R, Pabinger S, Dander A, Schossig A, Zschocke J, Trajanoski Z, Stocker G. SIMPLEX: cloud-enabled pipeline for the comprehensive analysis of exome sequencing data. PLoS One 2012; 7:e41948. [PMID: 22870267 PMCID: PMC3411592 DOI: 10.1371/journal.pone.0041948] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2012] [Accepted: 06/28/2012] [Indexed: 01/24/2023] Open
Abstract
In recent studies, exome sequencing has proven to be a successful screening tool for the identification of candidate genes causing rare genetic diseases. Although underlying targeted sequencing methods are well established, necessary data handling and focused, structured analysis still remain demanding tasks. Here, we present a cloud-enabled autonomous analysis pipeline, which comprises the complete exome analysis workflow. The pipeline combines several in-house developed and published applications to perform the following steps: (a) initial quality control, (b) intelligent data filtering and pre-processing, (c) sequence alignment to a reference genome, (d) SNP and DIP detection, (e) functional annotation of variants using different approaches, and (f) detailed report generation during various stages of the workflow. The pipeline connects the selected analysis steps, exposes all available parameters for customized usage, performs required data handling, and distributes computationally expensive tasks either on a dedicated high-performance computing infrastructure or on the Amazon cloud environment (EC2). The presented application has already been used in several research projects including studies to elucidate the role of rare genetic diseases. The pipeline is continuously tested and is publicly available under the GPL as a VirtualBox or Cloud image at http://simplex.i-med.ac.at; additional supplementary data is provided at http://www.icbi.at/exome.
Collapse
Affiliation(s)
- Maria Fischer
- Division for Bioinformatics, Biocenter, Innsbruck Medical University, Innsbruck, Austria
| | - Rene Snajder
- Division for Bioinformatics, Biocenter, Innsbruck Medical University, Innsbruck, Austria
- Oncotyrol, Center for Personalized Cancer Medicine, Innsbruck, Austria
| | - Stephan Pabinger
- Division for Bioinformatics, Biocenter, Innsbruck Medical University, Innsbruck, Austria
| | - Andreas Dander
- Division for Bioinformatics, Biocenter, Innsbruck Medical University, Innsbruck, Austria
- Oncotyrol, Center for Personalized Cancer Medicine, Innsbruck, Austria
| | - Anna Schossig
- Division of Human Genetics, Biocenter, Innsbruck Medical University, Innsbruck, Austria
| | - Johannes Zschocke
- Division of Human Genetics, Biocenter, Innsbruck Medical University, Innsbruck, Austria
| | - Zlatko Trajanoski
- Division for Bioinformatics, Biocenter, Innsbruck Medical University, Innsbruck, Austria
| | - Gernot Stocker
- Division for Bioinformatics, Biocenter, Innsbruck Medical University, Innsbruck, Austria
| |
Collapse
|
41
|
De novo mutations in ATP1A3 cause alternating hemiplegia of childhood. Nat Genet 2012; 44:1030-4. [PMID: 22842232 PMCID: PMC3442240 DOI: 10.1038/ng.2358] [Citation(s) in RCA: 290] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2012] [Accepted: 06/28/2012] [Indexed: 11/08/2022]
Abstract
Alternating hemiplegia of childhood (AHC) is a rare, severe neurodevelopmental syndrome characterized by recurrent hemiplegic episodes and distinct neurologic manifestations. AHC is usually a sporadic disorder with unknown etiology. Using exome sequencing of seven patients with AHC, and their unaffected parents, we identified de novo nonsynonymous mutations in ATP1A3 in all seven AHC patients. Subsequent sequence analysis of ATP1A3 in 98 additional patients revealed that 78% of AHC cases have a likely causal ATP1A3 mutation, including one inherited mutation in a familial case of AHC. Remarkably, six ATP1A3 mutations explain the majority of patients, including one observed in 36 patients. Unlike ATP1A3 mutations that cause rapid-onset-dystonia-parkinsonism, AHC-causing mutations revealed consistent reductions in ATPase activity without effects on protein expression. This work identifies de novo ATP1A3 mutations as the primary cause of AHC, and offers insight into disease pathophysiology by expanding the spectrum of phenotypes associated with mutations in this gene.
Collapse
|
42
|
Trakadis YJ. Patient-controlled encrypted genomic data: an approach to advance clinical genomics. BMC Med Genomics 2012; 5:31. [PMID: 22818218 PMCID: PMC3439266 DOI: 10.1186/1755-8794-5-31] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Accepted: 06/30/2012] [Indexed: 12/21/2022] Open
Abstract
Background The revolution in DNA sequencing technologies over the past decade has made it feasible to sequence an individual’s whole genome at a relatively low cost. The potential value of the information generated by genomic technologies for medicine and society is enormous. However, in order for exome sequencing, and eventually whole genome sequencing, to be implemented clinically, a number of major challenges need to be overcome. For instance, obtaining meaningful informed-consent, managing incidental findings and the great volume of data generated (including multiple findings with uncertain clinical significance), re-interpreting the genomic data and providing additional counselling to patients as genetic knowledge evolves are issues that need to be addressed. It appears that medical genetics is shifting from the present “phenotype-first” medical model to a “data-first” model which leads to multiple complexities. Discussion This manuscript discusses the different challenges associated with integrating genomic technologies into clinical practice and describes a “phenotype-first” approach, namely, “Individualized Mutation-weighed Phenotype Search”, and its benefits. The proposed approach allows for a more efficient prioritization of the genes to be tested in a clinical lab based on both the patient’s phenotype and his/her entire genomic data. It simplifies “informed-consent” for clinical use of genomic technologies and helps to protect the patient’s autonomy and privacy. Overall, this approach could potentially render widespread use of genomic technologies, in the immediate future, practical, ethical and clinically useful. Summary The “Individualized Mutation-weighed Phenotype Search” approach allows for an incremental integration of genomic technologies into clinical practice. It ensures that we do not over-medicalize genomic data but, rather, continue our current medical model which is based on serving the patient’s concerns. Service should not be solely driven by technology but rather by the medical needs and the extent to which a technology can be safely and effectively utilized.
Collapse
Affiliation(s)
- Yannis J Trakadis
- Department of Medical Genetics, Montreal Children's Hospital-McGill University Health Centre, 2300 Tupper, Montreal, QC, Canada.
| |
Collapse
|
43
|
Medina I, De Maria A, Bleda M, Salavert F, Alonso R, Gonzalez CY, Dopazo J. VARIANT: Command Line, Web service and Web interface for fast and accurate functional characterization of variants found by Next-Generation Sequencing. Nucleic Acids Res 2012; 40:W54-8. [PMID: 22693211 PMCID: PMC3394276 DOI: 10.1093/nar/gks572] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The massive use of Next-Generation Sequencing (NGS) technologies is uncovering an unexpected amount of variability. The functional characterization of such variability, particularly in the most common form of variation found, the Single Nucleotide Variants (SNVs), has become a priority that needs to be addressed in a systematic way. VARIANT (VARIant ANalyis Tool) reports information on the variants found that include consequence type and annotations taken from different databases and repositories (SNPs and variants from dbSNP and 1000 genomes, and disease-related variants from the Genome-Wide Association Study (GWAS) catalog, Online Mendelian Inheritance in Man (OMIM), Catalog of Somatic Mutations in Cancer (COSMIC) mutations, etc). VARIANT also produces a rich variety of annotations that include information on the regulatory (transcription factor or miRNA-binding sites, etc.) or structural roles, or on the selective pressures on the sites affected by the variation. This information allows extending the conventional reports beyond the coding regions and expands the knowledge on the contribution of non-coding or synonymous variants to the phenotype studied. Contrarily to other tools, VARIANT uses a remote database and operates through efficient RESTful Web Services that optimize search and transaction operations. In this way, local problems of installation, update or disk size limitations are overcome without the need of sacrifice speed (thousands of variants are processed per minute). VARIANT is available at: http://variant.bioinfo.cipf.es.
Collapse
Affiliation(s)
- Ignacio Medina
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | | | | | | | | | | | | |
Collapse
|
44
|
Cheng YC, Hsiao FC, Yeh EC, Lin WJ, Tang CYL, Tseng HC, Wu HT, Liu CK, Chen CC, Chen YT, Yao A. VarioWatch: providing large-scale and comprehensive annotations on human genomic variants in the next generation sequencing era. Nucleic Acids Res 2012; 40:W76-81. [PMID: 22618869 PMCID: PMC3394242 DOI: 10.1093/nar/gks397] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
VarioWatch (http://genepipe.ncgm.sinica.edu.tw/variowatch/) has been vastly improved since its former publication GenoWatch in the 2008 Web Server Issue. It is now at least 10 000-times faster in annotating a variant. Drastic speed increase, through complete re-design of its working mechanism, makes VarioWatch capable of annotating millions of human genomic variants generated from next generation sequencing in minutes, if not seconds. While using MegaQuery of VarioWatch to quickly annotate variants, users can apply various filters to retrieve a subgroup of variants according to the risk levels, interested regions, etc. that satisfy users’ requirements. In addition to performance leap, many new features have also been added, such as annotation on novel variants, functional analyses on splice sites and in/dels, detailed variant information in tabulated form, plus a risk level decision tree regarding the analyzed variant. Up to 1000 target variants can be visualized with our carefully designed Genome View, Gene View, Transcript View and Variation View. Two commonly used reference versions, NCBI build 36.3 and NCBI build 37.2, are supported. VarioWatch is unique in its ability to annotate comprehensively and efficiently millions of variants online, immediately delivering the results in real time, plus visualizes up to 1000 annotated variants.
Collapse
Affiliation(s)
- Yu-Chang Cheng
- National Center for Genome Medicine and Institute of Biomedical Sciences, Academia Sinica, Taiwan 11529, R.O.C
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Regan K, Wang K, Doughty E, Li H, Li J, Lee Y, Kann MG, Lussier YA. Translating Mendelian and complex inheritance of Alzheimer's disease genes for predicting unique personal genome variants. J Am Med Inform Assoc 2012; 19:306-16. [PMID: 22319180 PMCID: PMC3277633 DOI: 10.1136/amiajnl-2011-000656] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Objective Although trait-associated genes identified as complex versus single-gene inheritance differ substantially in odds ratio, the authors nonetheless posit that their mechanistic concordance can reveal fundamental properties of the genetic architecture, allowing the automated interpretation of unique polymorphisms within a personal genome. Materials and methods An analytical method, SPADE-gen, spanning three biological scales was developed to demonstrate the mechanistic concordance between Mendelian and complex inheritance of Alzheimer's disease (AD) genes: biological functions (BP), protein interaction modeling, and protein domain implicated in the disease-associated polymorphism. Results Among Gene Ontology (GO) biological processes (BP) enriched at a false detection rate <5% in 15 AD genes of Mendelian inheritance (Online Mendelian Inheritance in Man) and independently in those of complex inheritance (25 host genes of intragenic AD single-nucleotide polymorphisms confirmed in genome-wide association studies), 16 overlapped (empirical p=0.007) and 45 were similar (empirical p<0.009; information theory). SPAN network modeling extended the canonical pathway of AD (KEGG) with 26 new protein interactions (empirical p<0.0001). Discussion The study prioritized new AD-associated biological mechanisms and focused the analysis on previously unreported interactions associated with the biological processes of polymorphisms that affect specific protein domains within characterized AD genes and their direct interactors using (1) concordant GO-BP and (2) domain interactions within STRING protein–protein interactions corresponding to the genomic location of the AD polymorphism (eg, EPHA1, APOE, and CD2AP). Conclusion These results are in line with unique-event polymorphism theory, indicating how disease-associated polymorphisms of Mendelian or complex inheritance relate genetically to those observed as ‘unique personal variants’. They also provide insight for identifying novel targets, for repositioning drugs, and for personal therapeutics.
Collapse
Affiliation(s)
- Kelly Regan
- Department of Medicine, University of Illinois at Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | | | | | | | |
Collapse
|
46
|
Need AC, Shashi V, Hitomi Y, Schoch K, Shianna KV, McDonald MT, Meisler MH, Goldstein DB. Clinical application of exome sequencing in undiagnosed genetic conditions. J Med Genet 2012; 49:353-61. [PMID: 22581936 PMCID: PMC3375064 DOI: 10.1136/jmedgenet-2012-100819] [Citation(s) in RCA: 318] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Background There is considerable interest in the use of next-generation sequencing to help diagnose unidentified genetic conditions, but it is difficult to predict the success rate in a clinical setting that includes patients with a broad range of phenotypic presentations. Methods The authors present a pilot programme of whole-exome sequencing on 12 patients with unexplained and apparent genetic conditions, along with their unaffected parents. Unlike many previous studies, the authors did not seek patients with similar phenotypes, but rather enrolled any undiagnosed proband with an apparent genetic condition when predetermined criteria were met. Results This undertaking resulted in a likely genetic diagnosis in 6 of the 12 probands, including the identification of apparently causal mutations in four genes known to cause Mendelian disease (TCF4, EFTUD2, SCN2A and SMAD4) and one gene related to known Mendelian disease genes (NGLY1). Of particular interest is that at the time of this study, EFTUD2 was not yet known as a Mendelian disease gene but was nominated as a likely cause based on the observation of de novo mutations in two unrelated probands. In a seventh case with multiple disparate clinical features, the authors were able to identify homozygous mutations in EFEMP1 as a likely cause for macular degeneration (though likely not for other features). Conclusions This study provides evidence that next-generation sequencing can have high success rates in a clinical setting, but also highlights key challenges. It further suggests that the presentation of known Mendelian conditions may be considerably broader than currently recognised.
Collapse
Affiliation(s)
- Anna C Need
- Center for Human Genome Variation, Duke University School of Medicine, Box 91009, Durham, NC 27708, USA.
| | | | | | | | | | | | | | | |
Collapse
|
47
|
Makarov V, O'Grady T, Cai G, Lihm J, Buxbaum JD, Yoon S. AnnTools: a comprehensive and versatile annotation toolkit for genomic variants. ACTA ACUST UNITED AC 2012; 28:724-5. [PMID: 22257670 DOI: 10.1093/bioinformatics/bts032] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
UNLABELLED AnnTools is a versatile bioinformatics application designed for comprehensive annotation of a full spectrum of human genome variation: novel and known single-nucleotide substitutions (SNP/SNV), short insertions/deletions (INDEL) and structural variants/copy number variation (SV/CNV). The variants are interpreted by interrogating data compiled from 15 constantly updated sources. In addition to detailed functional characterization of the coding variants, AnnTools searches for overlaps with regulatory elements, disease/trait associated loci, known segmental duplications and artifact prone regions, thereby offering an integrated and comprehensive analysis of genomic data. The tool conveniently accepts user-provided tracks for custom annotation and offers flexibility in input data formats. The output is generated in the universal Variant Call Format. High annotation speed makes AnnTools suitable for high-throughput sequencing facilities, while a low-memory footprint and modest CPU requirements allow it to operate on a personal computer. The application is freely available for public use; the package includes installation scripts and a set of helper tools. AVAILABILITY http://anntools.sourceforge.net/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Vladimir Makarov
- The Seaver Autism Center for Research and Treatment, Department of Psychiatry, Levy Library, Mount Sinai School of Medicine, New York, NY 10029, USA.
| | | | | | | | | | | |
Collapse
|
48
|
Capriotti E, Nehrt NL, Kann MG, Bromberg Y. Bioinformatics for personal genome interpretation. Brief Bioinform 2012; 13:495-512. [PMID: 22247263 DOI: 10.1093/bib/bbr070] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
An international consortium released the first draft sequence of the human genome 10 years ago. Although the analysis of this data has suggested the genetic underpinnings of many diseases, we have not yet been able to fully quantify the relationship between genotype and phenotype. Thus, a major current effort of the scientific community focuses on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds. Many resources aim to identify and annotate the specific genes responsible for the observed phenotypes. Some of these use intra-species genetic variability as a means for better understanding this relationship. In addition, several online resources are now dedicated to collecting single nucleotide variants and other types of variants, and annotating their functional effects and associations with phenotypic traits. This information has enabled researchers to develop bioinformatics tools to analyze the rapidly increasing amount of newly extracted variation data and to predict the effect of uncharacterized variants. In this work, we review the most important developments in the field--the databases and bioinformatics tools that will be of utmost importance in our concerted effort to interpret the human variome.
Collapse
Affiliation(s)
- Emidio Capriotti
- Department of Mathematics and Computer Science, University of Balearic Islands, ctra. de Valldemossa Km 7.5, Palma de Mallorca, 07122 Spain.
| | | | | | | |
Collapse
|
49
|
Teer JK, Green ED, Mullikin JC, Biesecker LG. VarSifter: visualizing and analyzing exome-scale sequence variation data on a desktop computer. ACTA ACUST UNITED AC 2011; 28:599-600. [PMID: 22210868 DOI: 10.1093/bioinformatics/btr711] [Citation(s) in RCA: 126] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
UNLABELLED VarSifter is a graphical software tool for desktop computers that allows investigators of varying computational skills to easily and quickly sort, filter, and sift through sequence variation data. A variety of filters and a custom query framework allow filtering based on any combination of sample and annotation information. By simplifying visualization and analyses of exome-scale sequence variation data, this program will help bring the power and promise of massively-parallel DNA sequencing to a broader group of researchers. AVAILABILITY AND IMPLEMENTATION VarSifter is written in Java, and is freely available in source and binary versions, along with a User Guide, at http://research.nhgri.nih.gov/software/VarSifter/.
Collapse
Affiliation(s)
- Jamie K Teer
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | |
Collapse
|
50
|
Cirulli ET, Heinzen EL, Dietrich FS, Shianna KV, Singh A, Maia JM, Goedert JJ, Goldstein DB. A whole-genome analysis of premature termination codons. Genomics 2011; 98:337-42. [PMID: 21803148 PMCID: PMC3282586 DOI: 10.1016/j.ygeno.2011.07.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Revised: 07/02/2011] [Accepted: 07/14/2011] [Indexed: 11/18/2022]
Abstract
We sequenced the genomes of ten unrelated individuals and identified heterozygous stop codon-gain variants in protein-coding genes: we then sequenced their transcriptomes and assessed the expression levels of the stop codon-gain alleles. An ANOVA showed statistically significant differences between their expression levels (p=4×10(-16)). This difference was almost entirely accounted for by whether the stop codon-gain variant had a second, non-protein-truncating function in or near an alternate transcript: stop codon-gains without alternate functions were generally not found in the cDNA (p=3×10(-5)). Additionally, stop codon-gain variants in two intronless genes were not expressed, an unexpected outcome given previous studies. In this study, stop codon-gain variants were either well expressed in all individuals or were never expressed. Our finding that stop codon-gain variants were generally expressed only when they had an alternate function suggests that most naturally occurring stop codon-gain variants in protein-coding genes are either not transcribed or have their transcripts destroyed.
Collapse
Affiliation(s)
- Elizabeth T. Cirulli
- Center for Human Genome Variation, Duke University School of Medicine, Box 91009, Durham, 27708, USA
| | - Erin L. Heinzen
- Center for Human Genome Variation, Duke University School of Medicine, Box 91009, Durham, 27708, USA
| | - Fred S. Dietrich
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Research Drive, Durham, NC 27710, USA
| | - Kevin V. Shianna
- Center for Human Genome Variation, Duke University School of Medicine, Box 91009, Durham, 27708, USA
| | - Abanish Singh
- Center for Human Genome Variation, Duke University School of Medicine, Box 91009, Durham, 27708, USA
| | - Jessica M. Maia
- Center for Human Genome Variation, Duke University School of Medicine, Box 91009, Durham, 27708, USA
| | - James J. Goedert
- Infections & Immunoepidemiology Branch, Division of Cancer Epidemiology and Genetics, US National Cancer Institutes of Health, 6120 Executive Boulevard, Rockville, 20852, USA
| | - David B. Goldstein
- Center for Human Genome Variation, Duke University School of Medicine, Box 91009, Durham, 27708, USA
| |
Collapse
|