1
|
Ralli S, Vira T, Robles-Espinoza CD, Adams DJ, Brooks-Wilson AR. Variant ranking pipeline for complex familial disorders. Sci Rep 2024; 14:13599. [PMID: 38866901 PMCID: PMC11169219 DOI: 10.1038/s41598-024-64169-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 06/05/2024] [Indexed: 06/14/2024] Open
Abstract
Identifying genetic susceptibility factors for complex disorders remains a challenging task. To analyze collections of small and large pedigrees where genetic heterogeneity is likely, but biological commonalities are plausible, we have developed a weights-based pipeline to prioritize variants and genes. The Weights-based vAriant Ranking in Pedigrees (WARP) pipeline prioritizes variants using 5 weights: disease incidence rate, number of cases in a family, genome fraction shared amongst cases in a family, allele frequency and variant deleteriousness. Weights, except for the population allele frequency weight, are normalized between 0 and 1. Weights are combined multiplicatively to produce family-specific-variant weights that are then averaged across all families in which the variant is observed to generate a multifamily weight. Sorting multifamily weights in descending order creates a ranked list of variants and genes for further investigation. WARP was validated using familial melanoma sequence data from the European Genome-phenome Archive. The pipeline identified variation in known germline melanoma genes POT1, MITF and BAP1 in 4 out of 13 families (31%). Analysis of the other 9 families identified several interesting genes, some of which might have a role in melanoma. WARP provides an approach to identify disease predisposing genes in studies with small and large pedigrees.
Collapse
Affiliation(s)
- Sneha Ralli
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 1L3, Canada
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Tariq Vira
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 1L3, Canada
| | | | - David J Adams
- Experimental Cancer Genetics, Wellcome Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Angela R Brooks-Wilson
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 1L3, Canada.
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada.
| |
Collapse
|
2
|
Macnamara EF, D’Souza P, Tifft CJ. The undiagnosed diseases program: Approach to diagnosis. TRANSLATIONAL SCIENCE OF RARE DISEASES 2020; 4:179-188. [PMID: 32477883 PMCID: PMC7250153 DOI: 10.3233/trd-190045] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Undiagnosed and rare conditions are collectively common and affect millions of people worldwide. The NIH Undiagnosed Diseases Program (UDP) strives to achieve both a comprehensive diagnosis and a better understanding of the mechanisms of disease for many of these individuals. Through the careful review of records, a well-orchestrated inpatient evaluation, genomic sequencing and testing, and with the use of emerging strategies such as matchmaking programs, the UDP succeeds nearly 30 percent of the time for these highly selective cases. Although the UDP process is built on a unique set of resources, case examples demonstrate steps genetic professionals can take, in both clinical and research settings, to arrive at a diagnosis for their most challenging cases.
Collapse
Affiliation(s)
- Ellen F. Macnamara
- National Institutes of Health, Undiagnosed Diseases Program, Common Fund, Office of the Director, Bethesda, MD, USA
| | - Precilla D’Souza
- National Institutes of Health, Undiagnosed Diseases Program, Common Fund, Office of the Director, Bethesda, MD, USA
| | - Undiagnosed Diseases Network
- National Institutes of Health, Undiagnosed Diseases Program, Common Fund, Office of the Director, Bethesda, MD, USA
- Office of the Clinical Director, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Cynthia J. Tifft
- National Institutes of Health, Undiagnosed Diseases Program, Common Fund, Office of the Director, Bethesda, MD, USA
- Office of the Clinical Director, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
3
|
Schork NJ, Nazor K. Integrated Genomic Medicine: A Paradigm for Rare Diseases and Beyond. ADVANCES IN GENETICS 2017; 97:81-113. [PMID: 28838357 PMCID: PMC6383766 DOI: 10.1016/bs.adgen.2017.06.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Individualized medicine, or the tailoring of therapeutic interventions to a patient's unique genetic, biochemical, physiological, exposure and behavioral profile, has been enhanced, if not enabled, by modern biomedical technologies such as high-throughput DNA sequencing platforms, induced pluripotent stem cell assays, biomarker discovery protocols, imaging modalities, and wireless monitoring devices. Despite successes in the isolated use of these technologies, however, it is arguable that their combined and integrated use in focused studies of individual patients is the best way to not only tailor interventions for those patients, but also shed light on treatment strategies for patients with similar conditions. This is particularly true for individuals with rare diseases since, by definition, they will require study without recourse to other individuals, or at least without recourse to many other individuals. Such integration and focus will require new biomedical scientific paradigms and infrastructure, including the creation of databases harboring study results, the formation of dedicated multidisciplinary research teams and new training programs. We consider the motivation and potential for such integration, point out areas in need of improvement, and argue for greater emphasis on improving patient health via technological innovations, not merely improving the technologies themselves. We also argue that the paradigm described can, in theory, be extended to the study of individuals with more common diseases.
Collapse
Affiliation(s)
- Nicholas J. Schork
- The Translational Genomics Research Institute, 445 North Fifth Street, Phoenix, AZ 85004, , 858-794-4054
| | - Kristopher Nazor
- MYi Diagnostics and Discovery, 5310 Eastgate Mall, San Diego, CA 92121, , 858-458-9305
| |
Collapse
|
4
|
Feng BJ. PERCH: A Unified Framework for Disease Gene Prioritization. Hum Mutat 2017; 38:243-251. [PMID: 27995669 PMCID: PMC5299048 DOI: 10.1002/humu.23158] [Citation(s) in RCA: 96] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 12/12/2016] [Indexed: 12/30/2022]
Abstract
To interpret genetic variants discovered from next-generation sequencing, integration of heterogeneous information is vital for success. This article describes a framework named PERCH (Polymorphism Evaluation, Ranking, and Classification for a Heritable trait), available at http://BJFengLab.org/. It can prioritize disease genes by quantitatively unifying a new deleteriousness measure called BayesDel, an improved assessment of the biological relevance of genes to the disease, a modified linkage analysis, a novel rare-variant association test, and a converted variant call quality score. It supports data that contain various combinations of extended pedigrees, trios, and case-controls, and allows for a reduced penetrance, an elevated phenocopy rate, liability classes, and covariates. BayesDel is more accurate than PolyPhen2, SIFT, FATHMM, LRT, Mutation Taster, Mutation Assessor, PhyloP, GERP++, SiPhy, CADD, MetaLR, and MetaSVM. The overall approach is faster and more powerful than the existing quantitative method pVAAST, as shown by the simulations of challenging situations in finding the missing heritability of a complex disease. This framework can also classify variants of unknown significance (variants of uncertain significance) by quantitatively integrating allele frequencies, deleteriousness, association, and co-segregation. PERCH is a versatile tool for gene prioritization in gene discovery research and variant classification in clinical genetic testing.
Collapse
Affiliation(s)
- Bing-Jian Feng
- Department of Dermatology, University of Utah, Salt Lake City, UT 84132, USA
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84132, USA
| |
Collapse
|
5
|
Försti A, Kumar A, Paramasivam N, Schlesner M, Catalano C, Dymerska D, Lubinski J, Eils R, Hemminki K. Pedigree based DNA sequencing pipeline for germline genomes of cancer families. Hered Cancer Clin Pract 2016; 14:16. [PMID: 27508007 PMCID: PMC4977614 DOI: 10.1186/s13053-016-0058-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Accepted: 07/04/2016] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND In the course of our whole-genome sequencing efforts, we have developed a pipeline for analyzing germline genomes from Mendelian types of cancer pedigrees (familial cancer variant prioritization pipeline, FCVPP). RESULTS The variant calling step distinguishes two types of genomic variants: single nucleotide variants (SNVs) and indels, which undergo technical quality control. Mendelian types of variants are assumed to be rare and variants with frequencies higher that 0.1 % are screened out using human 1000 Genomes (Phase 3) and non-TCGA ExAC population data. Segregation in the pedigree allows variants to be present in affected family members and not in old, unaffected ones. The effectiveness of variant segregation depends on the number and relatedness of the family members: if over 5 third-degree (or more distant) relatives are available, the experience has shown that the number of likely variants is reduced from many hundreds to a few tens. These are then subjected to bioinformatics analysis, starting with the combined annotation dependent depletion (CADD) tool, which predicts the likelihood of the variant being deleterious. Different sets of individual tools are used for further evaluation of the deleteriousness of coding variants, 5' and 3' untranslated regions (UTRs), and intergenic variants. CONLUSIONS The likelihood of success of the present genomic pipeline in finding novel high- or medium-penetrant genes depends on many steps but first and foremost, the pedigree needs to be reasonably large and the assignments and diagnoses among the members need to be correct.
Collapse
Affiliation(s)
- Asta Försti
- Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), D69120 Heidelberg, Germany
- Center for Primary Health Care Research, Lund University, Malmö, Sweden
| | - Abhishek Kumar
- Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), D69120 Heidelberg, Germany
| | - Nagarajan Paramasivam
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), D69120 Heidelberg, Germany
- Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
| | - Matthias Schlesner
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), D69120 Heidelberg, Germany
| | - Calogerina Catalano
- Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), D69120 Heidelberg, Germany
| | - Dagmara Dymerska
- Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Jan Lubinski
- Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Roland Eils
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), D69120 Heidelberg, Germany
- Department of Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology (IPMB) and BioQuant, Heidelberg University, Heidelberg, Germany
| | - Kari Hemminki
- Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), D69120 Heidelberg, Germany
- Center for Primary Health Care Research, Lund University, Malmö, Sweden
| |
Collapse
|
6
|
Chung RH, Tsai WY, Kang CY, Yao PJ, Tsai HJ, Chen CH. FamPipe: An Automatic Analysis Pipeline for Analyzing Sequencing Data in Families for Disease Studies. PLoS Comput Biol 2016; 12:e1004980. [PMID: 27272119 PMCID: PMC4894624 DOI: 10.1371/journal.pcbi.1004980] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 05/12/2016] [Indexed: 11/18/2022] Open
Abstract
In disease studies, family-based designs have become an attractive approach to analyzing next-generation sequencing (NGS) data for the identification of rare mutations enriched in families. Substantial research effort has been devoted to developing pipelines for automating sequence alignment, variant calling, and annotation. However, fewer pipelines have been designed specifically for disease studies. Most of the current analysis pipelines for family-based disease studies using NGS data focus on a specific function, such as identifying variants with Mendelian inheritance or identifying shared chromosomal regions among affected family members. Consequently, some other useful family-based analysis tools, such as imputation, linkage, and association tools, have yet to be integrated and automated. We developed FamPipe, a comprehensive analysis pipeline, which includes several family-specific analysis modules, including the identification of shared chromosomal regions among affected family members, prioritizing variants assuming a disease model, imputation of untyped variants, and linkage and association tests. We used simulation studies to compare properties of some modules implemented in FamPipe, and based on the results, we provided suggestions for the selection of modules to achieve an optimal analysis strategy. The pipeline is under the GNU GPL License and can be downloaded for free at http://fampipe.sourceforge.net.
Collapse
Affiliation(s)
- Ren-Hua Chung
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
- * E-mail:
| | - Wei-Yun Tsai
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| | - Chen-Yu Kang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| | - Po-Ju Yao
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| | - Hui-Ju Tsai
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
- Department of Public Health, China Medical University, Taichung, Taiwan
- Department of Pediatrics, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Chia-Hsiang Chen
- Department of Psychiatry, Chang Gung Memorial Hospital-Linkou, Gueishan, Taoyuan, Taiwan
- Department and Graduate Institute of Biomedical Sciences, Chang Gung University, Taoyuan, Taiwan
| |
Collapse
|
7
|
Juan L, Liu Y, Wang Y, Teng M, Zang T, Wang Y. Family genome browser: visualizing genomes with pedigree information. Bioinformatics 2015; 31:2262-8. [PMID: 25788626 DOI: 10.1093/bioinformatics/btv151] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Accepted: 03/11/2015] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. RESULTS We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. AVAILABILITY AND IMPLEMENTATION The FGB is available at http://mlg.hit.edu.cn/FGB/.
Collapse
Affiliation(s)
- Liran Juan
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yongzhuang Liu
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yongtian Wang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Mingxiang Teng
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Tianyi Zang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Yadong Wang
- Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| |
Collapse
|
8
|
Pham PH, Shipman WJ, Erikson GA, Schork NJ, Torkamani A. Scripps Genome ADVISER: Annotation and Distributed Variant Interpretation SERver. PLoS One 2015; 10:e0116815. [PMID: 25706643 PMCID: PMC4338027 DOI: 10.1371/journal.pone.0116815] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 12/01/2014] [Indexed: 12/31/2022] Open
Abstract
Interpretation of human genomes is a major challenge. We present the Scripps Genome ADVISER (SG-ADVISER) suite, which aims to fill the gap between data generation and genome interpretation by performing holistic, in-depth, annotations and functional predictions on all variant types and effects. The SG-ADVISER suite includes a de-identification tool, a variant annotation web-server, and a user interface for inheritance and annotation-based filtration. SG-ADVISER allows users with no bioinformatics expertise to manipulate large volumes of variant data with ease--without the need to download large reference databases, install software, or use a command line interface. SG-ADVISER is freely available at genomics.scripps.edu/ADVISER.
Collapse
Affiliation(s)
- Phillip H. Pham
- Cypher Genomics, Inc., La Jolla, CA 92037, United States of America
| | - William J. Shipman
- Scripps Health, La Jolla, CA 92037, United States of America
- The Scripps Translational Science Institute, La Jolla, CA 92037, United States of America
| | - Galina A. Erikson
- Scripps Health, La Jolla, CA 92037, United States of America
- The Scripps Translational Science Institute, La Jolla, CA 92037, United States of America
| | - Nicholas J. Schork
- Scripps Health, La Jolla, CA 92037, United States of America
- The Scripps Translational Science Institute, La Jolla, CA 92037, United States of America
- The Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, United States of America
- Cypher Genomics, Inc., La Jolla, CA 92037, United States of America
| | - Ali Torkamani
- Scripps Health, La Jolla, CA 92037, United States of America
- The Scripps Translational Science Institute, La Jolla, CA 92037, United States of America
- The Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States of America
- Cypher Genomics, Inc., La Jolla, CA 92037, United States of America
| |
Collapse
|
9
|
Vandeweyer G, Van Laer L, Loeys B, Van den Bulcke T, Kooy RF. VariantDB: a flexible annotation and filtering portal for next generation sequencing data. Genome Med 2014; 6:74. [PMID: 25352915 PMCID: PMC4210545 DOI: 10.1186/s13073-014-0074-6] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 09/15/2014] [Indexed: 12/30/2022] Open
Abstract
Interpretation of the multitude of variants obtained from next generation sequencing (NGS) is labor intensive and complex. Web-based interfaces such as Galaxy streamline the generation of variant lists but lack flexibility in the downstream annotation and filtering that are necessary to identify causative variants in medical genomics. To this end, we built VariantDB, a web-based interactive annotation and filtering platform that automatically annotates variants with allele frequencies, functional impact, pathogenicity predictions and pathway information. VariantDB allows filtering by all annotations, under dominant, recessive or de novo inheritance models and is freely available at http://www.biomina.be/app/variantdb/.
Collapse
Affiliation(s)
- Geert Vandeweyer
- Department of Medical Genetics, University of Antwerp, 2650 Edegem, Antwerp Belgium ; Biomedical Informatics Research Center Antwerp, University and University Hospital of Antwerp, 2650 Edegem, Antwerp Belgium
| | - Lut Van Laer
- Department of Medical Genetics, University of Antwerp, 2650 Edegem, Antwerp Belgium ; Department of Medical Genetics, University Hospital of Antwerp, 2650 Edegem, Antwerp Belgium
| | - Bart Loeys
- Department of Medical Genetics, University of Antwerp, 2650 Edegem, Antwerp Belgium ; Department of Medical Genetics, University Hospital of Antwerp, 2650 Edegem, Antwerp Belgium
| | - Tim Van den Bulcke
- Biomedical Informatics Research Center Antwerp, University and University Hospital of Antwerp, 2650 Edegem, Antwerp Belgium
| | - R Frank Kooy
- Department of Medical Genetics, University of Antwerp, 2650 Edegem, Antwerp Belgium
| |
Collapse
|
10
|
Bodian DL, Solomon BD, Khromykh A, Thach DC, Iyer RK, Link K, Baker RL, Baveja R, Vockley JG, Niederhuber JE. Diagnosis of an imprinted-gene syndrome by a novel bioinformatics analysis of whole-genome sequences from a family trio. Mol Genet Genomic Med 2014; 2:530-8. [PMID: 25614875 PMCID: PMC4303223 DOI: 10.1002/mgg3.107] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Revised: 07/02/2014] [Accepted: 07/16/2014] [Indexed: 01/22/2023] Open
Abstract
Whole-genome sequencing and whole-exome sequencing are becoming more widely applied in clinical medicine to help diagnose rare genetic diseases. Identification of the underlying causative mutations by genome-wide sequencing is greatly facilitated by concurrent analysis of multiple family members, most often the mother-father-proband trio, using bioinformatics pipelines that filter genetic variants by mode of inheritance. However, current pipelines are limited to Mendelian inheritance patterns and do not specifically address disorders caused by mutations in imprinted genes, such as forms of Angelman syndrome and Beckwith-Wiedemann syndrome. Using publicly available tools, we implemented a genetic inheritance search mode to identify imprinted-gene mutations. Application of this search mode to whole-genome sequences from a family trio led to a diagnosis for a proband for whom extensive clinical testing and Mendelian inheritance-based sequence analysis were nondiagnostic. The condition in this patient, IMAGe syndrome, is likely caused by the heterozygous mutation c.832A>G (p.Lys278Glu) in the imprinted gene CDKN1C. The genotypes and disease status of six members of the family are consistent with maternal expression of the gene, and allele-biased expression was confirmed by RNA-Seq for the heterozygotes. This analysis demonstrates that an imprinted-gene search mode is a valuable addition to genome sequence analysis pipelines for identifying disease-causative variants.
Collapse
Affiliation(s)
- Dale L Bodian
- Inova Translational Medicine Institute, Inova Health System Falls Church, Virginia
| | - Benjamin D Solomon
- Inova Translational Medicine Institute, Inova Health System Falls Church, Virginia
| | - Alina Khromykh
- Inova Translational Medicine Institute, Inova Health System Falls Church, Virginia
| | - Dzung C Thach
- Inova Translational Medicine Institute, Inova Health System Falls Church, Virginia
| | - Ramaswamy K Iyer
- Inova Translational Medicine Institute, Inova Health System Falls Church, Virginia
| | - Kathleen Link
- Department of Pediatric Endocrinology, Inova Children's Hospital Falls Church, Virginia
| | - Robin L Baker
- Fairfax Neonatal Associates PC, Inova Children's Hospital Falls Church, Virginia
| | - Rajiv Baveja
- Fairfax Neonatal Associates PC, Inova Children's Hospital Falls Church, Virginia
| | - Joseph G Vockley
- Inova Translational Medicine Institute, Inova Health System Falls Church, Virginia
| | - John E Niederhuber
- Inova Translational Medicine Institute, Inova Health System Falls Church, Virginia
| |
Collapse
|
11
|
Winchester B. Lysosomal diseases: diagnostic update. J Inherit Metab Dis 2014; 37:599-608. [PMID: 24711203 DOI: 10.1007/s10545-014-9710-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Revised: 03/13/2014] [Accepted: 03/17/2014] [Indexed: 12/14/2022]
Abstract
Technological developments in newborn and population screening, biomarker discovery for monitoring treatment and rapid high throughput DNA sequencing are having a great impact on the diagnostic procedure for symptomatic patients with lysosomal storage diseases. The use of dried blood spots, initially for newborn screening, has stimulated the introduction of automated, rapid and more sensitive methods for the assay of lysosomal enzymes, including the synthesis of novel substrates. Storage products and secondary metabolites in urine and cells can be identified and measured very accurately and sensitively by high performance liquid chromatography and tandem mass spectrometry. This has enhanced the preliminary metabolite screen for LSDs and facilitated the diagnosis of transport defects. Fast, reliable and affordable high throughput DNA sequencing, such as whole or selected exome sequencing, is helping to make diagnoses in difficult cases, to reveal novel gene defects, to widen the clinical spectrum of diseases and possibly to identify modifying genetic factors. Bioinformatics will be necessary to handle the data generated by these new technologies. Notwithstanding, these technical innovations, accurate and reliable diagnosis will still depend on the knowledge and experience of skilled laboratory staff.
Collapse
Affiliation(s)
- Bryan Winchester
- Biochemistry Research Group, UCL Institute of Child Health at Great Ormond Street Hospital, University College London, London, UK,
| |
Collapse
|
12
|
Koboldt D, Larson D, Sullivan L, Bowne S, Steinberg K, Churchill J, Buhr A, Nutter N, Pierce E, Blanton S, Weinstock G, Wilson R, Daiger S. Exome-based mapping and variant prioritization for inherited Mendelian disorders. Am J Hum Genet 2014; 94:373-84. [PMID: 24560519 DOI: 10.1016/j.ajhg.2014.01.016] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2013] [Accepted: 01/30/2014] [Indexed: 02/08/2023] Open
Abstract
Exome sequencing in families affected by rare genetic disorders has the potential to rapidly identify new disease genes (genes in which mutations cause disease), but the identification of a single causal mutation among thousands of variants remains a significant challenge. We developed a scoring algorithm to prioritize potential causal variants within a family according to segregation with the phenotype, population frequency, predicted effect, and gene expression in the tissue(s) of interest. To narrow the search space in families with multiple affected individuals, we also developed two complementary approaches to exome-based mapping of autosomal-dominant disorders. One approach identifies segments of maximum identity by descent among affected individuals; the other nominates regions on the basis of shared rare variants and the absence of homozygous differences between affected individuals. We showcase our methods by using exome sequence data from families affected by autosomal-dominant retinitis pigmentosa (adRP), a rare disorder characterized by night blindness and progressive vision loss. We performed exome capture and sequencing on 91 samples representing 24 families affected by probable adRP but lacking common disease-causing mutations. Eight of 24 families (33%) were revealed to harbor high-scoring, most likely pathogenic (by clinical assessment) mutations affecting known RP genes. Analysis of the remaining 17 families identified candidate variants in a number of interesting genes, some of which have withstood further segregation testing in extended pedigrees. To empower the search for Mendelian-disease genes in family-based sequencing studies, we implemented them in a cross-platform-compatible software package, MendelScan, which is freely available to the research community.
Collapse
|
13
|
Solomon BD. Enhancing the incidental pipeline in genomic sequencing. Mol Syndromol 2014; 5:47-50. [PMID: 24715850 DOI: 10.1159/000357929] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/25/2013] [Indexed: 11/19/2022] Open
Affiliation(s)
- B D Solomon
- Division of Medical Genomics, Inova Translational Medicine Institute, Inova Health System, and Department of Pediatrics, Inova Children's Hospital, Falls Church, Va., USA
| |
Collapse
|
14
|
Dias C, McDonald A, Sincan M, Rupps R, Markello T, Salvarinova R, Santos RF, Menghrajani K, Ahaghotu C, Sutherland DP, Fortuno ES, Kollmann TR, Demos M, Friedman JM, Speert DP, Gahl WA, Boerkoel CF. Recurrent subacute post-viral onset of ataxia associated with a PRF1 mutation. Eur J Hum Genet 2013; 21:1232-9. [PMID: 23443029 PMCID: PMC3798831 DOI: 10.1038/ejhg.2013.20] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2012] [Revised: 11/19/2012] [Accepted: 01/23/2013] [Indexed: 12/14/2022] Open
Abstract
Inflammation is an important contributor to pediatric and adult neurodegeneration. Understanding the genetic determinants of neuroinflammation provides valuable insight into disease mechanism. We characterize a disorder of recurrent immune-mediated neurodegeneration. We report two sisters who presented with neurodegeneration triggered by infections. The proband, a previously healthy girl, presented at 22.5 months with ataxia and dysarthria following mild gastroenteritis. MRI at onset showed a symmetric signal abnormality of the cerebellar and peritrigonal white matter. Following a progressive course of partial remissions and relapses, she died at 5 years of age. Her older sister had a similar course following varicella infection, she died within 13 months. Both sisters had unremarkable routine laboratory testing, with exception of a transient mild cytopenia in the proband 19 months after presentation. Exome sequencing identified a biallelic perforin1 mutation (PRF1; p.R225W) previously associated with familial hemophagocytic lymphohistiocytosis (FHL). In contrast to FHL, these girls did not have hematopathology or cytokine overproduction. However, 3 years after disease onset, the proband had markedly deficient interleukin-1 beta (IL-1β) production. These observations extend the spectrum of disease associated with perforin mutations to immune-mediated neurodegeneration triggered by infection and possibly due to primary immunodeficiency.
Collapse
Affiliation(s)
- Cristina Dias
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- Genetics and Health Cluster, Child and Family Research Institute, BC Children's Hospital, Vancouver, British Columbia, Canada
| | - Allison McDonald
- Centre for Understanding and Preventing Infection in Children, Child and Family Research Institute, Vancouver, British Columbia, Canada
- Division of Infectious and Immunological Diseases, Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Murat Sincan
- NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, MD, USA
| | - Rosemarie Rupps
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- Genetics and Health Cluster, Child and Family Research Institute, BC Children's Hospital, Vancouver, British Columbia, Canada
- Rare Disease Foundation, Vancouver, British Columbia, Canada
| | - Thomas Markello
- NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, MD, USA
| | - Ramona Salvarinova
- Division of Biochemical Diseases, Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Rui F Santos
- Department of Radiology, BC Children's Hospital & University of British Columbia, Vancouver, British Columbia, Canada
| | - Kamal Menghrajani
- NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, MD, USA
| | - Chidi Ahaghotu
- NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, MD, USA
| | - Darren P Sutherland
- Centre for Understanding and Preventing Infection in Children, Child and Family Research Institute, Vancouver, British Columbia, Canada
- Division of Infectious and Immunological Diseases, Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Edgardo S Fortuno
- Centre for Understanding and Preventing Infection in Children, Child and Family Research Institute, Vancouver, British Columbia, Canada
- Division of Infectious and Immunological Diseases, Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Tobias R Kollmann
- Centre for Understanding and Preventing Infection in Children, Child and Family Research Institute, Vancouver, British Columbia, Canada
- Division of Infectious and Immunological Diseases, Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Michelle Demos
- Division of Neurology, Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Jan M Friedman
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- Genetics and Health Cluster, Child and Family Research Institute, BC Children's Hospital, Vancouver, British Columbia, Canada
| | - David P Speert
- Centre for Understanding and Preventing Infection in Children, Child and Family Research Institute, Vancouver, British Columbia, Canada
- Division of Infectious and Immunological Diseases, Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - William A Gahl
- NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, MD, USA
| | - Cornelius F Boerkoel
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- Genetics and Health Cluster, Child and Family Research Institute, BC Children's Hospital, Vancouver, British Columbia, Canada
- NIH Undiagnosed Diseases Program, NIH Office of Rare Diseases Research and NHGRI, Bethesda, MD, USA
| |
Collapse
|
15
|
Robinson PN, Köhler S, Oellrich A, Wang K, Mungall CJ, Lewis SE, Washington N, Bauer S, Seelow D, Krawitz P, Gilissen C, Haendel M, Smedley D. Improved exome prioritization of disease genes through cross-species phenotype comparison. Genome Res 2013; 24:340-8. [PMID: 24162188 PMCID: PMC3912424 DOI: 10.1101/gr.160325.113] [Citation(s) in RCA: 239] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Numerous new disease-gene associations have been identified by whole-exome sequencing studies in the last few years. However, many cases remain unsolved due to the sheer number of candidate variants remaining after common filtering strategies such as removing low quality and common variants and those deemed unlikely to be pathogenic. The observation that each of our genomes contains about 100 genuine loss-of-function variants makes identification of the causative mutation problematic when using these strategies alone. We propose using the wealth of genotype to phenotype data that already exists from model organism studies to assess the potential impact of these exome variants. Here, we introduce PHenotypic Interpretation of Variants in Exomes (PHIVE), an algorithm that integrates the calculation of phenotype similarity between human diseases and genetically modified mouse models with evaluation of the variants according to allele frequency, pathogenicity, and mode of inheritance approaches in our Exomiser tool. Large-scale validation of PHIVE analysis using 100,000 exomes containing known mutations demonstrated a substantial improvement (up to 54.1-fold) over purely variant-based (frequency and pathogenicity) methods with the correct gene recalled as the top hit in up to 83% of samples, corresponding to an area under the ROC curve of >95%. We conclude that incorporation of phenotype data can play a vital role in translational bioinformatics and propose that exome sequencing projects should systematically capture clinical phenotypes to take advantage of the strategy presented here.
Collapse
Affiliation(s)
- Peter N Robinson
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Bromberg Y. Building a genome analysis pipeline to predict disease risk and prevent disease. J Mol Biol 2013; 425:3993-4005. [PMID: 23928561 DOI: 10.1016/j.jmb.2013.07.038] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 07/26/2013] [Accepted: 07/28/2013] [Indexed: 12/24/2022]
Abstract
Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, transcriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice.
Collapse
Affiliation(s)
- Y Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08873, USA.
| |
Collapse
|
17
|
Pavlopoulos GA, Oulas A, Iacucci E, Sifrim A, Moreau Y, Schneider R, Aerts J, Iliopoulos I. Unraveling genomic variation from next generation sequencing data. BioData Min 2013; 6:13. [PMID: 23885890 PMCID: PMC3726446 DOI: 10.1186/1756-0381-6-13] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 07/18/2013] [Indexed: 12/29/2022] Open
Abstract
Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece.
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Computational and bioinformatics frameworks for next-generation whole exome and genome sequencing. ScientificWorldJournal 2013; 2013:730210. [PMID: 23365548 PMCID: PMC3556895 DOI: 10.1155/2013/730210] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2012] [Accepted: 11/22/2012] [Indexed: 12/28/2022] Open
Abstract
It has become increasingly apparent that one of the major hurdles in the genomic age will be the bioinformatics challenges of next-generation sequencing. We provide an overview of a general framework of bioinformatics analysis. For each of the three stages of (1) alignment, (2) variant calling, and (3) filtering and annotation, we describe the analysis required and survey the different software packages that are used. Furthermore, we discuss possible future developments as data sources grow and highlight opportunities for new bioinformatics tools to be developed.
Collapse
|
19
|
Bioinformatic perspectives in the neuronal ceroid lipofuscinoses. Biochim Biophys Acta Mol Basis Dis 2012; 1832:1831-41. [PMID: 23274885 DOI: 10.1016/j.bbadis.2012.12.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 12/16/2012] [Accepted: 12/19/2012] [Indexed: 02/06/2023]
Abstract
The neuronal ceroid lipofuscinoses (NCLs) are a group of rare genetic diseases characterised clinically by the progressive deterioration of mental, motor and visual functions and histopathologically by the intracellular accumulation of autofluorescent lipopigment - ceroid - in affected tissues. The NCLs are clinically and genetically heterogeneous and more than 14 genetically distinct NCL subtypes have been described to date (CLN1-CLN14) (Haltia and Goebel, 2012 [1]). In this review we will chronologically summarise work which has led over the years to identification of NCL genes, and outline the potential of novel genomic techniques and related bioinformatic approaches for further genetic dissection and diagnosis of NCLs. This article is part of a Special Issue entitled: The Neuronal Ceroid Lipofuscinoses or Batten Disease.
Collapse
|
20
|
Exome-assistant: a rapid and easy detection of disease-related genes and genetic variations from exome sequencing. BMC Genomics 2012; 13:692. [PMID: 23231371 PMCID: PMC3539923 DOI: 10.1186/1471-2164-13-692] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Accepted: 11/22/2012] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Protein-coding regions in human genes harbor 85% of the mutations that are associated with disease-related traits. Compared with whole-genome sequencing of complex samples, exome sequencing serves as an alternative option because of its dramatically reduced cost. In fact, exome sequencing has been successfully applied to identify the cause of several Mendelian disorders, such as Miller and Schinzel-Giedio syndrome. However, there remain great challenges in handling the huge data generated by exome sequencing and in identifying potential disease-related genetic variations. RESULTS In this study, Exome-assistant (http://122.228.158.106/exomeassistant), a convenient tool for submitting and annotating single nucleotide polymorphisms (SNPs) and insertion/deletion variations (InDels), was developed to rapidly detect candidate disease-related genetic variations from exome sequencing projects. Versatile filter criteria are provided by Exome-assistant to meet different users' requirements. Exome-assistant consists of four modules: the single case module, the two cases module, the multiple cases module, and the reanalysis module. The two cases and multiple cases modules allow users to identify sample-specific and common variations. The multiple cases module also supports family-based studies and Mendelian filtering. The identified candidate disease-related genetic variations can be annotated according to their sample features. CONCLUSIONS In summary, by exploring exome sequencing data, Exome-assistant can provide researchers with detailed biological insights into genetic variation events and permits the identification of potential genetic causes of human diseases and related traits.
Collapse
|
21
|
Sifrim A, Van Houdt JKJ, Tranchevent LC, Nowakowska B, Sakai R, Pavlopoulos GA, Devriendt K, Vermeesch JR, Moreau Y, Aerts J. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease. Genome Med 2012; 4:73. [PMID: 23013645 PMCID: PMC3580443 DOI: 10.1186/gm374] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Revised: 09/14/2012] [Accepted: 09/26/2012] [Indexed: 12/18/2022] Open
Abstract
The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org.
Collapse
Affiliation(s)
- Alejandro Sifrim
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Jeroen KJ Van Houdt
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Leon-Charles Tranchevent
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Beata Nowakowska
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Ryo Sakai
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Georgios A Pavlopoulos
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Koen Devriendt
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Joris R Vermeesch
- KU Leuven, Centre for Human Genetics, University Hospital Gasthuisberg, Herestraat 49, 3000 Leuven, Belgium
| | - Yves Moreau
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| | - Jan Aerts
- KU Leuven, Department of Electrical Engineering-ESAT, SCD-SISTA, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
- IBBT Future Health Department, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium
| |
Collapse
|
22
|
Coutant S, Cabot C, Lefebvre A, Léonard M, Prieur-Gaston E, Campion D, Lecroq T, Dauchel H. EVA: Exome Variation Analyzer, an efficient and versatile tool for filtering strategies in medical genomics. BMC Bioinformatics 2012; 13 Suppl 14:S9. [PMID: 23095660 PMCID: PMC3439720 DOI: 10.1186/1471-2105-13-s14-s9] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Background Whole exome sequencing (WES) has become the strategy of choice to identify a coding allelic variant for a rare human monogenic disorder. This approach is a revolution in medical genetics history, impacting both fundamental research, and diagnostic methods leading to personalized medicine. A plethora of efficient algorithms has been developed to ensure the variant discovery. They generally lead to ~20,000 variations that have to be narrow down to find the potential pathogenic allelic variant(s) and the affected gene(s). For this purpose, commonly adopted procedures which implicate various filtering strategies have emerged: exclusion of common variations, type of the allelics variants, pathogenicity effect prediction, modes of inheritance and multiple individuals for exome comparison. To deal with the expansion of WES in medical genomics individual laboratories, new convivial and versatile software tools have to implement these filtering steps. Non-programmer biologists have to be autonomous combining themselves different filtering criteria and conduct a personal strategy depending on their assumptions and study design. Results We describe EVA (Exome Variation Analyzer), a user-friendly web-interfaced software dedicated to the filtering strategies for medical WES. Thanks to different modules, EVA (i) integrates and stores annotated exome variation data as strictly confidential to the project owner, (ii) allows to combine the main filters dealing with common variations, molecular types, inheritance mode and multiple samples, (iii) offers the browsing of annotated data and filtered results in various interactive tables, graphical visualizations and statistical charts, (iv) and finally offers export files and cross-links to external useful databases and softwares for further prioritization of the small subset of sorted candidate variations and genes. We report a demonstrative case study that allowed to identify a new candidate gene related to a rare form of Alzheimer disease. Conclusions EVA is developed to be a user-friendly, versatile, and efficient-filtering assisting software for WES. It constitutes a platform for data storage and for drastic screening of clinical relevant genetics variations by non-programmer geneticists. Thereby, it provides a response to new needs at the expanding era of medical genomics investigated by WES for both fundamental research and clinical diagnostics.
Collapse
Affiliation(s)
- Sophie Coutant
- University of Rouen, INSERM U1079 Molecular genetics of cancer and neuropsychiatric diseases, 76183 Rouen cedex, France
| | | | | | | | | | | | | | | |
Collapse
|
23
|
Majewski J, Rosenblatt DS. Exome and whole-genome sequencing for gene discovery: The future is now! Hum Mutat 2012; 33:591-2. [DOI: 10.1002/humu.22055] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
24
|
Adams DR, Sincan M, Fuentes Fajardo K, Mullikin JC, Pierson TM, Toro C, Boerkoel CF, Tifft CJ, Gahl WA, Markello TC. Analysis of DNA sequence variants detected by high-throughput sequencing. Hum Mutat 2012; 33:599-608. [PMID: 22290882 DOI: 10.1002/humu.22035] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2011] [Accepted: 12/02/2011] [Indexed: 12/18/2022]
Abstract
The Undiagnosed Diseases Program at the National Institutes of Health uses high-throughput sequencing (HTS) to diagnose rare and novel diseases. HTS techniques generate large numbers of DNA sequence variants, which must be analyzed and filtered to find candidates for disease causation. Despite the publication of an increasing number of successful exome-based projects, there has been little formal discussion of the analytic steps applied to HTS variant lists. We present the results of our experience with over 30 families for whom HTS sequencing was used in an attempt to find clinical diagnoses. For each family, exome sequence was augmented with high-density SNP-array data. We present a discussion of the theory and practical application of each analytic step and provide example data to illustrate our approach. The article is designed to provide an analytic roadmap for variant analysis, thereby enabling a wide range of researchers and clinical genetics practitioners to perform direct analysis of HTS data for their patients and projects.
Collapse
Affiliation(s)
- David R Adams
- NIH Undiagnosed Diseases Program, NIH, Bethesda, Maryland, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|