1
|
Dolzhenko E, Deshpande V, Schlesinger F, Krusche P, Petrovski R, Chen S, Emig-Agius D, Gross A, Narzisi G, Bowman B, Scheffler K, van Vugt JJFA, French C, Sanchis-Juan A, Ibáñez K, Tucci A, Lajoie BR, Veldink JH, Raymond FL, Taft RJ, Bentley DR, Eberle MA. ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions. Bioinformatics 2020; 35:4754-4756. [PMID: 31134279 PMCID: PMC6853681 DOI: 10.1093/bioinformatics/btz431] [Citation(s) in RCA: 152] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 04/26/2019] [Accepted: 05/23/2019] [Indexed: 12/16/2022] Open
Abstract
SUMMARY We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci. AVAILABILITY AND IMPLEMENTATION ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | - Peter Krusche
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | - Roman Petrovski
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | - Sai Chen
- Illumina Inc., San Diego, CA 92122, USA
| | | | | | - Giuseppe Narzisi
- Computational Biology, New York Genome Center, New York, NY 10013, USA
| | | | | | - Joke J F A van Vugt
- UMC Utrecht Brain Center, Utrecht University, 3508 AB Utrecht, The Netherlands
| | - Courtney French
- Department of Medical Genetics, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK
| | - Alba Sanchis-Juan
- Department of Haematology, University of Cambridge, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK.,NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
| | - Kristina Ibáñez
- Genomics England, Queen Mary University London, London EC1M 6BQ, UK
| | - Arianna Tucci
- Genomics England, Queen Mary University London, London EC1M 6BQ, UK
| | | | - Jan H Veldink
- UMC Utrecht Brain Center, Utrecht University, 3508 AB Utrecht, The Netherlands
| | - F Lucy Raymond
- Department of Medical Genetics, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK
| | | | - David R Bentley
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | | |
Collapse
|
2
|
Chen S, Krusche P, Dolzhenko E, Sherman RM, Petrovski R, Schlesinger F, Kirsche M, Bentley DR, Schatz MC, Sedlazeck FJ, Eberle MA. Paragraph: a graph-based structural variant genotyper for short-read sequence data. Genome Biol 2019; 20:291. [PMID: 31856913 PMCID: PMC6921448 DOI: 10.1186/s13059-019-1909-7] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 12/02/2019] [Indexed: 12/30/2022] Open
Abstract
Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry. Our analysis shows that Paragraph has better accuracy than other existing genotypers and can be applied to population-scale studies.
Collapse
Affiliation(s)
- Sai Chen
- Illumina Inc, 5200 Illumina Way, San Diego, CA USA
| | - Peter Krusche
- Illumina Cambridge Ltd, Chesterford Research Park, Little Chesterford, UK
- Novartis Pharma AG, Basel, Switzerland
| | | | - Rachel M. Sherman
- Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
| | - Roman Petrovski
- Illumina Cambridge Ltd, Chesterford Research Park, Little Chesterford, UK
| | | | - Melanie Kirsche
- Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
| | - David R. Bentley
- Illumina Cambridge Ltd, Chesterford Research Park, Little Chesterford, UK
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY USA
| | - Fritz J. Sedlazeck
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX USA
| | | |
Collapse
|
3
|
Dolzhenko E, Deshpande V, Schlesinger F, Krusche P, Petrovski R, Chen S, Emig-Agius D, Gross A, Narzisi G, Bowman B, Scheffler K, van Vugt JJFA, French C, Sanchis-Juan A, Ibáñez K, Tucci A, Lajoie BR, Veldink JH, Raymond FL, Taft RJ, Bentley DR, Eberle MA. ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions. Bioinformatics 2019; 35:4754-4756. [PMID: 31134279 DOI: 10.1101/361162] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 04/26/2019] [Accepted: 05/23/2019] [Indexed: 05/25/2023]
Abstract
SUMMARY We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci. AVAILABILITY AND IMPLEMENTATION ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | - Peter Krusche
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | - Roman Petrovski
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | - Sai Chen
- Illumina Inc., San Diego, CA 92122, USA
| | | | | | - Giuseppe Narzisi
- Computational Biology, New York Genome Center, New York, NY 10013, USA
| | | | | | - Joke J F A van Vugt
- UMC Utrecht Brain Center, Utrecht University, 3508 AB Utrecht, The Netherlands
| | - Courtney French
- Department of Medical Genetics, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK
| | - Alba Sanchis-Juan
- Department of Haematology, University of Cambridge, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK
- NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
| | - Kristina Ibáñez
- Genomics England, Queen Mary University London, London EC1M 6BQ, UK
| | - Arianna Tucci
- Genomics England, Queen Mary University London, London EC1M 6BQ, UK
| | | | - Jan H Veldink
- UMC Utrecht Brain Center, Utrecht University, 3508 AB Utrecht, The Netherlands
| | - F Lucy Raymond
- Department of Medical Genetics, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK
| | | | - David R Bentley
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | | |
Collapse
|
4
|
Raczy C, Petrovski R, Saunders CT, Chorny I, Kruglyak S, Margulies EH, Chuang HY, Källberg M, Kumar SA, Liao A, Little KM, Strömberg MP, Tanner SW. Isaac: ultra-fast whole-genome secondary analysis on Illumina sequencing platforms. ACTA ACUST UNITED AC 2013; 29:2041-3. [PMID: 23736529 DOI: 10.1093/bioinformatics/btt314] [Citation(s) in RCA: 219] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
SUMMARY An ultrafast DNA sequence aligner (Isaac Genome Alignment Software) that takes advantage of high-memory hardware (>48 GB) and variant caller (Isaac Variant Caller) have been developed. We demonstrate that our combined pipeline (Isaac) is four to five times faster than BWA + GATK on equivalent hardware, with comparable accuracy as measured by trio conflict rates and sensitivity. We further show that Isaac is effective in the detection of disease-causing variants and can easily/economically be run on commodity hardware. AVAILABILITY Isaac has an open source license and can be obtained at https://github.com/sequencing.
Collapse
Affiliation(s)
- Come Raczy
- Illumina United Kingdom, Chesterford Research Park, Little Chesterford, Nr Saffron Walden, Essex, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Sukarov P, Marković J, Petrovski R. [Measurement of impendance in otology]. Med Glas 1970; 24:215-9. [PMID: 5205567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
|