Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Yi M, Zhao Y, Jia L, He M, Kebebew E, Stephens RM. Performance comparison of SNP detection tools with illumina exome sequencing data--an assessment using both family pedigree information and sample-matched SNP array data. Nucleic Acids Res 2014;42:e101. [PMID: 24831545 PMCID: PMC4081058 DOI: 10.1093/nar/gku392] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2013] [Revised: 03/27/2014] [Accepted: 04/22/2014] [Indexed: 12/30/2022] Open

For:	Yi M, Zhao Y, Jia L, He M, Kebebew E, Stephens RM. Performance comparison of SNP detection tools with illumina exome sequencing data--an assessment using both family pedigree information and sample-matched SNP array data. Nucleic Acids Res 2014;42:e101. [PMID: 24831545 PMCID: PMC4081058 DOI: 10.1093/nar/gku392] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2013] [Revised: 03/27/2014] [Accepted: 04/22/2014] [Indexed: 12/30/2022] Open

Number

Cited by Other Article(s)

Baumann A, Ruckert C, Meier C, Hutschenreiter T, Remy R, Schnur B, Döbel M, Fankep RCN, Skowronek D, Kutz O, Arnold N, Katzke AL, Forster M, Kobiela AL, Thiedig K, Zimmer A, Ritter J, Weber BHF, Honisch E, Hackmann K, Schmidt G, Sturm M, Ernst C. Limitations in next-generation sequencing-based genotyping of breast cancer polygenic risk score loci. Eur J Hum Genet 2024;32:987-997. [PMID: 38907004 PMCID: PMC11291653 DOI: 10.1038/s41431-024-01647-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 05/17/2024] [Accepted: 06/10/2024] [Indexed: 06/23/2024] Open

Affiliation(s)

Alexandra Baumann Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Dresden, Germany National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany German Cancer Consortium (DKTK), Dresden, Germany German Cancer Research Center (DKFZ), Heidelberg, Germany Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
Christian Ruckert Department of Medical Genetics, University Hospital Münster, Münster, Germany
Christoph Meier Institute of Human Genetics, University of Regensburg, Regensburg, Germany
Tim Hutschenreiter Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Dresden, Germany National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany German Cancer Consortium (DKTK), Dresden, Germany German Cancer Research Center (DKFZ), Heidelberg, Germany Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
Robert Remy Center for Familial Breast and Ovarian Cancer, Center for Integrated Oncology (CIO), Medical Faculty, University of Cologne and University Hospital Cologne, Cologne, Germany
Benedikt Schnur Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
Marvin Döbel Institute of Medical Genetics and Applied Genomics, University Hospital Tübingen, Tübingen, Germany
Rudel Christian Nkouamedjo Fankep Center for Familial Breast and Ovarian Cancer, Center for Integrated Oncology (CIO), Medical Faculty, University of Cologne and University Hospital Cologne, Cologne, Germany
Dariush Skowronek Department of Human Genetics, University Medicine Greifswald and Interfaculty Institute of Genetics and Functional Genomics, University of Greifswald, Greifswald, Germany
Oliver Kutz Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Dresden, Germany National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany German Cancer Consortium (DKTK), Dresden, Germany German Cancer Research Center (DKFZ), Heidelberg, Germany Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany Department of Gynecology and Obstetrics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany
Norbert Arnold Department of Gynecology and Obstetrics, Institute of Clinical Chemistry Institute of Clinical Molecular Biology, University Hospital Schleswig-Holstein, Campus Kiel, Kiel, Germany
Anna-Lena Katzke Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
Michael Forster Department of Gynecology and Obstetrics, Institute of Clinical Chemistry Institute of Clinical Molecular Biology, University Hospital Schleswig-Holstein, Campus Kiel, Kiel, Germany
Anna-Lena Kobiela Center for Familial Breast and Ovarian Cancer, Center for Integrated Oncology (CIO), Medical Faculty, University of Cologne and University Hospital Cologne, Cologne, Germany
Katharina Thiedig Division of Gynaecology and Obstetrics, Klinikum rechts der Isar der Technischen Universität München, München, Germany
Andreas Zimmer Institute for Human Genetics, Medical Center University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
Julia Ritter Department of Human Genetics, Labor Berlin - Charité Vivantes GmbH, Berlin, Germany
Bernhard H F Weber Institute of Human Genetics, University of Regensburg, Regensburg, Germany Institute of Clinical Human Genetics, University Hospital Regensburg, Regensburg, Germany
Ellen Honisch Department of Gynaecology and Obstetrics, University Hospital Düsseldorf, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany
Karl Hackmann Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Dresden, Germany National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany German Cancer Consortium (DKTK), Dresden, Germany German Cancer Research Center (DKFZ), Heidelberg, Germany Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
Gunnar Schmidt Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
Marc Sturm Institute of Medical Genetics and Applied Genomics, University Hospital Tübingen, Tübingen, Germany
Corinna Ernst Center for Familial Breast and Ovarian Cancer, Center for Integrated Oncology (CIO), Medical Faculty, University of Cologne and University Hospital Cologne, Cologne, Germany.

Collapse

Hofmeister RJ, Ribeiro DM, Rubinacci S, Delaneau O. Accurate rare variant phasing of whole-genome and whole-exome sequencing data in the UK Biobank. Nat Genet 2023:10.1038/s41588-023-01415-w. [PMID: 37386248 DOI: 10.1038/s41588-023-01415-w] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 05/04/2023] [Indexed: 07/01/2023]

Lefouili M, Nam K. The evaluation of Bcftools mpileup and GATK HaplotypeCaller for variant calling in non-human species. Sci Rep 2022;12:11331. [PMID: 35790846 PMCID: PMC9256665 DOI: 10.1038/s41598-022-15563-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 06/27/2022] [Indexed: 11/09/2022] Open

Zhang C, Zheng T, Ma Q, Yang L, Zhang M, Wang J, Teng X, Miao Y, Lin HC, Yang Y, Han D. Logical Analysis of Multiple Single-Nucleotide-Polymorphisms with Programmable DNA Molecular Computation for Clinical Diagnostics. Angew Chem Int Ed Engl 2022;61:e202117658. [PMID: 35137499 DOI: 10.1002/anie.202117658] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Indexed: 11/07/2022]

Affiliation(s)

Chao Zhang Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
Tingting Zheng Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
Qian Ma Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
Linlin Yang Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
Mingzhi Zhang Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
Junyan Wang Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
Xiaoyan Teng Department of Laboratory Medicine, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 201306, China
Yanyan Miao Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
Hsiao-Chu Lin Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
Yang Yang Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, 200433, China
Da Han Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, State Key Laboratory of Oncogenes and Related Genes, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China

Collapse

Zhang C, Zheng T, Ma Q, Yang L, Zhang M, Wang J, Teng X, Miao Y, Lin H, Yang Y, Han D. Logical Analysis of Multiple Single‐Nucleotide‐Polymorphisms with Programmable DNA Molecular Computation for Clinical Diagnostics. Angew Chem Int Ed Engl 2022. [DOI: 10.1002/ange.202117658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Affiliation(s)

Chao Zhang Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China
Tingting Zheng Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China
Qian Ma Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China
Linlin Yang Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China
Mingzhi Zhang Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China
Junyan Wang Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China
Xiaoyan Teng Department of Laboratory Medicine Shanghai Jiao Tong University Affiliated Sixth People's Hospital Shanghai 201306 China
Yanyan Miao Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China
Hsiao‐chu Lin Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China
Yang Yang Department of Thoracic Surgery Shanghai Pulmonary Hospital Tongji University School of Medicine Shanghai 200433 China
Da Han Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine State Key Laboratory of Oncogenes and Related Genes Renji Hospital School of Medicine Shanghai Jiao Tong University Shanghai 200127 China

Collapse

Zanti M, Michailidou K, Loizidou MA, Machattou C, Pirpa P, Christodoulou K, Spyrou GM, Kyriacou K, Hadjisavvas A. Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels. BMC Bioinformatics 2021;22:218. [PMID: 33910496 PMCID: PMC8080428 DOI: 10.1186/s12859-021-04144-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Accepted: 04/15/2021] [Indexed: 11/10/2022] Open

Abstract

Background

Next-generation sequencing (NGS) represents a significant advancement in clinical genetics. However, its use creates several technical, data interpretation and management challenges. It is essential to follow a consistent data analysis pipeline to achieve the highest possible accuracy and avoid false variant calls. Herein, we aimed to compare the performance of twenty-eight combinations of NGS data analysis pipeline compartments, including short-read mapping (BWA-MEM, Bowtie2, Stampy), variant calling (GATK-HaplotypeCaller, GATK-UnifiedGenotyper, SAMtools) and interval padding (null, 50 bp, 100 bp) methods, along with a commercially available pipeline (BWA Enrichment, Illumina®). Fourteen germline DNA samples from breast cancer patients were sequenced using a targeted NGS panel approach and subjected to data analysis.

Results

We highlight that interval padding is required for the accurate detection of intronic variants including spliceogenic pathogenic variants (PVs). In addition, using nearly default parameters, the BWA Enrichment algorithm, failed to detect these spliceogenic PVs and a missense PV in the TP53 gene. We also recommend the BWA-MEM algorithm for sequence alignment, whereas variant calling should be performed using a combination of variant calling algorithms; GATK-HaplotypeCaller and SAMtools for the accurate detection of insertions/deletions and GATK-UnifiedGenotyper for the efficient detection of single nucleotide variant calls.

Conclusions

These findings have important implications towards the identification of clinically actionable variants through panel testing in a clinical laboratory setting, when dedicated bioinformatics personnel might not always be available. The results also reveal the necessity of improving the existing tools and/or at the same time developing new pipelines to generate more reliable and more consistent data.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-021-04144-1.

Collapse

Quinodoz M, Peter VG, Bedoni N, Royer Bertrand B, Cisarova K, Salmaninejad A, Sepahi N, Rodrigues R, Piran M, Mojarrad M, Pasdar A, Ghanbari Asad A, Sousa AB, Coutinho Santos L, Superti-Furga A, Rivolta C. AutoMap is a high performance homozygosity mapping tool using next-generation sequencing data. Nat Commun 2021;12:518. [PMID: 33483490 PMCID: PMC7822856 DOI: 10.1038/s41467-020-20584-4] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Accepted: 12/09/2020] [Indexed: 12/11/2022] Open

Affiliation(s)

Mathieu Quinodoz Institute of Molecular and Clinical Ophthalmology Basel (IOB), Basel, Switzerland.,Department of Ophthalmology, University of Basel, Basel, Switzerland.,Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
Virginie G Peter Institute of Molecular and Clinical Ophthalmology Basel (IOB), Basel, Switzerland.,Department of Ophthalmology, University of Basel, Basel, Switzerland.,Department of Genetics and Genome Biology, University of Leicester, Leicester, UK.,Institute of Experimental Pathology, Lausanne University Hospital (CHUV), Lausanne, Switzerland
Nicola Bedoni Service of Medical Genetics, Lausanne University Hospital (CHUV), Lausanne, Switzerland
Béryl Royer Bertrand Service of Medical Genetics, Lausanne University Hospital (CHUV), Lausanne, Switzerland
Katarina Cisarova Service of Medical Genetics, Lausanne University Hospital (CHUV), Lausanne, Switzerland
Arash Salmaninejad Department of Medical Genetics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
Neda Sepahi Noncommunicable Diseases Research Center, Fasa University of Sciences, Fasa, Iran
Raquel Rodrigues Department of Medical Genetics, Hospital Santa Maria, Centro Hospitalar Universitário Lisboa Norte (CHULN), Lisbon Academic Medical Center (CAML), Lisbon, Portugal
Mehran Piran Noncommunicable Diseases Research Center, Fasa University of Sciences, Fasa, Iran.,Bioinformatics and Computational Biology Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
Majid Mojarrad Department of Medical Genetics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
Alireza Pasdar Department of Medical Genetics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.,Division of Applied Medicine, Medical School, University of Aberdeen, Aberdeen, UK
Ali Ghanbari Asad Noncommunicable Diseases Research Center, Fasa University of Sciences, Fasa, Iran
Ana Berta Sousa Department of Medical Genetics, Hospital Santa Maria, Centro Hospitalar Universitário Lisboa Norte (CHULN), Lisbon Academic Medical Center (CAML), Lisbon, Portugal.,Medical Faculty, Lisbon University, Lisbon, Portugal
Luisa Coutinho Santos Instituto de Oftalmologia Dr Gama Pinto, Lisbon, Portugal
Andrea Superti-Furga Service of Medical Genetics, Lausanne University Hospital (CHUV), Lausanne, Switzerland
Carlo Rivolta Institute of Molecular and Clinical Ophthalmology Basel (IOB), Basel, Switzerland. .,Department of Ophthalmology, University of Basel, Basel, Switzerland. .,Department of Genetics and Genome Biology, University of Leicester, Leicester, UK.

Collapse

Molina-Mora JA, Solano-Vargas M. Set-theory based benchmarking of three different variant callers for targeted sequencing. BMC Bioinformatics 2021;22:20. [PMID: 33413082 PMCID: PMC7791862 DOI: 10.1186/s12859-020-03926-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 12/09/2020] [Indexed: 12/05/2022] Open

Abstract

Background

Next generation sequencing (NGS) technologies have improved the study of hereditary diseases. Since the evaluation of bioinformatics pipelines is not straightforward, NGS demands effective strategies to analyze data that is of paramount relevance for decision making under a clinical scenario. According to the benchmarking framework of the Global Alliance for Genomics and Health (GA4GH), we implemented a new simple and user-friendly set-theory based method to assess variant callers using a gold standard variant set and high confidence regions. As model, we used TruSight Cardio kit sequencing data of the reference genome NA12878. This targeted sequencing kit is used to identify variants in key genes related to Inherited Cardiac Conditions (ICCs), a group of cardiovascular diseases with high rates of morbidity and mortality.

Results

We implemented and compared three variant calling pipelines (Isaac, Freebayes, and VarScan). Performance metrics using our set-theory approach showed high-resolution pipelines and revealed: (1) a perfect recall of 1.000 for all three pipelines, (2) very high precision values, i.e. 0.987 for Freebayes, 0.928 for VarScan, and 1.000 for Isaac, when compared with the reference material, and (3) a ROC curve analysis with AUC > 0.94 for all cases. Moreover, significant differences were obtained between the three pipelines. In general, results indicate that the three pipelines were able to recognize the expected variants in the gold standard data set.

Conclusions

Our set-theory approach to calculate metrics was able to identify the expected ICCs related variants by the three selected pipelines, but results were completely dependent on the algorithms. We emphasize the importance to assess pipelines using gold standard materials to achieve the most reliable results for clinical application.

Collapse

Alosaimi S, van Biljon N, Awany D, Thami PK, Defo J, Mugo JW, Bope CD, Mazandu GK, Mulder NJ, Chimusa ER. Simulation of African and non-African low and high coverage whole genome sequence data to assess variant calling approaches. Brief Bioinform 2020;22:6042242. [PMID: 33341897 DOI: 10.1093/bib/bbaa366] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 11/14/2020] [Accepted: 01/08/2020] [Indexed: 12/15/2022] Open

Affiliation(s)

Shatha Alosaimi Faculty of Health Sciences, Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town, South Africa
Noëlle van Biljon Department of Statistical Sciences, University of Cape Town, Cape Town, South Africa
Denis Awany Faculty of Health Sciences, Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town, South Africa
Prisca K Thami Faculty of Health Sciences, Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town, South Africa
Joel Defo Faculty of Health Sciences, Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town, South Africa
Jacquiline W Mugo Faculty of Health Sciences, Division of Computational Biology, Department of Biomedical Sciences, University of Cape Town, Cape Town, South Africa
Christian D Bope Faculty of Sciences, Department of Mathematics and Computer Science, University of Kinshasa, Kinshasa, DRC
Gaston K Mazandu Faculty of Health Sciences, Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town, South Africa.,Faculty of Health Sciences, Division of Computational Biology, Department of Biomedical Sciences, University of Cape Town, Cape Town, South Africa
Nicola J Mulder Faculty of Health Sciences, Division of Computational Biology, Department of Biomedical Sciences, University of Cape Town, Cape Town, South Africa.,Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Anzio Road, Observatory, Cape Town 7925, South Africa
Emile R Chimusa Faculty of Health Sciences, Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town, South Africa.,Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Anzio Road, Observatory, Cape Town 7925, South Africa

Collapse

DeepVariant-on-Spark: Small-Scale Genome Analysis Using a Cloud-Based Computing Framework. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020;2020:7231205. [PMID: 32952600 PMCID: PMC7481958 DOI: 10.1155/2020/7231205] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 08/15/2020] [Accepted: 08/21/2020] [Indexed: 12/18/2022]

Kumaran M, Subramanian U, Devarajan B. Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data. BMC Bioinformatics 2019;20:342. [PMID: 31208315 PMCID: PMC6580603 DOI: 10.1186/s12859-019-2928-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2018] [Accepted: 05/31/2019] [Indexed: 12/30/2022] Open

Gonda I, Ashrafi H, Lyon DA, Strickler SR, Hulse-Kemp AM, Ma Q, Sun H, Stoffel K, Powell AF, Futrell S, Thannhauser TW, Fei Z, Van Deynze AE, Mueller LA, Giovannoni JJ, Foolad MR. Sequencing-Based Bin Map Construction of a Tomato Mapping Population, Facilitating High-Resolution Quantitative Trait Loci Detection. THE PLANT GENOME 2019;12:180010. [PMID: 30951101 DOI: 10.3835/plantgenome2018.02.0010] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]

Dharanipragada P, Seelam SR, Parekh N. SeqVItA: Sequence Variant Identification and Annotation Platform for Next Generation Sequencing Data. Front Genet 2018;9:537. [PMID: 30487811 PMCID: PMC6247818 DOI: 10.3389/fgene.2018.00537] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022] Open

Abstract

The current trend in clinical data analysis is to understand how individuals respond to therapies and drug interactions based on their genetic makeup. This has led to a paradigm shift in healthcare; caring for patients is now 99% information and 1% intervention. Reducing costs of next generation sequencing (NGS) technologies has made it possible to take genetic profiling to the clinical setting. This requires not just fast and accurate algorithms for variant detection, but also a knowledge-base for variant annotation and prioritization to facilitate tailored therapeutics based on an individual's genetic profile. Here we show that it is possible to provide a fast and easy access to all possible information about a variant and its impact on the gene, its protein product, associated pathways and drug-variant interactions by integrating previously reported knowledge from various databases. With this objective, we have developed a pipeline, Sequence Variants Identification and Annotation (SeqVItA) that provides end-to-end solution for small sequence variants detection, annotation and prioritization on a single platform. Parallelization of the variant detection step and with numerous resources incorporated to infer functional impact, clinical relevance and drug-variant associations, SeqVItA will benefit the clinical and research communities alike. Its open-source platform and modular framework allows for easy customization of the workflow depending on the data type (single, paired, or pooled samples), variant type (germline and somatic), and variant annotation and prioritization. Performance comparison of SeqVItA on simulated data and detection, interpretation and analysis of somatic variants on real data (24 liver cancer patients) is carried out. We demonstrate the efficacy of annotation module in facilitating personalized medicine based on patient's mutational landscape. SeqVItA is freely available at https://bioinf.iiit.ac.in/seqvita.

Collapse

Zhang C, Liu X, Yao Y, Liu K, Hui W, Zhu J, Dou Y, Hua K, Peng M, Wang Z, Vermorken AJM, Cui Y. Genotyping of Multiple Clinical Samples with a Combined Direct PCR and Magnetic Lateral Flow Assay. iScience 2018;7:170-179. [PMID: 30245369 PMCID: PMC6153416 DOI: 10.1016/j.isci.2018.09.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/19/2018] [Accepted: 09/05/2018] [Indexed: 02/09/2023] Open

Smith SD, Kawash JK, Grigoriev A. Lightning-fast genome variant detection with GROM. Gigascience 2018;6:1-7. [PMID: 29048532 PMCID: PMC5737730 DOI: 10.1093/gigascience/gix091] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2017] [Accepted: 09/13/2017] [Indexed: 12/30/2022] Open

Tuzov N. A framework for the estimation of the proportion of true discoveries in single nucleotide variant detection studies for human data. PLoS One 2018;13:e0196058. [PMID: 29694377 PMCID: PMC5918994 DOI: 10.1371/journal.pone.0196058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2017] [Accepted: 04/05/2018] [Indexed: 12/30/2022] Open

Takamatsu T, Baslam M, Inomata T, Oikawa K, Itoh K, Ohnishi T, Kinoshita T, Mitsui T. Optimized Method of Extracting Rice Chloroplast DNA for High-Quality Plastome Resequencing and de Novo Assembly. FRONTIERS IN PLANT SCIENCE 2018;9:266. [PMID: 29541088 PMCID: PMC5835797 DOI: 10.3389/fpls.2018.00266] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]

Ma T, Zhang A. Omics Informatics: From Scattered Individual Software Tools to Integrated Workflow Management Systems. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017;14:926-946. [PMID: 26930689 DOI: 10.1109/tcbb.2016.2535251] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Perez M, Juniper SK. Is the trophosome of Ridgeia piscesae monoclonal? Symbiosis 2017. [DOI: 10.1007/s13199-017-0490-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Pantazatos SP, Huang YY, Rosoklija GB, Dwork AJ, Arango V, Mann JJ. Whole-transcriptome brain expression and exon-usage profiling in major depression and suicide: evidence for altered glial, endothelial and ATPase activity. Mol Psychiatry 2017;22:760-773. [PMID: 27528462 PMCID: PMC5313378 DOI: 10.1038/mp.2016.130] [Citation(s) in RCA: 142] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Revised: 06/04/2016] [Accepted: 06/07/2016] [Indexed: 12/30/2022]

Abstract

Brain gene expression profiling studies of suicide and depression using oligonucleotide microarrays have often failed to distinguish these two phenotypes. Moreover, next generation sequencing approaches are more accurate in quantifying gene expression and can detect alternative splicing. Using RNA-seq, we examined whole-exome gene and exon expression in non-psychiatric controls (CON, N=29), DSM-IV major depressive disorder suicides (MDD-S, N=21) and MDD non-suicides (MDD, N=9) in the dorsal lateral prefrontal cortex (Brodmann Area 9) of sudden death medication-free individuals post mortem. Using small RNA-seq, we also examined miRNA expression (nine samples per group). DeSeq2 identified 35 genes differentially expressed between groups and surviving adjustment for false discovery rate (adjusted P<0.1). In depression, altered genes include humanin-like-8 (MTRNRL8), interleukin-8 (IL8), and serpin peptidase inhibitor, clade H (SERPINH1) and chemokine ligand 4 (CCL4), while exploratory gene ontology (GO) analyses revealed lower expression of immune-related pathways such as chemokine receptor activity, chemotaxis and cytokine biosynthesis, and angiogenesis and vascular development in (adjusted P<0.1). Hypothesis-driven GO analysis suggests lower expression of genes involved in oligodendrocyte differentiation, regulation of glutamatergic neurotransmission, and oxytocin receptor expression in both suicide and depression, and provisional evidence for altered DNA-dependent ATPase expression in suicide only. DEXSEq analysis identified differential exon usage in ATPase, class II, type 9B (adjusted P<0.1) in depression. Differences in miRNA expression or structural gene variants were not detected. Results lend further support for models in which deficits in microglial, endothelial (blood-brain barrier), ATPase activity and astrocytic cell functions contribute to MDD and suicide, and identify putative pathways and mechanisms for further study in these disorders.

Collapse

Levano S, Gonzalez A, Singer M, Demougin P, Rüffert H, Urwyler A, Girard T. Resequencing array for gene variant detection in malignant hyperthermia and butyrylcholinestherase deficiency. Neuromuscul Disord 2017;27:492-499. [PMID: 28259615 DOI: 10.1016/j.nmd.2017.02.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Revised: 12/20/2016] [Accepted: 02/15/2017] [Indexed: 11/30/2022]

PEMapper and PECaller provide a simplified approach to whole-genome sequencing. Proc Natl Acad Sci U S A 2017;114:E1923-E1932. [PMID: 28223510 DOI: 10.1073/pnas.1618065114] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Brumme CJ, Poon AFY. Promises and pitfalls of Illumina sequencing for HIV resistance genotyping. Virus Res 2016;239:97-105. [PMID: 27993623 DOI: 10.1016/j.virusres.2016.12.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Revised: 12/15/2016] [Accepted: 12/15/2016] [Indexed: 12/13/2022]

Rudewicz J, Soueidan H, Uricaru R, Bonnefoi H, Iggo R, Bergh J, Nikolski M. MICADo - Looking for Mutations in Targeted PacBio Cancer Data: An Alignment-Free Method. Front Genet 2016;7:214. [PMID: 28008336 PMCID: PMC5143680 DOI: 10.3389/fgene.2016.00214] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Accepted: 11/23/2016] [Indexed: 12/11/2022] Open

Tian S, Yan H, Neuhauser C, Slager SL. An analytical workflow for accurate variant discovery in highly divergent regions. BMC Genomics 2016;17:703. [PMID: 27590916 PMCID: PMC5010666 DOI: 10.1186/s12864-016-3045-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Accepted: 08/25/2016] [Indexed: 02/07/2023] Open

Abstract

Background

Current variant discovery methods often start with the mapping of short reads to a reference genome; yet, their performance deteriorates in genomic regions where the reads are highly divergent from the reference sequence. This is particularly problematic for the human leukocyte antigen (HLA) region on chromosome 6p21.3. This region is associated with over 100 diseases, but variant calling is hindered by the extreme divergence across different haplotypes.

Results

We simulated reads from chromosome 6 exonic regions over a wide range of sequence divergence and coverage depth. We systematically assessed combinations between five mappers and five callers for their performance on simulated data and exome-seq data from NA12878, a well-studied individual in which multiple public call sets have been generated. Among those combinations, the number of known SNPs differed by about 5 % in the non-HLA regions of chromosome 6 but over 20 % in the HLA region. Notably, GSNAP mapping combined with GATK UnifiedGenotyper calling identified about 20 % more known SNPs than most existing methods without a noticeable loss of specificity, with 100 % sensitivity in three highly polymorphic HLA genes examined. Much larger differences were observed among these combinations in INDEL calling from both non-HLA and HLA regions. We obtained similar results with our internal exome-seq data from a cohort of chronic lymphocytic leukemia patients.

Conclusions

We have established a workflow enabling variant detection, with high sensitivity and specificity, over the full spectrum of divergence seen in the human genome. Comparing to public call sets from NA12878 has highlighted the overall superiority of GATK UnifiedGenotyper, followed by GATK HaplotypeCaller and SAMtools, in SNP calling, and of GATK HaplotypeCaller and Platypus in INDEL calling, particularly in regions of high sequence divergence such as the HLA region. GSNAP and Novoalign are the ideal mappers in combination with the above callers. We expect that the proposed workflow should be applicable to variant discovery in other highly divergent regions.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-016-3045-z) contains supplementary material, which is available to authorized users.

Collapse

Humble E, Thorne MAS, Forcada J, Hoffman JI. Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success. BMC Res Notes 2016;9:418. [PMID: 27562535 PMCID: PMC5000416 DOI: 10.1186/s13104-016-2209-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 08/06/2016] [Indexed: 01/26/2023] Open

Abstract

BACKGROUND

Single nucleotide polymorphism (SNP) discovery is an important goal of many studies. However, the number of 'putative' SNPs discovered from a sequence resource may not provide a reliable indication of the number that will successfully validate with a given genotyping technology. For this it may be necessary to account for factors such as the method used for SNP discovery and the type of sequence data from which it originates, suitability of the SNP flanking sequences for probe design, and genomic context. To explore the relative importance of these and other factors, we used Illumina sequencing to augment an existing Roche 454 transcriptome assembly for the Antarctic fur seal (Arctocephalus gazella). We then mapped the raw Illumina reads to the new hybrid transcriptome using BWA and BOWTIE2 before calling SNPs with GATK. The resulting markers were pooled with two existing sets of SNPs called from the original 454 assembly using NEWBLER and SWAP454. Finally, we explored the extent to which SNPs discovered using these four methods overlapped and predicted the corresponding validation outcomes for both Illumina Infinium iSelect HD and Affymetrix Axiom arrays.

RESULTS

Collating markers across all discovery methods resulted in a global list of 34,718 SNPs. However, concordance between the methods was surprisingly poor, with only 51.0 % of SNPs being discovered by more than one method and 13.5 % being called from both the 454 and Illumina datasets. Using a predictive modeling approach, we could also show that SNPs called from the Illumina data were on average more likely to successfully validate, as were SNPs called by more than one method. Above and beyond this pattern, predicted validation outcomes were also consistently better for Affymetrix Axiom arrays.

CONCLUSIONS

Our results suggest that focusing on SNPs called by more than one method could potentially improve validation outcomes. They also highlight possible differences between alternative genotyping technologies that could be explored in future studies of non-model organisms.

Collapse

Menon R, Patel AB, Joshi C. Comparative analysis of SNP candidates in disparate milk yielding river buffaloes using targeted sequencing. PeerJ 2016;4:e2147. [PMID: 27441113 PMCID: PMC4941740 DOI: 10.7717/peerj.2147] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Accepted: 05/27/2016] [Indexed: 12/17/2022] Open

Damiati E, Borsani G, Giacopuzzi E. Amplicon-based semiconductor sequencing of human exomes: performance evaluation and optimization strategies. Hum Genet 2016;135:499-511. [PMID: 27003585 PMCID: PMC4835520 DOI: 10.1007/s00439-016-1656-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Accepted: 03/12/2016] [Indexed: 02/02/2023]

Li J, Batcha AMN, Grüning B, Mansmann UR. An NGS Workflow Blueprint for DNA Sequencing Data and Its Application in Individualized Molecular Oncology. Cancer Inform 2016;14:87-107. [PMID: 27081306 PMCID: PMC4827795 DOI: 10.4137/cin.s30793] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Revised: 03/02/2016] [Accepted: 03/17/2016] [Indexed: 12/23/2022] Open

Chua EW, Cree SL, Ton KNT, Lehnert K, Shepherd P, Helsby N, Kennedy MA. Cross-Comparison of Exome Analysis, Next-Generation Sequencing of Amplicons, and the iPLEX(®) ADME PGx Panel for Pharmacogenomic Profiling. Front Pharmacol 2016;7:1. [PMID: 26858644 PMCID: PMC4726781 DOI: 10.3389/fphar.2016.00001] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 01/06/2016] [Indexed: 12/30/2022] Open

Field MA, Cho V, Andrews TD, Goodnow CC. Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies. PLoS One 2015;10:e0143199. [PMID: 26600436 PMCID: PMC4658170 DOI: 10.1371/journal.pone.0143199] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 11/02/2015] [Indexed: 12/21/2022] Open

Abstract

A diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant calls is highly variable however, depending on factors such as sequence library quality as well as the choice of short-read aligner, variant caller, and variant caller filtering strategy. Here we present a two-part study first using the high quality 'genome in a bottle' reference set to demonstrate the significant impact the choice of aligner, variant caller, and variant caller filtering strategy has on overall variant call quality and further how certain variant callers outperform others with increased sample contamination, an important consideration when analyzing sequenced cancer samples. This analysis confirms previous work showing that combining variant calls of multiple tools results in the best quality resultant variant set, for either specificity or sensitivity, depending on whether the intersection or union, of all variant calls is used respectively. Second, we analyze a melanoma cell line derived from a control lymphocyte sample to determine whether software choices affect the detection of clinically important melanoma risk-factor variants finding that only one of the three such variants is unanimously detected under all conditions. Finally, we describe a cogent strategy for implementing a clinical variant detection pipeline; a strategy that requires careful software selection, variant caller filtering optimizing, and combined variant calls in order to effectively minimize false negative variants. While implementing such features represents an increase in complexity and computation the results offer indisputable improvements in data quality.

Collapse

Sboner A, Elemento O. A primer on precision medicine informatics. Brief Bioinform 2015;17:145-53. [PMID: 26048401 DOI: 10.1093/bib/bbv032] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Indexed: 12/30/2022] Open

Vandeweyer G, Van Laer L, Loeys B, Van den Bulcke T, Kooy RF. VariantDB: a flexible annotation and filtering portal for next generation sequencing data. Genome Med 2014;6:74. [PMID: 25352915 PMCID: PMC4210545 DOI: 10.1186/s13073-014-0074-6] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 09/15/2014] [Indexed: 12/30/2022] Open

Warden CD, Adamson AW, Neuhausen SL, Wu X. Detailed comparison of two popular variant calling packages for exome and targeted exon studies. PeerJ 2014;2:e600. [PMID: 25289185 PMCID: PMC4184249 DOI: 10.7717/peerj.600] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 09/09/2014] [Indexed: 12/22/2022] Open

Abstract

The Genome Analysis Toolkit (GATK) is commonly used for variant calling of single nucleotide polymorphisms (SNPs) and small insertions and deletions (indels) from short-read sequencing data aligned against a reference genome. There have been a number of variant calling comparisons against GATK, but an equally comprehensive comparison for VarScan not yet been performed. More specifically, we compare (1) the effects of different pre-processing steps prior to variant calling with both GATK and VarScan, (2) VarScan variants called with increasingly conservative parameters, and (3) filtered and unfiltered GATK variant calls (for both the UnifiedGenotyper and the HaplotypeCaller). Variant calling was performed on three datasets (1 targeted exon dataset and 2 exome datasets), each with approximately a dozen subjects. In most cases, pre-processing steps (e.g., indel realignment and quality score base recalibration using GATK) had only a modest impact on the variant calls, but the importance of the pre-processing steps varied between datasets and variant callers. Based upon concordance statistics presented in this study, we recommend GATK users focus on “high-quality” GATK variants by filtering out variants flagged as low-quality. We also found that running VarScan with a conservative set of parameters (referred to as “VarScan-Cons”) resulted in a reproducible list of variants, with high concordance (>97%) to high-quality variants called by the GATK UnifiedGenotyper and HaplotypeCaller. These conservative parameters result in decreased sensitivity, but the VarScan-Cons variant list could still recover 84–88% of the high-quality GATK SNPs in the exome datasets. This study also provides limited evidence that VarScan-Cons has a decreased false positive rate among novel variants (relative to high-quality GATK SNPs) and that the GATK HaplotypeCaller has an increased false positive rate for indels (relative to VarScan-Cons and high-quality GATK UnifiedGenotyper indels). More broadly, we believe the metrics used for comparison in this study can be useful in assessing the quality of variant calls in the context of a specific experimental design. As an example, a limited number of variant calling comparisons are also performed on two additional variant callers.

Collapse