Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Cleary JG, Braithwaite R, Gaastra K, Hilbush BS, Inglis S, Irvine SA, Jackson A, Littin R, Nohzadeh-Malakshah S, Rathod M, Ware D, Trigg L, De La Vega FM. Joint variant and de novo mutation identification on pedigrees from high-throughput sequencing data. J Comput Biol 2015;21:405-19. [PMID: 24874280 DOI: 10.1089/cmb.2014.0029] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

For:	Cleary JG, Braithwaite R, Gaastra K, Hilbush BS, Inglis S, Irvine SA, Jackson A, Littin R, Nohzadeh-Malakshah S, Rathod M, Ware D, Trigg L, De La Vega FM. Joint variant and de novo mutation identification on pedigrees from high-throughput sequencing data. J Comput Biol 2015;21:405-19. [PMID: 24874280 DOI: 10.1089/cmb.2014.0029] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Number

Cited by Other Article(s)

Nguyen HTL, Kohl E, Bade J, Eng SE, Tosevska A, Al Shihabi A, Tebon PJ, Hong JJ, Dry S, Boutros PC, Panossian A, Gosline SJC, Soragni A. A platform for rapid patient-derived cutaneous neurofibroma organoid establishment and screening. CELL REPORTS METHODS 2024;4:100772. [PMID: 38744290 PMCID: PMC11133839 DOI: 10.1016/j.crmeth.2024.100772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 02/10/2024] [Accepted: 04/19/2024] [Indexed: 05/16/2024]

Affiliation(s)

Huyen Thi Lam Nguyen Department of Orthopaedic Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
Emily Kohl Department of Orthopaedic Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
Jessica Bade Pacific Northwest National Laboratories, Seattle, WA, USA
Stefan E Eng Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA; Institute for Precision Health, University of California, Los Angeles, Los Angeles, CA, USA; Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, USA
Anela Tosevska Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
Ahmad Al Shihabi Department of Orthopaedic Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA; Department of Pathology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
Peyton J Tebon Department of Orthopaedic Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
Jenny J Hong Division of Hematology-Oncology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
Sarah Dry Department of Pathology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
Paul C Boutros Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA; Institute for Precision Health, University of California, Los Angeles, Los Angeles, CA, USA; Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, USA; Department of Urology, University of California, Los Angeles, Los Angeles, CA, USA; Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA, USA
Andre Panossian Andre Panossian MD, Plastic Surgery, Pasadena, CA, USA
Sara J C Gosline Pacific Northwest National Laboratories, Seattle, WA, USA; Department of Biomedical Engineering, Oregon Health and Sciences University, Portland, OR, USA.
Alice Soragni Department of Orthopaedic Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA; Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, USA; Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA, USA.

Collapse

Kalleberg J, Rissman J, Schnabel RD. Overcoming Limitations to Deep Learning in Domesticated Animals with TrioTrain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589602. [PMID: 38659907 PMCID: PMC11042298 DOI: 10.1101/2024.04.15.589602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]

Höjer P, Frick T, Siga H, Pourbozorgi P, Aghelpasand H, Martin M, Ahmadian A. BLR: a flexible pipeline for haplotype analysis of multiple linked-read technologies. Nucleic Acids Res 2023;51:e114. [PMID: 37941142 PMCID: PMC10711428 DOI: 10.1093/nar/gkad1010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 10/04/2023] [Accepted: 10/18/2023] [Indexed: 11/10/2023] Open

Godazandeh K, Van Olmen L, Van Oudenhove L, Lefever S, Bogaert C, Fant B. Methods behind neoantigen prediction for personalized anticancer vaccines. Methods Cell Biol 2023;183:161-186. [PMID: 38548411 DOI: 10.1016/bs.mcb.2023.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]

Prodanov T, Bansal V. A multilocus approach for accurate variant calling in low-copy repeats using whole-genome sequencing. Bioinformatics 2023;39:i279-i287. [PMID: 37387146 PMCID: PMC10311303 DOI: 10.1093/bioinformatics/btad268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open

Abstract

MOTIVATION

Low-copy repeats (LCRs) or segmental duplications are long segments of duplicated DNA that cover > 5% of the human genome. Existing tools for variant calling using short reads exhibit low accuracy in LCRs due to ambiguity in read mapping and extensive copy number variation. Variants in more than 150 genes overlapping LCRs are associated with risk for human diseases.

METHODS

We describe a short-read variant calling method, ParascopyVC, that performs variant calling jointly across all repeat copies and utilizes reads independent of mapping quality in LCRs. To identify candidate variants, ParascopyVC aggregates reads mapped to different repeat copies and performs polyploid variant calling. Subsequently, paralogous sequence variants that can differentiate repeat copies are identified using population data and used for estimating the genotype of variants for each repeat copy.

RESULTS

On simulated whole-genome sequence data, ParascopyVC achieved higher precision (0.997) and recall (0.807) than three state-of-the-art variant callers (best precision = 0.956 for DeepVariant and best recall = 0.738 for GATK) in 167 LCR regions. Benchmarking of ParascopyVC using the genome-in-a-bottle high-confidence variant calls for HG002 genome showed that it achieved a very high precision of 0.991 and a high recall of 0.909 across LCR regions, significantly better than FreeBayes (precision = 0.954 and recall = 0.822), GATK (precision = 0.888 and recall = 0.873) and DeepVariant (precision = 0.983 and recall = 0.861). ParascopyVC demonstrated a consistently higher accuracy (mean F1 = 0.947) than other callers (best F1 = 0.908) across seven human genomes.

AVAILABILITY AND IMPLEMENTATION

ParascopyVC is implemented in Python and is freely available at https://github.com/tprodanov/ParascopyVC.

Collapse

McConnell SC, Hernandez KM, Andrade J, de Jong JLO. Immune gene variation associated with chromosome-scale differences among individual zebrafish genomes. Sci Rep 2023;13:7777. [PMID: 37179373 PMCID: PMC10183018 DOI: 10.1038/s41598-023-34467-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 04/30/2023] [Indexed: 05/15/2023] Open

Olson ND, Wagner J, Dwarshuis N, Miga KH, Sedlazeck FJ, Salit M, Zook JM. Variant calling and benchmarking in an era of complete human genome sequences. Nat Rev Genet 2023:10.1038/s41576-023-00590-0. [PMID: 37059810 DOI: 10.1038/s41576-023-00590-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/22/2023] [Indexed: 04/16/2023]

Ding Y, Owen M, Le J, Batalov S, Chau K, Kwon YH, Van Der Kraan L, Bezares-Orin Z, Zhu Z, Veeraraghavan N, Nahas S, Bainbridge M, Gleeson J, Baer RJ, Bandoli G, Chambers C, Kingsmore SF. Scalable, high quality, whole genome sequencing from archived, newborn, dried blood spots. NPJ Genom Med 2023;8:5. [PMID: 36788231 PMCID: PMC9929090 DOI: 10.1038/s41525-023-00349-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 01/05/2023] [Indexed: 02/16/2023] Open

Affiliation(s)

Yan Ding grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Mallory Owen Rady Children's Institute for Genomic Medicine, Rady Children's Hospital, San Diego, CA, 92123, USA.
Jennie Le grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Sergey Batalov grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Kevin Chau grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Yong Hyun Kwon grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Lucita Van Der Kraan grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Zaira Bezares-Orin grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Zhanyang Zhu grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Narayanan Veeraraghavan grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Shareef Nahas grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Matthew Bainbridge grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA
Joe Gleeson grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA ,2grid.266100.30000 0001 2107 4242Department of Pediatrics, University of California San Diego, La Jolla, CA 92093 USA
Rebecca J. Baer grid.266100.30000 0001 2107 4242Department of Pediatrics, University of California San Diego, La Jolla, CA 92093 USA ,3grid.266102.10000 0001 2297 6811California Preterm Birth Initiative, University of California San Francisco, San Francisco, CA USA
Gretchen Bandoli grid.266100.30000 0001 2107 4242Department of Pediatrics, University of California San Diego, La Jolla, CA 92093 USA
Christina Chambers grid.266100.30000 0001 2107 4242Department of Pediatrics, University of California San Diego, La Jolla, CA 92093 USA
Stephen F. Kingsmore grid.286440.c0000 0004 0383 2910Rady Children’s Institute for Genomic Medicine, Rady Children’s Hospital, San Diego, CA 92123 USA ,4grid.419735.d0000 0004 0615 8415Keck Graduate Institute, Claremont, CA 91711 USA

Collapse

Prodanov T, Bansal V. Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing. Nat Commun 2022;13:3221. [PMID: 35680869 PMCID: PMC9184528 DOI: 10.1038/s41467-022-30930-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 05/20/2022] [Indexed: 11/09/2022] Open

Yang H, Gu F, Zhang L, Hua XS. Using generative adversarial networks for genome variant calling from low depth ONT sequencing data. Sci Rep 2022;12:8725. [PMID: 35637238 PMCID: PMC9151722 DOI: 10.1038/s41598-022-12346-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 05/10/2022] [Indexed: 11/21/2022] Open

Olson ND, Wagner J, McDaniel J, Stephens SH, Westreich ST, Prasanna AG, Johanson E, Boja E, Maier EJ, Serang O, Jáspez D, Lorenzo-Salazar JM, Muñoz-Barrera A, Rubio-Rodríguez LA, Flores C, Kyriakidis K, Malousi A, Shafin K, Pesout T, Jain M, Paten B, Chang PC, Kolesnikov A, Nattestad M, Baid G, Goel S, Yang H, Carroll A, Eveleigh R, Bourgey M, Bourque G, Li G, Ma C, Tang L, Du Y, Zhang S, Morata J, Tonda R, Parra G, Trotta JR, Brueffer C, Demirkaya-Budak S, Kabakci-Zorlu D, Turgut D, Kalay Ö, Budak G, Narcı K, Arslan E, Brown R, Johnson IJ, Dolgoborodov A, Semenyuk V, Jain A, Tetikol HS, Jain V, Ruehle M, Lajoie B, Roddey C, Catreux S, Mehio R, Ahsan MU, Liu Q, Wang K, Ebrahim Sahraeian SM, Fang LT, Mohiyuddin M, Hung C, Jain C, Feng H, Li Z, Chen L, Sedlazeck FJ, Zook JM. PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions. CELL GENOMICS 2022;2:S2666-979X(22)00058-1. [PMID: 35720974 PMCID: PMC9205427 DOI: 10.1016/j.xgen.2022.100129] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 11/01/2021] [Accepted: 04/08/2022] [Indexed: 11/19/2022]

Affiliation(s)

Nathan D. Olson Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
Justin Wagner Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
Jennifer McDaniel Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
Sarah H. Stephens Booz Allen Hamilton, 8283 Greensboro Drive, Mclean, VA 22102, USA
Samuel T. Westreich DNAnexus, Inc., 1975 W El Camino Real #204, Mountain View, CA 94040, USA
Anish G. Prasanna Booz Allen Hamilton, 8283 Greensboro Drive, Mclean, VA 22102, USA
Elaine Johanson Office of Health Informatics, Office of the Chief Scientist, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, USA
Emily Boja Office of Health Informatics, Office of the Chief Scientist, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, USA
Ezekiel J. Maier Booz Allen Hamilton, 8283 Greensboro Drive, Mclean, VA 22102, USA
Omar Serang DNAnexus, Inc., 1975 W El Camino Real #204, Mountain View, CA 94040, USA
David Jáspez Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
José M. Lorenzo-Salazar Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
Adrián Muñoz-Barrera Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
Luis A. Rubio-Rodríguez Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
Carlos Flores Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain Research Unit, Hospital Universitario N.S. de Candelaria, Santa Cruz de Tenerife, Spain Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna, 38200 San Cristóbal de La Laguna, Spain
Konstantinos Kyriakidis School of Pharmacy, Aristotle University of Thessaloniki (AUTH), 541 24 Thessaloniki, Greece Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation, 570 01 Thessaloniki, Greece
Andigoni Malousi Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation, 570 01 Thessaloniki, Greece Laboratory of Biological Chemistry, School of Medicine, Aristotle University of Thessaloniki (AUTH), 541 24 Thessaloniki, Greece
Kishwar Shafin UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
Trevor Pesout UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
Miten Jain UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
Benedict Paten UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
Pi-Chuan Chang Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
Alexey Kolesnikov Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
Maria Nattestad Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
Gunjan Baid Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
Sidharth Goel Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
Howard Yang Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
Andrew Carroll Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
Robert Eveleigh The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
Mathieu Bourgey The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
Guillaume Bourque The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
Gen Li HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
ChouXian Ma HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
LinQi Tang HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
YuanPing Du HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
ShaoWei Zhang HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
Jordi Morata CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain
Raúl Tonda CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain
Genís Parra CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain
Jean-Rémi Trotta CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain
Christian Brueffer Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden
Sinem Demirkaya-Budak Seven Bridges Genomics, Inc, Charlestown, MA, USA
Duygu Kabakci-Zorlu Seven Bridges Genomics, Inc, Charlestown, MA, USA
Deniz Turgut Seven Bridges Genomics, Inc, Charlestown, MA, USA
Özem Kalay Seven Bridges Genomics, Inc, Charlestown, MA, USA
Gungor Budak Seven Bridges Genomics, Inc, Charlestown, MA, USA
Kübra Narcı Seven Bridges Genomics, Inc, Charlestown, MA, USA
Elif Arslan Seven Bridges Genomics, Inc, Charlestown, MA, USA
Richard Brown Seven Bridges Genomics, Inc, Charlestown, MA, USA
Ivan J. Johnson Seven Bridges Genomics, Inc, Charlestown, MA, USA
Alexey Dolgoborodov Seven Bridges Genomics, Inc, Charlestown, MA, USA
Vladimir Semenyuk Seven Bridges Genomics, Inc, Charlestown, MA, USA
Amit Jain Seven Bridges Genomics, Inc, Charlestown, MA, USA
H. Serhat Tetikol Seven Bridges Genomics, Inc, Charlestown, MA, USA
Varun Jain Illumina, Inc., San Diego, CA, USA
Mike Ruehle Illumina, Inc., San Diego, CA, USA
Bryan Lajoie Illumina, Inc., San Diego, CA, USA
Cooper Roddey Illumina, Inc., San Diego, CA, USA
Severine Catreux Illumina, Inc., San Diego, CA, USA
Rami Mehio Illumina, Inc., San Diego, CA, USA
Mian Umair Ahsan Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
Qian Liu Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
Kai Wang Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
Sayed Mohammad Ebrahim Sahraeian Roche Sequencing Solutions, Santa Clara, CA 95050, USA
Li Tai Fang Roche Sequencing Solutions, Santa Clara, CA 95050, USA
Marghoob Mohiyuddin Roche Sequencing Solutions, Santa Clara, CA 95050, USA
Calvin Hung WASAI Technology, Taipei, Taiwan
Chirag Jain National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
Hanying Feng Sentieon Inc., San Jose, CA, USA
Zhipan Li Sentieon Inc., San Jose, CA, USA
Luoqi Chen Sentieon Inc., San Jose, CA, USA
Fritz J. Sedlazeck Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
Justin M. Zook Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA

Collapse

Salatino A, Sookoian S, Pirola CJ. Computational Pipeline for Next-Generation Sequencing (NGS) Studies in Genetics of NASH. Methods Mol Biol 2022;2455:203-222. [PMID: 35212996 DOI: 10.1007/978-1-0716-2128-8_16] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Yan B, Wang D, Vaisvila R, Sun Z, Ettwiller L. Methyl-SNP-seq reveals dual readouts of methylome and variome at molecule resolution while enabling target enrichment. Genome Res 2022;32:2079-2091. [PMID: 36332968 PMCID: PMC9808626 DOI: 10.1101/gr.277080.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 10/31/2022] [Indexed: 11/06/2022]

Shafin K, Pesout T, Chang PC, Nattestad M, Kolesnikov A, Goel S, Baid G, Kolmogorov M, Eizenga JM, Miga KH, Carnevali P, Jain M, Carroll A, Paten B. Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads. Nat Methods 2021;18:1322-1332. [PMID: 34725481 PMCID: PMC8571015 DOI: 10.1038/s41592-021-01299-w] [Citation(s) in RCA: 114] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 09/06/2021] [Indexed: 01/15/2023]

Lindner M, Gawehns F, Te Molder S, Visser ME, van Oers K, Laine VN. Performance of methods to detect genetic variants from bisulphite sequencing data in a non-model species. Mol Ecol Resour 2021;22:834-846. [PMID: 34435438 PMCID: PMC9290141 DOI: 10.1111/1755-0998.13493] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 08/10/2021] [Accepted: 08/20/2021] [Indexed: 12/17/2022]

Kovaka S, Fan Y, Ni B, Timp W, Schatz MC. Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. Nat Biotechnol 2021;39:431-441. [PMID: 33257863 PMCID: PMC8567335 DOI: 10.1038/s41587-020-0731-9] [Citation(s) in RCA: 116] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 10/07/2020] [Indexed: 02/07/2023]

Kim JE, Choi J, Sung CO, Hong YS, Kim SY, Lee H, Kim TW, Kim JI. High prevalence of TP53 loss and whole-genome doubling in early-onset colorectal cancer. Exp Mol Med 2021;53:446-456. [PMID: 33753878 PMCID: PMC8080557 DOI: 10.1038/s12276-021-00583-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 12/10/2020] [Accepted: 12/22/2020] [Indexed: 02/01/2023] Open

Nachmanson D, Steward J, Yao H, Officer A, Jeong E, O'Keefe TJ, Hasteh F, Jepsen K, Hirst GL, Esserman LJ, Borowsky AD, Harismendy O. Mutational profiling of micro-dissected pre-malignant lesions from archived specimens. BMC Med Genomics 2020;13:173. [PMID: 33208147 PMCID: PMC7672910 DOI: 10.1186/s12920-020-00820-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 11/09/2020] [Indexed: 12/20/2022] Open

Abstract

BACKGROUND

Systematic cancer screening has led to the increased detection of pre-malignant lesions (PMLs). The absence of reliable prognostic markers has led mostly to over treatment resulting in potentially unnecessary stress, or insufficient treatment and avoidable progression. Importantly, most mutational profiling studies have relied on PML synchronous to invasive cancer, or performed in patients without outcome information, hence limiting their utility for biomarker discovery. The limitations in comprehensive mutational profiling of PMLs are in large part due to the significant technical and methodological challenges: most PML specimens are small, fixed in formalin and paraffin embedded (FFPE) and lack matching normal DNA.

METHODS

Using test DNA from a highly degraded FFPE specimen, multiple targeted sequencing approaches were evaluated, varying DNA input amount (3-200 ng), library preparation strategy (BE: Blunt-End, SS: Single-Strand, AT: A-Tailing) and target size (whole exome vs. cancer gene panel). Variants in high-input DNA from FFPE and mirrored frozen specimens were used for PML-specific variant calling training and testing, respectively. The resulting approach was applied to profile and compare multiple regions micro-dissected (mean area 5 mm²) from 3 breast ductal carcinoma in situ (DCIS).

RESULTS

Using low-input FFPE DNA, BE and SS libraries resulted in 4.9 and 3.7 increase over AT libraries in the fraction of whole exome covered at 20x (BE:87%, SS:63%, AT:17%). Compared to high-confidence somatic mutations from frozen specimens, PML-specific variant filtering increased recall (BE:85%, SS:80%, AT:75%) and precision (BE:93%, SS:91%, AT:84%) to levels expected from sampling variation. Copy number alterations were consistent across all tested approaches and only impacted by the design of the capture probe-set. Applied to DNA extracted from 9 micro-dissected regions (8 PML, 1 normal epithelium), the approach achieved comparable performance, illustrated the data adequacy to identify candidate driver events (GATA3 mutations, ERBB2 or FGFR1 gains, TP53 loss) and measure intra-lesion genetic heterogeneity.

CONCLUSION

Alternate experimental and analytical strategies increased the accuracy of DNA sequencing from archived micro-dissected PML regions, supporting the deeper molecular characterization of early cancer lesions and achieving a critical milestone in the development of biology-informed prognostic markers and precision chemo-prevention strategies.

Collapse

Recurrent inversion toggling and great ape genome evolution. Nat Genet 2020;52:849-858. [PMID: 32541924 PMCID: PMC7415573 DOI: 10.1038/s41588-020-0646-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 05/15/2020] [Indexed: 01/14/2023]

Luo R, Wong CL, Wong YS, Tang CI, Liu CM, Leung CM, Lam TW. Exploring the limit of using a deep neural network on pileup data for germline variant calling. NAT MACH INTELL 2020. [DOI: 10.1038/s42256-020-0167-4] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Mohanty AK, Vuzman D, Francioli L, Cassa C, Toth-Petroczy A, Sunyaev S. novoCaller: a Bayesian network approach for de novo variant calling from pedigree and population sequence data. Bioinformatics 2020;35:1174-1180. [PMID: 30169785 PMCID: PMC6449753 DOI: 10.1093/bioinformatics/bty749] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Revised: 06/19/2018] [Accepted: 08/29/2018] [Indexed: 12/12/2022] Open

Edge P, Bansal V. Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing. Nat Commun 2019;10:4660. [PMID: 31604920 PMCID: PMC6788989 DOI: 10.1038/s41467-019-12493-y] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 09/10/2019] [Indexed: 12/30/2022] Open

Statistical Binning for Barcoded Reads Improves Downstream Analyses. Cell Syst 2019;7:219-226.e5. [PMID: 30138581 PMCID: PMC6214366 DOI: 10.1016/j.cels.2018.07.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Revised: 05/03/2018] [Accepted: 07/10/2018] [Indexed: 12/30/2022]

Chromosome Y-encoded antigens associate with acute graft-versus-host disease in sex-mismatched stem cell transplant. Blood Adv 2019;2:2419-2429. [PMID: 30262602 DOI: 10.1182/bloodadvances.2018019513] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 08/21/2018] [Indexed: 12/22/2022] Open

Zook JM, McDaniel J, Olson ND, Wagner J, Parikh H, Heaton H, Irvine SA, Trigg L, Truty R, McLean CY, De La Vega FM, Xiao C, Sherry S, Salit M. An open resource for accurately benchmarking small variant and reference calls. Nat Biotechnol 2019;37:561-566. [PMID: 30936564 PMCID: PMC6500473 DOI: 10.1038/s41587-019-0074-6] [Citation(s) in RCA: 187] [Impact Index Per Article: 37.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 02/19/2019] [Indexed: 12/30/2022]

Iacoangeli A, Al Khleifat A, Sproviero W, Shatunov A, Jones AR, Morgan SL, Pittman A, Dobson RJ, Newhouse SJ, Al-Chalabi A. DNAscan: personal computer compatible NGS analysis, annotation and visualisation. BMC Bioinformatics 2019;20:213. [PMID: 31029080 PMCID: PMC6487045 DOI: 10.1186/s12859-019-2791-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 04/02/2019] [Indexed: 12/13/2022] Open

Abstract

BACKGROUND

Next Generation Sequencing (NGS) is a commonly used technology for studying the genetic basis of biological processes and it underpins the aspirations of precision medicine. However, there are significant challenges when dealing with NGS data. Firstly, a huge number of bioinformatics tools for a wide range of uses exist, therefore it is challenging to design an analysis pipeline. Secondly, NGS analysis is computationally intensive, requiring expensive infrastructure, and many medical and research centres do not have adequate high performance computing facilities and cloud computing is not always an option due to privacy and ownership issues. Finally, the interpretation of the results is not trivial and most available pipelines lack the utilities to favour this crucial step.

RESULTS

We have therefore developed a fast and efficient bioinformatics pipeline that allows for the analysis of DNA sequencing data, while requiring little computational effort and memory usage. DNAscan can analyse a whole exome sequencing sample in 1 h and a 40x whole genome sequencing sample in 13 h, on a midrange computer. The pipeline can look for single nucleotide variants, small indels, structural variants, repeat expansions and viral genetic material (or any other organism). Its results are annotated using a customisable variety of databases and are available for an on-the-fly visualisation with a local deployment of the gene.iobio platform. DNAscan is implemented in Python. Its code and documentation are available on GitHub: https://github.com/KHP-Informatics/DNAscan . Instructions for an easy and fast deployment with Docker and Singularity are also provided on GitHub.

CONCLUSIONS

DNAscan is an extremely fast and computationally efficient pipeline for analysis, visualization and interpretation of NGS data. It is designed to provide a powerful and easy-to-use tool for applications in biomedical research and diagnostic medicine, at minimal computational cost. Its comprehensive approach will maximise the potential audience of users, bringing such analyses within the reach of non-specialist laboratories, and those from centres with limited funding available.

Collapse

Affiliation(s)

A Iacoangeli Department of Biostatistics and Health Informatics, King's College London, London, UK. Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, King's College London, London, UK.
A Al Khleifat Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, King's College London, London, UK
W Sproviero Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, King's College London, London, UK
A Shatunov Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, King's College London, London, UK
A R Jones Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, King's College London, London, UK
S L Morgan Department of Molecular Neuroscience, UCL, Institute of Neurology, London, UK
A Pittman Department of Molecular Neuroscience, UCL, Institute of Neurology, London, UK
R J Dobson Department of Biostatistics and Health Informatics, King's College London, London, UK Farr Institute of Health Informatics Research, UCL Institute of Health Informatics, University College London, London, UK National Institute for Health Research (NIHR) Biomedical Research Centre and Dementia Unit at South London and Maudsley NHS Foundation Trust and King's College London, London, UK
S J Newhouse Department of Biostatistics and Health Informatics, King's College London, London, UK Farr Institute of Health Informatics Research, UCL Institute of Health Informatics, University College London, London, UK National Institute for Health Research (NIHR) Biomedical Research Centre and Dementia Unit at South London and Maudsley NHS Foundation Trust and King's College London, London, UK
A Al-Chalabi Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, King's College London, London, UK King's College Hospital, Bessemer Road, London, SE5 9RS, UK

Collapse

Liang Y, He L, Zhao Y, Hao Y, Zhou Y, Li M, Li C, Pu X, Wen Z. Comparative Analysis for the Performance of Variant Calling Pipelines on Detecting the de novo Mutations in Humans. Front Pharmacol 2019;10:358. [PMID: 31105557 PMCID: PMC6499170 DOI: 10.3389/fphar.2019.00358] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 03/21/2019] [Indexed: 01/22/2023] Open

Luo R, Sedlazeck FJ, Lam TW, Schatz MC. A multi-task convolutional deep neural network for variant calling in single molecule sequencing. Nat Commun 2019;10:998. [PMID: 30824707 PMCID: PMC6397153 DOI: 10.1038/s41467-019-09025-z] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2018] [Accepted: 02/15/2019] [Indexed: 12/22/2022] Open

The ketogenic diet influences taxonomic and functional composition of the gut microbiota in children with severe epilepsy. NPJ Biofilms Microbiomes 2019;5:5. [PMID: 30701077 PMCID: PMC6344533 DOI: 10.1038/s41522-018-0073-2] [Citation(s) in RCA: 147] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Accepted: 12/11/2018] [Indexed: 02/06/2023] Open

Abstract

The gut microbiota has been linked to various neurological disorders via the gut–brain axis. Diet influences the composition of the gut microbiota. The ketogenic diet (KD) is a high-fat, adequate-protein, low-carbohydrate diet established for treatment of therapy-resistant epilepsy in children. Its efficacy in reducing seizures has been confirmed, but the mechanisms remain elusive. The diet has also shown positive effects in a wide range of other diseases, including Alzheimer’s, depression, autism, cancer, and type 2 diabetes. We collected fecal samples from 12 children with therapy-resistant epilepsy before starting KD and after 3 months on the diet. Parents did not start KD and served as diet controls. Applying shotgun metagenomic DNA sequencing, both taxonomic and functional profiles were established. Here we report that alpha diversity is not changed significantly during the diet, but differences in both taxonomic and functional composition are detected. Relative abundance of bifidobacteria as well as E. rectale and Dialister is significantly diminished during the intervention. An increase in relative abundance of E. coli is observed on KD. Functional analysis revealed changes in 29 SEED subsystems including the reduction of seven pathways involved in carbohydrate metabolism. Decomposition of these shifts indicates that bifidobacteria and Escherichia are important contributors to the observed functional shifts. As relative abundance of health-promoting, fiber-consuming bacteria becomes less abundant during KD, we raise concern about the effects of the diet on the gut microbiota and overall health. Further studies need to investigate whether these changes are necessary for the therapeutic effect of KD.

The ketogenic diet changes both the relative abundance of gut microbiota and their metabolic activities. The diet forces a shift from carbohydrates to ketones as a primary energy source and has demonstrated efficacy in reducing epileptic seizures in children. After animal models implicated gut microbiota in this amelioration, Stefanie Prast-Nielsen, of Sweden’s Karolinska Institutet, and her team sequenced microbiotic DNA of fecal samples from 12 children with epilepsy before and after 3 months on a ketogenic diet. Changes included reductions in the numbers of Bifidobacterium and an increase in Escherichia coli. Carbohydrate metabolism significantly changed after 3 months on the diet. Some reductions raise questions about the diet’s potential impact on gut and overall health. More studies are also needed to discern the mechanistic impact of these changes on seizure activity.

Collapse

Cornejo OE, Yee MC, Dominguez V, Andrews M, Sockell A, Strandberg E, Livingstone D, Stack C, Romero A, Umaharan P, Royaert S, Tawari NR, Ng P, Gutierrez O, Phillips W, Mockaitis K, Bustamante CD, Motamayor JC. Population genomic analyses of the chocolate tree, Theobroma cacao L., provide insights into its domestication process. Commun Biol 2018;1:167. [PMID: 30345393 PMCID: PMC6191438 DOI: 10.1038/s42003-018-0168-6] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 09/14/2018] [Indexed: 01/24/2023] Open

Affiliation(s)

Omar E Cornejo School of Biological Sciences, Washington State University, PO Box 644236, Heald Hall 429B, Pullman, Washington, 99164, USA Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, CA, 94305, USA
Muh-Ching Yee Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, CA, 94305, USA Stanford Functional Genomics Facility, Stanford, CA, 94305, USA
Victor Dominguez Department of Biology, Indiana University, 915 E. Third St, Bloomington, IN, 47405, USA
Mary Andrews Department of Biology, Indiana University, 915 E. Third St, Bloomington, IN, 47405, USA
Alexandra Sockell Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, CA, 94305, USA
Erika Strandberg Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, CA, 94305, USA Biomedical Informatics Training Program, 1265 Welch Road, MSOB, X-215, MC 5479, Stanford, CA, 94305-5479, USA
Donald Livingstone Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA United States Department of Agriculture-Agriculture Research Service, Subtropical Horticulture Research Station, 13601 Old Cutler Rd, Miami, FL, 33158, USA
Conrad Stack Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
Alberto Romero Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
Pathmanathan Umaharan Cocoa Research Centre, The University of the West Indies, St. Augustine, Trinidad and Tobago
Stefan Royaert Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
Nilesh R Tawari Computational and Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, Genome, #02-01, Singapore, 138672, Singapore
Pauline Ng Computational and Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, Genome, #02-01, Singapore, 138672, Singapore
Osman Gutierrez SHRS, USDS-ARS, 13601 Old Cutler Road, Miami, FL, 33158, USA
Wilbert Phillips Programa de Mejoramiento de Cacao, CATIE, 7170, Turrialba, Costa Rica
Keithanne Mockaitis Department of Biology, Indiana University, 915 E. Third St, Bloomington, IN, 47405, USA Pervasive Technology Institute, Indiana University, 2709 E. 10th St., Bloomington, IN, 47408, USA
Carlos D Bustamante Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, CA, 94305, USA
Juan C Motamayor Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA.

Collapse

Danecek P, McCarthy SA. BCFtools/csq: haplotype-aware variant consequences. Bioinformatics 2018;33:2037-2039. [PMID: 28205675 PMCID: PMC5870570 DOI: 10.1093/bioinformatics/btx100] [Citation(s) in RCA: 208] [Impact Index Per Article: 34.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 02/14/2017] [Indexed: 02/06/2023] Open

Forbes TA, Howden SE, Lawlor K, Phipson B, Maksimovic J, Hale L, Wilson S, Quinlan C, Ho G, Holman K, Bennetts B, Crawford J, Trnka P, Oshlack A, Patel C, Mallett A, Simons C, Little MH. Patient-iPSC-Derived Kidney Organoids Show Functional Validation of a Ciliopathic Renal Phenotype and Reveal Underlying Pathogenetic Mechanisms. Am J Hum Genet 2018;102:816-831. [PMID: 29706353 DOI: 10.1016/j.ajhg.2018.03.014] [Citation(s) in RCA: 136] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Accepted: 03/05/2018] [Indexed: 02/07/2023] Open

Pizzino A, Whitehead M, Sabet Rasekh P, Murphy J, Helman G, Bloom M, Evans SH, Murnick JG, Conry J, Taft RJ, Simons C, Vanderver A, Adang LA. Mutations in SZT2 result in early-onset epileptic encephalopathy and leukoencephalopathy. Am J Med Genet A 2018;176:1443-1448. [PMID: 29696782 DOI: 10.1002/ajmg.a.38717] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2017] [Revised: 02/13/2018] [Accepted: 03/28/2018] [Indexed: 11/06/2022]

Choi Y, Chan AP, Kirkness E, Telenti A, Schork NJ. Comparison of phasing strategies for whole human genomes. PLoS Genet 2018;14:e1007308. [PMID: 29621242 PMCID: PMC5903673 DOI: 10.1371/journal.pgen.1007308] [Citation(s) in RCA: 81] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Revised: 04/17/2018] [Accepted: 03/13/2018] [Indexed: 12/17/2022] Open

Abstract

Humans are a diploid species that inherit one set of chromosomes paternally and one homologous set of chromosomes maternally. Unfortunately, most human sequencing initiatives ignore this fact in that they do not directly delineate the nucleotide content of the maternal and paternal copies of the 23 chromosomes individuals possess (i.e., they do not 'phase' the genome) often because of the costs and complexities of doing so. We compared 11 different widely-used approaches to phasing human genomes using the publicly available 'Genome-In-A-Bottle' (GIAB) phased version of the NA12878 genome as a gold standard. The phasing strategies we compared included laboratory-based assays that prepare DNA in unique ways to facilitate phasing as well as purely computational approaches that seek to reconstruct phase information from general sequencing reads and constructs or population-level haplotype frequency information obtained through a reference panel of haplotypes. To assess the performance of the 11 approaches, we used metrics that included, among others, switch error rates, haplotype block lengths, the proportion of fully phase-resolved genes, phasing accuracy and yield between pairs of SNVs. Our comparisons suggest that a hybrid or combined approach that leverages: 1. population-based phasing using the SHAPEIT software suite, 2. either genome-wide sequencing read data or parental genotypes, and 3. a large reference panel of variant and haplotype frequencies, provides a fast and efficient way to produce highly accurate phase-resolved individual human genomes. We found that for population-based approaches, phasing performance is enhanced with the addition of genome-wide read data; e.g., whole genome shotgun and/or RNA sequencing reads. Further, we found that the inclusion of parental genotype data within a population-based phasing strategy can provide as much as a ten-fold reduction in phasing errors. We also considered a majority voting scheme for the construction of a consensus haplotype combining multiple predictions for enhanced performance and site coverage. Finally, we also identified DNA sequence signatures associated with the genomic regions harboring phasing switch errors, which included regions of low polymorphism or SNV density.

Collapse

Shringarpure SS, Mathias RA, Hernandez RD, O'Connor TD, Szpiech ZA, Torres R, De La Vega FM, Bustamante CD, Barnes KC, Taub MA. Using genotype array data to compare multi- and single-sample variant calls and improve variant call sets from deep coverage whole-genome sequencing data. Bioinformatics 2018;33:1147-1153. [PMID: 28035032 PMCID: PMC5408850 DOI: 10.1093/bioinformatics/btw786] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Accepted: 12/07/2016] [Indexed: 12/30/2022] Open

Abstract

Motivation

Variant calling from next-generation sequencing (NGS) data is susceptible to false positive calls due to sequencing, mapping and other errors. To better distinguish true from false positive calls, we present a method that uses genotype array data from the sequenced samples, rather than public data such as HapMap or dbSNP, to train an accurate classifier using Random Forests. We demonstrate our method on a set of variant calls obtained from 642 African-ancestry genomes from the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA), sequenced to high depth (30X).

Results

We have applied our classifier to compare call sets generated with different calling methods, including both single-sample and multi-sample callers. At a False Positive Rate of 5%, our method determines true positive rates of 97.5%, 95% and 99% on variant calls obtained using Illuminas single-sample caller CASAVA, Real Time Genomics multisample variant caller, and the GATK UnifiedGenotyper, respectively. Since NGS sequencing data may be accompanied by genotype data for the same samples, either collected concurrent to sequencing or from a previous study, our method can be trained on each dataset to provide a more accurate computational validation of site calls compared to generic methods. Moreover, our method allows for adjustment based on allele frequency (e.g. a different set of criteria to determine quality for rare versus common variants) and thereby provides insight into sequencing characteristics that indicate call quality for variants of different frequencies.

Availability and Implementation

Code is available on Github at: https://github.com/suyashss/variant_validation.

Contacts

suyashs@stanford.edu or mtaub@jhsph.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

A robust targeted sequencing approach for low input and variable quality DNA from clinical samples. NPJ Genom Med 2018;3:2. [PMID: 29354287 PMCID: PMC5768874 DOI: 10.1038/s41525-017-0041-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2017] [Revised: 11/27/2017] [Accepted: 12/05/2017] [Indexed: 02/07/2023] Open

Abstract

Next-generation deep sequencing of gene panels is being adopted as a diagnostic test to identify actionable mutations in cancer patient samples. However, clinical samples, such as formalin-fixed, paraffin-embedded specimens, frequently provide low quantities of degraded, poor quality DNA. To overcome these issues, many sequencing assays rely on extensive PCR amplification leading to an accumulation of bias and artifacts. Thus, there is a need for a targeted sequencing assay that performs well with DNA of low quality and quantity without relying on extensive PCR amplification. We evaluate the performance of a targeted sequencing assay based on Oligonucleotide Selective Sequencing, which permits the enrichment of genes and regions of interest and the identification of sequence variants from low amounts of damaged DNA. This assay utilizes a repair process adapted to clinical FFPE samples, followed by adaptor ligation to single stranded DNA and a primer-based capture technique. Our approach generates sequence libraries of high fidelity with reduced reliance on extensive PCR amplification—this facilitates the accurate assessment of copy number alterations in addition to delivering accurate single nucleotide variant and insertion/deletion detection. We apply this method to capture and sequence the exons of a panel of 130 cancer-related genes, from which we obtain high read coverage uniformity across the targeted regions at starting input DNA amounts as low as 10 ng per sample. We demonstrate the performance using a series of reference DNA samples, and by identifying sequence variants in DNA from matched clinical samples originating from different tissue types.

A new DNA sequencing technology enables comprehensive genetic analyses of poor-quality tumor samples. Hanlee Ji from Stanford University in California, USA, together with colleagues from a company he cofounded called TOMA Biosciences, tested the performance of a targeted sequencing assay known as oligonucleotide-selective sequencing (OS-Seq). They used the “in-solution” version of OS-Seq, which involves a pre-processing step to remove any damaged DNA and then sequences target regions of the genome to look for duplications, insertions or deletions of DNA segments. Using archival specimens (which often contain low quantities of degraded DNA) from patients with lung and colorectal cancer, the researchers showed they could detect sequence variants in a panel of 130 cancer-related genes. The findings suggest the OS-Seq assay could help inform treatment decisions for cancer patients, even with clinical specimens of low quality.

Collapse

A primer to clinical genome sequencing. Curr Opin Pediatr 2017;29:513-519. [PMID: 28786837 PMCID: PMC5590671 DOI: 10.1097/mop.0000000000000532] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Shum BO, Henner I, Belluoccio D, Hinchcliffe MJ. Utility of NIST Whole-Genome Reference Materials for the Technical Validation of a Multigene Next-Generation Sequencing Test. J Mol Diagn 2017;19:602-612. [DOI: 10.1016/j.jmoldx.2017.04.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Revised: 04/10/2017] [Accepted: 04/11/2017] [Indexed: 01/04/2023] Open

Nafisinia M, Riley LG, Gold WA, Bhattacharya K, Broderick CR, Thorburn DR, Simons C, Christodoulou J. Compound heterozygous mutations in glycyl-tRNA synthetase (GARS) cause mitochondrial respiratory chain dysfunction. PLoS One 2017;12:e0178125. [PMID: 28594869 PMCID: PMC5464557 DOI: 10.1371/journal.pone.0178125] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Accepted: 05/07/2017] [Indexed: 01/13/2023] Open

Affiliation(s)

Michael Nafisinia Genetic Metabolic Disorders Research Unit, Western Sydney Genetics Program, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia Discipline of Child & Adolescent Health, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia
Lisa G. Riley Genetic Metabolic Disorders Research Unit, Western Sydney Genetics Program, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia Discipline of Child & Adolescent Health, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia
Wendy A. Gold Genetic Metabolic Disorders Research Unit, Western Sydney Genetics Program, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia Discipline of Child & Adolescent Health, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia
Kaustuv Bhattacharya Discipline of Child & Adolescent Health, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia Discipline of Genetic Medicine, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia Genetic Metabolic Disorders Service, Western Sydney Genetics Program, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia
Carolyn R. Broderick Children’s Hospital Institute of Sports Medicine, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia School of Medical Sciences, UNSW, Sydney, New South Wales, Australia
David R. Thorburn Murdoch Childrens Research Institute and Victorian Clinical Genetics Services, Royal Children’s Hospital, and Department of Paediatrics, University of Melbourne, Melbourne, Victoria, Australia
Cas Simons Institute for Molecular Bioscience, The University of Queensland, St Lucia, Queensland, Australia
John Christodoulou Genetic Metabolic Disorders Research Unit, Western Sydney Genetics Program, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia Discipline of Child & Adolescent Health, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia Discipline of Genetic Medicine, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia Genetic Metabolic Disorders Service, Western Sydney Genetics Program, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia Murdoch Childrens Research Institute and Victorian Clinical Genetics Services, Royal Children’s Hospital, and Department of Paediatrics, University of Melbourne, Melbourne, Victoria, Australia * E-mail:

Collapse

Huang AY, Zhang Z, Ye AY, Dou Y, Yan L, Yang X, Zhang Y, Wei L. MosaicHunter: accurate detection of postzygotic single-nucleotide mosaicism through next-generation sequencing of unpaired, trio, and paired samples. Nucleic Acids Res 2017;45:e76. [PMID: 28132024 PMCID: PMC5449543 DOI: 10.1093/nar/gkx024] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Revised: 12/24/2016] [Accepted: 01/26/2017] [Indexed: 02/07/2023] Open

Affiliation(s)

August Yue Huang Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, People's Republic of China National Institute of Biological Sciences, Beijing 102206, People's Republic of China
Zheng Zhang Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, People's Republic of China School of Life Sciences, Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing 100084, People's Republic of China
Adam Yongxin Ye Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, People's Republic of China Peking-Tsinghua Center for Life Sciences, Beijing, People's Republic of China Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, People's Republic of China
Yanmei Dou Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, People's Republic of China National Institute of Biological Sciences, Beijing 102206, People's Republic of China
Linlin Yan Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, People's Republic of China
Xiaoxu Yang Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, People's Republic of China
Yuehua Zhang Peking University First Hospital, Peking University, Beijing 100034, People's Republic of China
Liping Wei Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, People's Republic of China

Collapse

Huang G, Wang S, Wang X, You N. An empirical Bayes method for genotyping and SNP detection using multi-sample next-generation sequencing data. Bioinformatics 2016;32:3240-3245. [DOI: 10.1093/bioinformatics/btw409] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Accepted: 06/20/2016] [Indexed: 12/30/2022] Open

Vanderver A, Simons C, Helman G, Crawford J, Wolf NI, Bernard G, Pizzino A, Schmidt JL, Takanohashi A, Miller D, Khouzam A, Rajan V, Ramos E, Chowdhury S, Hambuch T, Ru K, Baillie GJ, Grimmond SM, Caldovic L, Devaney J, Bloom M, Evans SH, Murphy JLP, McNeill N, Fogel BL, Schiffmann R, van der Knaap MS, Taft RJ. Whole exome sequencing in patients with white matter abnormalities. Ann Neurol 2016;79:1031-1037. [PMID: 27159321 DOI: 10.1002/ana.24650] [Citation(s) in RCA: 106] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Revised: 03/27/2016] [Accepted: 03/28/2016] [Indexed: 01/25/2023]

Affiliation(s)

Adeline Vanderver Department of Neurology, Children's National Medical Center, Washington, DC.,Center for Genetic Medicine Research, Children's National Medical Center, Washington, DC.,School of Medicine and Health Sciences, George Washington University, Washington, DC
Cas Simons Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia
Guy Helman Department of Neurology, Children's National Medical Center, Washington, DC.,Center for Genetic Medicine Research, Children's National Medical Center, Washington, DC
Joanna Crawford Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia
Nicole I Wolf Department of Child Neurology, VU University Medical Center and Neuroscience Campus Amsterdam, Amsterdam, the Netherlands
Geneviève Bernard Departments of Pediatrics, Neurology, and Neurosurgery, Montreal Children's Hospital, McGill University Health Center, Montreal, Quebec, Canada
Amy Pizzino Department of Neurology, Children's National Medical Center, Washington, DC
Johanna L Schmidt Department of Neurology, Children's National Medical Center, Washington, DC.,Center for Genetic Medicine Research, Children's National Medical Center, Washington, DC
Asako Takanohashi Center for Genetic Medicine Research, Children's National Medical Center, Washington, DC
David Miller Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia.,University of Melbourne Centre for Cancer Research, University of Melbourne, Parkville, Victoria, Australia
Amirah Khouzam Illumina Inc, San Diego, CA
Vani Rajan Illumina Inc, San Diego, CA
Erica Ramos Illumina Inc, San Diego, CA
Shimul Chowdhury Illumina Inc, San Diego, CA
Tina Hambuch Illumina Inc, San Diego, CA
Kelin Ru Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia
Gregory J Baillie Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia
Sean M Grimmond Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia.,University of Melbourne Centre for Cancer Research, University of Melbourne, Parkville, Victoria, Australia
Ljubica Caldovic Center for Genetic Medicine Research, Children's National Medical Center, Washington, DC
Joseph Devaney Center for Genetic Medicine Research, Children's National Medical Center, Washington, DC
Miriam Bloom Department of Pediatrics, Children's National Medical Center, Washington, DC
Sarah H Evans Department of Physical Medicine and Rehabilitation, Children's National Medical Center, Washington, DC
Jennifer L P Murphy Department of Neurology, Children's National Medical Center, Washington, DC
Nathan McNeill Institute for Metabolic Disease, Baylor Research Institute, Dallas, TX
Brent L Fogel Department of Neurology, Program in Neurogenetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA

Raphael Schiffmann Institute for Metabolic Disease, Baylor Research Institute, Dallas, TX
Marjo S van der Knaap Department of Child Neurology, VU University Medical Center and Neuroscience Campus Amsterdam, Amsterdam, the Netherlands.,Department of Functional Genomics, VU University, Amsterdam, the Netherlands
Ryan J Taft School of Medicine and Health Sciences, George Washington University, Washington, DC.,Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia.,Illumina Inc, San Diego, CA

Collapse

Sequence-based Association Analysis Reveals an MGST1 eQTL with Pleiotropic Effects on Bovine Milk Composition. Sci Rep 2016;6:25376. [PMID: 27146958 PMCID: PMC4857175 DOI: 10.1038/srep25376] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 04/15/2016] [Indexed: 11/08/2022] Open

Gyarmati P, Kjellander C, Aust C, Song Y, Öhrmalm L, Giske CG. Metagenomic analysis of bloodstream infections in patients with acute leukemia and therapy-induced neutropenia. Sci Rep 2016;6:23532. [PMID: 26996149 PMCID: PMC4800731 DOI: 10.1038/srep23532] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 03/08/2016] [Indexed: 01/05/2023] Open

Narasimhan VM, Hunt KA, Mason D, Baker CL, Karczewski KJ, Barnes MR, Barnett AH, Bates C, Bellary S, Bockett NA, Giorda K, Griffiths CJ, Hemingway H, Jia Z, Kelly MA, Khawaja HA, Lek M, McCarthy S, McEachan R, O'Donnell-Luria A, Paigen K, Parisinos CA, Sheridan E, Southgate L, Tee L, Thomas M, Xue Y, Schnall-Levin M, Petkov PM, Tyler-Smith C, Maher ER, Trembath RC, MacArthur DG, Wright J, Durbin R, van Heel DA. Health and population effects of rare gene knockouts in adult humans with related parents. Science 2016;352:474-7. [PMID: 26940866 DOI: 10.1126/science.aac8624] [Citation(s) in RCA: 202] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 02/18/2016] [Indexed: 12/13/2022]

Affiliation(s)

Vagheesh M Narasimhan Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
Karen A Hunt Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
Dan Mason Bradford Institute for Health Research, Bradford Teaching Hospitals National Health Service (NHS) Foundation Trust, Bradford BD9 6RJ, UK
Christopher L Baker Center for Genome Dynamics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
Konrad J Karczewski Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA. Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Michael R Barnes William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
Anthony H Barnett Diabetes and Endocrine Centre, Heart of England NHS Foundation Trust and University of Birmingham, Birmingham B9 5SS, UK
Chris Bates TPP, Mill House, Troy Road, Leeds LS18 5TN, UK
Srikanth Bellary Aston Research Centre for Healthy Ageing, Aston University, Birmingham B4 7ET, UK
Nicholas A Bockett Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
Kristina Giorda 10X Genomics, 7068 Koll Center Parkway, Suite 415, Pleasanton, CA 94566, USA
Christopher J Griffiths Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
Harry Hemingway Farr Institute of Health Informatics Research, London NW1 2DA, UK. Institute of Health Informatics, University College London, London NW1 2DA, UK
Zhilong Jia William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
M Ann Kelly School of Clinical and Experimental Medicine, University of Birmingham, Birmingham B15 2TT, UK
Hajrah A Khawaja William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
Monkol Lek Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA. Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Shane McCarthy Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
Rosie McEachan Bradford Institute for Health Research, Bradford Teaching Hospitals National Health Service (NHS) Foundation Trust, Bradford BD9 6RJ, UK
Anne O'Donnell-Luria Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA. Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Kenneth Paigen Center for Genome Dynamics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
Constantinos A Parisinos Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
Eamonn Sheridan Bradford Institute for Health Research, Bradford Teaching Hospitals National Health Service (NHS) Foundation Trust, Bradford BD9 6RJ, UK
Laura Southgate Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
Louise Tee School of Clinical and Experimental Medicine, University of Birmingham, Birmingham B15 2TT, UK
Mark Thomas Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
Yali Xue Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
Michael Schnall-Levin 10X Genomics, 7068 Koll Center Parkway, Suite 415, Pleasanton, CA 94566, USA
Petko M Petkov Center for Genome Dynamics, The Jackson Laboratory, Bar Harbor, ME 04609, USA
Chris Tyler-Smith Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
Eamonn R Maher Department of Medical Genetics, University of Cambridge and National Institute for Health Research (NIHR) Cambridge Biomedical Research Centre, Box 238, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK. Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
Richard C Trembath Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK. Faculty of Life Sciences and Medicine, King's College London, London SE1 1UL, UK
Daniel G MacArthur Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA. Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
John Wright Bradford Institute for Health Research, Bradford Teaching Hospitals National Health Service (NHS) Foundation Trust, Bradford BD9 6RJ, UK
Richard Durbin Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
David A van Heel Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK.

Collapse

Goldfeder RL, Priest JR, Zook JM, Grove ME, Waggott D, Wheeler MT, Salit M, Ashley EA. Medical implications of technical accuracy in genome sequencing. Genome Med 2016;8:24. [PMID: 26932475 PMCID: PMC4774017 DOI: 10.1186/s13073-016-0269-0] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 01/21/2016] [Indexed: 12/31/2022] Open

Abstract

Background

As whole exome sequencing (WES) and whole genome sequencing (WGS) transition from research tools to clinical diagnostic tests, it is increasingly critical for sequencing methods and analysis pipelines to be technically accurate. The Genome in a Bottle Consortium has recently published a set of benchmark SNV, indel, and homozygous reference genotypes for the pilot whole genome NIST Reference Material based on the NA12878 genome.

Methods

We examine the relationship between human genome complexity and genes/variants reported to be associated with human disease. Specifically, we map regions of medical relevance to benchmark regions of high or low confidence. We use benchmark data to assess the sensitivity and positive predictive value of two representative sequencing pipelines for specific classes of variation.

Results

We observe that the accuracy of a variant call depends on the genomic region, variant type, and read depth, and varies by analytical pipeline. We find that most false negative WGS calls result from filtering while most false negative WES variants relate to poor coverage. We find that only 74.6 % of the exonic bases in ClinVar and OMIM genes and 82.1 % of the exonic bases in ACMG-reportable genes are found in high-confidence regions. Only 990 genes in the genome are found entirely within high-confidence regions while 593 of 3,300 ClinVar/OMIM genes have less than 50 % of their total exonic base pairs in high-confidence regions. We find greater than 77 % of the pathogenic or likely pathogenic SNVs currently in ClinVar fall within high-confidence regions. We identify sites that are prone to sequencing errors, including thousands present in publicly available variant databases. Finally, we examine the clinical impact of mandatory reporting of secondary findings, highlighting a false positive variant found in BRCA2.

Conclusions

Together, these data illustrate the importance of appropriate use and continued improvement of technical benchmarks to ensure accurate and judicious interpretation of next-generation DNA sequencing results in the clinical setting.

Electronic supplementary material

The online version of this article (doi:10.1186/s13073-016-0269-0) contains supplementary material, which is available to authorized users.

Collapse

Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat Biotechnol 2016;34:303-11. [PMID: 26829319 PMCID: PMC4786454 DOI: 10.1038/nbt.3432] [Citation(s) in RCA: 438] [Impact Index Per Article: 54.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2015] [Accepted: 11/12/2015] [Indexed: 01/13/2023]

Konopka T, Nijman SMB. Comparison of genetic variants in matched samples using thesaurus annotation. Bioinformatics 2015;32:657-63. [PMID: 26545822 PMCID: PMC4795618 DOI: 10.1093/bioinformatics/btv654] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 10/30/2015] [Indexed: 12/21/2022] Open

Cunha MLR, Meijers JCM, Middeldorp S. Introduction to the analysis of next generation sequencing data and its application to venous thromboembolism. Thromb Haemost 2015;114:920-32. [PMID: 26446408 DOI: 10.1160/th15-05-0411] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 08/26/2015] [Indexed: 12/13/2022]

Chiang C, Layer RM, Faust GG, Lindberg MR, Rose DB, Garrison EP, Marth GT, Quinlan AR, Hall IM. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Methods 2015;12:966-8. [PMID: 26258291 PMCID: PMC4589466 DOI: 10.1038/nmeth.3505] [Citation(s) in RCA: 344] [Impact Index Per Article: 38.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Accepted: 05/28/2015] [Indexed: 12/11/2022]