Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: English AC, Salerno WJ, Hampton OA, Gonzaga-Jauregui C, Ambreth S, Ritter DI, Beck CR, Davis CF, Dahdouli M, Ma S, Carroll A, Veeraraghavan N, Bruestle J, Drees B, Hastie A, Lam ET, White S, Mishra P, Wang M, Han Y, Zhang F, Stankiewicz P, Wheeler DA, Reid JG, Muzny DM, Rogers J, Sabo A, Worley KC, Lupski JR, Boerwinkle E, Gibbs RA. Assessing structural variation in a personal genome-towards a human reference diploid genome. BMC Genomics 2015;16:286. [PMID: 25886820 PMCID: PMC4490614 DOI: 10.1186/s12864-015-1479-3] [Citation(s) in RCA: 105] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 03/23/2015] [Indexed: 01/19/2023] Open

For:	English AC, Salerno WJ, Hampton OA, Gonzaga-Jauregui C, Ambreth S, Ritter DI, Beck CR, Davis CF, Dahdouli M, Ma S, Carroll A, Veeraraghavan N, Bruestle J, Drees B, Hastie A, Lam ET, White S, Mishra P, Wang M, Han Y, Zhang F, Stankiewicz P, Wheeler DA, Reid JG, Muzny DM, Rogers J, Sabo A, Worley KC, Lupski JR, Boerwinkle E, Gibbs RA. Assessing structural variation in a personal genome-towards a human reference diploid genome. BMC Genomics 2015;16:286. [PMID: 25886820 PMCID: PMC4490614 DOI: 10.1186/s12864-015-1479-3] [Citation(s) in RCA: 105] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 03/23/2015] [Indexed: 01/19/2023] Open

Number

Cited by Other Article(s)

Beck CR, Carvalho CMB, Akdemir ZC, Sedlazeck FJ, Song X, Meng Q, Hu J, Doddapaneni H, Chong Z, Chen ES, Thornton PC, Liu P, Yuan B, Withers M, Jhangiani SN, Kalra D, Walker K, English AC, Han Y, Chen K, Muzny DM, Ira G, Shaw CA, Gibbs RA, Hastings PJ, Lupski JR. Megabase Length Hypermutation Accompanies Human Structural Variation at 17p11.2. Cell 2019;176:1310-1324.e10. [PMID: 30827684 PMCID: PMC6438178 DOI: 10.1016/j.cell.2019.01.045] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 11/06/2018] [Accepted: 01/25/2019] [Indexed: 01/16/2023]

Affiliation(s)

Christine R Beck Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Claudia M B Carvalho Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Zeynep C Akdemir Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Fritz J Sedlazeck Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Xiaofei Song Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Qingchang Meng Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Jianhong Hu Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Harsha Doddapaneni Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Zechen Chong Department of Genetics and the Informatics Institute, the University of Alabama at Birmingham, Birmingham, AL 35294, USA
Edward S Chen Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Philip C Thornton Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Pengfei Liu Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Bo Yuan Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Marjorie Withers Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Shalini N Jhangiani Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Divya Kalra Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Kimberly Walker Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Adam C English Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Yi Han Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Ken Chen Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
Donna M Muzny Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
Grzegorz Ira Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Chad A Shaw Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA
Richard A Gibbs Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA; Human Genome Sequencing Center, BCM, Houston, TX 77030, USA
P J Hastings Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA; Dan L. Duncan Comprehensive Cancer Center, BCM, Houston, TX 77030, USA.
James R Lupski Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA; Human Genome Sequencing Center, BCM, Houston, TX 77030, USA; Department of Pediatrics, BCM, Houston, TX 77030, USA; Texas Children's Hospital, Houston, TX 77030, USA; Dan L. Duncan Comprehensive Cancer Center, BCM, Houston, TX 77030, USA.

Collapse

Gabur I, Chawla HS, Snowdon RJ, Parkin IAP. Connecting genome structural variation with complex traits in crop plants. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2019;132:733-750. [PMID: 30448864 DOI: 10.1007/s00122-018-3233-0] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 11/07/2018] [Indexed: 05/05/2023]

Comprehensive structural variation genome map of individuals carrying complex chromosomal rearrangements. PLoS Genet 2019;15:e1007858. [PMID: 30735495 PMCID: PMC6368290 DOI: 10.1371/journal.pgen.1007858] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 11/28/2018] [Indexed: 11/19/2022] Open

Massonnet M, Morales-Cruz A, Minio A, Figueroa-Balderas R, Lawrence DP, Travadon R, Rolshausen PE, Baumgartner K, Cantu D. Whole-Genome Resequencing and Pan-Transcriptome Reconstruction Highlight the Impact of Genomic Structural Variation on Secondary Metabolite Gene Clusters in the Grapevine Esca Pathogen Phaeoacremonium minimum. Front Microbiol 2018;9:1784. [PMID: 30150972 PMCID: PMC6099105 DOI: 10.3389/fmicb.2018.01784] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 07/16/2018] [Indexed: 12/30/2022] Open

Xia LC, Ai D, Lee H, Andor N, Li C, Zhang NR, Ji HP. SVEngine: an efficient and versatile simulator of genome structural variations with features of cancer clonal evolution. Gigascience 2018;7:5049476. [PMID: 29982625 PMCID: PMC6057526 DOI: 10.1093/gigascience/giy081] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Revised: 05/22/2018] [Accepted: 06/26/2018] [Indexed: 11/29/2022] Open

Abstract

Background

Simulating genome sequence data with variant features facilitates the development and benchmarking of structural variant analysis programs. However, there are only a few data simulators that provide structural variants in silico and even fewer that provide variants with different allelic fraction and haplotypes.

Findings

We developed SVEngine, an open-source tool to address this need. SVEngine simulates next-generation sequencing data with embedded structural variations. As input, SVEngine takes template haploid sequences (FASTA) and an external variant file, a variant distribution file, and/or a clonal phylogeny tree file (NEWICK) as input. Subsequently, it simulates and outputs sequence contigs (FASTAs), sequence reads (FASTQs), and/or post-alignment files (BAMs). All of the files contain the desired variants, along with BED files containing the ground truth. SVEngine's flexible design process enables one to specify size, position, and allelic fraction for deletions, insertions, duplications, inversions, and translocations. Finally, SVEngine simulates sequence data that replicate the characteristics of a sequencing library with mixed sizes of DNA insert molecules. To improve the compute speed, SVEngine is highly parallelized to reduce the simulation time.

Conclusions

We demonstrated the versatile features of SVEngine and its improved runtime comparisons with other available simulators. SVEngine's features include the simulation of locus-specific variant frequency designed to mimic the phylogeny of cancer clonal evolution. We validated SVEngine's accuracy by simulating genome-wide structural variants of NA12878 and a heterogeneous cancer genome. Our evaluation included checking various sequencing mapping features such as coverage change, read clipping, insert size shift, and neighboring hanging read pairs for representative variant types. Structural variant callers Lumpy and Manta and tumor heterogeneity estimator THetA2 were able to perform realistically on the simulated data. SVEngine is implemented as a standard Python package and is freely available for academic use .

Collapse

Barseghyan H, Délot EC, Vilain E. New technologies to uncover the molecular basis of disorders of sex development. Mol Cell Endocrinol 2018;468:60-69. [PMID: 29655603 PMCID: PMC7249677 DOI: 10.1016/j.mce.2018.04.003] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 04/06/2018] [Accepted: 04/06/2018] [Indexed: 02/04/2023]

Zhu T, Hu Z, Rodriguez JC, Deal KR, Dvorak J, Vogel JP, Liu Z, Luo MC. Analysis of Brachypodium genomes with genome-wide optical maps. Genome 2018;61:559-565. [PMID: 29883550 DOI: 10.1139/gen-2018-0013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Fang L, Hu J, Wang D, Wang K. NextSV: a meta-caller for structural variants from low-coverage long-read sequencing data. BMC Bioinformatics 2018;19:180. [PMID: 29792160 PMCID: PMC5966861 DOI: 10.1186/s12859-018-2207-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 05/15/2018] [Indexed: 01/23/2023] Open

Lan T, Lin H, Zhu W, Laurent TCAM, Yang M, Liu X, Wang J, Wang J, Yang H, Xu X, Guo X. Deep whole-genome sequencing of 90 Han Chinese genomes. Gigascience 2018;6:1-7. [PMID: 28938720 PMCID: PMC5603764 DOI: 10.1093/gigascience/gix067] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 07/20/2017] [Indexed: 12/30/2022] Open

Abstract

Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000 Genomes Project, as well as to other human genome projects.

Collapse

Affiliation(s)

Tianming Lan BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
Haoxiang Lin BGI Genomics, BGI-Shenzhen, Building NO. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen, 518083, China
Wenjuan Zhu BGI Genomics, BGI-Shenzhen, Building NO. 7, BGI Park, No. 21 Hongan 3rd Street, Yantian District, Shenzhen, 518083, China
Tellier Christian Asker Melchior Laurent BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen, Nørregade 10, PO Box 2177 1017 Copenhagen, Denmark
Mengcheng Yang BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
Xin Liu BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
Jun Wang BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen, Nørregade 10, PO Box 2177 1017 Copenhagen, Denmark
Jian Wang BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,James D. Watson Institute of Genome Sciences, 866 Yuhangtang Road, Hangzhou, Zhejiang Province, 310058, P. R. China
Huanming Yang BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,James D. Watson Institute of Genome Sciences, 866 Yuhangtang Road, Hangzhou, Zhejiang Province, 310058, P. R. China
Xun Xu BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
Xiaosen Guo BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China.,Department of Biology, University of Copenhagen, Nørregade 10, PO Box 2177 1017 Copenhagen, Denmark.,Shenzhen Key Laboratory of Neurogenomics, BGI-Shenzhen, Build 11, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China

Collapse

Telenti A, Lippert C, Chang PC, DePristo M. Deep learning of genomic variation and regulatory network data. Hum Mol Genet 2018;27:R63-R71. [PMID: 29648622 PMCID: PMC6499235 DOI: 10.1093/hmg/ddy115] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Revised: 03/26/2018] [Accepted: 03/27/2018] [Indexed: 02/07/2023] Open

Whole Genome Sequencing of Greater Amberjack (Seriola dumerili) for SNP Identification on Aligned Scaffolds and Genome Structural Variation Analysis Using Parallel Resequencing. Int J Genomics 2018;2018:7984292. [PMID: 29785397 PMCID: PMC5896239 DOI: 10.1155/2018/7984292] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Revised: 01/04/2018] [Accepted: 01/14/2018] [Indexed: 01/30/2023] Open

Next-Generation Sequencing and Mutational Analysis: Implications for Genes Encoding LINC Complex Proteins. Methods Mol Biol 2018;1840:321-336. [PMID: 30141054 DOI: 10.1007/978-1-4939-8691-0_22] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Enhancer adoption caused by genomic insertion elicits interdigital Shh expression and syndactyly in mouse. Proc Natl Acad Sci U S A 2017;115:1021-1026. [PMID: 29255029 PMCID: PMC5798340 DOI: 10.1073/pnas.1713339115] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Abstract

In this study, we reexamined an old mouse mutant named Hammer toe (Hm), which arose spontaneously almost a half century ago and exhibits a limb phenotype with webbing. We revealed that a 150-kb noncoding genomic fragment that was originally located in chromosome 14 has been inserted into a genomic region proximal to Sonic hedgehog (Shh), located in chromosome 5. This inserted fragment possesses enhancer activity to induce Shh expression in the interdigital regions in Hm, which in turn down-regulates bone morphogenetic protein signaling and eventually results in syndactyly and web formation. Since the donor fragment residing in chromosome 14 has enhancer activity to induce interdigital gene expression, the Hm mutation appears to be an archetypal case of enhancer adoption.

Acquisition of new cis-regulatory elements (CREs) can cause alteration of developmental gene regulation and may introduce morphological novelty in evolution. Although structural variation in the genome generated by chromosomal rearrangement is one possible source of new CREs, only a few examples are known, except for cases of retrotransposition. In this study, we show the acquisition of novel regulatory sequences as a result of large genomic insertion in the spontaneous mouse mutation Hammer toe (Hm). Hm mice exhibit syndactyly with webbing, due to suppression of interdigital cell death in limb development. We reveal that, in the Hm genome, a 150-kb noncoding DNA fragment from chromosome 14 is inserted into the region upstream of the Sonic hedgehog (Shh) promoter in chromosome 5. Phenotyping of mouse embryos with a series of CRISPR/Cas9-aided partial deletion of the 150-kb insert clearly indicated that two different regions are necessary for the syndactyly phenotype of Hm. We found that each of the two regions contains at least one enhancer for interdigital regulation. These results show that a set of enhancers brought by the large genomic insertion elicits the interdigital Shh expression and the Hm phenotype. Transcriptome analysis indicates that ectopic expression of Shh up-regulates Chordin (Chrd) that antagonizes bone morphogenetic protein signaling in the interdigital region. Indeed, Chrd-overexpressing transgenic mice recapitulated syndactyly with webbing. Thus, the Hm mutation provides an insight into enhancer acquisition as a source of creation of novel gene regulation.

Collapse

Li L, Leung AKY, Kwok TP, Lai YYY, Pang IK, Chung GTY, Mak ACY, Poon A, Chu C, Li M, Wu JJK, Lam ET, Cao H, Lin C, Sibert J, Yiu SM, Xiao M, Lo KW, Kwok PY, Chan TF, Yip KY. OMSV enables accurate and comprehensive identification of large structural variations from nanochannel-based single-molecule optical maps. Genome Biol 2017;18:230. [PMID: 29195502 PMCID: PMC5709945 DOI: 10.1186/s13059-017-1356-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 11/03/2017] [Indexed: 12/20/2022] Open

Affiliation(s)

Le Li Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Alden King-Yung Leung School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Tsz-Piu Kwok Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Yvonne Y Y Lai Cardiovascular Research Institute, University of California San Francisco, San Francisco, California, USA
Iris K Pang School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Grace Tin-Yun Chung Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Angel C Y Mak Cardiovascular Research Institute, University of California San Francisco, San Francisco, California, USA
Annie Poon Cardiovascular Research Institute, University of California San Francisco, San Francisco, California, USA
Catherine Chu Cardiovascular Research Institute, University of California San Francisco, San Francisco, California, USA
Menglu Li Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong
Jacob J K Wu Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong
Ernest T Lam BioNano Genomics, San Diego, California, USA
Han Cao BioNano Genomics, San Diego, California, USA
Chin Lin Cardiovascular Research Institute, University of California San Francisco, San Francisco, California, USA
Justin Sibert School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, Pennsylvania, USA
Siu-Ming Yiu Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong
Ming Xiao School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, Pennsylvania, USA
Kwok-Wai Lo Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Pui-Yan Kwok Cardiovascular Research Institute, University of California San Francisco, San Francisco, California, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, California, USA
Ting-Fung Chan School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. .,Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. .,Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. .,CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong.
Kevin Y Yip Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. .,Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. .,Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. .,CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong.

Collapse

Cretu Stancu M, van Roosmalen MJ, Renkens I, Nieboer MM, Middelkamp S, de Ligt J, Pregno G, Giachino D, Mandrile G, Espejo Valle-Inclan J, Korzelius J, de Bruijn E, Cuppen E, Talkowski ME, Marschall T, de Ridder J, Kloosterman WP. Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat Commun 2017;8:1326. [PMID: 29109544 PMCID: PMC5673902 DOI: 10.1038/s41467-017-01343-4] [Citation(s) in RCA: 233] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 09/07/2017] [Indexed: 01/08/2023] Open

Affiliation(s)

Mircea Cretu Stancu Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Markus J van Roosmalen Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Ivo Renkens Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Marleen M Nieboer Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Sjors Middelkamp Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Joep de Ligt Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Giulia Pregno Medical Genetics Unit, Department of Clinical and Biological Sciences, University of Torino, Orbassano, 10043, Italy
Daniela Giachino Medical Genetics Unit, Department of Clinical and Biological Sciences, University of Torino, Orbassano, 10043, Italy
Giorgia Mandrile Medical Genetics Unit, Department of Clinical and Biological Sciences, University of Torino, Orbassano, 10043, Italy
Jose Espejo Valle-Inclan Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Jerome Korzelius Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Ewart de Bruijn Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Edwin Cuppen Department of Genetics and Cancer Genomics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Michael E Talkowski Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA Department of Neurology, Harvard Medical School, Boston, MA, 02115, USA Program in Population and Medical Genetics and Stanley Center for Psychiatric Research, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
Tobias Marschall Center for Bioinformatics, Saarland University, 66123, Saarbrücken, Germany Max Planck Institute for Informatics, 66123, Saarbrücken, Germany
Jeroen de Ridder Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands
Wigard P Kloosterman Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG, Utrecht, The Netherlands.

Collapse

Barseghyan H, Tang W, Wang RT, Almalvez M, Segura E, Bramble MS, Lipson A, Douine ED, Lee H, Délot EC, Nelson SF, Vilain E. Next-generation mapping: a novel approach for detection of pathogenic structural variants with a potential utility in clinical diagnosis. Genome Med 2017;9:90. [PMID: 29070057 PMCID: PMC5655859 DOI: 10.1186/s13073-017-0479-0] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 10/10/2017] [Indexed: 11/13/2022] Open

Abstract

Background

Massively parallel DNA sequencing, such as exome sequencing, has become a routine clinical procedure to identify pathogenic variants responsible for a patient’s phenotype. Exome sequencing has the capability of reliably identifying inherited and de novo single-nucleotide variants, small insertions, and deletions. However, due to the use of 100–300-bp fragment reads, this platform is not well powered to sensitively identify moderate to large structural variants (SV), such as insertions, deletions, inversions, and translocations.

Methods

To overcome these limitations, we used next-generation mapping (NGM) to image high molecular weight double-stranded DNA molecules (megabase size) with fluorescent tags in nanochannel arrays for de novo genome assembly. We investigated the capacity of this NGM platform to identify pathogenic SV in a series of patients diagnosed with Duchenne muscular dystrophy (DMD), due to large deletions, insertion, and inversion involving the DMD gene.

Results

We identified deletion, duplication, and inversion breakpoints within DMD. The sizes of deletions were in the range of 45–250 Kbp, whereas the one identified insertion was approximately 13 Kbp in size. This method refined the location of the break points within introns for cases with deletions compared to current polymerase chain reaction (PCR)-based clinical techniques. Heterozygous SV were detected in the known carrier mothers of the DMD patients, demonstrating the ability of the method to ascertain carrier status for large SV. The method was also able to identify a 5.1-Mbp inversion involving the DMD gene, previously identified by RNA sequencing.

Conclusions

We showed the ability of NGM technology to detect pathogenic structural variants otherwise missed by PCR-based techniques or chromosomal microarrays. NGM is poised to become a new tool in the clinical genetic diagnostic strategy and research due to its ability to sensitively identify large genomic variations.

Electronic supplementary material

The online version of this article (doi:10.1186/s13073-017-0479-0) contains supplementary material, which is available to authorized users.

Collapse

Affiliation(s)

Hayk Barseghyan Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA.,Center for Genetic Medicine Research, Children's National Health System, Children's Research Institute, Washington, DC, 20010, USA
Wilson Tang Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
Richard T Wang Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
Miguel Almalvez Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA.,Center for Genetic Medicine Research, Children's National Health System, Children's Research Institute, Washington, DC, 20010, USA
Eva Segura Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
Matthew S Bramble Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA.,Center for Genetic Medicine Research, Children's National Health System, Children's Research Institute, Washington, DC, 20010, USA
Allen Lipson Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
Emilie D Douine Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
Hane Lee Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
Emmanuèle C Délot Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA.,Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA.,Center for Genetic Medicine Research, Children's National Health System, Children's Research Institute, Washington, DC, 20010, USA
Stanley F Nelson Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA.,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
Eric Vilain Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA. .,Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA. .,Center for Genetic Medicine Research, Children's National Health System, Children's Research Institute, Washington, DC, 20010, USA.

Collapse

Hampton OA, English AC, Wang M, Salerno WJ, Liu Y, Muzny DM, Han Y, Wheeler DA, Worley KC, Lupski JR, Gibbs RA. SVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads. BMC Genomics 2017;18:691. [PMID: 28984202 PMCID: PMC5629590 DOI: 10.1186/s12864-017-4021-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Affiliation(s)

Oliver A Hampton Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA. .,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.
Adam C English Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
Mark Wang Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
William J Salerno Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
Yue Liu Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
Donna M Muzny Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
Yi Han Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
David A Wheeler Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
Kim C Worley Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
James R Lupski Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Pediatrics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Texas Children's Hospital, 6621 Fanin Street, Houston, TX, 77030, USA
Richard A Gibbs Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA

Collapse

Sedlazeck FJ, Dhroso A, Bodian DL, Paschall J, Hermes F, Zook JM. Tools for annotation and comparison of structural variation. F1000Res 2017;6:1795. [PMID: 29123647 PMCID: PMC5668921 DOI: 10.12688/f1000research.12516.1] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/02/2017] [Indexed: 11/20/2022] Open

Jensen JM, Villesen P, Friborg RM, Mailund T, Besenbacher S, Schierup MH. Assembly and analysis of 100 full MHC haplotypes from the Danish population. Genome Res 2017;27:1597-1607. [PMID: 28774965 PMCID: PMC5580718 DOI: 10.1101/gr.218891.116] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 07/21/2017] [Indexed: 01/05/2023]

Ghurye J, Pop M, Koren S, Bickhart D, Chin CS. Scaffolding of long read assemblies using long range contact information. BMC Genomics 2017;18:527. [PMID: 28701198 PMCID: PMC5508778 DOI: 10.1186/s12864-017-3879-z] [Citation(s) in RCA: 138] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Accepted: 06/20/2017] [Indexed: 11/28/2022] Open

Long-read genome sequencing identifies causal structural variation in a Mendelian disease. Genet Med 2017. [PMID: 28640241 DOI: 10.1038/gim.2017.86] [Citation(s) in RCA: 143] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 2017;17:333-51. [PMID: 27184599 PMCID: PMC10373632 DOI: 10.1038/nrg.2016.49] [Citation(s) in RCA: 2147] [Impact Index Per Article: 306.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Smith M. DNA Sequence Analysis in Clinical Medicine, Proceeding Cautiously. Front Mol Biosci 2017;4:24. [PMID: 28516087 PMCID: PMC5413496 DOI: 10.3389/fmolb.2017.00024] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 04/07/2017] [Indexed: 12/03/2022] Open

Abstract

Delineation of underlying genomic and genetic factors in a specific disease may be valuable in establishing a definitive diagnosis and may guide patient management and counseling. In addition, genetic information may be useful in identification of at risk family members. Gene mapping and initial genome sequencing data enabled the development of microarrays to analyze genomic variants. The goal of this review is to consider different generations of sequencing techniques and their application to exome sequencing and whole genome sequencing and their clinical applications. In recent decades, exome sequencing has primarily been used in patient studies. Discussed in some detail, are important measures that have been developed to standardize variant calling and to assess pathogenicity of variants. Examples of cases where exome sequencing has facilitated diagnosis and led to improved medical management are presented. Whole genome sequencing and its clinical relevance are presented particularly in the context of analysis of nucleotide and structural genomic variants in large population studies and in certain patient cohorts. Applications involving analysis of cell free DNA in maternal blood for prenatal diagnosis of specific autosomal trisomies are reviewed. Applications of DNA sequencing to diagnosis and therapeutics of cancer are presented. Also discussed are important recent diagnostic applications of DNA sequencing in cancer, including analysis of tumor derived cell free DNA and exosomes that are present in body fluids. Insights gained into underlying pathogenetic mechanisms of certain complex common diseases, including schizophrenia, macular degeneration, neurodegenerative disease are presented. The relevance of different types of variants, rare, uncommon, and common to disease pathogenesis, and the continuum of causality, are addressed. Pharmogenetic variants detected by DNA sequence analysis are gaining in importance and are particularly relevant to personalized and precision medicine.

Collapse

Couldrey C, Keehan M, Johnson T, Tiplady K, Winkelman A, Littlejohn MD, Scott A, Kemper KE, Hayes B, Davis SR, Spelman RJ. Detection and assessment of copy number variation using PacBio long-read and Illumina sequencing in New Zealand dairy cattle. J Dairy Sci 2017;100:5472-5478. [PMID: 28456410 DOI: 10.3168/jds.2016-12199] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Accepted: 03/12/2017] [Indexed: 11/19/2022]

Abstract

Single nucleotide polymorphisms have been the DNA variant of choice for genomic prediction, largely because of the ease of single nucleotide polymorphism genotype collection. In contrast, structural variants (SV), which include copy number variants (CNV), translocations, insertions, and inversions, have eluded easy detection and characterization, particularly in nonhuman species. However, evidence increasingly shows that SV not only contribute a substantial proportion of genetic variation but also have significant influence on phenotypes. Here we present the discovery of CNV in a prominent New Zealand dairy bull using long-read PacBio (Pacific Biosciences, Menlo Park, CA) sequencing technology and the Sniffles SV discovery tool (version 0.0.1; https://github.com/fritzsedlazeck/Sniffles). The CNV identified from long reads were compared with CNV discovered in the same bull from Illumina sequencing using CNVnator (read depth-based tool; Illumina Inc., San Diego, CA) as a means of validation. Subsequently, further validation was undertaken using whole-genome Illumina sequencing of 556 cattle representing the wider New Zealand dairy cattle population. Very limited overlap was observed in CNV discovered from the 2 sequencing platforms, in part because of the differences in size of CNV detected. Only a few CNV were therefore able to be validated using this approach. However, the ability to use CNVnator to genotype the 557 cattle for copy number across all regions identified as putative CNV allowed a genome-wide assessment of transmission level of copy number based on pedigree. The more highly transmissible a putative CNV region was observed to be, the more likely the distribution of copy number was multimodal across the 557 sequenced animals. Furthermore, visual assessment of highly transmissible CNV regions provided evidence supporting the presence of CNV across the sequenced animals. This transmission-based approach was able to confirm a subset of CNV that segregates in the New Zealand dairy cattle population. Genome-wide identification and validation of CNV is an important step toward their inclusion in genomic selection strategies.

Collapse

Sekizuka T, Kawanishi M, Ohnishi M, Shima A, Kato K, Yamashita A, Matsui M, Suzuki S, Kuroda M. Elucidation of quantitative structural diversity of remarkable rearrangement regions, shufflons, in IncI2 plasmids. Sci Rep 2017;7:928. [PMID: 28424528 PMCID: PMC5430464 DOI: 10.1038/s41598-017-01082-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 03/20/2017] [Indexed: 12/30/2022] Open

Chakravorty S, Hegde M. Gene and Variant Annotation for Mendelian Disorders in the Era of Advanced Sequencing Technologies. Annu Rev Genomics Hum Genet 2017;18:229-256. [PMID: 28415856 DOI: 10.1146/annurev-genom-083115-022545] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen HC, Kitts PA, Murphy TD, Pruitt KD, Thibaud-Nissen F, Albracht D, Fulton RS, Kremitzki M, Magrini V, Markovic C, McGrath S, Steinberg KM, Auger K, Chow W, Collins J, Harden G, Hubbard T, Pelan S, Simpson JT, Threadgold G, Torrance J, Wood JM, Clarke L, Koren S, Boitano M, Peluso P, Li H, Chin CS, Phillippy AM, Durbin R, Wilson RK, Flicek P, Eichler EE, Church DM. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res 2017;27:849-864. [PMID: 28396521 PMCID: PMC5411779 DOI: 10.1101/gr.213611.116] [Citation(s) in RCA: 533] [Impact Index Per Article: 76.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 03/14/2017] [Indexed: 11/24/2022]

Affiliation(s)

Valerie A Schneider National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Tina Graves-Lindsay McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Kerstin Howe Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Nathan Bouk National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Hsiu-Chuan Chen National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Paul A Kitts National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Terence D Murphy National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Kim D Pruitt National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Françoise Thibaud-Nissen National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Derek Albracht McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Robert S Fulton McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Milinn Kremitzki McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Vincent Magrini McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Chris Markovic McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Sean McGrath McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Karyn Meltz Steinberg McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Kate Auger Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
William Chow Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Joanna Collins Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Glenn Harden Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Timothy Hubbard Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Sarah Pelan Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Jared T Simpson Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Glen Threadgold Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
James Torrance Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Jonathan M Wood Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Laura Clarke European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
Sergey Koren National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
Matthew Boitano Pacific Biosciences, Menlo Park, California 94025, USA
Paul Peluso Pacific Biosciences, Menlo Park, California 94025, USA
Heng Li Broad Institute, Cambridge, Massachusetts 02142, USA
Chen-Shan Chin Pacific Biosciences, Menlo Park, California 94025, USA
Adam M Phillippy National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
Richard Durbin Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
Richard K Wilson McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
Paul Flicek European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
Evan E Eichler Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
Deanna M Church National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA

Collapse

Jain A, Dorfman KD. Simulations of knotting of DNA during genome mapping. BIOMICROFLUIDICS 2017;11:024117. [PMID: 28798853 PMCID: PMC5533507 DOI: 10.1063/1.4979605] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Accepted: 03/21/2017] [Indexed: 05/28/2023]

An Incomplete Understanding of Human Genetic Variation. Genetics 2017;202:1251-4. [PMID: 27053122 DOI: 10.1534/genetics.115.180539] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Pausch H, MacLeod IM, Fries R, Emmerling R, Bowman PJ, Daetwyler HD, Goddard ME. Evaluation of the accuracy of imputed sequence variant genotypes and their utility for causal variant detection in cattle. Genet Sel Evol 2017;49:24. [PMID: 28222685 PMCID: PMC5320806 DOI: 10.1186/s12711-017-0301-x] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Accepted: 02/14/2017] [Indexed: 12/11/2022] Open

Abstract

Background

The availability of dense genotypes and whole-genome sequence variants from various sources offers the opportunity to compile large datasets consisting of tens of thousands of individuals with genotypes at millions of polymorphic sites that may enhance the power of genomic analyses. The imputation of missing genotypes ensures that all individuals have genotypes for a shared set of variants.

Results

We evaluated the accuracy of imputation from dense genotypes to whole-genome sequence variants in 249 Fleckvieh and 450 Holstein cattle using Minimac and FImpute. The sequence variants of a subset of the animals were reduced to the variants that were included on the Illumina BovineHD genotyping array and subsequently inferred in silico using either within- or multi-breed reference populations. The accuracy of imputation varied considerably across chromosomes and dropped at regions where the bovine genome contains segmental duplications. Depending on the imputation strategy, the correlation between imputed and true genotypes ranged from 0.898 to 0.952. The accuracy of imputation was higher with Minimac than FImpute particularly for variants with a low minor allele frequency. Using a multi-breed reference population increased the accuracy of imputation, particularly when FImpute was used to infer genotypes. When the sequence variants were imputed using Minimac, the true genotypes were more correlated to predicted allele dosages than best-guess genotypes. The computing costs to impute 23,256,743 sequence variants in 6958 animals were ten-fold higher with Minimac than FImpute. Association studies with imputed sequence variants revealed seven quantitative trait loci (QTL) for milk fat percentage. Two causal mutations in the DGAT1 and GHR genes were the most significantly associated variants at two QTL on chromosomes 14 and 20 when Minimac was used to infer genotypes.

Conclusions

The population-based imputation of millions of sequence variants in large cohorts is computationally feasible and provides accurate genotypes. However, the accuracy of imputation is low in regions where the genome contains large segmental duplications or the coverage with array-derived single nucleotide polymorphisms is poor. Using a reference population that includes individuals from many breeds increases the accuracy of imputation particularly at low-frequency variants. Considering allele dosages rather than best-guess genotypes as explanatory variables is advantageous to detect causal mutations in association studies with imputed sequence variants.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-017-0301-x) contains supplementary material, which is available to authorized users.

Collapse

Huddleston J, Chaisson MJP, Steinberg KM, Warren W, Hoekzema K, Gordon D, Graves-Lindsay TA, Munson KM, Kronenberg ZN, Vives L, Peluso P, Boitano M, Chin CS, Korlach J, Wilson RK, Eichler EE. Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Res 2016;27:677-685. [PMID: 27895111 PMCID: PMC5411763 DOI: 10.1101/gr.214007.116] [Citation(s) in RCA: 215] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 11/15/2016] [Indexed: 01/07/2023]

Affiliation(s)

John Huddleston Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
Mark J P Chaisson Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
Karyn Meltz Steinberg McDonnell Genome Institute, Department of Medicine, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA
Wes Warren McDonnell Genome Institute, Department of Medicine, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA
Kendra Hoekzema Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
David Gordon Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
Tina A Graves-Lindsay McDonnell Genome Institute, Department of Medicine, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA
Katherine M Munson Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
Zev N Kronenberg Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
Laura Vives Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
Paul Peluso Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA
Matthew Boitano Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA
Chen-Shin Chin Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA
Jonas Korlach Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA
Richard K Wilson Department of Pathology, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA
Evan E Eichler Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA

Collapse

Deep sequencing of 10,000 human genomes. Proc Natl Acad Sci U S A 2016;113:11901-11906. [PMID: 27702888 DOI: 10.1073/pnas.1613365113] [Citation(s) in RCA: 260] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Hoban S, Kelley JL, Lotterhos KE, Antolin MF, Bradburd G, Lowry DB, Poss ML, Reed LK, Storfer A, Whitlock MC. Finding the Genomic Basis of Local Adaptation: Pitfalls, Practical Solutions, and Future Directions. Am Nat 2016;188:379-97. [PMID: 27622873 PMCID: PMC5457800 DOI: 10.1086/688018] [Citation(s) in RCA: 431] [Impact Index Per Article: 53.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Du C, Pusey BN, Adams CJ, Lau CC, Bone WP, Gahl WA, Markello TC, Adams DR. Explorations to improve the completeness of exome sequencing. BMC Med Genomics 2016;9:56. [PMID: 27568008 PMCID: PMC5002202 DOI: 10.1186/s12920-016-0216-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 08/05/2016] [Indexed: 12/30/2022] Open

Fawcett GL, Karina Eterovic A. Identification of Genomic Somatic Variants in Cancer: From Discovery to Actionability. Adv Clin Chem 2016;78:123-162. [PMID: 28057186 DOI: 10.1016/bs.acc.2016.07.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Yuen RKC, Merico D, Cao H, Pellecchia G, Alipanahi B, Thiruvahindrapuram B, Tong X, Sun Y, Cao D, Zhang T, Wu X, Jin X, Zhou Z, Liu X, Nalpathamkalam T, Walker S, Howe JL, Wang Z, MacDonald JR, Chan A, D'Abate L, Deneault E, Siu MT, Tammimies K, Uddin M, Zarrei M, Wang M, Li Y, Wang J, Wang J, Yang H, Bookman M, Bingham J, Gross SS, Loy D, Pletcher M, Marshall CR, Anagnostou E, Zwaigenbaum L, Weksberg R, Fernandez BA, Roberts W, Szatmari P, Glazer D, Frey BJ, Ring RH, Xu X, Scherer SW. Genome-wide characteristics of de novo mutations in autism. NPJ Genom Med 2016;1:160271-1602710. [PMID: 27525107 PMCID: PMC4980121 DOI: 10.1038/npjgenmed.2016.27] [Citation(s) in RCA: 150] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Affiliation(s)

Ryan K C Yuen The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Daniele Merico The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Hongzhi Cao BGI-Shenzhen, Yantian, Shenzhen, China
Giovanna Pellecchia The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Babak Alipanahi Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada
Bhooma Thiruvahindrapuram The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Xin Tong BGI-Shenzhen, Yantian, Shenzhen, China
Yuhui Sun BGI-Shenzhen, Yantian, Shenzhen, China
Dandan Cao BGI-Shenzhen, Yantian, Shenzhen, China
Tao Zhang BGI-Shenzhen, Yantian, Shenzhen, China
Xueli Wu BGI-Shenzhen, Yantian, Shenzhen, China
Xin Jin BGI-Shenzhen, Yantian, Shenzhen, China
Ze Zhou BGI-Shenzhen, Yantian, Shenzhen, China
Xiaomin Liu BGI-Shenzhen, Yantian, Shenzhen, China
Thomas Nalpathamkalam The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Susan Walker The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Jennifer L Howe The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Zhuozhi Wang The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Jeffrey R MacDonald The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Ada Chan The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Lia D'Abate The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Eric Deneault The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Michelle T Siu Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Kristiina Tammimies Center of Neurodevelopmental Disorders (KIND), Pediatric Neuropsychiatry Unit, Karolinska Institutet, Stockholm, Sweden
Mohammed Uddin The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Mehdi Zarrei The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Mingbang Wang BGI-Shenzhen, Yantian, Shenzhen, China
Yingrui Li BGI-Shenzhen, Yantian, Shenzhen, China
Jun Wang BGI-Shenzhen, Yantian, Shenzhen, China
Jian Wang BGI-Shenzhen, Yantian, Shenzhen, China
Huanming Yang BGI-Shenzhen, Yantian, Shenzhen, China
Matt Bookman Google, Mountain View, California, USA
Jonathan Bingham Google, Mountain View, California, USA
Samuel S Gross Google, Mountain View, California, USA
Dion Loy Google, Mountain View, California, USA
Mathew Pletcher Autism Speaks, Princeton, New Jersey, USA
Christian R Marshall The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada; Department of Molecular Genetics, Paediatric Laboratory Medicine, The Hospital for Sick Children, Toronto, Ontario, Canada
Evdokia Anagnostou Bloorview Research Institute, University of Toronto, Toronto, Ontario, Canada
Lonnie Zwaigenbaum Department of Pediatrics, University of Alberta, Edmonton, Alberta, Canada
Rosanna Weksberg Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
Bridget A Fernandez Disciplines of Genetics and Medicine, Memorial University of Newfoundland, St. John's, Newfoundland, Canada; Provincial Medical Genetic Program, Eastern Health, St. John's, Newfoundland, Canada
Wendy Roberts Autism Research Unit, The Hospital for Sick Children, Toronto, Ontario, Canada
Peter Szatmari Autism Research Unit, The Hospital for Sick Children, Toronto, Ontario, Canada; Child Youth and Family Services, Centre for Addiction and Mental Health, Toronto, Ontario, Canada; Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada
David Glazer Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
Brendan J Frey Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada; Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Robert H Ring Autism Speaks, Princeton, New Jersey, USA
Xun Xu BGI-Shenzhen, Yantian, Shenzhen, China
Stephen W Scherer The Centre for Applied Genomics, Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; McLaughlin Centre, University of Toronto, Toronto, Ontario, Canada

Collapse

Yuan B, Neira J, Gu S, Harel T, Liu P, Briceño I, Elsea SH, Gómez A, Potocki L, Lupski JR. Nonrecurrent PMP22-RAI1 contiguous gene deletions arise from replication-based mechanisms and result in Smith-Magenis syndrome with evident peripheral neuropathy. Hum Genet 2016;135:1161-74. [PMID: 27386852 DOI: 10.1007/s00439-016-1703-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 06/21/2016] [Indexed: 11/29/2022]

Abstract

Hereditary neuropathy with liability to pressure palsies (HNPP) and Smith-Magenis syndrome (SMS) are genomic disorders associated with deletion copy number variants involving chromosome 17p12 and 17p11.2, respectively. Nonallelic homologous recombination (NAHR)-mediated recurrent deletions are responsible for the majority of HNPP and SMS cases; the rearrangement products encompass the key dosage-sensitive genes PMP22 and RAI1, respectively, and result in haploinsufficiency for these genes. Less frequently, nonrecurrent genomic rearrangements occur at this locus. Contiguous gene duplications encompassing both PMP22 and RAI1, i.e., PMP22-RAI1 duplications, have been investigated, and replication-based mechanisms rather than NAHR have been proposed for these rearrangements. In the current study, we report molecular and clinical characterizations of six subjects with the reciprocal phenomenon of deletions spanning both genes, i.e., PMP22-RAI1 deletions. Molecular studies utilizing high-resolution array comparative genomic hybridization and breakpoint junction sequencing identified mutational signatures that were suggestive of replication-based mechanisms. Systematic clinical studies revealed features consistent with SMS, including features of intellectual disability, speech and gross motor delays, behavioral problems and ocular abnormalities. Five out of six subjects presented clinical signs and/or objective electrophysiologic studies of peripheral neuropathy. Clinical profiling may improve the clinical management of this unique group of subjects, as the peripheral neuropathy can be more severe or of earlier onset as compared to SMS patients having the common recurrent deletion. Moreover, the current study, in combination with the previous report of PMP22-RAI1 duplications, contributes to the understanding of rare complex phenotypes involving multiple dosage-sensitive genes from a genetic mechanistic standpoint.

Collapse

Vembar SS, Seetin M, Lambert C, Nattestad M, Schatz MC, Baybayan P, Scherf A, Smith ML. Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing. DNA Res 2016;23:339-51. [PMID: 27345719 PMCID: PMC4991835 DOI: 10.1093/dnares/dsw022] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 05/10/2016] [Indexed: 01/03/2023] Open

Xia LC, Sakshuwong S, Hopmans ES, Bell JM, Grimes SM, Siegmund DO, Ji HP, Zhang NR. A genome-wide approach for detecting novel insertion-deletion variants of mid-range size. Nucleic Acids Res 2016;44:e126. [PMID: 27325742 PMCID: PMC5009736 DOI: 10.1093/nar/gkw481] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2015] [Accepted: 05/15/2016] [Indexed: 11/14/2022] Open

Mason-Suares H, Landry L, S. Lebo M. Detecting Copy Number Variation via Next Generation Technology. CURRENT GENETIC MEDICINE REPORTS 2016. [DOI: 10.1007/s40142-016-0091-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Lupski JR. Clinical genomics: from a truly personal genome viewpoint. Hum Genet 2016;135:591-601. [PMID: 27221143 DOI: 10.1007/s00439-016-1682-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Accepted: 05/11/2016] [Indexed: 12/23/2022]

Friedrich SM, Zec HC, Wang TH. Analysis of single nucleic acid molecules in micro- and nano-fluidics. LAB ON A CHIP 2016;16:790-811. [PMID: 26818700 PMCID: PMC4767527 DOI: 10.1039/c5lc01294e] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]

Guan P, Sung WK. Structural variation detection using next-generation sequencing data: A comparative technical review. Methods 2016;102:36-49. [PMID: 26845461 DOI: 10.1016/j.ymeth.2016.01.020] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2015] [Revised: 01/09/2016] [Accepted: 01/31/2016] [Indexed: 12/11/2022] Open

Abstract

Structural variations (SVs) are mutations in the genome of size at least fifty nucleotides. They contribute to the phenotypic differences among healthy individuals, cause severe diseases and even cancers by breaking or linking genes. Thus, it is crucial to systematically profile SVs in the genome. In the past decade, many next-generation sequencing (NGS)-based SV detection methods have been proposed due to the significant cost reduction of NGS experiments and their ability to unbiasedly detect SVs to the base-pair resolution. These SV detection methods vary in both sensitivity and specificity, since they use different SV-property-dependent and library-property-dependent features. As a result, predictions from different SV callers are often inconsistent. Besides, the noises in the data (both platform-specific sequencing error and artificial chimeric reads) impede the specificity of SV detection. Poorly characterized regions in the human genome (e.g., repeat regions) greatly impact the reads mapping and in turn affect the SV calling accuracy. Calling of complex SVs requires specialized SV callers. Apart from accuracy, processing speed of SV caller is another factor deciding its usability. Knowing the pros and cons of different SV calling techniques and the objectives of the biological study are essential for biologists and bioinformaticians to make informed decisions. This paper describes different components in the SV calling pipeline and reviews the techniques used by existing SV callers. Through simulation study, we also demonstrate that library properties, especially insert size, greatly impact the sensitivity of different SV callers. We hope the community can benefit from this work both in designing new SV calling methods and in selecting the appropriate SV caller for specific biological studies.

Collapse

Norris AL, Workman RE, Fan Y, Eshleman JR, Timp W. Nanopore sequencing detects structural variants in cancer. Cancer Biol Ther 2016;17:246-53. [PMID: 26787508 PMCID: PMC4848001 DOI: 10.1080/15384047.2016.1139236] [Citation(s) in RCA: 96] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Revised: 12/08/2015] [Accepted: 01/01/2016] [Indexed: 11/21/2022] Open

Parikh H, Mohiyuddin M, Lam HYK, Iyer H, Chen D, Pratt M, Bartha G, Spies N, Losert W, Zook JM, Salit M. svclassify: a method to establish benchmark structural variant calls. BMC Genomics 2016;17:64. [PMID: 26772178 PMCID: PMC4715349 DOI: 10.1186/s12864-016-2366-2] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2015] [Accepted: 01/05/2016] [Indexed: 01/24/2023] Open

Abstract

Background

The human genome contains variants ranging in size from small single nucleotide polymorphisms (SNPs) to large structural variants (SVs). High-quality benchmark small variant calls for the pilot National Institute of Standards and Technology (NIST) Reference Material (NA12878) have been developed by the Genome in a Bottle Consortium, but no similar high-quality benchmark SV calls exist for this genome. Since SV callers output highly discordant results, we developed methods to combine multiple forms of evidence from multiple sequencing technologies to classify candidate SVs into likely true or false positives. Our method (svclassify) calculates annotations from one or more aligned bam files from many high-throughput sequencing technologies, and then builds a one-class model using these annotations to classify candidate SVs as likely true or false positives.

Results

We first used pedigree analysis to develop a set of high-confidence breakpoint-resolved large deletions. We then used svclassify to cluster and classify these deletions as well as a set of high-confidence deletions from the 1000 Genomes Project and a set of breakpoint-resolved complex insertions from Spiral Genetics. We find that likely SVs cluster separately from likely non-SVs based on our annotations, and that the SVs cluster into different types of deletions. We then developed a supervised one-class classification method that uses a training set of random non-SV regions to determine whether candidate SVs have abnormal annotations different from most of the genome. To test this classification method, we use our pedigree-based breakpoint-resolved SVs, SVs validated by the 1000 Genomes Project, and assembly-based breakpoint-resolved insertions, along with semi-automated visualization using svviz.

Conclusions

We find that candidate SVs with high scores from multiple technologies have high concordance with PCR validation and an orthogonal consensus method MetaSV (99.7 % concordant), and candidate SVs with low scores are questionable. We distribute a set of 2676 high-confidence deletions and 68 high-confidence insertions with high svclassify scores from these call sets for benchmarking SV callers. We expect these methods to be particularly useful for establishing high-confidence SV calls for benchmark samples that have been characterized by multiple technologies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-016-2366-2) contains supplementary material, which is available to authorized users.

Collapse

Davey JW, Chouteau M, Barker SL, Maroja L, Baxter SW, Simpson F, Merrill RM, Joron M, Mallet J, Dasmahapatra KK, Jiggins CD. Major Improvements to the Heliconius melpomene Genome Assembly Used to Confirm 10 Chromosome Fusion Events in 6 Million Years of Butterfly Evolution. G3 (BETHESDA, MD.) 2016;6:695-708. [PMID: 26772750 PMCID: PMC4777131 DOI: 10.1534/g3.115.023655] [Citation(s) in RCA: 95] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 01/06/2016] [Indexed: 12/30/2022]

Yuan B, Harel T, Gu S, Liu P, Burglen L, Chantot-Bastaraud S, Gelowani V, Beck C, Carvalho C, Cheung S, Coe A, Malan V, Munnich A, Magoulas P, Potocki L, Lupski J. Nonrecurrent 17p11.2p12 Rearrangement Events that Result in Two Concomitant Genomic Disorders: The PMP22-RAI1 Contiguous Gene Duplication Syndrome. Am J Hum Genet 2015;97:691-707. [PMID: 26544804 DOI: 10.1016/j.ajhg.2015.10.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 10/05/2015] [Indexed: 12/31/2022] Open

Abstract

The genomic duplication associated with Potocki-Lupski syndrome (PTLS) maps in close proximity to the duplication associated with Charcot-Marie-Tooth disease type 1A (CMT1A). PTLS is characterized by hypotonia, failure to thrive, reduced body weight, intellectual disability, and autistic features. CMT1A is a common autosomal dominant distal symmetric peripheral polyneuropathy. The key dosage-sensitive genes RAI1 and PMP22 are respectively associated with PTLS and CMT1A. Recurrent duplications accounting for the majority of subjects with these conditions are mediated by nonallelic homologous recombination between distinct low-copy repeat (LCR) substrates. The LCRs flanking a contiguous genomic interval encompassing both RAI1 and PMP22 do not share extensive homology; thus, duplications encompassing both loci are rare and potentially generated by a different mutational mechanism. We characterized genomic rearrangements that simultaneously duplicate PMP22 and RAI1, including nine potential complex genomic rearrangements, in 23 subjects by high-resolution array comparative genomic hybridization and breakpoint junction sequencing. Insertions and microhomologies were found at the breakpoint junctions, suggesting potential replicative mechanisms for rearrangement formation. At the breakpoint junctions of these nonrecurrent rearrangements, enrichment of repetitive DNA sequences was observed, indicating that they might predispose to genomic instability and rearrangement. Clinical evaluation revealed blended PTLS and CMT1A phenotypes with a potential earlier onset of neuropathy. Moreover, additional clinical findings might be observed due to the extra duplicated material included in the rearrangements. Our genomic analysis suggests replicative mechanisms as a predominant mechanism underlying PMP22-RAI1 contiguous gene duplications and provides further evidence supporting the role of complex genomic architecture in genomic instability.

Collapse

Rhoads A, Au KF. PacBio Sequencing and Its Applications. GENOMICS PROTEOMICS & BIOINFORMATICS 2015;13:278-89. [PMID: 26542840 PMCID: PMC4678779 DOI: 10.1016/j.gpb.2015.08.002] [Citation(s) in RCA: 1162] [Impact Index Per Article: 129.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Revised: 08/06/2015] [Accepted: 08/11/2015] [Indexed: 12/15/2022]

Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays. Genetics 2015;202:351-62. [PMID: 26510793 PMCID: PMC4701098 DOI: 10.1534/genetics.115.183483] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2014] [Accepted: 10/28/2015] [Indexed: 01/06/2023] Open

100

Mu JC, Tootoonchi Afshar P, Mohiyuddin M, Chen X, Li J, Bani Asadi N, Gerstein MB, Wong WH, Lam HYK. Leveraging long read sequencing from a single individual to provide a comprehensive resource for benchmarking variant calling methods. Sci Rep 2015;5:14493. [PMID: 26412485 PMCID: PMC4585973 DOI: 10.1038/srep14493] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 08/28/2015] [Indexed: 11/09/2022] Open