51
|
Campbell CD, Eichler EE. Properties and rates of germline mutations in humans. Trends Genet 2013; 29:575-84. [PMID: 23684843 PMCID: PMC3785239 DOI: 10.1016/j.tig.2013.04.005] [Citation(s) in RCA: 188] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Revised: 04/05/2013] [Accepted: 04/18/2013] [Indexed: 11/25/2022]
Abstract
All genetic variation arises via new mutations; therefore, determining the rate and biases for different classes of mutation is essential for understanding the genetics of human disease and evolution. Decades of mutation rate analyses have focused on a relatively small number of loci because of technical limitations. However, advances in sequencing technology have allowed for empirical assessments of genome-wide rates of mutation. Recent studies have shown that 76% of new mutations originate in the paternal lineage and provide unequivocal evidence for an increase in mutation with paternal age. Although most analyses have focused on single nucleotide variants (SNVs), studies have begun to provide insight into the mutation rate for other classes of variation, including copy number variants (CNVs), microsatellites, and mobile element insertions (MEIs). Here, we review the genome-wide analyses for the mutation rate of several types of variants and suggest areas for future research.
Collapse
Affiliation(s)
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195
- Howard Hughes Medical Institute, Seattle, WA 98195
| |
Collapse
|
52
|
Rapid and accurate large-scale genotyping of duplicated genes and discovery of interlocus gene conversions. Nat Methods 2013; 10:903-9. [PMID: 23892896 PMCID: PMC3985568 DOI: 10.1038/nmeth.2572] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2013] [Accepted: 06/06/2013] [Indexed: 01/17/2023]
Abstract
Over 900 genes have been annotated within duplicated regions of the human genome, yet their functions and potential roles in disease remain largely unknown. One major obstacle has been the inability to accurately and comprehensively assay genetic variation for these genes in a high-throughput manner. We developed a sequencing-based method for rapid and high-throughput genotyping of duplicated genes using molecular inversion probes designed to target unique paralogous sequence variants. We applied this method to genotype all members of two gene families, SRGAP2 and RH, among a diversity panel of 1,056 humans. The approach could accurately distinguish copy number in paralogs having up to ∼99.6% sequence identity, identify small gene-disruptive deletions, detect single-nucleotide variants, define breakpoints of unequal crossover and discover regions of interlocus gene conversion. The ability to rapidly and accurately genotype multiple gene families in thousands of individuals at low cost enables the development of genome-wide gene conversion maps and 'unlocks' many previously inaccessible duplicated genes for association with human traits.
Collapse
|
53
|
Abstract
Copy number variation (CNV) contributes to disease and has restructured the genomes of great apes. The diversity and rate of this process, however, have not been extensively explored among great ape lineages. We analyzed 97 deeply sequenced great ape and human genomes and estimate 16% (469 Mb) of the hominid genome has been affected by recent CNV. We identify a comprehensive set of fixed gene deletions (n = 340) and duplications (n = 405) as well as >13.5 Mb of sequence that has been specifically lost on the human lineage. We compared the diversity and rates of copy number and single nucleotide variation across the hominid phylogeny. We find that CNV diversity partially correlates with single nucleotide diversity (r2 = 0.5) and recapitulates the phylogeny of apes with few exceptions. Duplications significantly outpace deletions (2.8-fold). The load of segregating duplications remains significantly higher in bonobos, Western chimpanzees, and Sumatran orangutans—populations that have experienced recent genetic bottlenecks (P = 0.0014, 0.02, and 0.0088, respectively). The rate of fixed deletion has been more clocklike with the exception of the chimpanzee lineage, where we observe a twofold increase in the chimpanzee–bonobo ancestor (P = 4.79 × 10−9) and increased deletion load among Western chimpanzees (P = 0.002). The latter includes the first genomic disorder in a chimpanzee with features resembling Smith-Magenis syndrome mediated by a chimpanzee-specific increase in segmental duplication complexity. We hypothesize that demographic effects, such as bottlenecks, have contributed to larger and more gene-rich segments being deleted in the chimpanzee lineage and that this effect, more generally, may account for episodic bursts in CNV during hominid evolution.
Collapse
|
54
|
D'Alessandro LCA, Werner P, Xie HM, Hakonarson H, White PS, Goldmuntz E. The prevalence of 16p12.1 microdeletion in patients with left-sided cardiac lesions. CONGENIT HEART DIS 2013; 9:83-6. [PMID: 23682798 DOI: 10.1111/chd.12097] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/11/2013] [Indexed: 11/28/2022]
Abstract
SETTING Left-sided cardiac lesions have a birth prevalence of approximately 1 in 1000 and have been shown to be heritable in pedigree studies. A large microdeletion at 16p12.1 is associated with childhood developmental delay, and initial studies describing this deletion identified left-sided lesions as an enriched phenotype compared with a control population. OBJECTIVE The aim of this study is to determine whether patients with left-sided cardiac lesions have an increased frequency of 16p12.1 microdeletions as compared with control populations. DESIGN A cohort of 262 probands with left-sided lesions, including 53 with isolated aortic stenosis/bicuspid aortic valve, 83 with coarctation of the aorta with or without aortic stenosis/bicuspid aortic valve, and 126 with hypoplastic left heart syndrome were assessed for copy number variation at 16p12.1. The control cohort included 595 patients with conotruncal defects as a cardiac control and 971 healthy children. RESULTS We detected one patient in the left-sided lesion cohort with a large duplication partially overlapping the reported 16p12.1 microdeletion, along with one patient each in the conotruncal and control cohorts with a deletion in the same region. None of these patients had dysmorphic features, extracardiac malformations, or developmental delay. CONCLUSION In our cohort, structural variation at 16p12.1 was not identified with increased frequency in patients with left-sided lesions as compared with controls.
Collapse
Affiliation(s)
- Lisa C A D'Alessandro
- The Division of Cardiology, The Children's Hospital of Philadelphia, Philadelphia, Pa, USA; Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pa, USA
| | | | | | | | | | | |
Collapse
|
55
|
Refinement and discovery of new hotspots of copy-number variation associated with autism spectrum disorder. Am J Hum Genet 2013; 92:221-37. [PMID: 23375656 DOI: 10.1016/j.ajhg.2012.12.016] [Citation(s) in RCA: 223] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2012] [Revised: 10/26/2012] [Accepted: 12/20/2012] [Indexed: 11/24/2022] Open
Abstract
Rare copy-number variants (CNVs) have been implicated in autism and intellectual disability. These variants are large and affect many genes but lack clear specificity toward autism as opposed to developmental-delay phenotypes. We exploited the repeat architecture of the genome to target segmental duplication-mediated rearrangement hotspots (n = 120, median size 1.78 Mbp, range 240 kbp to 13 Mbp) and smaller hotspots flanked by repetitive sequence (n = 1,247, median size 79 kbp, range 3-96 kbp) in 2,588 autistic individuals from simplex and multiplex families and in 580 controls. Our analysis identified several recurrent large hotspot events, including association with 1q21 duplications, which are more likely to be identified in individuals with autism than in those with developmental delay (p = 0.01; OR = 2.7). Within larger hotspots, we also identified smaller atypical CNVs that implicated CHD1L and ACACA for the 1q21 and 17q12 deletions, respectively. Our analysis, however, suggested no overall increase in the burden of smaller hotspots in autistic individuals as compared to controls. By focusing on gene-disruptive events, we identified recurrent CNVs, including DPP10, PLCB1, TRPM1, NRXN1, FHIT, and HYDIN, that are enriched in autism. We found that as the size of deletions increases, nonverbal IQ significantly decreases, but there is no impact on autism severity; and as the size of duplications increases, autism severity significantly increases but nonverbal IQ is not affected. The absence of an increased burden of smaller CNVs in individuals with autism and the failure of most large hotspots to refine to single genes is consistent with a model where imbalance of multiple genes contributes to a disease state.
Collapse
|
56
|
Xin H, Lee D, Hormozdiari F, Yedkar S, Mutlu O, Alkan C. Accelerating read mapping with FastHASH. BMC Genomics 2013; 14 Suppl 1:S13. [PMID: 23369189 PMCID: PMC3549798 DOI: 10.1186/1471-2164-14-s1-s13] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that can process and analyze the enormous amount of sequence data quickly and accurately. Unfortunately, the current read mapping algorithms have difficulties in coping with the massive amounts of data generated by NGS.We propose a new algorithm, FastHASH, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods. FastHASH is a generic algorithm compatible with all seed-and-extend class read mapping algorithms. It introduces two main techniques, namely Adjacency Filtering, and Cheap K-mer Selection.We implemented FastHASH and merged it into the codebase of the popular read mapping program, mrFAST. Depending on the edit distance cutoffs, we observed up to 19-fold speedup while still maintaining 100% sensitivity and high comprehensiveness.
Collapse
Affiliation(s)
- Hongyi Xin
- Depts. of Computer Science and Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | | | | | | | | |
Collapse
|
57
|
Mueller M, Barros P, Witherden A, Roberts A, Zhang Z, Schaschl H, Yu CY, Hurles M, Schaffner C, Floto R, Game L, Steinberg K, Wilson R, Graves T, Eichler E, Cook H, Vyse T, Aitman T. Genomic pathology of SLE-associated copy-number variation at the FCGR2C/FCGR3B/FCGR2B locus. Am J Hum Genet 2013; 92:28-40. [PMID: 23261299 PMCID: PMC3542466 DOI: 10.1016/j.ajhg.2012.11.013] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2012] [Revised: 09/12/2012] [Accepted: 11/26/2012] [Indexed: 01/18/2023] Open
Abstract
Reduced FCGR3B copy number is associated with increased risk of systemic lupus erythematosus (SLE). The five FCGR2/FCGR3 genes are arranged across two highly paralogous genomic segments on chromosome 1q23. Previous studies have suggested mechanisms for structural rearrangements at the FCGR2/FCGR3 locus and have proposed mechanisms whereby altered FCGR3B copy number predisposes to autoimmunity, but the high degree of sequence similarity between paralogous segments has prevented precise definition of the molecular events and their functional consequences. To pursue the genomic pathology associated with FCGR3B copy-number variation, we integrated sequencing data from fosmid and bacterial artificial chromosome clones and sequence-captured DNA from FCGR3B-deleted genomes to establish a detailed map of allelic and paralogous sequence variation across the FCGR2/FCGR3 locus. This analysis identified two highly paralogous 24.5 kb blocks within the FCGR2C/FCGR3B/FCGR2B locus that are devoid of nonpolymorphic paralogous sequence variations and that define the limits of the genomic regions in which nonallelic homologous recombination leads to FCGR2C/FCGR3B copy-number variation. Further, the data showed evidence of swapping of haplotype blocks between these highly paralogous blocks that most likely arose from sequential ancestral recombination events across the region. Functionally, we found by flow cytometry, immunoblotting and cDNA sequencing that individuals with FCGR3B-deleted alleles show ectopic presence of FcγRIIb on natural killer (NK) cells. We conclude that FCGR3B deletion juxtaposes the 5'-regulatory sequences of FCGR2C with the coding sequence of FCGR2B, creating a chimeric gene that results in an ectopic accumulation of FcγRIIb on NK cells and provides an explanation for SLE risk associated with reduced FCGR3B gene copy number.
Collapse
Affiliation(s)
- Michael Mueller
- Physiological Genomics and Medicine Group, MRC Clinical Sciences Centre, Faculty of Medicine, Imperial College London, London W12 0NN, UK
| | - Paula Barros
- Department of Medical and Molecular Genetics, King’s College London, Guy’s Hospital, London SE1 9RT, UK
| | - Abigail S. Witherden
- Department of Medical and Molecular Genetics, King’s College London, Guy’s Hospital, London SE1 9RT, UK
| | - Amy L. Roberts
- Department of Medical and Molecular Genetics, King’s College London, Guy’s Hospital, London SE1 9RT, UK
| | - Zhou Zhang
- Physiological Genomics and Medicine Group, MRC Clinical Sciences Centre, Faculty of Medicine, Imperial College London, London W12 0NN, UK
| | - Helmut Schaschl
- Department of Medical and Molecular Genetics, King’s College London, Guy’s Hospital, London SE1 9RT, UK
| | - Chack-Yung Yu
- Center for Molecular and Human Genetics, Nationwide Children’s Hospital and Department of Pediatrics, The Ohio State University, Columbus, OH 43205, USA
| | - Matthew E. Hurles
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SA, UK
| | - Catherine Schaffner
- Department of Medicine, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, UK
| | - R. Andres Floto
- Department of Medicine, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, UK
| | - Laurence Game
- Genomics Core Laboratory, MRC Clinical Sciences Centre, London W12 0NN, UK
| | - Karyn Meltz Steinberg
- Department of Genome Sciences, University of Washington School of Medicine and the Howard Hughes Medical Institute, Seattle, WA 98195, USA
| | - Richard K. Wilson
- The Genome Institute at Washington University, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Tina A. Graves
- The Genome Institute at Washington University, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine and the Howard Hughes Medical Institute, Seattle, WA 98195, USA
| | - H. Terence Cook
- Centre for Complement and Inflammation Research, Department of Medicine, Imperial College London, London W12 0NN, UK
| | - Timothy J. Vyse
- Department of Medical and Molecular Genetics, King’s College London, Guy’s Hospital, London SE1 9RT, UK
| | - Timothy J. Aitman
- Physiological Genomics and Medicine Group, MRC Clinical Sciences Centre, Faculty of Medicine, Imperial College London, London W12 0NN, UK
| |
Collapse
|
58
|
High-resolution fish on DNA fibers for low-copy repeats genome architecture studies. Genomics 2012; 100:380-6. [PMID: 22954586 PMCID: PMC3778886 DOI: 10.1016/j.ygeno.2012.08.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Revised: 08/10/2012] [Accepted: 08/22/2012] [Indexed: 11/22/2022]
Abstract
Low-copy repeats (LCRs) constitute 5% of the human genome. LCRs act as substrates for non-allelic homologous recombination (NAHR) leading to genomic structural variation. The aim of this study was to assess the potential of Fiber-FISH for LCRs direct visualization to support investigations of genome architecture within these challenging genomic regions. We describe a set of Fiber-FISH experiments designed for the study of the LCR22-2. This LCR is involved in recurrent reorganizations causing different genomic disorders. Four fosmid clones covering the entire length of the LCR22-2 and two single-copy BAC-clones, delimiting the LCR22-2 proximally and distally, were selected. The probes were hybridized in different multiple color combinations on DNA fibers from two karyotypically normal cell lines. We were able to identify three distinct structural haplotypes characterized by differences in copy-number and arrangement of the LCR22-2 genes and pseudogenes. Our results show that Multicolor Fiber-FISH is a viable methodological approach for the analysis of genome organization within complex LCR regions.
Collapse
|
59
|
Steinberg KM, Antonacci F, Sudmant PH, Kidd JM, Campbell CD, Vives L, Malig M, Scheinfeldt L, Beggs W, Ibrahim M, Lema G, Nyambo TB, Omar SA, Bodo JM, Froment A, Donnelly MP, Kidd KK, Tishkoff SA, Eichler EE. Structural diversity and African origin of the 17q21.31 inversion polymorphism. Nat Genet 2012; 44:872-80. [PMID: 22751100 PMCID: PMC3408829 DOI: 10.1038/ng.2335] [Citation(s) in RCA: 99] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2011] [Accepted: 06/01/2012] [Indexed: 12/12/2022]
Abstract
The 17q21.31 inversion polymorphism exists either as direct (H1) or inverted (H2) haplotypes with differential predispositions to disease and selection. We investigated its genetic diversity in 2,700 individuals, with an emphasis on African populations. We characterize eight structural haplotypes due to complex rearrangements that vary in size from 1.08-1.49 Mb and provide evidence for a 30-kb H1-H2 double recombination event. We show that recurrent partial duplications of the KANSL1 gene have occurred on both the H1 and H2 haplotypes and have risen to high frequency in European populations. We identify a likely ancestral H2 haplotype (H2') lacking these duplications that is enriched among African hunter-gatherer groups yet essentially absent from West African populations. Whereas H1 and H2 segmental duplications arose independently and before human migration out of Africa, they have reached high frequencies recently among Europeans, either because of extraordinary genetic drift or selective sweeps.
Collapse
|
60
|
Dennis MY, Nuttle X, Sudmant PH, Antonacci F, Graves TA, Nefedov M, Rosenfeld JA, Sajjadian S, Malig M, Kotkiewicz H, Curry CJ, Shafer S, Shaffer LG, de Jong PJ, Wilson RK, Eichler EE. Evolution of human-specific neural SRGAP2 genes by incomplete segmental duplication. Cell 2012; 149:912-22. [PMID: 22559943 DOI: 10.1016/j.cell.2012.03.033] [Citation(s) in RCA: 266] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Revised: 02/17/2012] [Accepted: 03/01/2012] [Indexed: 10/28/2022]
Abstract
Gene duplication is an important source of phenotypic change and adaptive evolution. We leverage a haploid hydatidiform mole to identify highly identical sequences missing from the reference genome, confirming that the cortical development gene Slit-Robo Rho GTPase-activating protein 2 (SRGAP2) duplicated three times exclusively in humans. We show that the promoter and first nine exons of SRGAP2 duplicated from 1q32.1 (SRGAP2A) to 1q21.1 (SRGAP2B) ∼3.4 million years ago (mya). Two larger duplications later copied SRGAP2B to chromosome 1p12 (SRGAP2C) and to proximal 1q21.1 (SRGAP2D) ∼2.4 and ∼1 mya, respectively. Sequence and expression analyses show that SRGAP2C is the most likely duplicate to encode a functional protein and is among the most fixed human-specific duplicate genes. Our data suggest a mechanism where incomplete duplication created a novel gene function-antagonizing parental SRGAP2 function-immediately "at birth" 2-3 mya, which is a time corresponding to the transition from Australopithecus to Homo and the beginning of neocortex expansion.
Collapse
Affiliation(s)
- Megan Y Dennis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, 98195, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
61
|
Sarkar D, Goldstein S, Schwartz DC, Newton MA. Statistical significance of optical map alignments. J Comput Biol 2012; 19:478-92. [PMID: 22506568 DOI: 10.1089/cmb.2011.0221] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The Optical Mapping System constructs ordered restriction maps spanning entire genomes through the assembly and analysis of large datasets comprising individually analyzed genomic DNA molecules. Such restriction maps uniquely reveal mammalian genome structure and variation, but also raise computational and statistical questions beyond those that have been solved in the analysis of smaller, microbial genomes. We address the problem of how to filter maps that align poorly to a reference genome. We obtain map-specific thresholds that control errors and improve iterative assembly. We also show how an optimal self-alignment score provides an accurate approximation to the probability of alignment, which is useful in applications seeking to identify structural genomic abnormalities.
Collapse
Affiliation(s)
- Deepayan Sarkar
- Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, New Delhi, India
| | | | | | | |
Collapse
|
62
|
Itsara A, Vissers L, Steinberg K, Meyer K, Zody M, Koolen D, de Ligt J, Cuppen E, Baker C, Lee C, Graves TA, Wilson R, Jenkins R, Veltman J, Eichler E. Resolving the breakpoints of the 17q21.31 microdeletion syndrome with next-generation sequencing. Am J Hum Genet 2012; 90:599-613. [PMID: 22482802 DOI: 10.1016/j.ajhg.2012.02.013] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Revised: 01/23/2012] [Accepted: 02/16/2012] [Indexed: 01/22/2023] Open
Abstract
Recurrent deletions have been associated with numerous diseases and genomic disorders. Few, however, have been resolved at the molecular level because their breakpoints often occur in highly copy-number-polymorphic duplicated sequences. We present an approach that uses a combination of somatic cell hybrids, array comparative genomic hybridization, and the specificity of next-generation sequencing to determine breakpoints that occur within segmental duplications. Applying our technique to the 17q21.31 microdeletion syndrome, we used genome sequencing to determine copy-number-variant breakpoints in three deletion-bearing individuals with molecular resolution. For two cases, we observed breakpoints consistent with nonallelic homologous recombination involving only H2 chromosomal haplotypes, as expected. Molecular resolution revealed that the breakpoints occurred at different locations within a 145 kbp segment of >99% identity and disrupt KANSL1 (previously known as KANSL1). In the remaining case, we found that unequal crossover occurred interchromosomally between the H1 and H2 haplotypes and that this event was mediated by a homologous sequence that was once again missing from the human reference. Interestingly, the breakpoints mapped preferentially to gaps in the current reference genome assembly, which we resolved in this study. Our method provides a strategy for the identification of breakpoints within complex regions of the genome harboring high-identity and copy-number-polymorphic segmental duplication. The approach should become particularly useful as high-quality alternate reference sequences become available and genome sequencing of individuals' DNA becomes more routine.
Collapse
|
63
|
Arlt MF, Wilson TE, Glover TW. Replication stress and mechanisms of CNV formation. Curr Opin Genet Dev 2012; 22:204-10. [PMID: 22365495 DOI: 10.1016/j.gde.2012.01.009] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2011] [Revised: 01/24/2012] [Accepted: 01/25/2012] [Indexed: 12/11/2022]
Abstract
Copy number variants (CNVs) are widely distributed throughout the human genome, where they contribute to genetic variation and phenotypic diversity. De novo CNVs are also a major cause of numerous genetic and developmental disorders. However, unlike many other types of mutations, little is known about the genetic and environmental risk factors for new and deleterious CNVs. DNA replication errors have been implicated in the generation of a major class of CNVs, the nonrecurrent CNVs. We have found that agents that perturb normal replication and create conditions of replication stress, including hydroxyurea and aphidicolin, are potent inducers of nonrecurrent CNVs in cultured human cells. These findings have broad implications for identifying CNV risk factors and for hydroxyurea-related therapies in humans.
Collapse
Affiliation(s)
- Martin F Arlt
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, United States
| | | | | |
Collapse
|
64
|
Uddin M, Sturge M, Peddle L, O'Rielly DD, Rahman P. Genome-wide signatures of 'rearrangement hotspots' within segmental duplications in humans. PLoS One 2011; 6:e28853. [PMID: 22194928 PMCID: PMC3237539 DOI: 10.1371/journal.pone.0028853] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2011] [Accepted: 11/16/2011] [Indexed: 11/19/2022] Open
Abstract
The primary objective of this study was to create a genome-wide high resolution map (i.e., >100 bp) of ‘rearrangement hotspots’ which can facilitate the identification of regions capable of mediating de novo deletions or duplications in humans. A hierarchical method was employed to fragment segmental duplications (SDs) into multiple smaller SD units. Combining an end space free pairwise alignment algorithm with a ‘seed and extend’ approach, we have exhaustively searched 409 million alignments to detect complex structural rearrangements within the reference-guided assembly of the NA18507 human genome (18× coverage), including the previously identified novel 4.8 Mb sequence from de novo assembly within this genome. We have identified 1,963 rearrangement hotspots within SDs which encompass 166 genes and display an enrichment of duplicated gene nucleotide variants (DNVs). These regions are correlated with increased non-allelic homologous recombination (NAHR) event frequency which presumably represents the origin of copy number variations (CNVs) and pathogenic duplications/deletions. Analysis revealed that 20% of the detected hotspots are clustered within the proximal and distal SD breakpoints flanked by the pathogenic deletions/duplications that have been mapped for 24 NAHR-mediated genomic disorders. FISH Validation of selected complex regions revealed 94% concordance with in silico localization of the highly homologous derivatives. Other results from this study indicate that intra-chromosomal recombination is enhanced in genic compared with agenic duplicated regions, and that gene desert regions comprising SDs may represent reservoirs for creation of novel genes. The generation of genome-wide signatures of ‘rearrangement hotspots’, which likely serve as templates for NAHR, may provide a powerful approach towards understanding the underlying mutational mechanism(s) for development of constitutional and acquired diseases.
Collapse
Affiliation(s)
- Mohammed Uddin
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
| | - Mitch Sturge
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
| | - Lynette Peddle
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
| | - Darren D. O'Rielly
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
| | - Proton Rahman
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
- * E-mail:
| |
Collapse
|
65
|
Unique and atypical deletions in Prader-Willi syndrome reveal distinct phenotypes. Eur J Hum Genet 2011; 20:283-90. [PMID: 22045295 DOI: 10.1038/ejhg.2011.187] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Prader-Willi syndrome (PWS) is a multisystem, contiguous gene disorder caused by an absence of paternally expressed genes within the 15q11.2-q13 region via one of the three main genetic mechanisms: deletion of the paternally inherited 15q11.2-q13 region, maternal uniparental disomy and imprinting defect. The deletion class is typically subdivided into Type 1 and Type 2 based on their proximal breakpoints (BP1-BP3 and BP2-BP3, respectively). Despite PWS being a well-characterized genetic disorder the role of the specific genes contributing to various aspects of the phenotype are not well understood. Methylation-specific multiplex ligation-dependent probe amplification (MS-MLPA) is a recently developed technique that detects copy number changes and aberrant DNA methylation. In this study, we initially applied MS-MLPA to elucidate the deletion subtypes of 88 subjects. In our cohort, 32 had a Type 1 and 49 had a Type 2 deletion. The remaining seven subjects had unique or atypical deletions that were either smaller (n=5) or larger (n=2) than typically described and were further characterized by array-based comparative genome hybridization. In two subjects both the PWS region (15q11.2) and the newly described 15q13.3 microdeletion syndrome region were deleted. The subjects with a unique or an atypical deletion revealed distinct phenotypic features. In conclusion, unique or atypical deletions were found in ∼8% of the deletion subjects with PWS in our cohort. These novel deletions provide further insight into the potential role of several of the genes within the 15q11.2 and the 15q13.3 regions.
Collapse
|
66
|
Cooper DN, Bacolla A, Férec C, Vasquez KM, Kehrer-Sawatzki H, Chen JM. On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease. Hum Mutat 2011; 32:1075-99. [PMID: 21853507 PMCID: PMC3177966 DOI: 10.1002/humu.21557] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2011] [Accepted: 06/17/2011] [Indexed: 12/21/2022]
Abstract
Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher order features of the genomic architecture. The human genome is now recognized to contain "pervasive architectural flaws" in that certain DNA sequences are inherently mutation prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here, we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of noncanonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair and may serve to increase mutation frequencies in generalized fashion (i.e., both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom.
| | | | | | | | | | | |
Collapse
|
67
|
Abstract
Copy number variants (CNVs) play an important role in human disease and population diversity. Advancements in technology have allowed for the analysis of CNVs in thousands of individuals with disease in addition to thousands of controls. These studies have identified rare CNVs associated with neuropsychiatric diseases such as autism, schizophrenia, and intellectual disability. In addition, copy number polymorphisms (CNPs) are present at higher frequencies in the population, show high diversity in copy number, sequence, and structure, and have been associated with multiple phenotypes, primarily related to immune or environmental response. However, the landscape of copy number variation still remains largely unexplored, especially for smaller CNVs and those embedded within complex regions of the human genome. An integrated approach including characterization of single nucleotide variants and CNVs in a large number of individuals with disease and normal genomes holds the promise of thoroughly elucidating the genetic basis of human disease and diversity.
Collapse
Affiliation(s)
- Santhosh Girirajan
- Department of Genome Sciences and Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA.
| | | | | |
Collapse
|
68
|
Church DM, Schneider VA, Graves T, Auger K, Cunningham F, Bouk N, Chen HC, Agarwala R, McLaren WM, Ritchie GRS, Albracht D, Kremitzki M, Rock S, Kotkiewicz H, Kremitzki C, Wollam A, Trani L, Fulton L, Fulton R, Matthews L, Whitehead S, Chow W, Torrance J, Dunn M, Harden G, Threadgold G, Wood J, Collins J, Heath P, Griffiths G, Pelan S, Grafham D, Eichler EE, Weinstock G, Mardis ER, Wilson RK, Howe K, Flicek P, Hubbard T. Modernizing reference genome assemblies. PLoS Biol 2011; 9:e1001091. [PMID: 21750661 PMCID: PMC3130012 DOI: 10.1371/journal.pbio.1001091] [Citation(s) in RCA: 327] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Deanna M Church
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
69
|
Alkan C, Coe BP, Eichler EE. Genome structural variation discovery and genotyping. Nat Rev Genet 2011; 12:363-76. [PMID: 21358748 DOI: 10.1038/nrg2958] [Citation(s) in RCA: 963] [Impact Index Per Article: 74.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Comparisons of human genomes show that more base pairs are altered as a result of structural variation - including copy number variation - than as a result of point mutations. Here we review advances and challenges in the discovery and genotyping of structural variation. The recent application of massively parallel sequencing methods has complemented microarray-based methods and has led to an exponential increase in the discovery of smaller structural-variation events. Some global discovery biases remain, but the integration of experimental and computational approaches is proving fruitful for accurate characterization of the copy, content and structure of variable regions. We argue that the long-term goal should be routine, cost-effective and high quality de novo assembly of human genomes to comprehensively assess all classes of structural variation.
Collapse
Affiliation(s)
- Can Alkan
- Department of Genome Sciences, University of Washington School of Medicine, Foege S413C, 3720 15th Ave NE, Seattle, Washington, USA
| | | | | |
Collapse
|
70
|
Microdeletion/microduplication of proximal 15q11.2 between BP1 and BP2: a susceptibility region for neurological dysfunction including developmental and language delay. Hum Genet 2011; 130:517-28. [PMID: 21359847 DOI: 10.1007/s00439-011-0970-4] [Citation(s) in RCA: 191] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 02/10/2011] [Indexed: 10/18/2022]
Abstract
The proximal long arm of chromosome 15 has segmental duplications located at breakpoints BP1-BP5 that mediate the generation of NAHR-related microdeletions and microduplications. The classical Prader-Willi/Angelman syndrome deletion is flanked by either of the proximal BP1 or BP2 breakpoints and the distal BP3 breakpoint. The larger Type I deletions are flanked by BP1 and BP3 in both Prader-Willi and Angelman syndrome subjects. Those with this deletion are reported to have a more severe phenotype than individuals with either Type II deletions (BP2-BP3) or uniparental disomy 15. The BP1-BP2 region spans approximately 500 kb and contains four evolutionarily conserved genes that are not imprinted. Reports of mutations or disturbed expression of these genes appear to impact behavioral and neurological function in affected individuals. Recently, reports of deletions and duplications flanked by BP1 and BP2 suggest an association with speech and motor delays, behavioral problems, seizures, and autism. We present a large cohort of subjects with copy number alteration of BP1 to BP2 with common phenotypic features. These include autism, developmental delay, motor and language delays, and behavioral problems, which were present in both cytogenetic groups. Parental studies demonstrated phenotypically normal carriers in several instances, and mildly affected carriers in others, complicating phenotypic association and/or causality. Possible explanations for these results include reduced penetrance, altered gene dosage on a particular genetic background, or a susceptibility region as reported for other areas of the genome implicated in autism and behavior disturbances.
Collapse
|
71
|
Kloosterman WP, Guryev V, van Roosmalen M, Duran KJ, de Bruijn E, Bakker SCM, Letteboer T, van Nesselrooij B, Hochstenbach R, Poot M, Cuppen E. Chromothripsis as a mechanism driving complex de novo structural rearrangements in the germline. Hum Mol Genet 2011; 20:1916-24. [PMID: 21349919 DOI: 10.1093/hmg/ddr073] [Citation(s) in RCA: 236] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
A variety of mutational mechanisms shape the dynamic architecture of human genomes and occasionally result in congenital defects and disease. Here, we used genome-wide long mate-pair sequencing to systematically screen for inherited and de novo structural variation in a trio including a child with severe congenital abnormalities. We identified 4321 inherited structural variants and 17 de novo rearrangements. We characterized the de novo structural changes to the base-pair level revealing a complex series of balanced inter- and intra-chromosomal rearrangements consisting of 12 breakpoints involving chromosomes 1, 4 and 10. Detailed inspection of breakpoint regions indicated that a series of simultaneous double-stranded DNA breaks caused local shattering of chromosomes. Fusion of the resulting chromosomal fragments involved non-homologous end joining, since junction points displayed limited or no homology and small insertions and deletions. The pattern of random joining of chromosomal fragments that we observe here strongly resembles the somatic rearrangement patterns--termed chromothripsis--that have recently been described in deranged cancer cells. We conclude that a similar mechanism may also drive the formation of de novo structural variation in the germline.
Collapse
Affiliation(s)
- Wigard P Kloosterman
- Department of Medical Genetics, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG Utrecht, The Netherlands
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
72
|
McGuire MM, Bowden W, Engel NJ, Ahn HW, Kovanci E, Rajkovic A. Genomic analysis using high-resolution single-nucleotide polymorphism arrays reveals novel microdeletions associated with premature ovarian failure. Fertil Steril 2011; 95:1595-600. [PMID: 21256485 DOI: 10.1016/j.fertnstert.2010.12.052] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2010] [Revised: 11/12/2010] [Accepted: 12/22/2010] [Indexed: 01/12/2023]
Abstract
OBJECTIVE To analyze DNA from women with premature ovarian failure (POF) for genome-wide copy-number variations (CNVs), focusing on novel autosomal microdeletions. DESIGN Case-control genetic association study. SETTING Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, Texas. PATIENT(S) Of 89 POF patients, eight experienced primary amenorrhea and 81 exhibited secondary amenorrhea before age 40 years. INTERVENTION(S) Genomic DNA from peripheral blood samples was analyzed for CNVs using high-resolution single-nucleotide polymorphism (SNP) arrays. MAIN OUTCOME MEASURE(S) Identification of novel CNVs in 89 POF cases, using the Database of Genomic Variants as a control population. RESULT(S) A total of 198 autosomal CNVs were detected by SNP arrays, ranging in size from 0.1 Mb to 3.4 Mb. These CNVs (>0.1 Mb) included 17 novel microduplications and seven novel microdeletions, six of which contained the coding regions 8q24.13, 10p15-p14, 10q23.31, 10q26.3, 15q25.2, and 18q21.32. Most of the novel CNVs were derived from autosomes rather than the X chromosome. CONCLUSION(S) The present pilot study revealed novel microdeletions/microduplications in women with POF. Two novel microdeletions caused haploinsufficiency for SYCE1 and CPEB1, genes known to cause ovarian failure in knockout mouse models. Chromosomal microarrays may be a useful adjunct to conventional karyotyping when evaluating genomic imbalances in women with POF.
Collapse
Affiliation(s)
- Megan M McGuire
- Department of Obstetrics, Gynecology, and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA
| | | | | | | | | | | |
Collapse
|
73
|
A human genome structural variation sequencing resource reveals insights into mutational mechanisms. Cell 2010; 143:837-47. [PMID: 21111241 DOI: 10.1016/j.cell.2010.10.027] [Citation(s) in RCA: 210] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2010] [Revised: 09/15/2010] [Accepted: 10/15/2010] [Indexed: 12/31/2022]
Abstract
Understanding the prevailing mutational mechanisms responsible for human genome structural variation requires uniformity in the discovery of allelic variants and precision in terms of breakpoint delineation. We develop a resource based on capillary end sequencing of 13.8 million fosmid clones from 17 human genomes and characterize the complete sequence of 1054 large structural variants corresponding to 589 deletions, 384 insertions, and 81 inversions. We analyze the 2081 breakpoint junctions and infer potential mechanism of origin. Three mechanisms account for the bulk of germline structural variation: microhomology-mediated processes involving short (2-20 bp) stretches of sequence (28%), nonallelic homologous recombination (22%), and L1 retrotransposition (19%). The high quality and long-range continuity of the sequence reveals more complex mutational mechanisms, including repeat-mediated inversions and gene conversion, that are most often missed by other methods, such as comparative genomic hybridization, single nucleotide polymorphism microarrays, and next-generation sequencing.
Collapse
|
74
|
Girirajan S, Eichler EE. Phenotypic variability and genetic susceptibility to genomic disorders. Hum Mol Genet 2010; 19:R176-87. [PMID: 20807775 PMCID: PMC2953748 DOI: 10.1093/hmg/ddq366] [Citation(s) in RCA: 201] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Revised: 07/28/2010] [Accepted: 08/24/2010] [Indexed: 11/13/2022] Open
Abstract
The duplication architecture of the human genome predisposes our species to recurrent copy number variation and disease. Emerging data suggest that this mechanism of mutation contributes to both common and rare diseases. Two features regarding this form of mutation have emerged. First, common structural polymorphisms create susceptible and protective chromosomal architectures. These structural polymorphisms occur at varying frequencies in populations, leading to different susceptibility and ethnic predilection. Second, a subset of rearrangements shows extreme variability in expressivity. We propose that two types of genomic disorders may be distinguished: syndromic forms where the phenotypic features are largely invariant and those where the same molecular lesion associates with a diverse set of diagnoses including epilepsy, schizophrenia, autism, intellectual disability and congenital malformations. Copy number variation analyses of patient genomes reveal that disease type and severity may be explained by the occurrence of additional rare events and their inheritance within families. We propose that the overall burden of copy number variants creates differing sensitized backgrounds during development leading to different thresholds and disease outcomes. We suggest that the accumulation of multiple high-penetrant alleles of low frequency may serve as a more general model for complex genetic diseases, posing a significant challenge for diagnostics and disease management.
Collapse
Affiliation(s)
| | - Evan E. Eichler
- Department of Genome Sciences, Howard Hughes Medical Institute,University of Washington School of Medicine, PO Box 355065, Foege S413C, 3720 15th Avenue NE, Seattle, WA 98195, USA
| |
Collapse
|