51
|
Hedges DJ, Belancio VP. Restless genomes humans as a model organism for understanding host-retrotransposable element dynamics. ADVANCES IN GENETICS 2011; 73:219-62. [PMID: 21310298 DOI: 10.1016/b978-0-12-380860-8.00006-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Since their initial discovery in maize, there have been various attempts to categorize the relationship between transposable elements (TEs) and their host organisms. These have ranged from TEs being selfish parasites to their role as essential, functional components of organismal biology. Research over the past several decades has, in many respects, only served to complicate the issue even further. On the one hand, investigators have amassed substantial evidence concerning the negative effects that TE-mutagenic activity can have on host genomes and organismal fitness. On the other hand, we find an increasing number of examples, across several taxa, of TEs being incorporated into functional biological roles for their host organism. Some 45% of our own genomes are comprised of TE copies. While many of these copies are dormant, having lost their ability to mobilize, several lineages continue to actively proliferate in modern human populations. With its complement of ancestral and active TEs, the human genome exhibits key aspects of the host-TE dynamic that has played out since early on in organismal evolution. In this review, we examine what insights the particularly well-characterized human system can provide regarding the nature of the host-TE interaction.
Collapse
Affiliation(s)
- Dale J Hedges
- Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, Florida, USA
| | | |
Collapse
|
52
|
Hu TT, Pattyn P, Bakker EG, Cao J, Cheng JF, Clark RM, Fahlgren N, Fawcett JA, Grimwood J, Gundlach H, Haberer G, Hollister JD, Ossowski S, Ottilar RP, Salamov AA, Schneeberger K, Spannagl M, Wang X, Yang L, Nasrallah ME, Bergelson J, Carrington JC, Gaut BS, Schmutz J, Mayer KFX, Van de Peer Y, Grigoriev IV, Nordborg M, Weigel D, Guo YL. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet 2011; 43:476-81. [PMID: 21478890 PMCID: PMC3083492 DOI: 10.1038/ng.807] [Citation(s) in RCA: 596] [Impact Index Per Article: 45.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2010] [Accepted: 03/18/2011] [Indexed: 12/19/2022]
Abstract
We report the 207-Mb genome sequence of the North American Arabidopsis lyrata strain MN47 based on 8.3× dideoxy sequence coverage. We predict 32,670 genes in this outcrossing species compared to the 27,025 genes in the selfing species Arabidopsis thaliana. The much smaller 125-Mb genome of A. thaliana, which diverged from A. lyrata 10 million years ago, likely constitutes the derived state for the family. We found evidence for DNA loss from large-scale rearrangements, but most of the difference in genome size can be attributed to hundreds of thousands of small deletions, mostly in noncoding DNA and transposons. Analysis of deletions and insertions still segregating in A. thaliana indicates that the process of DNA loss is ongoing, suggesting pervasive selection for a smaller genome. The high-quality reference genome sequence for A. lyrata will be an important resource for functional, evolutionary and ecological studies in the genus Arabidopsis.
Collapse
Affiliation(s)
- Tina T Hu
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
53
|
Nam GH, Ahn K, Bae JH, Han K, Lee CE, Park KD, Lee SH, Cho BW, Kim HS. Genomic structure and expression analyses of the PYGM gene in the thoroughbred horse. Zoolog Sci 2011; 28:276-80. [PMID: 21466345 DOI: 10.2108/zsj.28.276] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Muscle glycogen Phosphorylase (PYGM) has been shown to catalyze the degradation of glycogen to glucose-1-phosphate. The PYGM gene can contribute to providing energy to the body by disassembling the glycogen in muscle. Here, we analyzed the genomic structure and expression of the PYGM gene in the thoroughbred horse. The PYGM gene, containing several transposable elements (MIRs, LINEs, and MERs), was highly conserved in mammalian genomes. In order to understand the expression of the horse PYGM gene, we performed quantitative RT-PCR using 11 thoroughbred horse tissue samples. The horse PYGM gene was broadly expressed in all tissues tested. In particular, the highest expression of the horse PYGM gene was observed in skeletal muscle tissue relative to the other tissues. Interestingly, the horse PYGM gene contains fewer mobile elements than its human ortholog, resulting in an increase in the structural stability of the PYGM gene sequence. This study provides insights into the genomic structure of the horse PYGM gene that may be useful in future studies of its association with exercise capability.
Collapse
Affiliation(s)
- Gyu-Hwi Nam
- Department of Biological Sciences, College of Natural Sciences, Pusan National University, Busan 609-735, Republic of Korea
| | | | | | | | | | | | | | | | | |
Collapse
|
54
|
Harris EE. Nonadaptive processes in primate and human evolution. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2011; 143 Suppl 51:13-45. [PMID: 21086525 DOI: 10.1002/ajpa.21439] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Evolutionary biology has tended to focus on adaptive evolution by positive selection as the primum mobile of evolutionary trajectories in species while underestimating the importance of nonadaptive evolutionary processes. In this review, I describe evidence that suggests that primate and human evolution has been strongly influenced by nonadaptive processes, particularly random genetic drift and mutation. This is evidenced by three fundamental effects: a relative relaxation of selective constraints (i.e., purifying selection), a relative increase in the fixation of slightly deleterious mutations, and a general reduction in the efficacy of positive selection. These effects are observed in protein-coding, regulatory regions, and in gene expression data, as well as in an augmentation of fixation of large-scale mutations, including duplicated genes, mobile genetic elements, and nuclear mitochondrial DNA. The evidence suggests a general population-level explanation such as a reduction in effective population size (N(e)). This would have tipped the balance between the evolutionary forces of natural selection and random genetic drift toward genetic drift for variants having small selective effects. After describing these proximate effects, I describe the potential consequences of these effects for primate and human evolution. For example, an increase in the fixation of slightly deleterious mutations could potentially have led to an increase in the fixation rate of compensatory mutations that act to suppress the effects of slightly deleterious substitutions. The potential consequences of compensatory evolution for the evolution of novel gene functions and in potentially confounding the detection of positively selected genes are explored. The consequences of the passive accumulation of large-scale genomic mutations by genetic drift are unclear, though evidence suggests that new gene copies as well as insertions of transposable elements into genes can potentially lead to adaptive phenotypes. Finally, because a decrease in selective constraint at the genetic level is expected to have effects at the morphological level, I review studies that compare rates of morphological change in various mammalian and island populations where N(e) is reduced. Furthermore, I discuss evidence that suggests that craniofacial morphology in the Homo lineage has shifted from an evolutionary rate constrained by purifying selection toward a neutral evolutionary rate.
Collapse
Affiliation(s)
- Eugene E Harris
- Department of Biological Sciences and Geology, Queensborough Community College, City University of New York, Bayside, NY 10364, USA.
| |
Collapse
|
55
|
Al-Shahrour F, Minguez P, Marqués-Bonet T, Gazave E, Navarro A, Dopazo J. Selection upon genome architecture: conservation of functional neighborhoods with changing genes. PLoS Comput Biol 2010; 6:e1000953. [PMID: 20949098 PMCID: PMC2951340 DOI: 10.1371/journal.pcbi.1000953] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2009] [Accepted: 09/08/2010] [Indexed: 11/19/2022] Open
Abstract
An increasing number of evidences show that genes are not distributed randomly across eukaryotic chromosomes, but rather in functional neighborhoods. Nevertheless, the driving force that originated and maintains such neighborhoods is still a matter of controversy. We present the first detailed multispecies cartography of genome regions enriched in genes with related functions and study the evolutionary implications of such clustering. Our results indicate that the chromosomes of higher eukaryotic genomes contain up to 12% of genes arranged in functional neighborhoods, with a high level of gene co-expression, which are consistently distributed in phylogenies. Unexpectedly, neighborhoods with homologous functions are formed by different (non-orthologous) genes in different species. Actually, instead of being conserved, functional neighborhoods present a higher degree of synteny breaks than the genome average. This scenario is compatible with the existence of selective pressures optimizing the coordinated transcription of blocks of functionally related genes. If these neighborhoods were broken by chromosomal rearrangements, selection would favor further rearrangements reconstructing other neighborhoods of similar function. The picture arising from this study is a dynamic genomic landscape with a high level of functional organization. We describe here the most extensive functional cartography of the genomes of multiple species carried out to date. Our study shows, for the first time, how neighborhoods of functionally related genes arise and how they are maintained through evolution following a pattern that is fully consistent with the evolutionary trees of the analyzed species. Contrary to what would be expected, such neighborhoods are not composed of the same genes in different species but rather by genes unrelated, annotated, however, with the same function. Our analysis also reveals that such neighborhoods are dynamically rebuilt in a way that, while the particular genes often change, it is the function of the genes present in the neighborhood, as the ultimate target of selection, that is preserved.
Collapse
Affiliation(s)
- Fátima Al-Shahrour
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | - Pablo Minguez
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | - Tomás Marqués-Bonet
- Institut de Biologia Evolutiva, Universitat Pompeu Fabra (UPF) and Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, United States of America
| | - Elodie Gazave
- Institut de Biologia Evolutiva, Universitat Pompeu Fabra (UPF) and Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain
| | - Arcadi Navarro
- Institut de Biologia Evolutiva, Universitat Pompeu Fabra (UPF) and Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain
- Population Genomics Node (National Institute for Bioinformatics, INB), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Joaquín Dopazo
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
- CIBER de Enfermedades Raras (CIBERER), Valencia, Spain
- Functional Genomics Node (National Institute for Bioinformatics, INB), CIPF, Valencia, Spain
- * E-mail:
| |
Collapse
|
56
|
Lin JY, Stupar RM, Hans C, Hyten DL, Jackson SA. Structural and functional divergence of a 1-Mb duplicated region in the soybean (Glycine max) genome and comparison to an orthologous region from Phaseolus vulgaris. THE PLANT CELL 2010; 22:2545-61. [PMID: 20729383 PMCID: PMC2947175 DOI: 10.1105/tpc.110.074229] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2010] [Revised: 07/21/2010] [Accepted: 07/30/2010] [Indexed: 05/03/2023]
Abstract
Soybean (Glycine max) has undergone at least two rounds of polyploidization, resulting in a paleopolyploid genome that is a mosaic of homoeologous regions. To determine the structural and functional impact of these duplications, we sequenced two ~1-Mb homoeologous regions of soybean, Gm8 and Gm15, derived from the most recent ~13 million year duplication event and the orthologous region from common bean (Phaseolus vulgaris), Pv5. We observed inversions leading to major structural variation and a bias between the two chromosome segments as Gm15 experienced more gene movement (gene retention rate of 81% in Gm15 versus 91% in Gm8) and a nearly twofold increase in the deletion of long terminal repeat (LTR) retrotransposons via solo LTR formation. Functional analyses of Gm15 and Gm8 revealed decreases in gene expression and synonymous substitution rates for Gm15, for instance, a 38% increase in transcript levels from Gm8 relative to Gm15. Transcriptional divergence of homoeologs was found based on expression patterns among seven tissues and developmental stages. Our results indicate asymmetric evolution between homoeologous regions of soybean as evidenced by structural changes and expression variances of homoeologous genes.
Collapse
Affiliation(s)
- Jer-Young Lin
- Molecular and Evolutionary Genetics, Purdue University, West Lafayette, Indiana 47907
| | - Robert M. Stupar
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Christian Hans
- Molecular and Evolutionary Genetics, Purdue University, West Lafayette, Indiana 47907
| | - David L. Hyten
- Soybean Genomics and Improvement Lab, U.S. Department of Agriculture–Agricultural Research Service, Beltsville, Maryland 20705
| | - Scott A. Jackson
- Molecular and Evolutionary Genetics, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|
57
|
Konkel MK, Batzer MA. A mobile threat to genome stability: The impact of non-LTR retrotransposons upon the human genome. Semin Cancer Biol 2010; 20:211-21. [PMID: 20307669 DOI: 10.1016/j.semcancer.2010.03.001] [Citation(s) in RCA: 130] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2010] [Revised: 03/04/2010] [Accepted: 03/16/2010] [Indexed: 02/06/2023]
Abstract
It is now commonly agreed that the human genome is not the stable entity originally presumed. Deletions, duplications, inversions, and insertions are common, and contribute significantly to genomic structural variations (SVs). Their collective impact generates much of the inter-individual genomic diversity observed among humans. Not only do these variations change the structure of the genome; they may also have functional implications, e.g. altered gene expression. Some SVs have been identified as the cause of genetic disorders, including cancer predisposition. Cancer cells are notorious for their genomic instability, and often show genomic rearrangements at the microscopic and submicroscopic level to which transposable elements (TEs) contribute. Here, we review the role of TEs in genome instability, with particular focus on non-LTR retrotransposons. Currently, three non-LTR retrotransposon families - long interspersed element 1 (L1), SVA (short interspersed element (SINE-R), variable number of tandem repeats (VNTR), and Alu), and Alu (a SINE) elements - mobilize in the human genome, and cause genomic instability through both insertion- and post-insertion-based mutagenesis. Due to the abundance and high sequence identity of TEs, they frequently mislead the homologous recombination repair pathway into non-allelic homologous recombination, causing deletions, duplications, and inversions. While less comprehensively studied, non-LTR retrotransposon insertions and TE-mediated rearrangements are probably more common in cancer cells than in healthy tissue. This may be at least partially attributed to the commonly seen global hypomethylation as well as general epigenetic dysfunction of cancer cells. Where possible, we provide examples that impact cancer predisposition and/or development.
Collapse
Affiliation(s)
- Miriam K Konkel
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, Baton Rouge, LA 70803, USA
| | | |
Collapse
|
58
|
Meyer TJ, Srikanta D, Conlin EM, Batzer MA. Heads or tails: L1 insertion-associated 5' homopolymeric sequences. Mob DNA 2010; 1:7. [PMID: 20226075 PMCID: PMC2837659 DOI: 10.1186/1759-8753-1-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Accepted: 02/01/2010] [Indexed: 12/01/2022] Open
Abstract
Background L1s are one of the most successful autonomous mobile elements in primate genomes. These elements comprise as much as 17% of primate genomes with the majority of insertions occurring via target primed reverse transcription (TPRT). Twin priming, a variant of TPRT, can result in unusual DNA sequence architecture. These insertions appear to be inverted, truncated L1s flanked by target site duplications. Results We report on loci with sequence architecture consistent with variants of the twin priming mechanism and introduce dual priming, a mechanism that could generate similar sequence characteristics. These insertions take the form of truncated L1s with hallmarks of classical TPRT insertions but having a poly(T) simple repeat at the 5' end of the insertion. We identified loci using computational analyses of the human, chimpanzee, orangutan, rhesus macaque and marmoset genomes. Insertion site characteristics for all putative loci were experimentally verified. Conclusions The 39 loci that passed our computational and experimental screens probably represent inversion-deletion events which resulted in a 5' inverted poly(A) tail. Based on our observations of these loci and their local sequence properties, we conclude that they most probably represent twin priming events with unusually short non-inverted portions. We postulate that dual priming could, theoretically, produce the same patterns. The resulting homopolymeric stretches associated with these insertion events may promote genomic instability and create potential target sites for future retrotransposition events.
Collapse
Affiliation(s)
- Thomas J Meyer
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA
| | - Deepa Srikanta
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA
| | - Erin M Conlin
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA
| | - Mark A Batzer
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA
| |
Collapse
|
59
|
Delprat A, Negre B, Puig M, Ruiz A. The transposon Galileo generates natural chromosomal inversions in Drosophila by ectopic recombination. PLoS One 2009; 4:e7883. [PMID: 19936241 PMCID: PMC2775673 DOI: 10.1371/journal.pone.0007883] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2009] [Accepted: 10/01/2009] [Indexed: 11/25/2022] Open
Abstract
Background Transposable elements (TEs) are responsible for the generation of chromosomal inversions in several groups of organisms. However, in Drosophila and other Dipterans, where inversions are abundant both as intraspecific polymorphisms and interspecific fixed differences, the evidence for a role of TEs is scarce. Previous work revealed that the transposon Galileo was involved in the generation of two polymorphic inversions of Drosophila buzzatii. Methodology/Principal Findings To assess the impact of TEs in Drosophila chromosomal evolution and shed light on the mechanism involved, we isolated and sequenced the two breakpoints of another widespread polymorphic inversion from D. buzzatii, 2z3. In the non inverted chromosome, the 2z3 distal breakpoint was located between genes CG2046 and CG10326 whereas the proximal breakpoint lies between two novel genes that we have named Dlh and Mdp. In the inverted chromosome, the analysis of the breakpoint sequences revealed relatively large insertions (2,870-bp and 4,786-bp long) including two copies of the transposon Galileo (subfamily Newton), one at each breakpoint, plus several other TEs. The two Galileo copies: (i) are inserted in opposite orientation; (ii) present exchanged target site duplications; and (iii) are both chimeric. Conclusions/Significance Our observations provide the best evidence gathered so far for the role of TEs in the generation of Drosophila inversions. In addition, they show unequivocally that ectopic recombination is the causative mechanism. The fact that the three polymorphic D. buzzatii inversions investigated so far were generated by the same transposon family is remarkable and is conceivably due to Galileo's unusual structure and current (or recent) transpositional activity.
Collapse
Affiliation(s)
- Alejandra Delprat
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain
| | | | | | | |
Collapse
|
60
|
Transposable elements in gene regulation and in the evolution of vertebrate genomes. Curr Opin Genet Dev 2009; 19:607-12. [PMID: 19914058 DOI: 10.1016/j.gde.2009.10.013] [Citation(s) in RCA: 112] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2009] [Revised: 10/20/2009] [Accepted: 10/26/2009] [Indexed: 01/30/2023]
Abstract
Repetitive DNA and in particular transposable elements have been intimately linked to eukaryotic genomes for millions of years. Once overlooked for being only a collection of selfish debris and a nuisance for sequence assembly, genomic repeats are now being recognized as a key driving force in genome evolution. Indeed, by changing the DNA landscape of genomes, transposable elements have been a rich source of innovation in genes, regulatory elements and genome structures. In this review, I will focus on recent advances that demonstrate that genomic repeats have had a global impact on vertebrate gene regulatory networks. I will also summarize results that show how transposable elements have been a major catalyst of structural rearrangements throughout evolution.
Collapse
|
61
|
Cordaux R, Batzer MA. The impact of retrotransposons on human genome evolution. Nat Rev Genet 2009; 10:691-703. [PMID: 19763152 DOI: 10.1038/nrg2640] [Citation(s) in RCA: 1104] [Impact Index Per Article: 73.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Their ability to move within genomes gives transposable elements an intrinsic propensity to affect genome evolution. Non-long terminal repeat (LTR) retrotransposons--including LINE-1, Alu and SVA elements--have proliferated over the past 80 million years of primate evolution and now account for approximately one-third of the human genome. In this Review, we focus on this major class of elements and discuss the many ways that they affect the human genome: from generating insertion mutations and genomic instability to altering gene expression and contributing to genetic innovation. Increasingly detailed analyses of human and other primate genomes are revealing the scale and complexity of the past and current contributions of non-LTR retrotransposons to genomic change in the human lineage.
Collapse
Affiliation(s)
- Richard Cordaux
- CNRS UMR 6556 Ecologie, Evolution, Symbiose, Université de Poitiers, 40 Avenue du Recteur Pineau, Poitiers, France
| | | |
Collapse
|
62
|
Cruciform-forming inverted repeats appear to have mediated many of the microinversions that distinguish the human and chimpanzee genomes. Chromosome Res 2009; 17:469-83. [PMID: 19475482 DOI: 10.1007/s10577-009-9039-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2009] [Revised: 04/08/2009] [Accepted: 04/08/2009] [Indexed: 10/20/2022]
Abstract
Submicroscopic inversions have contributed significantly to the genomic divergence between humans and chimpanzees over evolutionary time. Those microinversions which are flanked by segmental duplications (SDs) are presumed to have originated via non-allelic homologous recombination between SDs arranged in inverted orientation. However, the nature of the mechanisms underlying those inversions which are not flanked by SDs remains unclear. We have investigated 35 such inversions, ranging in size from 51-nt to 22056-nt, with the goal of characterizing the DNA sequences in the breakpoint-flanking regions. Using the macaque genome as an outgroup, we determined the lineage specificity of these inversions and noted that the majority (N = 31; 89%) were associated with deletions (of length between 1-nt and 6754-nt) immediately adjacent to one or both inversion breakpoints. Overrepresentations of both direct and inverted repeats, >or= 6-nt in length and capable of non-B DNA structure formation, were noted in the vicinity of breakpoint junctions suggesting that these repeats could have contributed to double strand breakage. Inverted repeats capable of cruciform structure formation were also found to be a common feature of the inversion breakpoint-flanking regions, consistent with these inversions having originated through the resolution of Holliday junction-like cruciforms. Sequences capable of non-B DNA structure formation have previously been implicated in promoting gross deletions and translocations causing human genetic disease. We conclude that non-B DNA forming sequences may also have promoted the occurrence of mutations in an evolutionary context, giving rise to at least some of the inversion/deletions which now serve to distinguish the human and chimpanzee genomes.
Collapse
|
63
|
Xing J, Zhang Y, Han K, Salem AH, Sen SK, Huff CD, Zhou Q, Kirkness EF, Levy S, Batzer MA, Jorde LB. Mobile elements create structural variation: analysis of a complete human genome. Genome Res 2009; 19:1516-26. [PMID: 19439515 DOI: 10.1101/gr.091827.109] [Citation(s) in RCA: 220] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Structural variants (SVs) are common in the human genome. Because approximately half of the human genome consists of repetitive, transposable DNA sequences, it is plausible that these elements play an important role in generating SVs in humans. Sequencing of the diploid genome of one individual human (HuRef) affords us the opportunity to assess, for the first time, the impact of mobile elements on SVs in an individual in a thorough and unbiased fashion. In this study, we systematically evaluated more than 8000 SVs to identify mobile element-associated SVs as small as 100 bp and specific to the HuRef genome. Combining computational and experimental analyses, we identified and validated 706 mobile element insertion events (including Alu, L1, SVA elements, and nonclassical insertions), which added more than 305 kb of new DNA sequence to the HuRef genome compared with the Human Genome Project (HGP) reference sequence (hg18). We also identified 140 mobile element-associated deletions, which removed approximately 126 kb of sequence from the HuRef genome. Overall, approximately 10% of the HuRef-specific indels larger than 100 bp are caused by mobile element-associated events. More than one-third of the insertion/deletion events occurred in genic regions, and new Alu insertions occurred in exons of three human genes. Based on the number of insertions and the estimated time to the most recent common ancestor of HuRef and the HGP reference genome, we estimated the Alu, L1, and SVA retrotransposition rates to be one in 21 births, 212 births, and 916 births, respectively. This study presents the first comprehensive analysis of mobile element-related structural variants in the complete DNA sequence of an individual and demonstrates that mobile elements play an important role in generating inter-individual structural variation.
Collapse
Affiliation(s)
- Jinchuan Xing
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84109, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|