1
|
Gene Conversion amongst Alu SINE Elements. Genes (Basel) 2021; 12:genes12060905. [PMID: 34208107 PMCID: PMC8230782 DOI: 10.3390/genes12060905] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 05/30/2021] [Accepted: 06/08/2021] [Indexed: 11/17/2022] Open
Abstract
The process of non-allelic gene conversion acts on homologous sequences during recombination, replacing parts of one with the other to make them uniform. Such concerted evolution is best described as paralogous ribosomal RNA gene unification that serves to preserve the essential house-keeping functions of the converted genes. Transposed elements (TE), especially Alu short interspersed elements (SINE) that have more than a million copies in primate genomes, are a significant source of homologous units and a verified target of gene conversion. The consequences of such a recombination-based process are diverse, including multiplications of functional TE internal binding domains and, for evolutionists, confusing divergent annotations of orthologous transposable elements in related species. We systematically extracted and compared 68,097 Alu insertions in various primates looking for potential events of TE gene conversion and discovered 98 clear cases of Alu-Alu gene conversion, including 64 cases for which the direction of conversion was identified (e.g., AluS conversion to AluY). Gene conversion also does not necessarily affect the entire homologous sequence, and we detected 69 cases of partial gene conversion that resulted in virtual hybrids of two elements. Phylogenetic screening of gene-converted Alus revealed three clear hotspots of the process in the ancestors of Catarrhini, Hominoidea, and gibbons. In general, our systematic screening of orthologous primate loci for gene-converted TEs provides a new strategy and view of a post-integrative process that changes the identities of such elements.
Collapse
|
2
|
Choe SH, Park SJ, Cho HM, Park HR, Lee JR, Kim YH, Huh JW. A single mutation in the ACTR8 gene associated with lineage-specific expression in primates. BMC Evol Biol 2020; 20:66. [PMID: 32503430 PMCID: PMC7275561 DOI: 10.1186/s12862-020-01620-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 04/29/2020] [Indexed: 12/17/2022] Open
Abstract
Background Alternative splicing (AS) generates various transcripts from a single gene and thus plays a significant role in transcriptomic diversity and proteomic complexity. Alu elements are primate-specific transposable elements (TEs) and can provide a donor or acceptor site for AS. In a study on TE-mediated AS, we recently identified a novel AluSz6-exonized ACTR8 transcript of the crab-eating monkey (Macaca fascicularis). In the present study, we sought to determine the molecular mechanism of AluSz6 exonization of the ACTR8 gene and investigate its evolutionary and functional consequences in the crab-eating monkey. Results We performed RT-PCR and genomic PCR to analyze AluSz6 exonization in the ACTR8 gene and the expression of the AluSz6-exonized transcript in nine primate samples, including prosimians, New world monkeys, Old world monkeys, and hominoids. AluSz6 integration was estimated to have occurred before the divergence of simians and prosimians. The Alu-exonized transcript obtained by AS was lineage-specific and expressed only in Old world monkeys and apes, and humans. This lineage-specific expression was caused by a single G duplication in AluSz6, which provides a new canonical 5′ splicing site. We further identified other alternative transcripts that were unaffected by the AluSz6 insertion. Finally, we observed that the alternative transcripts were transcribed into new isoforms with C-terminus deletion, and in silico analysis showed that these isoforms do not have a destructive function. Conclusions The single G duplication in the TE sequence is the source of TE exonization and AS, and this mutation may suffer a different fate of ACTR8 gene expression during primate evolution.
Collapse
Affiliation(s)
- Se-Hee Choe
- National Primate Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, 28116, Korea.,Department of Functional Genomics, KRIBB School of Bioscience, Korea University of Science & Technology (UST), Daejeon, 34113, Korea
| | - Sang-Je Park
- National Primate Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, 28116, Korea
| | - Hyeon-Mu Cho
- National Primate Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, 28116, Korea.,Department of Functional Genomics, KRIBB School of Bioscience, Korea University of Science & Technology (UST), Daejeon, 34113, Korea
| | - Hye-Ri Park
- National Primate Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, 28116, Korea.,Department of Functional Genomics, KRIBB School of Bioscience, Korea University of Science & Technology (UST), Daejeon, 34113, Korea
| | - Ja-Rang Lee
- Primate Resource Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Jeongeup, 56216, Korea
| | - Young-Hyun Kim
- National Primate Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, 28116, Korea. .,Department of Functional Genomics, KRIBB School of Bioscience, Korea University of Science & Technology (UST), Daejeon, 34113, Korea.
| | - Jae-Won Huh
- National Primate Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, 28116, Korea. .,Department of Functional Genomics, KRIBB School of Bioscience, Korea University of Science & Technology (UST), Daejeon, 34113, Korea.
| |
Collapse
|
3
|
Fawcett JA, Innan H. The Role of Gene Conversion between Transposable Elements in Rewiring Regulatory Networks. Genome Biol Evol 2020; 11:1723-1729. [PMID: 31209488 PMCID: PMC6598467 DOI: 10.1093/gbe/evz124] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/11/2019] [Indexed: 12/23/2022] Open
Abstract
Nature has found many ways to utilize transposable elements (TEs) throughout evolution. Many molecular and cellular processes depend on DNA-binding proteins recognizing hundreds or thousands of similar DNA motifs dispersed throughout the genome that are often provided by TEs. It has been suggested that TEs play an important role in the evolution of such systems, in particular, the rewiring of gene regulatory networks. One mechanism that can further enhance the rewiring of regulatory networks is nonallelic gene conversion between copies of TEs. Here, we will first review evidence for nonallelic gene conversion in TEs. Then, we will illustrate the benefits nonallelic gene conversion provides in rewiring regulatory networks. For instance, nonallelic gene conversion between TE copies offers an alternative mechanism to spread beneficial mutations that improve the network, it allows multiple mutations to be combined and transferred together, and it allows natural selection to work efficiently in spreading beneficial mutations and removing disadvantageous mutations. Future studies examining the role of nonallelic gene conversion in the evolution of TEs should help us to better understand how TEs have contributed to evolution.
Collapse
|
4
|
Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing. PLoS One 2020; 15:e0226340. [PMID: 31940362 PMCID: PMC6961855 DOI: 10.1371/journal.pone.0226340] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 11/25/2019] [Indexed: 12/29/2022] Open
Abstract
Structural variation (SV) is typically defined as variation within the human genome that exceeds 50 base pairs (bp). SV may be copy number neutral or it may involve duplications, deletions, and complex rearrangements. Recent studies have shown SV to be associated with many human diseases. However, studies of SV have been challenging due to technological constraints. With the advent of third generation (long-read) sequencing technology, exploration of longer stretches of DNA not easily examined previously has been made possible. In the present study, we utilized third generation (long-read) sequencing techniques to examine SV in the EGFR landscape of four haplotypes derived from two human samples. We analyzed the EGFR gene and its landscape (+/- 500,000 base pairs) using this approach and were able to identify a region of non-coding DNA with over 90% similarity to the most common activating EGFR mutation in non-small cell lung cancer. Based on previously published Alu-element genome instability algorithms, we propose a molecular mechanism to explain how this non-coding region of DNA may be interacting with and impacting the stability of the EGFR gene and potentially generating this cancer-driver gene. By these techniques, we were also able to identify previously hidden structural variation in the four haplotypes and in the human reference genome (hg38). We applied previously published algorithms to compare the relative stabilities of these five different EGFR gene landscape haplotypes to estimate their relative potentials to generate the EGFR exon 19, 15 bp canonical deletion. To our knowledge, the present study is the first to use the differences in genomic architecture between targeted cancer-linked phased haplotypes to estimate their relative potentials to form a common cancer-linked driver mutation.
Collapse
|
5
|
Mussotter T, Bengesser K, Högel J, Cooper DN, Kehrer-Sawatzki H. Population-specific differences in gene conversion patterns between human SUZ12 and SUZ12P are indicative of the dynamic nature of interparalog gene conversion. Hum Genet 2014; 133:383-401. [PMID: 24385046 DOI: 10.1007/s00439-013-1410-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2013] [Accepted: 12/08/2013] [Indexed: 11/29/2022]
Abstract
Nonallelic homologous gene conversion (NAHGC) resulting from interparalog recombination without crossover represents an important influence on the evolution of duplicated sequences in the human genome. In 17q11.2, different paralogous sequences mediate large NF1 deletions by nonallelic homologous recombination with crossover (NAHR). Among these paralogs are SUZ12 and its pseudogene SUZ12P which harbour the breakpoints of type-2 (1.2-Mb) NF1 deletions. Such deletions are caused predominantly by mitotic NAHR since somatic mosaicism with normal cells is evident in most patients. Investigating whether SUZ12 and SUZ12P have also been involved in NAHGC, we observed gene conversion tracts between these paralogs in both Africans (AFR) and Europeans (EUR). Since germline type-2 NF1 deletions resulting from meiotic NAHR are very rare, the vast majority of the gene conversion tracts in SUZ12 and SUZ12P are likely to have resulted from mitotic recombination during premeiotic cell divisions of germ cells. A higher number of gene conversion tracts were noted within SUZ12 and SUZ12P in AFR as compared to EUR. Further, the distinctive signature of NAHGC (a high number of SNPs per paralog and a high number of shared SNPs between paralogs), a characteristic of many actively recombining paralogs, was observed in both SUZ12 and SUZ12P but only in AFR and not in EUR. A novel polymorphic 2.3-kb deletion in SUZ12P was identified which exhibited a high allele frequency in EUR. We postulate that this interparalog structural difference, together with low allelic recombination rates, could have caused a reduction in NAHGC between SUZ12 and SUZ12P during human evolution.
Collapse
Affiliation(s)
- Tanja Mussotter
- Institute of Human Genetics, University of Ulm, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | | | | | | | | |
Collapse
|
6
|
Cook GW, Konkel MK, Walker JA, Bourgeois MG, Fullerton ML, Fussell JT, Herbold HD, Batzer MA. A comparison of 100 human genes using an alu element-based instability model. PLoS One 2013; 8:e65188. [PMID: 23755193 PMCID: PMC3670932 DOI: 10.1371/journal.pone.0065188] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 04/23/2013] [Indexed: 02/07/2023] Open
Abstract
The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct) orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted) orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks) potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1) the two-hit double-strand break potential of Alu elements and 2) the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.
Collapse
Affiliation(s)
- George W. Cook
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Miriam K. Konkel
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Jerilyn A. Walker
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Matthew G. Bourgeois
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Mitchell L. Fullerton
- Department of Bioengineering, Clemson University, Clemson, South Carolina, United States of America
| | - John T. Fussell
- Electrochemical Materials, Louisiana Business and Technology Center, Baton Rouge, Louisiana, United States of America
| | - Heath D. Herbold
- Albemarle Corporation, Pasadena, Texas, United States of America
| | - Mark A. Batzer
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
- * E-mail:
| |
Collapse
|
7
|
Künstner A, Nabholz B, Ellegren H. Significant selective constraint at 4-fold degenerate sites in the avian genome and its consequence for detection of positive selection. Genome Biol Evol 2011; 3:1381-9. [PMID: 22042333 PMCID: PMC3242499 DOI: 10.1093/gbe/evr112] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/24/2011] [Indexed: 12/15/2022] Open
Abstract
A major conclusion from comparative genomics is that many sequences that do not code for proteins are conserved beyond neutral expectations, indicating that they evolve under the influence of purifying selection and are likely to have functional roles. Due to the degeneracy of the genetic code, synonymous sites within protein-coding genes have previously been seen as "silent" with respect to function and thereby invisible to selection. However, there are indications that synonymous sites of vertebrate genomes are also subject to selection and this is not necessarily because of potential codon bias. We used divergence in ancestral repeats as a neutral reference to estimate the constraint on 4-fold degenerate sites of avian genes in a whole-genome approach. In the pairwise comparison of chicken and zebra finch, constraint was estimated at 24-32%. Based on three-species alignments of chicken, turkey, and zebra finch, lineage-specific estimates of constraint were 43%, 29%, and 24%, respectively. The finding of significant constraint at 4-fold degenerate sites from data on interspecific divergence was replicated in an analysis of intraspecific diversity in the chicken genome. These observations corroborate recent data from mammalian genomes and call for a reappraisal of the use of synonymous substitution rates as neutral standards in molecular evolutionary analysis, for example, in the use of the well-known d(N)/d(S) ratio and in inferences on positive selection. We show by simulations that the rate of false positives in the detection of positively selected genes and sites increases several-fold at the levels of constraint at 4-fold degenerate sites found in this study.
Collapse
Affiliation(s)
| | | | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| |
Collapse
|
8
|
Castoe TA, Hall KT, Guibotsy Mboulas ML, Gu W, de Koning APJ, Fox SE, Poole AW, Vemulapalli V, Daza JM, Mockler T, Smith EN, Feschotte C, Pollock DD. Discovery of highly divergent repeat landscapes in snake genomes using high-throughput sequencing. Genome Biol Evol 2011; 3:641-53. [PMID: 21572095 PMCID: PMC3157835 DOI: 10.1093/gbe/evr043] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
We conducted a comprehensive assessment of genomic repeat content in two snake genomes, the venomous copperhead (Agkistrodon contortrix) and the Burmese python (Python molurus bivittatus). These two genomes are both relatively small (∼1.4 Gb) but have surprisingly extensive differences in the abundance and expansion histories of their repeat elements. In the python, the readily identifiable repeat element content is low (21%), similar to bird genomes, whereas that of the copperhead is higher (45%), similar to mammalian genomes. The copperhead's greater repeat content arises from the recent expansion of many different microsatellites and transposable element (TE) families, and the copperhead had 23-fold greater levels of TE-related transcripts than the python. This suggests the possibility that greater TE activity in the copperhead is ongoing. Expansion of CR1 LINEs in the copperhead genome has resulted in TE-mediated microsatellite expansion ("microsatellite seeding") at a scale several orders of magnitude greater than previously observed in vertebrates. Snakes also appear to be prone to horizontal transfer of TEs, particularly in the copperhead lineage. The reason that the copperhead has such a small genome in the face of so much recent expansion of repeat elements remains an open question, although selective pressure related to extreme metabolic performance is an obvious candidate. TE activity can affect gene regulation as well as rates of recombination and gene duplication, and it is therefore possible that TE activity played a role in the evolution of major adaptations in snakes; some evidence suggests this may include the evolution of venom repertoires.
Collapse
Affiliation(s)
- Todd A Castoe
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|