1
|
Ahmad A, Zhang W. Genomic exploration of retrocopies in Insect pests of plants and their role in the expansion of heat shock proteins superfamily as evolutionary targets. BMC Genomics 2024; 25:1116. [PMID: 39567882 PMCID: PMC11577761 DOI: 10.1186/s12864-024-11056-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Accepted: 11/15/2024] [Indexed: 11/22/2024] Open
Abstract
BACKGROUND Gene duplication is a dominant mechanism for the evolution of genomes and plays a key role in genome expansion. Gene duplication via retroposition produces RNA-mediated intron-less copies called retrocopies, that may gain regulatory sequence and biological function to generate retrogenes. Retrocopies dynamics have been reported in several model insect species, but there is still a huge knowledge gap about retrocopies dynamics in most insects, and their role in adaptation. RESULTS In this study, we reported retrocopy dynamics in 40 species of insect pests of plants belonging to six insect orders. We identified a total of 9,930 retrocopies, which is so far the largest set of retrocopies identified in insects. The identified retrocopies were further grouped into 2,599 Retrogenes, 4,578 Chimeras, 1,241 Intact retrocopies, and 1,512 Pseudogene. We also analyzed all the identified retrogenes that were annotated into 506 gene families. The highest number of retrogenes annotated belong to the heat shock proteins superfamily and are present across all the 40 species from the six orders. We found a significant expansion of the heat shock protein superfamily in the studied species. Almost all the retrogenes, including those belonging to heat shock proteins, are under purifying selection. In summary, we report the retrocopies and retrogenes dynamics in a large set of insect pests of plants and the expansion of the heat shock protein family due to retroposition. CONCLUSION This study unveils retrocopy dynamics in the insect pests of plants and highlights the evolution of new genes due to retroposition, and their role in important gene families' expansion.
Collapse
Affiliation(s)
- Aftab Ahmad
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, Shaanxi, China
| | - Wenyu Zhang
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, Shaanxi, China.
- Research & Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen, 518063, China.
| |
Collapse
|
2
|
Complex Analysis of Retroposed Genes' Contribution to Human Genome, Proteome and Transcriptome. Genes (Basel) 2020; 11:genes11050542. [PMID: 32408516 PMCID: PMC7290577 DOI: 10.3390/genes11050542] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/06/2020] [Accepted: 05/08/2020] [Indexed: 02/07/2023] Open
Abstract
Gene duplication is a major driver of organismal evolution. One of the main mechanisms of gene duplications is retroposition, a process in which mRNA is first transcribed into DNA and then reintegrated into the genome. Most gene retrocopies are depleted of the regulatory regions. Nevertheless, examples of functional retrogenes are rapidly increasing. These functions come from the gain of new spatio-temporal expression patterns, imposed by the content of the genomic sequence surrounding inserted cDNA and/or by selectively advantageous mutations, which may lead to the switch from protein coding to regulatory RNA. As recent studies have shown, these genes may lead to new protein domain formation through fusion with other genes, new regulatory RNAs or other regulatory elements. We utilized existing data from high-throughput technologies to create a complex description of retrogenes functionality. Our analysis led to the identification of human retroposed genes that substantially contributed to transcriptome and proteome. These retrocopies demonstrated the potential to encode proteins or short peptides, act as cis- and trans- Natural Antisense Transcripts (NATs), regulate their progenitors’ expression by competing for the same microRNAs, and provide a sequence to lncRNA and novel exons to existing protein-coding genes. Our study also revealed that retrocopies, similarly to retrotransposons, may act as recombination hot spots. To our best knowledge this is the first complex analysis of these functions of retrocopies.
Collapse
|
3
|
Zhou Y, Zhang C. Evolutionary patterns of chimeric retrogenes in Oryza species. Sci Rep 2019; 9:17733. [PMID: 31776387 PMCID: PMC6881317 DOI: 10.1038/s41598-019-54085-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 10/30/2019] [Indexed: 11/23/2022] Open
Abstract
Chimeric retroposition is a process by which RNA is reverse transcribed and the resulting cDNA is integrated into the genome along with flanking sequences. This process plays essential roles and drives genome evolution. Although the origination rates of chimeric retrogenes are high in plant genomes, the evolutionary patterns of the retrogenes and their parental genes are relatively uncharacterised in the rice genome. In this study, we evaluated the substitution ratio of 24 retrogenes and their parental genes to clarify their evolutionary patterns. The results indicated that seven gene pairs were under positive selection. Additionally, soon after new chimeric retrogenes were formed, they rapidly evolved. However, an unexpected pattern was also revealed. Specifically, after an undefined period following the formation of new chimeric retrogenes, the parental genes, rather than the new chimeric retrogenes, rapidly evolved under positive selection. We also observed that one retro chimeric gene (RCG3) was highly expressed in infected calli, whereas its parental gene was not. Finally, a comparison of our Ka/Ks analysis with that of other species indicated that the proportion of genes under positive selection is greater for chimeric retrogenes than for non-chimeric retrogenes in the rice genome.
Collapse
Affiliation(s)
- Yanli Zhou
- The Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, No. 132 Lanhei Road, Kunming, 650201, Yunnan, China
| | - Chengjun Zhang
- The Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, No. 132 Lanhei Road, Kunming, 650201, Yunnan, China. .,Haiyan Engineering & Technology Center, Kunming Institute of Botany, Chinese Academy of Science, Jiaxing, 314300, Zhejiang, China.
| |
Collapse
|
4
|
Casola C, Betrán E. The Genomic Impact of Gene Retrocopies: What Have We Learned from Comparative Genomics, Population Genomics, and Transcriptomic Analyses? Genome Biol Evol 2017; 9:1351-1373. [PMID: 28605529 PMCID: PMC5470649 DOI: 10.1093/gbe/evx081] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/18/2017] [Indexed: 02/07/2023] Open
Abstract
Gene duplication is a major driver of organismal evolution. Gene retroposition is a mechanism of gene duplication whereby a gene's transcript is used as a template to generate retroposed gene copies, or retrocopies. Intriguingly, the formation of retrocopies depends upon the enzymatic machinery encoded by retrotransposable elements, genomic parasites occurring in the majority of eukaryotes. Most retrocopies are depleted of the regulatory regions found upstream of their parental genes; therefore, they were initially considered transcriptionally incompetent gene copies, or retropseudogenes. However, examples of functional retrocopies, or retrogenes, have accumulated since the 1980s. Here, we review what we have learned about retrocopies in animals, plants and other eukaryotic organisms, with a particular emphasis on comparative and population genomic analyses complemented with transcriptomic datasets. In addition, these data have provided information about the dynamics of the different "life cycle" stages of retrocopies (i.e., polymorphic retrocopy number variants, fixed retropseudogenes and retrogenes) and have provided key insights into the retroduplication mechanisms, the patterns and evolutionary forces at work during the fixation process and the biological function of retrogenes. Functional genomic and transcriptomic data have also revealed that many retropseudogenes are transcriptionally active and a biological role has been experimentally determined for many. Finally, we have learned that not only non-long terminal repeat retroelements but also long terminal repeat retroelements play a role in the emergence of retrocopies across eukaryotes. This body of work has shown that mRNA-mediated duplication represents a widespread phenomenon that produces an array of new genes that contribute to organismal diversity and adaptation.
Collapse
Affiliation(s)
- Claudio Casola
- Department of Ecosystem Science and Management, Texas A&M University, TX
| | - Esther Betrán
- Department of Biology, University of Texas at Arlington, Arlington, TX
| |
Collapse
|
5
|
Du K, He S. Evolutionary fate and implications of retrocopies in the African coelacanth genome. BMC Genomics 2015; 16:915. [PMID: 26555943 PMCID: PMC4641402 DOI: 10.1186/s12864-015-2178-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2015] [Accepted: 10/31/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The coelacanth is known as a "living fossil" because of its morphological resemblance to its fossil ancestors. Thus, it serves as a useful model that provides insight into the fish that first walked on land. Retrocopies are a type of novel genetic element that are likely to contribute to genome or phenotype innovations. Thus, investigating retrocopies in the coelacanth genome can determine the role of retrocopies in coelacanth genome innovations and perhaps even water-to-land adaptations. RESULTS We determined the dS values, dN/dS ratios, expression patterns, and enrichment of functional categories for 472 retrocopies in the African coelacanth genome. Of the retrocopies, 85-355 were shown to be potentially functional (i.e., retrogenes). The distribution of retrocopies based on their dS values revealed a burst pattern of young retrocopies in the genome. The retrocopy birth pattern was shown to be more similar to that in tetrapods than ray-finned fish, which indicates a genomic transformation that accompanied vertebrate evolution from water to land. Among these retrocopies, retrogenes were more prevalent in old than young retrocopies, which indicates that most retrocopies may have been eliminated during evolution, even though some retrocopies survived, attained biological function as retrogenes, and became old. Transcriptome data revealed that many retrocopies showed a biased expression pattern in the testis, although the expression was not specifically associated with a particular retrocopy age range. We identified 225 Ensembl genes that overlapped with the coelacanth genome retrocopies. GO enrichment analysis revealed different overrepresented GO (gene ontology) terms between these "retrocopy-overlapped genes" and the retrocopy parent genes, which indicates potential genomic functional organization produced by retrotranspositions. Among the 225 retrocopy-overlapped genes, we also identified 46 that were coelacanth-specific, which could represent a potential molecular basis for coelacanth evolution. CONCLUSIONS Our study identified 472 retrocopies in the coelacanth genome. Sequence analysis of these retrocopies and their parent genes, transcriptome data, and GO annotation information revealed novel insight about the potential role of genomic retrocopies in coelacanth evolution and vertebrate adaptations during the evolutionary transition from water to land.
Collapse
Affiliation(s)
- Kang Du
- Key Laboratory of Aquatic Biodiversity and Conservation of the Chinese Academy of Sciences, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, 430072, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Shunping He
- Key Laboratory of Aquatic Biodiversity and Conservation of the Chinese Academy of Sciences, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, 430072, China.
| |
Collapse
|
6
|
Zhang C, Wang J, Marowsky NC, Long M, Wing RA, Fan C. High occurrence of functional new chimeric genes in survey of rice chromosome 3 short arm genome sequences. Genome Biol Evol 2013; 5:1038-48. [PMID: 23651622 PMCID: PMC3673630 DOI: 10.1093/gbe/evt071] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
In an effort to identify newly evolved genes in rice, we searched the genomes of Asian-cultivated rice Oryza sativa ssp. japonica and its wild progenitors, looking for lineage-specific genes. Using genome pairwise comparison of approximately 20-Mb DNA sequences from the chromosome 3 short arm (Chr3s) in six rice species, O. sativa, O. nivara, O. rufipogon, O. glaberrima, O. barthii, and O. punctata, combined with synonymous substitution rate tests and other evidence, we were able to identify potential recently duplicated genes, which evolved within the last 1 Myr. We identified 28 functional O. sativa genes, which likely originated after O. sativa diverged from O. glaberrima. These genes account for around 1% (28/3,176) of all annotated genes on O. sativa's Chr3s. Among the 28 new genes, two recently duplicated segments contained eight genes. Fourteen of the 28 new genes consist of chimeric gene structure derived from one or multiple parental genes and flanking targeting sequences. Although the majority of these 28 new genes were formed by single or segmental DNA-based gene duplication and recombination, we found two genes that were likely originated partially through exon shuffling. Sequence divergence tests between new genes and their putative progenitors indicated that new genes were most likely evolving under natural selection. We showed all 28 new genes appeared to be functional, as suggested by Ka/Ks analysis and the presence of RNA-seq, cDNA, expressed sequence tag, massively parallel signature sequencing, and/or small RNA data. The high rate of new gene origination and of chimeric gene formation in rice may demonstrate rice's broad diversification, domestication, its environmental adaptation, and the role of new genes in rice speciation.
Collapse
Affiliation(s)
- Chengjun Zhang
- Department of Ecology and Evolution, University of Chicago, USA
| | | | | | | | | | | |
Collapse
|
7
|
Abstract
Genes are perpetually added to and deleted from genomes during evolution. Thus, it is important to understand how new genes are formed and how they evolve to be critical components of the genetic systems that determine the biological diversity of life. Two decades of effort have shed light on the process of new gene origination and have contributed to an emerging comprehensive picture of how new genes are added to genomes, ranging from the mechanisms that generate new gene structures to the presence of new genes in different organisms to the rates and patterns of new gene origination and the roles of new genes in phenotypic evolution. We review each of these aspects of new gene evolution, summarizing the main evidence for the origination and importance of new genes in evolution. We highlight findings showing that new genes rapidly change existing genetic systems that govern various molecular, cellular, and phenotypic functions.
Collapse
Affiliation(s)
- Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois 60637;
| | | | | | | |
Collapse
|
8
|
Ciomborowska J, Rosikiewicz W, Szklarczyk D, Makałowski W, Makałowska I. "Orphan" retrogenes in the human genome. Mol Biol Evol 2012; 30:384-96. [PMID: 23066043 PMCID: PMC3548309 DOI: 10.1093/molbev/mss235] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Gene duplicates generated via retroposition were long thought to be pseudogenized and consequently decayed. However, a significant number of these genes escaped their evolutionary destiny and evolved into functional genes. Despite multiple studies, the number of functional retrogenes in human and other genomes remains unclear. We performed a comparative analysis of human, chicken, and worm genomes to identify “orphan” retrogenes, that is, retrogenes that have replaced their progenitors. We located 25 such candidates in the human genome. All of these genes were previously known, and the majority has been intensively studied. Despite this, they have never been recognized as retrogenes. Analysis revealed that the phenomenon of replacing parental genes with their retrocopies has been taking place over the entire span of animal evolution. This process was often species specific and contributed to interspecies differences. Surprisingly, these retrogenes, which should evolve in a more relaxed mode, are subject to a very strong purifying selection, which is, on average, two and a half times stronger than other human genes. Also, for retrogenes, they do not show a typical overall tendency for a testis-specific expression. Notably, seven of them are associated with human diseases. Recognizing them as “orphan” retrocopies, which have different regulatory machinery than their parents, is important for any disease studies in model organisms, especially when discoveries made in one species are transferred to humans.
Collapse
Affiliation(s)
- Joanna Ciomborowska
- Laboratory of Bionformatics, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | | | | | | | | |
Collapse
|
9
|
Meisel RP. Evolutionary dynamics of recently duplicated genes: Selective constraints on diverging paralogs in the Drosophila pseudoobscura genome. J Mol Evol 2009; 69:81-93. [PMID: 19536449 DOI: 10.1007/s00239-009-9254-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2008] [Revised: 05/06/2009] [Accepted: 05/26/2009] [Indexed: 01/12/2023]
Abstract
Duplicated genes produce genetic variation that can influence the evolution of genomes and phenotypes. In most cases, for a duplicated gene to contribute to evolutionary novelty it must survive the early stages of divergence from its paralog without becoming a pseudogene. I examined the evolutionary dynamics of recently duplicated genes in the Drosophila pseudoobscura genome to understand the factors affecting these early stages of evolution. Paralogs located in closer proximity have higher sequence identity. This suggests that gene conversion occurs more often between duplications in close proximity or that there is more genetic independence between distant paralogs. Partially duplicated genes have a higher likelihood of pseudogenization than completely duplicated genes, but no single factor significantly contributes to the selective constraints on a completely duplicated gene. However, DNA-based duplications and duplications within chromosome arms tend to produce longer duplication tracts than retroposed and inter-arm duplications, and longer duplication tracts are more likely to contain a completely duplicated gene. Therefore, the relative position of paralogs and the mechanism of duplication indirectly affect whether a duplicated gene is retained or pseudogenized.
Collapse
Affiliation(s)
- Richard P Meisel
- Department of Biology, The Pennsylvania State University, University Park, 16802, USA.
| |
Collapse
|
10
|
Patterns of amino acid evolution in the Drosophila ananassae chimeric gene, siren, parallel those of other Adh-derived chimeras. Genetics 2008; 180:1261-3. [PMID: 18780749 DOI: 10.1534/genetics.108.090068] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
siren1 and siren2 are novel alcohol dehydrogenase (Adh)-derived chimeric genes in the Drosophila bipectinata complex. D. ananassae, however, harbors a single homolog of these genes. Like other Adh-derived chimeric genes, siren evolved adaptively shortly after it was formed. These changes likely shifted the catalytic activity of siren.
Collapse
|
11
|
De Grassi A, Lanave C, Saccone C. Genome duplication and gene-family evolution: the case of three OXPHOS gene families. Gene 2008; 421:1-6. [PMID: 18573316 DOI: 10.1016/j.gene.2008.05.011] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2007] [Revised: 05/15/2008] [Accepted: 05/21/2008] [Indexed: 10/22/2022]
Abstract
DNA duplication is one of the main forces acting on the evolution of organisms because it creates the raw genetic material that natural selection can subsequently modify. Duplicated regions are mainly due to "errors" in different phases of meiosis, but DNA transposable elements and reverse transcription also contribute to amplify and move the genomic material to different genomic locations. As a result, redundancy affects genomes to variable degrees: from the single gene to the whole genome (WGD). Gene families are clusters of genes created by duplication and their size reflects the number of duplicated genes, called paralogs, in each species. The aim of this review is to describe the state of the art in the identification and analysis of gene families in eukaryotes, with specific attention to those generated by ancient large scale events in vertebrates (WGD or large segmental duplications). As a case study, we report our work on the evolution of gene families encoding subunits of the five OXPHOS (oxidative phosphorylation) complexes, fundamental and highly conserved in all respiring cells. Although OXPHOS gene families are smaller than the general trend in nuclear gene families, some exceptions are observed, such as three gene families with at least two paralogs in vertebrates. These gene families encode cytochrome c (Cyt c, the electron shuttle protein between complex III and IV), Lipid Binding Protein (LBP, the channel protein of complex V which transfers protons through the inner mitochondrial membrane) and the MLRQ subunit (MLRQ, a supernumerary subunit of the large complex I, with unknown function). We provide a two-step approach, based on structural genomic data, to demonstrate that these gene families should have arisen through WGD (or large segmental duplication) events at the origin of vertebrates and, only afterwards, underwent species-specific events of further gene duplications and loss. In summary, this review reflects the need to apply genome comparative approaches, deriving from both "classical" molecular phylogenetic analysis and "new" genome map analysis, to successfully define the complex evolutionary relations between gene family members which, in turn, are essential to obtain any other comparative phylogenetic or functional results.
Collapse
Affiliation(s)
- Anna De Grassi
- Istituto di Tecnologie Biomediche, Sede di Bari, CNR, Bari, Italy
| | | | | |
Collapse
|
12
|
Chen ST, Cheng HC, Barbash DA, Yang HP. Evolution of hydra, a recently evolved testis-expressed gene with nine alternative first exons in Drosophila melanogaster. PLoS Genet 2008; 3:e107. [PMID: 17616977 PMCID: PMC1904467 DOI: 10.1371/journal.pgen.0030107] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2007] [Accepted: 05/15/2007] [Indexed: 12/26/2022] Open
Abstract
We describe here the Drosophila gene hydra that appears to have originated de novo in the melanogaster subgroup and subsequently evolved in both structure and expression level in Drosophila melanogaster and its sibling species. D. melanogaster hydra encodes a predicted protein of ~300 amino acids with no apparent similarity to any previously known proteins. The syntenic region flanking hydra on both sides is found in both D. ananassae and D. pseudoobscura, but hydra is found only in melanogaster subgroup species, suggesting that it originated less than ~13 million y ago. Exon 1 of hydra has undergone recurrent duplications, leading to the formation of nine tandem alternative exon 1s in D. melanogaster. Seven of these alternative exons are flanked on their 3′ side by the transposon DINE-1 (Drosophila interspersed element-1). We demonstrate that at least four of the nine duplicated exon 1s can function as alternative transcription start sites. The entire hydra locus has also duplicated in D. simulans and D. sechellia. D. melanogaster hydra is expressed most intensely in the proximal testis, suggesting a role in late-stage spermatogenesis. The coding region of hydra has a relatively high Ka/Ks ratio between species, but the ratio is less than 1 in all comparisons, suggesting that hydra is subject to functional constraint. Analysis of sequence polymorphism and divergence of hydra shows that it has evolved under positive selection in the lineage leading to D. melanogaster. The dramatic structural changes surrounding the first exons do not affect the tissue specificity of gene expression: hydra is expressed predominantly in the testes in D. melanogaster, D. simulans, and D. yakuba. However, we have found that expression level changed dramatically (~ >20-fold) between D. melanogaster and D. simulans. While hydra initially evolved in the absence of nearby transposable element insertions, we suggest that the subsequent accumulation of repetitive sequences in the hydra region may have contributed to structural and expression-level evolution by inducing rearrangements and causing local heterochromatinization. Our analysis further shows that recurrent evolution of both gene structure and expression level may be characteristics of newly evolved genes. We also suggest that late-stage spermatogenesis is the functional target for newly evolved and rapidly evolving male-specific genes. Similar groups of animals have similar numbers of genes, but not all of these genes are the same. While some genes are highly conserved and can be easily and uniquely identified in species ranging from yeast to plants to humans, other genes are sometimes found in only a small number or even in a single species. Such newly evolved genes may help produce traits that make species unique. We describe here a newly evolved gene called hydra that occurs only in a small subgroup of Drosophila species. hydra is expressed in the testes, suggesting that it may have a function in male fertility. hydra has evolved significantly in its structure and protein-coding sequence among species. The authors named the gene hydra after the nine-headed monster slain by Hercules because in one species, Drosophila melanogaster, hydra has nine potential alternative first exons. Perhaps because of this or other structural changes, the level of RNA made by hydra differs significantly between one pair of species. This analysis reveals that newly created genes may evolve rapidly in sequence, structure, and expression level.
Collapse
Affiliation(s)
- Shou-Tao Chen
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan, Republic of China
| | - Hsin-Chien Cheng
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan, Republic of China
| | - Daniel A Barbash
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Hsiao-Pei Yang
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan, Republic of China
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
13
|
Bhutkar A, Russo SM, Smith TF, Gelbart WM. Genome-scale analysis of positionally relocated genes. Genome Res 2007; 17:1880-7. [PMID: 17989252 DOI: 10.1101/gr.7062307] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
During evolution, genome reorganization includes large-scale events such as inversions, translocations, and segmental or even whole-genome duplications, as well as fine-scale events such as the relocation of individual genes. This latter category, which we will refer to as positionally relocated genes (PRGs), is the subject of this report. Assessment of the magnitude of such PRGs and of possible contributing mechanisms is aided by a comparative analysis of related genomes, where conserved chromosomal organization can aid in identifying genes that have acquired a new location in a lineage of these genomes. Here we utilize two methods to comprehensively identify relocated protein-coding genes in the recently sequenced genomes of 12 species of genus Drosophila. We use exceptions to the general rule of maintenance of chromosome arm (Muller element) association for most Drosophila genes to identify one major class of PRGs. We also identify a partially overlapping set of PRGs among "embedded genes," located within the extents of other surrounding genes. We provide evidence that PRG movements have at least two different origins: Some events occur via retrotransposition of processed RNAs and others via a DNA-based transposition mechanism. Overall, we identify several hundred PRGs that arose within a lineage of the genus Drosophila phylogeny and provide suggestive evidence that a few thousand such events have occurred within the radiation of the insect order Diptera, thereby illustrating the magnitude of the contribution of PRG movement to chromosomal reorganization during evolution.
Collapse
Affiliation(s)
- Arjun Bhutkar
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA.
| | | | | | | |
Collapse
|
14
|
Bai Y, Casola C, Feschotte C, Betrán E. Comparative genomics reveals a constant rate of origination and convergent acquisition of functional retrogenes in Drosophila. Genome Biol 2007; 8:R11. [PMID: 17233920 PMCID: PMC1839131 DOI: 10.1186/gb-2007-8-1-r11] [Citation(s) in RCA: 137] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2006] [Revised: 11/13/2006] [Accepted: 01/18/2007] [Indexed: 12/23/2022] Open
Abstract
Genome comparisons between 12 Drosophila species elucidate the origins of retroposition events that have led to the emergence of candidate functional genes. Background Processed copies of genes (retrogenes) are duplicate genes that originated through the reverse-transcription of a host transcript and insertion in the genome. This type of gene duplication, as any other, could be a source of new genes and functions. Using whole genome sequence data for 12 Drosophila species, we dated the origin of 94 retroposition events that gave rise to candidate functional genes in D. melanogaster. Results Based on this analysis, we infer that functional retrogenes have emerged at a fairly constant rate of 0.5 genes per million years per lineage over the last approximately 63 million years of Drosophila evolution. The number of functional retrogenes and the rate at which they are recruited in the D. melanogaster lineage are of the same order of magnitude as those estimated in the human lineage, despite the higher deletion bias in the Drosophila genome. However, unlike primates, the rate of retroposition in Drosophila seems to be fairly constant and no burst of retroposition can be inferred from our analyses. In addition, our data also support an important role for retrogenes as a source of lineage-specific male functions, in agreement with previous hypotheses. Finally, we identified three cases of functional retrogenes in D. melanogaster that have been independently retroposed and recruited in parallel as new genes in other Drosophila lineages. Conclusion Together, these results indicate that retroposition is a persistent mechanism and a recurrent pathway for the emergence of new genes in Drosophila.
Collapse
Affiliation(s)
- Yongsheng Bai
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA
| | - Claudio Casola
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA
| | - Cédric Feschotte
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA
| | - Esther Betrán
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA
| |
Collapse
|
15
|
Wang W, Zheng H, Fan C, Li J, Shi J, Cai Z, Zhang G, Liu D, Zhang J, Vang S, Lu Z, Wong GKS, Long M, Wang J. High rate of chimeric gene origination by retroposition in plant genomes. THE PLANT CELL 2006; 18:1791-802. [PMID: 16829590 PMCID: PMC1533979 DOI: 10.1105/tpc.106.041905] [Citation(s) in RCA: 147] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2006] [Revised: 04/15/2006] [Accepted: 06/08/2006] [Indexed: 05/10/2023]
Abstract
Retroposition is widely found to play essential roles in origination of new mammalian and other animal genes. However, the scarcity of retrogenes in plants has led to the assumption that plant genomes rarely evolve new gene duplicates by retroposition, despite abundant retrotransposons in plants and a reported long terminal repeat (LTR) retrotransposon-mediated mechanism of retroposing cellular genes in maize (Zea mays). We show extensive retropositions in the rice (Oryza sativa) genome, with 1235 identified primary retrogenes. We identified 27 of these primary retrogenes within LTR retrotransposons, confirming a previously observed role of retroelements in generating plant retrogenes. Substitution analyses revealed that the vast majority are subject to negative selection, suggesting, along with expression data and evidence of age, that they are likely functional retrogenes. In addition, 42% of these retrosequences have recruited new exons from flanking regions, generating a large number of chimerical genes. We also identified young chimerical genes, suggesting that gene origination through retroposition is ongoing, with a rate an order of magnitude higher than the rate in primates. Finally, we observed that retropositions have followed an unexpected spatial pattern in which functional retrogenes avoid centromeric regions, while retropseudogenes are randomly distributed. These observations suggest that retroposition is an important mechanism that governs gene evolution in rice and other grass species.
Collapse
Affiliation(s)
- Wen Wang
- CAS-Max-Plank Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Nozawa M, Kumagai M, Aotsuka T, Tamura K. Proceedings of the SMBE Tri-National Young Investigators' Workshop 2005. Unusual evolution of interspersed repeat sequences in the Drosophila ananassae subgroup. Mol Biol Evol 2006; 23:981-7. [PMID: 16467489 DOI: 10.1093/molbev/msj105] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
New repeat sequences were found in the Drosophila ananassae genome sequence. They accounted for approximately 1.2% of the D. ananassae genome and were estimated to be more abundant in genomes of its closely related species belonging to the Drosophila bipectinata complex, whereas it was entirely absent in the Drosophila melanogaster genome. They were interspersed throughout euchromatic regions of the genome, usually as short tandem arrays of unit sequences, which were mostly 175-200 bp long with two distinct peaks at 180 and 189 bp in the length distribution. The nucleotide differences among unit sequences within the same array (locus) were much smaller than those between separate loci, suggesting within-locus concerted evolution. The phylogenetic tree of the repeat sequences from different loci showed that divergences between sequences from different chromosome arms occurred only at earlier stages of evolution, while those within the same chromosome arm occurred thereafter, resulting in the increase in copy number. We found RNA polymerase III promoter sequences (A box and B box), which play a critical role in retroposition of short interspersed elements. We also found conserved stem-loop structures, which are possibly associated with certain DNA rearrangements responsible for the increase in copy number within a chromosome arm. Such an atypical combination of characteristics (i.e., wide dispersal and tandem repetition) may have been generated by these different transposition mechanisms during the course of evolution.
Collapse
Affiliation(s)
- Masafumi Nozawa
- Department of Biological Sciences, Graduate School of Science, Tokyo Metropolitan University, Tokyo, Japan
| | | | | | | |
Collapse
|