1
|
Zhao H, Qin L, Deng X, Wang Z, Jiang R, Reitz SR, Wu S, He Z. Nucleotide and dinucleotide preference of segmented viruses are shaped more by segment: In case study of tomato spotted wilt virus. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2024; 122:105608. [PMID: 38796047 DOI: 10.1016/j.meegid.2024.105608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 05/16/2024] [Accepted: 05/21/2024] [Indexed: 05/28/2024]
Abstract
Several studies have showed that the nucleotide and dinucleotide composition of viruses possibly follows their host species or protein coding region. Nevertheless, the influence of viral segment on viral nucleotide and dinucleotide composition is still unknown. Here, we explored through tomato spotted wilt virus (TSWV), a segmented virus that seriously threatens the production of tomatoes all over the world. Through nucleotide composition analysis, we found the same over-representation of A across all viral segments at the first and second codon position, but it exhibited distinct in segments at the third codon position. Interestingly, the protein coding regions which encoded by the same or different segments exhibit obvious distinct nucleotide preference. Then, we found that the dinucleotides UpG and CpU were overrepresented and the dinucleotides UpA, CpG and GpU were underrepresented, not only in the complete genomic sequences, but also in different segments, protein coding regions and host species. Notably, 100% of the data investigated here were predicted to the correct viral segment and protein coding region, despite the fact that only 67% of the data analyzed here were predicted to the correct viral host species. In conclusion, in case study of TSWV, nucleotide composition and dinucleotide preference of segment viruses are more strongly dependent on segment and protein coding region than on host species. This research provides a novel perspective on the molecular evolutionary mechanisms of TSWV and provides reference for future research on genetic diversity of segmented viruses.
Collapse
Affiliation(s)
- Haiting Zhao
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China
| | - Lang Qin
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China
| | - Xiaolong Deng
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China
| | - Zhilei Wang
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China
| | - Runzhou Jiang
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China
| | - Stuart R Reitz
- Malheur Experiment Station, Oregon State University, Ontario, OR, USA
| | - Shengyong Wu
- State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing, China.
| | - Zhen He
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou 225009, China.
| |
Collapse
|
2
|
Kim J, Lee C, Ko BJ, Yoo DA, Won S, Phillippy AM, Fedrigo O, Zhang G, Howe K, Wood J, Durbin R, Formenti G, Brown S, Cantin L, Mello CV, Cho S, Rhie A, Kim H, Jarvis ED. False gene and chromosome losses in genome assemblies caused by GC content variation and repeats. Genome Biol 2022; 23:204. [PMID: 36167554 PMCID: PMC9516821 DOI: 10.1186/s13059-022-02765-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 09/02/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many short-read genome assemblies have been found to be incomplete and contain mis-assemblies. The Vertebrate Genomes Project has been producing new reference genome assemblies with an emphasis on being as complete and error-free as possible, which requires utilizing long reads, long-range scaffolding data, new assembly algorithms, and manual curation. A more thorough evaluation of the recent references relative to prior assemblies can provide a detailed overview of the types and magnitude of improvements. RESULTS Here we evaluate new vertebrate genome references relative to the previous assemblies for the same species and, in two cases, the same individuals, including a mammal (platypus), two birds (zebra finch, Anna's hummingbird), and a fish (climbing perch). We find that up to 11% of genomic sequence is entirely missing in the previous assemblies. In the Vertebrate Genomes Project zebra finch assembly, we identify eight new GC- and repeat-rich micro-chromosomes with high gene density. The impact of missing sequences is biased towards GC-rich 5'-proximal promoters and 5' exon regions of protein-coding genes and long non-coding RNAs. Between 26 and 60% of genes include structural or sequence errors that could lead to misunderstanding of their function when using the previous genome assemblies. CONCLUSIONS Our findings reveal novel regulatory landscapes and protein coding sequences that have been greatly underestimated in previous assemblies and are now present in the Vertebrate Genomes Project reference genomes.
Collapse
Affiliation(s)
- Juwan Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Dong Ahn Yoo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Sohyoung Won
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA
| | - Guojie Zhang
- BGI-Shenzhen, Shenzhen, 518083, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Universitetsparken 15, 2100, Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China
| | | | | | - Richard Durbin
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Giulio Formenti
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Samara Brown
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Lindsey Cantin
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Claudio V Mello
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, 97239, USA
| | - Seoae Cho
- eGnome, Inc, Seoul, Republic of Korea
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
- eGnome, Inc, Seoul, Republic of Korea.
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA.
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
3
|
Mordstein C, Savisaar R, Young RS, Bazile J, Talmane L, Luft J, Liss M, Taylor MS, Hurst LD, Kudla G. Codon Usage and Splicing Jointly Influence mRNA Localization. Cell Syst 2020; 10:351-362.e8. [PMID: 32275854 PMCID: PMC7181179 DOI: 10.1016/j.cels.2020.03.001] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 12/19/2019] [Accepted: 03/05/2020] [Indexed: 12/11/2022]
Abstract
In the human genome, most genes undergo splicing, and patterns of codon usage are splicing dependent: guanine and cytosine (GC) content is the highest within single-exon genes and within first exons of multi-exon genes. However, the effects of codon usage on gene expression are typically characterized in unspliced model genes. Here, we measured the effects of splicing on expression in a panel of synonymous reporter genes that varied in nucleotide composition. We found that high GC content increased protein yield, mRNA yield, cytoplasmic mRNA localization, and translation of unspliced reporters. Splicing did not affect the expression of GC-rich variants. However, splicing promoted the expression of AT-rich variants by increasing their steady-state protein and mRNA levels, in part through promoting cytoplasmic localization of mRNA. We propose that splicing promotes the nuclear export of AU-rich mRNAs and that codon- and splicing-dependent effects on expression are under evolutionary pressure in the human genome.
Collapse
Affiliation(s)
- Christine Mordstein
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK; Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Rosina Savisaar
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK; Instituto de Medicina Molecular, João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal
| | - Robert S Young
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK; Centre for Global Health Research, Usher Institute, The University of Edinburgh, Edinburgh, UK
| | - Jeanne Bazile
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Lana Talmane
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Juliet Luft
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Michael Liss
- Thermo Fisher Scientific, GENEART GmbH, Regensburg, Germany
| | - Martin S Taylor
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
4
|
Bagley JC, Uribe-Convers S, Carlsen MM, Muchhala N. Utility of targeted sequence capture for phylogenomics in rapid, recent angiosperm radiations: Neotropical Burmeistera bellflowers as a case study. Mol Phylogenet Evol 2020; 152:106769. [PMID: 32081762 DOI: 10.1016/j.ympev.2020.106769] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 02/10/2020] [Accepted: 02/12/2020] [Indexed: 02/06/2023]
Abstract
Targeted sequence capture is a promising approach for large-scale phylogenomics. However, rapid evolutionary radiations pose significant challenges for phylogenetic inference (e.g. incomplete lineages sorting (ILS), phylogenetic noise), and the ability of targeted nuclear loci to resolve species trees despite such issues remains poorly studied. We test the utility of targeted sequence capture for inferring phylogenetic relationships in rapid, recent angiosperm radiations, focusing on Burmeistera bellflowers (Campanulaceae), which diversified into ~130 species over less than 3 million years. We compared phylogenies estimated from supercontig (exons plus flanking sequences), exon-only, and flanking-only datasets with 506-546 loci (~4.7 million bases) for 46 Burmeistera species/lineages and 10 outgroup taxa. Nuclear loci resolved backbone nodes and many congruent internal relationships with high support in concatenation and coalescent-based species tree analyses, and inferences were largely robust to effects of missing taxa and base composition biases. Nevertheless, species trees were incongruent between datasets, and gene trees exhibited remarkably high levels of conflict (~4-60% congruence, ~40-99% conflict) not simply driven by poor gene tree resolution. Higher gene tree heterogeneity at shorter branches suggests an important role of ILS, as expected for rapid radiations. Phylogenetic informativeness analyses also suggest this incongruence has resulted from low resolving power at short internal branches, consistent with ILS, and homoplasy at deeper nodes, with exons exhibiting much greater risk of incorrect topologies due to homoplasy than other datasets. Our findings suggest that targeted sequence capture is feasible for resolving rapid, recent angiosperm radiations, and that results based on supercontig alignments containing nuclear exons and flanking sequences have higher phylogenetic utility and accuracy than either alone. We use our results to make practical recommendations for future target capture-based studies of Burmeistera and other rapid angiosperm radiations, including that such studies should analyze supercontigs to maximize the phylogenetic information content of loci.
Collapse
Affiliation(s)
- Justin C Bagley
- Department of Biology, University of Missouri-St. Louis, St. Louis, MO 63121, USA; Department of Biology, Virginia Commonwealth University, Richmond, VA 23284, USA.
| | - Simon Uribe-Convers
- Department of Biology, University of Missouri-St. Louis, St. Louis, MO 63121, USA
| | - Mónica M Carlsen
- Research Department, Science and Conservation Division, Missouri Botanical Garden, St. Louis, MO 63110, USA
| | - Nathan Muchhala
- Department of Biology, University of Missouri-St. Louis, St. Louis, MO 63121, USA
| |
Collapse
|
5
|
Paul P, Malakar AK, Chakraborty S. Codon usage vis-a-vis start and stop codon context analysis of three dicot species. J Genet 2018; 97:97-107. [PMID: 29666329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
To understand the variation in genomic composition and its effect on codon usage, we performed the comparative analysis of codon usage and nucleotide usage in the genes of three dicots, Glycine max, Arabidopsis thaliana and Medicago truncatula. The dicot genes were found to be A/T rich and have predominantly A-ending and/or T-ending codons. GC3s directly mimic theusage pattern of global GC content. Relative synonymous codon usage analysis suggests that the high usage frequency of A/T over G/C mononucleotide containing codons in AT-rich dicot genome is due to compositional constraint as a factor of codon usage bias. Odds ratio analysis identified the dinucleotides TpG, TpC, GpA, CpA and CpT as over-represented, where, CpG and TpA as under-represented dinucleotides. The results of (NcExp-NcObs)/NcExp plot suggests that selection pressure other than mutation played a significant role in influencing the pattern of codon usage in these dicots. PR2 analysis revealed the significant role of selection pressure on codon usage. Analysis of varience on codon usage at start and stop site showed variation in codon selection in these sites. This study provides evidence that the dicot genes were subjected to compositional selection pressure.
Collapse
Affiliation(s)
- Prosenjit Paul
- Department of Biotechnology, Assam University, Silchar 788 011, India.
| | | | | |
Collapse
|
6
|
Paul P, Malakar AK, Chakraborty S. Codon usage vis-a-vis start and stop codon context analysis of three dicot species. J Genet 2018. [DOI: 10.1007/s12041-018-0892-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
7
|
Mazumdar P, Binti Othman R, Mebus K, Ramakrishnan N, Ann Harikrishna J. Codon usage and codon pair patterns in non-grass monocot genomes. ANNALS OF BOTANY 2017; 120:893-909. [PMID: 29155926 PMCID: PMC5710610 DOI: 10.1093/aob/mcx112] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Accepted: 09/19/2017] [Indexed: 05/19/2023]
Abstract
BACKGROUND AND AIMS Studies on codon usage in monocots have focused on grasses, and observed patterns of this taxon were generalized to all monocot species. Here, non-grass monocot species were analysed to investigate the differences between grass and non-grass monocots. METHODS First, studies of codon usage in monocots were reviewed. The current information was then extended regarding codon usage, as well as codon-pair context bias, using four completely sequenced non-grass monocot genomes (Musa acuminata, Musa balbisiana, Phoenix dactylifera and Spirodela polyrhiza) for which comparable transcriptome datasets are available. Measurements were taken regarding relative synonymous codon usage, effective number of codons, derived optimal codon and GC content and then the relationships investigated to infer the underlying evolutionary forces. KEY RESULTS The research identified optimal codons, rare codons and preferred codon-pair context in the non-grass monocot species studied. In contrast to the bimodal distribution of GC3 (GC content in third codon position) in grasses, non-grass monocots showed a unimodal distribution. Disproportionate use of G and C (and of A and T) in two- and four-codon amino acids detected in the analysis rules out the mutational bias hypothesis as an explanation of genomic variation in GC content. There was found to be a positive relationship between CAI (codon adaptation index; predicts the level of expression of a gene) and GC3. In addition, a strong correlation was observed between coding and genomic GC content and negative correlation of GC3 with gene length, indicating a strong impact of GC-biased gene conversion (gBGC) in shaping codon usage and nucleotide composition in non-grass monocots. CONCLUSION Optimal codons in these non-grass monocots show a preference for G/C in the third codon position. These results support the concept that codon usage and nucleotide composition in non-grass monocots are mainly driven by gBGC.
Collapse
Affiliation(s)
- Purabi Mazumdar
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
| | - RofinaYasmin Binti Othman
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| | - Katharina Mebus
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
| | - N Ramakrishnan
- Electrical and Computer System Engineering, School of Engineering, Monash University Malaysia, Bandar Sunway, Malaysia
| | - Jennifer Ann Harikrishna
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
- For correspondence. E-mail:
| |
Collapse
|
8
|
Abstract
Mistranslation errors compromise fitness by wasting resources on nonfunctional proteins. In order to reduce the cost of mistranslations, natural selection chooses the most accurately translated codons at sites that are particularly important for protein structure and function. We investigated the determinants underlying selection for translational accuracy in several species of plants belonging to three clades: Brassicaceae, Fabidae, and Poaceae. Although signatures of translational selection were found in genes from a wide range of species, the underlying factors varied in nature and intensity. Indeed, the degree of synonymous codon bias at evolutionarily conserved sites varied among plant clades while remaining uniform within each clade. This is unlikely to solely reflect the diversity of tRNA pools because there is little correlation between synonymous codon bias and tRNA abundance, so other factors must affect codon choice and translational accuracy in plant genes. Accordingly, synonymous codon choice at a given site was affected not only by the selection pressure at that site, but also its participation in protein domains or mRNA secondary structures. Although these effects were detected in all the species we analyzed, their impact on translation accuracy was distinct in evolutionarily distant plant clades. The domain effect was found to enhance translational accuracy in dicot and monocot genes with a high GC content, but to oppose the selection of more accurate codons in monocot genes with a low GC content.
Collapse
|
9
|
Evolutionary forces affecting synonymous variations in plant genomes. PLoS Genet 2017; 13:e1006799. [PMID: 28531201 PMCID: PMC5460877 DOI: 10.1371/journal.pgen.1006799] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Revised: 06/06/2017] [Accepted: 05/04/2017] [Indexed: 01/04/2023] Open
Abstract
Base composition is highly variable among and within plant genomes, especially at third codon positions, ranging from GC-poor and homogeneous species to GC-rich and highly heterogeneous ones (particularly Monocots). Consequently, synonymous codon usage is biased in most species, even when base composition is relatively homogeneous. The causes of these variations are still under debate, with three main forces being possibly involved: mutational bias, selection and GC-biased gene conversion (gBGC). So far, both selection and gBGC have been detected in some species but how their relative strength varies among and within species remains unclear. Population genetics approaches allow to jointly estimating the intensity of selection, gBGC and mutational bias. We extended a recently developed method and applied it to a large population genomic dataset based on transcriptome sequencing of 11 angiosperm species spread across the phylogeny. We found that at synonymous positions, base composition is far from mutation-drift equilibrium in most genomes and that gBGC is a widespread and stronger process than selection. gBGC could strongly contribute to base composition variation among plant species, implying that it should be taken into account in plant genome analyses, especially for GC-rich ones. In protein coding genes, base composition strongly varies within and among plant genomes, especially at positions where changes do not alter the coded protein (synonymous variations). Some species, such as the model plant Arabidopsis thaliana, are relatively GC-poor and homogeneous while others, such as grasses, are highly heterogeneous and GC-rich. The causes of these variations are still debated: are they mainly due to selective or neutral processes? Answering to this question is important to correctly infer whether variations in base composition may have functional roles or not. We extended a population genetics method to jointly estimate the different forces that may affect synonymous variations and applied it to genomic datasets in 11 flowering plant species. We found that GC-biased gene conversion, a neutral process associated with recombination that mimics selection by favouring G and C bases, is a widespread and stronger process than selection and that it could explain the large variation in base composition observed in plant genomes. Our results bear implications for analysing plant genomes and for correctly interpreting what could be functional or not.
Collapse
|
10
|
Analysis of Ribosome-Associated mRNAs in Rice Reveals the Importance of Transcript Size and GC Content in Translation. G3-GENES GENOMES GENETICS 2017; 7:203-219. [PMID: 27852012 PMCID: PMC5217110 DOI: 10.1534/g3.116.036020] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Gene expression is controlled at transcriptional and post-transcriptional levels including decoding of messenger RNA (mRNA) into polypeptides via ribosome-mediated translation. Translational regulation has been intensively studied in the model dicot plant Arabidopsis thaliana, and in this study, we assessed the translational status [proportion of steady-state mRNA associated with ribosomes] of mRNAs by Translating Ribosome Affinity Purification followed by mRNA-sequencing (TRAP-seq) in rice (Oryza sativa), a model monocot plant and the most important food crop. A survey of three tissues found that most transcribed rice genes are translated whereas few transposable elements are associated with ribosomes. Genes with short and GC-rich coding regions are overrepresented in ribosome-associated mRNAs, suggesting that the GC-richness characteristic of coding sequences in grasses may be an adaptation that favors efficient translation. Transcripts with retained introns and extended 5′ untranslated regions are underrepresented on ribosomes, and rice genes belonging to different evolutionary lineages exhibited differential enrichment on the ribosomes that was associated with GC content. Genes involved in photosynthesis and stress responses are preferentially associated with ribosomes, whereas genes in epigenetic regulation pathways are the least enriched on ribosomes. Such variation is more dramatic in rice than that in Arabidopsis and is correlated with the wide variation of GC content of transcripts in rice. Taken together, variation in the translation status of individual transcripts reflects important mechanisms of gene regulation, which may have a role in evolution and diversification.
Collapse
|
11
|
Behringer MG, Hall DW. Selection on Position of Nonsense Codons in Introns. Genetics 2016; 204:1239-1248. [PMID: 27630196 PMCID: PMC5105854 DOI: 10.1534/genetics.116.189894] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 09/09/2016] [Indexed: 02/04/2023] Open
Abstract
Introns occasionally remain in mature messenger RNAs (mRNAs) due to splicing errors and the translated, aberrant proteins that result represent a metabolic cost and may have other deleterious consequences. The nonsense-mediated decay (NMD) pathway degrades aberrant mRNAs, which it recognizes by the presence of an in-frame premature termination codon (PTC). We investigated whether selection has shaped the location of PTCs in introns to reduce waste and facilitate NMD. We found across seven model organisms, that in both first and last introns, PTCs occur earlier in introns than expected by chance, suggesting that selection favors earlier position. This pattern is more pronounced in species with larger effective population sizes. The pattern does not hold for last introns in the two mammal species, however, perhaps because in these species NMD is not initiated from 3'-terminal introns. We conclude that there is compelling evidence that the location of PTCs is shaped by selection for reduced waste and efficient degradation of aberrant mRNAs.
Collapse
Affiliation(s)
- Megan G Behringer
- Department of Genetics, University of Georgia, Athens, Georgia 30602
| | - David W Hall
- Department of Genetics, University of Georgia, Athens, Georgia 30602
| |
Collapse
|