51
|
Ahmad T, Sablok G, Tatarinova TV, Xu Q, Deng XX, Guo WW. Evaluation of codon biology in citrus and Poncirus trifoliata based on genomic features and frame corrected expressed sequence tags. DNA Res 2013; 20:135-50. [PMID: 23315666 PMCID: PMC3628444 DOI: 10.1093/dnares/dss039] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid.
Collapse
Affiliation(s)
- Touqeer Ahmad
- Key Laboratory of Horticultural Plant Biology MOE, Huazhong Agricultural University, Wuhan 430070, China
| | | | | | | | | | | |
Collapse
|
52
|
Krishnan NM, Pattnaik S, Jain P, Gaur P, Choudhary R, Vaidyanathan S, Deepak S, Hariharan AK, Krishna PB, Nair J, Varghese L, Valivarthi NK, Dhas K, Ramaswamy K, Panda B. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica. BMC Genomics 2012; 13:464. [PMID: 22958331 PMCID: PMC3507787 DOI: 10.1186/1471-2164-13-464] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Accepted: 09/03/2012] [Indexed: 12/05/2022] Open
Abstract
Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides.
Collapse
Affiliation(s)
- Neeraja M Krishnan
- Ganit Labs, Bio-IT Centre, Institute of Bioinformatics and Applied Biotechnology, Biotech Park, Electronic City Phase I, Bangalore 560100, India
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
53
|
Woody JL, Beavis W, Shoemaker RC. Large homogeneous genome regions (isochores) in soybean [glycine max (L.) merr]. Front Genet 2012; 3:98. [PMID: 22934101 PMCID: PMC3365285 DOI: 10.3389/fgene.2012.00098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2012] [Accepted: 05/14/2012] [Indexed: 11/13/2022] Open
Abstract
The landscape of plant genomes, while slowly being characterized and defined, is still composed primarily of regions of undefined function. Many eukaryotic genomes contain isochore regions, mosaics of homogeneous GC content that can abruptly change from one neighboring isochore to the next. Isochores are broken into families that are characterized by their GC levels. We identified 4,339 compositionally distinct domains and 331 of these were identified as long homogeneous genome regions (LHGRs). We assigned these to four families based on finite mixture models of GC content. We then characterized each family with respect to exon length, gene content, and transposable elements. The LHGR pattern of soybeans is unique in that while the majority of the genes within LHGRs are found within a single LHGR family with a narrow GC range (Family B), that family is not the highest in GC content as seen in vertebrates and invertebrates. Instead Family B has a mean GC content of 35%. The range of GC content for all LHGRs is 16–59% GC which is a larger range than what is typical of vertebrates. This is the first study in which LHGRs have been identified in soybeans and the functions of the genes within the LHGRs have been analyzed.
Collapse
Affiliation(s)
- J L Woody
- Interdepartmental Genetics Program, Iowa State University Ames, IA, USA
| | | | | |
Collapse
|
54
|
O'Connell MJ, Doyle AM, Juenger TE, Donoghue MTA, Keshavaiah C, Tuteja R, Spillane C. In Arabidopsis thaliana codon volatility scores reflect GC3 composition rather than selective pressure. BMC Res Notes 2012; 5:359. [PMID: 22805311 PMCID: PMC3502101 DOI: 10.1186/1756-0500-5-359] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 07/17/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Synonymous codon usage bias has typically been correlated with, and attributed to translational efficiency. However, there are other pressures on genomic sequence composition that can affect codon usage patterns such as mutational biases. This study provides an analysis of the codon usage patterns in Arabidopsis thaliana in relation to gene expression levels, codon volatility, mutational biases and selective pressures. RESULTS We have performed synonymous codon usage and codon volatility analyses for all genes in the A. thaliana genome. In contrast to reports for species from other kingdoms, we find that neither codon usage nor volatility are correlated with selection pressure (as measured by dN/dS), nor with gene expression levels on a genome wide level. Our results show that codon volatility and usage are not synonymous, rather that they are correlated with the abundance of G and C at the third codon position (GC3). CONCLUSIONS Our results indicate that while the A. thaliana genome shows evidence for synonymous codon usage bias, this is not related to the expression levels of its constituent genes. Neither codon volatility nor codon usage are correlated with expression levels or selective pressures but, because they are directly related to the composition of G and C at the third codon position, they are the result of mutational bias. Therefore, in A. thaliana codon volatility and usage do not result from selection for translation efficiency or protein functional shift as measured by positive selection.
Collapse
Affiliation(s)
- Mary J O'Connell
- Bioinformatics and Molecular Evolution Group, School of Biotechnology,Dublin City University, Dublin 9, Ireland
| | | | | | | | | | | | | |
Collapse
|
55
|
Tao X, Gu YH, Wang HY, Zheng W, Li X, Zhao CW, Zhang YZ. Digital gene expression analysis based on integrated de novo transcriptome assembly of sweet potato [Ipomoea batatas (L.) Lam]. PLoS One 2012; 7:e36234. [PMID: 22558397 PMCID: PMC3338685 DOI: 10.1371/journal.pone.0036234] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Accepted: 03/29/2012] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Sweet potato (Ipomoea batatas L. [Lam.]) ranks among the top six most important food crops in the world. It is widely grown throughout the world with high and stable yield, strong adaptability, rich nutrient content, and multiple uses. However, little is known about the molecular biology of this important non-model organism due to lack of genomic resources. Hence, studies based on high-throughput sequencing technologies are needed to get a comprehensive and integrated genomic resource and better understanding of gene expression patterns in different tissues and at various developmental stages. METHODOLOGY/PRINCIPAL FINDINGS Illumina paired-end (PE) RNA-Sequencing was performed, and generated 48.7 million of 75 bp PE reads. These reads were de novo assembled into 128,052 transcripts (≥ 100 bp), which correspond to 41.1 million base pairs, by using a combined assembly strategy. Transcripts were annotated by Blast2GO and 51,763 transcripts got BLASTX hits, in which 39,677 transcripts have GO terms and 14,117 have ECs that are associated with 147 KEGG pathways. Furthermore, transcriptome differences of seven tissues were analyzed by using Illumina digital gene expression (DGE) tag profiling and numerous differentially and specifically expressed transcripts were identified. Moreover, the expression characteristics of genes involved in viral genomes, starch metabolism and potential stress tolerance and insect resistance were also identified. CONCLUSIONS/SIGNIFICANCE The combined de novo transcriptome assembly strategy can be applied to other organisms whose reference genomes are not available. The data provided here represent the most comprehensive and integrated genomic resources for cloning and identifying genes of interest in sweet potato. Characterization of sweet potato transcriptome provides an effective tool for better understanding the molecular mechanisms of cellular processes including development of leaves and storage roots, tissue-specific gene expression, potential biotic and abiotic stress response in sweet potato.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Yi-Zheng Zhang
- Key Laboratory of Bio-resources and Eco-environment, Ministry of Education, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Center for Functional Genomics and Bioinformatics, College of Life Sciences, Sichuan University, Chengdu, Sichuan, People's Republic of China
| |
Collapse
|
56
|
Serres-Giardi L, Belkhir K, David J, Glémin S. Patterns and evolution of nucleotide landscapes in seed plants. THE PLANT CELL 2012; 24:1379-97. [PMID: 22492812 PMCID: PMC3398553 DOI: 10.1105/tpc.111.093674] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Nucleotide landscapes, which are the way base composition is distributed along a genome, strongly vary among species. The underlying causes of these variations have been much debated. Though mutational bias and selection were initially invoked, GC-biased gene conversion (gBGC), a recombination-associated process favoring the G and C over A and T bases, is increasingly recognized as a major factor. As opposed to vertebrates, evolution of GC content is less well known in plants. Most studies have focused on the GC-poor and homogeneous Arabidopsis thaliana genome and the much more GC-rich and heterogeneous rice (Oryza sativa) genome and have often been generalized as a dicot/monocot dichotomy. This vision is clearly phylogenetically biased and does not allow understanding the mechanisms involved in GC content evolution in plants. To tackle these issues, we used EST data from more than 200 species and provided the most comprehensive description of gene GC content across the seed plant phylogeny so far available. As opposed to the classically assumed dicot/monocot dichotomy, we found continuous variations in GC content from the probably ancestral GC-poor and homogeneous genomes to the more derived GC-rich and highly heterogeneous ones, with several independent enrichment episodes. Our results suggest that gBGC could play a significant role in the evolution of GC content in plant genomes.
Collapse
Affiliation(s)
- Laurana Serres-Giardi
- Institut des Sciences de l’Evolution de Montpellier, Unité Mixte de Recherche 5554, Centre National de la Recherche Scientifique, Université Montpellier 2, F-34095 Montpellier, France
- Montpellier SupAgro, Unité Mixte de Recherche 1334, Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales, F-34398 Montpellier, France
| | - Khalid Belkhir
- Institut des Sciences de l’Evolution de Montpellier, Unité Mixte de Recherche 5554, Centre National de la Recherche Scientifique, Université Montpellier 2, F-34095 Montpellier, France
| | - Jacques David
- Montpellier SupAgro, Unité Mixte de Recherche 1334, Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales, F-34398 Montpellier, France
| | - Sylvain Glémin
- Institut des Sciences de l’Evolution de Montpellier, Unité Mixte de Recherche 5554, Centre National de la Recherche Scientifique, Université Montpellier 2, F-34095 Montpellier, France
- Address correspondence to
| |
Collapse
|
57
|
Liu H, Huang Y, Du X, Chen Z, Zeng X, Chen Y, Zhang H. Patterns of synonymous codon usage bias in the model grass Brachypodium distachyon. GENETICS AND MOLECULAR RESEARCH 2012; 11:4695-706. [DOI: 10.4238/2012.october.17.3] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
58
|
Triwitayakorn K, Chatkulkawin P, Kanjanawattanawong S, Sraphet S, Yoocha T, Sangsrakru D, Chanprasert J, Ngamphiw C, Jomchai N, Therawattanasuk K, Tangphatsornruang S. Transcriptome sequencing of Hevea brasiliensis for development of microsatellite markers and construction of a genetic linkage map. DNA Res 2011; 18:471-82. [PMID: 22086998 PMCID: PMC3223080 DOI: 10.1093/dnares/dsr034] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
To obtain more information on the Hevea brasiliensis genome, we sequenced the transcriptome from the vegetative shoot apex yielding 2 311 497 reads. Clustering and assembly of the reads produced a total of 113 313 unique sequences, comprising 28 387 isotigs and 84 926 singletons. Also, 17 819 expressed sequence tag (EST)-simple sequence repeats (SSRs) were identified from the data set. To demonstrate the use of this EST resource for marker development, primers were designed for 430 of the EST-SSRs. Three hundred and twenty-three primer pairs were amplifiable in H. brasiliensis clones. Polymorphic information content values of selected 47 SSRs among 20 H. brasiliensis clones ranged from 0.13 to 0.71, with an average of 0.51. A dendrogram of genetic similarities between the 20 H. brasiliensis clones using these 47 EST-SSRs suggested two distinct groups that correlated well with clone pedigree. These novel EST-SSRs together with the published SSRs were used for the construction of an integrated parental linkage map of H. brasiliensis based on 81 lines of an F1 mapping population. The map consisted of 97 loci, consisting of 37 novel EST-SSRs and 60 published SSRs, distributed on 23 linkage groups and covered 842.9 cM with a mean interval of 11.9 cM and ∼4 loci per linkage group. Although the numbers of linkage groups exceed the haploid number (18), but with several common markers between homologous linkage groups with the previous map indicated that the F1 map in this study is appropriate for further study in marker-assisted selection.
Collapse
|
59
|
Muyle A, Serres-Giardi L, Ressayre A, Escobar J, Glémin S. GC-biased gene conversion and selection affect GC content in the Oryza genus (rice). Mol Biol Evol 2011; 28:2695-706. [PMID: 21504892 DOI: 10.1093/molbev/msr104] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Base composition varies among and within eukaryote genomes. Although mutational bias and selection have initially been invoked, more recently GC-biased gene conversion (gBGC) has been proposed to play a central role in shaping nucleotide landscapes, especially in yeast, mammals, and birds. gBGC is a kind of meiotic drive in favor of G and C alleles, associated with recombination. Previous studies have also suggested that gBGC could be at work in grass genomes. However, these studies were carried on third codon positions that can undergo selection on codon usage. As most preferred codons end in G or C in grasses, gBGC and selection can be confounded. Here we investigated further the forces that might drive GC content evolution in the rice genus using both coding and noncoding sequences. We found that recombination rates correlate positively with equilibrium GC content and that selfing species (Oryza sativa and O. glaberrima) have significantly lower equilibrium GC content compared with more outcrossing species. As recombination is less efficient in selfing species, these results suggest that recombination drives GC content. We also detected a positive relationship between expression levels and GC content in third codon positions, suggesting that selection favors codons ending with G or C bases. However, the correlation between GC content and recombination cannot be explained by selection on codon usage alone as it was also observed in noncoding positions. Finally, analyses of polymorphism data ruled out the hypothesis that genomic variation in GC content is due to mutational processes. Our results suggest that both gBGC and selection on codon usage affect GC content in the Oryza genus and likely in other grass species.
Collapse
Affiliation(s)
- Aline Muyle
- Institut des Sciences de l'Evolution, UMR 5554 CNRS, Université Montpellier II, France
| | | | | | | | | |
Collapse
|
60
|
Synonymous Codon Usage, GC3, and Evolutionary Patterns Across Plastomes of Three Pooid Model Species: Emerging Grass Genome Models for Monocots. Mol Biotechnol 2011; 49:116-28. [DOI: 10.1007/s12033-011-9383-9] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
61
|
Abe H, Ito S, Inoue-Murayama M. Polymorphisms in the extracellular region of dopamine receptor D4 within and among avian orders. J Mol Evol 2011; 72:253-64. [PMID: 21286696 DOI: 10.1007/s00239-011-9432-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2010] [Accepted: 01/10/2011] [Indexed: 01/19/2023]
Abstract
Polymorphisms in the dopamine receptor D4 gene (DRD4) have been widely investigated to assess their correlation with variations in animal behavior. We precisely examined polymorphisms in the extracellular region of DRD4 in 75 avian species belonging to 16 orders and detected high degrees of polymorphism at inter- and intraordinal levels. The existence of a variable number of proline repeats (2 to 12 times) in the extracellular region was a common feature in all Neognathae, and a strong codon bias at synonymous sites was found among Passeriformes, Galliformes, and other non-passerine Neoaves. Furthermore, significantly higher values of the pairwise disparity index were detected in Passeriformes, suggesting either a substantial difference in the evolutionary processes or a higher level of mutation rate in the passerine clade. The differences in both codon bias and other genetic parameters among avian taxa would be explained by different levels of selective pressure on the extracellular region of DRD4. Our study suggested that different conformations determined in a sequence-dependent manner at the extracellular region could be one of the key factors affecting the efficiency and accuracy of DRD4 expression. Our findings further imply a possibility that behavioral diversity, which would be important during the processes of adaptive radiation, may be enhanced by the selection acting on indels or single-nucleotide substitutions in the extracellular region of DRD4.
Collapse
Affiliation(s)
- Hideaki Abe
- Wildlife Research Center of Kyoto University, 2-24 Tanaka-Sekiden-cho, Sakyo-ku, Kyoto 606-8203, Japan
| | | | | |
Collapse
|
62
|
Pack-Mutator-like transposable elements (Pack-MULEs) induce directional modification of genes through biased insertion and DNA acquisition. Proc Natl Acad Sci U S A 2011; 108:1537-42. [PMID: 21220310 DOI: 10.1073/pnas.1010814108] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
In monocots, many genes demonstrate a significant negative GC gradient, meaning that the GC content declines along the orientation of transcription. Such a gradient is not observed in the genes of the dicot plant Arabidopsis. In addition, a lack of homology is often observed when comparing the 5' end of the coding region of orthologous genes in rice and Arabidopsis. The reasons for these differences have been enigmatic. The presence of GC-rich sequences at the 5' end of genes may influence the conformation of chromatin, the expression level of genes, as well as the recombination rate. Here we show that Pack-Mutator-like transposable elements (Pack-MULEs) that carry gene fragments specifically acquire GC-rich fragments and preferentially insert into the 5' end of genes. The resulting Pack-MULEs form independent, GC-rich transcripts with a negative GC gradient. Alternatively, the Pack-MULEs evolve into additional exons at the 5' end of existing genes, thus altering the GC content in those regions. We demonstrate that Pack-MULEs modify the 5' end of genes and are at least partially responsible for the negative GC gradient of genes in grasses. Such a unique and global impact on gene composition and gene structure has not been observed for any other transposable elements.
Collapse
|
63
|
Davis JJ, Olsen GJ. Characterizing the native codon usages of a genome: an axis projection approach. Mol Biol Evol 2010; 28:211-21. [PMID: 20679093 PMCID: PMC3002238 DOI: 10.1093/molbev/msq185] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Codon usage can provide insights into the nature of the genes in a genome. Genes that are “native” to a genome (have not been recently acquired by horizontal transfer) range in codon usage from a low-bias “typical” usage to a more biased “high-expression” usage characteristic of genes encoding abundant proteins. Genes that differ from these native codon usages are candidates for foreign genes that have been recently acquired by horizontal gene transfer. In this study, we present a method for characterizing the codon usages of native genes—both typical and highly expressed—within a genome. Each gene is evaluated relative to a half line (or axis) in a 59D space of codon usage. The axis begins at the modal codon usage, the usage that matches the largest number of genes in the genome, and it passes through a point representing the codon usage of a set of genes with expression-related bias. A gene whose codon usage matches (does not significantly differ from) a point on this axis is a candidate native gene, and the location of its projection onto the axis provides a general estimate of its expression level. A gene that differs significantly from all points on the axis is a candidate foreign gene. This automated approach offers significant improvements over existing methods. We illustrate this by analyzing the genomes of Pseudomonas aeruginosa PAO1 and Bacillus anthracis A0248, which can be difficult to analyze with commonly used methods due to their biased base compositions. Finally, we use this approach to measure the proportion of candidate foreign genes in 923 bacterial and archaeal genomes. The organisms with the most homogeneous genomes (containing the fewest candidate foreign genes) are mostly endosymbionts and parasites, though with exceptions that include Pelagibacter ubique and Beutenbergia cavernae. The organisms with the most heterogeneous genomes (containing the most candidate foreign genes) include members of the genera Bacteroides, Corynebacterium, Desulfotalea, Neisseria, Xylella, and Thermobaculum.
Collapse
Affiliation(s)
- James J Davis
- Department of Microbiology, University of Illinois at Urbana-Champaign
| | | |
Collapse
|
64
|
Peng Z, Lu T, Li L, Liu X, Gao Z, Hu T, Yang X, Feng Q, Guan J, Weng Q, Fan D, Zhu C, Lu Y, Han B, Jiang Z. Genome-wide characterization of the biggest grass, bamboo, based on 10,608 putative full-length cDNA sequences. BMC PLANT BIOLOGY 2010; 10:116. [PMID: 20565830 PMCID: PMC3017805 DOI: 10.1186/1471-2229-10-116] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2009] [Accepted: 06/18/2010] [Indexed: 05/04/2023]
Abstract
BACKGROUND With the availability of rice and sorghum genome sequences and ongoing efforts to sequence genomes of other cereal and energy crops, the grass family (Poaceae) has become a model system for comparative genomics and for better understanding gene and genome evolution that underlies phenotypic and ecological divergence of plants. While the genomic resources have accumulated rapidly for almost all major lineages of grasses, bamboo remains the only large subfamily of Poaceae with little genomic information available in databases, which seriously hampers our ability to take a full advantage of the wealth of grass genomic data for effective comparative studies. RESULTS Here we report the cloning and sequencing of 10,608 putative full length cDNAs (FL-cDNAs) primarily from Moso bamboo, Phyllostachys heterocycla cv. pubescens, a large woody bamboo with the highest ecological and economic values of all bamboos. This represents the third largest FL-cDNA collection to date of all plant species, and provides the first insight into the gene and genome structures of bamboos. We developed a Moso bamboo genomic resource database that so far contained the sequences of 10,608 putative FL-cDNAs and nearly 38,000 expressed sequence tags (ESTs) generated in this study. CONCLUSION Analysis of FL-cDNA sequences show that bamboo diverged from its close relatives such as rice, wheat, and barley through an adaptive radiation. A comparative analysis of the lignin biosynthesis pathway between bamboo and rice suggested that genes encoding caffeoyl-CoA O-methyltransferase may serve as targets for genetic manipulation of lignin content to reduce pollutants generated from bamboo pulping.
Collapse
Affiliation(s)
- Zhenhua Peng
- Chinese Academy of Forestry, Wanshou Shan, Beijing 100091, PR China
- International Network for Bamboo and Rattan, 8 Fu Tong Dong Da Jie, Chaoyang District, Beijing 100102, PR China
| | - Tingting Lu
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
| | - Lubin Li
- Chinese Academy of Forestry, Wanshou Shan, Beijing 100091, PR China
| | - Xiaohui Liu
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
| | - Zhimin Gao
- International Network for Bamboo and Rattan, 8 Fu Tong Dong Da Jie, Chaoyang District, Beijing 100102, PR China
| | - Tao Hu
- Chinese Academy of Forestry, Wanshou Shan, Beijing 100091, PR China
| | - Xuewen Yang
- International Network for Bamboo and Rattan, 8 Fu Tong Dong Da Jie, Chaoyang District, Beijing 100102, PR China
| | - Qi Feng
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
| | - Jianping Guan
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
| | - Qijun Weng
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
| | - Danlin Fan
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
| | - Chuanrang Zhu
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
| | - Ying Lu
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
| | - Bin Han
- National Center for Gene Research & Shanghai Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, PR China
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, PR China
| | - Zehui Jiang
- Chinese Academy of Forestry, Wanshou Shan, Beijing 100091, PR China
- International Network for Bamboo and Rattan, 8 Fu Tong Dong Da Jie, Chaoyang District, Beijing 100102, PR China
| |
Collapse
|
65
|
Tatarinova TV, Alexandrov NN, Bouck JB, Feldmann KA. GC3 biology in corn, rice, sorghum and other grasses. BMC Genomics 2010; 11:308. [PMID: 20470436 PMCID: PMC2895627 DOI: 10.1186/1471-2164-11-308] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2009] [Accepted: 05/16/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The third, or wobble, position in a codon provides a high degree of possible degeneracy and is an elegant fault-tolerance mechanism. Nucleotide biases between organisms at the wobble position have been documented and correlated with the abundances of the complementary tRNAs. We and others have noticed a bias for cytosine and guanine at the third position in a subset of transcripts within a single organism. The bias is present in some plant species and warm-blooded vertebrates but not in all plants, or in invertebrates or cold-blooded vertebrates. RESULTS Here we demonstrate that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3 content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess upstream TATA boxes, (4) are predominant in certain classes of genes (e.g., stress responsive genes) and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses. CONCLUSIONS Our findings suggest that high levels of GC3 typify a class of genes whose expression is regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion. We discuss the three most probable explanations for GC3 bimodality: biased gene conversion, transcriptional and translational advantage and gene methylation.
Collapse
Affiliation(s)
- Tatiana V Tatarinova
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA.
| | | | | | | |
Collapse
|
66
|
Wu X, Wu S, Li D, Zhang J, Hou L, Ma J, Liu W, Ren D, Zhu Y, He F. Computational identification of rare codons of Escherichia coli based on codon pairs preference. BMC Bioinformatics 2010; 11:61. [PMID: 20109184 PMCID: PMC2828438 DOI: 10.1186/1471-2105-11-61] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2009] [Accepted: 01/28/2010] [Indexed: 12/04/2022] Open
Abstract
Background Codon bias is believed to play an important role in the control of gene expression. In Escherichia coli, some rare codons, which can limit the expression level of exogenous protein, have been defined by gene engineering operations. Previous studies have confirmed the existence of codon pair's preference in many genomes, but the underlying cause of this bias has not been well established. Here we focus on the patterns of rarely-used synonymous codons. A novel method was introduced to identify the rare codons merely by codon pair bias in Escherichia coli. Results In Escherichia coli, we defined the "rare codon pairs" by calculating the frequency of occurrence of all codon pairs in coding sequences. Rare codons which are disliked in genes could make great contributions to forming rare codon pairs. Meanwhile our investigation showed that many of these rare codon pairs contain termination codons and the recognized sites of restriction enzymes. Furthermore, a new index (Frare) was developed. Through comparison with the classical indices we found a significant negative correlation between Frare and the indices which depend on reference datasets. Conclusions Our approach suggests that we can identify rare codons by studying the context in which a codon lies. Also, the frequency of rare codons (Frare) could be a useful index of codon bias regardless of the lack of expression abundance information.
Collapse
Affiliation(s)
- Xianming Wu
- School of Biological Science and Technology, Shenyang Agricultural University, Shenyang, PR China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
67
|
Tangphatsornruang S, Somta P, Uthaipaisanwong P, Chanprasert J, Sangsrakru D, Seehalak W, Sommanas W, Tragoonrung S, Srinives P. Characterization of microsatellites and gene contents from genome shotgun sequences of mungbean (Vigna radiata (L.) Wilczek). BMC PLANT BIOLOGY 2009; 9:137. [PMID: 19930676 PMCID: PMC2788553 DOI: 10.1186/1471-2229-9-137] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2009] [Accepted: 11/24/2009] [Indexed: 05/18/2023]
Abstract
BACKGROUND Mungbean is an important economical crop in Asia. However, genomic research has lagged behind other crop species due to the lack of polymorphic DNA markers found in this crop. The objective of this work is to develop and characterize microsatellite or simple sequence repeat (SSR) markers from genome shotgun sequencing of mungbean. RESULT We have generated and characterized a total of 470,024 genome shotgun sequences covering 100.5 Mb of the mungbean (Vigna radiata (L.) Wilczek) genome using 454 sequencing technology. We identified 1,493 SSR motifs that could be used as potential molecular markers. Among 192 tested primer pairs in 17 mungbean accessions, 60 loci revealed polymorphism with polymorphic information content (PIC) values ranging from 0.0555 to 0.6907 with an average of 0.2594. Majority of microsatellite markers were transferable in Vigna species, whereas transferability rates were only 22.90% and 24.43% in Phaseolus vulgaris and Glycine max, respectively. We also used 16 SSR loci to evaluate phylogenetic relationship of 35 genotypes of the Asian Vigna group. The genome survey sequences were further analyzed to search for gene content. The evidence suggested 1,542 gene fragments have been sequence tagged, that fell within intersected existing gene models and shared sequence homology with other proteins in the database. Furthermore, potential microRNAs that could regulate developmental stages and environmental responses were discovered from this dataset. CONCLUSION In this report, we provided evidence of generating remarkable levels of diverse microsatellite markers and gene content from high throughput genome shotgun sequencing of the mungbean genomic DNA. The markers could be used in germplasm analysis, accessing genetic diversity and linkage mapping of mungbean.
Collapse
Affiliation(s)
- Sithichoke Tangphatsornruang
- National Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | - Prakit Somta
- Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen Campus, Nakhon Pathom 73140, Thailand
| | - Pichahpuk Uthaipaisanwong
- National Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | - Juntima Chanprasert
- National Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | - Duangjai Sangsrakru
- National Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | - Worapa Seehalak
- Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen Campus, Nakhon Pathom 73140, Thailand
| | - Warunee Sommanas
- Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen Campus, Nakhon Pathom 73140, Thailand
| | - Somvong Tragoonrung
- National Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | - Peerasak Srinives
- Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Kamphaeng Saen Campus, Nakhon Pathom 73140, Thailand
| |
Collapse
|
68
|
Zhou M, Li X. Analysis of synonymous codon usage patterns in different plant mitochondrial genomes. Mol Biol Rep 2009; 36:2039-46. [PMID: 19005776 DOI: 10.1007/s11033-008-9414-1] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2008] [Accepted: 10/27/2008] [Indexed: 10/21/2022]
Abstract
Codon usage in mitochondrial genome of the six different plants was analyzed to find general patterns of codon usage in plant mitochondrial genomes. The neutrality analysis indicated that the codon usage patterns of mitochondrial genes were more conserved in GC content and no correlation between GC12 and GC3. T and A ending codons were detected as the preferred codons in plant mitochondrial genomes. The Parity Rule 2 plot analysis showed that T was used more frequently than A. The EN(C)-plot showed that although a majority of the points with low EN(C) values were lying below the expected curve, a few genes lied on the expected curve. Correspondence analysis of relative synonymous codon usage yielded a first axis that explained only a partial amount of variation of codon usage. These findings suggest that natural selection is likely to be playing a large role in codon usage bias in plant mitochondrial genomes, but not only natural selection but also other several factors are likely to be involved in determining the selective constraints on codon bias in plant mitochondrial genomes. Meantime, 1 codon (P. patens), 6 codons (Z. mays), 9 codons (T. aestivum), 15 codons (A. thaliana), 15 codons (M. polymorpha) and 15 codons (N. tabacum) were defined as the preferred codons of the six plant mitochondrial genomes.
Collapse
Affiliation(s)
- Meng Zhou
- Department of Bioinformatics, Harbin Medical University, Harbin 150086, China.
| | | |
Collapse
|
69
|
Khalturin K, Hemmrich G, Fraune S, Augustin R, Bosch TCG. More than just orphans: are taxonomically-restricted genes important in evolution? Trends Genet 2009; 25:404-13. [PMID: 19716618 DOI: 10.1016/j.tig.2009.07.006] [Citation(s) in RCA: 300] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2009] [Revised: 07/13/2009] [Accepted: 07/13/2009] [Indexed: 10/20/2022]
Abstract
Comparative genome analyses indicate that every taxonomic group so far studied contains 10-20% of genes that lack recognizable homologs in other species. Do such 'orphan' or 'taxonomically-restricted' genes comprise spurious, non-functional ORFs, or does their presence reflect important evolutionary processes? Recent studies in basal metazoans such as Nematostella, Acropora and Hydra have shed light on the function of these genes, and now indicate that they are involved in important species-specific adaptive processes. Here we focus on evidence from Hydra suggesting that taxonomically-restricted genes play a role in the creation of phylum-specific novelties such as cnidocytes, in the generation of morphological diversity, and in the innate defence system. We propose that taxon-specific genes drive morphological specification, enabling organisms to adapt to changing conditions.
Collapse
Affiliation(s)
- Konstantin Khalturin
- Zoological Institute, Christian-Albrechts-University Kiel, Olshausenstrasse 40, 24098 Kiel, Germany
| | | | | | | | | |
Collapse
|
70
|
Nguyen MN, Ma J, Fogel GB, Rajapakse JC. Di-codon usage for classification of genes. Biosystems 2009; 98:1-6. [PMID: 19577612 DOI: 10.1016/j.biosystems.2009.06.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2009] [Revised: 06/11/2009] [Accepted: 06/14/2009] [Indexed: 11/17/2022]
Abstract
Genes are often classified into biologically related groups so that inferences on their functions can be made. This paper demonstrates that the di-codon usage is a useful feature for gene classification and gives better classification accuracy than the codon usage. Our experiments with different classifiers show that support vector machines performs better than other classifiers in classifying genes by using di-codon usage as features. The method is illustrated on 1841 HLA sequences which are classified into two major classes, HLA-I and HLA-II, and further classified into the subclasses of major classes. By using both codon and di-codon features, we show near perfect accuracies in the classification of HLA molecules into major classes and their sub-classes.
Collapse
|
71
|
Li Z, Zhang H, Ge S, Gu X, Gao G, Luo J. Expression pattern divergence of duplicated genes in rice. BMC Bioinformatics 2009; 10 Suppl 6:S8. [PMID: 19534757 PMCID: PMC2697655 DOI: 10.1186/1471-2105-10-s6-s8] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Background Genome-wide duplication is ubiquitous during diversification of the angiosperms, and gene duplication is one of the most important mechanisms for evolutionary novelties. As an indicator of functional evolution, the divergence of expression patterns following duplication events has drawn great attention in recent years. Using large-scale whole-genome microarray data, we systematically analyzed expression divergence patterns of rice genes from block, tandem and dispersed duplications. Results We found a significant difference in expression divergence patterns for the three types of duplicated gene pairs. Expression correlation is significantly higher for gene pairs from block and tandem duplications than those from dispersed duplications. Furthermore, a significant correlation was observed between the expression divergence and the synonymous substitution rate which is an approximate proxy of divergence time. Thus, both duplication types and divergence time influence the difference in expression divergence. Using a linear model, we investigated the influence of these two variables and found that the difference in expression divergence between block and dispersed duplicates is attributed largely to their different divergence time. In addition, the difference in expression divergence between tandem and the other two types of duplicates is attributed to both divergence time and duplication type. Conclusion Consistent with previous studies on Arabidopsis, our results revealed a significant difference in expression divergence between the types of duplicated genes and a significant correlation between expression divergence and synonymous substitution rate. We found that the attribution of duplication mode to the expression divergence implies a different evolutionary course of duplicated genes.
Collapse
Affiliation(s)
- Zhe Li
- College of Life Sciences, National Laboratory of Plant Genetic Engineering and Protein Engineering, Center for Bioinformatics, Peking University, Beijing 100871, PR China.
| | | | | | | | | | | |
Collapse
|
72
|
Liu H, He R, Zhang H, Huang Y, Tian M, Zhang J. Analysis of synonymous codon usage in Zea mays. Mol Biol Rep 2009; 37:677-84. [DOI: 10.1007/s11033-009-9521-7] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2008] [Accepted: 03/17/2009] [Indexed: 11/29/2022]
|
73
|
Kondou Y, Higuchi M, Takahashi S, Sakurai T, Ichikawa T, Kuroda H, Yoshizumi T, Tsumoto Y, Horii Y, Kawashima M, Hasegawa Y, Kuriyama T, Matsui K, Kusano M, Albinsky D, Takahashi H, Nakamura Y, Suzuki M, Sakakibara H, Kojima M, Akiyama K, Kurotani A, Seki M, Fujita M, Enju A, Yokotani N, Saitou T, Ashidate K, Fujimoto N, Ishikawa Y, Mori Y, Nanba R, Takata K, Uno K, Sugano S, Natsuki J, Dubouzet JG, Maeda S, Ohtake M, Mori M, Oda K, Takatsuji H, Hirochika H, Matsui M. Systematic approaches to using the FOX hunting system to identify useful rice genes. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2009; 57:883-94. [PMID: 18980645 DOI: 10.1111/j.1365-313x.2008.03733.x] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Ectopic gene expression, or the gain-of-function approach, has the advantage that once the function of a gene is known the gene can be transferred to many different plants by transformation. We previously reported a method, called FOX hunting, that involves ectopic expression of Arabidopsis full-length cDNAs in Arabidopsis to systematically generate gain-of-function mutants. This technology is most beneficial for generating a heterologous gene resource for analysis of useful plant gene functions. As an initial model we generated more than 23,000 independent Arabidopsis transgenic lines that expressed rice fl-cDNAs (Rice FOX Arabidopsis lines). The short generation time and rapid and efficient transformation frequency of Arabidopsis enabled the functions of the rice genes to be analyzed rapidly. We screened rice FOX Arabidopsis lines for alterations in morphology, photosynthesis, element accumulation, pigment accumulation, hormone profiles, secondary metabolites, pathogen resistance, salt tolerance, UV signaling, high light tolerance, and heat stress tolerance. Some of the mutant phenotypes displayed by rice FOX Arabidopsis lines resulted from the expression of rice genes that had no homologs in Arabidopsis. This result demonstrated that rice fl-cDNAs could be used to introduce new gene functions in Arabidopsis. Furthermore, these findings showed that rice gene function could be analyzed by employing Arabidopsis as a heterologous host. This technology provides a framework for the analysis of plant gene function in a heterologous host and of plant improvement by using heterologous gene resources.
Collapse
|
74
|
Alexandrov NN, Brover VV, Freidin S, Troukhan ME, Tatarinova TV, Zhang H, Swaller TJ, Lu YP, Bouck J, Flavell RB, Feldmann KA. Insights into corn genes derived from large-scale cDNA sequencing. PLANT MOLECULAR BIOLOGY 2009; 69:179-94. [PMID: 18937034 PMCID: PMC2709227 DOI: 10.1007/s11103-008-9415-4] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2008] [Accepted: 10/01/2008] [Indexed: 05/19/2023]
Abstract
We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST).
Collapse
|
75
|
An extensive analysis on the global codon usage pattern of baculoviruses. Arch Virol 2008; 153:2273-82. [DOI: 10.1007/s00705-008-0260-1] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2008] [Accepted: 10/27/2008] [Indexed: 12/18/2022]
|
76
|
Khalturin K, Anton-Erxleben F, Sassmann S, Wittlieb J, Hemmrich G, Bosch TCG. A novel gene family controls species-specific morphological traits in Hydra. PLoS Biol 2008; 6:e278. [PMID: 19018660 PMCID: PMC2586386 DOI: 10.1371/journal.pbio.0060278] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2007] [Accepted: 10/02/2008] [Indexed: 12/02/2022] Open
Abstract
Understanding the molecular events that underlie the evolution of morphological diversity is a major challenge in biology. Here, to identify genes whose expression correlates with species-specific morphologies, we compared transcriptomes of two closely related Hydra species. We find that species-specific differences in tentacle formation correlate with expression of a taxonomically restricted gene encoding a small secreted protein. We show that gain of function induces changes in morphology that mirror the phenotypic differences observed between species. These results suggest that "novel" genes may be involved in the generation of species-specific morphological traits.
Collapse
Affiliation(s)
- Konstantin Khalturin
- Zoological Institute, Christian-Albrechts-University, Am Botanishen Garten 1-9, 24118 Kiel, Germany
| | | | - Sylvia Sassmann
- Zoological Institute, Christian-Albrechts-University, Am Botanishen Garten 1-9, 24118 Kiel, Germany
| | - Jörg Wittlieb
- Zoological Institute, Christian-Albrechts-University, Am Botanishen Garten 1-9, 24118 Kiel, Germany
| | - Georg Hemmrich
- Zoological Institute, Christian-Albrechts-University, Am Botanishen Garten 1-9, 24118 Kiel, Germany
| | - Thomas C. G Bosch
- Zoological Institute, Christian-Albrechts-University, Am Botanishen Garten 1-9, 24118 Kiel, Germany
| |
Collapse
|
77
|
Suzuki H, Brown CJ, Forney LJ, Top EM. Comparison of correspondence analysis methods for synonymous codon usage in bacteria. DNA Res 2008; 15:357-65. [PMID: 18940873 PMCID: PMC2608848 DOI: 10.1093/dnares/dsn028] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.
Collapse
Affiliation(s)
- Haruo Suzuki
- Department of Biological Sciences and Initiative for Bioinformatics and Evolutionary Studies, University of Idaho, PO Box 443051, Moscow, Idaho 83844-3051, USA.
| | | | | | | |
Collapse
|
78
|
Mukhopadhyay P, Basak S, Ghosh TC. Differential selective constraints shaping codon usage pattern of housekeeping and tissue-specific homologous genes of rice and arabidopsis. DNA Res 2008; 15:347-56. [PMID: 18827062 PMCID: PMC2608846 DOI: 10.1093/dnares/dsn023] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Intra-genomic variation between housekeeping and tissue-specific genes has always been a study of interest in higher eukaryotes. To-date, however, no such investigation has been done in plants. Availability of whole genome expression data for both rice and Arabidopsis has made it possible to examine the evolutionary forces in shaping codon usage pattern in both housekeeping and tissue-specific genes in plants. In the present work, we have taken 4065 rice-Arabidopsis homologous gene pairs to study evolutionary forces responsible for codon usage divergence between housekeeping and tissue-specific genes. In both rice and Arabidopsis, it is mutational bias that regulates error minimization in highly expressed genes of both housekeeping and tissue-specific genes. Our results show that, in comparison to tissue-specific genes, housekeeping genes are under strong selective constraint in plants. However, in tissue-specific genes, lowly expressed genes are under stronger selective constraint compared with highly expressed genes. We demonstrated that constraint acting on mRNA secondary structure is responsible for modulating codon usage variations in rice tissue-specific genes. Thus, different evolutionary forces must underline the evolution of synonymous codon usage of highly expressed genes of housekeeping and tissue-specific genes in rice and Arabidopsis.
Collapse
Affiliation(s)
- Pamela Mukhopadhyay
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | | | | |
Collapse
|
79
|
Evidence for GC preference by monocot Dicer-like proteins. Biochem Biophys Res Commun 2008; 368:433-7. [DOI: 10.1016/j.bbrc.2008.01.110] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2008] [Accepted: 01/23/2008] [Indexed: 01/07/2023]
|
80
|
Mukhopadhyay P, Basak S, Ghosh TC. Nature of selective constraints on synonymous codon usage of rice differs in GC-poor and GC-rich genes. Gene 2007; 400:71-81. [PMID: 17629420 DOI: 10.1016/j.gene.2007.05.027] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2007] [Revised: 04/28/2007] [Accepted: 05/31/2007] [Indexed: 10/23/2022]
Abstract
Synonymous codon usage and cellular tRNA abundance are thought to be co-evolved in optimizing translational efficiencies in highly expressed genes. Here in this communication by taking the advantage of publicly available gene expression data of rice and Arabidopsis we demonstrated that tRNA gene copy number is not the only driving force favoring translational selection in all highly expressed genes of rice. We found that forces favoring translational selection differ between GC-rich and GC-poor classes of genes. Supporting our results we also showed that, in highly expressed genes of GC-poor class there is a perfect correspondence between majority of preferred codons and tRNA gene copy number that confers translational efficiencies to this group of genes. However, tRNA gene copy number is not fully consistent with models of translational selection in GC-rich group of genes, where constraints on mRNA secondary structure play a role to optimize codon usage in highly expressed genes.
Collapse
Affiliation(s)
- Pamela Mukhopadhyay
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata-700 054, India
| | | | | |
Collapse
|
81
|
Philippe H, Blanchette M. Proceedings of the First International Conference on Phylogenomics. March 15-19, 2006. Quebec, Canada. BMC Evol Biol 2007; 7 Suppl 1:S1-16. [PMID: 17288567 PMCID: PMC1796603 DOI: 10.1186/1471-2148-7-s1-s1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The First Phylogenomics Conference was held in Ste-Adèle (Québec, Canada) in March 2006. Selected papers appear in this special issue of BMC Evolutionary Biology. Here, we give an introduction to the field and provide an overview of the articles presented in this issue.
Collapse
Affiliation(s)
- Hervé Philippe
- Canadian Institute for Advanced Research, Centre Robert Cedergren, Département de Biochimie, Université de Montréal, 2900 Boulevard Édouard-Montpetit, Montréal, Québec, H3T 1J4, Canada
| | - Mathieu Blanchette
- McGill Centre for Bioinformatics, McGill University, 3775 University Steet, Montréal, Québec, H3A 2B4, Canada
| |
Collapse
|