1
|
Li M, Wang J, Dai R, Smagghe G, Wang X, You S. Comparative analysis of codon usage patterns and phylogenetic implications of five mitochondrial genomes of the genus Japanagallia Ishihara, 1955 (Hemiptera, Cicadellidae, Megophthalminae). PeerJ 2023; 11:e16058. [PMID: 37780390 PMCID: PMC10538298 DOI: 10.7717/peerj.16058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 08/17/2023] [Indexed: 10/03/2023] Open
Abstract
Japanagallia is a genus of Cicadomorpha in the family of leafhoppers that are plant piercing-sucking insects, and it is difficult to distinguish by morphological characteristics. So far, only one complete mitochondrial genome data has been reported for the genus Japanagallia. Therefore, in order to better understand this group, we assembled and annotated the complete mitochondrial genomes of five Japanagallia species, and analyzed their codon usage patterns. Nucleotide composition analysis showed that AT content was higher than GC content, and the protein-coding sequences preferred to end with A/T at the third codon position. Relative synonymous codon usage analysis revealed most over-represented codon ends with A or T. Parity plot analysis revealed the codon usage bias of mitochondrial genes was influenced by both natural selection and mutation pressure. In the neutrality plot, the slopes of regression lines were < 0.5, suggesting that natural selection was playing a major role while mutation pressure was of minor importance. The effective number of codons showed that the codon usage bias between genes and genomes was low. Correspondence analysis revealed that the codon usage pattern differed among 13 protein-coding genes. Phylogenetic analyses based on three datasets using two methods (maximum likelihood and Bayesian inference), restored the Megophthalminae monophyly with high support values (bootstrap support values (BS) = 100, Bayesian posterior probability (PP) = 1). In the obtained topology, the seven Japanagallia species were clustered into a monophyletic group and formed a sister group with Durgade. In conclusion, our study can provide a reference for the future research on organism evolution, identification and phylogeny relationships of Japanagallia species.
Collapse
Affiliation(s)
- Min Li
- Institute of Entomology, Guizhou University, The Provincial Key Laboratory for Agricultural Pest Management Mountainous Region, Guiyang, Guizhou, China
| | - Jiajia Wang
- College of Biology and Food Engineering, Chuzhou University, Chuzhou, Anhui, China
| | - Renhuai Dai
- Institute of Entomology, Guizhou University, The Provincial Key Laboratory for Agricultural Pest Management Mountainous Region, Guiyang, Guizhou, China
| | - Guy Smagghe
- Institute of Entomology, Guizhou University, The Provincial Key Laboratory for Agricultural Pest Management Mountainous Region, Guiyang, Guizhou, China
- Cellular and Molecular Life Sciences, Department of Biology, Brussels, Belgium
- Laboratory of Agrozoology, Dep. of Crop Protection, Ghent University, Ghent, Belgium
| | - Xianyi Wang
- Engineering Research Center of Medical Biotechnology, School of Biology and Engineering, Guizhou Medical University, Guiyang, Guizhou, China
| | - Siying You
- Institute of Entomology, Guizhou University, The Provincial Key Laboratory for Agricultural Pest Management Mountainous Region, Guiyang, Guizhou, China
| |
Collapse
|
2
|
Nalabothu RL, Fisher KJ, LaBella AL, Meyer TA, Opulente DA, Wolters JF, Rokas A, Hittinger CT. Codon Optimization Improves the Prediction of Xylose Metabolism from Gene Content in Budding Yeasts. Mol Biol Evol 2023; 40:msad111. [PMID: 37154525 PMCID: PMC10263009 DOI: 10.1093/molbev/msad111] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 02/28/2023] [Accepted: 05/04/2023] [Indexed: 05/10/2023] Open
Abstract
Xylose is the second most abundant monomeric sugar in plant biomass. Consequently, xylose catabolism is an ecologically important trait for saprotrophic organisms, as well as a fundamentally important trait for industries that hope to convert plant mass to renewable fuels and other bioproducts using microbial metabolism. Although common across fungi, xylose catabolism is rare within Saccharomycotina, the subphylum that contains most industrially relevant fermentative yeast species. The genomes of several yeasts unable to consume xylose have been previously reported to contain the full set of genes in the XYL pathway, suggesting the absence of a gene-trait correlation for xylose metabolism. Here, we measured growth on xylose and systematically identified XYL pathway orthologs across the genomes of 332 budding yeast species. Although the XYL pathway coevolved with xylose metabolism, we found that pathway presence only predicted xylose catabolism about half of the time, demonstrating that a complete XYL pathway is necessary, but not sufficient, for xylose catabolism. We also found that XYL1 copy number was positively correlated, after phylogenetic correction, with xylose utilization. We then quantified codon usage bias of XYL genes and found that XYL3 codon optimization was significantly higher, after phylogenetic correction, in species able to consume xylose. Finally, we showed that codon optimization of XYL2 was positively correlated, after phylogenetic correction, with growth rates in xylose medium. We conclude that gene content alone is a weak predictor of xylose metabolism and that using codon optimization enhances the prediction of xylose metabolism from yeast genome sequence data.
Collapse
Affiliation(s)
- Rishitha L Nalabothu
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI
| | - Kaitlin J Fisher
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI
- Department of Biological Sciences, State University of New York at Oswego, Oswego, NY
| | - Abigail Leavitt LaBella
- Department of Biological Sciences, Vanderbilt University, Nashville, TN
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC
| | - Taylor A Meyer
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI
| | - Dana A Opulente
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI
- Department of Biology, Villanova University, Villanova, PA
| | - John F Wolters
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN
| | - Chris Todd Hittinger
- Laboratory of Genetics, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|
3
|
Huang X, Jiao Y, Guo J, Wang Y, Chu G, Wang M. Analysis of codon usage patterns in Haloxylon ammodendron based on genomic and transcriptomic data. Gene X 2022; 845:146842. [PMID: 36038027 DOI: 10.1016/j.gene.2022.146842] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 08/17/2022] [Accepted: 08/23/2022] [Indexed: 11/28/2022] Open
Abstract
Haloxylon ammodendron, a xero-halophytic shrub of Chenopodiaceae, is a dominant species in deserts, which has a strong drought and salt tolerance and plays an important role in sand fixation. However, the codon usage bias (CUB) in H. ammodendron is still unclear at present. In this study, the codon usage patterns of 38,657 coding sequences (CDSs) in the newly released whole-genome sequence data of H. ammodendron and 3,948 CDSs in the previously obtained transcriptome sequencing data were compared and analyzed. The results showed that the CDSs with the total guanineandcytosine(GC)content in the range of 40% ∼ 45% was the most in the genome and transcriptome. Among which, the GC1, GC2, and GC3 contents of genomic CDSs were 50.83%, 40.56%, and 40.23%, respectively, and those of CDSs in the transcriptome were 47.16%, 39.02%, and 39.59%, respectively. Therefore, the bases in H. ammodendron were rich in adenine and thymine, and the overallcodonusage was biasedtoward A- and U-ending codons. The analysis of neutrality plot, effective number of codon (ENC) plot, and parity rule 2 (PR2) bias plot showed that both natural selection and mutation pressure had great influences on the CUB of H. ammodendron, but natural selection was the most important determinant. Besides, gene expression level and the function and protein length of some specific genes also had influences on the codon usage pattern. Finally, a total of 25 common optimal codons were found in the genomic and transcriptomic data, and AU/GC-ending codons ratio was 24:1. It should be noted that the salt-tolerant unigenes had similar codon usage, and the highly expressed genes had higher usage frequency of optimal codons and lower GC content than the lowly expressed genes. In addition, there was no difference in the ENC values of salt-tolerant unigenes in H. ammodendron, and the expression level of the genes had no correlation with CAI. This study will help to elucidate the formation mechanism of H. ammodendron codon usage bias, and make contributions to the identification of new genes and the genetic engineering study on H. ammodendron.
Collapse
Affiliation(s)
- Xiang Huang
- College of Agriculture, Shihezi University, Shihezi Xinjiang 832003, P.R. China
| | - Yalin Jiao
- College of Agriculture, Shihezi University, Shihezi Xinjiang 832003, P.R. China
| | - Jiaxing Guo
- College of Agriculture, Shihezi University, Shihezi Xinjiang 832003, P.R. China
| | - Ying Wang
- College of Agriculture, Shihezi University, Shihezi Xinjiang 832003, P.R. China
| | - Guangming Chu
- College of Agriculture, Shihezi University, Shihezi Xinjiang 832003, P.R. China
| | - Mei Wang
- College of Agriculture, Shihezi University, Shihezi Xinjiang 832003, P.R. China.
| |
Collapse
|
4
|
Derbyshire MC, Newman TE, Khentry Y, Owolabi Taiwo A. The evolutionary and molecular features of the broad-host-range plant pathogen Sclerotinia sclerotiorum. MOLECULAR PLANT PATHOLOGY 2022; 23:1075-1090. [PMID: 35411696 PMCID: PMC9276942 DOI: 10.1111/mpp.13221] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 03/09/2022] [Accepted: 03/25/2022] [Indexed: 05/21/2023]
Abstract
Sclerotinia sclerotiorum is a pathogenic fungus that infects hundreds of plant species, including many of the world's most important crops. Key features of S. sclerotiorum include its extraordinary host range, preference for dicotyledonous plants, relatively slow evolution, and production of protein effectors that are active in multiple host species. Plant resistance to this pathogen is highly complex, typically involving numerous polymorphisms with infinitesimally small effects, which makes resistance breeding a major challenge. Due to its economic significance, S. sclerotiorum has been subjected to a large amount of molecular and evolutionary research. In this updated pathogen profile, we review the evolutionary and molecular features of S. sclerotiorum and discuss avenues for future research into this important species.
Collapse
Affiliation(s)
- Mark C. Derbyshire
- Centre for Crop and Disease ManagementSchool of Molecular and Life SciencesCurtin UniversityPerthWestern AustraliaAustralia
| | - Toby E. Newman
- Centre for Crop and Disease ManagementSchool of Molecular and Life SciencesCurtin UniversityPerthWestern AustraliaAustralia
| | - Yuphin Khentry
- Centre for Crop and Disease ManagementSchool of Molecular and Life SciencesCurtin UniversityPerthWestern AustraliaAustralia
| | - Akeem Owolabi Taiwo
- Centre for Crop and Disease ManagementSchool of Molecular and Life SciencesCurtin UniversityPerthWestern AustraliaAustralia
| |
Collapse
|
5
|
Abdoli R, Mazumder TH, Nematollahian S, Zanjani RS, Mesbah RA, Uddin A. Gaining insights into the compositional constraints and molecular phylogeny of five silkworms mitochondrial genome. Int J Biol Macromol 2022; 206:543-552. [PMID: 35245576 DOI: 10.1016/j.ijbiomac.2022.02.135] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 12/08/2021] [Accepted: 02/22/2022] [Indexed: 11/28/2022]
Abstract
This study was performed to identify codon usage bias (CUB), genetic similarity and phylogenetic analysis of complete mitochondrial genomes along with separate sequences of 13 protein coding genes per each genome from five types of silkworm including Bombyx mori, Bombyx mandarina, Samia cynthia ricini, Antheraea pernyi and Antheraea assama. Nucleotide composition analysis suggested that AT content was higher than GC content and t-test analysis revealed significance difference (p < 0.01) between AT and GC content. Relative synonymous CUB analysis revealed most over-represented codon ends with A or T. Parity plot analysis revealed both natural selection and mutation pressure influenced CUB of mitochondrial genes while neutrality plot analysis suggested that role of natural selection was higher than mutation pressure. The effective number of codons (ENC) revealed the CUB was low among genes and genomes. In phylogenetic analysis of complete mitochondrial genomes, the B. mori fell in a same cluster with Bombyx mandarina and showed the most similarity (96.7%). In terms of protein coding genes, COX1, COX2 and COX3 showed the most obvious differences. In conclusion, comparative analysis of mitochondrial genomes could be used to identify differences in gene organization, accurate phylogenetic analysis and clustering of different types of silkworms.
Collapse
Affiliation(s)
- Ramin Abdoli
- Iran Silk Research Center, Agricultural Research, Education and Extension Organization (AREEO), Tehran, Iran.
| | | | - Shahla Nematollahian
- Iran Silk Research Center, Agricultural Research, Education and Extension Organization (AREEO), Tehran, Iran
| | - Reza Sourati Zanjani
- Iran Silk Research Center, Agricultural Research, Education and Extension Organization (AREEO), Tehran, Iran
| | - Rahim Abdollahi Mesbah
- Iran Silk Research Center, Agricultural Research, Education and Extension Organization (AREEO), Tehran, Iran
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi 788150, Assam, India.
| |
Collapse
|
6
|
Zhang Y, Shen Z, Meng X, Zhang L, Liu Z, Liu M, Zhang F, Zhao J. Codon usage patterns across seven Rosales species. BMC PLANT BIOLOGY 2022; 22:65. [PMID: 35123393 PMCID: PMC8817548 DOI: 10.1186/s12870-022-03450-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 01/31/2022] [Indexed: 05/03/2023]
Abstract
BACKGROUND Codon usage bias (CUB) analysis is an effective method for studying specificity, evolutionary relationships, and mRNA translation and discovering new genes among various species. In general, CUB analysis is mainly performed within one species or between closely related species and no such study has been applied among species with distant genetic relationships. Here, seven Rosales species with high economic value were selected to conduct CUB analysis. RESULTS The results showed that the average GC1, GC2 and GC3 contents were 51.08, 40.52 and 43.12%, respectively, indicating that the A/T content is more abundant and the Rosales species prefer A/T as the last codon. Neutrality plot and ENc plot analysis revealed that natural selection was the main factor leading to CUB during the evolution of Rosales species. All 7 Rosales species contained three high-frequency codons, AGA, GTT and TTG, encoding Arg, Val and Leu, respectively. The 7 Rosales species differed in high-frequency codon pairs and the distribution of GC3, though the usage patterns of closely related species were more consistent. The results of the biclustering heat map among 7 Rosales species and 20 other species were basically consistent with the results of genome data, suggesting that CUB analysis is an effective method for revealing evolutionary relationships among species at the family or order level. In addition, chlorophytes prefer using G/C as ending codon, while monocotyledonous and dicotyledonous plants prefer using A/T as ending codon. CONCLUSIONS The CUB pattern among Rosales species was mainly affected by natural selection. This work is the first to highlight the CUB patterns and characteristics of Rosales species and provides a new perspective for studying genetic relationships across a wide range of species.
Collapse
Affiliation(s)
- Yao Zhang
- College of Life Science, Hebei Agricultural University, Baoding, China
- Hebei Key Laboratory of Plant Physiology and Molecular Pathology, Hebei Agricultural University, Baoding, China
| | - Zenan Shen
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190 China
| | - Xiangrui Meng
- College of Life Science, Hebei Agricultural University, Baoding, China
- Hebei Key Laboratory of Plant Physiology and Molecular Pathology, Hebei Agricultural University, Baoding, China
| | - Liman Zhang
- College of Life Science, Hebei Agricultural University, Baoding, China
- Hebei Key Laboratory of Plant Physiology and Molecular Pathology, Hebei Agricultural University, Baoding, China
| | - Zhiguo Liu
- Research Center of Chinese Jujube, Hebei Agricultural University, Baoding, China
| | - Mengjun Liu
- Research Center of Chinese Jujube, Hebei Agricultural University, Baoding, China
| | - Fa Zhang
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190 China
| | - Jin Zhao
- College of Life Science, Hebei Agricultural University, Baoding, China
- Hebei Key Laboratory of Plant Physiology and Molecular Pathology, Hebei Agricultural University, Baoding, China
| |
Collapse
|
7
|
Andargie M, Congyi Z. Genome-wide analysis of codon usage in sesame ( Sesamum indicum L.). Heliyon 2022; 8:e08687. [PMID: 35106386 PMCID: PMC8789531 DOI: 10.1016/j.heliyon.2021.e08687] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/20/2021] [Accepted: 12/24/2021] [Indexed: 10/28/2022] Open
Abstract
Sesamum indicum is an ancient oil crop grown in tropical and subtropical areas of the world. We have analyzed 23,538 coding sequences (CDS) of S. indicum to understand the factors shaping codon usage in this important oil crop plant. We identified eleven highly preferred codons in S. indicum that have AT-endings. The slope of a neutrality plot was less than one while effective number of codons (ENC) plot showed distribution above and below the standard curve. There is a significant relationship between protein length and relative synonymous codon usage (RSCU) at the primary axis while there is a weak correlation between protein length and Nc values. Correspondence analysis conducted on RSCU values differentiated CDS based on their GC content and their characteristic feature and showed a discrete distribution. Moreover, by determining codon usage, we found out that majority of the lignan biosynthesis related genes showed a weaker codon usage bias. These results provide insights into understanding codon evolution in sesame.
Collapse
Affiliation(s)
- Mebeaselassie Andargie
- University of Goettingen, Molecular Phytopathology and Mycotoxin Research, Grisebachstrasse 6, 37077 Goettingen, Germany
| | - Zhu Congyi
- Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization (MOA), Guangdong Province Key Laboratory of Tropical and Subtropical Fruit Tree Research, Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| |
Collapse
|
8
|
Chakraborty S, Basumatary P, Nath D, Paul S, Uddin A. Compositional features and pattern of codon usage for mitochondrial CO genes among reptiles. Mitochondrion 2021; 62:111-121. [PMID: 34793987 DOI: 10.1016/j.mito.2021.11.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 11/02/2021] [Accepted: 11/10/2021] [Indexed: 11/27/2022]
Abstract
The phenomenon of non-random occurrence of synonymous nucleotide triplets (codons) in the coding sequences of genes is the codon usage bias (CUB). In this study, we used bioinformatic tool kit to analyze the compositional pattern and CUB of mitogenes namely COI, COII and COIII across different orders of reptiles. Estimation of overall base composition in the protein-coding sequences of COI, COII and COIII genes of the reptilian orders revealed an uneven usage of nucleotides. The overall count of A nucleotide was found to be the highest while the overall count of G nucleotide was the least. The CO genes across the three reptilian orders were prominently AT biased. Comparison of the GC proportion at each codon position displayed that GC1 percentage ranked the highest in all the three CO genes of the reptilian orders. SCUO values indicated weaker CUB, while considerable variation of SCUO values existed in the three CO genes across the studied reptiles. Relative synonymous codon usage (RSCU) values indicated that mostly the A ending codons were preferred. Based on the parameters namely neutrality plot, mutational responsive index and translational selection, we could conclude that natural selection was the major evolutionary force in COI, COII and COIII genes in the studied reptilian orders. However, correspondence analysis, parity plot and correlation studies indicated the existence of mutation pressure as well on the CO genes.
Collapse
Affiliation(s)
- Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India.
| | | | - Durbba Nath
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Sunanda Paul
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi788150, Assam, India.
| |
Collapse
|
9
|
Analysis of Codon Usage Patterns in Giardia duodenalis Based on Transcriptome Data from GiardiaDB. Genes (Basel) 2021; 12:genes12081169. [PMID: 34440343 PMCID: PMC8393687 DOI: 10.3390/genes12081169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 07/24/2021] [Accepted: 07/27/2021] [Indexed: 12/03/2022] Open
Abstract
Giardia duodenalis, a flagellated parasitic protozoan, the most common cause of parasite-induced diarrheal diseases worldwide. Codon usage bias (CUB) is an important evolutionary character in most species. However, G. duodenalis CUB remains unclear. Thus, this study analyzes codon usage patterns to assess the restriction factors and obtain useful information in shaping G. duodenalis CUB. The neutrality analysis result indicates that G. duodenalis has a wide GC3 distribution, which significantly correlates with GC12. ENC-plot result—suggesting that most genes were close to the expected curve with only a few strayed away points. This indicates that mutational pressure and natural selection played an important role in the development of CUB. The Parity Rule 2 plot (PR2) result demonstrates that the usage of GC and AT was out of proportion. Interestingly, we identified 26 optimal codons in the G. duodenalis genome, ending with G or C. In addition, GC content, gene expression, and protein size also influence G. duodenalis CUB formation. This study systematically analyzes G. duodenalis codon usage pattern and clarifies the mechanisms of G. duodenalis CUB. These results will be very useful to identify new genes, molecular genetic manipulation, and study of G. duodenalis evolution.
Collapse
|
10
|
Uddin A, Chakraborty S. Analysis of mitochondrial protein-coding genes of Antheraea assamensis: Muga silkworm of Assam. ARCHIVES OF INSECT BIOCHEMISTRY AND PHYSIOLOGY 2021; 106:e21750. [PMID: 33075174 DOI: 10.1002/arch.21750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 09/18/2020] [Accepted: 09/30/2020] [Indexed: 06/11/2023]
Abstract
To understand the synonymous codon usage pattern in mitochondrial genome of Antheraea assamensis, we analyzed the 13 mitochondrial protein-coding genes of this species using a bioinformatic approach as no work was reported yet. The nucleotide composition analysis suggested that the percentages of A, T, G,and C were 33.73, 46.39, 9.7 and 10.17, respectively and the overall GC content was 19.86, that is, lower than 50% and the genes were AT rich. The mean effective number of codons of mitochondrial protein-coding genes was 36.30 and it indicated low codon usage bias (CUB). Relative synonymous codon usage analysis suggested overrepresented and underrepresented codons in each gene and the pattern of codon usage was different among genes. Neutrality plot analysis revealed a narrow range of distribution for GC content at the third codon position and some points were diagonally distributed, suggesting both mutation pressure and natural selection influenced the CUB.
Collapse
Affiliation(s)
- Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Assam, India
| | | |
Collapse
|
11
|
Barbhuiya RI, Uddin A, Chakraborty S. Codon usage pattern and its influencing factors for mitochondrial CO genes among different classes of Arthropoda. Mitochondrial DNA A DNA Mapp Seq Anal 2020; 31:313-326. [PMID: 32755341 DOI: 10.1080/24701394.2020.1800661] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Analysis of codon usage bias (CUB) is very much important in perceiving the knowledge of molecular biology, the discovery of a new gene, designing of transgenes and evolution of gene. In this study, we analyzed compositional features and codon usage of MT-CO (COI, COII and COIII) genes among the classes of Arthropoda to explore the pattern of CUB as no research work was reported yet. Nucleotide composition analysis in CO genes suggested that the genes were AT-rich in all the four classes of Arthropoda. CUB was low in all the classes of Arthropoda for MT-CO genes as revealed from a high effective number of codons (ENC). We also found that the evolutionary forces namely mutation pressure and natural selection were the key influencing factors in CUB among MT-CO genes as revealed by correlation analysis between overall nucleotide composition and nucleotide composition at the 3rd codon position. Correspondence analysis suggested that the pattern of CUB was different among the classes of Arthropoda. Further, it was revealed from the neutrality plot that natural selection had a dominant role while mutation pressure exhibited a minor role in structuring the pattern of codon usage in all the classes of Arthropoda across COI, COII and COIII genes.
Collapse
Affiliation(s)
| | - Arif Uddin
- Department of Zoology, M. H. C. M. Science College, Hailakandi, India
| | | |
Collapse
|
12
|
The whale shark genome reveals how genomic and physiological properties scale with body size. Proc Natl Acad Sci U S A 2020; 117:20662-20671. [PMID: 32753383 DOI: 10.1073/pnas.1922576117] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The endangered whale shark (Rhincodon typus) is the largest fish on Earth and a long-lived member of the ancient Elasmobranchii clade. To characterize the relationship between genome features and biological traits, we sequenced and assembled the genome of the whale shark and compared its genomic and physiological features to those of 83 animals and yeast. We examined the scaling relationships between body size, temperature, metabolic rates, and genomic features and found both general correlations across the animal kingdom and features specific to the whale shark genome. Among animals, increased lifespan is positively correlated to body size and metabolic rate. Several genomic traits also significantly correlated with body size, including intron and gene length. Our large-scale comparative genomic analysis uncovered general features of metazoan genome architecture: Guanine and cytosine (GC) content and codon adaptation index are negatively correlated, and neural connectivity genes are longer than average genes in most genomes. Focusing on the whale shark genome, we identified multiple features that significantly correlate with lifespan. Among these were very long gene length, due to introns being highly enriched in repetitive elements such as CR1-like long interspersed nuclear elements, and considerably longer neural genes of several types, including connectivity, activity, and neurodegeneration genes. The whale shark genome also has the second slowest evolutionary rate observed in vertebrates to date. Our comparative genomics approach uncovered multiple genetic features associated with body size, metabolic rate, and lifespan and showed that the whale shark is a promising model for studies of neural architecture and lifespan.
Collapse
|
13
|
Pal A, Saha BK, Saha J. Comparative in silico analysis of ftsZ gene from different bacteria reveals the preference for core set of codons in coding sequence structuring and secondary structural elements determination. PLoS One 2019; 14:e0219231. [PMID: 31841523 PMCID: PMC6913975 DOI: 10.1371/journal.pone.0219231] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 11/28/2019] [Indexed: 11/19/2022] Open
Abstract
The deluge of sequence information in the recent times provide us with an excellent opportunity to compare organisms on a large genomic scale. In this study we have tried to decipher the variation in the gene organization and structuring of a vital bacterial gene called ftsZ which codes for an integral component of the bacterial cell division, the FtsZ protein. FtsZ is homologous to tubulin protein and has been found to be ubiquitous in eubacteria. FtsZ is showing increasing promise as a target for antibacterial drug discovery. Our study of ftsZ protein from 143 different bacterial species spanning a wider range of morphological and physiological type demonstrates that the ftsZ gene of about ninety three percent of the organisms show relatively biased codon usage profile and significant GC deviation from their genomic GC content. Comparative codon usage analysis of ftsZ and a core housekeeping gene rpoB demonstrated that codon usage pattern of ftsZ CDS is shaped by natural selection to a large extent and mimics that of a housekeeping gene. We have also detected a tendency among the different organisms to utilize a core set of codons in structuring the ftsZ coding sequence. We observed that the compositional frequency of the amino acid serine in the FtsZ protein appears to be a indicator of the bacterial lifestyle. Our meticulous analysis of the ftsZ gene linked with the corresponding FtsZ protein show that there is a bias towards the use of specific synonymous codons particularly in the helix and strand regions of the multi-domain FtsZ protein. Overall our findings suggest that in an indispensable and vital protein such as FtsZ, there is an inherent tendency to maintain form for optimized performance in spite of the extrinsic variability in coding features.
Collapse
Affiliation(s)
- Ayon Pal
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| | - Barnan Kumar Saha
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| | - Jayanti Saha
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| |
Collapse
|
14
|
Barbhuiya RI, Uddin A, Chakraborty S. Compositional properties and codon usage pattern of mitochondrial ATP gene in different classes of Arthropoda. Genetica 2019; 147:231-248. [PMID: 31152294 DOI: 10.1007/s10709-019-00067-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2018] [Accepted: 05/22/2019] [Indexed: 12/17/2022]
Abstract
Codon usage bias (CUB) is defined as the usage of synonymous codons unequally for an amino acid in a gene transcript. It is influenced by both mutation pressure and natural selection and is a species-specific property. In our current study, we used bioinformatic methods to investigate the coding sequences of mitochondrial adenosine triphosphate gene (MT-ATP) in different classes of arthropoda to know the codon usage pattern of the gene as no work was described earlier. The analysis of compositional properties suggested that the gene is AT rich. The effective number of codons revealed the CUB of both ATP6 and ATP8 gene was moderate. Heat map showed that the codons ending with AT were negatively associated with GC3 while the codons ending with GC were positively associated with GC3 in all the classes of arthropoda. Correspondence study revealed that the pattern of codon usage of ATP6 and ATP8 genes differed across classes. Neutrality plot suggested the codon usage bias of these two genes in phylum arthropoda was influenced by both mutation pressure and natural selection.
Collapse
Affiliation(s)
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Science College, Algapur, Hailakandi, Assam, 788150, India
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar, Assam, 788011, India.
| |
Collapse
|
15
|
Bhattacharyya D, Uddin A, Das S, Chakraborty S. Mutation pressure and natural selection on codon usage in chloroplast genes of two species in Pisum L. (Fabaceae: Faboideae). Mitochondrial DNA A DNA Mapp Seq Anal 2019; 30:664-673. [DOI: 10.1080/24701394.2019.1616701] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Affiliation(s)
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, India
| | - Sudipa Das
- Department of Life Science and Bioinformatics, Assam University, Silchar, India
| | | |
Collapse
|
16
|
Biswas R, Panja AS, Bandopadhyay R. In Silico Analyses of Burial Codon Bias Among the Species of Dipterocarpaceae Through Molecular and Phylogenetic Data. Evol Bioinform Online 2019; 15:1176934319834888. [PMID: 31223230 PMCID: PMC6563522 DOI: 10.1177/1176934319834888] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 02/07/2019] [Indexed: 11/15/2022] Open
Abstract
Introduction: DNA barcode, a molecular marker, is used to distinguish among the closely
related species, and it can be applied across a broad range of taxa to
understand ecology and evolution. MaturaseK gene (matK) and
rubisco bisphosphate carboxylase/oxygenase form I gene
(rbcL) of the chloroplast are highly conserved in a
plant system, which are used as core barcode. This present endeavor entails
the comprehensive examination of the under threat plant species based on
success of discrimination on DNA barcode under selection pressure. Result: The family Dipterocarpaceae comprising of 15 genera is under threat due to
some factors, namely, deforestation, habitat alteration, poor seed, pollen
dispersal, etc. Species of this family was grouped into 6 clusters for
matK and 5 clusters and 2 sub-clusters for
rbcL in the phylogenetic tree by using neighbor-joining
method. Cluster I to cluster VI of matK and cluster I to
cluster V of rbcL genes were analyzed by various codon and
substitution bias tools. Mutational pressure guided the codon bias which was
favored by the avoidance of higher GC content and significant negative
correlation between GC12 and GC3 (in sub-cluster I of cluster I
[0.03 < P], cluster I
[0.00001 < P], and cluster II
[0.01 < P] of rbcL, and cluster IV
[0.013 < P] of matK). After
refining the results, it could be speculated that the lower null expectation
values (R = 0.5 or <0.5) were less divergent from the
evolutionary perspective. Apart from that, the higher null expectation
values (R = >0.85) also showed the same result, which
possibly could be due to the negative impact of very high and low transition
rate than transversion. Conclusion: Through the analysis of inter-generic, inter/intra-specific variation and
phylogenetic data, it was found that both selection and mutation played an
important role in synonymous codon choice in these genes, but they acted
inconsistently on the genes, both matK and
rbcL. In vitro stable proteins of both
matK and rbcL were selected through
natural selection rather than mutational selection. matK
gene had higher individual discrimination and barcode success compared with
rbcL. These discriminatory approaches may describe the
problem related to the extinction of plant species. Hence, it becomes very
imperative to identify and detect the under threat plant species in
advance.
Collapse
Affiliation(s)
- Raju Biswas
- UGC-Center of Advanced Study, Department of Botany, The University of Burdwan, Bardhaman, India
| | - Anindya Sundar Panja
- Department of Biotechnology, Oriental Institute of Science and Technology, Vidyasagar University, Midnapore, India
| | - Rajib Bandopadhyay
- UGC-Center of Advanced Study, Department of Botany, The University of Burdwan, Bardhaman, India
| |
Collapse
|
17
|
Uddin A, Paul N, Chakraborty S. The codon usage pattern of genes involved in ovarian cancer. Ann N Y Acad Sci 2019; 1440:67-78. [PMID: 30843242 DOI: 10.1111/nyas.14019] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 01/04/2019] [Accepted: 01/14/2019] [Indexed: 12/20/2022]
Abstract
In this study, we analyzed the compositional dynamics and codon usage pattern of genes involved in ovarian cancer (OC) using a computational method. Mutations in specific genes are associated with OC, and some genes are risk factors for progression of OC, but no work has been reported yet on the codon usage pattern of genes involved in OC. Nucleotide composition analysis of OC-related genes suggested that the overall GC content was higher than AT content; that is, the genes were GC rich. The improved effective number of codons indicated that the overall extent of codon usage bias of genes involved in OC was low. The codons AGC, CTG, ATC, ACC, GTG, and GCC were overrepresented, while the codons TCG, TTA, CTA, CCG, CAA, CGT, ATA, ACG, GTA, GTT, GCG, and GGT were underrepresented in the genes. Correspondence analysis suggested that the codon usage pattern was different in different genes. A highly significant correlation was observed between GC12 and GC3 (r = 0.587, P < 0.01) of genes, suggesting that directional mutation affected the three codon positions. Our report on the codon usage pattern of genes involved in OC includes a new perspective for elucidating the mechanisms of biased usage of synonymous codons, as well as providing useful clues for molecular genetic engineering.
Collapse
Affiliation(s)
- Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Assam, India
| | - Nirmal Paul
- Department of Biotechnology, Assam University, Assam, India
| | | |
Collapse
|
18
|
Compositional dynamics and codon usage pattern of BRCA1 gene across nine mammalian species. Genomics 2019; 111:167-176. [DOI: 10.1016/j.ygeno.2018.01.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 12/22/2017] [Accepted: 01/22/2018] [Indexed: 11/19/2022]
|
19
|
Yang D, Xu A, Shen P, Gao C, Zang J, Qiu C, Ouyang H, Jiang Y, He F. A two-level model for the role of complex and young genes in the formation of organism complexity and new insights into the relationship between evolution and development. EvoDevo 2018; 9:22. [PMID: 30455862 PMCID: PMC6231269 DOI: 10.1186/s13227-018-0111-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 10/25/2018] [Indexed: 11/14/2022] Open
Abstract
Background How genome complexity affects organismal phenotypic complexity is a fundamental question in evolutionary developmental biology. Previous studies proposed various contributing factors of genome complexity and tried to find the connection between genomic complexity and organism complexity. However, a general model to answer this question is lacking. Here, we introduce a ‘two-level’ model for the realization of genome complexity at phenotypic level. Results Five representative species across Protostomia and Deuterostomia were involved in this study. The intrinsic gene properties contributing to genome complexity were classified into two generalized groups: the complexity and age degree of both protein-coding and noncoding genes. We found that young genes tend to be simpler; however, the mid-age genes, rather than the oldest genes, show the highest proportion of high complexity. Complex genes tend to be utilized preferentially in each stage of embryonic development, with maximum representation during the late stage of organogenesis. This trend is mainly attributed to mid-age complex genes. In contrast, young genes tend to be expressed in specific spatiotemporal states. An obvious correlation between the time point of the change in over- and under-representation and the order of gene age was observed, which supports the funnel-like model of the conservation pattern of development. In addition, we found some probable causes for the seemingly contradictory ‘funnel-like’ or ‘hourglass’ model. Conclusions These results indicate that complex and young genes contribute to organismal complexity at two different levels: Complex genes contribute to the complexity of individual proteomes in certain states, whereas young genes contribute to the diversity of proteomes in different spatiotemporal states. This conclusion is valid across the five species investigated, indicating it is a conserved model across Protostomia and Deuterostomia. The results in this study also support ‘funnel-like model’ from a new viewpoint and explain why there are different evo–devo relation models. Electronic supplementary material The online version of this article (10.1186/s13227-018-0111-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Dong Yang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206 The People's Republic of China
| | - Aishi Xu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206 The People's Republic of China
| | - Pan Shen
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206 The People's Republic of China
| | - Chao Gao
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206 The People's Republic of China
| | - Jiayin Zang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206 The People's Republic of China
| | - Chen Qiu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206 The People's Republic of China
| | - Hongsheng Ouyang
- 2Animal Sciences College of Jilin University, Changchun, 130062 The People's Republic of China
| | - Ying Jiang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206 The People's Republic of China
| | - Fuchu He
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206 The People's Republic of China
| |
Collapse
|
20
|
Uddin A, Mazumder TH, Chakraborty S. Understanding molecular biology of codon usage in mitochondrial complex IV genes of electron transport system: Relevance to mitochondrial diseases. J Cell Physiol 2018; 234:6397-6413. [DOI: 10.1002/jcp.27375] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 08/17/2018] [Indexed: 12/17/2022]
Affiliation(s)
- Arif Uddin
- Department of Zoology Moinul Hoque Choudhury Memorial Science College Hailakandi Assam India
| | | | | |
Collapse
|
21
|
Barbhuiya PA, Uddin A, Chakraborty S. Compositional properties and codon usage of TP73 gene family. Gene 2018; 683:159-168. [PMID: 30316927 DOI: 10.1016/j.gene.2018.10.030] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Revised: 10/03/2018] [Accepted: 10/11/2018] [Indexed: 12/19/2022]
Abstract
The TP73 gene is considered as one of the members of TP53 gene family and shows much homology to p53 gene. TP73 gene plays a pivotal role in cancer studies in addition to other biological functions. Codon usage bias (CUB) is the phenomenon of unequal usage of synonymous codons for an amino acid wherein some codons are more frequently used than others and it reveals the evolutionary relationship of a gene. Here, we report the pattern of codon usage in TP73 gene using various bioinformatic tools as no work was reported yet. Nucleotide composition analysis suggested that the mean nucleobase C was the highest, followed by G and the gene was GC rich. Correlation analysis between codon usage and GC3 suggested that most of the GC-ending codons showed positive correlation while most of the AT-ending codons showed negative correlation with GC3 in the coding sequences of TP73 gene variants in human. The CUB is moderate in human TP73 gene as evident from intrinsic codon deviation index (ICDI) analysis. Nature selected against two codons namely ATA (isoleucine) and AGA (arginine) in the coding sequences of TP73 gene during the course of evolution. A significant correlation (p < 0.05) was found between overall nucleotide composition and its composition at the 3rd codon position, indicating that both mutation pressure and natural selection might influence the CUB. The correlation analysis between ICDI and biochemical properties of protein suggested that variation of CUB was associated with degree of hydrophobicity and length of protein.
Collapse
Affiliation(s)
- Parvin A Barbhuiya
- Departments of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi 788150, Assam, India
| | - Supriyo Chakraborty
- Departments of Biotechnology, Assam University, Silchar 788011, Assam, India.
| |
Collapse
|
22
|
Chakraborty S, Uddin A, Mazumder TH, Choudhury MN, Malakar AK, Paul P, Halder B, Deka H, Mazumder GA, Barbhuiya RA, Barbhuiya MA, Devi WJ. Codon usage and expression level of human mitochondrial 13 protein coding genes across six continents. Mitochondrion 2018; 42:64-76. [DOI: 10.1016/j.mito.2017.11.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Revised: 10/09/2017] [Accepted: 11/27/2017] [Indexed: 02/03/2023]
|
23
|
Moyers BA, Zhang J. Toward Reducing Phylostratigraphic Errors and Biases. Genome Biol Evol 2018; 10:2037-2048. [PMID: 30060201 PMCID: PMC6105108 DOI: 10.1093/gbe/evy161] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/28/2018] [Indexed: 01/03/2023] Open
Abstract
Phylostratigraphy is a method for estimating gene age, usually applied to large numbers of genes in order to detect nonrandom age-distributions of gene properties that could shed light on mechanisms of gene origination and evolution. However, phylostratigraphy underestimates gene age with a nonnegligible probability. The underestimation is severer for genes with certain properties, creating spurious age distributions of these properties and those correlated with these properties. Here we explore three strategies to reduce phylostratigraphic error/bias. First, we test several alternative homology detection methods (PSIBLAST, HMMER, PHMMER, OMA, and GLAM2Scan) in phylostratigraphy, but fail to find any that noticeably outperforms the commonly used BLASTP. Second, using machine learning, we look for predictors of error-prone genes to exclude from phylostratigraphy, but cannot identify reliable predictors. Finally, we remove from phylostratigraphic analysis genes exhibiting errors in simulation, which by definition minimizes error/bias if the simulation is sufficiently realistic. Using this last approach, we show that some previously reported phylostratigraphic trends (e.g., younger proteins tend to evolve more rapidly and be shorter) disappear or even reverse, reconfirming the necessity of controlling phylostratigraphic error/bias. Taken together, our analyses demonstrate that phylostratigraphic errors/biases are refractory to several potential solutions but can be controlled at least partially by the exclusion of error-prone genes identified via realistic simulations. These results are expected to stimulate the judicious use of error-aware phylostratigraphy and reevaluation of previous phylostratigraphic findings.
Collapse
Affiliation(s)
- Bryan A Moyers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
24
|
Ma XX, Ma P, Chang QY, Liu ZB, Zhang D, Zhou XK, Ma ZR, Cao X. Adaptation ofBorrelia burgdorferito its natural hosts by synonymous codon and amino acid usage. J Basic Microbiol 2018. [DOI: 10.1002/jobm.201700652] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Xiao-Xia Ma
- Engineering and Technology Research Center for Animal Cell, Gansu; College of Life Science and Engineering; Northwest Minzu University; Gansu P.R. China
| | - Peng Ma
- Engineering and Technology Research Center for Animal Cell, Gansu; College of Life Science and Engineering; Northwest Minzu University; Gansu P.R. China
| | - Qiu-Yan Chang
- Engineering and Technology Research Center for Animal Cell, Gansu; College of Life Science and Engineering; Northwest Minzu University; Gansu P.R. China
| | - Zhen-Bin Liu
- Engineering and Technology Research Center for Animal Cell, Gansu; College of Life Science and Engineering; Northwest Minzu University; Gansu P.R. China
| | - Derong Zhang
- Engineering and Technology Research Center for Animal Cell, Gansu; College of Life Science and Engineering; Northwest Minzu University; Gansu P.R. China
| | - Xiao-Kai Zhou
- Engineering and Technology Research Center for Animal Cell, Gansu; College of Life Science and Engineering; Northwest Minzu University; Gansu P.R. China
| | - Zhong-Ren Ma
- Engineering and Technology Research Center for Animal Cell, Gansu; College of Life Science and Engineering; Northwest Minzu University; Gansu P.R. China
| | - Xin Cao
- Engineering and Technology Research Center for Animal Cell, Gansu; College of Life Science and Engineering; Northwest Minzu University; Gansu P.R. China
| |
Collapse
|
25
|
Paul P, Malakar AK, Chakraborty S. Codon usage vis-a-vis start and stop codon context analysis of three dicot species. J Genet 2018; 97:97-107. [PMID: 29666329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
To understand the variation in genomic composition and its effect on codon usage, we performed the comparative analysis of codon usage and nucleotide usage in the genes of three dicots, Glycine max, Arabidopsis thaliana and Medicago truncatula. The dicot genes were found to be A/T rich and have predominantly A-ending and/or T-ending codons. GC3s directly mimic theusage pattern of global GC content. Relative synonymous codon usage analysis suggests that the high usage frequency of A/T over G/C mononucleotide containing codons in AT-rich dicot genome is due to compositional constraint as a factor of codon usage bias. Odds ratio analysis identified the dinucleotides TpG, TpC, GpA, CpA and CpT as over-represented, where, CpG and TpA as under-represented dinucleotides. The results of (NcExp-NcObs)/NcExp plot suggests that selection pressure other than mutation played a significant role in influencing the pattern of codon usage in these dicots. PR2 analysis revealed the significant role of selection pressure on codon usage. Analysis of varience on codon usage at start and stop site showed variation in codon selection in these sites. This study provides evidence that the dicot genes were subjected to compositional selection pressure.
Collapse
Affiliation(s)
- Prosenjit Paul
- Department of Biotechnology, Assam University, Silchar 788 011, India.
| | | | | |
Collapse
|
26
|
Paul P, Malakar AK, Chakraborty S. Codon usage vis-a-vis start and stop codon context analysis of three dicot species. J Genet 2018. [DOI: 10.1007/s12041-018-0892-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
27
|
Sharma AK, Ahmed N, O'Brien EP. Determinants of translation speed are randomly distributed across transcripts resulting in a universal scaling of protein synthesis times. Phys Rev E 2018; 97:022409. [PMID: 29548178 DOI: 10.1103/physreve.97.022409] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Indexed: 06/08/2023]
Abstract
Ribosome profiling experiments have found greater than 100-fold variation in ribosome density along mRNA transcripts, indicating that individual codon elongation rates can vary to a similar degree. This wide range of elongation times, coupled with differences in codon usage between transcripts, suggests that the average codon translation-rate per gene can vary widely. Yet, ribosome run-off experiments have found that the average codon translation rate for different groups of transcripts in mouse stem cells is constant at 5.6 AA/s. How these seemingly contradictory results can be reconciled is the focus of this study. Here, we combine knowledge of the molecular factors shown to influence translation speed with genomic information from Escherichia coli, Saccharomyces cerevisiae and Homo sapiens to simulate the synthesis of cytosolic proteins in these organisms. The model recapitulates a near constant average translation rate, which we demonstrate arises because the molecular determinants of translation speed are distributed nearly randomly amongst most of the transcripts. Consequently, codon translation rates are also randomly distributed and fast-translating segments of a transcript are likely to be offset by equally probable slow-translating segments, resulting in similar average elongation rates for most transcripts. We also show that the codon usage bias does not significantly affect the near random distribution of codon translation rates because only about 10% of the total transcripts in an organism have high codon usage bias while the rest have little to no bias. Analysis of Ribo-Seq data and an in vivo fluorescent assay supports these conclusions.
Collapse
Affiliation(s)
- Ajeet K Sharma
- Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Nabeel Ahmed
- Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Bioinformatics and Genomics Graduate Program, The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Edward P O'Brien
- Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Bioinformatics and Genomics Graduate Program, The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
28
|
Cao X, Jiang H. An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta. BMC Genomics 2017; 18:796. [PMID: 29041902 PMCID: PMC5645894 DOI: 10.1186/s12864-017-4147-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 10/02/2017] [Indexed: 12/16/2022] Open
Abstract
Background Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data were used to study genes related to cuticle structure, chitin metabolism and immunity, a vast amount of the information has not yet been mined for understanding the basic molecular biology of this model insect. In fact, the basic features of these data, such as composition of the RNA-seq reads and lists of library-correlated genes, are unclear. From an extended view of all insects, clear-cut tempospatial expression data are rarely seen in the largest group of animals including Drosophila and mosquitoes, mainly due to their small sizes. Results We obtained the transcriptome data, analyzed the raw reads in relation to the assembled genome, and generated heatmaps for clustered genes. Library characteristics (tissues, stages), number of mapped bases, and sequencing methods affected the observed percentages of genome transcription. While up to 40% of the reads were not mapped to the genome in the initial Cufflinks gene modeling, we identified the causes for the mapping failure and reduced the number of non-mappable reads to <8%. Similarities between libraries, measured based on library-correlated genes, clearly identified differences among tissues or life stages. We calculated gene expression levels, analyzed the most abundantly expressed genes in the libraries. Furthermore, we analyzed tissue-specific gene expression and identified 18 groups of genes with distinct expression patterns. Conclusion We performed a thorough analysis of the 67 RNA-seq datasets to characterize new genomic features of M. sexta. Integrated knowledge of gene functions and expression features will facilitate future functional studies in this biochemical model insect. Electronic supplementary material The online version of this article doi: (10.1186/s12864-017-4147-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiaolong Cao
- Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK, 74078, USA.,Department of Entomology and Plant Pathology, Oklahoma State University, Stillwater, OK, 74078, USA
| | - Haobo Jiang
- Department of Entomology and Plant Pathology, Oklahoma State University, Stillwater, OK, 74078, USA.
| |
Collapse
|
29
|
More evolution underground: Accelerated mitochondrial substitution rate in Australian burrowing freshwater crayfishes (Decapoda: Parastacidae). Mol Phylogenet Evol 2017; 118:88-98. [PMID: 28966124 DOI: 10.1016/j.ympev.2017.09.022] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Revised: 08/18/2017] [Accepted: 09/26/2017] [Indexed: 12/11/2022]
Abstract
To further understand the evolutionary history and mitogenomic features of Australia's highly distinctive freshwater crayfish fauna, we utilized a recently described rapid mitogenome sequencing pipeline to generate 24 new crayfish mitogenomes including a diversity of burrowing crayfish species and the first for Astacopsis gouldi, the world's largest freshwater invertebrate. Whole mitogenome-based phylogeny estimates using both Bayesian and Maximum Likelihood methods substantially strengthen existing hypotheses for systematic relationships among Australian freshwater crayfish with evidence of pervasive diversifying selection and accelerated mitochondrial substitution rate among the members of the clade representing strongly burrowing crayfish that may reflect selection pressures for increased energy requirement for adaptation to terrestrial environment and a burrowing lifestyle. Further, gene rearrangements are prevalent in the burrowing crayfish mitogenomes involving both tRNA and protein coding genes. In addition, duplicated control regions were observed in two closely related Engaeus species, together with evidence for concerted evolution. This study significantly adds to the understanding of Australian freshwater crayfish evolutionary relationships and suggests a link between mitogenome evolution and adaptation to terrestrial environments and a burrowing lifestyle in freshwater crayfish.
Collapse
|
30
|
Athey J, Alexaki A, Osipova E, Rostovtsev A, Santana-Quintero LV, Katneni U, Simonyan V, Kimchi-Sarfaty C. A new and updated resource for codon usage tables. BMC Bioinformatics 2017; 18:391. [PMID: 28865429 PMCID: PMC5581930 DOI: 10.1186/s12859-017-1793-7] [Citation(s) in RCA: 149] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 08/15/2017] [Indexed: 01/24/2023] Open
Abstract
Background Due to the degeneracy of the genetic code, most amino acids can be encoded by multiple synonymous codons. Synonymous codons naturally occur with different frequencies in different organisms. The choice of codons may affect protein expression, structure, and function. Recombinant gene technologies commonly take advantage of the former effect by implementing a technique termed codon optimization, in which codons are replaced with synonymous ones in order to increase protein expression. This technique relies on the accurate knowledge of codon usage frequencies. Accurately quantifying codon usage bias for different organisms is useful not only for codon optimization, but also for evolutionary and translation studies: phylogenetic relations of organisms, and host-pathogen co-evolution relationships, may be explored through their codon usage similarities. Furthermore, codon usage has been shown to affect protein structure and function through interfering with translation kinetics, and cotranslational protein folding. Results Despite the obvious need for accurate codon usage tables, currently available resources are either limited in scope, encompassing only organisms from specific domains of life, or greatly outdated. Taking advantage of the exponential growth of GenBank and the creation of NCBI’s RefSeq database, we have developed a new database, the High-performance Integrated Virtual Environment-Codon Usage Tables (HIVE-CUTs), to present and analyse codon usage tables for every organism with publicly available sequencing data. Compared to existing databases, this new database is more comprehensive, addresses concerns that limited the accuracy of earlier databases, and provides several new functionalities, such as the ability to view and compare codon usage between individual organisms and across taxonomical clades, through graphical representation or through commonly used indices. In addition, it is being routinely updated to keep up with the continuous flow of new data in GenBank and RefSeq. Conclusion Given the impact of codon usage bias on recombinant gene technologies, this database will facilitate effective development and review of recombinant drug products and will be instrumental in a wide area of biological research. The database is available at hive.biochemistry.gwu.edu/review/codon. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1793-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- John Athey
- Division of Plasma Protein Therapeutics, Office of Tissue and Advanced Therapies, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA
| | - Aikaterini Alexaki
- Division of Plasma Protein Therapeutics, Office of Tissue and Advanced Therapies, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA
| | - Ekaterina Osipova
- High Performance Integrated Environment, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA
| | - Alexandre Rostovtsev
- High Performance Integrated Environment, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA
| | - Luis V Santana-Quintero
- High Performance Integrated Environment, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA
| | - Upendra Katneni
- Division of Plasma Protein Therapeutics, Office of Tissue and Advanced Therapies, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA
| | - Vahan Simonyan
- High Performance Integrated Environment, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA
| | - Chava Kimchi-Sarfaty
- Division of Plasma Protein Therapeutics, Office of Tissue and Advanced Therapies, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA.
| |
Collapse
|
31
|
Uddin A, Choudhury MN, Chakraborty S. Factors influencing codon usage of mitochondrial ND1 gene in pisces, aves and mammals. Mitochondrion 2017; 37:17-26. [PMID: 28668667 DOI: 10.1016/j.mito.2017.06.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Revised: 05/19/2017] [Accepted: 06/26/2017] [Indexed: 01/05/2023]
Abstract
Animal mitochondrial genome harbours 13 protein coding genes which regulate the process of respiration. The mitochondrial NADH dehydrogenase 1 (MT-ND1) gene, one of the 13 protein-coding genes, encodes the NADH dehydrogenase 1 enzyme of the respiratory chain. Analysis of codon usage bias (CUB) acquires importance for better understanding of the molecular biology, new gene discovery, design of transgenes and gene evolution. The MT-ND1 gene seems to be a good candidate for analyzing codon usage pattern, since no work has yet been reported. Moreover, it is still not clear which factors significantly influence the codon usage pattern. In the present study, comparative analysis of codon usage pattern, expression level and influencing factors for MT-ND1 gene from 100 different species each of pisces, aves and mammals were used for CUB analysis. Our result suggests that the gene is AT rich in pisces, aves and mammals and most of the nucleotides significantly differ among them as revealed from t-test. CUB was not remarkable as reflected by high value of effective number of codons and it also significantly differs among pisces, aves and mammals. Although we found that CUB is mainly influenced by natural selection and mutation pressure for MT-ND1 gene as suggested by correlation and correspondence analysis but neutrality plot further revealed that natural selection played a major role and mutation pressure played a minor role in codon usage pattern. Additionally, t-test analysis showed that the MT-ND1 gene has a wide significant discrepancy in codon choices in pisces, aves and mammals. This study has contributed to boost our understanding about the mechanism of distribution of the codons and the factors that may influence the evolution of the MT-ND1 gene.
Collapse
Affiliation(s)
- Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi 788150, Assam, India.
| | | | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India.
| |
Collapse
|
32
|
Huang X, Xu J, Chen L, Wang Y, Gu X, Peng X, Yang G. Analysis of transcriptome data reveals multifactor constraint on codon usage in Taenia multiceps. BMC Genomics 2017; 18:308. [PMID: 28427327 PMCID: PMC5397707 DOI: 10.1186/s12864-017-3704-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 04/12/2017] [Indexed: 12/04/2022] Open
Abstract
Background Codon usage bias (CUB) is an important evolutionary feature in genomes that has been widely observed in many organisms. However, the synonymous codon usage pattern in the genome of T. multiceps remains to be clarified. In this study, we analyzed the codon usage of T. multiceps based on the transcriptome data to reveal the constraint factors and to gain an improved understanding of the mechanisms that shape synonymous CUB. Results Analysis of a total of 8,620 annotated mRNA sequences from T. multiceps indicated only a weak codon bias, with mean GC and GC3 content values of 49.29% and 51.43%, respectively. Our analysis indicated that nucleotide composition, mutational pressure, natural selection, gene expression level, amino acids with grand average of hydropathicity (GRAVY) and aromaticity (Aromo) and the effective selection of amino-acids all contributed to the codon usage in T. multiceps. Among these factors, natural selection was implicated as the major factor affecting the codon usage variation in T. multiceps. The codon usage of ribosome genes was affected mainly by mutations, while the essential genes were affected mainly by selection. In addition, 21codons were identified as “optimal codons”. Overall, the optimal codons were GC-rich (GC:AU, 41:22), and ended with G or C (except CGU). Furthermore, different degrees of variation in codon usage were found between T. multiceps and Escherichia coli, yeast, Homo sapiens. However, little difference was found between T. multiceps and Taenia pisiformis. Conclusions In this study, the codon usage pattern of T. multiceps was analyzed systematically and factors affected CUB were also identified. This is the first study of codon biology in T. multiceps. Understanding the codon usage pattern in T. multiceps can be helpful for the discovery of new genes, molecular genetic engineering and evolutionary studies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3704-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xing Huang
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China.,Chengdu Agricultural College, Chengdu, 611130, China
| | - Jing Xu
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China
| | - Lin Chen
- Meat-processing Application Key Laboratory of Sichuan Province, College of Pharmacy and Biological Engineering, Chengdu University, Chengdu, 610106, China
| | - Yu Wang
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China
| | - Xiaobin Gu
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China
| | - Xuerong Peng
- College of Science, Sichuan Agricultural University, Ya'an, 625014, China
| | - Guangyou Yang
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China.
| |
Collapse
|
33
|
Tan MH, Gan HM, Lee YP, Poore GC, Austin CM. Digging deeper: new gene order rearrangements and distinct patterns of codons usage in mitochondrial genomes among shrimps from the Axiidea, Gebiidea and Caridea (Crustacea: Decapoda). PeerJ 2017; 5:e2982. [PMID: 28265498 PMCID: PMC5335691 DOI: 10.7717/peerj.2982] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 01/12/2017] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Whole mitochondrial DNA is being increasingly utilized for comparative genomic and phylogenetic studies at deep and shallow evolutionary levels for a range of taxonomic groups. Although mitogenome sequences are deposited at an increasing rate into public databases, their taxonomic representation is unequal across major taxonomic groups. In the case of decapod crustaceans, several infraorders, including Axiidea (ghost shrimps, sponge shrimps, and mud lobsters) and Caridea (true shrimps) are still under-represented, limiting comprehensive phylogenetic studies that utilize mitogenomic information. METHODS Sequence reads from partial genome scans were generated using the Illumina MiSeq platform and mitogenome sequences were assembled from these low coverage reads. In addition to examining phylogenetic relationships within the three infraorders, Axiidea, Gebiidea, and Caridea, we also investigated the diversity and frequency of codon usage bias and mitogenome gene order rearrangements. RESULTS We present new mitogenome sequences for five shrimp species from Australia that includes two ghost shrimps, Callianassa ceramica and Trypaea australiensis, along with three caridean shrimps, Macrobrachium bullatum, Alpheus lobidens, and Caridina cf. nilotica. Strong differences in codon usage were discovered among the three infraorders and significant gene order rearrangements were observed. While the gene order rearrangements are congruent with the inferred phylogenetic relationships and consistent with taxonomic classification, they are unevenly distributed within and among the three infraorders. DISCUSSION Our findings suggest potential for mitogenome rearrangements to be useful phylogenetic markers for decapod crustaceans and at the same time raise important questions concerning the drivers of mitogenome evolution in different decapod crustacean lineages.
Collapse
Affiliation(s)
- Mun Hua Tan
- School of Science, Monash University Malaysia, Bandar Sunway, Selangor, Malaysia
- Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Bandar Sunway, Selangor, Malaysia
| | - Han Ming Gan
- School of Science, Monash University Malaysia, Bandar Sunway, Selangor, Malaysia
- Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Bandar Sunway, Selangor, Malaysia
| | - Yin Peng Lee
- School of Science, Monash University Malaysia, Bandar Sunway, Selangor, Malaysia
- Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Bandar Sunway, Selangor, Malaysia
| | | | - Christopher M. Austin
- School of Science, Monash University Malaysia, Bandar Sunway, Selangor, Malaysia
- Genomics Facility, Tropical Medicine and Biology Platform, Monash University Malaysia, Bandar Sunway, Selangor, Malaysia
- School of Life and Environmental Sciences, Deakin University, Burwood, VIC, Australia
| |
Collapse
|
34
|
Ali SS, Shao J, Lary DJ, Kronmiller BA, Shen D, Strem MD, Amoako-Attah I, Akrofi AY, Begoude BD, ten Hoopen GM, Coulibaly K, Kebe BI, Melnick RL, Guiltinan MJ, Tyler BM, Meinhardt LW, Bailey BA. Phytophthora megakarya and P. palmivora, closely related causal agents of cacao black pod rot, underwent increases in genome sizes and gene numbers by different mechanisms. Genome Biol Evol 2017; 9:2982378. [PMID: 28186564 PMCID: PMC5381587 DOI: 10.1093/gbe/evx021] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 12/21/2016] [Accepted: 02/04/2017] [Indexed: 12/13/2022] Open
Abstract
Phytophthora megakarya (Pmeg) and Phytophthora palmivora (Ppal) are closely related species causing cacao black pod rot. Although Ppal is a cosmopolitan pathogen, cacao is the only known host of economic importance for Pmeg. Pmeg is more virulent on cacao than Ppal. We sequenced and compared the Pmeg and Ppal genomes and identified virulence-related putative gene models (PGeneM) that may be responsible for their differences in host specificities and virulence. Pmeg and Ppal have estimated genome sizes of 126.88 and 151.23 Mb and PGeneM numbers of 42,036 and 44,327, respectively. The evolutionary histories of Pmeg and Ppal appear quite different. Postspeciation, Ppal underwent whole-genome duplication whereas Pmeg has undergone selective increases in PGeneM numbers, likely through accelerated transposable element-driven duplications. Many PGeneMs in both species failed to match transcripts and may represent pseudogenes or cryptic genetic reservoirs. Pmeg appears to have amplified specific gene families, some of which are virulence-related. Analysis of mycelium, zoospore, and in planta transcriptome expression profiles using neural network self-organizing map analysis generated 24 multivariate and nonlinear self-organizing map classes. Many members of the RxLR, necrosis-inducing phytophthora protein, and pectinase genes families were specifically induced in planta . Pmeg displays a diverse virulence-related gene complement similar in size to and potentially of greater diversity than Ppal but it remains likely that the specific functions of the genes determine each species’ unique characteristics as pathogens.
Collapse
Affiliation(s)
- Shahin S. Ali
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, USDA/ARS, Beltsville Agricultural Research Center-West, Beltsville, Maryland
| | - Jonathan Shao
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, USDA/ARS, Beltsville Agricultural Research Center-West, Beltsville, Maryland
| | | | | | - Danyu Shen
- College of Plant Protection, Nanjing Agricultural University, China
| | - Mary D. Strem
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, USDA/ARS, Beltsville Agricultural Research Center-West, Beltsville, Maryland
| | | | | | - B.A. Didier Begoude
- Regional Laboratory for Biological and Applied Microbiology (IRAD), Yaoundé, Cameroon
| | - G. Martijn ten Hoopen
- Regional Laboratory for Biological and Applied Microbiology (IRAD), Yaoundé, Cameroon
- CIRAD, UPR 106 Bioagresseurs, Montpellier, France
| | | | | | - Rachel L. Melnick
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, USDA/ARS, Beltsville Agricultural Research Center-West, Beltsville, Maryland
| | | | - Brett M. Tyler
- Center for Genome Research and Biocomputing, Oregon State University
- Department of Botany and Plant Pathology, Oregon State University
| | - Lyndel W. Meinhardt
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, USDA/ARS, Beltsville Agricultural Research Center-West, Beltsville, Maryland
| | - Bryan A. Bailey
- Sustainable Perennial Crops Laboratory, Plant Sciences Institute, USDA/ARS, Beltsville Agricultural Research Center-West, Beltsville, Maryland
| |
Collapse
|
35
|
Badet T, Peyraud R, Mbengue M, Navaud O, Derbyshire M, Oliver RP, Barbacci A, Raffaele S. Codon optimization underpins generalist parasitism in fungi. eLife 2017; 6:e22472. [PMID: 28157073 PMCID: PMC5315462 DOI: 10.7554/elife.22472] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 01/28/2017] [Indexed: 01/04/2023] Open
Abstract
The range of hosts that parasites can infect is a key determinant of the emergence and spread of disease. Yet, the impact of host range variation on the evolution of parasite genomes remains unknown. Here, we show that codon optimization underlies genome adaptation in broad host range parasites. We found that the longer proteins encoded by broad host range fungi likely increase natural selection on codon optimization in these species. Accordingly, codon optimization correlates with host range across the fungal kingdom. At the species level, biased patterns of synonymous substitutions underpin increased codon optimization in a generalist but not a specialist fungal pathogen. Virulence genes were consistently enriched in highly codon-optimized genes of generalist but not specialist species. We conclude that codon optimization is related to the capacity of parasites to colonize multiple hosts. Our results link genome evolution and translational regulation to the long-term persistence of generalist parasitism.
Collapse
Affiliation(s)
- Thomas Badet
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Remi Peyraud
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Malick Mbengue
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Olivier Navaud
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Mark Derbyshire
- Centre for Crop and Disease Management, Department of Environment and Agriculture, Curtin University, Perth, Australia
| | - Richard P Oliver
- Centre for Crop and Disease Management, Department of Environment and Agriculture, Curtin University, Perth, Australia
| | - Adelin Barbacci
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Sylvain Raffaele
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| |
Collapse
|
36
|
Chakraborty S, Nag D, Mazumder TH, Uddin A. Codon usage pattern and prediction of gene expression level in Bungarus species. Gene 2016; 604:48-60. [PMID: 27845207 DOI: 10.1016/j.gene.2016.11.023] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2016] [Revised: 10/18/2016] [Accepted: 11/10/2016] [Indexed: 10/20/2022]
Abstract
Codon bias study in an organism gains significance in understanding the molecular mechanism as well as the functional conservation of gene expression during the course of evolution. The prime focus in this study is to compare the codon usage patterns among the four species belonging to the genus Bungarus (B. multicinctus, B. fasciatus, B. candidus and B. flaviceps) using several codon bias parameters. Our results suggested that relatively low codon bias exists in the coding sequences of the selected species. The compositional constraints together with gene expression level might influence the patterns of codon usage among the genes of Bungarus species. Both natural selection and mutation pressure affect the codon usage pattern in Bungarus species as evident from correspondence analysis. Neutrality plot indicates that natural selection played a major role while mutation pressure played a minor role in codon usage pattern of the genes in Bungarus species.
Collapse
Affiliation(s)
- Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar, Assam 788011, India.
| | - Debojyoti Nag
- Department of Biotechnology, Assam University, Silchar, Assam 788011, India
| | | | - Arif Uddin
- Department of Biotechnology, Assam University, Silchar, Assam 788011, India; Moinul Hoque Choudhury Memorial Science College, Algapur, HailaKandi, Assam 788150, India
| |
Collapse
|
37
|
Lopes KDP, Campos-Laborie FJ, Vialle RA, Ortega JM, De Las Rivas J. Evolutionary hallmarks of the human proteome: chasing the age and coregulation of protein-coding genes. BMC Genomics 2016; 17:725. [PMID: 27801289 PMCID: PMC5088522 DOI: 10.1186/s12864-016-3062-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Background The development of large-scale technologies for quantitative transcriptomics has enabled comprehensive analysis of the gene expression profiles in complete genomes. RNA-Seq allows the measurement of gene expression levels in a manner far more precise and global than previous methods. Studies using this technology are altering our view about the extent and complexity of the eukaryotic transcriptomes. In this respect, multiple efforts have been done to determine and analyse the gene expression patterns of human cell types in different conditions, either in normal or pathological states. However, until recently, little has been reported about the evolutionary marks present in human protein-coding genes, particularly from the combined perspective of gene expression and protein evolution. Results We present a combined analysis of human protein-coding gene expression profiling and time-scale ancestry mapping, that places the genes in taxonomy clades and reveals eight evolutionary major steps (“hallmarks”), that include clusters of functionally coherent proteins. The human expressed genes are analysed using a RNA-Seq dataset of 116 samples from 32 tissues. The evolutionary analysis of the human proteins is performed combining the information from: (i) a database of orthologous proteins (OMA), (ii) the taxonomy mapping of genes to lineage clades (from NCBI Taxonomy) and (iii) the evolution time-scale mapping provided by TimeTree (Timescale of Life). The human protein-coding genes are also placed in a relational context based in the construction of a robust gene coexpression network, that reveals tighter links between age-related protein-coding genes and finds functionally coherent gene modules. Conclusions Understanding the relational landscape of the human protein-coding genes is essential for interpreting the functional elements and modules of our active genome. Moreover, decoding the evolutionary history of the human genes can provide very valuable information to reveal or uncover their origin and function. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3062-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Katia de Paiva Lopes
- Bioinformatics and Functional Genomics Group, Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Cientificas (CSIC), Salamanca, Spain.,Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas (ICB), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brasil
| | - Francisco José Campos-Laborie
- Bioinformatics and Functional Genomics Group, Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Cientificas (CSIC), Salamanca, Spain
| | - Ricardo Assunção Vialle
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas (ICB), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brasil
| | - José Miguel Ortega
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas (ICB), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brasil
| | - Javier De Las Rivas
- Bioinformatics and Functional Genomics Group, Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Cientificas (CSIC), Salamanca, Spain.
| |
Collapse
|
38
|
Zhao Y, Zheng H, Xu A, Yan D, Jiang Z, Qi Q, Sun J. Analysis of codon usage bias of envelope glycoprotein genes in nuclear polyhedrosis virus (NPV) and its relation to evolution. BMC Genomics 2016; 17:677. [PMID: 27558469 PMCID: PMC4997668 DOI: 10.1186/s12864-016-3021-7] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Accepted: 08/16/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Analysis of codon usage bias is an extremely versatile method using in furthering understanding of the genetic and evolutionary paths of species. Codon usage bias of envelope glycoprotein genes in nuclear polyhedrosis virus (NPV) has remained largely unexplored at present. Hence, the codon usage bias of NPV envelope glycoprotein was analyzed here to reveal the genetic and evolutionary relationships between different viral species in baculovirus genus. RESULTS A total of 9236 codons from 18 different species of NPV of the baculovirus genera were used to perform this analysis. Glycoprotein of NPV exhibits weaker codon usage bias. Neutrality plot analysis and correlation analysis of effective number of codons (ENC) values indicate that natural selection is the main factor influencing codon usage bias, and that the impact of mutation pressure is relatively smaller. Another cluster analysis shows that the kinship or evolutionary relationships of these viral species can be divided into two broad categories despite all of these 18 species are from the same baculovirus genus. CONCLUSIONS There are many elements that can affect codon bias, such as the composition of amino acids, mutation pressure, natural selection, gene expression level, and etc. In the meantime, cluster analysis also illustrates that codon usage bias of virus envelope glycoprotein can serve as an effective means of evolutionary classification in baculovirus genus.
Collapse
Affiliation(s)
- Yongchao Zhao
- Subtropical Sericulture and Mulberry Resources Protection and Safety Engineering Research Center, Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, 510642, People's Republic of China
| | - Hao Zheng
- Subtropical Sericulture and Mulberry Resources Protection and Safety Engineering Research Center, Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, 510642, People's Republic of China
| | - Anying Xu
- Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang Jiangsu, 212018, People's Republic of China
| | - Donghua Yan
- Subtropical Sericulture and Mulberry Resources Protection and Safety Engineering Research Center, Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, 510642, People's Republic of China
| | - Zijian Jiang
- Subtropical Sericulture and Mulberry Resources Protection and Safety Engineering Research Center, Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, 510642, People's Republic of China
| | - Qi Qi
- Subtropical Sericulture and Mulberry Resources Protection and Safety Engineering Research Center, Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, 510642, People's Republic of China
| | - Jingchen Sun
- Subtropical Sericulture and Mulberry Resources Protection and Safety Engineering Research Center, Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, 510642, People's Republic of China.
| |
Collapse
|
39
|
Mazumder TH, Uddin A, Chakraborty S. Transcription factor gene GATA2: Association of leukemia and nonsynonymous to the synonymous substitution rate across five mammals. Genomics 2016; 107:155-61. [PMID: 26850985 DOI: 10.1016/j.ygeno.2016.02.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 01/15/2016] [Accepted: 02/01/2016] [Indexed: 11/29/2022]
Abstract
GATA2 gene encodes a member of the GATA family of zinc-finger transcription factors that play a pivotal role during the transition of primitive blood forming cells into white blood cells. Mutation in GATA2 results in the loss of function or even gain of function, including abnormal proliferation of white blood cells that may predispose to acute myeloid leukemia. Our results showed that the codon usage in GATA2 has been influenced by GC mutation bias where nature has highly favored fourteen most over represented codons but disfavored the ATA codon across five mammals. Purifying natural selection has affected GATA2 gene in human and other mammals to maintain its protein function during the period of evolution. Our findings report an insight into the codon usage patterns in gaining the clues for codon optimization to alter the translational efficiency as well as for the functional conservation of gene expression and the significance of nucleotide composition in GATA2 gene within mammals.
Collapse
Affiliation(s)
| | - Arif Uddin
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India.
| |
Collapse
|
40
|
Yang X, Luo X, Cai X. Analysis of codon usage pattern in Taenia saginata based on a transcriptome dataset. Parasit Vectors 2014; 7:527. [PMID: 25440955 PMCID: PMC4268816 DOI: 10.1186/s13071-014-0527-1] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Accepted: 11/06/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Codon usage bias is an important evolutionary feature in a genome and has been widely documented in many genomes. Analysis of codon usage bias has significance for mRNA translation, design of transgenes, new gene discovery, and studies of molecular biology and evolution, etc. However, the information about synonymous codon usage pattern of T. saginata genome remains unclear. T. saginata is a food-borne zoonotic cestode which infects approximataely 50 million humans worldwide, and causes significant health problems to the host and considerable socio-economic losses as a consequence. In this study, synonymous codon usage in T. saginata were examined. METHODS Total RNA was isolated from T. saginata cysticerci and 91,487 unigenes were generated using Illumina sequencing technology. After filtering, the final sequence collection containing 11,399 CDSs was used for our analysis. RESULTS Neutrality analysis showed that the T. saginata had a wide GC3 distribution and a significant correlation was observed between GC12 and GC3. NC-plot showed most of genes on or close to the expected curve, but only a few points with low-ENC values were below it, suggesting that mutational bias plays a major role in shaping codon usage. The Parity Rule 2 plot (PR2) analysis showed that GC and AT were not used proportionally. We also identified twenty-three optimal codons in the T. saginata genome, all of which were ended with a G or C residue. These results suggest that mutational and selection forces are probably driving factors of codon usage bias in T. saginata genome. Meanwhile, other factors such as protein length, gene expression, GC content of genes, the hydropathicity of each protein also influence codon usage. CONCLUSIONS Here, we systematically analyzed the codon usage pattern and identified factors shaping in codon usage bias in T. saginata. Currently, no complete nuclear genome is available for codon usage analysis at the genome level in T. saginata. This is the first report to investigate codon biology in T. sagninata. Such information does not only bring about a new perspective for understanding the mechanisms of biased usage of synonymous codons but also provide useful clues for molecular genetic engineering and evolutionary studies.
Collapse
Affiliation(s)
- Xing Yang
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, 730046, PR China. .,College of Veterinary Medicine, Jilin University, Changchun, 130000, PR China.
| | - Xuenong Luo
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, 730046, PR China.
| | - Xuepeng Cai
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, 730046, PR China. .,College of Veterinary Medicine, Jilin University, Changchun, 130000, PR China.
| |
Collapse
|
41
|
Moyers BA, Zhang J. Phylostratigraphic bias creates spurious patterns of genome evolution. Mol Biol Evol 2014; 32:258-67. [PMID: 25312911 DOI: 10.1093/molbev/msu286] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Phylostratigraphy is a method for dating the evolutionary emergence of a gene or gene family by identifying its homologs across the tree of life, typically by using BLAST searches. Applying this method to all genes in a species, or genomic phylostratigraphy, allows investigation of genome-wide patterns in new gene origination at different evolutionary times and thus has been extensively used. However, gene age estimation depends on the challenging task of detecting distant homologs via sequence similarity, which is expected to have differential accuracies for different genes. Here, we evaluate the accuracy of phylostratigraphy by realistic computer simulation with parameters estimated from genomic data, and investigate the impact of its error on findings of genome evolution. We show that 1) phylostratigraphy substantially underestimates gene age for a considerable fraction of genes, 2) the error is especially serious when the protein evolves rapidly, is short, and/or its most conserved block of sites is small, and 3) these errors create spurious nonuniform distributions of various gene properties among age groups, many of which cannot be predicted a priori. Given the high likelihood that conclusions about gene age are faulty, we advocate the use of realistic simulation to determine if observations from phylostratigraphy are explainable, at least qualitatively, by a null model of biased measurement, and in all cases, critical evaluation of results.
Collapse
Affiliation(s)
- Bryan A Moyers
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| |
Collapse
|
42
|
Ma L, Cui P, Zhu J, Zhang Z, Zhang Z. Translational selection in human: more pronounced in housekeeping genes. Biol Direct 2014; 9:17. [PMID: 25011537 PMCID: PMC4100034 DOI: 10.1186/1745-6150-9-17] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Accepted: 07/02/2014] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND Translational selection is a ubiquitous and significant mechanism to regulate protein expression in prokaryotes and unicellular eukaryotes. Recent evidence has shown that translational selection is weakly operative in highly expressed genes in human and other vertebrates. However, it remains unclear whether translational selection acts differentially on human genes depending on their expression patterns. RESULTS Here we report that human housekeeping (HK) genes that are strictly defined as genes that are expressed ubiquitously and consistently in most or all tissues, are under stronger translational selection. CONCLUSIONS These observations clearly show that translational selection is also closely associated with expression pattern. Our results suggest that human HK genes are more efficiently and/or accurately translated into proteins, which will inevitably open up a new understanding of HK genes and the regulation of gene expression. REVIEWERS This article was reviewed by Yuan Yuan, Baylor College of Medicine; Han Liang, University of Texas MD Anderson Cancer Center (nominated by Dr Laura Landweber) Eugene Koonin, NCBI, NLM, NIH, United States of America Sandor Pongor, International Centre for Genetic Engineering and biotechnology (ICGEB), Italy.
Collapse
Affiliation(s)
| | | | | | | | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, No,1 Beichen West Road, Chaoyang District, Beijing 100101, China.
| |
Collapse
|
43
|
Mehta SL, Dharap A, Vemuganti R. Expression of transcribed ultraconserved regions of genome in rat cerebral cortex. Neurochem Int 2014; 77:86-93. [PMID: 24953281 DOI: 10.1016/j.neuint.2014.06.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Revised: 06/09/2014] [Accepted: 06/10/2014] [Indexed: 11/29/2022]
Abstract
Emerging evidence indicates that 481 regions of the genome (>200 bp) that actively transcribe noncoding RNAs shows 100% homology between humans, rats and mice. These transcribed ultraconserved regions (T-UCRs) are thought to control the essential regulatory functions basic for life in rodents and mammals. Using microarray analysis, we presently show that 107 T-UCRs are actively expressed in adult rat cerebral cortex. They are grouped into intragenic (61) and intergenic (46) based on their genic location. Interestingly, 10 T-UCRs are expressed at unusually high levels in cerebral cortex. Additionally, many T-UCRs also showed cogenic expression. We further analyzed the correlation of intragenic T-UCRs with their host protein coding genes. Surprisingly, most of the expressed intragenic T-UCRs (54 out of 61) displayed a negative correlation with their host gene expression. T-UCRs are thought to control the splicing and transcription of the protein-coding genes that host them and flank them. Bioinformatics analysis indicated that the protein products of majority of these genes are nuclear in localization, share protein domains and are involved in the regulation of diverse biological and molecular functions including metabolism, development, cell cycle, binding and transcription factor regulation. In conclusion, this is the first study to shows that many T-UCRs are expressed in rodent brain and they might play a role in physiological brain functions.
Collapse
Affiliation(s)
- Suresh L Mehta
- Department of Neurological Surgery, University of Wisconsin, Madison, WI, USA
| | - Ashutosh Dharap
- Department of Neurological Surgery, University of Wisconsin, Madison, WI, USA; Theoretical Biology and Biophysics (T-6), Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Raghu Vemuganti
- Department of Neurological Surgery, University of Wisconsin, Madison, WI, USA.
| |
Collapse
|
44
|
Speed controls in translating secretory proteins in eukaryotes--an evolutionary perspective. PLoS Comput Biol 2014; 10:e1003294. [PMID: 24391480 PMCID: PMC3879104 DOI: 10.1371/journal.pcbi.1003294] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2013] [Accepted: 09/04/2013] [Indexed: 11/19/2022] Open
Abstract
Protein translation is the most expensive operation in dividing cells from bacteria to humans. Therefore, managing the speed and allocation of resources is subject to tight control. From bacteria to humans, clusters of relatively rare tRNA codons at the N'-terminal of mRNAs have been implicated in attenuating the process of ribosome allocation, and consequently the translation rate in a broad range of organisms. The current interpretation of "slow" tRNA codons does not distinguish between protein translations mediated by free- or endoplasmic reticulum (ER)-bound ribosomes. We demonstrate that proteins translated by free- or ER-bound ribosomes exhibit different overall properties in terms of their translation efficiency and speed in yeast, fly, plant, worm, bovine and human. We note that only secreted or membranous proteins with a Signal peptide (SP) are specified by segments of "slow" tRNA at the N'-terminal, followed by abundant codons that are considered "fast." Such profiles apply to 3100 proteins of the human proteome that are composed of secreted and signal peptide (SP)-assisted membranous proteins. Remarkably, the bulks of the proteins (12,000), or membranous proteins lacking SP (3400), do not have such a pattern. Alternation of "fast" and "slow" codons was found also in proteins that translocate to mitochondria through transit peptides (TP). The differential clusters of tRNA adapted codons is not restricted to the N'-terminal of transcripts. Specifically, Glycosylphosphatidylinositol (GPI)-anchored proteins are unified by clusters of low adapted tRNAs codons at the C'-termini. Furthermore, selection of amino acids types and specific codons was shown as the driving force which establishes the translation demands for the secretory proteome. We postulate that "hard-coded" signals within the secretory proteome assist the steps of protein maturation and folding. Specifically, "speed control" signals for delaying the translation of a nascent protein fulfill the co- and post-translational stages such as membrane translocation, proteins processing and folding.
Collapse
|
45
|
Abstract
Novel protein-coding genes can arise either through re-organization of pre-existing genes or de novo1,2. Processes involving re-organization of pre-existing genes, notably following gene duplication, have been extensively described1,2. In contrast, de novo gene birth remains poorly understood, mainly because translation of sequences devoid of genes, or “non-genic” sequences, is expected to produce insignificant polypeptides rather than proteins with specific biological functions1,3-6. Here, we formalize an evolutionary model according to which functional genes evolve de novo through transitory proto-genes4 generated by widespread translational activity in non-genic sequences. Testing this model at genome-scale in Saccharomyces cerevisiae, we detect translation of hundreds of short species-specific open reading frames (ORFs) located in non-genic sequences. These translation events appear to provide adaptive potential7, as suggested by their differential regulation upon stress and by signatures of retention by natural selection. In line with our model, we establish that S. cerevisiae ORFs can be placed within an evolutionary continuum ranging from non-genic sequences to genes. We identify ~1,900 candidate proto-genes among S. cerevisiae ORFs and find that de novo gene birth from such a reservoir may be more prevalent than sporadic gene duplication. Our work illustrates that evolution exploits seemingly dispensable sequences to generate adaptive functional innovation.
Collapse
|
46
|
Behura SK, Severson DW. Codon usage bias: causative factors, quantification methods and genome-wide patterns: with emphasis on insect genomes. Biol Rev Camb Philos Soc 2012; 88:49-61. [PMID: 22889422 DOI: 10.1111/j.1469-185x.2012.00242.x] [Citation(s) in RCA: 126] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Codon usage bias refers to the phenomenon where specific codons are used more often than other synonymous codons during translation of genes, the extent of which varies within and among species. Molecular evolutionary investigations suggest that codon bias is manifested as a result of balance between mutational and translational selection of such genes and that this phenomenon is widespread across species and may contribute to genome evolution in a significant manner. With the advent of whole-genome sequencing of numerous species, both prokaryotes and eukaryotes, genome-wide patterns of codon bias are emerging in different organisms. Various factors such as expression level, GC content, recombination rates, RNA stability, codon position, gene length and others (including environmental stress and population size) can influence codon usage bias within and among species. Moreover, there has been a continuous quest towards developing new concepts and tools to measure the extent of codon usage bias of genes. In this review, we outline the fundamental concepts of evolution of the genetic code, discuss various factors that may influence biased usage of synonymous codons and then outline different principles and methods of measurement of codon usage bias. Finally, we discuss selected studies performed using whole-genome sequences of different insect species to show how codon bias patterns vary within and among genomes. We conclude with generalized remarks on specific emerging aspects of codon bias studies and highlight the recent explosion of genome-sequencing efforts on arthropods (such as twelve Drosophila species, species of ants, honeybee, Nasonia and Anopheles mosquitoes as well as the recent launch of a genome-sequencing project involving 5000 insects and other arthropods) that may help us to understand better the evolution of codon bias and its biological significance.
Collapse
Affiliation(s)
- Susanta K Behura
- Department of Biological Sciences, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA.
| | | |
Collapse
|
47
|
Tirosh Y, Morpurgo N, Cohen M, Linial M, Bloch G. Raalin, a transcript enriched in the honey bee brain, is a remnant of genomic rearrangement in Hymenoptera. INSECT MOLECULAR BIOLOGY 2012; 21:305-318. [PMID: 22404450 DOI: 10.1111/j.1365-2583.2012.01138.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We identified a predicted compact cysteine-rich sequence in the honey bee genome that we called 'Raalin'. Raalin transcripts are enriched in the brain of adult honey bee workers and drones, with only minimum expression in other tissues or in pre-adult stages. Open-reading frame (ORF) homologues of Raalin were identified in the transcriptomes of fruit flies, mosquitoes and moths. The Raalin-like gene from Drosophila melanogaster encodes for a short secreted protein that is maximally expressed in the adult brain with negligible expression in other tissues or pre-imaginal stages. Raalin-like sequences have also been found in the recently sequenced genomes of six ant species, but not in the jewel wasp Nasonia vitripennis. As in the honey bee, the Raalin-like sequences of ants do not have an ORF. A comparison of the genome region containing Raalin in the genomes of bees, ants and the wasp provides evolutionary support for an extensive genome rearrangement in this sequence. Our analyses identify a new family of ancient cysteine-rich short sequences in insects in which insertions and genome rearrangements may have disrupted this locus in the branch leading to the Hymenoptera. The regulated expression of this transcript suggests that it has a brain-specific function.
Collapse
Affiliation(s)
- Y Tirosh
- Department of Biological Chemistry, The Hebrew University of Jerusalem, Jerusalem, Israel
| | | | | | | | | |
Collapse
|
48
|
Mahlab S, Tuller T, Linial M. Conservation of the relative tRNA composition in healthy and cancerous tissues. RNA (NEW YORK, N.Y.) 2012; 18:640-52. [PMID: 22357911 PMCID: PMC3312552 DOI: 10.1261/rna.030775.111] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Elongation in protein translation is strongly dependent on the availability of mature transfer RNAs (tRNAs). The relative concentrations of the tRNA isoacceptors determine the translation efficiency in unicellular organisms. However, the degree of correspondence of codons and the relevant tRNA isoacceptors serves as an estimator for translation efficiency in all organisms. In this study, we focus on the translational capacity of the human proteome. We show that the correspondence between the codon usage and tRNAs can be improved by combining experimental measurements with the genomic copy number of isoacceptor groups. We show that there are technologies of tRNA measurements that are useful for our analysis. However, fragments of tRNAs do not agree with translational capacity. It was shown that there is a significant increase in the absolute levels of tRNA genes in cancerous cells in comparison to healthy cells. However, we find that the relative composition of tRNA isoacceptors in healthy, cancerous, or transformed cells remains almost identical. This result may indicate that maintaining the relative tRNA composition in cancerous cells is advantageous via its stabilizing of the effectiveness of translation.
Collapse
Affiliation(s)
- Shelly Mahlab
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
- Corresponding author.E-mail.E-mail .E-mail .
| | - Tamir Tuller
- Iby and Aladar Fleischman Faculty of Engineering, Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 69978, Israel
- Corresponding author.E-mail.E-mail .E-mail .
| | - Michal Linial
- Department of Biological Chemistry, Institute of Life Sciences, Sudarsky Center for Computational Biology, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
- Corresponding author.E-mail.E-mail .E-mail .
| |
Collapse
|
49
|
Zhu E, Sambath S. Characterization of Synonymous Codon Usage in the Newly Identified Duck Plague Virus UL16 Gene. ADVANCES IN INTELLIGENT AND SOFT COMPUTING 2012. [PMCID: PMC7122970 DOI: 10.1007/978-3-642-27537-1_89] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
A comparative analysis of the codon usage bias in the newly identified UL16 gene(GenBank accession no.EU195095) of DPV and the UL16 gene of 22 reference herpesviruses was performed. In this study, the synonymous codon usage bias of UL16 gene in the 23 herpesviruses have been analyzed and the results showed obvious differences by the CAI, RSCU, ENC and GC3s. The results revealed that the synonymous codons with A and T at the third codon positon have widely usage in the codon of UL16 gene of DPV. The ENC-GC3s plot revealed that the genetic heterogeneity in UL16 gene of herpesviruses was constrained by G+C content at the third codon position. The phylogenetic analysis suggested that DPV was evolutionarily closer to herpesviruses which further clustered into Alphaherpesvirinae. Furthermore the ORF of DPV UL16 gene has sequential rare codons. There were 21 codons showing distinct usage differences between DPV with Escherichia coli, 19 codons showing distinct usage differences between DPV with yeast, and 20 between DPV and Human. Therefore the Escherichia coli, Yeast and Human expression system were suitable for the expression of DPV UL16 gene if some codons could be optimized.
Collapse
Affiliation(s)
- Egui Zhu
- South China Normal University, Guangzhou, 510631 China, People's Republic
| | - Sabo Sambath
- South China Normal University, Guangzhou, 510631 China, People's Republic
| |
Collapse
|