1
|
Wang N, Zheng X, Leptihn S, Li Y, Cai H, Zhang P, Wu W, Yu Y, Hua X. Characteristics and phylogenetic distribution of megaplasmids and prediction of a putative chromid in Pseudomonas aeruginosa. Comput Struct Biotechnol J 2024; 23:1418-1428. [PMID: 38616963 PMCID: PMC11015739 DOI: 10.1016/j.csbj.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 04/01/2024] [Accepted: 04/01/2024] [Indexed: 04/16/2024] Open
Abstract
Research on megaplasmids that contribute to the spread of antimicrobial resistance (AMR) in Pseudomonas aeruginosa strains has grown in recent years due to the now widely used technologies allowing long-read sequencing. Here, we systematically analyzed distinct and consistent genetic characteristics of megaplasmids found in P. aeruginosa. Our data provide information on their phylogenetic distribution and hypotheses tracing the potential evolutionary paths of megaplasmids. Most of the megaplasmids we found belong to the IncP-2-type, with conserved and syntenic genetic backbones carrying modules of genes associated with chemotaxis apparatus, tellurite resistance and plasmid replication, segregation, and transmission. Extensively variable regions harbor abundant AMR genes, especially those encoding β-lactamases such as VIM-2, IMP-45, and KPC variants, which are high-risk elements in nosocomial infection. IncP-2 megaplasmids act as effective vehicles transmitting AMR genes to diverse regions. One evolutionary model of the origin of megaplasmids claims that chromids can develop from megaplasmids. These chromids have been characterized as an intermediate between a megaplasmid and a chromosome, also containing core genes that can be found on the chromosome but not on the megaplasmid. Using in silico prediction, we identified the "PABCH45 unnamed replicon" as a putative chromid in P. aeruginosa, which shows a much higher similarity and closer phylogenetic relationship to chromosomes than to megaplasmids while also encoding plasmid-like partition genes. We propose that such a chromid could facilitate genome expansion, allowing for more rapid adaptations to novel ecological niches or selective conditions, in comparison to megaplasmids.
Collapse
Affiliation(s)
- Nanfei Wang
- Department of Infectious Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, China
- Regional Medical Center for National Institute of Respiratory Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xuan Zheng
- Department of Nephrology, Sir Run Run Shaw Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Sebastian Leptihn
- HMU Health and Medical University, Am Anger 64/73 – 99084, Erfurt, Germany
- Deutsches Zentrum für Infektionsforschung (DZIF) Translational Phage-Network, Inhoffenstraße 7 – 38124, Braunschweig, Germany
- University of Southern Denmark,Department of Biochemistry and Molecular Biology, Campusvej 55 – 5230, Odense, Denmark
| | - Yue Li
- Department of Infectious Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, China
- Regional Medical Center for National Institute of Respiratory Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Heng Cai
- Department of Infectious Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, China
- Regional Medical Center for National Institute of Respiratory Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Piaopiao Zhang
- Department of Infectious Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, China
- Regional Medical Center for National Institute of Respiratory Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Wenhao Wu
- Department of Infectious Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, China
- Regional Medical Center for National Institute of Respiratory Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yunsong Yu
- Department of Infectious Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, China
- Regional Medical Center for National Institute of Respiratory Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xiaoting Hua
- Department of Infectious Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, China
- Regional Medical Center for National Institute of Respiratory Diseases, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| |
Collapse
|
2
|
Qin L, Ding S, He Z. Compositional biases and evolution of the largest plant RNA virus order Patatavirales. Int J Biol Macromol 2023; 240:124403. [PMID: 37076075 DOI: 10.1016/j.ijbiomac.2023.124403] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 03/13/2023] [Accepted: 03/25/2023] [Indexed: 04/21/2023]
Abstract
Patatavirales is the largest order of plant RNA viruses and exclusively contains the family Potyviridae, accounting for 30 % of all known plant viruses. The composition bias of animal RNA viruses and several plant RNA viruses has been determined. However, the comprehensive nucleic acid composition, codon pair usage patterns, dinucleotide preference and codon pair preference of plant RNA viruses have not been investigated to date. In this study, integrated analysis and discussion of the nucleic acid composition, codon usage patterns, dinucleotide composition and codon pair bias of potyvirids were performed using 3732 complete genome coding sequences. The nucleic acid composition of potyvirids was significantly enriched in A/U. Interestingly, the A/U-rich nucleotide composition of Patatavirales is essential for determining the preferred A-ended and U-ended codons and the overexpression of UpG and CpA dinucleotides. The codon usage patterns and codon pair bias of potyvirids were significantly correlated with their nucleic acid composition. Additionally, the codon usage pattern, dinucleotide composition and codon-pair bias of potyvirids are more dependent on the classification of the virus compared with their hosts. Our analysis provides a better understanding of future research on the origin and evolution patterns of the order Patatavirales.
Collapse
Affiliation(s)
- Lang Qin
- College of Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou 225009, Jiangsu Province, PR China
| | - Shiwen Ding
- College of Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou 225009, Jiangsu Province, PR China
| | - Zhen He
- College of Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou 225009, Jiangsu Province, PR China.
| |
Collapse
|
3
|
Benisty H, Hernandez-Alias X, Weber M, Anglada-Girotto M, Mantica F, Radusky L, Senger G, Calvet F, Weghorn D, Irimia M, Schaefer MH, Serrano L. Genes enriched in A/T-ending codons are co-regulated and conserved across mammals. Cell Syst 2023; 14:312-323.e3. [PMID: 36889307 DOI: 10.1016/j.cels.2023.02.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 07/11/2022] [Accepted: 02/09/2023] [Indexed: 03/09/2023]
Abstract
Codon usage influences gene expression distinctly depending on the cell context. Yet, the importance of codon bias in the simultaneous turnover of specific groups of protein-coding genes remains to be investigated. Here, we find that genes enriched in A/T-ending codons are expressed more coordinately in general and across tissues and development than those enriched in G/C-ending codons. tRNA abundance measurements indicate that this coordination is linked to the expression changes of tRNA isoacceptors reading A/T-ending codons. Genes with similar codon composition are more likely to be part of the same protein complex, especially for genes with A/T-ending codons. The codon preferences of genes with A/T-ending codons are conserved among mammals and other vertebrates. We suggest that this orchestration contributes to tissue-specific and ontogenetic-specific expression, which can facilitate, for instance, timely protein complex formation.
Collapse
Affiliation(s)
- Hannah Benisty
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain.
| | - Xavier Hernandez-Alias
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Marc Weber
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Miquel Anglada-Girotto
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Federica Mantica
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Leandro Radusky
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Gökçe Senger
- Department of Experimental Oncology, European Institute of Oncology (IEO) IRCCS, Via Adamello 16, Milan 20139, Italy
| | - Ferriol Calvet
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Donate Weghorn
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Manuel Irimia
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain; ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain
| | - Martin H Schaefer
- Department of Experimental Oncology, European Institute of Oncology (IEO) IRCCS, Via Adamello 16, Milan 20139, Italy
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain; Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain; ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain.
| |
Collapse
|
4
|
He Z, Ding S, Guo J, Qin L, Xu X. Synonymous Codon Usage Analysis of Three Narcissus Potyviruses. Viruses 2022; 14:v14050846. [PMID: 35632588 PMCID: PMC9143068 DOI: 10.3390/v14050846] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 04/13/2022] [Accepted: 04/18/2022] [Indexed: 02/04/2023] Open
Abstract
Narcissus degeneration virus (NDV), narcissus late season yellows virus (NLSYV) and narcissus yellow stripe virus (NYSV), which belong to the genus Potyvirus of the family Potyviridae, cause significant losses in the ornamental value and quality of narcissus. Several previous studies have explored the genetic diversity and evolution rate of narcissus viruses, but the analysis of the synonymous codons of the narcissus viruses is still unclear. Herein, the coat protein (CP) of three viruses is used to analyze the viruses’ phylogeny and codon usage pattern. Phylogenetic analysis showed that NYSV, NDV and NLSYV isolates were divided into five, three and five clusters, respectively, and these clusters seemed to reflect the geographic distribution. The effective number of codon (ENC) values indicated a weak codon usage bias in the CP coding region of the three narcissus viruses. ENC-plot and neutrality analysis showed that the codon usage bias of the three narcissus viruses is all mainly influenced by natural selection compared with the mutation pressure. The three narcissus viruses shared the same best optimal codon (CCA) and the synonymous codon prefers to use codons ending with A/U, compared to C/G. Our study shows the codon analysis of different viruses on the same host for the first time, which indicates the importance of the evolutionary-based design to control these viruses.
Collapse
Affiliation(s)
- Zhen He
- School of Horticulture and Plant Protection, Yangzhou University, Yangzhou 225009, China; (S.D.); (L.Q.); (X.X.)
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou 225009, China
- Correspondence: or
| | - Shiwen Ding
- School of Horticulture and Plant Protection, Yangzhou University, Yangzhou 225009, China; (S.D.); (L.Q.); (X.X.)
| | - Jiyuan Guo
- Department of Resources and Environment, Moutai Institute, Zunyi 564507, China;
| | - Lang Qin
- School of Horticulture and Plant Protection, Yangzhou University, Yangzhou 225009, China; (S.D.); (L.Q.); (X.X.)
| | - Xiaowei Xu
- School of Horticulture and Plant Protection, Yangzhou University, Yangzhou 225009, China; (S.D.); (L.Q.); (X.X.)
| |
Collapse
|
5
|
Maldonado LL, Bertelli AM, Kamenetzky L. Molecular features similarities between SARS-CoV-2, SARS, MERS and key human genes could favour the viral infections and trigger collateral effects. Sci Rep 2021; 11:4108. [PMID: 33602998 PMCID: PMC7893037 DOI: 10.1038/s41598-021-83595-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 01/26/2021] [Indexed: 01/31/2023] Open
Abstract
In December 2019, rising pneumonia cases caused by a novel β-coronavirus (SARS-CoV-2) occurred in Wuhan, China, which has rapidly spread worldwide, causing thousands of deaths. The WHO declared the SARS-CoV-2 outbreak as a public health emergency of international concern, since then several scientists are dedicated to its study. It has been observed that many human viruses have codon usage biases that match highly expressed proteins in the tissues they infect and depend on the host cell machinery for the replication and co-evolution. In this work, we analysed 91 molecular features and codon usage patterns for 339 viral genes and 463 human genes that consisted of 677,873 codon positions. Hereby, we selected the highly expressed genes from human lung tissue to perform computational studies that permit to compare their molecular features with those of SARS, SARS-CoV-2 and MERS genes. The integrated analysis of all the features revealed that certain viral genes and overexpressed human genes have similar codon usage patterns. The main pattern was the A/T bias that together with other features could propitiate the viral infection, enhanced by a host dependant specialization of the translation machinery of only some of the overexpressed genes. The envelope protein E, the membrane glycoprotein M and ORF7 could be further benefited. This could be the key for a facilitated translation and viral replication conducting to different comorbidities depending on the genetic variability of population due to the host translation machinery. This is the first codon usage approach that reveals which human genes could be potentially deregulated due to the codon usage similarities between the host and the viral genes when the virus is already inside the human cells of the lung tissues. Our work leaded to the identification of additional highly expressed human genes which are not the usual suspects but might play a role in the viral infection and settle the basis for further research in the field of human genetics associated with new viral infections. To identify the genes that could be deregulated under a viral infection is important to predict the collateral effects and determine which individuals would be more susceptible based on their genetic features and comorbidities associated.
Collapse
Affiliation(s)
- Lucas L Maldonado
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina.
| | | | - Laura Kamenetzky
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
- iB3 | Instituto de Biociencias, Biotecnología y Biología traslacional, Departamento de Fisiologia y Biologia Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
| |
Collapse
|
6
|
Maldonado LL, Stegmayer G, Milone DH, Oliveira G, Rosenzvit M, Kamenetzky L. Whole genome analysis of codon usage in Echinococcus. Mol Biochem Parasitol 2018; 225:54-66. [DOI: 10.1016/j.molbiopara.2018.08.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Revised: 07/20/2018] [Accepted: 08/01/2018] [Indexed: 01/15/2023]
|
7
|
Arakawa K, Tomita M. The GC Skew Index: A Measure of Genomic Compositional Asymmetry and the Degree of Replicational Selection. Evol Bioinform Online 2017. [DOI: 10.1177/117693430700300006] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Circular bacterial chromosomes have highly polarized nucleotide composition in the two replichores, and this genomic strand asymmetry can be visualized using GC skew graphs. Here we propose and discuss the GC skew index (GCSI) for the quantification of genomic compositional skew, which combines a normalized measure of fast Fourier transform to capture the shape of the skew graph and Euclidean distance between the two vertices in a cumulative skew graph to represent the degree of skew. We calculated GCSI for all available bacterial genomes, and GCSI correlated well with the visibility of GC skew. This novel index is useful for estimating confidence levels for the prediction of replication origin and terminus by methods based on GC skew and for measuring the strength of replicational selection in a genome.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa 252-8520, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa 252-8520, Japan
| |
Collapse
|
8
|
Huang X, Xu J, Chen L, Wang Y, Gu X, Peng X, Yang G. Analysis of transcriptome data reveals multifactor constraint on codon usage in Taenia multiceps. BMC Genomics 2017; 18:308. [PMID: 28427327 PMCID: PMC5397707 DOI: 10.1186/s12864-017-3704-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 04/12/2017] [Indexed: 12/04/2022] Open
Abstract
Background Codon usage bias (CUB) is an important evolutionary feature in genomes that has been widely observed in many organisms. However, the synonymous codon usage pattern in the genome of T. multiceps remains to be clarified. In this study, we analyzed the codon usage of T. multiceps based on the transcriptome data to reveal the constraint factors and to gain an improved understanding of the mechanisms that shape synonymous CUB. Results Analysis of a total of 8,620 annotated mRNA sequences from T. multiceps indicated only a weak codon bias, with mean GC and GC3 content values of 49.29% and 51.43%, respectively. Our analysis indicated that nucleotide composition, mutational pressure, natural selection, gene expression level, amino acids with grand average of hydropathicity (GRAVY) and aromaticity (Aromo) and the effective selection of amino-acids all contributed to the codon usage in T. multiceps. Among these factors, natural selection was implicated as the major factor affecting the codon usage variation in T. multiceps. The codon usage of ribosome genes was affected mainly by mutations, while the essential genes were affected mainly by selection. In addition, 21codons were identified as “optimal codons”. Overall, the optimal codons were GC-rich (GC:AU, 41:22), and ended with G or C (except CGU). Furthermore, different degrees of variation in codon usage were found between T. multiceps and Escherichia coli, yeast, Homo sapiens. However, little difference was found between T. multiceps and Taenia pisiformis. Conclusions In this study, the codon usage pattern of T. multiceps was analyzed systematically and factors affected CUB were also identified. This is the first study of codon biology in T. multiceps. Understanding the codon usage pattern in T. multiceps can be helpful for the discovery of new genes, molecular genetic engineering and evolutionary studies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3704-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xing Huang
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China.,Chengdu Agricultural College, Chengdu, 611130, China
| | - Jing Xu
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China
| | - Lin Chen
- Meat-processing Application Key Laboratory of Sichuan Province, College of Pharmacy and Biological Engineering, Chengdu University, Chengdu, 610106, China
| | - Yu Wang
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China
| | - Xiaobin Gu
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China
| | - Xuerong Peng
- College of Science, Sichuan Agricultural University, Ya'an, 625014, China
| | - Guangyou Yang
- Department of Parasitology, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, 611130, China.
| |
Collapse
|
9
|
Xu W, Xing T, Zhao M, Yin X, Xia G, Wang M. Synonymous codon usage bias in plant mitochondrial genes is associated with intron number and mirrors species evolution. PLoS One 2015; 10:e0131508. [PMID: 26110418 PMCID: PMC4481540 DOI: 10.1371/journal.pone.0131508] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 06/03/2015] [Indexed: 11/21/2022] Open
Abstract
Synonymous codon usage bias (SCUB) is a common event that a non-uniform usage of codons often occurs in nearly all organisms. We previously found that SCUB is correlated with both intron number and exon position in the plant nuclear genome but not in the plastid genome; SCUB in both nuclear and plastid genome can mirror the evolutionary specialization. However, how about the rules in the mitochondrial genome has not been addressed. Here, we present an analysis of SCUB in the mitochondrial genome, based on 24 plant species ranging from algae to land plants. The frequencies of NNA and NNT (A- and T-ending codons) are higher than those of NNG and NNC, with the strongest preference in bryophytes and the weakest in land plants, suggesting an association between SCUB and plant evolution. The preference for NNA and NNT is more evident in genes harboring a greater number of introns in land plants, but the bias to NNA and NNT exhibits even among exons. The pattern of SCUB in the mitochondrial genome differs in some respects to that present in both the nuclear and plastid genomes.
Collapse
Affiliation(s)
- Wenjing Xu
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Tian Xing
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Mingming Zhao
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Xunhao Yin
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Guangmin Xia
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
| | - Mengcheng Wang
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100, China
- * E-mail:
| |
Collapse
|
10
|
Qi Y, Xu W, Xing T, Zhao M, Li N, Yan L, Xia G, Wang M. Synonymous Codon Usage Bias in the Plastid Genome is Unrelated to Gene Structure and Shows Evolutionary Heterogeneity. Evol Bioinform Online 2015; 11:65-77. [PMID: 25922569 PMCID: PMC4395140 DOI: 10.4137/ebo.s22566] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Revised: 02/22/2015] [Accepted: 02/22/2015] [Indexed: 01/26/2023] Open
Abstract
Synonymous codon usage bias (SCUB) is the nonuniform usage of codons, occurring often in nearly all organisms. Our previous study found that SCUB is correlated with intron number, is unequal among exons in the plant nuclear genome, and mirrors evolutionary specialization. However, whether this rule exists in the plastid genome has not been addressed. Here, we present an analysis of SCUB in the plastid genomes of 25 species from lower to higher plants (algae, bryophytes, pteridophytes, gymnosperms, and spermatophytes). We found NNA and NNT (A- and T-ending codons) are preferential in the plastid genomes of all plants. Interestingly, this preference is heterogeneous among taxonomies of plants, with the strongest preference in bryophytes and the weakest in pteridophytes, suggesting an association between SCUB and plant evolution. In addition, SCUB frequencies are consistent among genes with varied introns and among exons, indicating that the bias of NNA and NNT is unrelated to either intron number or exon position. Further, SCUB is associated with DNA methylation–induced conversion of cytosine to thymine in the vascular plants but not in algae or bryophytes. These data demonstrate that these SCUB profiles in the plastid genome are distinctly different compared with the nuclear genome.
Collapse
Affiliation(s)
- Yueying Qi
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, Jinan 250100, Shandong, China
| | - Wenjing Xu
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, Jinan 250100, Shandong, China
| | - Tian Xing
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, Jinan 250100, Shandong, China
| | - Mingming Zhao
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, Jinan 250100, Shandong, China
| | - Nana Li
- Shandong Center of Crop Germplasm Resources, Jinan 250100,Shandong, China
| | - Li Yan
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, Jinan 250100, Shandong, China
| | - Guangmin Xia
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, Jinan 250100, Shandong, China
| | - Mengcheng Wang
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, Jinan 250100, Shandong, China
| |
Collapse
|
11
|
Evolution of tryptophan biosynthetic pathway in microbial genomes: a comparative genetic study. SYSTEMS AND SYNTHETIC BIOLOGY 2013; 8:59-72. [PMID: 24592292 DOI: 10.1007/s11693-013-9127-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2013] [Revised: 10/05/2013] [Accepted: 10/08/2013] [Indexed: 10/26/2022]
Abstract
Biosynthetic pathway evolution needs to consider the evolution of a group of genes that code for enzymes catalysing the multiple chemical reaction steps leading to the final end product. Tryptophan biosynthetic pathway has five chemical reaction steps that are highly conserved in diverse microbial genomes, though the genes of the pathway enzymes show considerable variations in arrangements, operon structure (gene fusion and splitting) and regulation. We use a combined bioinformatic and statistical analyses approach to address the question if the pathway genes from different microbial genomes, belonging to a wide range of groups, show similar evolutionary relationships within and between them. Our analyses involved detailed study of gene organization (fusion/splitting events), base composition, relative synonymous codon usage pattern of the genes, gene expressivity, amino acid usage, etc. to assess inter- and intra-genic variations, between and within the pathway genes, in diverse group of microorganisms. We describe these genetic and genomic variations in the tryptophan pathway genes in different microorganisms to show the similarities across organisms, and compare the same genes across different organisms to find the possible variability arising possibly due to horizontal gene transfers. Such studies form the basis for moving from single gene evolution to pathway evolutionary studies that are important steps towards understanding the systems biology of intracellular pathways.
Collapse
|
12
|
Chen W, Xie T, Shao Y, Chen F. Genomic characteristics comparisons of 12 food-related filamentous fungi in tRNA gene set, codon usage and amino acid composition. Gene 2012; 497:116-24. [PMID: 22305983 DOI: 10.1016/j.gene.2012.01.016] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2011] [Revised: 12/26/2011] [Accepted: 01/17/2012] [Indexed: 11/19/2022]
Abstract
Filamentous fungi are widely exploited in food industry due to their abilities to secrete large amounts of enzymes and metabolites. The recent availability of fungal genome sequences has provided an opportunity to explore the genomic characteristics of these food-related filamentous fungi. In this paper, we selected 12 representative filamentous fungi in the areas of food processing and safety, which were Aspergillus clavatus, A. flavus, A. fumigatus, A. nidulans, A. niger, A. oryzae, A. terreus, Monascus ruber, Neurospora crassa, Penicillium chrysogenum, Rhizopus oryzae and Trichoderma reesei, and did the comparative studies of their genomic characteristics of tRNA gene distribution, codon usage pattern and amino acid composition. The results showed that the copy numbers greatly differed among isoaccepting tRNA genes and the distribution seemed to be related with translation process. The results also revealed that genome compositional variation probably constrained the base choice at the third codon, and affected the overall amino acid composition but seemed to have little effect on the integrated physicochemical characteristics of overall amino acids. The further analysis suggested that the wobble pairing and base modification were the important mechanisms in codon-anticodon interaction. In the scope of authors' knowledge, it is the first report about the genomic characteristics analysis of food-related filamentous fungi, which would be informative for the analysis of filamentous fungal genome evolution and their practical application in food industry.
Collapse
Affiliation(s)
- Wanping Chen
- College of Food Science and Technology, Huazhong Agricultural University, Wuhan, Hubei Province, 430070, PR China
| | | | | | | |
Collapse
|
13
|
Pandit A, Sinha S. Differential trends in the codon usage patterns in HIV-1 genes. PLoS One 2011; 6:e28889. [PMID: 22216135 PMCID: PMC3245234 DOI: 10.1371/journal.pone.0028889] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2011] [Accepted: 11/16/2011] [Indexed: 12/27/2022] Open
Abstract
Host-pathogen interactions underlie one of the most complex evolutionary phenomena resulting in continual adaptive genetic changes, where pathogens exploit the host's molecular resources for growth and survival, while hosts try to eliminate the pathogen. Deciphering the molecular basis of host-pathogen interactions is useful in understanding the factors governing pathogen evolution and disease propagation. In host-pathogen context, a balance between mutation, selection, and genetic drift is known to maintain codon bias in both organisms. Studies revealing determinants of the bias and its dynamics are central to the understanding of host-pathogen evolution. We considered the Human Immunodeficiency Virus (HIV) type 1 and its human host to search for evolutionary signatures in the viral genome. Positive selection is known to dominate intra-host evolution of HIV-1, whereas high genetic variability underlies the belief that neutral processes drive inter-host differences. In this study, we analyze the codon usage patterns of HIV-1 genomes across all subtypes and clades sequenced over a period of 23 years. We show presence of unique temporal correlations in the codon bias of three HIV-1 genes illustrating differential adaptation of the HIV-1 genes towards the host preferred codons. Our results point towards gene-specific translational selection to be an important force driving the evolution of HIV-1 at the population level.
Collapse
Affiliation(s)
- Aridaman Pandit
- Mathematical Modeling and Computational Biology Group, Centre for Cellular & Molecular Biology (CSIR), Hyderabad, Andhra Pradesh, India
| | - Somdatta Sinha
- Mathematical Modeling and Computational Biology Group, Centre for Cellular & Molecular Biology (CSIR), Hyderabad, Andhra Pradesh, India
- Indian Institute of Science Education and Research Mohali, Mohali, Punjab, India
| |
Collapse
|
14
|
Aoi MC, Rourke BC. Interspecific and intragenic differences in codon usage bias among vertebrate myosin heavy-chain genes. J Mol Evol 2011; 73:74-93. [PMID: 21915654 DOI: 10.1007/s00239-011-9457-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2010] [Accepted: 08/19/2011] [Indexed: 01/13/2023]
Abstract
Synonymous codon usage bias is a broadly observed phenomenon in bacteria, plants, and invertebrates and may result from selection. However, the role of selective pressures in shaping codon bias is still controversial in vertebrates, particularly for mammals. The myosin heavy-chain (MyHC) gene family comprises multiple isoforms of the major force-producing contractile protein in cardiac and skeletal muscles. Slow and fast genes are tandemly arrayed on separate chromosomes, and have distinct patterns of functionality and expression in muscle. We analyze both full-length MyHC genes (~5400 bp) and a larger collection of partial sequences at the 3' end (~500 bp). The MyHC isoforms are an interesting system in which to study codon usage bias because of their length, expression, and critical importance to organismal mobility. Codon bias and GC content differs among MyHC genes with regards to functional type, isoform, and position within the gene. Codon bias even varies by isoform within a species. We find evidence in favor of both chromosomal influences on nucleotide composition and selection against nonsense errors (SANE) acting on codon usage in MyHC genes. Intragenic variation in codon bias and elongation rate is significant, with a strong trend for increasing codon bias and elongation rate towards the 3' end of the gene, although the trend is dependent upon the degeneracy class of the codons. Therefore, patterns of codon usage in MyHC genes are consistent with models supporting SANE as a major force shaping codon usage.
Collapse
Affiliation(s)
- Mikio C Aoi
- Department of Mathematics, North Carolina State University, Raleigh, NC 27695, USA
| | | |
Collapse
|
15
|
Wang B, Liu J, Jin L, Feng XY, Chen JQ. Complex mutation and weak selection together determined the codon usage bias in bryophyte mitochondrial genomes. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2010; 52:1100-1108. [PMID: 21106008 DOI: 10.1111/j.1744-7909.2010.00998.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Mutation and selection are two major forces causing codon usage biases. How these two forces influence the codon usages in green plant mitochondrial genomes has not been well investigated. In the present study, we surveyed five bryophyte mitochondrial genomes to reveal their codon usage patterns as well as the determining forces. Three interesting findings were made. First, comparing to Chara vulgaris, an algal species sister to all extant land plants, bryophytes have more G, C-ending codon usages in their mitochondrial genes. This is consistent with the generally higher genomic GC content in bryophyte mitochondria, suggesting an increased mutational pressure toward GC. Second, as indicated by Wright's Nc-GC3s plot, mutation, not selection, is the major force affecting codon usages of bryophyte mitochondrial genes. However, the real mutational dynamics seem very complex. Context-dependent analysis indicated that nucleotide at the 2nd codon position would slightly affect synonymous codon choices. Finally, in bryophyte mitochondria, tRNA genes would apply a weak selection force to fine-tune the synonymous codon frequencies, as revealed by data of Ser4-Pro-Thr-Val families. In summary, complex mutation and weak selection together determined the codon usages in bryophyte mitochondrial genomes.
Collapse
Affiliation(s)
- Bin Wang
- State Key Laboratory of Pharmaceutical Biotechnology, Department of Biology, Nanjing University, Nanjing 210093, China
| | | | | | | | | |
Collapse
|
16
|
Tse H, Cai JJ, Tsoi HW, Lam EP, Yuen KY. Natural selection retains overrepresented out-of-frame stop codons against frameshift peptides in prokaryotes. BMC Genomics 2010; 11:491. [PMID: 20828396 PMCID: PMC2996987 DOI: 10.1186/1471-2164-11-491] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2010] [Accepted: 09/09/2010] [Indexed: 12/03/2022] Open
Abstract
Background Out-of-frame stop codons (OSCs) occur naturally in coding sequences of all organisms, providing a mechanism of early termination of translation in incorrect reading frame so that the metabolic cost associated with frameshift events can be reduced. Given such a functional significance, we expect statistically overrepresented OSCs in coding sequences as a result of a widespread selection. Accordingly, we examined available prokaryotic genomes to look for evidence of this selection. Results The complete genome sequences of 990 prokaryotes were obtained from NCBI GenBank. We found that low G+C content coding sequences contain significantly more OSCs and G+C content at specific codon positions were the principal determinants of OSC usage bias in the different reading frames. To investigate if there is overrepresentation of OSCs, we modeled the trinucleotide and hexanucleotide biases of the coding sequences using Markov models, and calculated the expected OSC frequencies for each organism using a Monte Carlo approach. More than 93% of 342 phylogenetically representative prokaryotic genomes contain excess OSCs. Interestingly the degree of OSC overrepresentation correlates positively with G+C content, which may represent a compensatory mechanism for the negative correlation of OSC frequency with G+C content. We extended the analysis using additional compositional bias models and showed that lower-order bias like codon usage and dipeptide bias could not explain the OSC overrepresentation. The degree of OSC overrepresentation was found to correlate negatively with the optimal growth temperature of the organism after correcting for the G+C% and AT skew of the coding sequence. Conclusions The present study uses approaches with statistical rigor to show that OSC overrepresentation is a widespread phenomenon among prokaryotes. Our results support the hypothesis that OSCs carry functional significance and have been selected in the course of genome evolution to act against unintended frameshift occurrences. Some results also hint that OSC overrepresentation being a compensatory mechanism to make up for the decrease in OSCs in high G+C organisms, thus revealing the interplay between two different determinants of OSC frequency.
Collapse
Affiliation(s)
- Herman Tse
- Carol Yu Centre for Infection, Department of Microbiology, The University of Hong Kong, Hong Kong, China
| | | | | | | | | |
Collapse
|
17
|
Variation in the correlation of G + C composition with synonymous codon usage bias among bacteria. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2010:61374. [PMID: 18350114 DOI: 10.1155/2007/61374] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2007] [Accepted: 06/04/2007] [Indexed: 11/17/2022]
Abstract
G + C composition at the third codon position (GC3) is widely reported to be correlated with synonymous codon usage bias. However, no quantitative attempt has been made to compare the extent of this correlation among different genomes. Here, we applied Shannon entropy from information theory to measure the degree of GC3 bias and that of synonymous codon usage bias of each gene. The strength of the correlation of GC3 with synonymous codon usage bias, quantified by a correlation coefficient, varied widely among bacterial genomes, ranging from -0.07 to 0.95. Previous analyses suggesting that the relationship between GC3 and synonymous codon usage bias is independent of species are thus inconsistent with the more detailed analyses obtained here for individual species.
Collapse
|
18
|
Supek F, Škunca N, Repar J, Vlahoviček K, Šmuc T. Translational selection is ubiquitous in prokaryotes. PLoS Genet 2010; 6:e1001004. [PMID: 20585573 PMCID: PMC2891978 DOI: 10.1371/journal.pgen.1001004] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2009] [Accepted: 05/26/2010] [Indexed: 11/29/2022] Open
Abstract
Codon usage bias in prokaryotic genomes is largely a consequence of background substitution patterns in DNA, but highly expressed genes may show a preference towards codons that enable more efficient and/or accurate translation. We introduce a novel approach based on supervised machine learning that detects effects of translational selection on genes, while controlling for local variation in nucleotide substitution patterns represented as sequence composition of intergenic DNA. A cornerstone of our method is a Random Forest classifier that outperformed previous distance measure-based approaches, such as the codon adaptation index, in the task of discerning the (highly expressed) ribosomal protein genes by their codon frequencies. Unlike previous reports, we show evidence that translational selection in prokaryotes is practically universal: in 460 of 461 examined microbial genomes, we find that a subset of genes shows a higher codon usage similarity to the ribosomal proteins than would be expected from the local sequence composition. These genes constitute a substantial part of the genome—between 5% and 33%, depending on genome size—while also exhibiting higher experimentally measured mRNA abundances and tending toward codons that match tRNA anticodons by canonical base pairing. Certain gene functional categories are generally enriched with, or depleted of codon-optimized genes, the trends of enrichment/depletion being conserved between Archaea and Bacteria. Prominent exceptions from these trends might indicate genes with alternative physiological roles; we speculate on specific examples related to detoxication of oxygen radicals and ammonia and to possible misannotations of asparaginyl–tRNA synthetases. Since the presence of codon optimizations on genes is a valid proxy for expression levels in fully sequenced genomes, we provide an example of an “adaptome” by highlighting gene functions with expression levels elevated specifically in thermophilic Bacteria and Archaea. Synonymous codons are not equally common in genomes. The main causes of unequal codon usage are varying nucleotide substitution patterns, as manifested in the wide range of genomic nucleotide compositions. However, since the first E. coli and yeast genes were sequenced, it became evident that there was also a bias towards codons that can be translated to protein faster and more accurately. This bias was stronger in highly expressed genes, and its driving force was termed translational selection. Researchers sought for effects of translational selection in microbial genomes as they became available, employing a flurry of mathematical approaches which sometimes led to contradictory conclusions. We introduce a sensitive and accurate machine learning-based methodology and find that highly expressed genes have a recognizable codon usage pattern in almost every bacterial and archaeal genome analyzed, even after accounting for large differences in background nucleotide composition. We also show that the gene functional category has a great bearing on whether that gene is subject to translational selection. Since presence of codon optimizations can be used as a purely sequence-derived proxy for expression levels, we can delineate “adaptomes” by relating predicted gene activity to organisms' phenotypes, which we demonstrate on genomes of temperature-resistant Bacteria and Archaea.
Collapse
Affiliation(s)
- Fran Supek
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Nives Škunca
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Jelena Repar
- Division of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Kristian Vlahoviček
- Division of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Tomislav Šmuc
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
- * E-mail:
| |
Collapse
|
19
|
Harrison PW, Lower RPJ, Kim NKD, Young JPW. Introducing the bacterial 'chromid': not a chromosome, not a plasmid. Trends Microbiol 2010; 18:141-8. [PMID: 20080407 DOI: 10.1016/j.tim.2009.12.010] [Citation(s) in RCA: 246] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Revised: 12/02/2009] [Accepted: 12/04/2009] [Indexed: 10/20/2022]
Abstract
In addition to the main chromosome, approximately one in ten bacterial genomes have a 'second chromosome' or 'megaplasmid'. Here, we propose that these represent a single class of elements that have a distinct and consistent set of properties, and suggest the term 'chromid' to distinguish them from both chromosomes and plasmids. Chromids carry some core genes, and their nucleotide composition and codon usage are very similar to those of the chromosomes they are associated with. By contrast, they have plasmid replication and partitioning systems and the majority of their genes confer accessory functions. Chromids seem particularly rich in genus-specific genes and appear to be 'reinvented' at the origin of a new genus.
Collapse
Affiliation(s)
- Peter W Harrison
- Department of Biology, University of York, PO Box 373, York YO10 5YW, UK.
| | | | | | | |
Collapse
|
20
|
Suzuki H, Saito R, Tomita M. Measure of synonymous codon usage diversity among genes in bacteria. BMC Bioinformatics 2009; 10:167. [PMID: 19480720 PMCID: PMC2697163 DOI: 10.1186/1471-2105-10-167] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2008] [Accepted: 06/01/2009] [Indexed: 11/10/2022] Open
Abstract
Background In many bacteria, intragenomic diversity in synonymous codon usage among genes has been reported. However, no quantitative attempt has been made to compare the diversity levels among different genomes. Here, we introduce a mean dissimilarity-based index (Dmean) for quantifying the level of diversity in synonymous codon usage among all genes within a genome. Results The application of Dmean to 268 bacterial genomes shows that in bacteria with extremely biased genomic G+C compositions there is little diversity in synonymous codon usage among genes. Furthermore, our findings contradict previous reports. For example, a low level of diversity in codon usage among genes has been reported for Helicobacter pylori, but based on Dmean, the diversity level of this species is higher than those of more than half of bacteria tested here. The discrepancies between our findings and previous reports are probably due to differences in the methods used for measuring codon usage diversity. Conclusion We recommend that Dmean be used to measure the diversity level of codon usage among genes. This measure can be applied to other compositional features such as amino acid usage and dinucleotide relative abundance as a genomic signature.
Collapse
Affiliation(s)
- Haruo Suzuki
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata, 997-0017, Japan.
| | | | | |
Collapse
|
21
|
Suzuki H, Brown CJ, Forney LJ, Top EM. Comparison of correspondence analysis methods for synonymous codon usage in bacteria. DNA Res 2008; 15:357-65. [PMID: 18940873 PMCID: PMC2608848 DOI: 10.1093/dnares/dsn028] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.
Collapse
Affiliation(s)
- Haruo Suzuki
- Department of Biological Sciences and Initiative for Bioinformatics and Evolutionary Studies, University of Idaho, PO Box 443051, Moscow, Idaho 83844-3051, USA.
| | | | | | | |
Collapse
|
22
|
Kloster M, Tang C. SCUMBLE: a method for systematic and accurate detection of codon usage bias by maximum likelihood estimation. Nucleic Acids Res 2008; 36:3819-27. [PMID: 18495752 PMCID: PMC2441815 DOI: 10.1093/nar/gkn288] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2007] [Revised: 04/22/2008] [Accepted: 04/25/2008] [Indexed: 11/23/2022] Open
Abstract
The genetic code is degenerate--most amino acids can be encoded by from two to as many as six different codons. The synonymous codons are not used with equal frequency: not only are some codons favored over others, but also their usage can vary significantly from species to species and between different genes in the same organism. Known causes of codon bias include differences in mutation rates as well as selection pressure related to the expression level of a gene, but the standard analysis methods can account for only a fraction of the observed codon usage variation. We here introduce an explicit model of codon usage bias, inspired by statistical physics. Combining this model with a maximum likelihood approach, we are able to clearly identify different sources of bias in various genomes. We have applied the algorithm to Saccharomyces cerevisiae as well as 325 prokaryote genomes, and in most cases our model explains essentially all observed variance.
Collapse
Affiliation(s)
- Morten Kloster
- Department of Bioengineering and Therapeutic Sciences, UCSF, San Francisco, California 94158, USA and Center for Theoretical Biology, Peking University, Beijing 100871, China
| | - Chao Tang
- Department of Bioengineering and Therapeutic Sciences, UCSF, San Francisco, California 94158, USA and Center for Theoretical Biology, Peking University, Beijing 100871, China
| |
Collapse
|
23
|
Ishii K, Washio T, Uechi T, Yoshihama M, Kenmochi N, Tomita M. Characteristics and clustering of human ribosomal protein genes. BMC Genomics 2006; 7:37. [PMID: 16504170 PMCID: PMC1459141 DOI: 10.1186/1471-2164-7-37] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2005] [Accepted: 02/28/2006] [Indexed: 11/20/2022] Open
Abstract
Background The ribosome is a central player in the translation system, which in mammals consists of four RNA species and 79 ribosomal proteins (RPs). The control mechanisms of gene expression and the functions of RPs are believed to be identical. Most RP genes have common promoters and were therefore assumed to have a unified gene expression control mechanism. Results We systematically analyzed the homogeneity and heterogeneity of RP genes on the basis of their expression profiles, promoter structures, encoded amino acid compositions, and codon compositions. The results revealed that (1) most RP genes are coordinately expressed at the mRNA level, with higher signals in the spleen, lymph node dissection (LND), and fetal brain. However, 17 genes, including the P protein genes (RPLP0, RPLP1, RPLP2), are expressed in a tissue-specific manner. (2) Most promoters have GC boxes and possible binding sites for nuclear respiratory factor 2, Yin and Yang 1, and/or activator protein 1. However, they do not have canonical TATA boxes. (3) Analysis of the amino acid composition of the encoded proteins indicated a high lysine and arginine content. (4) The major RP genes exhibit a characteristic synonymous codon composition with high rates of G or C in the third-codon position and a high content of AAG, CAG, ATC, GAG, CAC, and CTG. Conclusion Eleven of the RP genes are still identified as being unique and did not exhibit at least some of the above characteristics, indicating that they may have unknown functions not present in other RP genes. Furthermore, we found sequences conserved between human and mouse genes around the transcription start sites and in the intronic regions. This study suggests certain overall trends and characteristic features of human RP genes.
Collapse
Affiliation(s)
- Kyota Ishii
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata 997-0035, Japan
- Graduate School of Media and Governance, Keio University, Fujisawa, Kanagawa 252-8520, Japan
| | - Takanori Washio
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata 997-0035, Japan
- Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| | - Tamayo Uechi
- Frontier Science Research Center, University of Miyazaki, Kiyotake, Miyazaki 889-1692, Japan
| | - Maki Yoshihama
- Frontier Science Research Center, University of Miyazaki, Kiyotake, Miyazaki 889-1692, Japan
| | - Naoya Kenmochi
- Frontier Science Research Center, University of Miyazaki, Kiyotake, Miyazaki 889-1692, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata 997-0035, Japan
- Department of Environmental Information, Keio University, Fujisawa, Kanagawa 252-8520, Japan
| |
Collapse
|