1
|
Lamolle G, Iriarte A, Simón D, Musto H. Amino acid usage and protein expression levels in the flatworm Schistosoma mansoni. Mol Biochem Parasitol 2023; 255:111581. [PMID: 37478919 DOI: 10.1016/j.molbiopara.2023.111581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 07/10/2023] [Accepted: 07/17/2023] [Indexed: 07/23/2023]
Abstract
Schistosoma mansoni is a parasitic flatworm that causes a human disease called schistosomiasis, or bilharzia. At the genomic level, S. mansoni is AT-rich, but has some compositional heterogeneity. Indeed, some regions of its genome are GC-rich, mainly in the regions located near the extreme ends of the chromosomes. Recently, we showed that, despite the strong bias towards A/T ending codons, highly expressed genes tend to use GC-rich codons. Here, we address the following question: are highly expressed sequences biased in their amino acid frequencies? Our analyses show that these sequences in S. mansoni, as in species ranging from bacteria to human, are strongly biased in nucleotide composition. Highly expressed genes tend to use GC-rich codons (in the first and second codon positions), which code the energetically cheapest amino acids. Therefore, we conclude that amino acid usage, at least in highly expressed genes, is strongly shaped by natural selection to avoid energetically expensive residues. Whether this is an adaptation to the parasitic way of life of S. mansoni, is unclear since the same pattern occurs in free-living species.
Collapse
Affiliation(s)
- Guillermo Lamolle
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay
| | - Andrés Iriarte
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay; Laboratorio de Biología Computacional, Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Avenida A. Navarro 3051, 11600 Montevideo, Uruguay
| | - Diego Simón
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay; Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Universidad de la República, Mataojo 2055, 11400 Montevideo, Uruguay; Laboratorio de Evolución Experimental de Virus, Institut Pasteur de Montevideo, Mataojo 2020, 11400 Montevideo, Uruguay
| | - Héctor Musto
- Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay.
| |
Collapse
|
2
|
Lamolle G, Simón D, Iriarte A, Musto H. Main Factors Shaping Amino Acid Usage Across Evolution. J Mol Evol 2023:10.1007/s00239-023-10120-5. [PMID: 37264211 DOI: 10.1007/s00239-023-10120-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 05/17/2023] [Indexed: 06/03/2023]
Abstract
The standard genetic code determines that in most species, including viruses, there are 20 amino acids that are coded by 61 codons, while the other three codons are stop triplets. Considering the whole proteome each species features its own amino acid frequencies, given the slow rate of change, closely related species display similar GC content and amino acids usage. In contrast, distantly related species display different amino acid frequencies. Furthermore, within certain multicellular species, as mammals, intragenomic differences in the usage of amino acids are evident. In this communication, we shall summarize some of the most prominent and well-established factors that determine the differences found in the amino acid usage, both across evolution and intragenomically.
Collapse
Affiliation(s)
- Guillermo Lamolle
- Laboratorio de Genómica Evolutiva, Facultad de Ciencias, Universidad de La República, Montevideo, Uruguay
| | - Diego Simón
- Laboratorio de Genómica Evolutiva, Facultad de Ciencias, Universidad de La República, Montevideo, Uruguay
- Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Facultad de Ciencias, Universidad de La República, Montevideo, Uruguay
- Laboratorio de Evolución Experimental de Virus, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Andrés Iriarte
- Laboratorio de Genómica Evolutiva, Facultad de Ciencias, Universidad de La República, Montevideo, Uruguay
- Laboratorio de Biología Computacional, Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de La República, Montevideo, Uruguay
| | - Héctor Musto
- Laboratorio de Genómica Evolutiva, Facultad de Ciencias, Universidad de La República, Montevideo, Uruguay.
| |
Collapse
|
3
|
Füssy Z, Vinopalová M, Treitli SC, Pánek T, Smejkalová P, Čepička I, Doležal P, Hampl V. Retortamonads from vertebrate hosts share features of anaerobic metabolism and pre-adaptations to parasitism with diplomonads. Parasitol Int 2021; 82:102308. [PMID: 33626397 PMCID: PMC7985675 DOI: 10.1016/j.parint.2021.102308] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/26/2021] [Accepted: 02/11/2021] [Indexed: 12/17/2022]
Abstract
Although the mitochondria of extant eukaryotes share a single origin, functionally these organelles diversified to a great extent, reflecting lifestyles of the organisms that host them. In anaerobic protists of the group Metamonada, mitochondria are present in reduced forms (also termed hydrogenosomes or mitosomes) and a complete loss of mitochondrion in Monocercomonoides exilis (Metamonada:Preaxostyla) has also been reported. Within metamonads, retortamonads from the gastrointestinal tract of vertebrates form a sister group to parasitic diplomonads (e.g. Giardia and Spironucleus) and have also been hypothesized to completely lack mitochondria. We obtained transcriptomic data from Retortamonas dobelli and R. caviae and searched for enzymes of the core metabolism as well as mitochondrion- and parasitism-related proteins. Our results indicate that retortamonads have a streamlined metabolism lacking pathways for metabolites they are probably capable of obtaining from prey bacteria or their environment, reminiscent of the biochemical arrangement in other metamonads. Retortamonads were surprisingly found do encode homologs of components of Giardia's remarkable ventral disk, as well as homologs of regulatory NEK kinases and secreted lytic enzymes known for involvement in host colonization by Giardia. These can be considered pre-adaptations of these intestinal microorganisms to parasitism. Furthermore, we found traces of the mitochondrial metabolism represented by iron‑sulfur cluster assembly subunits, subunits of mitochondrial translocation and chaperone machinery and, importantly, [FeFe]‑hydrogenases and hydrogenase maturases (HydE, HydF and HydG). Altogether, our results strongly suggest that a remnant mitochondrion is still present.
Collapse
Affiliation(s)
- Zoltán Füssy
- Charles University, Faculty of Science, Department of Parasitology, BIOCEV, Vestec, Czech Republic.
| | - Martina Vinopalová
- Charles University, Faculty of Science, Department of Parasitology, BIOCEV, Vestec, Czech Republic
| | | | - Tomáš Pánek
- Charles University, Faculty of Science, Department of Zoology, Prague, Czech Republic
| | - Pavla Smejkalová
- Charles University, Faculty of Science, Department of Parasitology, BIOCEV, Vestec, Czech Republic; Charles University, Faculty of Science, Department of Parasitology, Prague, Czech Republic
| | - Ivan Čepička
- Charles University, Faculty of Science, Department of Zoology, Prague, Czech Republic
| | - Pavel Doležal
- Charles University, Faculty of Science, Department of Parasitology, BIOCEV, Vestec, Czech Republic
| | - Vladimír Hampl
- Charles University, Faculty of Science, Department of Parasitology, BIOCEV, Vestec, Czech Republic.
| |
Collapse
|
4
|
Abstract
Darwin's theory of evolution emphasized that positive selection of functional proficiency provides the fitness that ultimately determines the structure of life, a view that has dominated biochemical thinking of enzymes as perfectly optimized for their specific functions. The 20th-century modern synthesis, structural biology, and the central dogma explained the machinery of evolution, and nearly neutral theory explained how selection competes with random fixation dynamics that produce molecular clocks essential e.g. for dating evolutionary histories. However, quantitative proteomics revealed that selection pressures not relating to optimal function play much larger roles than previously thought, acting perhaps most importantly via protein expression levels. This paper first summarizes recent progress in the 21st century toward recovering this universal selection pressure. Then, the paper argues that proteome cost minimization is the dominant, underlying 'non-function' selection pressure controlling most of the evolution of already functionally adapted living systems. A theory of proteome cost minimization is described and argued to have consequences for understanding evolutionary trade-offs, aging, cancer, and neurodegenerative protein-misfolding diseases.
Collapse
|
5
|
Dohra H, Fujishima M, Suzuki H. Analysis of amino acid and codon usage in Paramecium bursaria. FEBS Lett 2015; 589:3113-8. [PMID: 26341535 DOI: 10.1016/j.febslet.2015.08.033] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Revised: 08/20/2015] [Accepted: 08/21/2015] [Indexed: 01/28/2023]
Abstract
The ciliate Paramecium bursaria harbors the green-alga Chlorella symbionts. We reassembled the P. bursaria transcriptome to minimize falsely fused transcripts, and investigated amino acid and codon usage using the transcriptome data. Surface proteins preferentially use smaller amino acid residues like cysteine. Unusual synonymous codon and amino acid usage in highly expressed genes can reflect a balance between translational selection and other factors. A correlation of gene expression level with synonymous codon or amino acid usage is emphasized in genes down-regulated in symbiont-bearing cells compared to symbiont-free cells. Our results imply that the selection is associated with P. bursaria-Chlorella symbiosis.
Collapse
Affiliation(s)
- Hideo Dohra
- Instrumental Research Support Office, Research Institute of Green Science and Technology, Shizuoka University, 836 Ohya, Suruga-ku, Shizuoka 422-8529, Japan; Department of Biological Science, Graduate School of Science, Shizuoka University, 836 Ohya, Suruga-ku, Shizuoka 422-8529, Japan
| | - Masahiro Fujishima
- Department of Environmental Science and Engineering, Graduate School of Science and Engineering, Yamaguchi University, 1677-1 Yoshida, Yamaguchi 753-8512, Japan; National Bio-Resource Project of Japan Agency for Medical Research and Development, Japan
| | - Haruo Suzuki
- Department of Environmental Science and Engineering, Graduate School of Science and Engineering, Yamaguchi University, 1677-1 Yoshida, Yamaguchi 753-8512, Japan.
| |
Collapse
|
6
|
Kepp KP, Dasmeh P. A model of proteostatic energy cost and its use in analysis of proteome trends and sequence evolution. PLoS One 2014; 9:e90504. [PMID: 24587382 PMCID: PMC3938754 DOI: 10.1371/journal.pone.0090504] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 02/03/2014] [Indexed: 12/25/2022] Open
Abstract
A model of proteome-associated chemical energetic costs of cells is derived from protein-turnover kinetics and protein folding. Minimization of the proteostatic maintenance cost can explain a range of trends of proteomes and combines both protein function, stability, size, proteostatic cost, temperature, resource availability, and turnover rates in one simple framework. We then explore the ansatz that the chemical energy remaining after proteostatic maintenance is available for reproduction (or cell division) and thus, proportional to organism fitness. Selection for lower proteostatic costs is then shown to be significant vs. typical effective population sizes of yeast. The model explains and quantifies evolutionary conservation of highly abundant proteins as arising both from functional mutations and from changes in other properties such as stability, cost, or turnover rates. We show that typical hypomorphic mutations can be selected against due to increased cost of compensatory protein expression (both in the mutated gene and in related genes, i.e. epistasis) rather than compromised function itself, although this compensation depends on the protein's importance. Such mutations exhibit larger selective disadvantage in abundant, large, synthetically costly, and/or short-lived proteins. Selection against increased turnover costs of less stable proteins rather than misfolding toxicity per se can explain equilibrium protein stability distributions, in agreement with recent findings in E. coli. The proteostatic selection pressure is stronger at low metabolic rates (i.e. scarce environments) and in hot habitats, explaining proteome adaptations towards rough environments as a question of energy. The model may also explain several trade-offs observed in protein evolution and suggests how protein properties can coevolve to maintain low proteostatic cost.
Collapse
Affiliation(s)
- Kasper P. Kepp
- Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark
- * E-mail:
| | - Pouria Dasmeh
- Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
7
|
Iriarte A, Baraibar JD, Diana L, Castro-Sowinski S, Romero H, Musto H. Trends in amino acid usage across the class Mollicutes. J Biomol Struct Dyn 2014; 32:65-74. [DOI: 10.1080/07391102.2012.748636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
8
|
Raiford DW, Heizer EM, Miller RV, Doom TE, Raymer ML, Krane DE. Metabolic and translational efficiency in microbial organisms. J Mol Evol 2012; 74:206-16. [PMID: 22538926 DOI: 10.1007/s00239-012-9500-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2011] [Accepted: 04/05/2012] [Indexed: 11/25/2022]
Abstract
Metabolic efficiency, as a selective force shaping proteomes, has been shown to exist in Escherichia coli and Bacillus subtilis and in a small number of organisms with photoautotrophic and thermophilic lifestyles. Earlier attempts at larger-scale analyses have utilized proxies (such as molecular weight) for biosynthetic cost, and did not consider lifestyle or auxotrophy. This study extends the analysis to all currently sequenced microbial organisms that are amenable to these analyses while utilizing lifestyle specific amino acid biosynthesis pathways (where possible) to determine protein production costs and compensating for auxotrophy. The tendency for highly expressed proteins (with adherence to codon usage bias as a proxy for expressivity) to utilize less biosynthetically expensive amino acids is taken as evidence of cost selection. A comprehensive analysis of sequenced genomes to identify those that exhibit strong translational efficiency bias (389 out of 1,700 sequenced organisms) is also presented.
Collapse
Affiliation(s)
- Douglas W Raiford
- Department of Computer Science, University of Montana, Missoula, MT, USA.
| | | | | | | | | | | |
Collapse
|
9
|
Synonymous codon usage analysis of thirty two mycobacteriophage genomes. Adv Bioinformatics 2010:316936. [PMID: 20150956 PMCID: PMC2817497 DOI: 10.1155/2009/316936] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2009] [Accepted: 10/27/2009] [Indexed: 11/17/2022] Open
Abstract
Synonymous codon usage of protein coding genes of thirty two completely sequenced mycobacteriophage genomes was studied using multivariate statistical analysis. One of the major factors influencing codon usage is identified to be compositional bias. Codons ending with either C or G are preferred in highly expressed genes among which C ending codons are highly preferred over G ending codons. A strong negative correlation between effective number of codons (Nc) and GC3s content was also observed, showing that the codon usage was effected by gene nucleotide composition. Translational selection is also identified to play a role in shaping the codon usage operative at the level of translational accuracy. High level of heterogeneity is seen among and between the genomes. Length of genes is also identified to influence the codon usage in 11 out of 32 phage genomes. Mycobacteriophage Cooper is identified to be the highly biased genome with better translation efficiency comparing well with the host specific tRNA genes.
Collapse
|
10
|
Analysis of synonymous codon usage in the UL24 gene of duck enteritis virus. Virus Genes 2008; 38:96-103. [PMID: 18958612 DOI: 10.1007/s11262-008-0295-0] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2008] [Accepted: 10/09/2008] [Indexed: 10/21/2022]
Abstract
The analysis on codon usage bias of UL24 gene of duck enteritis virus (DEV) may improve our understanding of the evolution and pathogenesis of DEV and provide a basis for understanding the relevant mechanism for biased usage of synonymous codons and for selecting appropriate expression systems to improve the expression of target genes. The codon usage bias of UL24 genes of DEV and 27 reference herpesviruses were analyzed. The results showed that codon of UL24 gene of DEV was strong bias toward the synonymous codons with A and T at the third codon position. A high level of diversity in codon usage bias existed, and the effective number of codons used in a gene plot revealed that the genetic heterogeneity in UL24 gene of herpesviruses was constrained by the G + C content. The phylogentic analysis suggested that DEV was evolutionarily closer to Alphaherpesvirinae and that there was no significant deviation in codon usage in different virus strains. There were 20 codons showing distinct usage differences between DEV and Escherichia coli, 23 between DEV and Homo sapiens, but only 16 codons between DEV and yeast. Therefore the yeast expression system may be more suitable for the expression of DEV genes.
Collapse
|
11
|
Sabbía V, Piovani R, Naya H, Rodríguez-Maseda H, Romero H, Musto H. Trends of amino acid usage in the proteins from the human genome. J Biomol Struct Dyn 2007; 25:55-9. [PMID: 17676938 DOI: 10.1080/07391102.2007.10507155] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Correspondence analysis of amino acid usage was applied to 14,815 complete proteins from the human genome. We found that three major factors influence the variability of amino acidic composition of these proteins, explaining, respectively 20.4%, 14.7%, and 9.9% of the total variability. The first trend is strongly correlated with the GC content of first and second codon positions and is also significantly correlated with the GC level of the corresponding flanking regions and introns. Therefore, the main force shaping amino acid usage among human proteins are the compositional constraints determined by the isochore in which each gene is embedded. The second trend correlates with the hydropathy of each protein and with the frequency of beta-strands. Finally, the third trend is strongly associated with the usage of Cys and the frequency of alpha-helices.
Collapse
Affiliation(s)
- Víctor Sabbía
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Iguá 4225, Montevideo 11400, Uruguay
| | | | | | | | | | | |
Collapse
|
12
|
Swire J. Selection on synthesis cost affects interprotein amino acid usage in all three domains of life. J Mol Evol 2007; 64:558-71. [PMID: 17476453 DOI: 10.1007/s00239-006-0206-8] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2006] [Accepted: 01/02/2007] [Indexed: 11/27/2022]
Abstract
Most investigations of the forces shaping protein evolution have focused on protein function. However, cells are typically 50%-75% protein by dry weight, with protein expression levels distributed over five orders of magnitude. Cells may, therefore, be under considerable selection pressure to incorporate amino acids that are cheap to synthesize into proteins that are highly expressed. Such selection pressure has been demonstrated to alter amino acid usage in a few organisms, but whether "cost selection" is a general phenomenon remains unknown. One reason for this is that reliable protein expression level data is not available for most organisms. Accordingly, I have developed a new method for detecting cost selection. This method depends solely on interprotein gradients in amino acid usage. Applying it to an analysis of 43 whole genomes from all three domains of life, I show that selection on the synthesis cost of amino acids is a pervasive force in shaping the composition of proteins. Moreover, some amino acids have different price tags for different organisms--the cost of amino acids is changed for organisms living in hydrothermal vents compared with those living at the sea surface or for organisms that have difficulty acquiring elements such as nitrogen compared with those that do not--so I also investigated whether differences between organisms in amino acid usage might reflect differences in synthesis or acquisition costs. The results suggest that organisms evolve to alter amino acid usage in response to environmental conditions.
Collapse
Affiliation(s)
- Jonathan Swire
- Centre for Bioinformatics, Division of Molecular Biosciences, Faculty of Life Sciences, Imperial College London, London, SW7 2AZ, UK.
| |
Collapse
|
13
|
Sau K, Gupta SK, Sau S, Mandal SC, Ghosh TC. Factors influencing synonymous codon and amino acid usage biases in Mimivirus. Biosystems 2006; 85:107-13. [PMID: 16442213 DOI: 10.1016/j.biosystems.2005.12.004] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2005] [Revised: 12/05/2005] [Accepted: 12/17/2005] [Indexed: 10/25/2022]
Abstract
Synonymous codon and amino acid usage biases have been investigated in 903 Mimivirus protein-coding genes in order to understand the architecture and evolution of Mimivirus genome. As expected for an AT-rich genome, third codon positions of the synonymous codons of Mimivirus carry mostly A or T bases. It was found that codon usage bias in Mimivirus genes is dictated both by mutational pressure and translational selection. Evidences show that four factors such as mean molecular weight (MMW), hydropathy, aromaticity and cysteine content are mostly responsible for the variation of amino acid usage in Mimivirus proteins. Based on our observation, we suggest that genes involved in translation, DNA repair, protein folding, etc., have been laterally transferred to Mimivirus a long ago from living organism and with time these genes acquire the codon usage pattern of other Mimivirus genes under selection pressure.
Collapse
Affiliation(s)
- K Sau
- Department of Biotechnology, Haldia Institute of Technology, Haldia, India
| | | | | | | | | |
Collapse
|
14
|
Banerjee T, Ghosh TC. Gene expression level shapes the amino acid usages in Prochlorococcus marinus MED4. J Biomol Struct Dyn 2006; 23:547-54. [PMID: 16494504 DOI: 10.1080/07391102.2006.10507079] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Prochlorococcus species are the first example of free-living bacteria with reduced genome. Codon and amino acid usages bias of Prochlorococcus marinus MED4 was investigated using all protein coding genes having length greater than or equal to 100 amino acids. Correspondence analysis on relative synonymous codon usage (RSCU) values shows that there is no such influence of translational selection in shaping the codon usage variation among the genes in this organism. However, amino acid usages were markedly different between the highly and lowly expressed genes in this organism and in particular, GC rich amino acids were found to occur significantly higher in highly expressed genes than the lowly expressed genes. Comparative analysis of the homologous genes of Synechococcus sp. WH8102 and Prochlorococcus marinus MED4 shows that amino acids conservation in highly expressed genes is significantly higher than lowly expressed genes. Based on our results we concluded that conservation of GC rich amino acids in the highly expressed genes to its ancestor is the major source of variation in amino acid usages in the organism.
Collapse
Affiliation(s)
- T Banerjee
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | | |
Collapse
|
15
|
Das S, Pan A, Paul S, Dutta C. Comparative Analyses of Codon and Amino Acid Usage in Symbiotic Island and Core Genome in Nitrogen-Fixing Symbiotic BacteriumBradyrhizobium japonicum. J Biomol Struct Dyn 2005; 23:221-32. [PMID: 16060695 DOI: 10.1080/07391102.2005.10507061] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Genes involved in the symbiotic interactions between the nitrogen-fixing endosymbiont Bradyrhizobium japonicum, and its leguminous host are mostly clustered in a symbiotic island (SI), acquired by the bacterium through a process of horizontal transfer. A comparative analysis of the codon and amino acid usage in core and SI genes/proteins of B. japonicum has been carried out in the present study. The mutational bias, translational selection, and gene length are found to be the major sources of variation in synonymous codon usage in the core genome as well as in SI, the strength of translational selection being higher in core genes than in SI. In core proteins, hydrophobicity is the main source of variation in amino acid usage, expressivity and aromaticity being the second and third important sources. But in SI proteins, aromaticity is the chief source of variation, followed by expressivity and hydrophobicity. In SI proteins, both the mean molecular weight and mean aromaticity of individual proteins exhibit significant positive correlation with gene expressivity, which violate the cost-minimization hypothesis. Investigation of nucleotide substitution patterns in B. japonicum and Mesorhizobium loti orthologous genes reveals that both synonymous and non-synonymous sites of highly expressed genes are more conserved than their lowly expressed counterparts and this conservation is more pronounced in the genes present in core genome than in SI.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Centre, Indian Institute of Chemical Biology, 4 Raja SC Mullick Road, Kolkata 700 032, India
| | | | | | | |
Collapse
|
16
|
Naya H, Gianola D, Romero H, Urioste JI, Musto H. Inferring Parameters Shaping Amino Acid Usage in Prokaryotic Genomes via Bayesian MCMC Methods. Mol Biol Evol 2005; 23:203-11. [PMID: 16162860 DOI: 10.1093/molbev/msj023] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Molar content of guanine plus cytosine (G + C) and optimal growth temperature (OGT) are main factors characterizing the frequency distribution of amino acids in prokaryotes. Previous work, using multivariate exploratory methods, has emphasized ascertainment of biological factors underlying variability between genomes, but the strength of each identified factor on amino acid content has not been quantified. We combine the flexibility of the phylogenetic mixed model (PMM) with the power of Bayesian inference via Markov Chain Monte Carlo (MCMC) methods, to obtain a novel evolutionary picture of amino acid usage in prokaryotic genomes. We implement a Bayesian PMM which incorporates the feature that evolutionary history makes observed data interdependent. As in previous studies with PMM, we present a variance partition; however, attention is also given to the posterior distribution of "systematic effects" that may shed light about the relative importance of and relationships between evolutionary forces acting at the genomic level. In particular, we analyzed influences of G + C, OGT, and respiratory metabolism. Estimates of G + C effects were significant for amino acids coded by G + C or molar content of adenine plus thymine (A + T) in first and second bases. OGT had an important effect on 12 amino acids, probably reflecting complex patterns of protein modifications, to cope with varying environments. The effect of respiratory metabolism was less clear, probably due to the already reported association of G + C with aerobic metabolism. A "heritability" parameter was always high and significant, reinforcing the importance of accommodating phylogenetic relationships in these analyses. "Heritable" component correlations displayed a pattern that tended to cluster "pure" G + C (A + T) in first and second codon positions, suggesting an inherited departure from linear regression on G + C.
Collapse
Affiliation(s)
- Hugo Naya
- Laboratorio de Organización y Evolución del Genoma, Departamento de Biología Celular y Molecular, Facultad de Ciencias, Montevideo, Uruguay.
| | | | | | | | | |
Collapse
|
17
|
Sau K, Sau S, Mandal SC, Ghosh TC. Factors influencing the synonymous codon and amino acid usage bias in AT-rich Pseudomonas aeruginosa phage PhiKZ. Acta Biochim Biophys Sin (Shanghai) 2005; 37:625-33. [PMID: 16143818 PMCID: PMC7109957 DOI: 10.1111/j.1745-7270.2005.00089.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
To reveal how the AT-rich genome of bacteriophage PhiKZ has been shaped in order to carry out its growth in the GC-rich host Pseudomonas aeruginosa, synonymous codon and amino acid usage bias of PhiKZ was investigated and the data were compared with that of P. aeruginosa. It was found that synonymous codon and amino acid usage of PhiKZ was distinct from that of P. aeruginosa. In contrast to P. aeruginosa, the third codon position of the synonymous codons of PhiKZ carries mostly A or T base; codon usage bias in PhiKZ is dictated mainly by mutational bias and, to a lesser extent, by translational selection. A cluster analysis of the relative synonymous codon usage values of 16 myoviruses including PhiKZ shows that PhiKZ is evolutionary much closer to Escherichia coli phage T4. Further analysis reveals that the three factors of mean molecular weight, aromaticity and cysteine content are mostly responsible for the variation of amino acid usage in PhiKZ proteins, whereas amino acid usage of P. aeruginosa proteins is mainly governed by grand average of hydropathicity, aromaticity and cysteine content. Based on these observations, we suggest that codons of the phage-like PhiKZ have evolved to preferentially incorporate the smaller amino acid residues into their proteins during translation, thereby economizing the cost of its development in GC-rich P. aeruginosa.
Collapse
Affiliation(s)
- K. Sau
- Department of Mathematics, Jadavpur UniversityCalcutta 700 032, India
| | - S. Sau
- Department of Biochemistry, Bose Institute, P1/12-CIT Scheme VII MCalcutta 700 054, India
| | - S. C. Mandal
- Department of Mathematics, Jadavpur UniversityCalcutta 700 032, India
- Corresponding authors: S. C. MANDAL: E-mail,
| | - T. C. Ghosh
- Bioinformatics Centre, Bose Institute, P1/12-CIT Scheme VII MCalcutta 700 054, India
- T. C. GHOSH: Tel, +91-33-2334 6626; Fax, +91-33-2334 3886; E-mail,
| |
Collapse
|
18
|
Chanda I, Pan A, Dutta C. Proteome composition in Plasmodium falciparum: higher usage of GC-rich nonsynonymous codons in highly expressed genes. J Mol Evol 2005; 61:513-23. [PMID: 16044241 DOI: 10.1007/s00239-005-0023-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2005] [Accepted: 04/19/2005] [Indexed: 10/25/2022]
Abstract
The parasite Plasmodium falciparum, responsible for the most deadly form of human malaria, is one of the extremely AT-rich genomes sequenced so far and known to possess many atypical characteristics. Using multivariate statistical approaches, the present study analyzes the amino acid usage pattern in 5038 annotated protein-coding sequences in P. falciparum clone 3D7. The amino acid composition of individual proteins, though dominated by the directional mutational pressure, exhibits wide variation across the proteome. The Asn content, expression level, mean molecular weight, hydropathy, and aromaticity are found to be the major sources of variation in amino acid usage. At all stages of development, frequencies of residues encoded by GC-rich codons such as Gly, Ala, Arg, and Pro increase significantly in the products of the highly expressed genes. Investigation of nucleotide substitution patterns in P. falciparum and other Plasmodium species reveals that the nonsynonymous sites of highly expressed genes are more conserved than those of the lowly expressed ones, though for synonymous sites, the reverse is true. The highly expressed genes are, therefore, expected to be closer to their putative ancestral state in amino acid composition, and a plausible reason for their sequences being GC-rich at nonsynonymous codon positions could be that their ancestral state was less AT-biased. Negative correlation of the expression level of proteins with respective molecular weights supports the notion that P. falciparum, in spite of its intracellular parasitic lifestyle, follows the principle of cost minimization.
Collapse
Affiliation(s)
- Ipsita Chanda
- Human Genetics & Genomics Group, Indian Institute of Chemical Biology, Kolkata 700032, India
| | | | | |
Collapse
|
19
|
Naya H, Zavala A, Romero H, Rodríguez-Maseda H, Musto H. Correspondence analysis of amino acid usage within the family Bacillaceae. Biochem Biophys Res Commun 2005; 325:1252-7. [PMID: 15555561 DOI: 10.1016/j.bbrc.2004.10.170] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2004] [Indexed: 11/30/2022]
Abstract
When the amino acid usage of all completely sequenced prokaryotes is studied by multivariate analysis (MVA), it is known that the genomic molar content of guanine plus cytosine (GC) and optimal growth temperature (Topt) have a dominant effect. Furthermore, these two factors are associated to the first two axes of different MVA, and thus, nearly independent among them. However, it was recently shown that for several Families of prokaryotes there are significant and positive correlations between GC and Topt. This trend is particularly clear within Bacillaceae, where there are species displaying a broad range of variations for these two factors. In this paper we report that (a) Topt and genomic GC are the main factors shaping amino acid usage but are not independent between them, (b) the usage of cysteine is the second source of variability, and finally (c) the global hydrophobicity of the encoded proteins of each species is the third main factor.
Collapse
Affiliation(s)
- Hugo Naya
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Iguá 4225, Montevideo 11400, Uruguay
| | | | | | | | | |
Collapse
|
20
|
Abstract
The primary structures of peptides may be adapted for efficient synthesis as well as proper function. Here, the Saccharomyces cerevisiae genome sequence, DNA microarray expression data, tRNA gene numbers, and functional categorizations of proteins are employed to determine whether the amino acid composition of peptides reflects natural selection to optimize the speed and accuracy of translation. Strong relationships between synonymous codon usage bias and estimates of transcript abundance suggest that DNA array data serve as adequate predictors of translation rates. Amino acid usage also shows striking relationships with expression levels. Stronger correlations between tRNA concentrations and amino acid abundances among highly expressed proteins than among less abundant proteins support adaptation of both tRNA abundances and amino acid usage to enhance the speed and accuracy of protein synthesis. Natural selection for efficient synthesis appears to also favor shorter proteins as a function of their expression levels. Comparisons restricted to proteins within functional classes are employed to control for differences in amino acid composition and protein size that reflect differences in the functional requirements of proteins expressed at different levels.
Collapse
Affiliation(s)
- Hiroshi Akashi
- Institute of Molecular Evolutionary Genetics and Department of Biology, 208 Mueller Laboratory, Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|