1
|
Felipe Benites L, Stephens TG, Van Etten J, James T, Christian WC, Barry K, Grigoriev IV, McDermott TR, Bhattacharya D. Hot springs viruses at Yellowstone National Park have ancient origins and are adapted to thermophilic hosts. Commun Biol 2024; 7:312. [PMID: 38594478 PMCID: PMC11003980 DOI: 10.1038/s42003-024-05931-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 02/16/2024] [Indexed: 04/11/2024] Open
Abstract
Geothermal springs house unicellular red algae in the class Cyanidiophyceae that dominate the microbial biomass at these sites. Little is known about host-virus interactions in these environments. We analyzed the virus community associated with red algal mats in three neighboring habitats (creek, endolithic, soil) at Lemonade Creek, Yellowstone National Park (YNP), USA. We find that despite proximity, each habitat houses a unique collection of viruses, with the giant viruses, Megaviricetes, dominant in all three. The early branching phylogenetic position of genes encoded on metagenome assembled virus genomes (vMAGs) suggests that the YNP lineages are of ancient origin and not due to multiple invasions from mesophilic habitats. The existence of genomic footprints of adaptation to thermophily in the vMAGs is consistent with this idea. The Cyanidiophyceae at geothermal sites originated ca. 1.5 Bya and are therefore relevant to understanding biotic interactions on the early Earth.
Collapse
Affiliation(s)
- L Felipe Benites
- Department of Biochemistry and Microbiology, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA
| | - Timothy G Stephens
- Department of Biochemistry and Microbiology, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA
| | - Julia Van Etten
- Department of Biochemistry and Microbiology, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA
- Graduate Program in Ecology and Evolution, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA
| | - Timeeka James
- Department of Biochemistry and Microbiology, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA
| | - William C Christian
- Department of Land Resources and Environmental Sciences, Montana State University, Bozeman, Montana, USA
| | - Kerrie Barry
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Igor V Grigoriev
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, 94720, USA
| | - Timothy R McDermott
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana, USA
| | - Debashish Bhattacharya
- Department of Biochemistry and Microbiology, Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA.
| |
Collapse
|
2
|
Hu EZ, Lan XR, Liu ZL, Gao J, Niu DK. A positive correlation between GC content and growth temperature in prokaryotes. BMC Genomics 2022; 23:110. [PMID: 35139824 PMCID: PMC8827189 DOI: 10.1186/s12864-022-08353-7] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 01/31/2022] [Indexed: 01/27/2023] Open
Abstract
Background GC pairs are generally more stable than AT pairs; GC-rich genomes were proposed to be more adapted to high temperatures than AT-rich genomes. Previous studies consistently showed positive correlations between growth temperature and the GC contents of structural RNA genes. However, for the whole genome sequences and the silent sites of the codons in protein-coding genes, the relationship between GC content and growth temperature is in a long-lasting debate. Results With a dataset much larger than previous studies (681 bacteria and 155 archaea with completely assembled genomes), our phylogenetic comparative analyses showed positive correlations between optimal growth temperature (Topt) and GC content both in bacterial and archaeal structural RNA genes and in bacterial whole genome sequences, chromosomal sequences, plasmid sequences, core genes, and accessory genes. However, in the 155 archaea, we did not observe a significant positive correlation of Topt with whole-genome GC content (GCw) or GC content at four-fold degenerate sites. We randomly drew 155 samples from the 681 bacteria for 1000 rounds. In most cases (> 95%), the positive correlations between Topt and genomic GC contents became statistically nonsignificant (P > 0.05). This result suggested that the small sample sizes might account for the lack of positive correlations between growth temperature and genomic GC content in the 155 archaea and the bacterial samples of previous studies. Comparing the GC content among four categories (psychrophiles/psychrotrophiles, mesophiles, thermophiles, and hyperthermophiles) also revealed a positive correlation between GCw and growth temperature in bacteria. By including the GCw of incompletely assembled genomes, we expanded the sample size of archaea to 303. Positive correlations between GCw and Topt appear especially after excluding the halophilic archaea whose GC contents might be strongly shaped by intense UV radiation. Conclusions This study explains the previous contradictory observations and ends a long debate. Prokaryotes growing in high temperatures have higher GC contents. Thermal adaptation is one possible explanation for the positive association. Meanwhile, we propose that the elevated efficiency of DNA repair in response to heat mutagenesis might have the by-product of increasing GC content like that happens in intracellular symbionts and marine bacterioplankton. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08353-7.
Collapse
Affiliation(s)
- En-Ze Hu
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Xin-Ran Lan
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Zhi-Ling Liu
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Jie Gao
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Deng-Ke Niu
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
3
|
Neutralism versus selectionism: Chargaff's second parity rule, revisited. Genetica 2021; 149:81-88. [PMID: 33880685 PMCID: PMC8057000 DOI: 10.1007/s10709-021-00119-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 04/09/2021] [Indexed: 11/03/2022]
Abstract
Of Chargaff's four "rules" on DNA base frequencies, the functional interpretation of his second parity rule (PR2) is the most contentious. Thermophile base compositions (GC%) were taken by Galtier and Lobry (1997) as favoring Sueoka's neutral PR2 hypothesis over Forsdyke's selective PR2 hypothesis, namely that mutations improving local within-species recombination efficiency had generated a genome-wide potential for the strands of duplex DNA to separate and initiate recombination through the "kissing" of the tips of stem-loops. However, following Chargaff's GC rule, base composition mainly reflects a species-specific, genome-wide, evolutionary pressure. GC% could not have consistently followed the dictates of temperature, since it plays fundamental roles in both sustaining species integrity and, through primarily neutral genome-wide mutation, fostering speciation. Evidence for a local within-species recombination-initiating role of base order was obtained with a novel technology that masked the contribution of base composition to nucleic acid folding energy. Forsdyke's results were consistent with his PR2 hypothesis, appeared to resolve some root problems in biology and provided a theoretical underpinning for alignment-free taxonomic analyses using relative oligonucleotide frequencies (k-mer analysis). Moreover, consistent with Chargaff's cluster rule, discovery of the thermoadaptive role of the "purine-loading" of open reading frames made less tenable the Galtier-Lobry anti-selectionist arguments.
Collapse
|
4
|
Bize A, Midoux C, Mariadassou M, Schbath S, Forterre P, Da Cunha V. Exploring short k-mer profiles in cells and mobile elements from Archaea highlights the major influence of both the ecological niche and evolutionary history. BMC Genomics 2021; 22:186. [PMID: 33726663 PMCID: PMC7962313 DOI: 10.1186/s12864-021-07471-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 02/24/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND K-mer-based methods have greatly advanced in recent years, largely driven by the realization of their biological significance and by the advent of next-generation sequencing. Their speed and their independence from the annotation process are major advantages. Their utility in the study of the mobilome has recently emerged and they seem a priori adapted to the patchy gene distribution and the lack of universal marker genes of viruses and plasmids. To provide a framework for the interpretation of results from k-mer based methods applied to archaea or their mobilome, we analyzed the 5-mer DNA profiles of close to 600 archaeal cells, viruses and plasmids. Archaea is one of the three domains of life. Archaea seem enriched in extremophiles and are associated with a high diversity of viral and plasmid families, many of which are specific to this domain. We explored the dataset structure by multivariate and statistical analyses, seeking to identify the underlying factors. RESULTS For cells, the 5-mer profiles were inconsistent with the phylogeny of archaea. At a finer taxonomic level, the influence of the taxonomy and the environmental constraints on 5-mer profiles was very strong. These two factors were interdependent to a significant extent, and the respective weights of their contributions varied according to the clade. A convergent adaptation was observed for the class Halobacteria, for which a strong 5-mer signature was identified. For mobile elements, coevolution with the host had a clear influence on their 5-mer profile. This enabled us to identify one previously known and one new case of recent host transfer based on the atypical composition of the mobile elements involved. Beyond the effect of coevolution, extrachromosomal elements strikingly retain the specific imprint of their own viral or plasmid taxonomic family in their 5-mer profile. CONCLUSION This specific imprint confirms that the evolution of extrachromosomal elements is driven by multiple parameters and is not restricted to host adaptation. In addition, we detected only recent host transfer events, suggesting the fast evolution of short k-mer profiles. This calls for caution when using k-mers for host prediction, metagenomic binning or phylogenetic reconstruction.
Collapse
Affiliation(s)
- Ariane Bize
- Université Paris-Saclay, INRAE, PROSE, F-92761, Antony, France.
| | - Cédric Midoux
- Université Paris-Saclay, INRAE, PROSE, F-92761, Antony, France.,Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Mahendra Mariadassou
- Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Sophie Schbath
- Université Paris-Saclay, INRAE, MaIAGE, F-78350, Jouy-en-Josas, France.,Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, F-78350, Jouy-en-Josas, France
| | - Patrick Forterre
- Institut Pasteur, Unité de Virologie des Archées, Département de Microbiologie, 25 Rue du Docteur Roux, 75015, Paris, France. .,Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| | - Violette Da Cunha
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
5
|
Forsdyke DR. Success of alignment-free oligonucleotide (k-mer) analysis confirms relative importance of genomes not genes in speciation and phylogeny. Biol J Linn Soc Lond 2019. [DOI: 10.1093/biolinnean/blz096] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
AbstractThe utility of DNA sequence substrings (k-mers) in alignment-free phylogenetic classification, including that of bacteria and viruses, is increasingly recognized. However, its biological basis eludes many 21st century practitioners. A path from the 19th century recognition of the informational basis of heredity to the modern era can be discerned. Crick’s DNA ‘unpairing postulate’ predicted that recombinational pairing of homologous DNAs during meiosis would be mediated by short k-mers in the loops of stem-loop structures extruded from classical duplex helices. The complementary ‘kissing’ duplex loops – like tRNA anticodon–codon k-mer duplexes – would seed a more extensive pairing that would then extend until limited by lack of homology or other factors. Indeed, this became the principle behind alignment-based methods that assessed similarity by degree of DNA–DNA reassociation in vitro. These are now seen as less sensitive than alignment-free methods that are closely consistent, both theoretically and mechanistically, with chromosomal anti-recombination models for the initiation of divergence into new species. The analytical power of k-mer differences supports the theses that evolutionary advance sometimes serves the needs of nucleic acids (genomes) rather than proteins (genes), and that such differences can play a role in early speciation events.
Collapse
Affiliation(s)
- Donald R Forsdyke
- Department of Biomedical and Molecular Sciences, Queen’s University, Kingston, Ontario, Canada
| |
Collapse
|
6
|
Peculiarities and biotechnological potential of environmental adaptation by Geobacillus species. Appl Microbiol Biotechnol 2018; 102:10425-10437. [PMID: 30310966 DOI: 10.1007/s00253-018-9422-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 09/25/2018] [Accepted: 09/26/2018] [Indexed: 12/21/2022]
Abstract
The genus Geobacillus comprises thermophilic bacilli capable of endospore formation. The members of this genus provide thermostable proteins and can be used in whole cell applications at elevated temperatures; therefore, these organisms are of biotechnological importance. While these applications have been described in previous reviews, the present paper highlights the environmental adaptations and genome diversifications of Geobacillus spp. and their applications in evolutionary-protein engineering. Despite their obligate thermophilic properties, Geobacillus spp. are widely distributed in nature. Because several isolates demonstrate remarkable properties for cell reproduction in their respective niches, they seem to exist not only as endospores but also as vegetative cells in diverse environments. This suggests their excellence in environmental adaptation via genome diversification; in fact, evidence suggests that Geobacillus spp. were derived from Bacillus spp. while diversifying their genomes via horizontal gene transfer. Moreover, when subjected to an environmental stressor, Geobacillus spp. diversify their genomes using inductive mutations and transposable elements to produce derivative cells that are adaptive to the stressor. Notably, inductive mutations in Geobacillus spp. occur more rapidly and frequently than the stress-induced mutagenesis observed in other microorganisms. Owing to this, Geobacillus spp. can efficiently generate mutant genes coding for thermostable enzyme variants from the thermolabile enzyme genes under appropriate selection pressures. This phenomenon provides a new approach to generate thermostable enzymes, termed as thermoadaptation-directed enzyme evolution, thereby expanding the biotechnological potentials of Geobacillus spp. In this review, we have discussed this approach using successful examples and major challenges yet to be addressed.
Collapse
|
7
|
Jegousse C, Yang Y, Zhan J, Wang J, Zhou Y. Structural signatures of thermal adaptation of bacterial ribosomal RNA, transfer RNA, and messenger RNA. PLoS One 2017; 12:e0184722. [PMID: 28910383 PMCID: PMC5598986 DOI: 10.1371/journal.pone.0184722] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Accepted: 08/29/2017] [Indexed: 12/02/2022] Open
Abstract
Temperature adaptation of bacterial RNAs is a subject of both fundamental and practical interest because it will allow a better understanding of molecular mechanism of RNA folding with potential industrial application of functional thermophilic or psychrophilic RNAs. Here, we performed a comprehensive study of rRNA, tRNA, and mRNA of more than 200 bacterial species with optimal growth temperatures (OGT) ranging from 4°C to 95°C. We investigated temperature adaptation at primary, secondary and tertiary structure levels. We showed that unlike mRNA, tRNA and rRNA were optimized for their structures at compositional levels with significant tertiary structural features even for their corresponding randomly permutated sequences. tRNA and rRNA are more exposed to solvent but remain structured for hyperthermophiles with nearly OGT-independent fluctuation of solvent accessible surface area within a single RNA chain. mRNA in hyperthermophiles is essentially the same as random sequences without tertiary structures although many mRNA in mesophiles and psychrophiles have well-defined tertiary structures based on their low overall solvent exposure with clear separation of deeply buried from partly exposed bases as in tRNA and rRNA. These results provide new insight into temperature adaptation of different RNAs.
Collapse
MESH Headings
- Bacteria/genetics
- Databases, Genetic
- Models, Molecular
- Nucleic Acid Conformation
- RNA Folding/drug effects
- RNA, Bacterial/chemistry
- RNA, Bacterial/drug effects
- RNA, Messenger/chemistry
- RNA, Messenger/drug effects
- RNA, Ribosomal/chemistry
- RNA, Ribosomal/drug effects
- RNA, Transfer/chemistry
- RNA, Transfer/drug effects
- Solvents/pharmacology
- Temperature
Collapse
Affiliation(s)
- Clara Jegousse
- UFR Sciences et Techniques, Université de Nantes, 2 rue de la Houssinière, Nantes, France
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Yuedong Yang
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Jihua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- * E-mail:
| |
Collapse
|
8
|
Seward EA, Kelly S. Dietary nitrogen alters codon bias and genome composition in parasitic microorganisms. Genome Biol 2016; 17:226. [PMID: 27842572 PMCID: PMC5109750 DOI: 10.1186/s13059-016-1087-9] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2016] [Accepted: 10/12/2016] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Genomes are composed of long strings of nucleotide monomers (A, C, G and T) that are either scavenged from the organism's environment or built from metabolic precursors. The biosynthesis of each nucleotide differs in atomic requirements with different nucleotides requiring different quantities of nitrogen atoms. However, the impact of the relative availability of dietary nitrogen on genome composition and codon bias is poorly understood. RESULTS Here we show that differential nitrogen availability, due to differences in environment and dietary inputs, is a major determinant of genome nucleotide composition and synonymous codon use in both bacterial and eukaryotic microorganisms. Specifically, low nitrogen availability species use nucleotides that require fewer nitrogen atoms to encode the same genes compared to high nitrogen availability species. Furthermore, we provide a novel selection-mutation framework for the evaluation of the impact of metabolism on gene sequence evolution and show that it is possible to predict the metabolic inputs of related organisms from an analysis of the raw nucleotide sequence of their genes. CONCLUSIONS Taken together, these results reveal a previously hidden relationship between cellular metabolism and genome evolution and provide new insight into how genome sequence evolution can be influenced by adaptation to different diets and environments.
Collapse
Affiliation(s)
- Emily A Seward
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK
| | - Steven Kelly
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK.
| |
Collapse
|
9
|
Forsdyke DR. Homostability. Evol Bioinform Online 2016. [DOI: 10.1007/978-3-319-28755-3_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
|
10
|
Brbić M, Warnecke T, Kriško A, Supek F. Global Shifts in Genome and Proteome Composition Are Very Tightly Coupled. Genome Biol Evol 2015; 7:1519-32. [PMID: 25971281 PMCID: PMC4494046 DOI: 10.1093/gbe/evv088] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/09/2015] [Indexed: 02/05/2023] Open
Abstract
The amino acid composition (AAC) of proteomes differs greatly between microorganisms and is associated with the environmental niche they inhabit, suggesting that these changes may be adaptive. Similarly, the oligonucleotide composition of genomes varies and may confer advantages at the DNA/RNA level. These influences overlap in protein-coding sequences, making it difficult to gauge their relative contributions. We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes. We find that G + C content, the most frequently used measure of genomic composition, cannot capture diversity in AAC and across ecological contexts. However, di-/trinucleotide composition in intergenic DNA predicts amino acid frequencies of proteomes to the point where very little cross-species variability remains unexplained (91% of variance accounted for). Qualitatively similar results were obtained for 49 fungal genomes, where 80% of the variability in AAC could be explained by the composition of introns and intergenic regions. Upon factoring out oligonucleotide composition and phylogenetic inertia, the residual AAC is poorly predictive of the microbes' ecological preferences, in stark contrast with the original AAC. Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome. Thus, evolutionary shifts in overall AAC appear to occur almost exclusively through factors shaping the global oligonucleotide content of the genome. We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level.
Collapse
Affiliation(s)
- Maria Brbić
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia
| | - Tobias Warnecke
- MRC Clinical Sciences Centre, Imperial College, Hammersmith Campus, London, United Kingdom
| | - Anita Kriško
- Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia
| | - Fran Supek
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia EMBL/CRG Systems Biology Unit, Centre for Genomic Regulation, Barcelona, Spain
| |
Collapse
|
11
|
Overlapping genes: a new strategy of thermophilic stress tolerance in prokaryotes. Extremophiles 2014; 19:345-53. [PMID: 25503326 DOI: 10.1007/s00792-014-0720-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 12/01/2014] [Indexed: 12/29/2022]
Abstract
Overlapping genes (OGs) draw the focus of recent day's research. However, the significance of OGs in prokaryotic genomes remained unexplored. As an adaptation to high temperature, thermophiles were shown to eliminate their intergenic regions. Therefore, it could be possible that prokaryotes would increase their OG content to adapt to high temperature. To test this hypothesis, we carried out a comparative study on OG frequency of 256 prokaryotic genomes comprising both thermophiles and non-thermophiles. It was found that thermophiles exhibit higher frequency of overlapping genes than non-thermophiles. Moreover, overlap frequency was found to correlate with optimal growth temperature (OGT) in prokaryotes. Long overlap frequency was found to hold a positive correlation with OGT resulting in an abundance of long overlaps in thermophiles compared to non-thermophiles. On the other hand, short overlap (1-4 nucleotides) frequency (SOF) did not yield any direct correlation with OGT. However, the correlation of SOF with CAIavg (extent of variation of codon usage bias measured as the mean of codon adaptation index of all genes in a given genome) and IG% (proportion of intergenic regions) indicate that they might upregulate the aforementioned factors (CAIavg and IG%) which are already known to be vital forces for thermophilic adaptation. From these evidences, we propose that the OG content bears a strong link to thermophily. Long overlaps are important for their genome compaction and short overlaps are important to uphold high CAIavg. Our findings will surely help in better understanding of the significance of overlapping gene content in prokaryotic genomes.
Collapse
|
12
|
Sandle T, Skinner K. Study of psychrophilic and psychrotolerant micro-organisms isolated in cold rooms used for pharmaceutical processing. J Appl Microbiol 2012; 114:1166-74. [DOI: 10.1111/jam.12101] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2012] [Revised: 11/13/2012] [Accepted: 12/04/2012] [Indexed: 11/28/2022]
Affiliation(s)
- T. Sandle
- Bio Products Laboratory Ltd; Elstree UK
| | | |
Collapse
|
13
|
Dutta C, Paul S. Microbial lifestyle and genome signatures. Curr Genomics 2012; 13:153-62. [PMID: 23024607 PMCID: PMC3308326 DOI: 10.2174/138920212799860698] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2011] [Revised: 09/13/2011] [Accepted: 09/28/2011] [Indexed: 12/29/2022] Open
Abstract
Microbes are known for their unique ability to adapt to varying lifestyle and environment, even to the extreme or adverse ones. The genomic architecture of a microbe may bear the signatures not only of its phylogenetic position, but also of the kind of lifestyle to which it is adapted. The present review aims to provide an account of the specific genome signatures observed in microbes acclimatized to distinct lifestyles or ecological niches. Niche-specific signatures identified at different levels of microbial genome organization like base composition, GC-skew, purine-pyrimidine ratio, dinucleotide abundance, codon bias, oligonucleotide composition etc. have been discussed. Among the specific cases highlighted in the review are the phenomena of genome shrinkage in obligatory host-restricted microbes, genome expansion in strictly intra-amoebal pathogens, strand-specific codon usage in intracellular species, acquisition of genome islands in pathogenic or symbiotic organisms, discriminatory genomic traits of marine microbes with distinct trophic strategies, and conspicuous sequence features of certain extremophiles like those adapted to high temperature or high salinity.
Collapse
Affiliation(s)
- Chitra Dutta
- Structural Biology & Bioinformatics Division, CSIR- Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India
| | | |
Collapse
|
14
|
Mahale KN, Kempraj V, Dasgupta D. Does the growth temperature of a prokaryote influence the purine content of its mRNAs? Gene 2012; 497:83-9. [PMID: 22305982 DOI: 10.1016/j.gene.2012.01.040] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2011] [Accepted: 01/19/2012] [Indexed: 11/20/2022]
Abstract
The formation and breaking of hydrogen bonds between nucleic acid bases are dependent on temperature. The high G+C content of organisms was surmised to be an adaptation for high temperature survival because of the thermal stability of G:C pairs. However, a survey of genomic GC% and optimum growth temperature (OGT) of several prokaryotes revoked any direct relation between them. Significantly high purine (R=A or G) content in mRNAs is also seen as a selective response for survival among thermophiles. Nevertheless, the biological relevance of thermophiles loading their unstable mRNAs with excess purines (purine-loading or R-loading) is not persuasive. Here, we analysed the mRNA sequences from the genomes of 168 prokaryotes (as obtained from NCBI Genome database) with their OGTs ranging from -5 °C to 100 °C to verify the relation between R-loading and OGT. Our analysis fails to demonstrate any correlation between R-loading of the mRNA pool and OGT of a prokaryote. The percentage of purine-loaded mRNAs in prokaryotes is found to be in a rough negative correlation with the genomic GC% (r(2)=0.655, slope=-1.478, P<000.1). We conclude that genomic GC% and bias against certain combinations of nucleotides drive the mRNA-synonymous (sense) strands of DNA towards variations in R-loading.
Collapse
|
15
|
Nakashima H, Kuroda Y. Differences in dinucleotide frequencies of thermophilic genes encoding water soluble and membrane proteins. J Zhejiang Univ Sci B 2011; 12:419-27. [PMID: 21634034 DOI: 10.1631/jzus.b1000331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The occurrence frequencies of the dinucleotides of genes of three thermophilic and three mesophilic species from both archaea and eubacteria were investigated in this study. The genes encoding water soluble proteins were rich in the dinucleotides of purine dimers, whereas the genes encoding membrane proteins were rich in pyrimidine dimers. The dinucleotides of purine dimers are the counterparts of pyrimidine dimers in a double-stranded DNA. The purine/pyrimidine dimers were favored in the thermophiles but not in the mesophiles, based on comparisons of observed and expected frequencies. This finding is in agreement with our previous study which showed that purine/pyrimidine dimers are positive factors that increase the thermal stability of DNA. The dinucleotides AA, AG, and GA are components of the codons of charged residues of Glu, Asp, Lys, and Arg, and the dinucleotides TT, CT, and TC are components of the codons of hydrophobic residues of Leu, Ile, and Phe. This is consistent with the suitabilities of the different amino acid residues for water soluble and membrane proteins. Our analysis provides a picture of how thermophilic species produce water soluble and membrane proteins with distinctive characters: the genes encoding water soluble proteins use DNA sequences rich in purine dimers, and the genes encoding membrane proteins use DNA sequences rich in pyrimidine dimers on the opposite strand.
Collapse
Affiliation(s)
- Hiroshi Nakashima
- Department of Clinical Laboratory Science, Graduate Course of Medical Science and Technology, School of Health Sciences, Kanazawa University, 5-11-80 Kodatsuno, Kanazawa 920-0942, Japan.
| | | |
Collapse
|
16
|
Amlacher S, Sarges P, Flemming D, van Noort V, Kunze R, Devos DP, Arumugam M, Bork P, Hurt E. Insight into structure and assembly of the nuclear pore complex by utilizing the genome of a eukaryotic thermophile. Cell 2011; 146:277-89. [PMID: 21784248 DOI: 10.1016/j.cell.2011.06.039] [Citation(s) in RCA: 180] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Revised: 04/15/2011] [Accepted: 06/24/2011] [Indexed: 01/25/2023]
Abstract
Despite decades of research, the structure and assembly of the nuclear pore complex (NPC), which is composed of ∼30 nucleoporins (Nups), remain elusive. Here, we report the genome of the thermophilic fungus Chaetomium thermophilum (ct) and identify the complete repertoire of Nups therein. The thermophilic proteins show improved properties for structural and biochemical studies compared to their mesophilic counterparts, and purified ctNups enabled the reconstitution of the inner pore ring module that spans the width of the NPC from the anchoring membrane to the central transport channel. This module is composed of two large Nups, Nup192 and Nup170, which are flexibly bridged by short linear motifs made up of linker Nups, Nic96 and Nup53. This assembly illustrates how Nup interactions can generate structural plasticity within the NPC scaffold. Our findings therefore demonstrate the utility of the genome of a thermophilic eukaryote for studying complex molecular machines.
Collapse
Affiliation(s)
- Stefan Amlacher
- Biochemie-Zentrum der Universität Heidelberg, Im Neuenheimer Feld 328, Heidelberg D-69120, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2009; 2:159-67. [PMID: 19054742 DOI: 10.1155/2008/829730] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results.
Collapse
|
18
|
Affiliation(s)
- Claire Torchet
- Institut Jacques-Monod, Biochimie de l'Evolution et Adaptabilité Moléculaire, Université Paris VI, Tour 43, 2 place Jussieu, 75251 Paris Cedex 05, France
| | | |
Collapse
|
19
|
Forsdyke DR. Calculation of folding energies of single-stranded nucleic acid sequences: conceptual issues. J Theor Biol 2007; 248:745-53. [PMID: 17698086 DOI: 10.1016/j.jtbi.2007.07.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2007] [Revised: 07/05/2007] [Accepted: 07/09/2007] [Indexed: 12/16/2022]
Abstract
The stability of a folded single-stranded nucleic acid depends on the composition and order of its constituent bases and may be assessed by taking into account the pairing energies of its constituent dinucleotides. To assess the possible biological significance of a computed structure, Maizel and coworkers in the 1980s compared the energy of folding of a natural single-stranded RNA sequence with the energies of several versions of the same sequence produced by shuffling base order. However, in the 2000s many took as self-evident the view that shuffling at the mononucleotide level (single bases) was conceptual wrong and should be replaced by shuffling at the level of dinucleotides (retaining pairs of adjacent bases). Folding energies then became indistinguishable from those of corresponding shuffled sequences and doubt was cast on the importance of secondary structures. Nevertheless, some continued productively to employ the single base shuffling approach, the justification for which is the topic of this paper. Because dinucleotide pairing energies are needed to calculate structure, it does not follow that shuffling should not disrupt dinucleotides. Base shuffling allows determination of the relative contributions of base composition and base order to total folding energy. The potential for secondary structure arises from pressures acting at both DNA and RNA levels, and is abundant throughout genomes-with a probable primary role in recombination. Within a gene the potential can often be accommodated, and base order and composition work together (values have the same negative sign) in contributing to total folding energy. But sometimes protein-coding pressure on base order conflicts with the pressure for secondary structure and the values have opposite signs. Total folding energy can be deemed of potential biological significance when the average of several readings is significantly less than zero.
Collapse
Affiliation(s)
- Donald R Forsdyke
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada K7L3N6.
| |
Collapse
|
20
|
Thorvaldsen S, Hjerde E, Fenton C, Willassen NP. Molecular characterization of cold adaptation based on ortholog protein sequences from Vibrionaceae species. Extremophiles 2007; 11:719-32. [PMID: 17576517 DOI: 10.1007/s00792-007-0093-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2006] [Accepted: 05/15/2007] [Indexed: 10/23/2022]
Abstract
A set of 298 protein families from psychrophilic Vibrio salmonicida was compiled to identify genotypic characteristics that discern it from orthologous sequences from the mesophilic Vibrio/Photobacterium branch of the gamma-Proteobacteria (Vibrionaceae family). In our comparative exploration we employed alignment based bioinformatical and statistical methods. Interesting information was found in the substitution matrices, and the pattern of asymmetries in the amino acid substitution process. Together with the compositional difference, they identified the amino acids Ile, Asn, Ala and Gln as those having the most psycrophilic involvement. Ile and Asn are enhanced whereas Gln and Ala are suppressed. The inflexible Pro residue is also suppressed in loop regions, as expected in a flexible structure. The dataset were also classified and analysed according to the predicted subcellular location, and we made an additional study of 183 intracellular and 65 membrane proteins. Our results revealed that the psychrophilic proteins have similar hydrophobic and charge contributions in the core of the protein as mesophilic proteins, while the solvent-exposed surface area is significantly more hydrophobic. In addition, the psychrophilic intracellular (but not the membrane) proteins are significantly more negatively charged at the surface. Our analysis supports the hypothesis of preference for more flexible amino acids at the molecular surface. Life in cold climate seems to be obtained through many minor structural modifications rather than certain amino acids substitutions.
Collapse
Affiliation(s)
- Steinar Thorvaldsen
- Department of Mathematics and Statistics, Faculty of Science, University of Tromsø, 9037, Tromsø, Norway.
| | | | | | | |
Collapse
|
21
|
Zeldovich KB, Berezovsky IN, Shakhnovich EI. Protein and DNA sequence determinants of thermophilic adaptation. PLoS Comput Biol 2007; 3:e5. [PMID: 17222055 PMCID: PMC1769408 DOI: 10.1371/journal.pcbi.0030005] [Citation(s) in RCA: 203] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2006] [Accepted: 11/29/2006] [Indexed: 11/19/2022] Open
Abstract
There have been considerable attempts in the past to relate phenotypic trait--habitat temperature of organisms--to their genotypes, most importantly compositions of their genomes and proteomes. However, despite accumulation of anecdotal evidence, an exact and conclusive relationship between the former and the latter has been elusive. We present an exhaustive study of the relationship between amino acid composition of proteomes, nucleotide composition of DNA, and optimal growth temperature (OGT) of prokaryotes. Based on 204 complete proteomes of archaea and bacteria spanning the temperature range from -10 degrees C to 110 degrees C, we performed an exhaustive enumeration of all possible sets of amino acids and found a set of amino acids whose total fraction in a proteome is correlated, to a remarkable extent, with the OGT. The universal set is Ile, Val, Tyr, Trp, Arg, Glu, Leu (IVYWREL), and the correlation coefficient is as high as 0.93. We also found that the G + C content in 204 complete genomes does not exhibit a significant correlation with OGT (R = -0.10). On the other hand, the fraction of A + G in coding DNA is correlated with temperature, to a considerable extent, due to codon patterns of IVYWREL amino acids. Further, we found strong and independent correlation between OGT and the frequency with which pairs of A and G nucleotides appear as nearest neighbors in genome sequences. This adaptation is achieved via codon bias. These findings present a direct link between principles of proteins structure and stability and evolutionary mechanisms of thermophylic adaptation. On the nucleotide level, the analysis provides an example of how nature utilizes codon bias for evolutionary adaptation to extreme conditions. Together these results provide a complete picture of how compositions of proteomes and genomes in prokaryotes adjust to the extreme conditions of the environment.
Collapse
Affiliation(s)
- Konstantin B Zeldovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Igor N Berezovsky
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
22
|
Lin FH, Forsdyke DR. Prokaryotes that grow optimally in acid have purine-poor codons in long open reading frames. Extremophiles 2006; 11:9-18. [PMID: 16957882 DOI: 10.1007/s00792-006-0005-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2006] [Accepted: 03/29/2006] [Indexed: 10/24/2022]
Abstract
In nucleic acids the N-glycosyl bonds between purines and their ribose sugar moities are broken under acid conditions. If one strand of a duplex DNA segment were more vulnerable to mutation than the other, then the archaeon Picrophilus torridus, with an optimum growth pH near zero, could have adapted by decreasing the purine content of that strand. Yet, P. torridus has an optimum growth temperature near 60 degrees C, and thermophiles prefer purine-rich codons. We found that, as in other thermophiles, high growth temperature correlates with the use of purine-rich codons. The extra purines are often in third, non-amino acid determining, codon positions. However, as in other acidophiles, as open reading frame lengths increase, there is increased use of purine-poor codons, particularly those without purines in second, amino acid-determining, codon positions. Thus, P. torridus can be seen as adapting (a) to temperature by increasing its purines in all open reading frames without greatly impacting protein amino acid compositions, and (b) to pH by decreasing purines in longer open reading frames, thereby potentially impacting protein amino acid compositions. It is proposed that longer open reading frames, being larger mutational targets, have become less vulnerable to depurination by virtue of pyrimidine for purine substitutions.
Collapse
Affiliation(s)
- Feng-Hsu Lin
- Department of Biochemistry, Queen's University, K7L3N6, Kingston, ON, Canada
| | | |
Collapse
|
23
|
Synonymous codon usage and its potential link with optimal growth temperature in prokaryotes. Gene 2006; 385:128-36. [PMID: 16989961 DOI: 10.1016/j.gene.2006.05.033] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2006] [Accepted: 05/29/2006] [Indexed: 12/01/2022]
Abstract
The relationship between codon usage in prokaryotes and their ability to grow at extreme temperatures has been given much attention over the past years. Previous studies have suggested that the difference in synonymous codon usage between (hyper)thermophiles and mesophiles is a consequence of a selective pressure linked to growth temperature. Here, we performed an updated analysis of the variation in synonymous codon usage with growth temperature; our study includes a large number of species from a wide taxonomic and growth temperature range. The presence of psychrophilic species in our study allowed us to test whether the same selective pressure acts on synonymous codon usage at very low growth temperature. Our results show that the synonymous codon usage for Arg (through the AGG, AGA and CGT codons) is the most discriminating factor between (hyper)thermophilic and non-thermophilic species, thus confirming previous studies. We report the unusual clustering of an Archaeal psychrophile with the thermophilic and hyperthermophilic species on the synonymous codon usage factorial map; the other psychrophiles in our study cluster with the mesophilic species. Our conclusion is that the difference in synonymous codon usage between (hyper)thermophilic and non-thermophilic species cannot be clearly attributed to a selective pressure linked to growth at high temperatures.
Collapse
|
24
|
Das S, Paul S, Bag SK, Dutta C. Analysis of Nanoarchaeum equitans genome and proteome composition: indications for hyperthermophilic and parasitic adaptation. BMC Genomics 2006; 7:186. [PMID: 16869956 PMCID: PMC1574309 DOI: 10.1186/1471-2164-7-186] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2006] [Accepted: 07/25/2006] [Indexed: 11/24/2022] Open
Abstract
Background Nanoarchaeum equitans, the only known hyperthermophilic archaeon exhibiting parasitic life style, has raised some new questions about the evolution of the Archaea and provided a model of choice to study the genome landmarks correlated with thermo-parasitic adaptation. In this context, we have analyzed the genome and proteome composition of N. equitans and compared the same with those of other mesophiles, hyperthermophiles and obligatory host-associated organisms. Results Analysis of nucleotide, codon and amino acid usage patterns in N. equitans indicates the presence of distinct selective constraints, probably due to its adaptation to a thermo-parasitic life-style. Among the conspicuous characteristics featuring its hyperthermophilic adaptation are overrepresentation of purine bases in protein coding sequences, higher GC-content in tRNA/rRNA sequences, distinct synonymous codon usage, enhanced usage of aromatic and positively charged residues, and decreased frequencies of polar uncharged residues, as compared to those in mesophilic organisms. Positively charged amino acid residues are relatively abundant in the encoded gene-products of N. equitans and other hyperthermophiles, which is reflected in their isoelectric point distribution. Pairwise comparison of 105 orthologous protein sequences shows a strong bias towards replacement of uncharged polar residues of mesophilic proteins by Lys/Arg, Tyr and some hydrophobic residues in their Nanoarchaeal orthologs. The traits potentially attributable to the symbiotic/parasitic life-style of the organism include the presence of apparently weak translational selection in synonymous codon usage and a marked heterogeneity in membrane-associated proteins, which may be important for N. equitans to interact with the host and hence, may help the organism to adapt to the strictly host-associated life style. Despite being strictly host-dependent, N. equitans follows cost minimization hypothesis. Conclusion The present study reveals that the genome and proteome composition of N. equitans are marked with the signatures of dual adaptation – one to high temperature and the other to obligatory parasitism. While the analysis of nucleotide/amino acid preferences in N. equitans offers an insight into the molecular strategies taken by the archaeon for thermo-parasitic adaptation, the comparative study of the compositional characteristics of mesophiles, hyperthermophiles and obligatory host-associated organisms demonstrates the generality of such strategies in the microbial world.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata–700032, India
| | - Sandip Paul
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata–700032, India
| | - Sumit K Bag
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata–700032, India
| | - Chitra Dutta
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata–700032, India
- Human Genetics & Genomics Division, Indian Institute of Chemical Biology, Kolkata–700032, India
| |
Collapse
|
25
|
Bragg JG, Thomas D, Baudouin-Cornu P. Variation among species in proteomic sulphur content is related to environmental conditions. Proc Biol Sci 2006; 273:1293-300. [PMID: 16720405 PMCID: PMC1560280 DOI: 10.1098/rspb.2005.3441] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2005] [Accepted: 12/04/2005] [Indexed: 11/12/2022] Open
Abstract
The elemental composition of proteins influences the quantities of different elements required by organisms. Here, we considered variation in the sulphur content of whole proteomes among 19 Archaea, 122 Eubacteria and 10 eukaryotes whose genomes have been fully sequenced. We found that different species vary greatly in the sulphur content of their proteins, and that average sulphur content of proteomes and genome base composition are related. Forces contributing to variation in proteomic sulphur content appear to operate quite uniformly across the proteins of different species. In particular, the sulphur content of orthologous proteins was frequently correlated with mean proteomic sulphur contents. Among prokaryotes, proteomic sulphur content tended to be greater in anaerobes, relative to non-anaerobes. Thermophiles tended to have lower proteomic sulphur content than non-thermophiles, consistent with the thermolability of cysteine and methionine residues. This work suggests that persistent environmental growth conditions can influence the evolution of elemental composition of whole proteomes in a manner that may have important implications for the amount of sulphur used by living organisms to build proteins. It extends previous studies that demonstrated links between transient changes in environmental conditions and the elemental composition of subsets of proteins expressed under these conditions.
Collapse
Affiliation(s)
- Jason G Bragg
- Department of Biology, University of New MexicoMSC03 2020, Albuquerque, NM 87131-0001, USA
| | - Dominique Thomas
- Centre de Génétique Moléculaire, Centre National de la Recherche Scientifique91198 Gif-sur-Yvette, France
- Cytomics Systems SABâtiment 5, 1 avenue de la Terrasse, 91190 Gif sur Yvette, France
| | - Peggy Baudouin-Cornu
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital600 University Avenue, Toronto, ON M5G 1X5, Canada
- LPG, SBGM/DBJCbât 144, CEA Saclay, F-91191 Gif-sur-Yvette Cedex, France
| |
Collapse
|
26
|
Wang HC, Susko E, Roger AJ. On the correlation between genomic G+C content and optimal growth temperature in prokaryotes: data quality and confounding factors. Biochem Biophys Res Commun 2006; 342:681-4. [PMID: 16499870 DOI: 10.1016/j.bbrc.2006.02.037] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2006] [Accepted: 02/08/2006] [Indexed: 11/30/2022]
Abstract
The correlation between genomic G+C content and optimal growth temperature in prokaryotes has gained renewed interest after Musto et al. [H. Musto, H. Naya, A. Zavala, H. Romero, F. Alvarex-Valin, G. Bernardi, Correlations between genomic GC levels and optimal growth temperatures in prokaryotes, FEBS Lett. 573 (2004) 73-77], reported that positive correlations exist in 15 families studied. We have reanalyzed their data and found that when genome size and data quality were adjusted for, there was no significant evidence of relationship between optimal temperature and GC content for two of the families that had previously shown strongly significant correlations. Using updated temperature optima for Halobacteriaceae species we found the correlation is insignificant in this family. For the family Enterobacteriaceae when genome size and optimal temperature are included in a multiple linear regression, only genome size is significant as a predictor of GC content. We showed that more profound statistical methods than simple two factor correlation analysis should be used for analyzing complex intrinsic and extrinsic factors that affect genomic GC content. We further found that a positive correlation between temperature and genomic GC is only evident in free-living species of low optimal growth temperatures.
Collapse
Affiliation(s)
- Huai-Chun Wang
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada B3H 3J5.
| | | | | |
Collapse
|
27
|
Chargaff’s GC rule. Evol Bioinform Online 2006. [DOI: 10.1007/978-0-387-33419-6_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
28
|
Lee SJ, Mortimer JR, Forsdyke DR. Genomic conflict settled in favour of the species rather than the gene at extreme GC percentage values. ACTA ACUST UNITED AC 2005; 3:219-28. [PMID: 15702952 DOI: 10.2165/00822942-200403040-00003] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Wada and colleagues have shown that, whether prokaryotic or eukaryotic, each gene has a "homostabilising propensity" to adopt a relatively uniform GC percentage (GC%). Accordingly, each gene can be viewed as a "microisochore" occupying a discrete GC% niche of relatively uniform base composition amongst its fellow genes. Although first, second and third codon positions usually differ in GC%, each position tends to maintain a uniform, gene-specific GC% value. Thus, within a genome, genic GC% values can cover a wide range. This is most evident at third codon positions, which are least constrained by amino acid encoding needs. In 1991, Wada and colleagues further noted that, within a phylogenetic group, genomic GC% values can also cover a wide range. This is again most evident at third codon positions. Thus, the dispersion of GC% values among genes within a genome matches the dispersion of GC% values among genomes within a phylogenetic group. Wada described the context-independence of plots of different codon position GC% values against total GC% as a "universal" characteristic. Several studies relate this to recombination. We have confirmed that third codon positions usually relate more to the genes that contain them than to the species. However, in genomes with extreme GC% values (low or high), third codon positions tend to maintain a constant GC%, thus relating more to the species than to the genes that contain them. Genes in an extreme-GC% genome collectively span a smaller GC% range, and mainly rely on first and second codon positions for differentiation as "microisochores". Our results are consistent with the view that differences in GC% serve to recombinationally isolate both genome sectors (facilitating gene duplication) and genomes (facilitating genome duplication, e.g. speciation). In intermediate-GC% genomes, conflict between the needs of the species and the needs of individual genes within that species is minimal. However, in extreme-GC% genomes there is a conflict, which is settled in favour of the species (i.e. group selection) rather than in favour of the gene (genic selection).
Collapse
Affiliation(s)
- Shang-Jung Lee
- Genetics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
| | | | | |
Collapse
|
29
|
Rayment JH, Forsdyke DR. Amino acids as placeholders: base-composition pressures on protein length in malaria parasites and prokaryotes. ACTA ACUST UNITED AC 2005; 4:117-30. [PMID: 16128613 DOI: 10.2165/00822942-200504020-00005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
BACKGROUND The composition and sequence of amino acids in a protein may serve the underlying needs of the nucleic acids that encode the protein (the genome phenotype). In extreme form, amino acids become mere placeholders inserted between functional segments or domains, and--apart from increasing protein length--playing no role in the specific function or structure of a protein (the conventional phenotype). METHODS We studied the genomes of two malarial parasites and 521 prokaryotes (144 complete) that differ widely in GC% and optimum growth temperature, comparing the base compositions of the protein coding regions and corresponding lengths (kilobases). RESULTS Malarial parasites show distinctive responses to base-compositional pressures that increase as protein lengths increase. A low-GC% species (Plasmodium falciparum) is likely to have more placeholder amino acids than an intermediate-GC% species (P. vivax), so that homologous proteins are longer. In prokaryotes, GC% is generally greater and AG% is generally less in open reading frames (ORFs) encoding long proteins. The increased GC% in long ORFs increases as species' GC% increases, and decreases as species' AG% increases. In low- and intermediate-GC% prokaryotic species, increases in ORF GC% as encoded proteins increase in length are largely accounted for by the base compositions of first and second (amino acid-determining) codon positions. In high-GC% prokaryotic species, first and third (non-amino acid-determining) codon positions play this role. CONCLUSION In low- and intermediate-GC% prokaryotes, placeholder amino acids are likely to be well defined, corresponding to codons enriched in G and/or C at first and second positions. In high-GC% prokaryotes, placeholder amino acids are likely to be less well defined. Increases in ORF GC% as encoded proteins increase in length are greater in mesophiles than in thermophiles, which are constrained from increasing protein lengths in response to base-composition pressures.
Collapse
Affiliation(s)
- Jonathan H Rayment
- Department of Biochemistry, Queen's University, Kingston, Ontario, Canada
| | | |
Collapse
|
30
|
Das S, Ghosh S, Pan A, Dutta C. Compositional variation in bacterial genes and proteins with potential expression level. FEBS Lett 2005; 579:5205-10. [PMID: 16165133 DOI: 10.1016/j.febslet.2005.08.042] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2005] [Accepted: 08/22/2005] [Indexed: 11/22/2022]
Abstract
Usage of guanine and cytosine at three codon sites in eubacterial genes vary distinctly with potential expressivity, as predicted by Codon Adaptation Index (CAI). In bacteria with moderate/high GC-content, G(3) follows a biphasic relationship, while C(3) increases with CAI. In AT-rich bacteria, correlation of CAI is negative with G(3), but non-specific with C(3). Correlations of CAI with residues encoded by G-starting codons are positive, while with those by C-starting codons are usually negative/random. Average Size/Complexity Score and aromaticity of gene-products decrease with CAI, confirming general validity of cost-minimization principle in free-living eubacteria. Alcoholicity of bacterial gene-products usually decreases with expressivity.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Center, Indian Institute of Chemical Biology, 4, Raja S.C. Mullick Road, Kolkata 700 032, India
| | | | | | | |
Collapse
|
31
|
Khachane AN, Timmis KN, dos Santos VAPM. Uracil content of 16S rRNA of thermophilic and psychrophilic prokaryotes correlates inversely with their optimal growth temperatures. Nucleic Acids Res 2005; 33:4016-22. [PMID: 16030352 PMCID: PMC1179731 DOI: 10.1093/nar/gki714] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We report here the finding of a highly significant inverse correlation of the uracil content of 16S rRNA and the optimum growth temperature (Topt) of cultured thermophilic and psychrophilic prokaryotes. This correlation was significantly different from the weaker correlations between the contents of other nucleotides and Topt. Analysis of the 16S rRNA secondary structure regions revealed a fall in the A:U base-pair content in step with the increase in Topt that was much steeper than that of mismatched base-pairs, which are thermodynamically less stable. These findings indicate that the 16S rRNA sequences of thermophiles and psychrophiles are under a strong thermo-adaptive pressure, and that structure–function constraints play a crucial role in determining their 16S rRNA nucleotide composition. The derived relationship between uracil content and Topt was used to develop an algorithm to predict the Topt values of uncultured prokaryotes lacking cultured close relatives and belonging to the phyla predominantly containing thermophiles. This algorithm may be useful in guiding the design of cultivation conditions for hitherto uncultured microbes.
Collapse
Affiliation(s)
| | | | - Vítor A. P. Martins dos Santos
- To whom correspondence should be addressed at Division of Microbiology, GBF—German Research Centre for Biotechnology, Mascheroder Weg 1, D-38124 Braunschweig, Germany. Tel: +49(0) 531 6181 422; Fax: +49(0) 531 6181 411;
| |
Collapse
|
32
|
Friedman R, Drake JW, Hughes AL. Genome-wide patterns of nucleotide substitution reveal stringent functional constraints on the protein sequences of thermophiles. Genetics 2005; 167:1507-12. [PMID: 15280258 PMCID: PMC1470942 DOI: 10.1534/genetics.104.026344] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
To test the hypothesis that the proteins of thermophilic prokaryotes are subject to unusually stringent functional constraints, we estimated the numbers of synonymous and nonsynonymous nucleotide substitutions per site between 17,957 pairs of orthologous genes from 22 pairs of closely related species of Archaea and Bacteria. The average ratio of nonsynonymous to synonymous substitutions was significantly lower in thermophiles than in nonthermophiles, and this effect was observed in both Archaea and Bacteria. There was no evidence that this difference could be explained by factors such as nucleotide content bias. Rather, the results support the hypothesis that proteins of thermophiles are subject to unusually strong purifying selection, leading to a reduced overall level of amino acid evolution per mutational event. The results show that genome-wide patterns of sequence evolution can be influenced by natural selection exerted by a species' environment and shed light on a previous observation that relatively few of the mutations arising in a thermophilic archaeon were nucleotide substitutions in contrast to indels.
Collapse
Affiliation(s)
- Robert Friedman
- Department of Biological Sciences, University of South Carolina, Columbia, South Carolina 29208, USA
| | | | | |
Collapse
|
33
|
Abstract
Most positively selected mutations cause changes in metabolism, resulting in a better-adapted phenotype. But as well as acting on the information content of genes, natural selection may also act directly on nucleic acid and protein molecules. We review the evidence for direct temperature-dependent natural selection acting on genomes, transcriptomes and proteomes.
Collapse
Affiliation(s)
- Donal A Hickey
- Department of Biology, Concordia University, 7141 Sherbrooke Street, Montreal, Quebec, H4B 1R6, Canada.
| | | |
Collapse
|
34
|
Paz A, Mester D, Baca I, Nevo E, Korol A. Adaptive role of increased frequency of polypurine tracts in mRNA sequences of thermophilic prokaryotes. Proc Natl Acad Sci U S A 2004; 101:2951-6. [PMID: 14973185 PMCID: PMC365726 DOI: 10.1073/pnas.0308594100] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The mechanism of an organism's adaptation to high temperatures has been investigated intensively in recent years. It was suggested that the macromolecules of thermophilic microorganisms (especially proteins) have structural features that enhance their thermostability. We compared mRNA sequences of 72 fully sequenced prokaryotic proteomes (14 thermophilic and 58 mesophilic species). Although the differences between the percentage of adenine plus guanine content of whole mRNAs of different prokaryotic species are much lower than those of guanine plus cytosine content, the thermophile purine-pyrimidine (R/Y) ratio within their mRNAs is significantly higher than that of the mesophiles. The first and third codon positions of both thermophiles and mesophiles are purine-biased, with the bias more pronounced by the thermophiles. Thermophile mRNAs that display the highest R/Y ratio (1.43-1.69) are those of the ribosomal proteins, histone-like proteins, DNA-dependent RNA polymerase subunits, and heat-shock proteins. Within mesophilic prokaryotes and five eukaryotic species, the R/Y ratio of the mRNAs of heat-shock proteins is higher than their average over coding part of the genome. Polypurine tracts (R)(n) (with n > or = 5) are much more abundant within the thermophile mRNAs compared with mesophiles. Between two sequential pure-purinic codons of thermophile mRNAs, there is a rather strong tendency for the occurrence of adenine but not guanine tracts. The data suggest that mixed adenine.guanine and polyadenine tracts in mRNAs increase the thermostability beyond the contribution of amino acids encoded by purine tracts, which highlights the importance of ecological stress in the evolution of genome architecture.
Collapse
Affiliation(s)
- Arnon Paz
- Institute of Evolution, Haifa University, Mount Carmel, Haifa 31905, Israel
| | | | | | | | | |
Collapse
|