1
|
Ali F. Patterns of Change in Nucleotide Diversity Over Gene Length. Genome Biol Evol 2024; 16:evae078. [PMID: 38608148 PMCID: PMC11040516 DOI: 10.1093/gbe/evae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 03/26/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024] Open
Abstract
Nucleotide diversity at a site is influenced by the relative strengths of neutral and selective population genetic processes. Therefore, attempts to estimate Effective population size based on the diversity of synonymous sites demand a better understanding of their selective constraints. The nucleotide diversity of a gene was previously found to correlate with its length. In this work, I measure nucleotide diversity at synonymous sites and uncover a pattern of low diversity towards the translation initiation site of a gene. The degree of reduction in diversity at the translation initiation site and the length of this region of reduced diversity can be quantified as "Effect Size" and "Effect Length" respectively, using parameters of an asymptotic regression model. Estimates of Effect Length across bacteria covaried with recombination rates as well as with a multitude of translation-associated traits such as the avoidance of mRNA secondary structure around translation initiation site, the number of rRNAs, and relative codon usage of ribosomal genes. Evolutionary simulations under purifying selection reproduce the observed patterns and diversity-length correlation and highlight that selective constraints on the 5'-region of a gene may be more extensive than previously believed. These results have implications for the estimation of effective population size, and relative mutation rates, and for genome scans of genes under positive selection based on "silent-site" diversity.
Collapse
Affiliation(s)
- Farhan Ali
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
2
|
Ali F. Patterns of change in nucleotide diversity over gene length. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.13.548940. [PMID: 37503020 PMCID: PMC10369989 DOI: 10.1101/2023.07.13.548940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Nucleotide diversity at a site is influenced by the relative strengths of neutral and selective population genetic processes. Therefore, attempts to identify sites under positive selection require an understanding of the expected diversity in its absence. The nucleotide diversity of a gene was previously found to correlate with its length. In this work, I measure nucleotide diversity at synonymous sites and uncover a pattern of low diversity towards the translation initiation site (TIS) of a gene. The degree of reduction in diversity at the TIS and the length of this region of reduced diversity can be quantified as "Effect Size" and "Effect Length" respectively, using parameters of an asymptotic regression model. Estimates of Effect Length across bacteria covaried with recombination rates as well as with a multitude of fast-growth adaptations such as the avoidance of mRNA secondary structure around TIS, the number of rRNAs, and relative codon usage of ribosomal genes. Thus, the dependence of nucleotide diversity on gene length is governed by a combination of selective and non-selective processes. These results have implications for the estimation of effective population size and relative mutation rates based on "silent-site" diversity, and for pN/pS-based prediction of genes under selection.
Collapse
Affiliation(s)
- Farhan Ali
- Biodesign Institute, Arizona State University, Tempe, Arizona
| |
Collapse
|
3
|
Nielsen FD, Møller-Jensen J, Jørgensen MG. Adding context to the pneumococcal core genes using bioinformatic analysis of the intergenic pangenome of Streptococcus pneumoniae. FRONTIERS IN BIOINFORMATICS 2023; 3:1074212. [PMID: 36844929 PMCID: PMC9944727 DOI: 10.3389/fbinf.2023.1074212] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 01/24/2023] [Indexed: 02/10/2023] Open
Abstract
Introduction: Whole genome sequencing offers great opportunities for linking genotypes to phenotypes aiding in our understanding of human disease and bacterial pathogenicity. However, these analyses often overlook non-coding intergenic regions (IGRs). By disregarding the IGRs, crucial information is lost, as genes have little biological function without expression. Methods/Results: In this study, we present the first complete pangenome of the important human pathogen Streptococcus pneumoniae (pneumococcus), spanning both the genes and IGRs. We show that the pneumococcus species retains a small core genome of IGRs that are present across all isolates. Gene expression is highly dependent on these core IGRs, and often several copies of these core IGRs are found across each genome. Core genes and core IGRs show a clear linkage as 81% of core genes are associated with core IGRs. Additionally, we identify a single IGR within the core genome that is always occupied by one of two highly distinct sequences, scattered across the phylogenetic tree. Discussion: Their distribution indicates that this IGR is transferred between isolates through horizontal regulatory transfer independent of the flanking genes and that each type likely serves different regulatory roles depending on their genetic context.
Collapse
Affiliation(s)
- Flemming Damgaard Nielsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark,Department of Clinical Microbiology, Odense University Hospital, Odense, Denmark
| | - Jakob Møller-Jensen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Mikkel Girke Jørgensen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark,*Correspondence: Mikkel Girke Jørgensen,
| |
Collapse
|
4
|
Radi MS, Munro LJ, Salcedo-Sora JE, Kim SH, Feist AM, Kell DB. Understanding Functional Redundancy and Promiscuity of Multidrug Transporters in E. coli under Lipophilic Cation Stress. MEMBRANES 2022; 12:1264. [PMID: 36557171 PMCID: PMC9783932 DOI: 10.3390/membranes12121264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 11/27/2022] [Accepted: 12/03/2022] [Indexed: 06/17/2023]
Abstract
Multidrug transporters (MDTs) are major contributors to microbial drug resistance and are further utilized for improving host phenotypes in biotechnological applications. Therefore, the identification of these MDTs and the understanding of their mechanisms of action in vivo are of great importance. However, their promiscuity and functional redundancy represent a major challenge towards their identification. Here, a multistep tolerance adaptive laboratory evolution (TALE) approach was leveraged to achieve this goal. Specifically, a wild-type E. coli K-12-MG1655 and its cognate knockout individual mutants ΔemrE, ΔtolC, and ΔacrB were evolved separately under increasing concentrations of two lipophilic cations, tetraphenylphosphonium (TPP+), and methyltriphenylphosphonium (MTPP+). The evolved strains showed a significant increase in MIC values of both cations and an apparent cross-cation resistance. Sequencing of all evolved mutants highlighted diverse mutational mechanisms that affect the activity of nine MDTs including acrB, mdtK, mdfA, acrE, emrD, tolC, acrA, mdtL, and mdtP. Besides regulatory mutations, several structural mutations were recognized in the proximal binding domain of acrB and the permeation pathways of both mdtK and mdfA. These details can aid in the rational design of MDT inhibitors to efficiently combat efflux-based drug resistance. Additionally, the TALE approach can be scaled to different microbes and molecules of medical and biotechnological relevance.
Collapse
Affiliation(s)
- Mohammad S. Radi
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kongens Lyngby, Denmark
| | - Lachlan J. Munro
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kongens Lyngby, Denmark
| | - Jesus E. Salcedo-Sora
- GeneMill, Shared Research Facilities, Faculty of Health and Life Sciences, University of Liverpool, Crown St., Liverpool L69 7ZB, UK
| | - Se Hyeuk Kim
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kongens Lyngby, Denmark
| | - Adam M. Feist
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kongens Lyngby, Denmark
- Department of Bioengineering, University of California, 9500 Gilman Drive, La Jolla, San Diego, CA 92093, USA
| | - Douglas B. Kell
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kongens Lyngby, Denmark
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St., Liverpool L69 7ZB, UK
| |
Collapse
|
5
|
White H, Vos M, Sheppard SK, Pascoe B, Raymond B. Signatures of selection in core and accessory genomes indicate different ecological drivers of diversification among Bacillus cereus clades. Mol Ecol 2022; 31:3584-3597. [PMID: 35510788 PMCID: PMC9324797 DOI: 10.1111/mec.16490] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 03/31/2022] [Accepted: 04/12/2022] [Indexed: 11/30/2022]
Abstract
Bacterial clades are often ecologically distinct, despite extensive horizontal gene transfer (HGT). How selection works on different parts of bacterial pan-genomes to drive and maintain the emergence of clades is unclear. Focusing on the three largest clades in the diverse and well-studied Bacillus cereus sensu lato group, we identified clade-specific core genes (present in all clade members) and then used clade-specific allelic diversity to identify genes under purifying and diversifying selection. Clade-specific accessory genes (present in a subset of strains within a clade) were characterized as being under selection using presence/absence in specific clades. Gene ontology analyses of genes under selection revealed that different gene functions were enriched in different clades. Furthermore, some gene functions were enriched only amongst clade-specific core or accessory genomes. Genes under purifying selection were often clade-specific, while genes under diversifying selection showed signs of frequent HGT. These patterns are consistent with different selection pressures acting on both the core and the accessory genomes of different clades and can lead to ecological divergence in both cases. Examining variation in allelic diversity allows us to uncover genes under clade-specific selection, allowing ready identification of strains and their ecological niche.
Collapse
Affiliation(s)
- Hugh White
- Centre for Ecology and ConservationUniversity of ExeterPenrynUK
| | - Michiel Vos
- European Centre for Environment and Human HealthUniversity of Exeter Medical SchoolEnvironment and Sustainability InstitutePenryn CampusUK
| | - Samuel K. Sheppard
- Milner Centre for EvolutionDepartment of Biology & BiotechnologyUniversity of BathBathUK
| | - Ben Pascoe
- Milner Centre for EvolutionDepartment of Biology & BiotechnologyUniversity of BathBathUK
| | - Ben Raymond
- Centre for Ecology and ConservationUniversity of ExeterPenrynUK
| |
Collapse
|
6
|
Bohr LL, Youngblom MA, Eldholm V, Pepperell CS. Genome reorganization during emergence of host-associated Mycobacterium abscessus. Microb Genom 2021; 7. [PMID: 34874249 PMCID: PMC8767326 DOI: 10.1099/mgen.0.000706] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Mycobacterium abscessus is a rapid growing, free-living species of bacterium that also causes lung infections in humans. Human infections are usually acquired from the environment; however, dominant circulating clones (DCCs) have emerged recently in both M. abscessus subsp. massiliense and subsp. abscessus that appear to be transmitted among humans and are now globally distributed. These recently emerged clones are potentially informative about the ecological and evolutionary mechanisms of pathogen emergence and host adaptation. The geographical distribution of DCCs has been reported, but the genomic processes underlying their transition from environmental bacterium to human pathogen are not well characterized. To address this knowledge gap, we delineated the structure of M. abscessus subspecies abscessus and massiliense using genomic data from 200 clinical isolates of M. abscessus from seven geographical regions. We identified differences in overall patterns of lateral gene transfer (LGT) and barriers to LGT between subspecies and between environmental and host-adapted bacteria. We further characterized genome reorganization that accompanied bacterial host adaptation, inferring selection pressures acting at both genic and intergenic loci. We found that both subspecies encode an expansive pangenome with many genes at rare frequencies. Recombination appears more frequent in M. abscessus subsp. massiliense than in subsp. abscessus, consistent with prior reports. We found evidence suggesting that phage are exchanged between subspecies, despite genetic barriers evident elsewhere throughout the genome. Patterns of LGT differed according to niche, with less LGT observed among host-adapted DCCs versus environmental bacteria. We also found evidence suggesting that DCCs are under distinct selection pressures at both genic and intergenic sites. Our results indicate that host adaptation of M. abscessus was accompanied by major changes in genome evolution, including shifts in the apparent frequency of LGT and impacts of selection. Differences were evident among the DCCs as well, which varied in the degree of gene content remodelling, suggesting they were placed differently along the evolutionary trajectory toward host adaptation. These results provide insight into the evolutionary forces that reshape bacterial genomes as they emerge into the pathogenic niche.
Collapse
Affiliation(s)
- Lindsey L Bohr
- Department of Medical Microbiology and Immunology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
| | - Madison A Youngblom
- Department of Medical Microbiology and Immunology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
| | | | - Caitlin S Pepperell
- Department of Medical Microbiology and Immunology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA.,Department of Medicine, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
7
|
A Translation-Aborting Small Open Reading Frame in the Intergenic Region Promotes Translation of a Mg 2+ Transporter in Salmonella Typhimurium. mBio 2021; 12:mBio.03376-20. [PMID: 33849981 PMCID: PMC8092293 DOI: 10.1128/mbio.03376-20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Translation initiation regions in mRNAs that include the ribosome-binding site (RBS) and the start codon are often sequestered within a secondary structure. Therefore, to initiate protein synthesis, the mRNA secondary structure must be unfolded to allow the RBS to be accessible to the ribosome. Bacterial mRNAs often harbor upstream open reading frames (uORFs) in the 5′ untranslated regions (UTRs). Translation of the uORF usually affects downstream gene expression at the levels of transcription and/or translation initiation. Unlike other uORFs mostly located in the 5′ UTR, we discovered an 8-amino-acid ORF, designated mgtQ, in the intergenic region between the mgtC virulence gene and the mgtB Mg2+ transporter gene in the Salmonella mgtCBRU operon. Translation of mgtQ promotes downstream mgtB Mg2+ transporter expression at the level of translation by releasing the ribosome-binding sequence of the mgtB gene that is sequestered in a translation-inhibitory stem-loop structure. Interestingly, mgtQ Asp2 and Glu5 codons that induce ribosome destabilization are required for mgtQ-mediated mgtB translation. Moreover, the mgtQ Asp and Glu codons-mediated mgtB translation is counteracted by the ribosomal subunit L31 that stabilizes ribosome. Substitution of the Asp2 and Glu5 codons in mgtQ decreases MgtB Mg2+ transporter production and thus attenuates Salmonella virulence in mice, likely by limiting Mg2+ acquisition during infection.
Collapse
|
8
|
Wang L, Wang M, Shi X, Yang J, Qian C, Liu Q, Zong L, Liu X, Zhu Z, Tang D, Zhang X. Investigation into archaeal extremophilic lifestyles through comparative proteogenomic analysis. J Biomol Struct Dyn 2020; 39:7080-7092. [PMID: 32820705 DOI: 10.1080/07391102.2020.1808531] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Archaea are a group of primary life forms on Earth and could thrive in many unique environments. Their successful colonization of extreme niches requires corresponding adaptations at proteogenomic level in order to maintain stable cellular structures and active physiological functions. Although some studies have already investigated the extremophilic lifestyles of archaeal species based on genomic features and protein structures, there is a lack of comparative proteogenomic analysis in a large scale. In this study, we explored 686 high-quality archaeal genomes (proteomes) sourced from the Pathosystems Resource Integration Center (PATRIC) database. General patterns of genomic features such as genome size, coding capacity (coding genes and non-coding regions), and G + C contents were re-confirmed. Protein domain distribution patterns were then identified across archaeal species. Domains with unknown functions (DUFs) and mini proteins were investigated in terms of their distributions due to their importance in archaeal physiological functions. In addition, physicochemical properties of protein sequences, such as stability, hydrophobicity, isoelectric point, aromaticity and amino acid compositions in corresponding archaeal groups were compared. Unique features associated with extremophilic lifestyles were observed, which suggested that evolutionary adaptations to different extreme environments had intrinsic impacts on archaeal protein features. Taken together, this systematic study facilitates a better understanding of the mechanisms behind the extremophilic lifestyles of archaeal species, which will further contribute to the evolutionary explorations of archaeal adaptations both experimentally and theoretically in the future studies.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Liang Wang
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China.,Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu, China.,Jiangsu Key Lab of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Mengmeng Wang
- Jiangsu Key Lab of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu, China.,Department of Pharmaceutical Analysis, School of Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Xinyi Shi
- School of Life Science, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Jianye Yang
- School of Life Science, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Chenlu Qian
- School of Life Science, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Qinghua Liu
- Jiangsu Key Lab of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu, China.,Department of Pharmaceutical Analysis, School of Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Lixin Zong
- School of Life Science, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Xin Liu
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Zuobin Zhu
- School of Life Science, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Daoquan Tang
- Jiangsu Key Lab of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu, China.,Department of Pharmaceutical Analysis, School of Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Xiao Zhang
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu, China.,Department of Computer Science, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu, China
| |
Collapse
|
9
|
Menendez-Gil P, Caballero CJ, Catalan-Moreno A, Irurzun N, Barrio-Hernandez I, Caldelari I, Toledo-Arana A. Differential evolution in 3'UTRs leads to specific gene expression in Staphylococcus. Nucleic Acids Res 2020; 48:2544-2563. [PMID: 32016395 PMCID: PMC7049690 DOI: 10.1093/nar/gkaa047] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 12/05/2019] [Accepted: 01/16/2020] [Indexed: 12/16/2022] Open
Abstract
The evolution of gene expression regulation has contributed to species differentiation. The 3' untranslated regions (3'UTRs) of mRNAs include regulatory elements that modulate gene expression; however, our knowledge of their implications in the divergence of bacterial species is currently limited. In this study, we performed genome-wide comparative analyses of mRNAs encoding orthologous proteins from the genus Staphylococcus and found that mRNA conservation was lost mostly downstream of the coding sequence (CDS), indicating the presence of high sequence diversity in the 3'UTRs of orthologous genes. Transcriptomic mapping of different staphylococcal species confirmed that 3'UTRs were also variable in length. We constructed chimeric mRNAs carrying the 3'UTR of orthologous genes and demonstrated that 3'UTR sequence variations affect protein production. This suggested that species-specific functional 3'UTRs might be specifically selected during evolution. 3'UTR variations may occur through different processes, including gene rearrangements, local nucleotide changes, and the transposition of insertion sequences. By extending the conservation analyses to specific 3'UTRs, as well as the entire set of Escherichia coli and Bacillus subtilis mRNAs, we showed that 3'UTR variability is widespread in bacteria. In summary, our work unveils an evolutionary bias within 3'UTRs that results in species-specific non-coding sequences that may contribute to bacterial diversity.
Collapse
Affiliation(s)
- Pilar Menendez-Gil
- Instituto de Agrobiotecnología (IdAB), CSIC-UPNA-Gobierno de Navarra, 31192-Mutilva, Navarra, Spain
| | - Carlos J Caballero
- Instituto de Agrobiotecnología (IdAB), CSIC-UPNA-Gobierno de Navarra, 31192-Mutilva, Navarra, Spain
| | - Arancha Catalan-Moreno
- Instituto de Agrobiotecnología (IdAB), CSIC-UPNA-Gobierno de Navarra, 31192-Mutilva, Navarra, Spain
| | - Naiara Irurzun
- Instituto de Agrobiotecnología (IdAB), CSIC-UPNA-Gobierno de Navarra, 31192-Mutilva, Navarra, Spain
| | - Inigo Barrio-Hernandez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Isabelle Caldelari
- Université de Strasbourg, CNRS, Architecture et Réactivité de l’ARN, UPR9002, F-67000-Strasbourg, France
| | - Alejandro Toledo-Arana
- Instituto de Agrobiotecnología (IdAB), CSIC-UPNA-Gobierno de Navarra, 31192-Mutilva, Navarra, Spain
| |
Collapse
|
10
|
Wang L, Luo Y, Zhao Y, Gao GF, Bi Y, Qiu HJ. Comparative genomic analysis reveals an 'open' pan-genome of African swine fever virus. Transbound Emerg Dis 2020; 67:1553-1562. [PMID: 31965706 DOI: 10.1111/tbed.13489] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 01/02/2020] [Accepted: 01/16/2020] [Indexed: 12/22/2022]
Abstract
The worldwide transmission of African swine fever virus (ASFV) drastically affects the pig industry and global trade. Development of vaccines is hindered by the lack of knowledge of the genomic characteristics of ASFV. In this study, we developed a pipeline for the de novo assembly of ASFV genome without virus isolation and purification. We then used a comparative genomics approach to systematically study 46 genomes of ASFVs to reveal the genomic characteristics. The analysis revealed that ASFV has an 'open' pan-genome based on both protein-coding genes and intergenic regions. Of the 151-174 genes found in the ASFV strains, only 86 were identified as core genes; the remainder were flexible accessory genes. Notably, 44 of the 86 core genes and 155 of the 324 accessory genes have been functionally annotated according to the known proteins. Interestingly, a dynamic number of taxis-related genes were identified in the accessory genes, and two potential virulence genes were identified in all ASFV isolates. The 'open' pan-genome of ASFV based on gene and intergenic regions reveals its pronounced natural diversity concerning genomic composition and regulation.
Collapse
Affiliation(s)
- Liang Wang
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Center for Influenza Research and Early-warning (CASCIRE), Chinese Academy of Sciences, Beijing, China
| | - Yuzi Luo
- State Key Laboratory of Veterinary Biotechnology, Harbin Veterinary Research Institute of the Chinese Academy of Agricultural Sciences, Harbin, China
| | - Yuhui Zhao
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Center for Influenza Research and Early-warning (CASCIRE), Chinese Academy of Sciences, Beijing, China
| | - George F Gao
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Center for Influenza Research and Early-warning (CASCIRE), Chinese Academy of Sciences, Beijing, China.,Shenzhen Key Laboratory of Pathogen and Immunity, Guangdong Key Laboratory for Diagnosis and Treatment of Emerging Infectious Diseases, State Key Discipline of Infectious Disease, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen Third People's Hospital, Shenzhen, China.,National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention (China CDC), Beijing, China.,Savaid Medical School, University of Chinese Academy of Sciences, Beijing, China
| | - Yuhai Bi
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Center for Influenza Research and Early-warning (CASCIRE), Chinese Academy of Sciences, Beijing, China.,Shenzhen Key Laboratory of Pathogen and Immunity, Guangdong Key Laboratory for Diagnosis and Treatment of Emerging Infectious Diseases, State Key Discipline of Infectious Disease, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen Third People's Hospital, Shenzhen, China
| | - Hua-Ji Qiu
- State Key Laboratory of Veterinary Biotechnology, Harbin Veterinary Research Institute of the Chinese Academy of Agricultural Sciences, Harbin, China
| |
Collapse
|
11
|
Khademi SMH, Sazinas P, Jelsbak L. Within-Host Adaptation Mediated by Intergenic Evolution in Pseudomonas aeruginosa. Genome Biol Evol 2019; 11:1385-1397. [PMID: 30980662 PMCID: PMC6505451 DOI: 10.1093/gbe/evz083] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/09/2019] [Indexed: 12/21/2022] Open
Abstract
Bacterial pathogens evolve during the course of infection as they adapt to the selective pressures that confront them inside the host. Identification of adaptive mutations and their contributions to pathogen fitness remains a central challenge. Although mutations can either target intergenic or coding regions in the pathogen genome, studies of host adaptation have focused predominantly on molecular evolution within coding regions, whereas the role of intergenic mutations remains unclear. Here, we address this issue and investigate the extent to which intergenic mutations contribute to the evolutionary response of a clinically important bacterial pathogen, Pseudomonas aeruginosa, to the host environment, and whether intergenic mutations have distinct roles in host adaptation. We characterize intergenic evolution in 44 clonal lineages of P. aeruginosa and identify 77 intergenic regions in which parallel evolution occurs. At the genetic level, we find that mutations in regions under selection are located primarily within regulatory elements upstream of transcriptional start sites. At the functional level, we show that some of these mutations both increase or decrease transcription of genes and are directly responsible for evolution of important pathogenic phenotypes including antibiotic sensitivity. Importantly, we find that intergenic mutations facilitate essential genes to become targets of evolution. In summary, our results highlight the evolutionary significance of intergenic mutations in creating host-adapted strains, and that intergenic and coding regions have different qualitative contributions to this process.
Collapse
Affiliation(s)
- S M Hossein Khademi
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Lyngby, Denmark.,Division of Infection Medicine, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Pavelas Sazinas
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Lyngby, Denmark
| | | |
Collapse
|
12
|
Jiang X, Hall AB, Arthur TD, Plichta DR, Covington CT, Poyet M, Crothers J, Moses PL, Tolonen AC, Vlamakis H, Alm EJ, Xavier RJ. Invertible promoters mediate bacterial phase variation, antibiotic resistance, and host adaptation in the gut. Science 2019; 363:181-187. [PMID: 30630933 DOI: 10.1126/science.aau5238] [Citation(s) in RCA: 86] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 12/03/2018] [Indexed: 12/20/2022]
Abstract
Phase variation, the reversible alternation between genetic states, enables infection by pathogens and colonization by commensals. However, the diversity of phase variation remains underexplored. We developed the PhaseFinder algorithm to quantify DNA inversion-mediated phase variation. A systematic search of 54,875 bacterial genomes identified 4686 intergenic invertible DNA regions (invertons), revealing an enrichment in host-associated bacteria. Invertons containing promoters often regulate extracellular products, underscoring the importance of surface diversity for gut colonization. We found invertons containing promoters regulating antibiotic resistance genes that shift to the ON orientation after antibiotic treatment in human metagenomic data and in vitro, thereby mitigating the cost of antibiotic resistance. We observed that the orientations of some invertons diverge after fecal microbiota transplant, potentially as a result of individual-specific selective forces.
Collapse
Affiliation(s)
- Xiaofang Jiang
- Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - A Brantley Hall
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | | | - Damian R Plichta
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Christian T Covington
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Mathilde Poyet
- Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Jessica Crothers
- Department of Pathology and Laboratory Medicine, University of Vermont Medical Center, Burlington, VT 05401, USA
| | - Peter L Moses
- Division of Gastroenterology and Hepatology, University of Vermont, Burlington, VT 05401, USA
| | - Andrew C Tolonen
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Hera Vlamakis
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Eric J Alm
- Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Ramnik J Xavier
- Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.,Gastrointestinal Unit and Center for the Study of Inflammatory Bowel Disease, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| |
Collapse
|
13
|
Shelyakin PV, Bochkareva OO, Karan AA, Gelfand MS. Micro-evolution of three Streptococcus species: selection, antigenic variation, and horizontal gene inflow. BMC Evol Biol 2019; 19:83. [PMID: 30917781 PMCID: PMC6437910 DOI: 10.1186/s12862-019-1403-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Accepted: 02/25/2019] [Indexed: 02/07/2023] Open
Abstract
Background The genus Streptococcus comprises pathogens that strongly influence the health of humans and animals. Genome sequencing of multiple Streptococcus strains demonstrated high variability in gene content and order even in closely related strains of the same species and created a newly emerged object for genomic analysis, the pan-genome. Here we analysed the genome evolution of 25 strains of Streptococcus suis, 50 strains of Streptococcus pyogenes and 28 strains of Streptococcus pneumoniae. Results Fractions of the pan-genome, unique, periphery, and universal genes differ in size, functional composition, the level of nucleotide substitutions, and predisposition to horizontal gene transfer and genomic rearrangements. The density of substitutions in intergenic regions appears to be correlated with selection acting on adjacent genes, implying that more conserved genes tend to have more conserved regulatory regions. The total pan-genome of the genus is open, but only due to strain-specific genes, whereas other pan-genome fractions reach saturation. We have identified the set of genes with phylogenies inconsistent with species and non-conserved location in the chromosome; these genes are rare in at least one species and have likely experienced recent horizontal transfer between species. The strain-specific fraction is enriched with mobile elements and hypothetical proteins, but also contains a number of candidate virulence-related genes, so it may have a strong impact on adaptability and pathogenicity. Mapping the rearrangements to the phylogenetic tree revealed large parallel inversions in all species. A parallel inversion of length 15 kB with breakpoints formed by genes encoding surface antigen proteins PhtD and PhtB in S. pneumoniae leads to replacement of gene fragments that likely indicates the action of an antigen variation mechanism. Conclusions Members of genus Streptococcus have a highly dynamic, open pan-genome, that potentially confers them with the ability to adapt to changing environmental conditions, i.e. antibiotic resistance or transmission between different hosts. Hence, integrated analysis of all aspects of genome evolution is important for the identification of potential pathogens and design of drugs and vaccines. Electronic supplementary material The online version of this article (10.1186/s12862-019-1403-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Pavel V Shelyakin
- Vavilov Institute of General Genetics Russian Academy of Sciences, Gubkina str. 3, Moscow, 119991, Russia. .,Kharkevich Institute for Information Transmission Problems, 19, Bolshoy Karetny per., Moscow, 127051, Russia. .,Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, Russia.
| | - Olga O Bochkareva
- Kharkevich Institute for Information Transmission Problems, 19, Bolshoy Karetny per., Moscow, 127051, Russia.,Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Anna A Karan
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Mikhail S Gelfand
- Kharkevich Institute for Information Transmission Problems, 19, Bolshoy Karetny per., Moscow, 127051, Russia.,Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, Russia.,Faculty of Computer Science, Higher School of Economics, Moscow, Russia
| |
Collapse
|
14
|
Purifying and positive selection in the evolution of stop codons. Sci Rep 2018; 8:9260. [PMID: 29915293 PMCID: PMC6006363 DOI: 10.1038/s41598-018-27570-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 05/18/2018] [Indexed: 12/13/2022] Open
Abstract
Modes of evolution of stop codons in protein-coding genes, especially the conservation of UAA, have been debated for many years. We reconstructed the evolution of stop codons in 40 groups of closely related prokaryotic and eukaryotic genomes. The results indicate that the UAA codons are maintained by purifying selection in all domains of life. In contrast, positive selection appears to drive switches from UAG to other stop codons in prokaryotes but not in eukaryotes. Changes in stop codons are significantly associated with increased substitution frequency immediately downstream of the stop. These positions are otherwise more strongly conserved in evolution compared to sites farther downstream, suggesting that such substitutions are compensatory. Although GC content has a major impact on stop codon frequencies, its contribution to the decreased frequency of UAA differs between bacteria and archaea, presumably, due to differences in their translation termination mechanisms.
Collapse
|
15
|
Thorpe HA, Bayliss SC, Sheppard SK, Feil EJ. Piggy: a rapid, large-scale pan-genome analysis tool for intergenic regions in bacteria. Gigascience 2018; 7:1-11. [PMID: 29635296 PMCID: PMC5890482 DOI: 10.1093/gigascience/giy015] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2017] [Revised: 01/09/2018] [Accepted: 02/16/2018] [Indexed: 12/31/2022] Open
Abstract
Background The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences. Findings To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli. Conclusions For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).
Collapse
Affiliation(s)
- Harry A Thorpe
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY
| | - Sion C Bayliss
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY
| | - Samuel K Sheppard
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY
| | - Edward J Feil
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY
| |
Collapse
|
16
|
Comparative Analyses of Selection Operating on Nontranslated Intergenic Regions of Diverse Bacterial Species. Genetics 2017; 206:363-376. [PMID: 28280056 DOI: 10.1534/genetics.116.195784] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Accepted: 02/26/2017] [Indexed: 12/31/2022] Open
Abstract
Nontranslated intergenic regions (IGRs) compose 10-15% of bacterial genomes, and contain many regulatory elements with key functions. Despite this, there are few systematic studies on the strength and direction of selection operating on IGRs in bacteria using whole-genome sequence data sets. Here we exploit representative whole-genome data sets from six diverse bacterial species: Staphylococcus aureus, Streptococcus pneumoniae, Mycobacterium tuberculosis, Salmonella enterica, Klebsiella pneumoniae, and Escherichia coli We compare patterns of selection operating on IGRs using two independent methods: the proportion of singleton mutations and the dI/dS ratio, where dI is the number of intergenic SNPs per intergenic site. We find that the strength of purifying selection operating over all intergenic sites is consistently intermediate between that operating on synonymous and nonsynonymous sites. Ribosome binding sites and noncoding RNAs tend to be under stronger selective constraint than promoters and Rho-independent terminators. Strikingly, a clear signal of purifying selection remains even when all these major categories of regulatory elements are excluded, and this constraint is highest immediately upstream of genes. While a paucity of variation means that the data for M. tuberculosis are more equivocal than for the other species, we find strong evidence for positive selection within promoters of this species. This points to a key adaptive role for regulatory changes in this important pathogen. Our study underlines the feasibility and utility of gauging the selective forces operating on bacterial IGRs from whole-genome sequence data, and suggests that our current understanding of the functionality of these sequences is far from complete.
Collapse
|
17
|
Martin O, Krzywicki A, Zagorski M. Drivers of structural features in gene regulatory networks: From biophysical constraints to biological function. Phys Life Rev 2016; 17:124-58. [DOI: 10.1016/j.plrev.2016.06.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Revised: 03/25/2016] [Accepted: 04/20/2016] [Indexed: 12/23/2022]
|
18
|
Matelska D, Kurkowska M, Purta E, Bujnicki JM, Dunin-Horkawicz S. Loss of Conserved Noncoding RNAs in Genomes of Bacterial Endosymbionts. Genome Biol Evol 2016; 8:426-38. [PMID: 26782934 PMCID: PMC4779614 DOI: 10.1093/gbe/evw007] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The genomes of intracellular symbiotic or pathogenic bacteria, such as of Buchnera, Mycoplasma, and Rickettsia, are typically smaller compared with their free-living counterparts. Here we showed that noncoding RNA (ncRNA) families, which are conserved in free-living bacteria, frequently could not be detected by computational methods in the small genomes. Statistical tests demonstrated that their absence is not an artifact of low GC content or small deletions in these small genomes, and thus it was indicative of an independent loss of ncRNAs in different endosymbiotic lineages. By analyzing the synteny (conservation of gene order) between the reduced and nonreduced genomes, we revealed instances of protein-coding genes that were preserved in the reduced genomes but lost cis-regulatory elements. We found that the loss of cis-regulatory ncRNA sequences, which regulate the expression of cognate protein-coding genes, is characterized by the reduction of secondary structure formation propensity, GC content, and length of the corresponding genomic regions.
Collapse
Affiliation(s)
- Dorota Matelska
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
| | - Malgorzata Kurkowska
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
| | - Elzbieta Purta
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland Laboratory of Structural Bioinformatics, Institute of Molecular Biology and Biotechnology, Adam Mickiewicz University, Poznan, Poland
| | - Stanislaw Dunin-Horkawicz
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
| |
Collapse
|
19
|
Galán-Vásquez E, Sánchez-Osorio I, Martínez-Antonio A. Transcription Factors Exhibit Differential Conservation in Bacteria with Reduced Genomes. PLoS One 2016; 11:e0146901. [PMID: 26766575 PMCID: PMC4713081 DOI: 10.1371/journal.pone.0146901] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 12/23/2015] [Indexed: 11/18/2022] Open
Abstract
The description of transcriptional regulatory networks has been pivotal in the understanding of operating principles under which organisms respond and adapt to varying conditions. While the study of the topology and dynamics of these networks has been the subject of considerable work, the investigation of the evolution of their topology, as a result of the adaptation of organisms to different environmental conditions, has received little attention. In this work, we study the evolution of transcriptional regulatory networks in bacteria from a genome reduction perspective, which manifests itself as the loss of genes at different degrees. We used the transcriptional regulatory network of Escherichia coli as a reference to compare 113 smaller, phylogenetically-related γ-proteobacteria, including 19 genomes of symbionts. We found that the type of regulatory action exerted by transcription factors, as genomes get progressively smaller, correlates well with their degree of conservation, with dual regulators being more conserved than repressors and activators in conditions of extreme reduction. In addition, we found that the preponderant conservation of dual regulators might be due to their role as both global regulators and nucleoid-associated proteins. We summarize our results in a conceptual model of how each TF type is gradually lost as genomes become smaller and give a rationale for the order in which this phenomenon occurs.
Collapse
Affiliation(s)
- Edgardo Galán-Vásquez
- Center for Research and Advanced Studies of the National Polytechnic Institute, Campus Irapuato, Genetic Engineering Department, Cinvestav, Km. 9.6 Libramiento Norte Carr. Irapuato-León 36821, Irapuato Gto, México
| | - Ismael Sánchez-Osorio
- Center for Research and Advanced Studies of the National Polytechnic Institute, Campus Irapuato, Genetic Engineering Department, Cinvestav, Km. 9.6 Libramiento Norte Carr. Irapuato-León 36821, Irapuato Gto, México
| | - Agustino Martínez-Antonio
- Center for Research and Advanced Studies of the National Polytechnic Institute, Campus Irapuato, Genetic Engineering Department, Cinvestav, Km. 9.6 Libramiento Norte Carr. Irapuato-León 36821, Irapuato Gto, México
| |
Collapse
|
20
|
DNA Methylation Assessed by SMRT Sequencing Is Linked to Mutations in Neisseria meningitidis Isolates. PLoS One 2015; 10:e0144612. [PMID: 26656597 PMCID: PMC4676702 DOI: 10.1371/journal.pone.0144612] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Accepted: 11/20/2015] [Indexed: 11/20/2022] Open
Abstract
The Gram-negative bacterium Neisseria meningitidis features extensive genetic variability. To present, proposed virulence genotypes are also detected in isolates from asymptomatic carriers, indicating more complex mechanisms underlying variable colonization modes of N. meningitidis. We applied the Single Molecule, Real-Time (SMRT) sequencing method from Pacific Biosciences to assess the genome-wide DNA modification profiles of two genetically related N. meningitidis strains, both of serogroup A. The resulting DNA methylomes revealed clear divergences, represented by the detection of shared and of strain-specific DNA methylation target motifs. The positional distribution of these methylated target sites within the genomic sequences displayed clear biases, which suggest a functional role of DNA methylation related to the regulation of genes. DNA methylation in N. meningitidis has a likely underestimated potential for variability, as evidenced by a careful analysis of the ORF status of a panel of confirmed and predicted DNA methyltransferase genes in an extended collection of N. meningitidis strains of serogroup A. Based on high coverage short sequence reads, we find phase variability as a major contributor to the variability in DNA methylation. Taking into account the phase variable loci, the inferred functional status of DNA methyltransferase genes matched the observed methylation profiles. Towards an elucidation of presently incompletely characterized functional consequences of DNA methylation in N. meningitidis, we reveal a prominent colocalization of methylated bases with Single Nucleotide Polymorphisms (SNPs) detected within our genomic sequence collection. As a novel observation we report increased mutability also at 6mA methylated nucleotides, complementing mutational hotspots previously described at 5mC methylated nucleotides. These findings suggest a more diverse role of DNA methylation and Restriction-Modification (RM) systems in the evolution of prokaryotic genomes.
Collapse
|
21
|
Abstract
Regulation of gene expression ensures an organism responds to stimuli and undergoes proper development. Although the regulatory networks in bacteria have been investigated in model microorganisms, nearly nothing is known about the evolution and plasticity of these networks in obligate, intracellular bacteria. The phylum Chlamydiae contains a vast array of host-associated microbes, including several human pathogens. The Chlamydiae are unique among obligate, intracellular bacteria as they undergo a complex biphasic developmental cycle in which large swaths of genes are temporally regulated. Coupled with the low number of transcription factors, these organisms offer a model to study the evolution of regulatory networks in intracellular organisms. We provide the first comprehensive analysis exploring the diversity and evolution of regulatory networks across the phylum. We utilized a comparative genomics approach to construct predicted coregulatory networks, which unveiled genus- and family-specific regulatory motifs and architectures, most notably those of virulence-associated genes. Surprisingly, our analysis suggests that few regulatory components are conserved across the phylum, and those that are conserved are involved in the exploitation of the intracellular niche. Our study thus lends insight into a component of chlamydial evolution that has otherwise remained largely unexplored.
Collapse
Affiliation(s)
- D Domman
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria
| | - M Horn
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria
| |
Collapse
|
22
|
Brbić M, Warnecke T, Kriško A, Supek F. Global Shifts in Genome and Proteome Composition Are Very Tightly Coupled. Genome Biol Evol 2015; 7:1519-32. [PMID: 25971281 PMCID: PMC4494046 DOI: 10.1093/gbe/evv088] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/09/2015] [Indexed: 02/05/2023] Open
Abstract
The amino acid composition (AAC) of proteomes differs greatly between microorganisms and is associated with the environmental niche they inhabit, suggesting that these changes may be adaptive. Similarly, the oligonucleotide composition of genomes varies and may confer advantages at the DNA/RNA level. These influences overlap in protein-coding sequences, making it difficult to gauge their relative contributions. We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes. We find that G + C content, the most frequently used measure of genomic composition, cannot capture diversity in AAC and across ecological contexts. However, di-/trinucleotide composition in intergenic DNA predicts amino acid frequencies of proteomes to the point where very little cross-species variability remains unexplained (91% of variance accounted for). Qualitatively similar results were obtained for 49 fungal genomes, where 80% of the variability in AAC could be explained by the composition of introns and intergenic regions. Upon factoring out oligonucleotide composition and phylogenetic inertia, the residual AAC is poorly predictive of the microbes' ecological preferences, in stark contrast with the original AAC. Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome. Thus, evolutionary shifts in overall AAC appear to occur almost exclusively through factors shaping the global oligonucleotide content of the genome. We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level.
Collapse
Affiliation(s)
- Maria Brbić
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia
| | - Tobias Warnecke
- MRC Clinical Sciences Centre, Imperial College, Hammersmith Campus, London, United Kingdom
| | - Anita Kriško
- Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia
| | - Fran Supek
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia EMBL/CRG Systems Biology Unit, Centre for Genomic Regulation, Barcelona, Spain
| |
Collapse
|
23
|
Mehmood T, Bohlin J, Snipen L. A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:560-567. [PMID: 26357267 DOI: 10.1109/tcbb.2014.2366146] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.
Collapse
|
24
|
Abstract
The concept of the minimal cell has fascinated scientists for a long time, from both fundamental and applied points of view. This broad concept encompasses extreme reductions of genomes, the last universal common ancestor (LUCA), the creation of semiartificial cells, and the design of protocells and chassis cells. Here we review these different areas of research and identify common and complementary aspects of each one. We focus on systems biology, a discipline that is greatly facilitating the classical top-down and bottom-up approaches toward minimal cells. In addition, we also review the so-called middle-out approach and its contributions to the field with mathematical and computational models. Owing to the advances in genomics technologies, much of the work in this area has been centered on minimal genomes, or rather minimal gene sets, required to sustain life. Nevertheless, a fundamental expansion has been taking place in the last few years wherein the minimal gene set is viewed as a backbone of a more complex system. Complementing genomics, progress is being made in understanding the system-wide properties at the levels of the transcriptome, proteome, and metabolome. Network modeling approaches are enabling the integration of these different omics data sets toward an understanding of the complex molecular pathways connecting genotype to phenotype. We review key concepts central to the mapping and modeling of this complexity, which is at the heart of research on minimal cells. Finally, we discuss the distinction between minimizing the number of cellular components and minimizing cellular complexity, toward an improved understanding and utilization of minimal and simpler cells.
Collapse
|
25
|
López-Leal G, Tabche ML, Castillo-Ramírez S, Mendoza-Vargas A, Ramírez-Romero MA, Dávila G. RNA-Seq analysis of the multipartite genome of Rhizobium etli CE3 shows different replicon contributions under heat and saline shock. BMC Genomics 2014; 15:770. [PMID: 25201548 PMCID: PMC4167512 DOI: 10.1186/1471-2164-15-770] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Accepted: 09/03/2014] [Indexed: 12/23/2022] Open
Abstract
Background Regulation of transcription is essential for any organism and Rhizobium etli (a multi-replicon, nitrogen-fixing symbiotic bacterium) is no exception. This bacterium is commonly found in the rhizosphere (free-living) or inside of root-nodules of the common bean (Phaseolus vulgaris) in a symbiotic relationship. Abiotic stresses, such as high soil temperatures and salinity, compromise the genetic stability of R. etli and therefore its symbiotic interaction with P. vulgaris. However, it is still unclear which genes are up- or down-regulated to cope with these stress conditions. The aim of this study was to identify the genes and non-coding RNAs (ncRNAs) that are differentially expressed under heat and saline shock, as well as the promoter regions of the up-regulated loci. Results Analysing the heat and saline shock responses of R. etli CE3 through RNA-Seq, we identified 756 and 392 differentially expressed genes, respectively, and 106 were up-regulated under both conditions. Notably, the set of genes over-expressed under either condition was preferentially encoded on plasmids, although this observation was more significant for the heat shock response. In contrast, during either saline shock or heat shock, the down-regulated genes were principally chromosomally encoded. Our functional analysis shows that genes encoding chaperone proteins were up-regulated during the heat shock response, whereas genes involved in the metabolism of compatible solutes were up-regulated following saline shock. Furthermore, we identified thirteen and nine ncRNAs that were differentially expressed under heat and saline shock, respectively, as well as eleven ncRNAs that had not been previously identified. Finally, using an in silico analysis, we studied the promoter motifs in all of the non-coding regions associated with the genes and ncRNAs up-regulated under both conditions. Conclusions Our data suggest that the replicon contribution is different for different stress responses and that the heat shock response is more complex than the saline shock response. In general, this work exemplifies how strategies that not only consider differentially regulated genes but also regulatory elements of the stress response provide a more comprehensive view of bacterial gene regulation. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-770) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Gamaliel López-Leal
- Programa de Genómica Evolutiva, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Apartado Postal 565-A, Cuernavaca, Morelos C,P 62210, México.
| | | | | | | | | | | |
Collapse
|
26
|
Martínez-Núñez MA, Poot-Hernandez AC, Rodríguez-Vázquez K, Perez-Rueda E. Increments and duplication events of enzymes and transcription factors influence metabolic and regulatory diversity in prokaryotes. PLoS One 2013; 8:e69707. [PMID: 23922780 PMCID: PMC3726781 DOI: 10.1371/journal.pone.0069707] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2013] [Accepted: 06/13/2013] [Indexed: 11/18/2022] Open
Abstract
In this work, the content of enzymes and DNA-binding transcription factors (TFs) in 794 non-redundant prokaryotic genomes was evaluated. The identification of enzymes was based on annotations deposited in the KEGG database as well as in databases of functional domains (COG and PFAM) and structural domains (Superfamily). For identifications of the TFs, hidden Markov profiles were constructed based on well-known transcriptional regulatory families. From these analyses, we obtained diverse and interesting results, such as the negative rate of incremental changes in the number of detected enzymes with respect to the genome size. On the contrary, for TFs the rate incremented as the complexity of genome increased. This inverse related performance shapes the diversity of metabolic and regulatory networks and impacts the availability of enzymes and TFs. Furthermore, the intersection of the derivatives between enzymes and TFs was identified at 9,659 genes, after this point, the regulatory complexity grows faster than metabolic complexity. In addition, TFs have a low number of duplications, in contrast to the apparent high number of duplications associated with enzymes. Despite the greater number of duplicated enzymes versus TFs, the increment by which duplicates appear is higher in TFs. A lower proportion of enzymes among archaeal genomes (22%) than in the bacterial ones (27%) was also found. This low proportion might be compensated by the interconnection between the metabolic pathways in Archaea. A similar proportion was also found for the archaeal TFs, for which the formation of regulatory complexes has been proposed. Finally, an enrichment of multifunctional enzymes in Bacteria, as a mechanism of ecological adaptation, was detected.
Collapse
Affiliation(s)
- Mario Alberto Martínez-Núñez
- Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Ciudad Universitaria, México D.F., México
- * E-mail: (MMN); (EPR)
| | - Augusto Cesar Poot-Hernandez
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Katya Rodríguez-Vázquez
- Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Ciudad Universitaria, México D.F., México
| | - Ernesto Perez-Rueda
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
- * E-mail: (MMN); (EPR)
| |
Collapse
|
27
|
Bohlin J, Brynildsrud O, Vesth T, Skjerve E, Ussery DW. Amino acid usage is asymmetrically biased in AT- and GC-rich microbial genomes. PLoS One 2013; 8:e69878. [PMID: 23922837 PMCID: PMC3724673 DOI: 10.1371/journal.pone.0069878] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2013] [Accepted: 06/14/2013] [Indexed: 11/18/2022] Open
Abstract
INTRODUCTION Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affected by base compositional changes. In addition, we wanted to know how genome-wide amino acid usage was biased in the different genomes and how changes to base composition and mutations affected this bias. To carry this out, we used a Generalized Additive Mixed-effects Model (GAMM) to explore non-linear associations and strong data dependences in closely related microbes; principal component analysis (PCA) was used to examine genomic amino acid- and codon frequencies, while the concept of relative entropy was used to analyze genomic mutation rates. RESULTS We found that genomic amino acid frequencies carried a stronger phylogenetic signal than codon frequencies, but that this signal was weak compared to that of genomic %AT. Further, in contrast to codon usage bias (CUB), amino acid usage bias (AAUB) was differently distributed in AT- and GC-rich genomes in the sense that AT-rich genomes did not prefer specific amino acids over others to the same extent as GC-rich genomes. AAUB was also associated with relative entropy; genomes with low AAUB contained more random mutations as a consequence of relaxed purifying selection than genomes with higher AAUB. CONCLUSION Genomic base composition has a substantial effect on both amino acid- and codon frequencies in bacterial genomes. While phylogeny influenced amino acid usage more in GC-rich genomes, AT-content was driving amino acid usage in AT-rich genomes. We found the GAMM model to be an excellent tool to analyze the genomic data used in this study.
Collapse
Affiliation(s)
- Jon Bohlin
- Centre for Epidemiology and Biostatistics, Department of Food Safety and Infection Biology, Norwegian School of Veterinary Science, Oslo, Norway.
| | | | | | | | | |
Collapse
|
28
|
The genome organization of Thermotoga maritima reflects its lifestyle. PLoS Genet 2013; 9:e1003485. [PMID: 23637642 PMCID: PMC3636130 DOI: 10.1371/journal.pgen.1003485] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2012] [Accepted: 03/13/2013] [Indexed: 01/01/2023] Open
Abstract
The generation of genome-scale data is becoming more routine, yet the subsequent analysis of omics data remains a significant challenge. Here, an approach that integrates multiple omics datasets with bioinformatics tools was developed that produces a detailed annotation of several microbial genomic features. This methodology was used to characterize the genome of Thermotoga maritima—a phylogenetically deep-branching, hyperthermophilic bacterium. Experimental data were generated for whole-genome resequencing, transcription start site (TSS) determination, transcriptome profiling, and proteome profiling. These datasets, analyzed in combination with bioinformatics tools, served as a basis for the improvement of gene annotation, the elucidation of transcription units (TUs), the identification of putative non-coding RNAs (ncRNAs), and the determination of promoters and ribosome binding sites. This revealed many distinctive properties of the T. maritima genome organization relative to other bacteria. This genome has a high number of genes per TU (3.3), a paucity of putative ncRNAs (12), and few TUs with multiple TSSs (3.7%). Quantitative analysis of promoters and ribosome binding sites showed increased sequence conservation relative to other bacteria. The 5′UTRs follow an atypical bimodal length distribution comprised of “Short” 5′UTRs (11–17 nt) and “Common” 5′UTRs (26–32 nt). Transcriptional regulation is limited by a lack of intergenic space for the majority of TUs. Lastly, a high fraction of annotated genes are expressed independent of growth state and a linear correlation of mRNA/protein is observed (Pearson r = 0.63, p<2.2×10−16 t-test). These distinctive properties are hypothesized to be a reflection of this organism's hyperthermophilic lifestyle and could yield novel insights into the evolutionary trajectory of microbial life on earth. Genomic studies have greatly benefited from the advent of high-throughput technologies and bioinformatics tools. Here, a methodology integrating genome-scale data and bioinformatics tools is developed to characterize the genome organization of the hyperthermophilic, phylogenetically deep-branching bacterium Thermotoga maritima. This approach elucidates several features of the genome organization and enables comparative analysis of these features across diverse taxa. Our results suggest that the genome of T. maritima is reflective of its hyperthermophilic lifestyle. Ultimately, constraints imposed on the genome have negative impacts on regulatory complexity and phenotypic diversity. Investigating the genome organization of Thermotogae species will help resolve various causal factors contributing to the genome organization such as phylogeny and environment. Applying a similar analysis of the genome organization to numerous taxa will likely provide insights into microbial evolution.
Collapse
|
29
|
Harwich MD, Serrano MG, Fettweis JM, Alves JMP, Reimers MA, Buck GA, Jefferson KK. Genomic sequence analysis and characterization of Sneathia amnii sp. nov. BMC Genomics 2012; 13 Suppl 8:S4. [PMID: 23281612 PMCID: PMC3535699 DOI: 10.1186/1471-2164-13-s8-s4] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Bacteria of the genus Sneathia are emerging as potential pathogens of the female reproductive tract. Species of Sneathia, which were formerly grouped with Leptotrichia, can be part of the normal microbiota of the genitourinary tracts of men and women, but they are also associated with a variety of clinical conditions including bacterial vaginosis, preeclampsia, preterm labor, spontaneous abortion, post-partum bacteremia and other invasive infections. Sneathia species also exhibit a significant correlation with sexually transmitted diseases and cervical cancer. Because Sneathia species are fastidious and rarely cultured successfully in vitro; and the genomes of members of the genus had until now not been characterized, very little is known about the physiology or the virulence of these organisms. RESULTS Here, we describe a novel species, Sneathia amnii sp. nov, which closely resembles bacteria previously designated "Leptotrichia amnionii". As part of the Vaginal Human Microbiome Project at VCU, a vaginal isolate of S. amnii sp. nov. was identified, successfully cultured and bacteriologically cloned. The biochemical characteristics and virulence properties of the organism were examined in vitro, and the genome of the organism was sequenced, annotated and analyzed. The analysis revealed a reduced circular genome of ~1.34 Mbp, containing ~1,282 protein-coding genes. Metabolic reconstruction of the bacterium reflected its biochemical phenotype, and several genes potentially associated with pathogenicity were identified. CONCLUSIONS Bacteria with complex growth requirements frequently remain poorly characterized and, as a consequence, their roles in health and disease are unclear. Elucidation of the physiology and identification of genes putatively involved in the metabolism and virulence of S. amnii may lead to a better understanding of the role of this potential pathogen in bacterial vaginosis, preterm birth, and other issues associated with vaginal and reproductive health.
Collapse
Affiliation(s)
- Michael D Harwich
- Department of Microbiology and Immunology, Virginia Commonwealth University School of Medicine, 1101 E. Marshall Street - PO Box 980678, Richmond, VA 23298-0678, USA
| | | | | | | | | | | | | | | |
Collapse
|
30
|
Pachkov M, Balwierz PJ, Arnold P, Ozonov E, van Nimwegen E. SwissRegulon, a database of genome-wide annotations of regulatory sites: recent updates. Nucleic Acids Res 2012. [PMID: 23180783 PMCID: PMC3531101 DOI: 10.1093/nar/gks1145] [Citation(s) in RCA: 102] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Identification of genomic regulatory elements is essential for understanding the dynamics of cellular processes. This task has been substantially facilitated by the availability of genome sequences for many species and high-throughput data of transcripts and transcription factor (TF) binding. However, rigorous computational methods are necessary to derive accurate genome-wide annotations of regulatory sites from such data. SwissRegulon (http://swissregulon.unibas.ch) is a database containing genome-wide annotations of regulatory motifs, promoters and TF binding sites (TFBSs) in promoter regions across model organisms. Its binding site predictions were obtained with rigorous Bayesian probabilistic methods that operate on orthologous regions from related genomes, and use explicit evolutionary models to assess the evidence of purifying selection on each site. New in the current version of SwissRegulon is a curated collection of 190 mammalian regulatory motifs associated with ∼340 TFs, and TFBS annotations across a curated set of ∼35 000 promoters in both human and mouse. Predictions of TFBSs for Saccharomyces cerevisiae have also been significantly extended and now cover 158 of yeast’s ∼180 TFs. All data are accessible through both an easily navigable genome browser with search functions, and as flat files that can be downloaded for further analysis.
Collapse
Affiliation(s)
- Mikhail Pachkov
- Biozentrum, University of Basel, and Swiss Institute of Bioinformatics, Klingelbergstrasse 50/70, CH-4056 Basel, Switzerland
| | | | | | | | | |
Collapse
|
31
|
Campanaro S, Pascale FD, Telatin A, Schiavon R, Bartlett DH, Valle G. The transcriptional landscape of the deep-sea bacterium Photobacterium profundum in both a toxR mutant and its parental strain. BMC Genomics 2012; 13:567. [PMID: 23107454 PMCID: PMC3505737 DOI: 10.1186/1471-2164-13-567] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2012] [Accepted: 10/16/2012] [Indexed: 02/08/2023] Open
Abstract
Background The deep-sea bacterium Photobacterium profundum is an established model for studying high pressure adaptation. In this paper we analyse the parental strain DB110 and the toxR mutant TW30 by massively parallel cDNA sequencing (RNA-seq). ToxR is a transmembrane DNA-binding protein first discovered in Vibrio cholerae, where it regulates a considerable number of genes involved in environmental adaptation and virulence. In P. profundum the abundance and activity of this protein is influenced by hydrostatic pressure and its role is related to the regulation of genes in a pressure-dependent manner. Results To better characterize the ToxR regulon, we compared the expression profiles of wt and toxR strains in response to pressure changes. Our results revealed a complex expression pattern with a group of 22 genes having expression profiles similar to OmpH that is an outer membrane protein transcribed in response to high hydrostatic pressure. Moreover, RNA-seq allowed a deep characterization of the transcriptional landscape that led to the identification of 460 putative small RNA genes and the detection of 298 protein-coding genes previously unknown. We were also able to perform a genome-wide prediction of operon structure, transcription start and termination sites, revealing an unexpected high number of genes (992) with large 5′-UTRs, long enough to harbour cis-regulatory RNA structures, suggesting a correlation between intergenic region size and UTR length. Conclusion This work led to a better understanding of high-pressure response in P. profundum. Furthermore, the high-resolution RNA-seq analysis revealed several unexpected features about transcriptional landscape and general mechanisms of controlling bacterial gene expression.
Collapse
Affiliation(s)
- Stefano Campanaro
- Department of Biology and CRIBI Biotechnology Centre, University of Padua, Via Ugo Bassi 58/B, Padova 35131, Italy.
| | | | | | | | | | | |
Collapse
|
32
|
Tsoy OV, Pyatnitskiy MA, Kazanov MD, Gelfand MS. Evolution of transcriptional regulation in closely related bacteria. BMC Evol Biol 2012; 12:200. [PMID: 23039862 PMCID: PMC3735044 DOI: 10.1186/1471-2148-12-200] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2012] [Accepted: 09/26/2012] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND The exponential growth of the number of fully sequenced genomes at varying taxonomic closeness allows one to characterize transcriptional regulation using comparative-genomics analysis instead of time-consuming experimental methods. A transcriptional regulatory unit consists of a transcription factor, its binding site and a regulated gene. These units constitute a graph which contains so-called "network motifs", subgraphs of a given structure. Here we consider genomes of closely related Enterobacteriales and estimate the fraction of conserved network motifs and sites as well as positions under selection in various types of non-coding regions. RESULTS Using a newly developed technique, we found that the highest fraction of positions under selection, approximately 50%, was observed in synvergon spacers (between consecutive genes from the same strand), followed by ~45% in divergon spacers (common 5'-regions), and ~10% in convergon spacers (common 3'-regions). The fraction of selected positions in functional regions was higher, 60% in transcription factor-binding sites and ~45% in terminators and promoters. Small, but significant differences were observed between Escherichia coli and Salmonella enterica. This fraction is similar to the one observed in eukaryotes.The conservation of binding sites demonstrated some differences between types of regulatory units. In E. coli, strains the interactions of the type "local transcriptional factor gene" turned out to be more conserved in feed-forward loops (FFLs) compared to non-motif interactions. The coherent FFLs tend to be less conserved than the incoherent FFLs. A natural explanation is that the former imply functional redundancy. CONCLUSIONS A naïve hypothesis that FFL would be highly conserved turned out to be not entirely true: its conservation depends on its status in the transcriptional network and also from its usage. The fraction of positions under selection in intergenic regions of bacterial genomes is roughly similar to that of eukaryotes. Known regulatory sites explain 20±5% of selected positions.
Collapse
Affiliation(s)
- Olga V Tsoy
- Institute for Information Transmission Problems, RAS, Bolshoi Karetny per. 19, Moscow 127994, Russia
| | | | | | | |
Collapse
|
33
|
Arnold P, Erb I, Pachkov M, Molina N, van Nimwegen E. MotEvo: integrated Bayesian probabilistic methods for inferring regulatory sites and motifs on multiple alignments of DNA sequences. ACTA ACUST UNITED AC 2012; 28:487-94. [PMID: 22334039 DOI: 10.1093/bioinformatics/btr695] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Probabilistic approaches for inferring transcription factor binding sites (TFBSs) and regulatory motifs from DNA sequences have been developed for over two decades. Previous work has shown that prediction accuracy can be significantly improved by incorporating features such as the competition of multiple transcription factors (TFs) for binding to nearby sites, the tendency of TFBSs for co-regulated TFs to cluster and form cis-regulatory modules and explicit evolutionary modeling of conservation of TFBSs across orthologous sequences. However, currently available tools only incorporate some of these features, and significant methodological hurdles hampered their synthesis into a single consistent probabilistic framework. RESULTS We present MotEvo, a integrated suite of Bayesian probabilistic methods for the prediction of TFBSs and inference of regulatory motifs from multiple alignments of phylogenetically related DNA sequences, which incorporates all features just mentioned. In addition, MotEvo incorporates a novel model for detecting unknown functional elements that are under evolutionary constraint, and a new robust model for treating gain and loss of TFBSs along a phylogeny. Rigorous benchmarking tests on ChIP-seq datasets show that MotEvo's novel features significantly improve the accuracy of TFBS prediction, motif inference and enhancer prediction. AVAILABILITY Source code, a user manual and files with several example applications are available at www.swissregulon.unibas.ch.
Collapse
Affiliation(s)
- Phil Arnold
- Biozentrum, University of Basel, Swiss Institute of Bioinformatics, Klingelbergstrasse 50-70, 4056 Basel, Switzerland
| | | | | | | | | |
Collapse
|
34
|
Charneski CA, Honti F, Bryant JM, Hurst LD, Feil EJ. Atypical at skew in Firmicute genomes results from selection and not from mutation. PLoS Genet 2011; 7:e1002283. [PMID: 21935355 PMCID: PMC3174206 DOI: 10.1371/journal.pgen.1002283] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2011] [Accepted: 07/12/2011] [Indexed: 11/18/2022] Open
Abstract
The second parity rule states that, if there is no bias in mutation or selection, then within each strand of DNA complementary bases are present at approximately equal frequencies. In bacteria, however, there is commonly an excess of G (over C) and, to a lesser extent, T (over A) in the replicatory leading strand. The low G+C Firmicutes, such as Staphylococcus aureus, are unusual in displaying an excess of A over T on the leading strand. As mutation has been established as a major force in the generation of such skews across various bacterial taxa, this anomaly has been assumed to reflect unusual mutation biases in Firmicute genomes. Here we show that this is not the case and that mutation bias does not explain the atypical AT skew seen in S. aureus. First, recently arisen intergenic SNPs predict the classical replication-derived equilibrium enrichment of T relative to A, contrary to what is observed. Second, sites predicted to be under weak purifying selection display only weak AT skew. Third, AT skew is primarily associated with largely non-synonymous first and second codon sites and is seen with respect to their sense direction, not which replicating strand they lie on. The atypical AT skew we show to be a consequence of the strong bias for genes to be co-oriented with the replicating fork, coupled with the selective avoidance of both stop codons and costly amino acids, which tend to have T-rich codons. That intergenic sequence has more A than T, while at mutational equilibrium a preponderance of T is expected, points to a possible further unresolved selective source of skew. When considering a single strand of DNA, it is not necessarily the case that the frequency of each base should equal its complementary partner, such that A = T and G = C. For the leading strand, it is typically the case that Gs are more common than Cs, and Ts more common than As. This bias is widely thought to arise due to different mutational biases during replication. The Firmicutes exhibit an atypical preference for A over T on the leading strand, and here we show that selection, rather than mutation, can explain this exception. For those bases within coding regions, selection acts to inflate the frequency of A over T in order to avoid stop codons and to use metabolically cheap amino acids. Because genes are not orientated randomly, this manifests as an overall enrichment of A on the leading strand. Furthermore, a direct examination of mutational patterns is inconsistent with the observed enrichment of As. Curiously, our data also point to an unresolved source of selection on synonymous and intergenic sites, which are widely assumed to be neutral.
Collapse
|
35
|
Abstract
We tested whether functionally important sites in bacterial, yeast, and animal promoters are more conserved than their neighbors. We found that substitutions are predominantly seen in less important sites and that those that occurred tended to have less impact on gene expression than possible alternatives. These results suggest that purifying selection operates on promoter sequences.
Collapse
|
36
|
Mattick JS. The central role of RNA in human development and cognition. FEBS Lett 2011; 585:1600-16. [DOI: 10.1016/j.febslet.2011.05.001] [Citation(s) in RCA: 167] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Accepted: 05/03/2011] [Indexed: 12/22/2022]
|
37
|
Koonin EV, Wolf YI. Constraints and plasticity in genome and molecular-phenome evolution. Nat Rev Genet 2011; 11:487-98. [PMID: 20548290 DOI: 10.1038/nrg2810] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Multiple constraints variously affect different parts of the genomes of diverse life forms. The selective pressures that shape the evolution of viral, archaeal, bacterial and eukaryotic genomes differ markedly, even among relatively closely related animal and bacterial lineages; by contrast, constraints affecting protein evolution seem to be more universal. The constraints that shape the evolution of genomes and phenomes are complemented by the plasticity and robustness of genome architecture, expression and regulation. Taken together, these findings are starting to reveal complex networks of evolutionary processes that must be integrated to attain a new synthesis of evolutionary biology.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | |
Collapse
|
38
|
Rangannan V, Bansal M. High-quality annotation of promoter regions for 913 bacterial genomes. ACTA ACUST UNITED AC 2010; 26:3043-50. [PMID: 20956245 DOI: 10.1093/bioinformatics/btq577] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
MOTIVATION The number of bacterial genomes being sequenced is increasing very rapidly and hence, it is crucial to have procedures for rapid and reliable annotation of their functional elements such as promoter regions, which control the expression of each gene or each transcription unit of the genome. The present work addresses this requirement and presents a generic method applicable across organisms. RESULTS Relative stability of the DNA double helical sequences has been used to discriminate promoter regions from non-promoter regions. Based on the difference in stability between neighboring regions, an algorithm has been implemented to predict promoter regions on a large scale over 913 microbial genome sequences. The average free energy values for the promoter regions as well as their downstream regions are found to differ, depending on their GC content. Threshold values to identify promoter regions have been derived using sequences flanking a subset of translation start sites from all microbial genomes and then used to predict promoters over the complete genome sequences. An average recall value of 72% (which indicates the percentage of protein and RNA coding genes with predicted promoter regions assigned to them) and precision of 56% is achieved over the 913 microbial genome dataset. AVAILABILITY The binary executable for 'PromPredict' algorithm (implemented in PERL and supported on Linux and MS Windows) and the predicted promoter data for all 913 microbial genomes are available at http://nucleix.mbu.iisc.ernet.in/prombase/.
Collapse
|
39
|
Elyashiv E, Bullaughey K, Sattath S, Rinott Y, Przeworski M, Sella G. Shifts in the intensity of purifying selection: an analysis of genome-wide polymorphism data from two closely related yeast species. Genome Res 2010; 20:1558-73. [PMID: 20817943 DOI: 10.1101/gr.108993.110] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
How much does the intensity of purifying selection vary among populations and species? How uniform are the shifts in selective pressures across the genome? To address these questions, we took advantage of a recent, whole-genome polymorphism data set from two closely related species of yeast, Saccharomyces cerevisiae and S. paradoxus, paying close attention to the population structure within these species. We found that the average intensity of purifying selection on amino acid sites varies markedly among populations and between species. As expected in the presence of extensive weakly deleterious mutations, the effect of purifying selection is substantially weaker on single nucleotide polymorphisms (SNPs) segregating within populations than on SNPs fixed between population samples. Also in accordance with a Nearly Neutral model, the variation in the intensity of purifying selection across populations corresponds almost perfectly to simple measures of their effective size. As a first step toward understanding the processes generating these patterns, we sought to tease apart the relative importance of systematic, genome-wide changes in the efficacy of selection, such as those expected from demographic processes and of gene-specific changes, which may be expected after a shift in selective pressures. For that purpose, we developed a new model for the evolution of purifying selection between populations and inferred its parameters from the genome-wide data using a likelihood approach. We found that most, but not all changes seem to be explained by systematic shifts in the efficacy of selection. One population, the sake-derived strains of S. cerevisiae, however, also shows extensive gene-specific changes, plausibly associated with domestication. These findings have important implications for our understanding of purifying selection as well as for estimates of the rate of molecular adaptation in yeast and in other species.
Collapse
Affiliation(s)
- Eyal Elyashiv
- Department of Evolution, Systematics, and Ecology, Hebrew University of Jerusalem, Jerusalem 91905, Israel
| | | | | | | | | | | |
Collapse
|
40
|
Supek F, Škunca N, Repar J, Vlahoviček K, Šmuc T. Translational selection is ubiquitous in prokaryotes. PLoS Genet 2010; 6:e1001004. [PMID: 20585573 PMCID: PMC2891978 DOI: 10.1371/journal.pgen.1001004] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2009] [Accepted: 05/26/2010] [Indexed: 11/29/2022] Open
Abstract
Codon usage bias in prokaryotic genomes is largely a consequence of background substitution patterns in DNA, but highly expressed genes may show a preference towards codons that enable more efficient and/or accurate translation. We introduce a novel approach based on supervised machine learning that detects effects of translational selection on genes, while controlling for local variation in nucleotide substitution patterns represented as sequence composition of intergenic DNA. A cornerstone of our method is a Random Forest classifier that outperformed previous distance measure-based approaches, such as the codon adaptation index, in the task of discerning the (highly expressed) ribosomal protein genes by their codon frequencies. Unlike previous reports, we show evidence that translational selection in prokaryotes is practically universal: in 460 of 461 examined microbial genomes, we find that a subset of genes shows a higher codon usage similarity to the ribosomal proteins than would be expected from the local sequence composition. These genes constitute a substantial part of the genome—between 5% and 33%, depending on genome size—while also exhibiting higher experimentally measured mRNA abundances and tending toward codons that match tRNA anticodons by canonical base pairing. Certain gene functional categories are generally enriched with, or depleted of codon-optimized genes, the trends of enrichment/depletion being conserved between Archaea and Bacteria. Prominent exceptions from these trends might indicate genes with alternative physiological roles; we speculate on specific examples related to detoxication of oxygen radicals and ammonia and to possible misannotations of asparaginyl–tRNA synthetases. Since the presence of codon optimizations on genes is a valid proxy for expression levels in fully sequenced genomes, we provide an example of an “adaptome” by highlighting gene functions with expression levels elevated specifically in thermophilic Bacteria and Archaea. Synonymous codons are not equally common in genomes. The main causes of unequal codon usage are varying nucleotide substitution patterns, as manifested in the wide range of genomic nucleotide compositions. However, since the first E. coli and yeast genes were sequenced, it became evident that there was also a bias towards codons that can be translated to protein faster and more accurately. This bias was stronger in highly expressed genes, and its driving force was termed translational selection. Researchers sought for effects of translational selection in microbial genomes as they became available, employing a flurry of mathematical approaches which sometimes led to contradictory conclusions. We introduce a sensitive and accurate machine learning-based methodology and find that highly expressed genes have a recognizable codon usage pattern in almost every bacterial and archaeal genome analyzed, even after accounting for large differences in background nucleotide composition. We also show that the gene functional category has a great bearing on whether that gene is subject to translational selection. Since presence of codon optimizations can be used as a purely sequence-derived proxy for expression levels, we can delineate “adaptomes” by relating predicted gene activity to organisms' phenotypes, which we demonstrate on genomes of temperature-resistant Bacteria and Archaea.
Collapse
Affiliation(s)
- Fran Supek
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Nives Škunca
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Jelena Repar
- Division of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Kristian Vlahoviček
- Division of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Tomislav Šmuc
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
- * E-mail:
| |
Collapse
|
41
|
Beisel CL, Storz G. Base pairing small RNAs and their roles in global regulatory networks. FEMS Microbiol Rev 2010; 34:866-82. [PMID: 20662934 DOI: 10.1111/j.1574-6976.2010.00241.x] [Citation(s) in RCA: 220] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Bacteria use a range of RNA regulators collectively termed small RNAs (sRNAs) to help respond to changes in the environment. Many sRNAs regulate their target mRNAs through limited base-pairing interactions. Ongoing characterization of base-pairing sRNAs in bacteria has started to reveal how these sRNAs participate in global regulatory networks. These networks can be broken down into smaller regulatory circuits that have characteristic behaviors and functions. In this review, we describe the specific regulatory circuits that incorporate base-pairing sRNAs and the importance of each circuit in global regulation. Because most of these circuits were originally identified as network motifs in transcriptional networks, we also discuss why sRNAs may be used over protein transcription factors to help transduce environmental signals.
Collapse
Affiliation(s)
- Chase L Beisel
- Cell Biology and Metabolism Program, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892-5430, USA
| | | |
Collapse
|
42
|
Toolbox model of evolution of prokaryotic metabolic networks and their regulation. Proc Natl Acad Sci U S A 2009; 106:9743-8. [PMID: 19482938 DOI: 10.1073/pnas.0903206106] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
It has been reported that the number of transcription factors encoded in prokaryotic genomes scales approximately quadratically with their total number of genes. We propose a conceptual explanation of this finding and illustrate it using a simple model in which metabolic and regulatory networks of prokaryotes are shaped by horizontal gene transfer of coregulated metabolic pathways. Adapting to a new environmental condition monitored by a new transcription factor (e.g., learning to use another nutrient) involves both acquiring new enzymes and reusing some of the enzymes already encoded in the genome. As the repertoire of enzymes of an organism (its toolbox) grows larger, it can reuse its enzyme tools more often and thus needs to get fewer new ones to master each new task. From this observation, it logically follows that the number of functional tasks and their regulators increases faster than linearly with the total number of genes encoding enzymes. Genomes can also shrink, e.g., because of a loss of a nutrient from the environment, followed by deletion of its regulator and all enzymes that become redundant. We propose several simple models of network evolution elaborating on this toolbox argument and reproducing the empirically observed quadratic scaling. The distribution of lengths of pathway branches in our model agrees with that of the real-life metabolic network of Escherichia coli. Thus, our model provides a qualitative explanation for broad distributions of regulon sizes in prokaryotes.
Collapse
|
43
|
Molina N, van Nimwegen E. Scaling laws in functional genome content across prokaryotic clades and lifestyles. Trends Genet 2009; 25:243-7. [PMID: 19457568 DOI: 10.1016/j.tig.2009.04.004] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2009] [Revised: 04/29/2009] [Accepted: 04/29/2009] [Indexed: 11/28/2022]
Abstract
For high-level functional categories that are represented in almost all prokaryotic genomes, the numbers of genes in these categories scale as power-laws in the total number of genes. We present a comprehensive analysis of the variation in these scaling laws across prokaryotic clades and lifestyles. For the large majority of functional categories, including transcription regulators, the inferred scaling laws are statistically indistinguishable across clades and lifestyles, supporting the simple hypothesis that these scaling laws are universally shared by all prokaryotes.
Collapse
Affiliation(s)
- Nacho Molina
- Biozentrum, the University of Basel and Swiss Institute of Bioinformatics, Klingelbergstrasse 50/70, 4056-CH, Basel, Switzerland
| | | |
Collapse
|
44
|
Balleza E, López-Bojorquez LN, Martínez-Antonio A, Resendis-Antonio O, Lozada-Chávez I, Balderas-Martínez YI, Encarnación S, Collado-Vides J. Regulation by transcription factors in bacteria: beyond description. FEMS Microbiol Rev 2009; 33:133-51. [PMID: 19076632 PMCID: PMC2704942 DOI: 10.1111/j.1574-6976.2008.00145.x] [Citation(s) in RCA: 137] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Transcription is an essential step in gene expression and its understanding has been one of the major interests in molecular and cellular biology. By precisely tuning gene expression, transcriptional regulation determines the molecular machinery for developmental plasticity, homeostasis and adaptation. In this review, we transmit the main ideas or concepts behind regulation by transcription factors and give just enough examples to sustain these main ideas, thus avoiding a classical ennumeration of facts. We review recent concepts and developments: cis elements and trans regulatory factors, chromosome organization and structure, transcriptional regulatory networks (TRNs) and transcriptomics. We also summarize new important discoveries that will probably affect the direction of research in gene regulation: epigenetics and stochasticity in transcriptional regulation, synthetic circuits and plasticity and evolution of TRNs. Many of the new discoveries in gene regulation are not extensively tested with wetlab approaches. Consequently, we review this broad area in Inference of TRNs and Dynamical Models of TRNs. Finally, we have stepped backwards to trace the origins of these modern concepts, synthesizing their history in a timeline schema.
Collapse
Affiliation(s)
- Enrique Balleza
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | | | | | | | | | | | | | | |
Collapse
|
45
|
Pérez-Rueda E, Janga SC, Martínez-Antonio A. Scaling relationship in the gene content of transcriptional machinery in bacteria. MOLECULAR BIOSYSTEMS 2009; 5:1494-501. [DOI: 10.1039/b907384a] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
46
|
Koonin EV, Wolf YI. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res 2008; 36:6688-719. [PMID: 18948295 PMCID: PMC2588523 DOI: 10.1093/nar/gkn668] [Citation(s) in RCA: 474] [Impact Index Per Article: 27.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | | |
Collapse
|