1
|
Glasner JD, Marquez-Villavicencio M, Kim HS, Jahn CE, Ma B, Biehl BS, Rissman AI, Mole B, Yi X, Yang CH, Dangl JL, Grant SR, Perna NT, Charkowski AO. Niche-specificity and the variable fraction of the Pectobacterium pan-genome. Mol Plant Microbe Interact 2008; 21:1549-1560. [PMID: 18986251 DOI: 10.1094/mpmi-21-12-1549] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
We compare genome sequences of three closely related soft-rot pathogens that vary in host range and geographical distribution to identify genetic differences that could account for lifestyle differences. The isolates compared, Pectobacterium atrosepticum SCRI1043, P. carotovorum WPP14, and P. brasiliensis 1692, represent diverse lineages of the genus. P. carotovorum and P. brasiliensis genome contigs, generated by 454 pyrosequencing ordered by reference to the previously published complete circular chromosome of P. atrosepticum genome and each other, account for 96% of the predicted genome size. Orthologous proteins encoded by P. carotovorum and P. brasiliensis are approximately 95% identical to each other and 92% identical to P. atrosepticum. Multiple alignment using Mauve identified a core genome of 3.9 Mb conserved among these Pectobacterium spp. Each core genome is interrupted at many points by species-specific insertions or deletions (indels) that account for approximately 0.9 to 1.1 Mb. We demonstrate that the presence of a hrpK-like type III secretion system-dependent effector protein in P. carotovorum and P. brasiliensis and its absence from P. atrosepticum is insufficient to explain variability in their response to infection in a plant. Additional genes that vary among these species include those encoding peptide toxin production, enzyme production, secretion proteins, and antibiotic production, as well as differences in more general aspects of gene regulation and metabolism that may be relevant to pathogenicity.
Collapse
Affiliation(s)
- J D Glasner
- Genome Center of Wisconsin, Madison, WI, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Wei J, Goldberg MB, Burland V, Venkatesan MM, Deng W, Fournier G, Mayhew GF, Plunkett G, Rose DJ, Darling A, Mau B, Perna NT, Payne SM, Runyen-Janecky LJ, Zhou S, Schwartz DC, Blattner FR. Complete genome sequence and comparative genomics of Shigella flexneri serotype 2a strain 2457T. Infect Immun 2003; 71:2775-86. [PMID: 12704152 PMCID: PMC153260 DOI: 10.1128/iai.71.5.2775-2786.2003] [Citation(s) in RCA: 303] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We determined the complete genome sequence of Shigella flexneri serotype 2a strain 2457T (4,599,354 bp). Shigella species cause >1 million deaths per year from dysentery and diarrhea and have a lifestyle that is markedly different from those of closely related bacteria, including Escherichia coli. The genome exhibits the backbone and island mosaic structure of E. coli pathogens, albeit with much less horizontally transferred DNA and lacking 357 genes present in E. coli. The strain is distinctive in its large complement of insertion sequences, with several genomic rearrangements mediated by insertion sequences, 12 cryptic prophages, 372 pseudogenes, and 195 S. flexneri-specific genes. The 2457T genome was also compared with that of a recently sequenced S. flexneri 2a strain, 301. Our data are consistent with Shigella being phylogenetically indistinguishable from E. coli. The S. flexneri-specific regions contain many genes that could encode proteins with roles in virulence. Analysis of these will reveal the genetic basis for aspects of this pathogenic organism's distinctive lifestyle that have yet to be explained.
Collapse
Affiliation(s)
- J Wei
- Laboratory of Genetics and Genome Center, University of Wisconsin, Madison, Wisconsin 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Welch RA, Burland V, Plunkett G, Redford P, Roesch P, Rasko D, Buckles EL, Liou SR, Boutin A, Hackett J, Stroud D, Mayhew GF, Rose DJ, Zhou S, Schwartz DC, Perna NT, Mobley HLT, Donnenberg MS, Blattner FR. Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc Natl Acad Sci U S A 2002; 99:17020-4. [PMID: 12471157 PMCID: PMC139262 DOI: 10.1073/pnas.252529799] [Citation(s) in RCA: 1026] [Impact Index Per Article: 46.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present the complete genome sequence of uropathogenic Escherichia coli, strain CFT073. A three-way genome comparison of the CFT073, enterohemorrhagic E. coli EDL933, and laboratory strain MG1655 reveals that, amazingly, only 39.2% of their combined (nonredundant) set of proteins actually are common to all three strains. The pathogen genomes are as different from each other as each pathogen is from the benign strain. The difference in disease potential between O157:H7 and CFT073 is reflected in the absence of genes for type III secretion system or phage- and plasmid-encoded toxins found in some classes of diarrheagenic E. coli. The CFT073 genome is particularly rich in genes that encode potential fimbrial adhesins, autotransporters, iron-sequestration systems, and phase-switch recombinases. Striking differences exist between the large pathogenicity islands of CFT073 and two other well-studied uropathogenic E. coli strains, J96 and 536. Comparisons indicate that extraintestinal pathogenic E. coli arose independently from multiple clonal lineages. The different E. coli pathotypes have maintained a remarkable synteny of common, vertically evolved genes, whereas many islands interrupting this common backbone have been acquired by different horizontal transfer events in each strain.
Collapse
Affiliation(s)
- R A Welch
- Department of Medical Microbiology and Immunology, Laboratory of Genetics, Genome Center of Wisconsin, and Animal Health and Biological Sciences, University of Wisconsin, Madison, WI 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Lim A, Dimalanta ET, Potamousis KD, Yen G, Apodoca J, Tao C, Lin J, Qi R, Skiadas J, Ramanathan A, Perna NT, Plunkett G, Burland V, Mau B, Hackett J, Blattner FR, Anantharaman TS, Mishra B, Schwartz DC. Shotgun optical maps of the whole Escherichia coli O157:H7 genome. Genome Res 2001; 11:1584-93. [PMID: 11544203 PMCID: PMC311123 DOI: 10.1101/gr.172101] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2000] [Accepted: 06/04/2001] [Indexed: 11/24/2022]
Abstract
We have constructed NheI and XhoI optical maps of Escherichia coli O157:H7 solely from genomic DNA molecules to provide a uniquely valuable scaffold for contig closure and sequence validation. E. coli O157:H7 is a common pathogen found in contaminated food and water. Our approach obviated the need for the analysis of clones, PCR products, and hybridizations, because maps were constructed from ensembles of single DNA molecules. Shotgun sequencing of bacterial genomes remains labor-intensive, despite advances in sequencing technology. This is partly due to manual intervention required during the last stages of finishing. The applicability of optical mapping to this problem was enhanced by advances in machine vision techniques that improved mapping throughput and created a path to full automation of mapping. Comparisons were made between maps and sequence data that characterized sequence gaps and guided nascent assemblies.
Collapse
Affiliation(s)
- A Lim
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Perna NT, Plunkett G, Burland V, Mau B, Glasner JD, Rose DJ, Mayhew GF, Evans PS, Gregor J, Kirkpatrick HA, Pósfai G, Hackett J, Klink S, Boutin A, Shao Y, Miller L, Grotbeck EJ, Davis NW, Lim A, Dimalanta ET, Potamousis KD, Apodaca J, Anantharaman TS, Lin J, Yen G, Schwartz DC, Welch RA, Blattner FR. Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 2001; 409:529-33. [PMID: 11206551 DOI: 10.1038/35054089] [Citation(s) in RCA: 1470] [Impact Index Per Article: 63.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The bacterium Escherichia coli O157:H7 is a worldwide threat to public health and has been implicated in many outbreaks of haemorrhagic colitis, some of which included fatalities caused by haemolytic uraemic syndrome. Close to 75,000 cases of O157:H7 infection are now estimated to occur annually in the United States. The severity of disease, the lack of effective treatment and the potential for large-scale outbreaks from contaminated food supplies have propelled intensive research on the pathogenesis and detection of E. coli O157:H7 (ref. 4). Here we have sequenced the genome of E. coli O157:H7 to identify candidate genes responsible for pathogenesis, to develop better methods of strain detection and to advance our understanding of the evolution of E. coli, through comparison with the genome of the non-pathogenic laboratory strain E. coli K-12 (ref. 5). We find that lateral gene transfer is far more extensive than previously anticipated. In fact, 1,387 new genes encoded in strain-specific clusters of diverse sizes were found in O157:H7. These include candidate virulence factors, alternative metabolic capacities, several prophages and other new functions--all of which could be targets for surveillance.
Collapse
Affiliation(s)
- N T Perna
- Genome Center of Wisconsin, and Department of Animal Health and Biomedical Sciences, University of Wisconsin, Madison 53706, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Ffrench-Constant RH, Waterfield N, Burland V, Perna NT, Daborn PJ, Bowen D, Blattner FR. A genomic sample sequence of the entomopathogenic bacterium Photorhabdus luminescens W14: potential implications for virulence. Appl Environ Microbiol 2000; 66:3310-29. [PMID: 10919786 PMCID: PMC92150 DOI: 10.1128/aem.66.8.3310-3329.2000] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Photorhabdus luminescens is a pathogenic bacterium that lives in the guts of insect-pathogenic nematodes. After invasion of an insect host by a nematode, bacteria are released from the nematode gut and help kill the insect, in which both the bacteria and the nematodes subsequently replicate. However, the bacterial virulence factors associated with this "symbiosis of pathogens" remain largely obscure. In order to identify genes encoding potential virulence factors, we performed approximately 2,000 random sequencing reads from a P. luminescens W14 genomic library. We then compared the sequences obtained to sequences in existing gene databases and to the Escherichia coli K-12 genome sequence. Here we describe the different classes of potential virulence factors found. These factors include genes that putatively encode Tc insecticidal toxin complexes, Rtx-like toxins, proteases and lipases, colicin and pyocins, and various antibiotics. They also include a diverse array of secretion (e.g., type III), iron uptake, and lipopolysaccharide production systems. We speculate on the potential functions of each of these gene classes in insect infection and also examine the extent to which the invertebrate pathogen P. luminescens shares potential antivertebrate virulence factors. The implications for understanding both the biology of this insect pathogen and links between the evolution of vertebrate virulence factors and the evolution of invertebrate virulence factors are discussed.
Collapse
|
7
|
Burland V, Shao Y, Perna NT, Plunkett G, Sofia HJ, Blattner FR. The complete DNA sequence and analysis of the large virulence plasmid of Escherichia coli O157:H7. Nucleic Acids Res 1998; 26:4196-204. [PMID: 9722640 PMCID: PMC147824 DOI: 10.1093/nar/26.18.4196] [Citation(s) in RCA: 235] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The complete DNA sequence of pO157, the large virulence plasmid of EHEC strain O157:H7 EDL 933, is presented. The 92 kb F-like plasmid is composed of segments of putative virulence genes in a framework of replication and maintenance regions, with seven insertion sequence elements, located mostly at the boundaries of the virulence segments. One hundred open reading frames (ORFs) were identified, of which 19 were previously sequenced potential virulence genes. Forty-two ORFs were sufficiently similar to known proteins for suggested functions to be assigned, and 22 had no convincing similarity with any known proteins. Of the newly identified genes, an unusually large ORF of 3169 amino acids has a putative cytotoxin active site shared with the large clostridial toxin (LCT) family and proteins such as ToxA and B of Clostridium difficile . A conserved motif was detected that links the large ORF and the LCT proteins with the OCH1 family of glycosyltransferases. In the complete sequence, the mosaic form can be observed at the levels of base composition, codon usage and gene organization. Insights were obtained from patterns of DNA composition as well as the pathogenic and 'housekeeping' gene segments. Evolutionary trees built from shared plasmid maintenance genes show that even these genes have heterogeneous origins.
Collapse
Affiliation(s)
- V Burland
- Laboratory of Genetics, University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA.
| | | | | | | | | | | |
Collapse
|
8
|
Perna NT, Mayhew GF, Pósfai G, Elliott S, Donnenberg MS, Kaper JB, Blattner FR. Molecular evolution of a pathogenicity island from enterohemorrhagic Escherichia coli O157:H7. Infect Immun 1998; 66:3810-7. [PMID: 9673266 PMCID: PMC108423 DOI: 10.1128/iai.66.8.3810-3817.1998] [Citation(s) in RCA: 307] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/1997] [Accepted: 05/27/1998] [Indexed: 02/08/2023] Open
Abstract
We report the complete 43,359-bp sequence of the locus of enterocyte effacement (LEE) from EDL933, an enterohemorrhagic Escherichia coli O157:H7 serovar originally isolated from contaminated hamburger implicated in an outbreak of hemorrhagic colitis. The locus was isolated from the EDL933 chromosome with a homologous-recombination-driven targeting vector. Recent completion of the LEE sequence from enteropathogenic E. coli (EPEC) E2348/69 afforded the opportunity for a comparative analysis of the entire pathogenicity island. We have identified a total of 54 open reading frames in the EDL933 LEE. Of these, 13 fall within a putative P4 family prophage designated 933L. The prophage is not present in E2348/69 but is found in a closely related EPEC O55:H7 serovar and other O157:H7 isolates. The remaining 41 genes are shared by the two complete LEEs, and we describe the nature and extent of variation among the two strains for each gene. The rate of divergence is heterogeneous along the locus. Most genes show greater than 95% identity between the two strains, but other genes vary more than expected for clonal divergence among E. coli strains. Several of these highly divergent genes encode proteins that are known to be involved in interactions with the host cell. This pattern suggests recombinational divergence coupled with natural selection and has implications for our understanding of the interaction of both pathogens with their host, for the emergence of O157:H7, and for the evolutionary history of pathogens in general.
Collapse
Affiliation(s)
- N T Perna
- Laboratory of Genetics, University of Wisconsin-Madison, Madison Wisconsin 53706, USA.
| | | | | | | | | | | | | |
Collapse
|
9
|
Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA, Goeden MA, Rose DJ, Mau B, Shao Y. The complete genome sequence of Escherichia coli K-12. Science 1997; 277:1453-62. [PMID: 9278503 DOI: 10.1126/science.277.5331.1453] [Citation(s) in RCA: 5277] [Impact Index Per Article: 195.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The 4,639,221-base pair sequence of Escherichia coli K-12 is presented. Of 4288 protein-coding genes annotated, 38 percent have no attributed function. Comparison with five other sequenced microbes reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident. The largest family of paralogous proteins contains 80 ABC transporters. The genome as a whole is strikingly organized with respect to the local direction of replication; guanines, oligonucleotides possibly related to replication and recombination, and most genes are so oriented. The genome also contains insertion sequence (IS) elements, phage remnants, and many other patches of unusual composition indicating genome plasticity through horizontal transfer.
Collapse
MESH Headings
- Bacterial Proteins/chemistry
- Bacterial Proteins/genetics
- Bacterial Proteins/metabolism
- Bacteriophage lambda/genetics
- Base Composition
- Binding Sites
- Chromosome Mapping
- DNA Replication
- DNA Transposable Elements
- DNA, Bacterial/genetics
- Escherichia coli/genetics
- Genes, Bacterial
- Genome, Bacterial
- Molecular Sequence Data
- Mutation
- Operon
- RNA, Bacterial/genetics
- RNA, Transfer/genetics
- Recombination, Genetic
- Regulatory Sequences, Nucleic Acid
- Repetitive Sequences, Nucleic Acid
- Sequence Analysis, DNA
- Sequence Homology, Amino Acid
Collapse
Affiliation(s)
- F R Blattner
- Laboratory of Genetics, University of Wisconsin-Madison, 445 Henry Mall, Madison, WI 53706, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Abstract
The nuclear genomes of many animals contain non-functional copies of mitochondrial genes that provide new opportunities for evolutionary analysis.
Collapse
Affiliation(s)
- N T Perna
- Program in Genetics, University of New Hampshire, Durham 03824, USA
| | | |
Collapse
|
11
|
Abstract
Three statistics (%GC, GC-skew, and AT-skew) can be used to describe the overall patterns of nucleotide composition in DNA sequences. Fourfold degenerate third codon positions from 16 animal mitochondrial genomes were analyzed. The overall composition, as measured by %GC, varies from 3.6 %GC in the honeybee to 47.2 %GC in human mtDNA. Compositional differences between strands of the mitochondrial genome were quantified using the two skew statistics presented in this paper. Strand-specific distribution of bases varies among animal taxa independently of overall %GC. Compositional patterns reflect the substitution process. Description of these patterns may aid in the formation of hypotheses about substitutional mechanisms.
Collapse
Affiliation(s)
- N T Perna
- Program in Genetics, University of New Hampshire, Durham 03824, USA
| | | |
Collapse
|
12
|
Perna NT, Batzer MA, Deininger PL, Stoneking M. Alu insertion polymorphism: a new type of marker for human population studies. Hum Biol 1992; 64:641-8. [PMID: 1328024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
A PCR-based method was used to screen 462 individuals from Japan, Papua New Guinea, Indonesia, and Australia for an Alu family insertion polymorphism. The frequency of this Alu insertion shows significant heterogeneity among island subgroups of the Indonesian sample and between the Japanese-Indonesian populations and the Australian-New Guinean populations. The simple, rapid PCR-based screening technique and the significant frequency differences among populations demonstrate that Alu insertion polymorphisms are potentially valuable markers for studies of the evolutionary history and migration patterns of modern humans.
Collapse
Affiliation(s)
- N T Perna
- Department of Anthropology, Pennsylvania State University, University Park 16802
| | | | | | | |
Collapse
|