1
|
Lauer KP, Llorente I, Blair E, Seto J, Krasnov V, Purkayastha A, Ditty SE, Hadfield TL, Buck C, Tibbetts C, Seto D. Natural variation among human adenoviruses: genome sequence and annotation of human adenovirus serotype 1. J Gen Virol 2004; 85:2615-2625. [PMID: 15302955 DOI: 10.1099/vir.0.80118-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The 36,001 base pair DNA sequence of human adenovirus serotype 1 (HAdV-1) has been determined, using a 'leveraged primer sequencing strategy' to generate high quality sequences economically. This annotated genome (GenBank AF534906) confirms anticipated similarity to closely related species C (formerly subgroup), human adenoviruses HAdV-2 and -5, and near identity with earlier reports of sequences representing parts of the HAdV-1 genome. A first round of HAdV-1 sequence data acquisition used PCR amplification and sequencing primers from sequences common to the genomes of HAdV-2 and -5. The subsequent rounds of sequencing used primers derived from the newly generated data. Corroborative re-sequencing with primers selected from this HAdV-1 dataset generated sparsely tiled arrays of high quality sequencing ladders spanning both complementary strands of the HAdV-1 genome. These strategies allow for rapid and accurate low-pass sequencing of genomes. Such rapid genome determinations facilitate the development of specific probes for differentiation of family, serotype, subtype and strain (e.g. pathogen genome signatures). These will be used to monitor epidemic outbreaks of acute respiratory disease in a defined test bed by the Epidemic Outbreak Surveillance (EOS) project.
Collapse
Affiliation(s)
- Kim P Lauer
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 5B3, Manassas, VA 20110, USA
| | - Isabel Llorente
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 5B3, Manassas, VA 20110, USA
| | - Eric Blair
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 5B3, Manassas, VA 20110, USA
| | - Jason Seto
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 5B3, Manassas, VA 20110, USA
| | - Vladimir Krasnov
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 5B3, Manassas, VA 20110, USA
| | - Anjan Purkayastha
- Epidemic Outbreak Surveillance (EOS) Consortium, 5201 Leesburg Pike, Suite 1401, Falls Church, VA 22041, USA
- HQ USAF Surgeon General Office, Directorate of Modernization (SGR), 5201 Leesburg Pike, Suite 1401, Falls Church, VA 22041, USA
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 5B3, Manassas, VA 20110, USA
| | - Susan E Ditty
- Epidemic Outbreak Surveillance (EOS) Consortium, 5201 Leesburg Pike, Suite 1401, Falls Church, VA 22041, USA
- Division of Microbiology, Department of Infectious and Parasitic Diseases Pathology, Armed Forces Institute of Pathology, 5300 Georgia Avenue NW, Washington, DC 20306, USA
| | - Ted L Hadfield
- Epidemic Outbreak Surveillance (EOS) Consortium, 5201 Leesburg Pike, Suite 1401, Falls Church, VA 22041, USA
- Division of Microbiology, Department of Infectious and Parasitic Diseases Pathology, Armed Forces Institute of Pathology, 5300 Georgia Avenue NW, Washington, DC 20306, USA
| | - Charles Buck
- Department of Virology, American Type Culture Collection (ATCC), Manassas, VA 20108, USA
| | - Clark Tibbetts
- Epidemic Outbreak Surveillance (EOS) Consortium, 5201 Leesburg Pike, Suite 1401, Falls Church, VA 22041, USA
- HQ USAF Surgeon General Office, Directorate of Modernization (SGR), 5201 Leesburg Pike, Suite 1401, Falls Church, VA 22041, USA
| | - Donald Seto
- Epidemic Outbreak Surveillance (EOS) Consortium, 5201 Leesburg Pike, Suite 1401, Falls Church, VA 22041, USA
- HQ USAF Surgeon General Office, Directorate of Modernization (SGR), 5201 Leesburg Pike, Suite 1401, Falls Church, VA 22041, USA
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 5B3, Manassas, VA 20110, USA
| |
Collapse
|
2
|
Celamkoti S, Kundeti S, Purkayastha A, Mazumder R, Buck C, Seto D. GeneOrder3.0: software for comparing the order of genes in pairs of small bacterial genomes. BMC Bioinformatics 2004; 5:52. [PMID: 15128433 PMCID: PMC419981 DOI: 10.1186/1471-2105-5-52] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2004] [Accepted: 05/05/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND An increasing number of whole viral and bacterial genomes are being sequenced and deposited in public databases. In parallel to the mounting interest in whole genomes, the number of whole genome analyses software tools is also increasing. GeneOrder was originally developed to provide an analysis of genes between two genomes, allowing visualization of gene order and synteny comparisons of any small genomes. It was originally developed for comparing virus, mitochondrion and chloroplast genomes. This is now extended to small bacterial genomes of sizes less than 2 Mb. RESULTS GeneOrder3.0 has been developed and validated successfully on several small bacterial genomes (ca. 580 kb to 1.83 Mb) archived in the NCBI GenBank database. It is an updated web-based "on-the-fly" computational tool allowing gene order and synteny comparisons of any two small bacterial genomes. Analyses of several bacterial genomes show that a large amount of gene and genome re-arrangement occurs, as seen with earlier DNA software tools. This can be displayed at the protein level using GeneOrder3.0. Whole genome alignments of genes are presented in both a table and a dot plot. This allows the detection of evolutionary more distant relationships since protein sequences are more conserved than DNA sequences. CONCLUSIONS GeneOrder3.0 allows researchers to perform comparative analysis of gene order and synteny in genomes of sizes up to 2 Mb "on-the-fly." AVAILABILITY http://binf.gmu.edu/genometools.html and http://pasteur.atcc.org:8050/GeneOrder3.0.
Collapse
Affiliation(s)
- Srikanth Celamkoti
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University. 10900 University Boulevard, MSN 5B3, Manassas, VA 20110 USA
| | - Sashidhara Kundeti
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University. 10900 University Boulevard, MSN 5B3, Manassas, VA 20110 USA
| | - Anjan Purkayastha
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University. 10900 University Boulevard, MSN 5B3, Manassas, VA 20110 USA
| | - Raja Mazumder
- Biochemistry and Molecular Biology Department, Georgetown University School of Medicine. 4000 Reservoir Road, Washington, D.C. 20057 USA
| | - Charles Buck
- Virology Program, American Type Culture Collection (ATCC). 10801 University Boulevard, Manassas, VA 20110, USA
| | - Donald Seto
- Bioinformatics and Computational Biology, School of Computational Sciences, George Mason University. 10900 University Boulevard, MSN 5B3, Manassas, VA 20110 USA
| |
Collapse
|
4
|
Zafar N, Mazumder R, Seto D. CoreGenes: a computational tool for identifying and cataloging "core" genes in a set of small genomes. BMC Bioinformatics 2002; 3:12. [PMID: 11972896 PMCID: PMC111185 DOI: 10.1186/1471-2105-3-12] [Citation(s) in RCA: 111] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2001] [Accepted: 04/24/2002] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Improvements in DNA sequencing technology and methodology have led to the rapid expansion of databases comprising DNA sequence, gene and genome data. Lower operational costs and heightened interest resulting from initial intriguing novel discoveries from genomics are also contributing to the accumulation of these data sets. A major challenge is to analyze and to mine data from these databases, especially whole genomes. There is a need for computational tools that look globally at genomes for data mining. RESULTS CoreGenes is a global JAVA-based interactive data mining tool that identifies and catalogs a "core" set of genes from two to five small whole genomes simultaneously. CoreGenes performs hierarchical and iterative BLASTP analyses using one genome as a reference and another as a query. Subsequent query genomes are compared against each newly generated "consensus." These iterations lead to a matrix comprising related genes from this set of genomes, e. g., viruses, mitochondria and chloroplasts. Currently the software is limited to small genomes on the order of 330 kilobases or less. CONCLUSION A computational tool CoreGenes has been developed to analyze small whole genomes globally. BLAST score-related and putatively essential "core" gene data are displayed as a table with links to GenBank for further data on the genes of interest. This web resource is available at http://pumpkins.ib3.gmu.edu:8080/CoreGenes or http://www.bif.atcc.org/CoreGenes.
Collapse
Affiliation(s)
- Nikhat Zafar
- School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 4E3, Manassas, VA 20110 USA
| | - Raja Mazumder
- School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 4E3, Manassas, VA 20110 USA
| | - Donald Seto
- School of Computational Sciences, George Mason University, 10900 University Boulevard, MSN 4E3, Manassas, VA 20110 USA
- Center for Biomedical Genomics and Informatics, College of Arts and Sciences, George Mason University, 10900 University Boulevard, MSN 4E3, Manassas, VA 20110 USA
| |
Collapse
|