26
|
Quackenbush J. The power of public access: the human genome project and the scientific process. Nat Genet 2001; 29:4-6. [PMID: 11528377 DOI: 10.1038/ng0901-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The scientific process, and scientific progress, require a critical examination of all published reports. Recent publications detailing errors in the draft human genome sequence are an integral part of our quest to better understand nature and demonstrate the value of free access to scientific data.
Collapse
|
27
|
Kappe SH, Gardner MJ, Brown SM, Ross J, Matuschewski K, Ribeiro JM, Adams JH, Quackenbush J, Cho J, Carucci DJ, Hoffman SL, Nussenzweig V. Exploring the transcriptome of the malaria sporozoite stage. Proc Natl Acad Sci U S A 2001; 98:9895-900. [PMID: 11493695 PMCID: PMC55549 DOI: 10.1073/pnas.171185198] [Citation(s) in RCA: 105] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2001] [Indexed: 11/18/2022] Open
Abstract
Most studies of gene expression in Plasmodium have been concerned with asexual and/or sexual erythrocytic stages. Identification and cloning of genes expressed in the preerythrocytic stages lag far behind. We have constructed a high quality cDNA library of the Plasmodium sporozoite stage by using the rodent malaria parasite P. yoelii, an important model for malaria vaccine development. The technical obstacles associated with limited amounts of RNA material were overcome by PCR-amplifying the transcriptome before cloning. Contamination with mosquito RNA was negligible. Generation of 1,972 expressed sequence tags (EST) resulted in a total of 1,547 unique sequences, allowing insight into sporozoite gene expression. The circumsporozoite protein (CS) and the sporozoite surface protein 2 (SSP2) are well represented in the data set. A BLASTX search with all tags of the nonredundant protein database gave only 161 unique significant matches (P(N) < or = 10(-4)), whereas 1,386 of the unique sequences represented novel sporozoite-expressed genes. We identified ESTs for three proteins that may be involved in host cell invasion and documented their expression in sporozoites. These data should facilitate our understanding of the preerythrocytic Plasmodium life cycle stages and the development of preerythrocytic vaccines.
Collapse
|
28
|
Huang J, Qi R, Quackenbush J, Dauway E, Lazaridis E, Yeatman T. Effects of ischemia on gene expression. J Surg Res 2001; 99:222-7. [PMID: 11469890 DOI: 10.1006/jsre.2001.6195] [Citation(s) in RCA: 118] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Microarray gene expression technology has recently made it feasible to characterize the RNA expression of thousands of genes across numerous tissue samples. We hypothesized that the warm ischemia commonly associated with the surgical extirpation of human tissue would have significant effects on gene expression profiles. To quantitate the effects of warm ischemia on human tissue, we rapidly dissected normal mucosa from a human colon cancer specimen. The specimen was divided and maintained at room temperature until snap-frozen in liquid nitrogen. Aliquots of tissue were frozen at times 5, 10, 15, 20, 40, and 60 min after extirpation. Spotted microarrays composed of 2400 distinct elements were used to assay mRNA derived from each time point in triplicate. Eisen's hierarchical clustering methodology and Bayesean statistical methods were then used to assay the effects of warm ischemia on gene expression. Application of time-course statistical models suggest that three patterns were induced by ischemia, accounting for 68.2, 17.8, and 13.4% of the evaluable genes, respectively. Pattern I corresponds to an average change of 27% over 60 min from 5 min baseline level of expression and 63.8% of the genes with at least 80% probability of membership in this pattern show average increases in expression over 60 min. The remainder decrease on average. Pattern II genes show the least ischemia-related effects, demonstrating an average change of only 12% over 60 min. In contrast to pattern I, we find that 67.5% of the genes with at least 80% probability of membership in this pattern are decreasing in expression on average over time. The remaining 32.5% in this pattern increase an average of 12% over 60 min. Finally, pattern III genes (13.4% of the sample) show the greatest sensitivity to ischemia, changing an average of 50% over 60 min, with about the same number increasing as are decreasing. Fold changes in RNA over- or under-expression were observed up to greater than 20-fold. Warm ischemia associated with the surgical extirpation of human tissues has significant effects on gene expression. These data support the careful monitoring of ischemic time for tissues harvested for the purpose of gene profiling.
Collapse
|
29
|
Gaspard R, Dharap S, Malek J, Qi R, Quackenbush J. Optimized growth conditions for direct amplification of cDNA clone inserts from culture. Biotechniques 2001; 31:35-6. [PMID: 11464517 DOI: 10.2144/01311bm04] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
30
|
Flores-Morales A, Ståhlberg N, Tollet-Egnell P, Lundeberg J, Malek RL, Quackenbush J, Lee NH, Norstedt G. Microarray analysis of the in vivo effects of hypophysectomy and growth hormone treatment on gene expression in the rat. Endocrinology 2001; 142:3163-76. [PMID: 11416039 DOI: 10.1210/endo.142.7.8235] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Complementary DNA microarrays containing 3000 different rat genes were used to study the consequences of severe hormonal deficiency (hypophysectomy) on the gene expression patterns in heart, liver, and kidney. Hybridization signals were seen from a majority of the arrayed complementary DNAs; nonetheless, tissue-specific expression patterns could be delineated. Hypophysectomy affected the expression of genes involved in a variety of cellular functions. Between 16-29% of the detected transcripts from each tissue changed expression level as a reaction to this condition. Chronic treatment of hypophysectomized animals with human GH also caused significant changes in gene expression patterns. The study confirms previous knowledge concerning certain gene expression changes in the above-mentioned situations and provides new information regarding hypophysectomy and chronic human GH effects in the rat. Furthermore, we have identified several new genes that respond to GH treatment. Our results represent a first step toward a more global understanding of gene expression changes in states of hormonal deficiency.
Collapse
|
31
|
Abstract
Microarray experiments are providing unprecedented quantities of genome-wide data on gene-expression patterns. Although this technique has been enthusiastically developed and applied in many biological contexts, the management and analysis of the millions of data points that result from these experiments has received less attention. Sophisticated computational tools are available, but the methods that are used to analyse the data can have a profound influence on the interpretation of the results. A basic understanding of these computational tools is therefore required for optimal experimental design and meaningful data analysis.
Collapse
|
32
|
Smith TP, Grosse WM, Freking BA, Roberts AJ, Stone RT, Casas E, Wray JE, White J, Cho J, Fahrenkrug SC, Bennett GL, Heaton MP, Laegreid WW, Rohrer GA, Chitko-McKown CG, Pertea G, Holt I, Karamycheva S, Liang F, Quackenbush J, Keele JW. Sequence evaluation of four pooled-tissue normalized bovine cDNA libraries and construction of a gene index for cattle. Genome Res 2001; 11:626-30. [PMID: 11282978 PMCID: PMC311058 DOI: 10.1101/gr.170101] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
An essential component of functional genomics studies is the sequence of DNA expressed in tissues of interest. To provide a resource of bovine-specific expressed sequence data and facilitate this powerful approach in cattle research, four normalized cDNA libraries were produced and arrayed for high-throughput sequencing. The libraries were made with RNA pooled from multiple tissues to increase efficiency of normalization and maximize the number of independent genes for which sequence data were obtained. Target tissues included those with highest likelihood to have impact on production parameters of animal health, growth, reproductive efficiency, and carcass merit. Success of normalization and inter- and intralibrary redundancy were assessed by collecting 6000-23,000 sequences from each of the libraries (68,520 total sequences deposited in GenBank). Sequence comparison and assembly of these sequences was performed in combination with 56,500 other bovine EST sequences present in the GenBank dbEST database to construct a cattle Gene Index (available from The Institute for Genomic Research at http://www.tigr.org/tdb/tgi.shtml). The 124,381 bovine ESTs present in GenBank at the time of the analysis form 16,740 assemblies that are listed and annotated on the Web site. Analysis of individual library sequence data indicates that the pooled-tissue approach was highly effective in preparing libraries for efficient deep sequencing.
Collapse
|
33
|
Yuan Q, Quackenbush J, Sultana R, Pertea M, Salzberg SL, Buell CR. Rice bioinformatics. analysis of rice sequence data and leveraging the data to other plant species. PLANT PHYSIOLOGY 2001; 125:1166-74. [PMID: 11244096 PMCID: PMC1539370 DOI: 10.1104/pp.125.3.1166] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Rice (Oryza sativa) is a model species for monocotyledonous plants, especially for members in the grass family. Several attributes such as small genome size, diploid nature, transformability, and establishment of genetic and molecular resources make it a tractable organism for plant biologists. With an estimated genome size of 430 Mb (Arumuganathan and Earle, 1991), it is feasible to obtain the complete genome sequence of rice using current technologies. An international effort has been established and is in the process of sequencing O. sativa spp. japonica var "Nipponbare" using a bacterial artificial chromosome/P1 artificial chromosome shotgun sequencing strategy. Annotation of the rice genome is performed using prediction-based and homology-based searches to identify genes. Annotation tools such as optimized gene prediction programs are being developed for rice to improve the quality of annotation. Resources are also being developed to leverage the rice genome sequence to partial genome projects such as expressed sequence tag projects, thereby maximizing the output from the rice genome project. To provide a low level of annotation for rice genomic sequences, we have aligned all rice bacterial artificial chromosome/P1 artificial chromosome sequences with The Institute of Genomic Research Gene Indices that are a set of nonredundant transcripts that are generated from nine public plant expressed sequence tag projects (rice, wheat, sorghum, maize, barley, Arabidopsis, tomato, potato, and barrel medic). In addition, we have used data from The Institute of Genomic Research Gene Indices and the Arabidopsis and Rice Genome Projects to identify putative orthologues and paralogues among these nine genomes.
Collapse
|
34
|
Kawai J, Shinagawa A, Shibata K, Yoshino M, Itoh M, Ishii Y, Arakawa T, Hara A, Fukunishi Y, Konno H, Adachi J, Fukuda S, Aizawa K, Izawa M, Nishi K, Kiyosawa H, Kondo S, Yamanaka I, Saito T, Okazaki Y, Gojobori T, Bono H, Kasukawa T, Saito R, Kadota K, Matsuda H, Ashburner M, Batalov S, Casavant T, Fleischmann W, Gaasterland T, Gissi C, King B, Kochiwa H, Kuehl P, Lewis S, Matsuo Y, Nikaido I, Pesole G, Quackenbush J, Schriml LM, Staubli F, Suzuki R, Tomita M, Wagner L, Washio T, Sakai K, Okido T, Furuno M, Aono H, Baldarelli R, Barsh G, Blake J, Boffelli D, Bojunga N, Carninci P, de Bonaldo MF, Brownstein MJ, Bult C, Fletcher C, Fujita M, Gariboldi M, Gustincich S, Hill D, Hofmann M, Hume DA, Kamiya M, Lee NH, Lyons P, Marchionni L, Mashima J, Mazzarelli J, Mombaerts P, Nordone P, Ring B, Ringwald M, Rodriguez I, Sakamoto N, Sasaki H, Sato K, Schönbach C, Seya T, Shibata Y, Storch KF, Suzuki H, Toyo-oka K, Wang KH, Weitz C, Whittaker C, Wilming L, Wynshaw-Boris A, Yoshida K, Hasegawa Y, Kawaji H, Kohtsuki S, Hayashizaki Y. Functional annotation of a full-length mouse cDNA collection. Nature 2001; 409:685-90. [PMID: 11217851 DOI: 10.1038/35055500] [Citation(s) in RCA: 487] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The RIKEN Mouse Gene Encyclopaedia Project, a systematic approach to determining the full coding potential of the mouse genome, involves collection and sequencing of full-length complementary DNAs and physical mapping of the corresponding genes to the mouse genome. We organized an international functional annotation meeting (FANTOM) to annotate the first 21,076 cDNAs to be analysed in this project. Here we describe the first RIKEN clone collection, which is one of the largest described for any organism. Analysis of these cDNAs extends known gene families and identifies new ones.
Collapse
|
35
|
Quackenbush J. Expression Profiler: A suite of web-based tools for the analysis of microarray gene expression data. Brief Bioinform 2001. [DOI: 10.1093/bib/2.4.388] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
36
|
Quackenbush J, Cho J, Lee D, Liang F, Holt I, Karamycheva S, Parvizi B, Pertea G, Sultana R, White J. The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res 2001; 29:159-64. [PMID: 11125077 PMCID: PMC29813 DOI: 10.1093/nar/29.1.159] [Citation(s) in RCA: 318] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
While genome sequencing projects are advancing rapidly, EST sequencing and analysis remains a primary research tool for the identification and categorization of gene sequences in a wide variety of species and an important resource for annotation of genomic sequence. The TIGR Gene Indices (http://www.tigr.org/tdb/tgi. shtml) are a collection of species-specific databases that use a highly refined protocol to analyze EST sequences in an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST and annotated gene sequences from GenBank for the targeted species. This process produces a set of unique, high-fidelity virtual transcripts, or Tentative Consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, to provide links between orthologous and paralogous genes and as a resource for comparative sequence analysis.
Collapse
|
37
|
|
38
|
Yuan Q, Liang F, Hsiao J, Zismann V, Benito MI, Quackenbush J, Wing R, Buell R. Anchoring of rice BAC clones to the rice genetic map in silico. Nucleic Acids Res 2000; 28:3636-41. [PMID: 10982886 PMCID: PMC110739 DOI: 10.1093/nar/28.18.3636] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A wealth of molecular resources have been developed for rice genomics, including dense genetic maps, expressed sequence tags (ESTs), yeast artificial chromosome maps, bacterial artificial chromosome (BAC) libraries and BAC end sequence databases. Integration of genetic and physical maps involves labor-intensive empirical experiments. To accelerate the integration of the bacterial clone resources with the genetic map for the International Rice Genome Sequencing Project, we cleaned and filtered the available EST and BAC end sequences for repetitive sequences and then searched all available rice genetic markers with our filtered databases. We identified 418 genetic markers that aligned with at least one BAC end sequence with >95% sequence identity, providing a set of large insert clones with an average separation of 1 Mb that can serve as nucleation points for the sequencing phase of the International Rice Genome Sequencing Project.
Collapse
|
39
|
Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. An optimized protocol for analysis of EST sequences. Nucleic Acids Res 2000; 28:3657-65. [PMID: 10982889 PMCID: PMC110731 DOI: 10.1093/nar/28.18.3657] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The vast body of Expressed Sequence Tag (EST) data in the public databases provide an important resource for comparative and functional genomics studies and an invaluable tool for the annotation of genomic sequences. We have developed a rigorous protocol for reconstructing the sequences of transcribed genes from EST and gene sequence fragments. A key element in developing this protocol has been the evaluation of a number of sequence assembly programs to determine which most faithfully reproduce transcript sequences from EST data. The TIGR Gene Indices constructed using this protocol for human, mouse, rat and a variety of other plant and animal models have demonstrated their utility in a variety of applications and are freely available to the scientific research community.
Collapse
|
40
|
Hegde P, Qi R, Abernathy K, Gay C, Dharap S, Gaspard R, Hughes JE, Snesrud E, Lee N, Quackenbush J. A concise guide to cDNA microarray analysis. Biotechniques 2000; 29:548-50, 552-4, 556 passim. [PMID: 10997270 DOI: 10.2144/00293bi01] [Citation(s) in RCA: 668] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Microarray expression analysis has become one of the most widely used functional genomics tools. Efficient application of this technique requires the development of robust and reproducible protocols. We have optimized all aspects of the process, including PCR amplification of target cDNA clones, microarray printing, probe labeling and hybridization, and have developed strategies for data normalization and analysis.
Collapse
|
41
|
Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. Gene index analysis of the human genome estimates approximately 120,000 genes. Nat Genet 2000; 25:239-40. [PMID: 10835646 DOI: 10.1038/76126] [Citation(s) in RCA: 195] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Although sequencing of the human genome will soon be completed, gene identification and annotation remains a challenge. Early estimates suggested that there might be 60,000-100,000 (ref. 1) human genes, but recent analyses of the available data from EST sequencing projects have estimated as few as 45,000 (ref. 2) or as many as 140, 000 (ref. 3) distinct genes. The Chromosome 22 Sequencing Consortium estimated a minimum of 45,000 genes based on their annotation of the complete chromosome, although their data suggests there may be additional genes. The nearly 2,000,000 human ESTs in dbEST provide an important resource for gene identification and genome annotation, but these single-pass sequences must be carefully analysed to remove contaminating sequences, including those from genomic DNA, spurious transcription, and vector and bacterial sequences. We have developed a highly refined and rigorously tested protocol for cleaning, clustering and assembling EST sequences to produce high-fidelity consensus sequences for the represented genes (F.L. et al., manuscript submitted) and used this to create the TIGR Gene Indices-databases of expressed genes for human, mouse, rat and other species (http://www.tigr.org/tdb/tgi.html). Using highly refined and tested algorithms for EST analysis, we have arrived at two independent estimates indicating the human genome contains approximately 120,000 genes.
Collapse
|
42
|
Abstract
The haploid nuclear genome of the African trypanosome, Trypanosoma brucei, is about 35 Mb and varies in size among different trypanosome isolates by as much as 25%. The nuclear DNA of this diploid organism is distributed among three size classes of chromosomes: the megabase chromosomes of which there are at least 11 pairs ranging from 1 Mb to more than 6 Mb (numbered I-XI from smallest to largest); several intermediate chromosomes of 200-900 kb and uncertain ploidy; and about 100 linear minichromosomes of 50-150 kb. Size differences of as much as four-fold can occur, both between the two homologues of a megabase chromosome pair in a specific trypanosome isolate and among chromosome pairs in different isolates. The genomic DNA sequences determined to date indicated that about 50% of the genome is coding sequence. The chromosomal telomeres possess TTAGGG repeats and many, if not all, of the telomeres of the megabase and intermediate chromosomes are linked to expression sites for genes encoding variant surface glycoproteins (VSGs). The minichromosomes serve as repositories for VSG genes since some but not all of their telomeres are linked to unexpressed VSG genes. A gene discovery program, based on sequencing the ends of cloned genomic DNA fragments, has generated more than 20 Mb of discontinuous single-pass genomic sequence data during the past year, and the complete sequences of chromosomes I and II (about 1 Mb each) in T. brucei GUTat 10.1 are currently being determined. It is anticipated that the entire genomic sequence of this organism will be known in a few years. Analysis of a test microarray of 400 cDNAs and small random genomic DNA fragments probed with RNAs from two developmental stages of T. brucei demonstrates that the microarray technology can be used to identify batteries of genes differentially expressed during the various life cycle stages of this parasite.
Collapse
|
43
|
Quackenbush J, Liang F, Holt I, Pertea G, Upton J. The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res 2000; 28:141-5. [PMID: 10592205 PMCID: PMC102391 DOI: 10.1093/nar/28.1.141] [Citation(s) in RCA: 271] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Expressed sequence tags (ESTs) have provided a first glimpse of the collection of transcribed sequences in a variety of organisms. However, a careful analysis of this sequence data can provide significant additional functional, structural and evolutionary information. Our analysis of the public EST sequences, available through the TIGR Gene Indices (TGI; http://www.tigr.org/tdb/tdb.html ), is an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed for selected organisms by first clustering, then assembling EST and annotated gene sequences from GenBank. This process produces a set of unique, high-fidelity virtual transcripts, or tentative consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, and to provide links between orthologous and paralogous genes.
Collapse
|
44
|
Doyle DJ, Quackenbush J. Symposium on Genomic Medicine, University of Maryland, Shady Grove Campus, Rockville, Maryland, March 17-18, 1997. MICROBIAL & COMPARATIVE GENOMICS 1998; 2:99-102. [PMID: 9689218 DOI: 10.1089/omi.1.1997.2.99] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
45
|
Korenberg JR, Aaltonen J, Brahe C, Cabin D, Creau N, Delabar JM, Doering J, Gardiner K, Hubert RS, Ives J, Kessling A, Kudoh J, Lafrenière R, Murakami Y, Ohira M, Ohki M, Patterson D, Potier MC, Quackenbush J, Reeves RH, Sakaki Y, Shimizu N, Soeda E, Van Broeckhoven C, Yaspo ML. Report and abstracts of the Sixth International Workshop on Human Chromosome 21 Mapping 1996. Cold Spring Harbor, New York, USA. May 6-8,1996. CYTOGENETICS AND CELL GENETICS 1998; 79:21-52. [PMID: 9533011 DOI: 10.1159/000134681] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
46
|
Fraser CM, Casjens S, Huang WM, Sutton GG, Clayton R, Lathigra R, White O, Ketchum KA, Dodson R, Hickey EK, Gwinn M, Dougherty B, Tomb JF, Fleischmann RD, Richardson D, Peterson J, Kerlavage AR, Quackenbush J, Salzberg S, Hanson M, van Vugt R, Palmer N, Adams MD, Gocayne J, Weidman J, Utterback T, Watthey L, McDonald L, Artiach P, Bowman C, Garland S, Fuji C, Cotton MD, Horst K, Roberts K, Hatch B, Smith HO, Venter JC. Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature 1997; 390:580-6. [PMID: 9403685 DOI: 10.1038/37551] [Citation(s) in RCA: 1498] [Impact Index Per Article: 55.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The genome of the bacterium Borrelia burgdorferi B31, the aetiologic agent of Lyme disease, contains a linear chromosome of 910,725 base pairs and at least 17 linear and circular plasmids with a combined size of more than 533,000 base pairs. The chromosome contains 853 genes encoding a basic set of proteins for DNA replication, transcription, translation, solute transport and energy metabolism, but, like Mycoplasma genitalium, it contains no genes for cellular biosynthetic reactions. Because B. burgdorferi and M. genitalium are distantly related eubacteria, we suggest that their limited metabolic capacities reflect convergent evolution by gene loss from more metabolically competent progenitors. Of 430 genes on 11 plasmids, most have no known biological function; 39% of plasmid genes are paralogues that form 47 gene families. The biological significance of the multiple plasmid-encoded genes is not clear, although they may be involved in antigenic variation or immune evasion.
Collapse
|
47
|
Klenk HP, Clayton RA, Tomb JF, White O, Nelson KE, Ketchum KA, Dodson RJ, Gwinn M, Hickey EK, Peterson JD, Richardson DL, Kerlavage AR, Graham DE, Kyrpides NC, Fleischmann RD, Quackenbush J, Lee NH, Sutton GG, Gill S, Kirkness EF, Dougherty BA, McKenney K, Adams MD, Loftus B, Peterson S, Reich CI, McNeil LK, Badger JH, Glodek A, Zhou L, Overbeek R, Gocayne JD, Weidman JF, McDonald L, Utterback T, Cotton MD, Spriggs T, Artiach P, Kaine BP, Sykes SM, Sadow PW, D'Andrea KP, Bowman C, Fujii C, Garland SA, Mason TM, Olsen GJ, Fraser CM, Smith HO, Woese CR, Venter JC. The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 1997; 390:364-70. [PMID: 9389475 DOI: 10.1038/37052] [Citation(s) in RCA: 990] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Archaeoglobus fulgidus is the first sulphur-metabolizing organism to have its genome sequence determined. Its genome of 2,178,400 base pairs contains 2,436 open reading frames (ORFs). The information processing systems and the biosynthetic pathways for essential components (nucleotides, amino acids and cofactors) have extensive correlation with their counterparts in the archaeon Methanococcus jannaschii. The genomes of these two Archaea indicate dramatic differences in the way these organisms sense their environment, perform regulatory and transport functions, and gain energy. In contrast to M. jannaschii, A. fulgidus has fewer restriction-modification systems, and none of its genes appears to contain inteins. A quarter (651 ORFs) of the A. fulgidus genome encodes functionally uncharacterized yet conserved proteins, two-thirds of which are shared with M. jannaschii (428 ORFs). Another quarter of the genome encodes new proteins indicating substantial archaeal gene diversity.
Collapse
|
48
|
Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleischmann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA, Nelson K, Quackenbush J, Zhou L, Kirkness EF, Peterson S, Loftus B, Richardson D, Dodson R, Khalak HG, Glodek A, McKenney K, Fitzegerald LM, Lee N, Adams MD, Hickey EK, Berg DE, Gocayne JD, Utterback TR, Peterson JD, Kelley JM, Cotton MD, Weidman JM, Fujii C, Bowman C, Watthey L, Wallin E, Hayes WS, Borodovsky M, Karp PD, Smith HO, Fraser CM, Venter JC. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 1997; 388:539-47. [PMID: 9252185 DOI: 10.1038/41483] [Citation(s) in RCA: 2543] [Impact Index Per Article: 94.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Helicobacter pylori, strain 26695, has a circular genome of 1,667,867 base pairs and 1,590 predicted coding sequences. Sequence analysis indicates that H. pylori has well-developed systems for motility, for scavenging iron, and for DNA restriction and modification. Many putative adhesins, lipoproteins and other outer membrane proteins were identified, underscoring the potential complexity of host-pathogen interaction. Based on the large number of sequence-related genes encoding outer membrane proteins and the presence of homopolymeric tracts and dinucleotide repeats in coding sequences, H. pylori, like several other mucosal pathogens, probably uses recombination and slipped-strand mispairing within repeats as mechanisms for antigenic variation and adaptive evolution. Consistent with its restricted niche, H. pylori has a few regulatory networks, and a limited metabolic repertoire and biosynthetic capacity. Its survival in acid conditions depends, in part, on its ability to establish a positive inside-membrane potential in low pH.
Collapse
|
49
|
Stewart EA, McKusick KB, Aggarwal A, Bajorek E, Brady S, Chu A, Fang N, Hadley D, Harris M, Hussain S, Lee R, Maratukulam A, O'Connor K, Perkins S, Piercy M, Qin F, Reif T, Sanders C, She X, Sun WL, Tabar P, Voyticky S, Cowles S, Fan JB, Mader C, Quackenbush J, Myers RM, Cox DR. An STS-based radiation hybrid map of the human genome. Genome Res 1997; 7:422-33. [PMID: 9149939 DOI: 10.1101/gr.7.5.422] [Citation(s) in RCA: 239] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We have constructed a physical map of the human genome by using a panel of 83 whole genome radiation hybrids (the Stanford G3 panel) in conjunction with 10,478 sequence-tagged sites (STSs) derived from random genomic DNA sequences, previously mapped genetic markers, and expressed sequences. Of these STSs, 5049 are framework markers that fall into 1766 high-confidence bins. An additional 945 STSs are indistinguishable in their map location from one or more of the framework markers. These 5994 mapped STSs have an average spacing of 500 kb. An additional 4484 STSs are positioned with respect to the framework markers. Comparison of the orders of markers on this map with orders derived from independent meiotic and YAC STS-content maps indicates that the error rate in defining high-confidence bins is < 5%. Analysis of 322 random cDNAs indicates that the map covers the vast majority of the human genome. This STS-based radiation hybrid map of the human genome brings us one step closer to the goal of a physical map containing 30,000 unique ordered landmarks with an average marker spacing of 100 kb.
Collapse
|
50
|
Inoue I, Nakajima T, Williams CS, Quackenbush J, Puryear R, Powers M, Cheng T, Ludwig EH, Sharma AM, Hata A, Jeunemaitre X, Lalouel JM. A nucleotide substitution in the promoter of human angiotensinogen is associated with essential hypertension and affects basal transcription in vitro. J Clin Invest 1997; 99:1786-97. [PMID: 9120024 PMCID: PMC508000 DOI: 10.1172/jci119343] [Citation(s) in RCA: 396] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
In earlier studies, we provided statistical evidence that individual differences in the angiotensinogen gene, the precursor of the vasoactive hormone angiotensin II, constitute inherited predispositions to essential hypertension in humans. We have now identified a common variant in the proximal promoter, the presence of an adenine, instead of a guanine, 6 bp upstream from the initiation site of transcription, in significant association with the disorder. Tests of promoter activity and DNA binding studies with nuclear proteins suggest that this nucleotide substitution affects the basal transcription rate of the gene. These observations provide some biological insight about the possible mechanism of a genetic predisposition to essential hypertension; they may also have important evolutionary implications.
Collapse
|