301
|
Flores-Morales A, Ståhlberg N, Tollet-Egnell P, Lundeberg J, Malek RL, Quackenbush J, Lee NH, Norstedt G. Microarray analysis of the in vivo effects of hypophysectomy and growth hormone treatment on gene expression in the rat. Endocrinology 2001; 142:3163-76. [PMID: 11416039 DOI: 10.1210/endo.142.7.8235] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Complementary DNA microarrays containing 3000 different rat genes were used to study the consequences of severe hormonal deficiency (hypophysectomy) on the gene expression patterns in heart, liver, and kidney. Hybridization signals were seen from a majority of the arrayed complementary DNAs; nonetheless, tissue-specific expression patterns could be delineated. Hypophysectomy affected the expression of genes involved in a variety of cellular functions. Between 16-29% of the detected transcripts from each tissue changed expression level as a reaction to this condition. Chronic treatment of hypophysectomized animals with human GH also caused significant changes in gene expression patterns. The study confirms previous knowledge concerning certain gene expression changes in the above-mentioned situations and provides new information regarding hypophysectomy and chronic human GH effects in the rat. Furthermore, we have identified several new genes that respond to GH treatment. Our results represent a first step toward a more global understanding of gene expression changes in states of hormonal deficiency.
Collapse
|
302
|
Abstract
Microarray experiments are providing unprecedented quantities of genome-wide data on gene-expression patterns. Although this technique has been enthusiastically developed and applied in many biological contexts, the management and analysis of the millions of data points that result from these experiments has received less attention. Sophisticated computational tools are available, but the methods that are used to analyse the data can have a profound influence on the interpretation of the results. A basic understanding of these computational tools is therefore required for optimal experimental design and meaningful data analysis.
Collapse
|
303
|
Smith TP, Grosse WM, Freking BA, Roberts AJ, Stone RT, Casas E, Wray JE, White J, Cho J, Fahrenkrug SC, Bennett GL, Heaton MP, Laegreid WW, Rohrer GA, Chitko-McKown CG, Pertea G, Holt I, Karamycheva S, Liang F, Quackenbush J, Keele JW. Sequence evaluation of four pooled-tissue normalized bovine cDNA libraries and construction of a gene index for cattle. Genome Res 2001; 11:626-30. [PMID: 11282978 PMCID: PMC311058 DOI: 10.1101/gr.170101] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
An essential component of functional genomics studies is the sequence of DNA expressed in tissues of interest. To provide a resource of bovine-specific expressed sequence data and facilitate this powerful approach in cattle research, four normalized cDNA libraries were produced and arrayed for high-throughput sequencing. The libraries were made with RNA pooled from multiple tissues to increase efficiency of normalization and maximize the number of independent genes for which sequence data were obtained. Target tissues included those with highest likelihood to have impact on production parameters of animal health, growth, reproductive efficiency, and carcass merit. Success of normalization and inter- and intralibrary redundancy were assessed by collecting 6000-23,000 sequences from each of the libraries (68,520 total sequences deposited in GenBank). Sequence comparison and assembly of these sequences was performed in combination with 56,500 other bovine EST sequences present in the GenBank dbEST database to construct a cattle Gene Index (available from The Institute for Genomic Research at http://www.tigr.org/tdb/tgi.shtml). The 124,381 bovine ESTs present in GenBank at the time of the analysis form 16,740 assemblies that are listed and annotated on the Web site. Analysis of individual library sequence data indicates that the pooled-tissue approach was highly effective in preparing libraries for efficient deep sequencing.
Collapse
|
304
|
Yuan Q, Quackenbush J, Sultana R, Pertea M, Salzberg SL, Buell CR. Rice bioinformatics. analysis of rice sequence data and leveraging the data to other plant species. PLANT PHYSIOLOGY 2001; 125:1166-74. [PMID: 11244096 PMCID: PMC1539370 DOI: 10.1104/pp.125.3.1166] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Rice (Oryza sativa) is a model species for monocotyledonous plants, especially for members in the grass family. Several attributes such as small genome size, diploid nature, transformability, and establishment of genetic and molecular resources make it a tractable organism for plant biologists. With an estimated genome size of 430 Mb (Arumuganathan and Earle, 1991), it is feasible to obtain the complete genome sequence of rice using current technologies. An international effort has been established and is in the process of sequencing O. sativa spp. japonica var "Nipponbare" using a bacterial artificial chromosome/P1 artificial chromosome shotgun sequencing strategy. Annotation of the rice genome is performed using prediction-based and homology-based searches to identify genes. Annotation tools such as optimized gene prediction programs are being developed for rice to improve the quality of annotation. Resources are also being developed to leverage the rice genome sequence to partial genome projects such as expressed sequence tag projects, thereby maximizing the output from the rice genome project. To provide a low level of annotation for rice genomic sequences, we have aligned all rice bacterial artificial chromosome/P1 artificial chromosome sequences with The Institute of Genomic Research Gene Indices that are a set of nonredundant transcripts that are generated from nine public plant expressed sequence tag projects (rice, wheat, sorghum, maize, barley, Arabidopsis, tomato, potato, and barrel medic). In addition, we have used data from The Institute of Genomic Research Gene Indices and the Arabidopsis and Rice Genome Projects to identify putative orthologues and paralogues among these nine genomes.
Collapse
|
305
|
Kawai J, Shinagawa A, Shibata K, Yoshino M, Itoh M, Ishii Y, Arakawa T, Hara A, Fukunishi Y, Konno H, Adachi J, Fukuda S, Aizawa K, Izawa M, Nishi K, Kiyosawa H, Kondo S, Yamanaka I, Saito T, Okazaki Y, Gojobori T, Bono H, Kasukawa T, Saito R, Kadota K, Matsuda H, Ashburner M, Batalov S, Casavant T, Fleischmann W, Gaasterland T, Gissi C, King B, Kochiwa H, Kuehl P, Lewis S, Matsuo Y, Nikaido I, Pesole G, Quackenbush J, Schriml LM, Staubli F, Suzuki R, Tomita M, Wagner L, Washio T, Sakai K, Okido T, Furuno M, Aono H, Baldarelli R, Barsh G, Blake J, Boffelli D, Bojunga N, Carninci P, de Bonaldo MF, Brownstein MJ, Bult C, Fletcher C, Fujita M, Gariboldi M, Gustincich S, Hill D, Hofmann M, Hume DA, Kamiya M, Lee NH, Lyons P, Marchionni L, Mashima J, Mazzarelli J, Mombaerts P, Nordone P, Ring B, Ringwald M, Rodriguez I, Sakamoto N, Sasaki H, Sato K, Schönbach C, Seya T, Shibata Y, Storch KF, Suzuki H, Toyo-oka K, Wang KH, Weitz C, Whittaker C, Wilming L, Wynshaw-Boris A, Yoshida K, Hasegawa Y, Kawaji H, Kohtsuki S, Hayashizaki Y. Functional annotation of a full-length mouse cDNA collection. Nature 2001; 409:685-90. [PMID: 11217851 DOI: 10.1038/35055500] [Citation(s) in RCA: 487] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The RIKEN Mouse Gene Encyclopaedia Project, a systematic approach to determining the full coding potential of the mouse genome, involves collection and sequencing of full-length complementary DNAs and physical mapping of the corresponding genes to the mouse genome. We organized an international functional annotation meeting (FANTOM) to annotate the first 21,076 cDNAs to be analysed in this project. Here we describe the first RIKEN clone collection, which is one of the largest described for any organism. Analysis of these cDNAs extends known gene families and identifies new ones.
Collapse
|
306
|
Quackenbush J. Expression Profiler: A suite of web-based tools for the analysis of microarray gene expression data. Brief Bioinform 2001. [DOI: 10.1093/bib/2.4.388] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
307
|
Quackenbush J, Cho J, Lee D, Liang F, Holt I, Karamycheva S, Parvizi B, Pertea G, Sultana R, White J. The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res 2001; 29:159-64. [PMID: 11125077 PMCID: PMC29813 DOI: 10.1093/nar/29.1.159] [Citation(s) in RCA: 318] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
While genome sequencing projects are advancing rapidly, EST sequencing and analysis remains a primary research tool for the identification and categorization of gene sequences in a wide variety of species and an important resource for annotation of genomic sequence. The TIGR Gene Indices (http://www.tigr.org/tdb/tgi. shtml) are a collection of species-specific databases that use a highly refined protocol to analyze EST sequences in an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST and annotated gene sequences from GenBank for the targeted species. This process produces a set of unique, high-fidelity virtual transcripts, or Tentative Consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, to provide links between orthologous and paralogous genes and as a resource for comparative sequence analysis.
Collapse
|
308
|
|
309
|
Yuan Q, Liang F, Hsiao J, Zismann V, Benito MI, Quackenbush J, Wing R, Buell R. Anchoring of rice BAC clones to the rice genetic map in silico. Nucleic Acids Res 2000; 28:3636-41. [PMID: 10982886 PMCID: PMC110739 DOI: 10.1093/nar/28.18.3636] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A wealth of molecular resources have been developed for rice genomics, including dense genetic maps, expressed sequence tags (ESTs), yeast artificial chromosome maps, bacterial artificial chromosome (BAC) libraries and BAC end sequence databases. Integration of genetic and physical maps involves labor-intensive empirical experiments. To accelerate the integration of the bacterial clone resources with the genetic map for the International Rice Genome Sequencing Project, we cleaned and filtered the available EST and BAC end sequences for repetitive sequences and then searched all available rice genetic markers with our filtered databases. We identified 418 genetic markers that aligned with at least one BAC end sequence with >95% sequence identity, providing a set of large insert clones with an average separation of 1 Mb that can serve as nucleation points for the sequencing phase of the International Rice Genome Sequencing Project.
Collapse
|
310
|
Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. An optimized protocol for analysis of EST sequences. Nucleic Acids Res 2000; 28:3657-65. [PMID: 10982889 PMCID: PMC110731 DOI: 10.1093/nar/28.18.3657] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The vast body of Expressed Sequence Tag (EST) data in the public databases provide an important resource for comparative and functional genomics studies and an invaluable tool for the annotation of genomic sequences. We have developed a rigorous protocol for reconstructing the sequences of transcribed genes from EST and gene sequence fragments. A key element in developing this protocol has been the evaluation of a number of sequence assembly programs to determine which most faithfully reproduce transcript sequences from EST data. The TIGR Gene Indices constructed using this protocol for human, mouse, rat and a variety of other plant and animal models have demonstrated their utility in a variety of applications and are freely available to the scientific research community.
Collapse
|
311
|
Hegde P, Qi R, Abernathy K, Gay C, Dharap S, Gaspard R, Hughes JE, Snesrud E, Lee N, Quackenbush J. A concise guide to cDNA microarray analysis. Biotechniques 2000; 29:548-50, 552-4, 556 passim. [PMID: 10997270 DOI: 10.2144/00293bi01] [Citation(s) in RCA: 668] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Microarray expression analysis has become one of the most widely used functional genomics tools. Efficient application of this technique requires the development of robust and reproducible protocols. We have optimized all aspects of the process, including PCR amplification of target cDNA clones, microarray printing, probe labeling and hybridization, and have developed strategies for data normalization and analysis.
Collapse
|
312
|
Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. Gene index analysis of the human genome estimates approximately 120,000 genes. Nat Genet 2000; 25:239-40. [PMID: 10835646 DOI: 10.1038/76126] [Citation(s) in RCA: 195] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Although sequencing of the human genome will soon be completed, gene identification and annotation remains a challenge. Early estimates suggested that there might be 60,000-100,000 (ref. 1) human genes, but recent analyses of the available data from EST sequencing projects have estimated as few as 45,000 (ref. 2) or as many as 140, 000 (ref. 3) distinct genes. The Chromosome 22 Sequencing Consortium estimated a minimum of 45,000 genes based on their annotation of the complete chromosome, although their data suggests there may be additional genes. The nearly 2,000,000 human ESTs in dbEST provide an important resource for gene identification and genome annotation, but these single-pass sequences must be carefully analysed to remove contaminating sequences, including those from genomic DNA, spurious transcription, and vector and bacterial sequences. We have developed a highly refined and rigorously tested protocol for cleaning, clustering and assembling EST sequences to produce high-fidelity consensus sequences for the represented genes (F.L. et al., manuscript submitted) and used this to create the TIGR Gene Indices-databases of expressed genes for human, mouse, rat and other species (http://www.tigr.org/tdb/tgi.html). Using highly refined and tested algorithms for EST analysis, we have arrived at two independent estimates indicating the human genome contains approximately 120,000 genes.
Collapse
|
313
|
Abstract
The haploid nuclear genome of the African trypanosome, Trypanosoma brucei, is about 35 Mb and varies in size among different trypanosome isolates by as much as 25%. The nuclear DNA of this diploid organism is distributed among three size classes of chromosomes: the megabase chromosomes of which there are at least 11 pairs ranging from 1 Mb to more than 6 Mb (numbered I-XI from smallest to largest); several intermediate chromosomes of 200-900 kb and uncertain ploidy; and about 100 linear minichromosomes of 50-150 kb. Size differences of as much as four-fold can occur, both between the two homologues of a megabase chromosome pair in a specific trypanosome isolate and among chromosome pairs in different isolates. The genomic DNA sequences determined to date indicated that about 50% of the genome is coding sequence. The chromosomal telomeres possess TTAGGG repeats and many, if not all, of the telomeres of the megabase and intermediate chromosomes are linked to expression sites for genes encoding variant surface glycoproteins (VSGs). The minichromosomes serve as repositories for VSG genes since some but not all of their telomeres are linked to unexpressed VSG genes. A gene discovery program, based on sequencing the ends of cloned genomic DNA fragments, has generated more than 20 Mb of discontinuous single-pass genomic sequence data during the past year, and the complete sequences of chromosomes I and II (about 1 Mb each) in T. brucei GUTat 10.1 are currently being determined. It is anticipated that the entire genomic sequence of this organism will be known in a few years. Analysis of a test microarray of 400 cDNAs and small random genomic DNA fragments probed with RNAs from two developmental stages of T. brucei demonstrates that the microarray technology can be used to identify batteries of genes differentially expressed during the various life cycle stages of this parasite.
Collapse
|
314
|
Quackenbush J, Liang F, Holt I, Pertea G, Upton J. The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res 2000; 28:141-5. [PMID: 10592205 PMCID: PMC102391 DOI: 10.1093/nar/28.1.141] [Citation(s) in RCA: 271] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Expressed sequence tags (ESTs) have provided a first glimpse of the collection of transcribed sequences in a variety of organisms. However, a careful analysis of this sequence data can provide significant additional functional, structural and evolutionary information. Our analysis of the public EST sequences, available through the TIGR Gene Indices (TGI; http://www.tigr.org/tdb/tdb.html ), is an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed for selected organisms by first clustering, then assembling EST and annotated gene sequences from GenBank. This process produces a set of unique, high-fidelity virtual transcripts, or tentative consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, and to provide links between orthologous and paralogous genes.
Collapse
|
315
|
Malek RL, Guo Q, Ruffy M, Liu ET, Holt I, Chandra I, Liang F, Upton J, Quackenbush J, Jove R, Yeatman TJ, Lee NH. Use of the Rat Gene Index to examine gene expression patterns from Src-transformed rat fibroblasts that exhibit broad differences in metastatic potential. Nat Genet 1999. [DOI: 10.1038/14362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
316
|
Doyle DJ, Quackenbush J. Symposium on Genomic Medicine, University of Maryland, Shady Grove Campus, Rockville, Maryland, March 17-18, 1997. MICROBIAL & COMPARATIVE GENOMICS 1998; 2:99-102. [PMID: 9689218 DOI: 10.1089/omi.1.1997.2.99] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
317
|
Korenberg JR, Aaltonen J, Brahe C, Cabin D, Creau N, Delabar JM, Doering J, Gardiner K, Hubert RS, Ives J, Kessling A, Kudoh J, Lafrenière R, Murakami Y, Ohira M, Ohki M, Patterson D, Potier MC, Quackenbush J, Reeves RH, Sakaki Y, Shimizu N, Soeda E, Van Broeckhoven C, Yaspo ML. Report and abstracts of the Sixth International Workshop on Human Chromosome 21 Mapping 1996. Cold Spring Harbor, New York, USA. May 6-8,1996. CYTOGENETICS AND CELL GENETICS 1998; 79:21-52. [PMID: 9533011 DOI: 10.1159/000134681] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
318
|
Fraser CM, Casjens S, Huang WM, Sutton GG, Clayton R, Lathigra R, White O, Ketchum KA, Dodson R, Hickey EK, Gwinn M, Dougherty B, Tomb JF, Fleischmann RD, Richardson D, Peterson J, Kerlavage AR, Quackenbush J, Salzberg S, Hanson M, van Vugt R, Palmer N, Adams MD, Gocayne J, Weidman J, Utterback T, Watthey L, McDonald L, Artiach P, Bowman C, Garland S, Fuji C, Cotton MD, Horst K, Roberts K, Hatch B, Smith HO, Venter JC. Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature 1997; 390:580-6. [PMID: 9403685 DOI: 10.1038/37551] [Citation(s) in RCA: 1498] [Impact Index Per Article: 55.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The genome of the bacterium Borrelia burgdorferi B31, the aetiologic agent of Lyme disease, contains a linear chromosome of 910,725 base pairs and at least 17 linear and circular plasmids with a combined size of more than 533,000 base pairs. The chromosome contains 853 genes encoding a basic set of proteins for DNA replication, transcription, translation, solute transport and energy metabolism, but, like Mycoplasma genitalium, it contains no genes for cellular biosynthetic reactions. Because B. burgdorferi and M. genitalium are distantly related eubacteria, we suggest that their limited metabolic capacities reflect convergent evolution by gene loss from more metabolically competent progenitors. Of 430 genes on 11 plasmids, most have no known biological function; 39% of plasmid genes are paralogues that form 47 gene families. The biological significance of the multiple plasmid-encoded genes is not clear, although they may be involved in antigenic variation or immune evasion.
Collapse
|
319
|
Klenk HP, Clayton RA, Tomb JF, White O, Nelson KE, Ketchum KA, Dodson RJ, Gwinn M, Hickey EK, Peterson JD, Richardson DL, Kerlavage AR, Graham DE, Kyrpides NC, Fleischmann RD, Quackenbush J, Lee NH, Sutton GG, Gill S, Kirkness EF, Dougherty BA, McKenney K, Adams MD, Loftus B, Peterson S, Reich CI, McNeil LK, Badger JH, Glodek A, Zhou L, Overbeek R, Gocayne JD, Weidman JF, McDonald L, Utterback T, Cotton MD, Spriggs T, Artiach P, Kaine BP, Sykes SM, Sadow PW, D'Andrea KP, Bowman C, Fujii C, Garland SA, Mason TM, Olsen GJ, Fraser CM, Smith HO, Woese CR, Venter JC. The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 1997; 390:364-70. [PMID: 9389475 DOI: 10.1038/37052] [Citation(s) in RCA: 990] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Archaeoglobus fulgidus is the first sulphur-metabolizing organism to have its genome sequence determined. Its genome of 2,178,400 base pairs contains 2,436 open reading frames (ORFs). The information processing systems and the biosynthetic pathways for essential components (nucleotides, amino acids and cofactors) have extensive correlation with their counterparts in the archaeon Methanococcus jannaschii. The genomes of these two Archaea indicate dramatic differences in the way these organisms sense their environment, perform regulatory and transport functions, and gain energy. In contrast to M. jannaschii, A. fulgidus has fewer restriction-modification systems, and none of its genes appears to contain inteins. A quarter (651 ORFs) of the A. fulgidus genome encodes functionally uncharacterized yet conserved proteins, two-thirds of which are shared with M. jannaschii (428 ORFs). Another quarter of the genome encodes new proteins indicating substantial archaeal gene diversity.
Collapse
|
320
|
Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleischmann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA, Nelson K, Quackenbush J, Zhou L, Kirkness EF, Peterson S, Loftus B, Richardson D, Dodson R, Khalak HG, Glodek A, McKenney K, Fitzegerald LM, Lee N, Adams MD, Hickey EK, Berg DE, Gocayne JD, Utterback TR, Peterson JD, Kelley JM, Cotton MD, Weidman JM, Fujii C, Bowman C, Watthey L, Wallin E, Hayes WS, Borodovsky M, Karp PD, Smith HO, Fraser CM, Venter JC. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 1997; 388:539-47. [PMID: 9252185 DOI: 10.1038/41483] [Citation(s) in RCA: 2543] [Impact Index Per Article: 94.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Helicobacter pylori, strain 26695, has a circular genome of 1,667,867 base pairs and 1,590 predicted coding sequences. Sequence analysis indicates that H. pylori has well-developed systems for motility, for scavenging iron, and for DNA restriction and modification. Many putative adhesins, lipoproteins and other outer membrane proteins were identified, underscoring the potential complexity of host-pathogen interaction. Based on the large number of sequence-related genes encoding outer membrane proteins and the presence of homopolymeric tracts and dinucleotide repeats in coding sequences, H. pylori, like several other mucosal pathogens, probably uses recombination and slipped-strand mispairing within repeats as mechanisms for antigenic variation and adaptive evolution. Consistent with its restricted niche, H. pylori has a few regulatory networks, and a limited metabolic repertoire and biosynthetic capacity. Its survival in acid conditions depends, in part, on its ability to establish a positive inside-membrane potential in low pH.
Collapse
|
321
|
Stewart EA, McKusick KB, Aggarwal A, Bajorek E, Brady S, Chu A, Fang N, Hadley D, Harris M, Hussain S, Lee R, Maratukulam A, O'Connor K, Perkins S, Piercy M, Qin F, Reif T, Sanders C, She X, Sun WL, Tabar P, Voyticky S, Cowles S, Fan JB, Mader C, Quackenbush J, Myers RM, Cox DR. An STS-based radiation hybrid map of the human genome. Genome Res 1997; 7:422-33. [PMID: 9149939 DOI: 10.1101/gr.7.5.422] [Citation(s) in RCA: 239] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We have constructed a physical map of the human genome by using a panel of 83 whole genome radiation hybrids (the Stanford G3 panel) in conjunction with 10,478 sequence-tagged sites (STSs) derived from random genomic DNA sequences, previously mapped genetic markers, and expressed sequences. Of these STSs, 5049 are framework markers that fall into 1766 high-confidence bins. An additional 945 STSs are indistinguishable in their map location from one or more of the framework markers. These 5994 mapped STSs have an average spacing of 500 kb. An additional 4484 STSs are positioned with respect to the framework markers. Comparison of the orders of markers on this map with orders derived from independent meiotic and YAC STS-content maps indicates that the error rate in defining high-confidence bins is < 5%. Analysis of 322 random cDNAs indicates that the map covers the vast majority of the human genome. This STS-based radiation hybrid map of the human genome brings us one step closer to the goal of a physical map containing 30,000 unique ordered landmarks with an average marker spacing of 100 kb.
Collapse
|
322
|
Inoue I, Nakajima T, Williams CS, Quackenbush J, Puryear R, Powers M, Cheng T, Ludwig EH, Sharma AM, Hata A, Jeunemaitre X, Lalouel JM. A nucleotide substitution in the promoter of human angiotensinogen is associated with essential hypertension and affects basal transcription in vitro. J Clin Invest 1997; 99:1786-97. [PMID: 9120024 PMCID: PMC508000 DOI: 10.1172/jci119343] [Citation(s) in RCA: 396] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
In earlier studies, we provided statistical evidence that individual differences in the angiotensinogen gene, the precursor of the vasoactive hormone angiotensin II, constitute inherited predispositions to essential hypertension in humans. We have now identified a common variant in the proximal promoter, the presence of an adenine, instead of a guanine, 6 bp upstream from the initiation site of transcription, in significant association with the disorder. Tests of promoter activity and DNA binding studies with nuclear proteins suggest that this nucleotide substitution affects the basal transcription rate of the gene. These observations provide some biological insight about the possible mechanism of a genetic predisposition to essential hypertension; they may also have important evolutionary implications.
Collapse
|
323
|
Schuler GD, Boguski MS, Stewart EA, Stein LD, Gyapay G, Rice K, White RE, Rodriguez-Tom P, Aggarwal A, Bajorek E, Bentolila S, Birren BB, Butler A, Castle AB, Chiannilkulchai N, Chu A, Clee C, Cowles S, Day PJR, Dibling T, East C, Drouot N, Dunham I, Duprat S, Edwards C, Fan JB, Fang N, Fizames C, Garrett C, Green L, Hadley D, Harris M, Harrison P, Brady S, Hicks A, Holloway E, Hui L, Hussain S, Louis-Dit-Sully C, Ma J, MacGilvery A, Mader C, Maratukulam A, Matise TC, McKusick KB, Morissette J, Mungall A, Muselet D, Nusbaum HC, Page DC, Peck A, Perkins S, Piercy M, Qin F, Quackenbush J, Ranby S, Reif T, Rozen S, Sanders C, She X, Silva J, Slonim DK, Soderlund C, Sun WL, Tabar P, Thangarajah T, Vega-Czarny N, Vollrath D, Voyticky S, Wilmer T, Wu X, Adams MD, Auffray C, Walter NAR, Brandon R, Dehejia A, Goodfellow PN, Houlgatte R, Hudson JR, Ide SE, Iorio KR, Lee WY, Seki N, Nagase T, Ishikawa K, Nomura N, Phillips C, Polymeropoulos MH, Sandusky M, Schmitt K, Berry R, Swanson K, Torres R, Venter JC, Sikela JM, Beckmann JS, Weissenbach J, Myers RM, Cox DR, James MR, Bentley D, Deloukas P, Lander ES, Hudson TJ. A Gene Map of the Human Genome. Science 1996. [DOI: 10.1126/science.274.5287.540] [Citation(s) in RCA: 717] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
324
|
Schuler GD, Boguski MS, Stewart EA, Stein LD, Gyapay G, Rice K, White RE, Rodriguez-Tomé P, Aggarwal A, Bajorek E, Bentolila S, Birren BB, Butler A, Castle AB, Chiannilkulchai N, Chu A, Clee C, Cowles S, Day PJ, Dibling T, Drouot N, Dunham I, Duprat S, East C, Edwards C, Fan JB, Fang N, Fizames C, Garrett C, Green L, Hadley D, Harris M, Harrison P, Brady S, Hicks A, Holloway E, Hui L, Hussain S, Louis-Dit-Sully C, Ma J, MacGilvery A, Mader C, Maratukulam A, Matise TC, McKusick KB, Morissette J, Mungall A, Muselet D, Nusbaum HC, Page DC, Peck A, Perkins S, Piercy M, Qin F, Quackenbush J, Ranby S, Reif T, Rozen S, Sanders C, She X, Silva J, Slonim DK, Soderlund C, Sun WL, Tabar P, Thangarajah T, Vega-Czarny N, Vollrath D, Voyticky S, Wilmer T, Wu X, Adams MD, Auffray C, Walter NA, Brandon R, Dehejia A, Goodfellow PN, Houlgatte R, Hudson JR, Ide SE, Iorio KR, Lee WY, Seki N, Nagase T, Ishikawa K, Nomura N, Phillips C, Polymeropoulos MH, Sandusky M, Schmitt K, Berry R, Swanson K, Torres R, Venter JC, Sikela JM, Beckmann JS, Weissenbach J, Myers RM, Cox DR, James MR, Bentley D, Deloukas P, Lander ES, Hudson TJ. A gene map of the human genome. Science 1996; 274:540-6. [PMID: 8849440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The human genome is thought to harbor 50,000 to 100,000 genes, of which about half have been sampled to date in the form of expressed sequence tags. An international consortium was organized to develop and map gene-based sequence tagged site markers on a set of two radiation hybrid panels and a yeast artificial chromosome library. More than 16,000 human genes have been mapped relative to a framework map that contains about 1000 polymorphic genetic markers. The gene map unifies the existing genetic and physical maps with the nucleotide and protein sequence databases in a fashion that should speed the discovery of genes underlying inherited human disease. The integrated resource is available through a site on the World Wide Web at http://www.ncbi.nlm.nih.gov/SCIENCE96/.
Collapse
|
325
|
Quackenbush J, Davies C, Bailis JM, Khristich JV, Diggle K, Marchuck Y, Tobin J, Clark SP, Rodkins A, Marcano S. An STS content map of human chromosome 11: localization of 910 YAC clones and 109 islands. Genomics 1995; 29:512-25. [PMID: 8666402 DOI: 10.1006/geno.1995.9974] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Physical mapping of human chromosomes at a resolution of 100 kb to 1 Mb will provide important reagents for gene identification and framework templates for ultimately determining the complete DNA sequence. Sequence-tagged site (STS) content mapping, coupled with large fragment cloning in yeast artificial chromosomes, provides an efficient mechanism for producing first-generation, low-resolution maps of human chromosomes. Previously, we produced a set of standardized STSs for human chromosome 11 regionally localized by fluorescence in situ hybridization or somatic cell hybrid analysis. In this paper, we used these as well as other STS content, and identify 109 islands spanning an estimated 218 Mb on the 126-Mb chromosome. Since about 62% of the islands contain markers ordered on chromosome 11 by genetic or radiation hybrid analysis, this data set represents a first-order approximation of a physical map of human chromosome 11. This set of clones, contigs, and associated STSs will provide the material for the production of a continuous overlapping set of YACs as well for high-resolution physical mapping based upon sampled and complete DNA sequencing.
Collapse
|