151
|
Wendl MC, Korf I, Chinwalla AT, Hillier LW. Automated processing of raw DNA sequence data. IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE : THE QUARTERLY MAGAZINE OF THE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY 2001; 20:41-8. [PMID: 11494768 DOI: 10.1109/51.940044] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- M C Wendl
- Genome Sequencing Center, Washington University, St. Louis, USA.
| | | | | | | |
Collapse
|
152
|
Null AP, Muddiman DC. Perspectives on the use of electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry for short tandem repeat genotyping in the post-genome era. JOURNAL OF MASS SPECTROMETRY : JMS 2001; 36:589-606. [PMID: 11433532 DOI: 10.1002/jms.172] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The recent completion of the first rough draft of the human genome has provided fundamental information regarding our genetic make-up; however, the post-genome era will certainly require a host of new technologies to address complex biological questions. In particular, a rapid and accurate approach to characterize genetic markers, including short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) is demanded. STRs are the most informative of the two polymorphisms owing to their remarkable variability and even dispersity throughout eukaryotic genomes. Mass spectrometry is rapidly becoming a significant method in DNA analysis and has high probability of revolutionizing the way in which scientists probe the human genome. It is our responsibility as biomolecular mass spectrometrists to understand the issues in genetic analysis and the capabilities of mass spectrometry so that we may fulfill our role in developing a rapid, reliable technology to answer specific biological questions. This perspective is intended to familiarize the mass spectrometry community with modern genomics and to report on the current state of mass spectrometry, specifically electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry, for characterization of STRs.
Collapse
Affiliation(s)
- A P Null
- Department of Chemistry, Virginia Commonwealth University, Richmond, Virginia 23284, USA
| | | |
Collapse
|
153
|
Zhuo D, Zhao WD, Wright FA, Yang HY, Wang JP, Sears R, Baer T, Kwon DH, Gordon D, Gibbs S, Dai D, Yang Q, Spitzner J, Krahe R, Stredney D, Stutz A, Yuan B. Assembly, Annotation, and Integration of UNIGENE Clusters into the Human Genome Draft. Genome Res 2001. [DOI: 10.1101/gr.164501] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The recent release of the first draft of the human genome provides an unprecedented opportunity to integrate human genes and their functions in a complete positional context. However, at least three significant technical hurdles remain: first, to assemble a complete and nonredundant human transcript index; second, to accurately place the individual transcript indices on the human genome; and third, to functionally annotate all human genes. Here, we report the extension of the UNIGENE database through the assembly of its sequence clusters into nonredundant sequence contigs. Each resulting consensus was aligned to the human genome draft. A unique location for each transcript within the human genome was determined by the integration of the restriction fingerprint, assembled genomic contig, and radiation hybrid (RH) maps. A total of 59,500 UNIGENE clusters were mapped on the basis of at least three independent criteria as compared with the 30,000 human genes/ESTs currently mapped in Genemap'99. Finally, the extension of the human transcript consensus in this study enabled a greater number of putative functional assignments than the 11,000 annotated entries in UNIGENE. This study reports a draft physical map with annotations for a majority of the human transcripts, called the Human Index of Nonredundant Transcripts (HINT). Such information can be immediately applied to the discovery of new genes and the identification of candidate genes for positional cloning.
Collapse
|
154
|
Kan Z, Rouchka EC, Gish WR, States DJ. Gene structure prediction and alternative splicing analysis using genomically aligned ESTs. Genome Res 2001; 11:889-900. [PMID: 11337482 PMCID: PMC311065 DOI: 10.1101/gr.155001] [Citation(s) in RCA: 255] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
With the availability of a nearly complete sequence of the human genome, aligning expressed sequence tags (EST) to the genomic sequence has become a practical and powerful strategy for gene prediction. Elucidating gene structure is a complex problem requiring the identification of splice junctions, gene boundaries, and alternative splicing variants. We have developed a software tool, Transcript Assembly Program (TAP), to delineate gene structures using genomically aligned EST sequences. TAP assembles the joint gene structure of the entire genomic region from individual splice junction pairs, using a novel algorithm that uses the EST-encoded connectivity and redundancy information to sort out the complex alternative splicing patterns. A method called polyadenylation site scan (PASS) has been developed to detect poly-A sites in the genome. TAP uses these predictions to identify gene boundaries by segmenting the joint gene structure at polyadenylated terminal exons. Reconstructing 1007 known transcripts, TAP scored a sensitivity (Sn) of 60% and a specificity (Sp) of 92% at the exon level. The gene boundary identification process was found to be accurate 78% of the time. also reports alternative splicing patterns in EST alignments. An analysis of alternative splicing in 1124 genic regions suggested that more than half of human genes undergo alternative splicing. Surprisingly, we saw an absolute majority of the detected alternative splicing events affect the coding region. Furthermore, the evolutionary conservation of alternative splicing between human and mouse was analyzed using an EST-based approach. (See http://stl.wustl.edu/~zkan/TAP/)
Collapse
Affiliation(s)
- Z Kan
- Center for Computational Biology, Washington University, St. Louis, Missouri 63110, USA
| | | | | | | |
Collapse
|
155
|
Harrington JJ, Sherf B, Rundlett S, Jackson PD, Perry R, Cain S, Leventhal C, Thornton M, Ramachandran R, Whittington J, Lerner L, Costanzo D, McElligott K, Boozer S, Mays R, Smith E, Veloso N, Klika A, Hess J, Cothren K, Lo K, Offenbacher J, Danzig J, Ducar M. Creation of genome-wide protein expression libraries using random activation of gene expression. Nat Biotechnol 2001; 19:440-5. [PMID: 11329013 DOI: 10.1038/88107] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Here we report the use of random activation of gene expression (RAGE) to create genome-wide protein expression libraries. RAGE libraries containing only 5 x 10(6) individual clones were found to express every gene tested, including genes that are normally silent in the parent cell line. Furthermore, endogenous genes were activated at similar frequencies and expressed at similar levels within RAGE libraries created from multiple human cell lines, demonstrating that RAGE libraries are inherently normalized. Pools of RAGE clones were used to isolate 19,547 human gene clusters, approximately 53% of which were novel when tested against public databases of expressed sequence tag (EST) and complementary DNA (cDNA). Isolation of individual clones confirmed that the activated endogenous genes can be expressed at high levels to produce biologically active proteins. The properties of RAGE libraries and RAGE expression clones are well suited for a number of biotechnological applications including gene discovery, protein characterization, drug development, and protein manufacturing.
Collapse
Affiliation(s)
- J J Harrington
- Athersys, Inc., 3201 Carnegie Ave., Cleveland, OH 44115, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
156
|
Emmert-Buck MR, Strausberg RL, Krizman DB, Bonaldo MF, Bonner RF, Bostwick DG, Brown MR, Buetow KH, Chuaqui RF, Cole KA, Duray PH, Englert CR, Gillespie JW, Greenhut S, Grouse L, Hillier LW, Katz KS, Klausner RD, Kuznetzov V, Lash AE, Lennon G, Linehan WM, Liotta LA, Marra MA, Munson PJ, Ornstein DK, Prabhu VV, Prang C, Schuler GD, Soares MB, Tolstoshev CM, Vocke CD, Waterston RH. Molecular profiling of clinical tissues specimens: feasibility and applications. J Mol Diagn 2001; 2:60-6. [PMID: 11272889 PMCID: PMC1906897 DOI: 10.1016/s1525-1578(10)60617-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open
Affiliation(s)
- M R Emmert-Buck
- Pathogenetics Unit, Laboratory of Pathology, National Cancer Institute, Bethesda, Maryland, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
157
|
Park CH, Valore EV, Waring AJ, Ganz T. Hepcidin, a urinary antimicrobial peptide synthesized in the liver. J Biol Chem 2001; 276:7806-10. [PMID: 11113131 DOI: 10.1074/jbc.m008922200] [Citation(s) in RCA: 1506] [Impact Index Per Article: 62.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Cysteine-rich antimicrobial peptides are abundant in animal and plant tissues involved in host defense. In insects, most are synthesized in the fat body, an organ analogous to the liver of vertebrates. From human urine, we characterized a cysteine-rich peptide with three forms differing by amino-terminal truncation, and we named it hepcidin (Hepc) because of its origin in the liver and its antimicrobial properties. Two predominant forms, Hepc20 and Hepc25, contained 20 and 25 amino acid residues with all 8 cysteines connected by intramolecular disulfide bonds. Reverse translation and search of the data bases found homologous liver cDNAs in species from fish to human and a corresponding human genomic sequence on human chromosome 19. The full cDNA by 5' rapid amplification of cDNA ends was 0.4 kilobase pair, in agreement with hepcidin mRNA size on Northern blots. The liver was the predominant site of mRNA expression. The encoded prepropeptide contains 84 amino acids, but only the 20-25-amino acid processed forms were found in urine. Hepcidins exhibited antifungal activity against Candida albicans, Aspergillus fumigatus, and Aspergillus niger and antibacterial activity against Escherichia coli, Staphylococcus aureus, Staphylococcus epidermidis, and group B Streptococcus. Hepcidin may be a vertebrate counterpart of cysteine-rich antimicrobial peptides produced in the fat body of insects.
Collapse
Affiliation(s)
- C H Park
- Departments of Medicine and Pathology, UCLA School of Medicine, Los Angeles, California 90059, USA
| | | | | | | |
Collapse
|
158
|
Hill CA, Gutierrez JA. Analysis of the expressed genome of the lone star tick, Amblyomma americanum (Acari: Ixodidae) using an expressed sequence tag approach. MICROBIAL & COMPARATIVE GENOMICS 2001; 5:89-101. [PMID: 11087176 DOI: 10.1089/10906590050179774] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
An expressed sequence tag (EST) approach was used to study the genome of two developmental stages of the lone star tick, Amblyomma americanum. cDNA libraries were constructed from the larval and adult stages of A. americanum. In total, 1942 ESTs were sequenced (1462 adult ESTs and 480 larval ESTs) and analyzed using bioinformatic programs. Contig assembly using the CAPII program revealed 11% and 15% redundancy of sequences in the larval and adult ESTs, respectively. Of the 1942 ESTs, 1738 sequences were considered quality sequences and of these, 771 or approximately 44.4% of the sequences were putatively identified based on amino acid identity using the protein Basic Local Alignment Search Tool (BLAST) algorithm. Putatively identified sequences were classified according to their predicted gene function. In total, 967 sequences, or 55.6% of the quality sequences, had limited or no protein similarity to previously identified gene products. Sequences lacking protein homology were analyzed using an automated sequence annotation system for predicted protein characteristics such as open reading frames, signal peptides, protein motifs, and transmembrane regions. In this paper we describe the sequencing of the largest number of ESTs obtained from an arachnid species to date and the subsequent detailed analysis of these sequences.
Collapse
Affiliation(s)
- C A Hill
- Elanco Animal Health, A Division of Eli Lilly and Company, Greenfield, Indiana 46140, USA.
| | | |
Collapse
|
159
|
|
160
|
Navarro E, Espinosa L. Improving quality of expressed sequence tag (EST) databases: recovery of reversed, antisense cDNA sequences. MICROBIAL & COMPARATIVE GENOMICS 2001; 5:17-24. [PMID: 11011762 DOI: 10.1089/10906590050145230] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Expressed sequence tag (EST) databases contain a significant number (5-20%) of reversed, antisense, cDNA sequences that can be recognized by the label "reversed clone: similarity on wrong strand" in the annotations to the sequence. Despite this high number of altered sequences, no attempt has been made to explain the alteration in molecular terms, or to evaluate their effect on the quality of the information curated in EST databases. In this paper we try to explain the way these altered sequences are originated, and propose a plausible mechanism: a "double priming" of the first strand oligo-dT primer at both ends of nascent cDNAs. In this way, a symmetrical cDNA intermediate is generated, an intermediate that can be cloned after partial digestion with the restriction enzyme used for the directional cloning. Furthermore, when "secondary" priming takes place inside the cDNA, the chain synthesized is prone to be truncated prematurely, with the subsequent loss of upstream information. One of the most subtle effects of this cloning alteration is the generation of virtual open reading frames (ORFs) in sequences with no homologues available for comparison. Nevertheless, and according to our model and our data, the "double priming mechanism" does not shift the ORF effected, so antisense sequences should be considered as normal ones after a simple transformation in their inverse-complementary forms.
Collapse
|
161
|
Ohara O, Temple G. Directional cDNA library construction assisted by the in vitro recombination reaction. Nucleic Acids Res 2001; 29:E22. [PMID: 11160942 PMCID: PMC29629 DOI: 10.1093/nar/29.4.e22] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We report here a new directional cDNA library construction method using an in vitro site-specific recombination reaction, based on the integrase-excisionase system of bacteriophage lambda. Preliminary experiments revealed that in vitro recombinational cloning (RC) provided important advantages over conventional ligation-assisted cloning: it eliminated restriction digestion for directional cloning, generated low levels of chimeric clones, reduced size bias and, in our hands, gave a higher cloning efficiency than conventional ligation reactions. In a cDNA cloning experiment using an in vitro synthesized long poly(A)(+) RNA (7.8 kb), the RC gave a higher full-length cDNA clone content and about 10 times more transformants than conventional ligation-assisted cloning. Furthermore, characterization of rat brain cDNA clones yielded by the RC method showed that the frequency of cDNA clones >2 kb having internal NotI sites was approximately 6%, whereas these cDNAs could not be cloned at all or could be isolated only in a truncated form by conventional methods. Taken together, these results indicate that the RC method makes it possible to prepare cDNA libraries better representing the entire population of cDNAs, without sacrificing the simplicity of current conventional ligation-assisted methods.
Collapse
Affiliation(s)
- O Ohara
- Kazusa DNA Research Institute, 1532-3 Yana, Kisarazu, Chiba 292-0812, Japan.
| | | |
Collapse
|
162
|
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann Y, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, et alLander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann Y, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Raymond C, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blöcker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Slater G, Smit AF, Stupka E, Szustakowki J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ, de Jong P, Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ, Szustakowki J. Initial sequencing and analysis of the human genome. Nature 2001; 409:860-921. [PMID: 11237011 DOI: 10.1038/35057062] [Show More Authors] [Citation(s) in RCA: 15031] [Impact Index Per Article: 626.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
Collapse
Affiliation(s)
- E S Lander
- Whitehead Institute for Biomedical Research, Center for Genome Research, Cambridge, MA 02142, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
163
|
Abstract
To date, a comprehensive survey of the expression of lysyl oxidase (LOX), lysyl oxidase-like 1 (LOXL1), and lysyl oxidase-like 2 (LOXL2) has yet to be performed. The use of in vitro strategies to accomplish this task would prove daunting as it is both time-consuming and costly. We present a new in silico data mining strategy that directly addresses these limitations. Sequences corresponding to the 3' untranslated regions of LOX, LOXL1, and LOXL2 were individually queried against the human expressed sequence tag database (dbEST). In this manner, the entire tissue repertoire available in the dbEST was surveyed. This provided an estimate of the levels of mRNA transcripts in a variety of adult and fetal tissues. We have also employed this strategy to determine the pattern of expression and levels of a newly discovered gene, CGI-15. The veracity of this technique has been independently assessed by semiquantitative PCR analysis. The application of this technology is bounded only by the ever-growing information available in the GenBank, UniGene, and human EST databases. The utility of our data mining strategy to establish relative transcript levels in numerous tissues is presented.
Collapse
Affiliation(s)
- R Pires Martins
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, Michigan 48201, USA
| | | | | |
Collapse
|
164
|
Taniguchi Y, Lejukole HY, Yamada T, Akagi S, Takahashi S, Shimizu M, Yasue H, Sasaki Y. Analysis of expressed sequence tags from a cDNA library of somatic nuclear transfer-derived cloned bovine whole foetus. Anim Genet 2001; 32:1-6. [PMID: 11419338 DOI: 10.1046/j.1365-2052.2001.00701.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The expression profile of genes in specific tissues is studied through analysing expressed sequence tags (ESTs) and provides useful information for characterizing gene function and tissue physiology. Analysis of ESTs is achieved by partial sequencing and characterization of clones isolated randomly from cDNA libraries. In the present study, we analysed the genes expressed in the somatic nuclear transfer-derived cloned bovine foetus in the early period of foetal development. To this aim, we constructed a directionally cloned cDNA library from somatic nuclear transfer-derived cloned 60 day-old whole foetus of cattle and sequenced 3' end of 510 randomly isolated clones. By BLASTN analysis, we identified 403 unique clones: 186 showed homology to previously identified genes, 123 matched uncharacterized ESTs and 94 showed no significant matches to sequences already present in DNA databases. Analysis of these cDNA clones revealed that this library contained a variety of functional genes, while foetuin, insulin-like growth factor 2, collagen type I alpha I and maternal G10 transcript genes were the most abundant transcripts. Our study allowed the establishment of a first list of genes expressed in bovine whole foetus. In future, the list of genes might help facilitate the understanding of physiology of foetal development in somatic nuclear transfer-derived cloned bovine foetus.
Collapse
Affiliation(s)
- Y Taniguchi
- Laboratory of Animal Genetics and Breeding, Graduate School of Agriculture, Kyoto University, Sakyoku, Kyoto 606-8502, Japan
| | | | | | | | | | | | | | | |
Collapse
|
165
|
Abstract
In the continuing search for a full-length cDNA cloning method, there is no clear winner. Perfecting these techniques may require the re-engineering of reverse transcriptase. There now exist two reasonably linear methods for deriving expression signatures from small amounts of biological material, but advances in serial analysis of gene expression provide a quantitative, if expensive, alternative to these methods.
Collapse
Affiliation(s)
- S Bashiardes
- Department of Genetics, Washington University School of Medicine, Campus Box 8232, 4566 Scott Avenue, St Louis, Missouri 63110-1093, USA
| | | |
Collapse
|
166
|
Sugahara Y, Carninci P, Itoh M, Shibata K, Konno H, Endo T, Muramatsu M, Hayashizaki Y. Comparative evaluation of 5'-end-sequence quality of clones in CAP trapper and other full-length-cDNA libraries. Gene 2001; 263:93-102. [PMID: 11223247 DOI: 10.1016/s0378-1119(00)00557-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
To enhance the usefulness of the laboratory mouse and to facilitate the rapid assay of gene functions we have been collecting the entire set of mouse full-length cDNA by one-pass sequencing. To collect full-length cDNA clones efficiently, it is critical to construct high-quality cDNA libraries. In recent years, we have been developing a way to construct full-length cDNA libraries by using biotinylation of the cap structure (the 'CAP-trapper' method) coupled with treatment to increase reverse transcriptase efficiency at high temperature by the addition of trehalose. In this paper we report our evaluation of the quality of CAP trapper and a number of other full-length cDNA libraries, including the results of 5' end analysis of clones in CAP trapper and the other libraries. We used a procedure that compared the 5'-ends of cDNA clones with those of genes in the public databases. Our analysis showed that 63% of cDNA clones in CAP trapper libraries had sequences that were either the same length as those of equivalent genes in the public database or 5'-extended, and that 90% of these clones maintained their coding sequences. These results indicate that the CAP trapper library is a promising tool for collecting full-length cDNA in large-scale projects. Comparison of the quality of CAP trapper with that of other full-length-cDNA libraries confirmed the value of these libraries.
Collapse
Affiliation(s)
- Y Sugahara
- Laboratory for Genome Exploration Research Project, Genomic Sciences Center and Genome Science Laboratory, RIKEN Tsukuba Institute, 3-1-1 Koyadai, Tsukuba, Ibaraki 305-0074, Japan
| | | | | | | | | | | | | | | |
Collapse
|
167
|
High-throughput development and characterization of a genomewide collection of gene-based single nucleotide polymorphism markers by chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Proc Natl Acad Sci U S A 2001; 98. [PMID: 11136232 PMCID: PMC14630 DOI: 10.1073/pnas.021506298] [Citation(s) in RCA: 197] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We describe here a system for the rapid identification, assay development, and characterization of gene-based single nucleotide polymorphisms (SNPs). This system couples informatics tools that mine candidate SNPs from public expressed sequence tag resources and automatically designs assay reagents with detection by a chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry platform. As a proof of concept of this system, a genomewide collection of reagents for 9,115 gene-based SNP genetic markers was rapidly developed and validated. These data provide preliminary insights into patterns of polymorphism in a genomewide collection of gene-based polymorphisms.
Collapse
|
168
|
Buetow KH, Edmonson M, MacDonald R, Clifford R, Yip P, Kelley J, Little DP, Strausberg R, Koester H, Cantor CR, Braun A. High-throughput development and characterization of a genomewide collection of gene-based single nucleotide polymorphism markers by chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Proc Natl Acad Sci U S A 2001; 98:581-4. [PMID: 11136232 PMCID: PMC14630 DOI: 10.1073/pnas.98.2.581] [Citation(s) in RCA: 292] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We describe here a system for the rapid identification, assay development, and characterization of gene-based single nucleotide polymorphisms (SNPs). This system couples informatics tools that mine candidate SNPs from public expressed sequence tag resources and automatically designs assay reagents with detection by a chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry platform. As a proof of concept of this system, a genomewide collection of reagents for 9,115 gene-based SNP genetic markers was rapidly developed and validated. These data provide preliminary insights into patterns of polymorphism in a genomewide collection of gene-based polymorphisms.
Collapse
Affiliation(s)
- K H Buetow
- Laboratory of Population Genetics, Division of Cancer Epidemiology and Genetics, and Office of Genomics, National Cancer Institute, Bethesda, MD 20892-5060, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
169
|
Hadano S, Yanagisawa Y, Skaug J, Fichter K, Nasir J, Martindale D, Koop BF, Scherer SW, Nicholson DW, Rouleau GA, Ikeda J, Hayden MR. Cloning and characterization of three novel genes, ALS2CR1, ALS2CR2, and ALS2CR3, in the juvenile amyotrophic lateral sclerosis (ALS2) critical region at chromosome 2q33-q34: candidate genes for ALS2. Genomics 2001; 71:200-13. [PMID: 11161814 DOI: 10.1006/geno.2000.6392] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Amyotrophic lateral sclerosis is a progressive neurodegenerative disease that manifests as selective upper and lower motor neuron degeneration. The autosomal recessive form of juvenile amyotrophic lateral sclerosis (ALS2) has previously been mapped to the 1.7-cM interval flanked by D2S116 and D2S2237 on human chromosome 2q33-q34. We identified three novel full-length transcripts encoded by three distinct genes (HGMW-approved symbols ALS2CR1, ALS2CR2, and ALS2CR3) within the ALS2 critical region. The intron-exon organizations of these genes as well as those of CFLAR, CASP10, and CASP8, which were previously mapped to this region, were defined. These genes were evaluated for mutations in ALS2 patients, and no disease-associated sequence alterations in either exons or intron-exon boundaries were observed. Sequence analysis of overlapping RT-PCR products covering the whole coding sequence for each transcript revealed no aberrant mRNA sequences. These data strongly indicate that ALS2CR1, ALS2CR2, ALS2CR3, CFLAR, CASP10, and CASP8 are not causative genes for ALS2.
Collapse
Affiliation(s)
- S Hadano
- NeuroGenes, International Cooperative Research Project, Japan Science and Technology Corporation, Isehara, 259-1193, Japan
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
170
|
Loriod B, Victorero G, Nguyen C. cDNA Macroarrays and Microarrays on Nylon Membranes with Radioactive Detection. ACTA ACUST UNITED AC 2001. [DOI: 10.1007/978-3-642-56517-5_4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/12/2023]
|
171
|
Abstract
Analysis of the human genome draft sequences has revealed a more complete portrait of the olfactory receptor gene repertoire in humans than was available previously. The new information provides a basis for deeper analysis of the functions of the receptors, and promises new insights into the evolutionary history of the family.
Collapse
Affiliation(s)
- C Crasto
- Department of Neurobiology, Yale University Medical School, 333 Cedar Street, New Haven, CT 06510, USA.
| | | | | |
Collapse
|
172
|
Szabo S, Deng X, Khomenko T, Yoshida M, Jadus MR, Sandor Z, Gombos Z, Matsumoto H. Gene expression and gene therapy in experimental duodenal ulceration. JOURNAL OF PHYSIOLOGY, PARIS 2001; 95:325-35. [PMID: 11595457 DOI: 10.1016/s0928-4257(01)00045-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Gastroduodenal ulceration is still poorly understood and changes in gene expression may provide new mechanistic insights. Previously, we demonstrated that angiogenic growth factors are potent ulcer healing agents, and the synthesis of bFGF, PDGF and VEGF is enhanced early in duodenal ulcer healing. The initial molecular event in duodenal ulceration seems to be the organ-specific early release of ET-1 in the pre-ulcerogenic stages after the administration of duodenal ulcerogen cysteamine in rats. We also briefly review here data from literature indicating a central role of ET-1 in gastroduodenal ulceration. After studying the involvement of immediate early genes (e.g. egr-1, Sp1) in ulcer development, we now investigated expression of other genes in the duodenal mucosa in the early stages of chemically induced duodenal ulceration in rats. Following a brief review of principles of gene expression and gene therapy, we review our preliminary gene expression studies, involving monitoring about 1200 genes which revealed about 160 signals and prominent changes in about 30 genes in the early stages of experimental duodenal ulceration. Cysteamine enhanced ET-B receptor gene expression in 30 min, while transcription factors (MAX, STAT 3) showed increased expression in 12 h. We recently also initiated gene therapy studies to enhance the local synthesis of PDGF and VEGF to accelerate duodenal ulcer healing, using a single dose of naked DNA (ND) or adenoviral (AV) vectors of VEGF and PDGF in rats with cysteamine-induced duodenal ulcers. Gene therapy with ND or AV of VEGF or PDGF significantly accelerated chronic duodenal ulcer healing, and increased levels of VEGF and PDGF were detected by Western blotting and ELISA in duodenal mucosa after both VEGF and PDGF gene therapy. Thus, gene expression studies provide new insights into the molecular mechanisms of duodenal ulceration and VEGF or PDGF gene therapy seems to be a new option to achieve a rapid ulcer healing.
Collapse
Affiliation(s)
- S Szabo
- Path. & Lab. Med. Service, VA Medical Center, 5901 E. 7th Street, Long Beach, CA 90822-5201, USA.
| | | | | | | | | | | | | | | |
Collapse
|
173
|
Posey KL, Jones LB, Cerda R, Bajaj M, Huynh T, Hardin PE, Hardin SH. Survey of transcripts in the adult Drosophila brain. Genome Biol 2001; 2:RESEARCH0008. [PMID: 11276425 PMCID: PMC30707 DOI: 10.1186/gb-2001-2-3-research0008] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2000] [Revised: 01/22/2001] [Accepted: 01/24/2001] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Classic methods of identifying genes involved in neural function include the laborious process of behavioral screening of mutagenized flies and then rescreening candidate lines for pleiotropic effects due to developmental defects. To accelerate the molecular analysis of brain function in Drosophila we constructed a cDNA library exclusively from adult brains. Our goal was to begin to develop a catalog of transcripts expressed in the brain. These transcripts are expected to contain a higher proportion of clones that are involved in neuronal function. RESULTS The library contains approximately 6.75 million independent clones. From our initial characterization of 271 randomly chosen clones, we expect that approximately 11% of the clones in this library will identify transcribed sequences not found in expressed sequence tag databases. Furthermore, 15% of these 271 clones are not among the 13,601 predicted Drosophila genes. CONCLUSIONS Our analysis of this unique Drosophila brain library suggests that the number of genes may be underestimated in this organism. This work complements the Drosophila genome project by providing information that facilitates more complete annotation of the genomic sequence. This library should be a useful resource that will help in determining how basic brain functions operate at the molecular level.
Collapse
Affiliation(s)
- Karen L Posey
- Department of Biology and Biochemistry, Institute of Molecular Biology, University of Houston, Houston, TX 77204-5513, USA
| | - Leslie B Jones
- Department of Biology and Biochemistry, Institute of Molecular Biology, University of Houston, Houston, TX 77204-5513, USA
| | - Rosalinda Cerda
- Department of Biology and Biochemistry, Institute of Molecular Biology, University of Houston, Houston, TX 77204-5513, USA
| | - Monica Bajaj
- Department of Biology and Biochemistry, Institute of Molecular Biology, University of Houston, Houston, TX 77204-5513, USA
| | - Thao Huynh
- Department of Biology and Biochemistry, Institute of Molecular Biology, University of Houston, Houston, TX 77204-5513, USA
| | - Paul E Hardin
- Department of Biology and Biochemistry, Institute of Molecular Biology, University of Houston, Houston, TX 77204-5513, USA
| | - Susan H Hardin
- Department of Biology and Biochemistry, Institute of Molecular Biology, University of Houston, Houston, TX 77204-5513, USA
| |
Collapse
|
174
|
Ruiz A, Pujana MA, Estivill X. Isolation and characterisation of a novel human gene (C9orf11) on chromosome 9p21, a region frequently deleted in human cancer. BIOCHIMICA ET BIOPHYSICA ACTA 2000; 1517:128-34. [PMID: 11118625 DOI: 10.1016/s0167-4781(00)00272-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The chromosome 9p21 region has been described to be frequently deleted in several neoplasias. The cyclin dependent kinase inhibitor 2A (CDKN2A or P16) gene was cloned in this region and identified as a tumour suppressor gene. However, much evidence indicates the existence of another tumour suppressor gene located proximal to the CDKN2A gene, which could be involved in cutaneous malignant melanoma (CMM) initiation. In the present report we have further investigated this 9p21 chromosomal region and cloned and characterised a novel gene within it (C9orf11). This gene shares no similarities to any known gene or predicted protein representing a novel human gene. Nevertheless, a putative leucine zipper pattern is located at the C-terminal end of the predicted protein, suggesting that it could dimerise. C9orf11 encodes for a protein of 294 amino acids with a predicted molecular mass of 32.8 kDa. C9orf11 is organised in eight exons that encompass a region of approx. 13 kb. Expression analysis demonstrates that C9orf11 is highly expressed in testis, although minor expression was seen in other tissues. Mutations in the C9orf11 gene were not detected in CMM families that were negative for CDKN2A mutations. Two SNPs for the C9orf11 gene have been identified, which could be used in segregation or association studies for other disorders.
Collapse
Affiliation(s)
- A Ruiz
- Medical and Molecular Genetics Centre - IRO, Hospital Duran i Reynals, Autovia de Castelldefels km 2,7, 08907 L'Hospitalet de Llobregat, Barcelona, Catalonia, Spain
| | | | | |
Collapse
|
175
|
Andrews J, Bouffard GG, Cheadle C, Lü J, Becker KG, Oliver B. Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis. Genome Res 2000. [DOI: 10.1101/gr.159800] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Identification and annotation of all the genes in the sequencedDrosophila genome is a work in progress. Wild-type testis function requires many genes and is thus of potentially high value for the identification of transcription units. We therefore undertook a survey of the repertoire of genes expressed in the Drosophilatestis by computational and microarray analysis. We generated 3141 high-quality testis expressed sequence tags (ESTs). Testis ESTs computationally collapsed into 1560 cDNA set used for further analysis. Of those, 11% correspond to named genes, and 33% provide biological evidence for a predicted gene. A surprising 47% fail to align with existing ESTs and 16% with predicted genes in the current genome release. EST frequency and microarray expression profiles indicate that the testis mRNA population is highly complex and shows an extended range of transcript abundance. Furthermore, >80% of the genes expressed in the testis showed onefold overexpression relative to ovaries, or gonadectomized flies. Additionally, >3% showed more than threefold overexpression at p <0.05. Surprisingly, 22% of the genes most highly overexpressed in testis matchDrosophila genomic sequence, but not predicted genes. These data strongly support the idea that sequencing additional cDNA libraries from defined tissues, such as testis, will be important tools for refined annotation of the Drosophila genome. Additionally, these data suggest that the number of genes in Drosophila will significantly exceed the conservative estimate of 13,601.[The sequence data described in this paper have been submitted to the dbEST data library under accession nos.AI944400–AI947263 and BE661985–BE662262.][The microarray data described in this paper have been submitted to the GEO data library under accession nos. GPLS, GSM3–GSM10.]
Collapse
|
176
|
Andrews J, Bouffard GG, Cheadle C, Lü J, Becker KG, Oliver B. Gene discovery using computational and microarray analysis of transcription in the Drosophila melanogaster testis. Genome Res 2000; 10:2030-43. [PMID: 11116097 PMCID: PMC313064 DOI: 10.1101/gr.10.12.2030] [Citation(s) in RCA: 164] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Identification and annotation of all the genes in the sequenced Drosophila genome is a work in progress. Wild-type testis function requires many genes and is thus of potentially high value for the identification of transcription units. We therefore undertook a survey of the repertoire of genes expressed in the Drosophila testis by computational and microarray analysis. We generated 3141 high-quality testis expressed sequence tags (ESTs). Testis ESTs computationally collapsed into 1560 cDNA set used for further analysis. Of those, 11% correspond to named genes, and 33% provide biological evidence for a predicted gene. A surprising 47% fail to align with existing ESTs and 16% with predicted genes in the current genome release. EST frequency and microarray expression profiles indicate that the testis mRNA population is highly complex and shows an extended range of transcript abundance. Furthermore, >80% of the genes expressed in the testis showed onefold overexpression relative to ovaries, or gonadectomized flies. Additionally, >3% showed more than threefold overexpression at p <0.05. Surprisingly, 22% of the genes most highly overexpressed in testis match Drosophila genomic sequence, but not predicted genes. These data strongly support the idea that sequencing additional cDNA libraries from defined tissues, such as testis, will be important tools for refined annotation of the Drosophila genome. Additionally, these data suggest that the number of genes in Drosophila will significantly exceed the conservative estimate of 13,601.
Collapse
Affiliation(s)
- J Andrews
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | | | | | |
Collapse
|
177
|
Grottke C, Mantwill K, Dietel M, Schadendorf D, Lage H. Identification of differentially expressed genes in human melanoma cells with acquired resistance to various antineoplastic drugs. Int J Cancer 2000; 88:535-46. [PMID: 11058868 DOI: 10.1002/1097-0215(20001115)88:4<535::aid-ijc4>3.0.co;2-v] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Malignant melanoma displays strong resistance against various antineoplastic drugs. The mechanisms conferring this intrinsic resistance are unclear. To better understand the molecular events associated with drug resistance in melanoma, a panel of human melanoma cell variants exhibiting low and high levels of resistance to 4 commonly used drugs in melanoma treatment, i.e., vindesine, etoposide, fotemustine and cisplatin, was characterized by differential display reverse transcription-polymerase chain reaction (DDRT-PCR). Of 269 mRNA fragments found to be altered in expression level by DDRT-PCR, a total of 11 cDNA clones was characterized after confirmation of a differential expression pattern by Northern blot analyses. These clones include 3 genes (DSM-1, DSM-3 and DSM-5) of known function, 4 previously sequenced genes (DSM-2, DSM-4, DSM-6 and DSM-7) of uncharacterized function and 4 novel genes (DSM-8-DSM-11) without match in GenBank. All of these genes exhibited altered mRNA expression in high level etoposide-resistant cells, whereby 7 genes (DSM-1-DSM-6 and DSM-8) were found to be decreased in the transcription rate in these etoposide-resistant cells. The mRNA synthesis of the remaining genes (DSM-7 and DSM-9-DSM11) was enhanced in high level etoposide-resistant melanoma cells. The expression of 5 (DSM-5 and DSM-7-DSM-10) of the cloned cDNA encoding mRNAs was modulated in various independently established drug-resistant melanoma cells, indicating to be associated with drug resistance. Further characterization of these genes may yield inside into the biology and development of drug resistance in malignant melanoma.
Collapse
Affiliation(s)
- C Grottke
- Institute of Pathology, Charité, Campus Mitte, Humboldt University Berlin, Berlin, Germany
| | | | | | | | | |
Collapse
|
178
|
Kawamoto S, Yoshii J, Mizuno K, Ito K, Miyamoto Y, Ohnishi T, Matoba R, Hori N, Matsumoto Y, Okumura T, Nakao Y, Yoshii H, Arimoto J, Ohashi H, Nakanishi H, Ohno I, Hashimoto J, Shimizu K, Maeda K, Kuriyama H, Nishida K, Shimizu-Matsumoto A, Adachi W, Ito R, Kawasaki S, Chae KS. BodyMap: a collection of 3' ESTs for analysis of human gene expression information. Genome Res 2000; 10:1817-27. [PMID: 11076866 PMCID: PMC310944 DOI: 10.1101/gr.151500] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
BodyMap is a collection of site-directed 3' expressed sequence tags (ESTs) (gene signatures, GSs) that contains the transcript compositions of various human tissues and was the first systematic effort to acquire gene expression data. For the construction of BodyMap, cDNA libraries were made, preserving abundance information and histologic resolutions of tissue mRNAs. By sequencing 164,000 randomly selected clones, 88,587 GSs that represent chromosomally coded transcripts have been collected from 51 human organs and tissues. They were clustered into 18,722 independent 3' termini from transcripts, and more than 3000 of these were not found among ESTs assembled in UniGene (Build 75). Assessment of the prevalence of polyadenylation signals and comparison with GenBank cDNAs indicated that there was no significant contamination by internally primed cDNAs or genomic fragments but that there was a relatively high incidence (12%) of alternative polyadenylation sites. We evaluated the sensitivity and resolution of expression information in BodyMap by in silico Northern hybridization and selection of tissue-specific gene probes. BodyMap is a unique resource for estimation of the absolute abundance of transcripts and selection of gene probes for efficient hybridization-based gene expression profiling.
Collapse
Affiliation(s)
- S Kawamoto
- Institute for Molecular and Cellular Biology, Osaka University, Osaka 565-0871, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
179
|
Abstract
Among higher eukaryotes, very little of the genome codes for protein. What is in the rest of the genome, or the "junk" DNA, that, in Homo sapiens, is estimated to be almost 97% of the genome? Is it possible that much of this "junk" is intron DNA? This is not a question that can be answered just by looking at the published data, even from the finished genomes. One cannot assume that there are no genes in a sequenced region, just because no genes were annotated. We introduce another approach to this problem, based on an analysis of the cDNA-to-genomic alignments, in all of the complete or nearly-complete genomes from the multicellular organisms. Our conclusion is that, in animals but not in plants, most of the "junk" is intron DNA.
Collapse
Affiliation(s)
- G K Wong
- Human Genome Center, Department of Medicine, University of Washington, Seattle, Washington 98195, USA.
| | | | | | | | | |
Collapse
|
180
|
Gill RW, Sanseau P. Rapid in silico cloning of genes using expressed sequence tags (ESTs). BIOTECHNOLOGY ANNUAL REVIEW 2000; 5:25-44. [PMID: 10874996 DOI: 10.1016/s1387-2656(00)05031-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Expressed sequence tags (ESTs) are short single-pass DNA sequences obtained from either end of cDNA clones. These ESTs are derived from a vast number of cDNA libraries obtained from different species. Human ESTs are the bulk of the data and have been widely used to identify new members of gene families, as markers on the human chromosomes, to discover polymorphism sites and to compare expression patterns in different tissues or pathologies states. Information strategies have been devised to query EST databases. Since most of the analysis is performed with a computer, the term "in silico" strategy has been coined. In this chapter we will review the current status of EST databases, the pros and cons of EST-type data and describe possible strategies to retrieve meaningful information.
Collapse
Affiliation(s)
- R W Gill
- Glaxo-Wellcome, Genome Informatics, Genetics Division, Herts, England
| | | |
Collapse
|
181
|
Sleeman MA, Murison JG, Strachan L, Kumble K, Glenn MP, McGrath A, Grierson A, Havukkala I, Tan PL, Watson JD. Gene expression in rat dermal papilla cells: analysis of 2529 ESTs. Genomics 2000; 69:214-24. [PMID: 11031104 DOI: 10.1006/geno.2000.6300] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Dermal papilla (DEPA) cells are resident at the base of hair follicles and are fundamental to hair growth and development. Cultured DEPA cells, in contrast to normal fibroblast cells, are capable of inducing de novo hair follicle growth in vivo. By differential screening of a DEPA cDNA library, we have demonstrated that dermal papilla cells are different from fibroblasts at the molecular level. We further studied these cells by random sequencing of 5130 clones from the DEPA cDNA library. Fifty percent had a BLASTX E value < or =1 x 10(-25). Twenty-one percent had similarity to proteins involved in cell structure/motility with 4 of the top 10 most abundant clones encoding extracellular matrix proteins. Clones encoding growth factor molecules were also abundant. The remaining 50.7% of clones had low similarity scores, demonstrating many novel molecules. For example, we identified a new CTGF family member, the rat homologue of Elm1.
Collapse
Affiliation(s)
- M A Sleeman
- Genesis Research and Development Corporation Limited, Auckland, New Zealand.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
182
|
Gary SC, Zerillo CA, Chiang VL, Gaw JU, Gray G, Hockfield S. cDNA cloning, chromosomal localization, and expression analysis of human BEHAB/brevican, a brain specific proteoglycan regulated during cortical development and in glioma. Gene 2000; 256:139-47. [PMID: 11054543 DOI: 10.1016/s0378-1119(00)00362-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
BEHAB (Brain Enriched HyAluronan Binding)/brevican, a brain-specific member of the lectican family of chondroitin sulfate proteoglycans (CSPGs), may play a role in both brain development and human glioma. BEHAB/brevican has been cloned from bovine, mouse and rat. Two isoforms have been reported: a full-length isoform that is secreted into the extracellular matrix (ECM) and a shorter isoform with a sequence that predicts a glycophosphatidylinositol (GPI) anchor. Here, we report the characterization of BEHAB/brevican isoforms in human brain. First, BEHAB/brevican maps to human chromosome 1q31. Second, we report the sequence of both isoforms of human BEHAB/brevican. The deduced protein sequence of full-length, secreted human BEHAB/brevican is 89.7, 83.3 and 83.2% identical to bovine, mouse and rat homologues, respectively. Third, by RNase protection analysis (RPA) we show the developmental regulation of BEHAB/brevican isoforms in normal human cortex. The secreted isoform is highly expressed from birth through 8years of age and is downregulated by 20years of age to low levels that are maintained in the normal adult cortex. The GPI isoform is expressed at uniformly low levels throughout development. Fourth, we confirm and extend previous studies from our laboratory, here demonstrating the upregulation of BEHAB/brevican mRNA in human glioma quantitatively. RPA analysis shows that both isoforms are upregulated in glioma, showing an approximately sevenfold increase in expression over normal levels. In contrast to the developmental regulation of BEHAB/brevican, where only the secreted isoform is regulated, both isoforms are increased in parallel in human glioma. The distinct patterns of regulation of expression of the two isoforms suggest distinct mechanisms of regulation of BEHAB/brevican during development and in glioma.
Collapse
Affiliation(s)
- S C Gary
- Section of Neurobiology, Yale University School of Medicine, New Haven, CT 06510, USA.
| | | | | | | | | | | |
Collapse
|
183
|
Yoshikawa T, Seki N, Azuma T, Masuho Y, Muramatsu M, Miyajima N, Saito T. Isolation of a cDNA for a novel human RING finger protein gene, RNF18, by the virtual transcribed sequence (VTS) approach(1). BIOCHIMICA ET BIOPHYSICA ACTA 2000; 1493:349-55. [PMID: 11018261 DOI: 10.1016/s0167-4781(00)00193-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We have recently developed a novel database system, designated as the virtual transcribed sequence (VTS) which efficiently extracts many genes from public human genome databases, and tested the feasibility of this novel computational approach (N. Miyajima, C. Burge, T. Saito, Biochem. Biophys. Res. Commun. 272 (2000) 801; http://host45.maze.co.jp/vts/). In this study, using the VTS approach, we isolated a cDNA for a novel human gene with RING finger motif (C(3)HC(4)), which is not deposited in public EST databases. The isolated cDNA clone is 2163 bp in length, and contains an open reading frame of 452 amino acids. We designated the novel gene as RNF18. A database search showed that the RNF18 gene had the moderate similarity to SS-A/Ro52 protein, which is a ribonucleoprotein reactive with autoantibodies in patients with Sjögren's syndrome and systemic lupus erythematosus. Tissue distribution analyses by Northern blot and RT-PCR methods demonstrated that the RNF18 messenger RNA was preferentially expressed in testis. The exon-intron boundaries of RNF18 gene were determined by aligning the cDNA sequence with the corresponding genome sequence. The isolated cDNA consists of eight exons that span about 11 kb of the genome DNA. The precise chromosomal location of the RNF18 gene was determined by PCR-based radiation hybrid mapping, and the gene was located to centromere region of chromosome 11 between markers NIB1900 and D11S1350. Taken together, the VTS approach should provide a novel cDNA cloning strategy for isolating unidentified genes, which are not found even in EST databases but are detectable computationally.
Collapse
Affiliation(s)
- T Yoshikawa
- Biological Technology Laboratory, Helix Research Institute, Kisarazu, Chiba, Japan
| | | | | | | | | | | | | |
Collapse
|
184
|
Abstract
The end of the beginning of the Human Genome Project was announced on 26 June when the working draft or first assembly was announced. Here, Ian Dunham who led the group at the Sanger Centre that produced the first complete sequence of a human chromosome reflects on how it felt to be with the genome project from the beginning.
Collapse
Affiliation(s)
- I Dunham
- The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, CB10 1SA, Cambridge, UK
| |
Collapse
|
185
|
Carninci P, Shibata Y, Hayatsu N, Sugahara Y, Shibata K, Itoh M, Konno H, Okazaki Y, Muramatsu M, Hayashizaki Y. Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes. Genome Res 2000; 10:1617-30. [PMID: 11042159 PMCID: PMC310980 DOI: 10.1101/gr.145100] [Citation(s) in RCA: 211] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In the effort to prepare the mouse full-length cDNA encyclopedia, we previously developed several techniques to prepare and select full-length cDNAs. To increase the number of different cDNAs, we introduce here a strategy to prepare normalized and subtracted cDNA libraries in a single step. The method is based on hybridization of the first-strand, full-length cDNA with several RNA drivers, including starting mRNA as the normalizing driver and run-off transcripts from minilibraries containing highly expressed genes, rearrayed clones, and previously sequenced cDNAs as subtracting drivers. Our method keeps the proportion of full-length cDNAs in the subtracted/normalized library high. Moreover, our method dramatically enhances the discovery of new genes as compared to results obtained by using standard, full-length cDNA libraries. This procedure can be extended to the preparation of full-length cDNA encyclopedias from other organisms.
Collapse
Affiliation(s)
- P Carninci
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center, Tsukuba, Japan.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
186
|
Hirosawa M, Ishikawa K, Nagase T, Ohara O. Detection of spurious interruptions of protein-coding regions in cloned cDNA sequences by GeneMark analysis. Genome Res 2000; 10:1333-41. [PMID: 10984451 DOI: 10.1101/gr.129500] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
cDNA is an artificial copy of mRNA and, therefore, no cDNA can be completely free from suspicion of cloning errors. Because overlooking these cloning errors results in serious misinterpretation of cDNA sequences, development of an alerting system targeting spurious sequences in cloned cDNAs is an urgent requirement for massive cDNA sequence analysis. We describe here the application of a modified GeneMark program, originally designed for prokaryotic gene finding, for detection of artifacts in cDNA clones. This program serves to provide a warning when any spurious split of protein-coding regions is detected through statistical analysis of cDNA sequences based on Markov models. In this study, 817 cDNA sequences deposited in public databases by us were subjected to analysis using this alerting system to assess its sensitivity and specificity. The results indicated that any spurious split of protein-coding regions in cloned cDNAs could be sensitively detected and systematically revised by means of this system after the experimental validation of the alerts. Furthermore, this study offered us, for the first time, statistical data regarding the rates and types of errors causing protein-coding splits in cloned cDNAs obtained by conventional cloning methods.
Collapse
Affiliation(s)
- M Hirosawa
- Kazusa DNA Research Institute, Kisarazu, Chiba 292-0812, Japan
| | | | | | | |
Collapse
|
187
|
Yang Z, Wong GK, Eberle MA, Kibukawa M, Passey DA, Hughes WR, Kruglyak L, Yu J. Sampling SNPs. Nat Genet 2000; 26:13-4. [PMID: 10973237 DOI: 10.1038/79113] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
188
|
Abstract
High-throughput gene sequencing has revolutionized the process used to identify novel molecular targets for drug discovery. Thousands of new gene sequences have been generated but only a limited number of these can be converted into validated targets likely to be involved in disease. We describe here some of the approaches used at SmithKline Beecham to select and validate novel targets. These include the identification of selective tissue gene product expression, such as for cathepsin K, a novel osteoclast-specific cysteine protease. We also describe the discovery and functional characterization of novel members of the G-protein coupled receptor superfamily and their pairing with natural ligands. Lastly, we discuss the promises of gene microarrays and proteomics, developing technologies that allow the parallel analyses of tissue expression patterns of thousands of genes or proteins, respectively.
Collapse
Affiliation(s)
- C Debouck
- Discovery Chemistry & Platform Technologies, SmithKline Beecham Pharmaceuticals, Research & Development, King of Prussia, Pennsylvania 19406, USA.
| | | |
Collapse
|
189
|
Sweadner KJ, Rael E. The FXYD gene family of small ion transport regulators or channels: cDNA sequence, protein signature sequence, and expression. Genomics 2000; 68:41-56. [PMID: 10950925 DOI: 10.1006/geno.2000.6274] [Citation(s) in RCA: 326] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A gene family of small membrane proteins, represented by phospholemman and the gamma subunit of Na,K-ATPase, was defined and characterized by the analysis of more than 1000 related ESTs (expressed sequence tags). In addition to new and more complete cDNA sequence for known family members (including MAT-8, CHIF, and RIC), the findings included two new family members and new splicing variants. A large number of EST replicates made it possible to derive curated DNA sequence with higher confidence and accuracy than from the sequencing of individual clones. The family has a core motif of 35 invariant and conserved amino acids centered on a single transmembrane span. Features of each predicted protein product were compared, and tissue distributions were determined. The gene family was named FXYD (pronounced fix-id) in recognition of invariant amino acids in its signature motif. The abundant proteins are involved in the control of ion transport.
Collapse
Affiliation(s)
- K J Sweadner
- Neuroscience Center, Massachusetts General Hospital, 149 13th Street, Charlestown, Massachusetts 02129, USA.
| | | |
Collapse
|
190
|
Abstract
Serial Analysis of Gene Expression (SAGE) is an innovative technique that offers the potential of cataloging both the identity and relative frequencies of mRNA transcripts in a given poly(A(+)) RNA preparation. Although it is a very effective approach for determining the expression of mRNA populations, there are significant biases in the observed results that are inherent in the experimental process. These are caused by sampling error, sequencing error, nonuniqueness, and nonrandomness of tag sequences. The quantitative information desired from SAGE experiments consists of estimates of the number of genes and the frequency distribution of transcript copy numbers. Of additional concern is the extent to which a given tag sequence can be assumed to be unique to its gene. The present study takes these mathematical biases into account and presents a basis for maximum likelihood estimation of gene number and transcript copy frequencies given a set of experimental results. These estimates of the true state of genomic expression are markedly different from those based directly on the observations from the underlying experiments. It also is shown that while in many cases it is probable that a given tag sequence is unique within the genome, in larger genomes this cannot be safely assumed.
Collapse
Affiliation(s)
- J Stollberg
- Pacific Biomedical Research Center, University of Hawai'i at Manoa, Honolulu, Hawaii 96822, USA.
| | | | | | | |
Collapse
|
191
|
Smith RC, Rhodes SJ. Applications of developmental biology to medicine and animal agriculture. PROGRESS IN DRUG RESEARCH. FORTSCHRITTE DER ARZNEIMITTELFORSCHUNG. PROGRES DES RECHERCHES PHARMACEUTIQUES 2000; 54:213-56. [PMID: 10857390 DOI: 10.1007/978-3-0348-8391-7_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
With the complete sequence of the human genome expected by winter 2001, genomic-based drug discovery efforts of the pharmaceutical industry are focusing on finding the relatively few therapeutically useful genes from among the total gene set. Methods to rapidly elucidate gene function will have increasing value in these investigations. The use of model organisms in functional genomics has begun to be recognized and exploited and is one example of the emerging use of the tools of developmental biology in recent drug discovery efforts. The use of protein products expressed during embryo-genesis and the use of certain pluripotent cell populations (stem cells) as candidate therapeutics are other applications of developmental biology to the treatment of human diseases. These agents may be used to repair damaged or diseased tissues by inducing or directing developmental programs that recapitulate embryonic processes to replace specialized cells. The activation or silencing of embryonic genes in the disease state, particularly those encoding transcription factors, is another avenue of exploitation. Finally, the direct drug-induced manipulation of embryonic development is a unique application of developmental biology in animal agriculture.
Collapse
Affiliation(s)
- R C Smith
- Department of Biology, Indiana University-Purdue University Indianapolis 46202-5132, USA
| | | |
Collapse
|
192
|
Lash AE, Tolstoshev CM, Wagner L, Schuler GD, Strausberg RL, Riggins GJ, Altschul SF. SAGEmap: a public gene expression resource. Genome Res 2000; 10:1051-60. [PMID: 10899154 PMCID: PMC310889 DOI: 10.1101/gr.10.7.1051] [Citation(s) in RCA: 281] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We have constructed a public gene expression data repository and online data access and analysis, WWW and FTP sites for serial analysis of gene expression (SAGE) data. The WWW and FTP components of this resource, SAGEmap, are located at http://www.ncbi.nlm.nih. gov/sage and ftp://ncbi.nlm.nih.gov/pub/sage, respectively. We herein describe SAGE data submission procedures, the construction and characteristics of SAGE tags to gene assignments, the derivation and use of a novel statistical test designed specifically for differential-type analyses of SAGE data, and the organization and use of this resource.
Collapse
Affiliation(s)
- A E Lash
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894 USA.
| | | | | | | | | | | | | |
Collapse
|
193
|
Abstract
Large-scale sequencing of human cDNA and genomic DNA libraries has produced a large collection of sequence data in public databases. To date, >900,000 human expressed sequence tag (EST) sequences and >80,000,000 bases of genomic DNA sequence have been deposited in Genbank. This ever-expanding data set is a rich source of gene-associated and anonymous single nucleotide polymorphisms (SNPs). DNA sequence variations can be found by comparing the sequences of redundant ESTs and by comparing sequences from overlapping genomic clones. Initial studies have shown that, with proper computer screening, informative SNP markers can be developed from these DNA databases in an efficient and cost-effective manner. Complete public access to these databases will allow individual investigators to add biological value to the human sequence data generated by large-scale sequencing centers.
Collapse
Affiliation(s)
- Z Gu
- Division of Dermatology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | | | | |
Collapse
|
194
|
Xie H, Surka M, Howard J, Trimble WS. Characterization of the mammalian septin H5: distinct patterns of cytoskeletal and membrane association from other septin proteins. CELL MOTILITY AND THE CYTOSKELETON 2000; 43:52-62. [PMID: 10340703 DOI: 10.1002/(sici)1097-0169(1999)43:1<52::aid-cm6>3.0.co;2-5] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The mechanisms controlling cytokinesis during yeast budding and animal cell fission appear quite different, yet both require members of the septin protein family. Mammalian homologs of this novel family of GTPases have been identified but little is known about their properties or functions. Using an antibody specific for the mammalian septin H5, we show that this protein is expressed at distinct levels in a variety of tissues. Tissue expression levels in different tissues did not coincide with those of the only previously characterized mammalian septin Nedd5. H5, like Nedd5, localizes to the cleavage furrow in mitotic fibroblast cells but in non-mitotic cells these proteins associate with actin filaments in different ways. Nedd5 predominantly localizes with stress fibers, but only associates with central portions of the microfilament bundles. In contrast, H5 associates with the entire length of the stress fibers and the cortical actin network. Conditions that disrupt the actin cytoskeleton also disrupt the filamentous patterns of both Nedd5 and H5, resulting in a punctate cytoplasmic pattern. Cell fractionation revealed that H5 co-fractionated with actin, while Nedd5 was predominantly restricted to the membrane fraction. Co-immunoprecipitation experiments revealed that although H5 will co-precipitate with Nedd5, the precipitation is not quantitative. Taken together, these results not only show that H5 behaves like a septin, but also demonstrate that individual septin proteins have distinct properties, suggesting that they may play different roles in cytokinesis and in other stages of the cell cycle.
Collapse
Affiliation(s)
- H Xie
- The Hospital for Sick Children and Department of Biochemistry, University of Toronto, Ontario, Canada
| | | | | | | |
Collapse
|
195
|
Miyajima N, Burge CB, Saito T. Computational and experimental analysis identifies many novel human genes. Biochem Biophys Res Commun 2000; 272:801-7. [PMID: 10860834 DOI: 10.1006/bbrc.2000.2866] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Because of advances in automation, human genomic sequences are being deposited in public databases at a dramatic rate. However, the process of detecting genes in these sequences is still something of an art. Here we describe the implementation and testing of a relatively straightforward computational approach, the Virtual Transcribed Sequence project, which analyzes their gene content using the gene prediction program GENSCAN (GENSCAN 1.0 1,2) in combination with similarity-based methods. This approach identifies many novel human genes not found even in EST databases.
Collapse
Affiliation(s)
- N Miyajima
- Department of Genome Informatics, Kazusa DNA Research Institute, Chiba, Japan
| | | | | |
Collapse
|
196
|
Abstract
The chapter gives an overview of bioinformatic techniques of importance in protein analysis. These include database searches, sequence comparisons and structural predictions. Links to useful World Wide Web (WWW) pages are given in relation to each topic. Databases with biological information are reviewed with emphasis on databases for nucleotide sequences (EMBL, GenBank, DDBJ), genomes, amino acid sequences (Swissprot, PIR, TrEMBL, GenePept), and three-dimensional structures (PDB). Integrated user interfaces for databases (SRS and Entrez) are described. An introduction to databases of sequence patterns and protein families is also given (Prosite, Pfam, Blocks). Furthermore, the chapter describes the widespread methods for sequence comparisons, FASTA and BLAST, and the corresponding WWW services. The techniques involving multiple sequence alignments are also reviewed: alignment creation with the Clustal programs, phylogenetic tree calculation with the Clustal or Phylip packages and tree display using Drawtree, njplot or phylo_win. Finally, the chapter also treats the issue of structural prediction. Different methods for secondary structure predictions are described (Chou-Fasman, Garnier-Osguthorpe-Robson, Predator, PHD). Techniques for predicting membrane proteins, antigenic sites and postranslational modifications are also reviewed.
Collapse
Affiliation(s)
- B Persson
- Stockholm Bioinformatic Centre, Sweden
| |
Collapse
|
197
|
Dempsey AA, Ton C, Liew CC. A cardiovascular EST repertoire: progress and promise for understanding cardiovascular disease. MOLECULAR MEDICINE TODAY 2000; 6:231-7. [PMID: 10840381 DOI: 10.1016/s1357-4310(00)01727-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
The application of expressed sequence tag (EST) technology has proven to be an effective tool for gene discovery and the generation of gene expression profiles. The generation of an EST resource for the cardiovascular system has revealed significant insights into the changes in gene expression that guide heart development and disease. Furthermore, an important genetic resource has been developed for cardiovascular biology that is valuable for data mining and disease gene discovery.
Collapse
Affiliation(s)
- A A Dempsey
- The Cardiovascular Genome Unit, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | | | | |
Collapse
|
198
|
Abstract
The number of protein-coding genes in an organism provides a useful first measure of its molecular complexity. Single-celled prokaryotes and eukaryotes typically have a few thousand genes; for example, Escherichia coli has 4,300 and Saccharomyces cerevisiae has 6,000. Evolution of multicellularity appears to have been accompanied by a several-fold increase in gene number, the invertebrates Caenorhabditis elegans and Drosophila melanogaster having 19,000 and 13,600 genes, respectively. Here we estimate the number of human genes by comparing a set of human expressed sequence tag (EST) contigs with human chromosome 22 and with a non-redundant set of mRNA sequences. The two comparisons give mutually consistent estimates of approximately 35,000 genes, substantially lower than most previous estimates. Evolution of the increased physiological complexity of vertebrates may therefore have depended more on the combinatorial diversification of regulatory networks or alternative splicing than on a substantial increase in gene number.
Collapse
Affiliation(s)
- B Ewing
- Department of Molecular Biotechnology, University of Washington, Seattle, Washington, USA
| | | |
Collapse
|
199
|
Hwang DM, Dempsey AA, Lee CY, Liew CC. Identification of differentially expressed genes in cardiac hypertrophy by analysis of expressed sequence tags. Genomics 2000; 66:1-14. [PMID: 10843799 DOI: 10.1006/geno.2000.6171] [Citation(s) in RCA: 86] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Cardiac hypertrophy is an adaptive response to chronic hemodynamic overload. We employed a whole-genome approach using expressed sequence tags (ESTs) to characterize gene transcription and identify new genes overexpressed in cardiac hypertrophy. Analysis of general transcription patterns revealed a proportional increase in transcripts related to cell/organism defense and a decrease in transcripts related to cell structure and motility in hypertrophic hearts compared to normal hearts. Detailed comparison of individual gene expression identified 64 genes potentially overexpressed in hypertrophy, of 232 candidate genes derived from a set of 77,692 cardiac ESTs, including 47,856 ESTs generated in our laboratory. Of these, 29 were good candidates (P < 0.0002) and 35 were weaker candidates (P < 0.005). RT-PCR of a number of these candidate genes demonstrated correspondence of EST-based predictions of gene expression with in vitro levels. Consistent with an organ under various stresses, up to one-half of the good candidates predicted to exhibit differential expression were genes potentially involved in stress response. Analyses of general transcription patterns and of single-gene expression levels were also suggestive of increased protein synthesis in the hypertrophic myocardium. Overall, these results depict a scenario compatible with current understanding of cardiac hypertrophy. However, the identification of several genes not previously known to exhibit increased expression in cardiac hypertrophy (e.g., prostaglandin D synthases; CD59 antigen) also suggests a number of new avenues for further investigation. These data demonstrate the utility of genome-based resources for investigating questions of cardiovascular biology and medicine.
Collapse
Affiliation(s)
- D M Hwang
- The Cardiac Gene Unit, Department of Laboratory Medicine and Pathobiology, The Centre for Cardiovascular Research, The Toronto Hospital, Toronto, Ontario, M5G 1L5, Canada
| | | | | | | |
Collapse
|
200
|
Eckmann L, Smith JR, Housley MP, Dwinell MB, Kagnoff MF. Analysis by high density cDNA arrays of altered gene expression in human intestinal epithelial cells in response to infection with the invasive enteric bacteria Salmonella. J Biol Chem 2000; 275:14084-94. [PMID: 10799483 DOI: 10.1074/jbc.275.19.14084] [Citation(s) in RCA: 140] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Many clinically important enteric pathogens initiate disease by invading and passing through the intestinal epithelium, a process accompanied by increased epithelial expression of proinflammatory cytokines. To further define the role intestinal epithelial cells play in initiating and modulating the host response to infection with invasive bacteria, hybrid selection on high density cDNA arrays was used to characterize the mRNA expression profile of approximately 4,300 genes in human intestinal epithelial cells after infection with the prototypic invasive bacteria, Salmonella. Selected findings were further evaluated by reverse transcription-polymerase chain reaction, Northern blot analysis, and protein assays. Epithelial infection with Salmonella significantly up-regulated mRNA expression of a relatively small fraction of all genes tested. Of these, several cytokines (granulocyte colony-stimulating factor, inhibin A, Epstein-Barr virus-induced gene 3, interleukin-8, macrophage inflammatory protein-2alpha), kinases (TKT, Eck, HEK), transcription factors (interferon regulatory factor-1), and HLA class I were the most prominent. Furthermore, the transcription factor NF-kappaB is shown to be important for inducible mRNA expression for a broad group of genes tested. These findings expand the repertoire of known epithelial cell responses to infection with an invasive enteric pathogen. The results also show that evaluation of mRNA expression profiles by cDNA array analysis is a powerful approach to characterizing and understanding host-pathogen interactions.
Collapse
Affiliation(s)
- L Eckmann
- Department of Medicine, University of California, San Diego, La Jolla, California 92093, USA.
| | | | | | | | | |
Collapse
|