1
|
Garay LA, Sitepu IR, Cajka T, Fiehn O, Cathcart E, Fry RW, Kanti A, Joko Nugroho A, Faulina SA, Stephanandra S, German JB, Boundy-Mills KL. Discovery of synthesis and secretion of polyol esters of fatty acids by four basidiomycetous yeast species in the order Sporidiobolales. J Ind Microbiol Biotechnol 2017; 44:923-936. [PMID: 28289902 DOI: 10.1007/s10295-017-1919-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Accepted: 02/05/2017] [Indexed: 12/22/2022]
Abstract
Polyol esters of fatty acids (PEFA) are amphiphilic glycolipids produced by yeast that could play a role as natural, environmentally friendly biosurfactants. We recently reported discovery of a new PEFA-secreting yeast species, Rhodotorula babjevae, a basidiomycetous yeast to display this behavior, in addition to a few other Rhodotorula yeasts reported on the 1960s. Additional yeast species within the taxonomic order Sporidiobolales were screened for secreted glycolipid production. PEFA production equal or above 1 g L-1 were detected in 19 out of 65 strains of yeast screened, belonging to 6 out of 30 yeast species tested. Four of these species were not previously known to secrete glycolipids. These results significantly increase the number of yeast species known to secrete PEFA, holding promise for expanding knowledge of PEFA synthesis and secretion mechanisms, as well as setting the groundwork towards commercialization.
Collapse
Affiliation(s)
- Luis A Garay
- Phaff Yeast Culture Collection, Department of Food Science and Technology, University of California, One Shields Ave, Davis, CA, 95616-8598, USA
| | - Irnayuli R Sitepu
- Phaff Yeast Culture Collection, Department of Food Science and Technology, University of California, One Shields Ave, Davis, CA, 95616-8598, USA.,Biotechnology Department, Indonesia International Institute for Life Sciences (i3L), Jalan Pulo Mas Barat Kav. 88, Jakarta, 13210, Indonesia
| | - Tomas Cajka
- West Coast Metabolomics Center, Genome Center, University of California, 451 Health Sciences Drive, Davis, CA, 95616, USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, Genome Center, University of California, 451 Health Sciences Drive, Davis, CA, 95616, USA.,Biochemistry Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah, 21589, Saudi Arabia
| | - Erin Cathcart
- Phaff Yeast Culture Collection, Department of Food Science and Technology, University of California, One Shields Ave, Davis, CA, 95616-8598, USA
| | - Russell W Fry
- Phaff Yeast Culture Collection, Department of Food Science and Technology, University of California, One Shields Ave, Davis, CA, 95616-8598, USA
| | - Atit Kanti
- Research Center for Biology, Indonesian Institute of Sciences, Jalan Raya Jakarta - Bogor Km.46 Cibinong, Bogor, 16911, Indonesia
| | - Agustinus Joko Nugroho
- Research Center for Biology, Indonesian Institute of Sciences, Jalan Raya Jakarta - Bogor Km.46 Cibinong, Bogor, 16911, Indonesia
| | - Sarah Asih Faulina
- Research, Development and Innovation Agency, Ministry of Environment and Forestry, Jalan Gunung Batu No. 5, P.O. Box 165, Bogor, 16610, Indonesia
| | - Sira Stephanandra
- Research, Development and Innovation Agency, Ministry of Environment and Forestry, Jalan Gunung Batu No. 5, P.O. Box 165, Bogor, 16610, Indonesia
| | - J Bruce German
- Department of Food Science and Technology, University of California Davis, One Shields Avenue, Davis, CA, 95616, USA
| | - Kyria L Boundy-Mills
- Phaff Yeast Culture Collection, Department of Food Science and Technology, University of California, One Shields Ave, Davis, CA, 95616-8598, USA.
| |
Collapse
|
2
|
Finding and mapping new genes faster than ever: revisited. Genetics 2014; 197:1063-7. [PMID: 25104804 DOI: 10.1534/genetics.114.165373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
This article recounts some of the early days of the Human Genome Project, covering the important and sometimes controversial role that complementary DNA-based approaches played in the discovery and mapping of the majority of human genes. It also describes my involvement in this effort and my lab's development of methods for rapid sequence identification and mapping of human genes.
Collapse
|
3
|
Expressed sequence tags: normalization and subtraction of cDNA libraries expressed sequence tags\ normalization and subtraction of cDNA libraries. Methods Mol Biol 2009. [PMID: 19277560 DOI: 10.1007/978-1-60327-136-3_6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Expressed Sequence Tags (ESTs) provide a rapid and efficient approach for gene discovery and analysis of gene expression in eukaryotes. ESTs have also become particularly important with recent expanded efforts in complete genome sequencing of understudied, nonmodel eukaryotes such as protists and algae. For these projects, ESTs provide an invaluable source of data for gene identification and prediction of exon-intron boundaries. The generation of EST data, although straightforward in concept, requires nonetheless great care to ensure the highest efficiency and return for the investment in time and funds. To this end, key steps in the process include generation of a normalized cDNA library to facilitate a high gene discovery rate followed by serial subtraction of normalized libraries to maintain the discovery rate. Here we describe in detail, protocols for normalization and subtraction of cDNA libraries followed by an example using the toxic dinoflagellate Alexandrium tamarense.
Collapse
|
4
|
Weisemann JM, Boguski MS, Ouellette BF. Sequence databases: integrated information retrieval and data submission. CURRENT PROTOCOLS IN HUMAN GENETICS 2008; Chapter 6:Unit 6.7. [PMID: 18428302 DOI: 10.1002/0471142905.hg0607s27] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
This unit describes the NCBI's Entrez database browser. Entrez integrates DNA and protein sequence data, three dimensional structures, and taxonomic information with its associated abstracts and citations contained in PubMed (MEDLINE). It is possible to search the Entrez information space using conventional search queries (authors, gene names, map location) as well as by bibliographic associations (articles that are related to one another) and sequence homology. Also described are the procedures for submission of new data, updates, and corrections to the sequence databases.
Collapse
Affiliation(s)
- J M Weisemann
- National Center for Biotechnology Information, Bethesda, Maryland, USA
| | | | | |
Collapse
|
5
|
Wang JPZ, Lindsay BG, Cui L, Wall PK, Marion J, Zhang J, dePamphilis CW. Gene capture prediction and overlap estimation in EST sequencing from one or multiple libraries. BMC Bioinformatics 2005; 6:300. [PMID: 16351717 PMCID: PMC1369009 DOI: 10.1186/1471-2105-6-300] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2004] [Accepted: 12/13/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In expressed sequence tag (EST) sequencing, we are often interested in how many genes we can capture in an EST sample of a targeted size. This information provides insights to sequencing efficiency in experimental design, as well as clues to the diversity of expressed genes in the tissue from which the library was constructed. RESULTS We propose a compound Poisson process model that can accurately predict the gene capture in a future EST sample based on an initial EST sample. It also allows estimation of the number of expressed genes in one cDNA library or co-expressed in two cDNA libraries. The superior performance of the new prediction method over an existing approach is established by a simulation study. Our analysis of four Arabidopsis thaliana EST sets suggests that the number of expressed genes present in four different cDNA libraries of Arabidopsis thaliana varies from 9155 (root) to 12005 (silique). An observed fraction of co-expressed genes in two different EST sets as low as 25% can correspond to an actual overlap fraction greater than 65%. CONCLUSION The proposed method provides a convenient tool for gene capture prediction and cDNA library property diagnosis in EST sequencing.
Collapse
Affiliation(s)
- Ji-Ping Z Wang
- Department of Statistics, Northwestern University, Evanston, IL 60208, USA
| | - Bruce G Lindsay
- Department of Statistics, Penn State University, University Park 16802, USA
| | - Liying Cui
- Department of Biology, Penn State University, University Park 16802, USA
| | - P Kerr Wall
- Department of Biology, Penn State University, University Park 16802, USA
| | - Josh Marion
- Department of Computer Science, Penn State University, University Park 16802, USA
| | - Jiaxuan Zhang
- College of Software, Tsinghua University, Beijing, 100086, PR China
| | | |
Collapse
|
6
|
Imanishi T, Itoh T, Suzuki Y, O'Donovan C, Fukuchi S, Koyanagi KO, Barrero RA, Tamura T, Yamaguchi-Kabata Y, Tanino M, Yura K, Miyazaki S, Ikeo K, Homma K, Kasprzyk A, Nishikawa T, Hirakawa M, Thierry-Mieg J, Thierry-Mieg D, Ashurst J, Jia L, Nakao M, Thomas MA, Mulder N, Karavidopoulou Y, Jin L, Kim S, Yasuda T, Lenhard B, Eveno E, Suzuki Y, Yamasaki C, Takeda JI, Gough C, Hilton P, Fujii Y, Sakai H, Tanaka S, Amid C, Bellgard M, Bonaldo MDF, Bono H, Bromberg SK, Brookes AJ, Bruford E, Carninci P, Chelala C, Couillault C, de Souza SJ, Debily MA, Devignes MD, Dubchak I, Endo T, Estreicher A, Eyras E, Fukami-Kobayashi K, R. Gopinath G, Graudens E, Hahn Y, Han M, Han ZG, Hanada K, Hanaoka H, Harada E, Hashimoto K, Hinz U, Hirai M, Hishiki T, Hopkinson I, Imbeaud S, Inoko H, Kanapin A, Kaneko Y, Kasukawa T, Kelso J, Kersey P, Kikuno R, Kimura K, Korn B, Kuryshev V, Makalowska I, Makino T, Mano S, Mariage-Samson R, Mashima J, Matsuda H, Mewes HW, Minoshima S, Nagai K, Nagasaki H, Nagata N, Nigam R, Ogasawara O, Ohara O, Ohtsubo M, Okada N, Okido T, Oota S, Ota M, Ota T, Otsuki T, Piatier-Tonneau D, Poustka A, Ren SX, Saitou N, Sakai K, Sakamoto S, Sakate R, Schupp I, Servant F, Sherry S, Shiba R, Shimizu N, Shimoyama M, Simpson AJ, Soares B, Steward C, Suwa M, Suzuki M, Takahashi A, Tamiya G, Tanaka H, Taylor T, Terwilliger JD, Unneberg P, Veeramachaneni V, Watanabe S, Wilming L, Yasuda N, Yoo HS, Stodolsky M, Makalowski W, Go M, Nakai K, Takagi T, Kanehisa M, Sakaki Y, Quackenbush J, Okazaki Y, Hayashizaki Y, Hide W, Chakraborty R, Nishikawa K, Sugawara H, Tateno Y, Chen Z, Oishi M, Tonellato P, Apweiler R, Okubo K, Wagner L, Wiemann S, Strausberg RL, Isogai T, Auffray C, Nomura N, Gojobori T, Sugano S. Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biol 2004; 2:e162. [PMID: 15103394 PMCID: PMC393292 DOI: 10.1371/journal.pbio.0020162] [Citation(s) in RCA: 267] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2003] [Accepted: 04/01/2004] [Indexed: 01/08/2023] Open
Abstract
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.
Collapse
Affiliation(s)
- Tadashi Imanishi
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Takeshi Itoh
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 2Bioinformatics Laboratory, Genome Research Department, National Institute of Agrobiological SciencesIbarakiJapan
| | - Yutaka Suzuki
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
- 68Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of TokyoTokyoJapan
| | - Claire O'Donovan
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Satoshi Fukuchi
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | | | - Roberto A Barrero
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Takuro Tamura
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
- 8BITS CompanyShizuokaJapan
| | - Yumi Yamaguchi-Kabata
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Motohiko Tanino
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Kei Yura
- 9Quantum Bioinformatics Group, Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research InstituteKyotoJapan
| | - Satoru Miyazaki
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Kazuho Ikeo
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Keiichi Homma
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Arek Kasprzyk
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Tetsuo Nishikawa
- 10Reverse Proteomics Research InstituteChibaJapan
- 11Central Research Laboratory, HitachiTokyoJapan
| | - Mika Hirakawa
- 12Bioinformatics Center, Institute for Chemical Research, Kyoto UniversityKyotoJapan
| | - Jean Thierry-Mieg
- 13National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MarylandUnited States of America
- 14Centre National de la Recherche Scientifique (CNRS), Laboratoire de Physique MathematiqueMontpellierFrance
| | - Danielle Thierry-Mieg
- 13National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MarylandUnited States of America
- 14Centre National de la Recherche Scientifique (CNRS), Laboratoire de Physique MathematiqueMontpellierFrance
| | - Jennifer Ashurst
- 15The Wellcome Trust Sanger Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Libin Jia
- 16National Cancer Institute, National Institutes of HealthBethesda, MarylandUnited States of America
| | - Mitsuteru Nakao
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | - Michael A Thomas
- 17Department of Biological Sciences, Idaho State UniversityPocatello, IdahoUnited States of America
| | - Nicola Mulder
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Youla Karavidopoulou
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Lihua Jin
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Sangsoo Kim
- 18Korea Research Institute of Bioscience and BiotechnologyTaejeonKorea
| | | | - Boris Lenhard
- 19Center for Genomics and Bioinformatics, Karolinska InstitutetStockholmSweden
| | - Eric Eveno
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | - Yoshiyuki Suzuki
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Chisato Yamasaki
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Jun-ichi Takeda
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Craig Gough
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Phillip Hilton
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Yasuyuki Fujii
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Hiroaki Sakai
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
- 22Tokyo Research Laboratories, Kyowa Hakko Kogyo CompanyTokyoJapan
| | - Susumu Tanaka
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Clara Amid
- 23MIPS—Institute for Bioinformatics, GSF—National Research Center for Environment and HealthNeuherbergGermany
| | - Matthew Bellgard
- 24Centre for Bioinformatics and Biological Computing, School of Information Technology, Murdoch UniversityMurdoch, Western AustraliaAustralia
| | - Maria de Fatima Bonaldo
- 25Medical Education and Biomedical Research Facility, University of IowaIowa City, IowaUnited States of America
| | - Hidemasa Bono
- 26Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Susan K Bromberg
- 27Medical College of Wisconsin, MilwaukeeWisconsinUnited States of America
| | - Anthony J Brookes
- 19Center for Genomics and Bioinformatics, Karolinska InstitutetStockholmSweden
| | - Elspeth Bruford
- 28HUGO Gene Nomenclature Committee, University College LondonLondonUnited Kingdom
| | | | - Claude Chelala
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
| | - Christine Couillault
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | | | - Marie-Anne Debily
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
| | | | - Inna Dubchak
- 32Lawrence Berkeley National Laboratory, BerkeleyCaliforniaUnited States of America
| | - Toshinori Endo
- 33Department of Bioinformatics, Medical Research Institute, Tokyo Medical and Dental UniversityTokyoJapan
| | | | - Eduardo Eyras
- 15The Wellcome Trust Sanger Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Kaoru Fukami-Kobayashi
- 35Bioresource Information Division, RIKEN BioResource Center, RIKEN Tsukuba InstituteIbarakiJapan
| | - Gopal R. Gopinath
- 36Genome Knowledgebase, Cold Spring Harbor LaboratoryCold Spring Harbor, New YorkUnited States of America
| | - Esther Graudens
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | - Yoonsoo Hahn
- 18Korea Research Institute of Bioscience and BiotechnologyTaejeonKorea
| | - Michael Han
- 23MIPS—Institute for Bioinformatics, GSF—National Research Center for Environment and HealthNeuherbergGermany
| | - Ze-Guang Han
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
- 37Chinese National Human Genome Center at ShanghaiShanghaiChina
| | - Kousuke Hanada
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Hideki Hanaoka
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Erimi Harada
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Katsuyuki Hashimoto
- 38Division of Genetic Resources, National Institute of Infectious DiseasesTokyoJapan
| | - Ursula Hinz
- 34Swiss Institute of BioinformaticsGenevaSwitzerland
| | - Momoki Hirai
- 39Graduate School of Frontier Sciences, Department of Integrated Biosciences, University of TokyoChibaJapan
| | - Teruyoshi Hishiki
- 40Functional Genomics Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Ian Hopkinson
- 41Department of Primary Care and Population Sciences, Royal Free University College Medical School, University College LondonLondonUnited Kingdom
- 42Clinical and Molecular Genetics Unit, The Institute of Child HealthLondonUnited Kingdom
| | - Sandrine Imbeaud
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | - Hidetoshi Inoko
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
- 43Department of Genetic Information, Division of Molecular Life Science, School of Medicine, Tokai UniversityKanagawaJapan
| | - Alexander Kanapin
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Yayoi Kaneko
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Takeya Kasukawa
- 26Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Janet Kelso
- 44South African National Bioinformatics Institute, University of the Western CapeBellvilleSouth Africa
| | - Paul Kersey
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | | | | | - Bernhard Korn
- 46RZPD Resource Center for Genome ResearchHeidelbergGermany
| | - Vladimir Kuryshev
- 47Molecular Genome Analysis, German Cancer Research Center-DKFZHeidelbergGermany
| | - Izabela Makalowska
- 48Pennsylvania State UniversityUniversity Park, PennsylvaniaUnited States of America
| | - Takashi Makino
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Shuhei Mano
- 43Department of Genetic Information, Division of Molecular Life Science, School of Medicine, Tokai UniversityKanagawaJapan
| | - Regine Mariage-Samson
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
| | - Jun Mashima
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Hideo Matsuda
- 49Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka UniversityOsakaJapan
| | - Hans-Werner Mewes
- 23MIPS—Institute for Bioinformatics, GSF—National Research Center for Environment and HealthNeuherbergGermany
| | - Shinsei Minoshima
- 50Medical Photobiology Department, Photon Medical Research Center, Hamamatsu University School of MedicineShizuokaJapan
- 52Department of Molecular Biology, Keio University School of MedicineTokyoJapan
| | | | - Hideki Nagasaki
- 51Computational Biology Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Naoki Nagata
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Rajni Nigam
- 27Medical College of Wisconsin, MilwaukeeWisconsinUnited States of America
| | - Osamu Ogasawara
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | | | - Masafumi Ohtsubo
- 52Department of Molecular Biology, Keio University School of MedicineTokyoJapan
| | - Norihiro Okada
- 53Department of Biological Sciences, Graduate School of Bioscience and Biotechnology, Tokyo Institute of TechnologyKanagawaJapan
| | - Toshihisa Okido
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Satoshi Oota
- 35Bioresource Information Division, RIKEN BioResource Center, RIKEN Tsukuba InstituteIbarakiJapan
| | - Motonori Ota
- 54Global Scientific Information and Computing Center, Tokyo Institute of TechnologyTokyoJapan
| | - Toshio Ota
- 22Tokyo Research Laboratories, Kyowa Hakko Kogyo CompanyTokyoJapan
| | - Tetsuji Otsuki
- 55Molecular Biology Laboratory, Medicinal Research Laboratories, Taisho Pharmaceutical CompanySaitamaJapan
| | | | - Annemarie Poustka
- 47Molecular Genome Analysis, German Cancer Research Center-DKFZHeidelbergGermany
| | - Shuang-Xi Ren
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
- 37Chinese National Human Genome Center at ShanghaiShanghaiChina
| | - Naruya Saitou
- 56Department of Population Genetics, National Institute of GeneticsShizuokaJapan
| | - Katsunaga Sakai
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Shigetaka Sakamoto
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Ryuichi Sakate
- 39Graduate School of Frontier Sciences, Department of Integrated Biosciences, University of TokyoChibaJapan
| | - Ingo Schupp
- 47Molecular Genome Analysis, German Cancer Research Center-DKFZHeidelbergGermany
| | - Florence Servant
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Stephen Sherry
- 13National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MarylandUnited States of America
| | - Rie Shiba
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Nobuyoshi Shimizu
- 52Department of Molecular Biology, Keio University School of MedicineTokyoJapan
| | - Mary Shimoyama
- 27Medical College of Wisconsin, MilwaukeeWisconsinUnited States of America
| | | | - Bento Soares
- 25Medical Education and Biomedical Research Facility, University of IowaIowa City, IowaUnited States of America
| | - Charles Steward
- 15The Wellcome Trust Sanger Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Makiko Suwa
- 51Computational Biology Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Mami Suzuki
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Aiko Takahashi
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Gen Tamiya
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
- 43Department of Genetic Information, Division of Molecular Life Science, School of Medicine, Tokai UniversityKanagawaJapan
| | - Hiroshi Tanaka
- 33Department of Bioinformatics, Medical Research Institute, Tokyo Medical and Dental UniversityTokyoJapan
| | - Todd Taylor
- 57Human Genome Research Group, Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Joseph D Terwilliger
- 58Columbia University and Columbia Genome CenterNew York, New YorkUnited States of America
| | - Per Unneberg
- 59Department of Biotechnology, Royal Institute of TechnologyStockholmSweden
| | - Vamsi Veeramachaneni
- 48Pennsylvania State UniversityUniversity Park, PennsylvaniaUnited States of America
| | - Shinya Watanabe
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | - Laurens Wilming
- 15The Wellcome Trust Sanger Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Norikazu Yasuda
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Hyang-Sook Yoo
- 18Korea Research Institute of Bioscience and BiotechnologyTaejeonKorea
| | - Marvin Stodolsky
- 60Biology Division and Genome Task Group, Office of Biological and Environmental Research, United States Department of EnergyWashington, D.CUnited States of America
| | - Wojciech Makalowski
- 48Pennsylvania State UniversityUniversity Park, PennsylvaniaUnited States of America
| | - Mitiko Go
- 61Faculty of Bio-Science, Nagahama Institute of Bio-Science and TechnologyShigaJapan
| | - Kenta Nakai
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | - Toshihisa Takagi
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | - Minoru Kanehisa
- 12Bioinformatics Center, Institute for Chemical Research, Kyoto UniversityKyotoJapan
| | - Yoshiyuki Sakaki
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
- 57Human Genome Research Group, Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - John Quackenbush
- 62Institute for Genomic ResearchRockville, MarylandUnited States of America
| | - Yasushi Okazaki
- 26Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Yoshihide Hayashizaki
- 26Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Winston Hide
- 44South African National Bioinformatics Institute, University of the Western CapeBellvilleSouth Africa
| | - Ranajit Chakraborty
- 63Center for Genome Information, Department of Environmental Health, University of CincinnatiCincinnati, OhioUnited States of America
| | - Ken Nishikawa
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Hideaki Sugawara
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Yoshio Tateno
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Zhu Chen
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
- 37Chinese National Human Genome Center at ShanghaiShanghaiChina
- 64State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, Rui-Jin Hospital, Shanghai Second Medical UniversityShanghaiChina
| | | | - Peter Tonellato
- 65PointOne SystemsWauwatosa, WisconsinUnited States of America
| | - Rolf Apweiler
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Kousaku Okubo
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
- 40Functional Genomics Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Lukas Wagner
- 13National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MarylandUnited States of America
| | - Stefan Wiemann
- 47Molecular Genome Analysis, German Cancer Research Center-DKFZHeidelbergGermany
| | - Robert L Strausberg
- 16National Cancer Institute, National Institutes of HealthBethesda, MarylandUnited States of America
| | - Takao Isogai
- 10Reverse Proteomics Research InstituteChibaJapan
- 66Graduate School of Life and Environmental Sciences, University of TsukubaIbarakiJapan
| | - Charles Auffray
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | - Nobuo Nomura
- 40Functional Genomics Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Takashi Gojobori
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
- 67Department of Genetics, Graduate University for Advanced StudiesShizuokaJapan
| | - Sumio Sugano
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
- 40Functional Genomics Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 68Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of TokyoTokyoJapan
| |
Collapse
|
7
|
Zhang Y, Mei P, Lou R, Zhang MQ, Wu G, Qiang B, Zhang Z, Shen Y. Gene expression profiling in developing human hippocampus. J Neurosci Res 2002; 70:200-8. [PMID: 12271469 DOI: 10.1002/jnr.10322] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The gene expression profile of developing human hippocampus is of particular interest and importance to neurobiologists devoted to development of the human brain and related diseases. To gain further molecular insight into the developmental and functional characteristics, we analyzed the expression profile of active genes in developing human hippocampus. Expressed sequence tags (ESTs) were selected by sequencing randomly selected clones from an original 3'-directed cDNA library of 150-day human fetal hippocampus, and a digital expression profile of 946 known genes that could be divided into 16 categories was generated. We also used for comparison 14 other expression profiles of related human neural cells/tissues, including human adult hippocampus. To yield more confidence regarding differential expression, a method was applied to attach normalized expression data to genes with a low false-positive rate (<0.05). Finally, hierarchical cluster analysis was used to exhibit related gene expression patterns. Our results are in accordance with anatomical and physiological observations made during the developmental process of the human hippocampus. Furthermore, some novel findings appeared to be unique to our results. The abundant expression of genes for cell surface components and disease-related genes drew our attention. Twenty-four genes are significantly different from adult, and 13 genes might be developing hippocampus-specific candidate genes, including wnt2b and some Alzheimer's disease-related genes. Our results could provide useful information on the ontogeny, development, and function of cells in the human hippocampus at the molecular level and underscore the utility of large-scale, parallel gene expression analyses in the study of complex biological phenomena.
Collapse
Affiliation(s)
- Yan Zhang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, People's Republic of China
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Maddouri M, Elloumi M. A data mining approach based on machine learning techniques to classify biological sequences. Knowl Based Syst 2002. [DOI: 10.1016/s0950-7051(01)00143-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
Jia L, Young MF, Powell J, Yang L, Ho NC, Hotchkiss R, Robey PG, Francomano CA. Gene expression profile of human bone marrow stromal cells: high-throughput expressed sequence tag sequencing analysis. Genomics 2002; 79:7-17. [PMID: 11827452 DOI: 10.1006/geno.2001.6683] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Human bone marrow stromal cells (HBMSC) are pluripotent cells with the potential to differentiate into osteoblasts, chondrocytes, myelosupportive stroma, and marrow adipocytes. We used high-throughput DNA sequencing analysis to generate 4258 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' (97) and 3' (4161) ends of human cDNA clones from a HBMSC cDNA library. Our goal was to obtain tag sequences from the maximum number of possible genes and to deposit them in the publicly accessible database for ESTs (dbEST of the National Center for Biotechnology Information). Comparisons of our EST sequencing data with nonredundant human mRNA and protein databases showed that the ESTs represent 1860 gene clusters. The EST sequencing data analysis showed 60 novel genes found only in this cDNA library after BLAST analysis against 3.0 million ESTs in NCBI's dbEST database. The BLAST search also showed the identified ESTs that have close homology to known genes, which suggests that these may be newly recognized members of known gene families. The gene expression profile of this cell type is revealed by analyzing both the frequency with which a message is encountered and the functional categorization of expressed sequences. Comparing an EST sequence with the human genomic sequence database enables assignment of an EST to a specific chromosomal region (a process called digital gene localization) and often enables immediate partial determination of intron/exon boundaries within the genomic structure. It is expected that high-throughput EST sequencing and data mining analysis will greatly promote our understanding of gene expression in these cells and of growth and development of the skeleton.
Collapse
Affiliation(s)
- Libin Jia
- Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | | | | | | | | | |
Collapse
|
10
|
Wang YH, McWilliam SM, Barendse W, Kata SR, Womack JE, Moore SS, Lehnert SA. Mapping of 12 bovine ribosomal protein genes using a bovine radiation hybrid panel. Anim Genet 2001; 32:269-73. [PMID: 11683713 DOI: 10.1046/j.1365-2052.2001.00791.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Twelve bovine ribosomal protein genes, for which sequence data had been acquired from complementary deoxyribonucleic acid (cDNA) clones isolated from a cattle skin cDNA library, were mapped. As ribosomal protein genes are a group of highly conserved house keeping genes, specific primers were designed to span the intron-exon splice sites and to amplify intronic sequences, in order to obtain bovine-specific polymerase chain reaction (PCR) products. Two of 12 ribosomal protein genes were genotyped in this way and the remaining 10 were mapped using additional primers designed from within the intron. Eleven previously unmapped ribosomal protein genes were localized and one previously reported ribosomal protein gene localization was confirmed. The 12 ribosomal protein genes mapped in this study are spread over 10 chromosomes, including the X chromosome. The locations show conservation of comparative map position in cattle and human.
Collapse
Affiliation(s)
- Y H Wang
- CSIRO Livestock Industries, Molecular Animal Genetics Centre, Gehrmann Laboratories, Brisbane, Australia.
| | | | | | | | | | | | | |
Collapse
|
11
|
Weisemann JM, Boguski MS, Ouellette BF. Sequence databases: integrated information retrieval and data submission. CURRENT PROTOCOLS IN MOLECULAR BIOLOGY 2001; Chapter 19:Unit 19.2. [PMID: 18265176 DOI: 10.1002/0471142727.mb1902s51] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
This unit provides an overview of biomedical information resources, focusing on sequence data, structure information, and the associated literature, and also discusses how nucleotide sequence data gets into the databases in the first place. Some specific databases covered here are MEDLINE, GenBank, and Entrez.
Collapse
Affiliation(s)
- J M Weisemann
- National Center for Biotechnology Information, Bethesda, Maryland, USA
| | | | | |
Collapse
|
12
|
Sarrazin P, Bkaily G, Haché R, Patry C, Dumais R, Rocha FA, de Brum-Fernandes AJ. Characterization of the prostaglandin receptors in human osteoblasts in culture. Prostaglandins Leukot Essent Fatty Acids 2001; 64:203-10. [PMID: 11334557 DOI: 10.1054/plef.1999.0127] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Prostaglandins have complex actions on bone metabolism that depend on interactions with different types and subtypes of receptors. Our objective was to characterize the prostaglandins receptors present in primary cultures of human osteoblasts. RT-PCR analysis revealed the presence of DP, EP(4), IP, FP and TP receptor mRNA in primary cultures of human osteoblasts. FP receptor mRNA was detected only after 3 weeks of confluency, all the others were detected at every culture time tested. To verify the functionality of these receptors we challenged the cells with the prostanoids and synthetic analogues and determined the intracellular levels of cAMP. All receptors found by RT-PCR were coupled to second messengers except for the DP subtype. These results clearly show the presence of functional EP(4), IP, FP and TP receptors in human osteoblasts in culture.
Collapse
Affiliation(s)
- P Sarrazin
- Department of Medicine, Faculty of Medicine, Université de Sherbrooke, Sherbrooke, Canada
| | | | | | | | | | | | | |
Collapse
|
13
|
Navarro E, Espinosa L. Improving quality of expressed sequence tag (EST) databases: recovery of reversed, antisense cDNA sequences. MICROBIAL & COMPARATIVE GENOMICS 2001; 5:17-24. [PMID: 11011762 DOI: 10.1089/10906590050145230] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Expressed sequence tag (EST) databases contain a significant number (5-20%) of reversed, antisense, cDNA sequences that can be recognized by the label "reversed clone: similarity on wrong strand" in the annotations to the sequence. Despite this high number of altered sequences, no attempt has been made to explain the alteration in molecular terms, or to evaluate their effect on the quality of the information curated in EST databases. In this paper we try to explain the way these altered sequences are originated, and propose a plausible mechanism: a "double priming" of the first strand oligo-dT primer at both ends of nascent cDNAs. In this way, a symmetrical cDNA intermediate is generated, an intermediate that can be cloned after partial digestion with the restriction enzyme used for the directional cloning. Furthermore, when "secondary" priming takes place inside the cDNA, the chain synthesized is prone to be truncated prematurely, with the subsequent loss of upstream information. One of the most subtle effects of this cloning alteration is the generation of virtual open reading frames (ORFs) in sequences with no homologues available for comparison. Nevertheless, and according to our model and our data, the "double priming mechanism" does not shift the ORF effected, so antisense sequences should be considered as normal ones after a simple transformation in their inverse-complementary forms.
Collapse
|
14
|
Abstract
A large body of immunologic, epidemiologic, and genetic data indicate that tissue injury in multiple sclerosis (MS) results from an abnormal immune response to one or more myelin antigens that develops in genetically susceptible individuals after exposure to an as-yet undefined causal agent. The genetic component of MS etiology is believed to result from the action of several genes of moderate effect. The incomplete penetrance of MS susceptibility alleles probably reflects interactions with other genes, post transcriptional regulatory mechanisms, and significant nutritional and environmental influences. Equally significant, it is also likely that genetic heterogeneity exists, meaning that specific genes influence susceptibility and pathogenesis in some affects but not in others. Results in multiplex MS families confirm the genetic importance of the MHC region in conferring susceptibility of MS. Susceptibility may be mediated by the class II genes themselves (DR, DQ or both), related to the known function of these molecules in the normal immune response, e.g. antigen binding and presentation and T cell repertoire determination. The possibility that other genes in the MHC or the telomeric region of the MHC are responsible for the observed genetic effect cannot be excluded. The data also indicate that although the MHC region plays a significant role in MS susceptibility, much of the genetic effect in MS remains to be explained. Some loci may be involved in the initial pathogenic events, while others could influence the development and progression of the disease. The past few years have seen real progress in the development of laboratory and analytical approaches to study non-Mendelian complex genetic disorders and in defining the pathological basis of demyelination, setting the stage for the final characterization of the genes involved in MS susceptibility and pathogenesis. Their identification and characterization is likely to define the basic etiology of the disease, improve risk assessment and influence therapeutics.
Collapse
Affiliation(s)
- J R Oksenberg
- Department of Neurology, School of Medicine, University of California, 94143-0435, San Francisco, CA, USA.
| | | | | | | |
Collapse
|
15
|
Konno H, Fukunishi Y, Shibata K, Itoh M, Carninci P, Sugahara Y, Hayashizaki Y. Computer-Based Methods for the Mouse Full-Length cDNA Encyclopedia: Real-Time Sequence Clustering for Construction of a Nonredundant cDNA Library. Genome Res 2001. [DOI: 10.1101/gr.145701] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We developed computer-based methods for constructing a nonredundant mouse full-length cDNA library. Our cDNA library construction process comprises assessment of library quality, sequencing the 3′ ends of inserts and clustering, and completing a re-array to generate a nonredundant library from a redundant one. After the cDNA libraries are generated, we sequence the 5′ ends of the inserts to check the quality of the library; then we determine the sequencing priority of each library. Selected libraries undergo large-scale sequencing of the 3′ ends of the inserts and clustering of the tag sequences. After clustering, the nonredundant library is constructed from the original libraries, which have redundant clones. All libraries, plates, clones, sequences, and clusters are uniquely identified, and all information is saved in the database according to this identifier. At press time, our system has been in place for the past two years; we have clustered 939,725 3′ end sequences into 127,385 groups from 227 cDNA libraries/sublibraries (seehttp://genome.gse.riken.go.jp/).[The sequence data described in this paper have been submitted to the DDBJ data library under accession nos. AV00011–AV175734, AV204013–AV382295, andBB561685–BB609425.]
Collapse
|
16
|
Goh SH, Park JH, Lee YJ, Lee HG, Yoo HS, Lee IC, Park JH, Kim YS, Lee CC. Gene expression profile and identification of differentially expressed transcripts during human intrathymic T-cell development by cDNA sequencing analysis. Genomics 2000; 70:1-18. [PMID: 11087656 DOI: 10.1006/geno.2000.6342] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The development of immature thymocytes to mature T-lymphocytes is a central process for establishing a functional immune system. The gene regulatory events involved in this process are of outstanding interest in understanding the generation of the T-cell repertoire as well as the differentiation of lineage-specific cells, such as CD4(+) helper T-cells or CD8(+) cytotoxic T-lymphocytes. While some essential genes involved in lineage decision and thymocyte differentiation have been already identified, the exact regulatory mechanisms and differential gene expressions are still unknown. The present study was performed to analyze the gene expression profile during T-cell development, in particular, during the differentiation of immature thymocytes into CD4(+) mature T-cells by analyses of expressed sequence tags (ESTs), and to elucidate novel human genes involved in this process. Based on distinct developmental stages, three PCR-based cDNA libraries from immature CD3(-),4(-),8(-) triple-negative, CD4(+),8(+) double-positive, and mature CD4(+),8(-) single-positive thymocytes were constructed. A total of 1477 randomly selected clones were analyzed by automated single-pass sequencing, and the assembly of ESTs resulted in 1027 different species of contig sequences. Among them, 392 contig sequences were matched to known genes, and several novel transcripts were discovered. The matched clones were classified into seven categories according to their functional aspects, and the gene expression profiles of the three thymocyte subsets were compared. The information obtained in current study will serve as a valuable resource for elucidating the molecular mechanism of intrathymic T-cell development.
Collapse
Affiliation(s)
- S H Goh
- Genome Research Center, Korea Research Institute of Bioscience and Biotechnology, Taejon, 305-333, Korea
| | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Dempsey AA, Ton C, Liew CC. A cardiovascular EST repertoire: progress and promise for understanding cardiovascular disease. MOLECULAR MEDICINE TODAY 2000; 6:231-7. [PMID: 10840381 DOI: 10.1016/s1357-4310(00)01727-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
The application of expressed sequence tag (EST) technology has proven to be an effective tool for gene discovery and the generation of gene expression profiles. The generation of an EST resource for the cardiovascular system has revealed significant insights into the changes in gene expression that guide heart development and disease. Furthermore, an important genetic resource has been developed for cardiovascular biology that is valuable for data mining and disease gene discovery.
Collapse
Affiliation(s)
- A A Dempsey
- The Cardiovascular Genome Unit, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | | | | |
Collapse
|
18
|
Hwang DM, Dempsey AA, Lee CY, Liew CC. Identification of differentially expressed genes in cardiac hypertrophy by analysis of expressed sequence tags. Genomics 2000; 66:1-14. [PMID: 10843799 DOI: 10.1006/geno.2000.6171] [Citation(s) in RCA: 86] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Cardiac hypertrophy is an adaptive response to chronic hemodynamic overload. We employed a whole-genome approach using expressed sequence tags (ESTs) to characterize gene transcription and identify new genes overexpressed in cardiac hypertrophy. Analysis of general transcription patterns revealed a proportional increase in transcripts related to cell/organism defense and a decrease in transcripts related to cell structure and motility in hypertrophic hearts compared to normal hearts. Detailed comparison of individual gene expression identified 64 genes potentially overexpressed in hypertrophy, of 232 candidate genes derived from a set of 77,692 cardiac ESTs, including 47,856 ESTs generated in our laboratory. Of these, 29 were good candidates (P < 0.0002) and 35 were weaker candidates (P < 0.005). RT-PCR of a number of these candidate genes demonstrated correspondence of EST-based predictions of gene expression with in vitro levels. Consistent with an organ under various stresses, up to one-half of the good candidates predicted to exhibit differential expression were genes potentially involved in stress response. Analyses of general transcription patterns and of single-gene expression levels were also suggestive of increased protein synthesis in the hypertrophic myocardium. Overall, these results depict a scenario compatible with current understanding of cardiac hypertrophy. However, the identification of several genes not previously known to exhibit increased expression in cardiac hypertrophy (e.g., prostaglandin D synthases; CD59 antigen) also suggests a number of new avenues for further investigation. These data demonstrate the utility of genome-based resources for investigating questions of cardiovascular biology and medicine.
Collapse
Affiliation(s)
- D M Hwang
- The Cardiac Gene Unit, Department of Laboratory Medicine and Pathobiology, The Centre for Cardiovascular Research, The Toronto Hospital, Toronto, Ontario, M5G 1L5, Canada
| | | | | | | |
Collapse
|
19
|
Navarro E, Espinosa L, Adell T, Torà M, Berrozpe G, Real FX. Expressed sequence tag (EST) phenotyping of HT-29 cells: cloning of ser/thr protein kinase EMK1, kinesin KIF3B, and of transcripts that include Alu repeated elements. BIOCHIMICA ET BIOPHYSICA ACTA 1999; 1450:254-64. [PMID: 10395937 DOI: 10.1016/s0167-4889(99)00051-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
To study the mechanisms that control epithelial commitment and differentiation we have used undifferentiated HT-29 colon cancer cells and a subpopulation of mucus secreting cells obtained by selection of HT-29 cells in 10-6 M methotrexate (M6 cells) as experimental models. We isolated cDNAs encoding transcripts overexpressed in early confluent M6 cells regarding steady-state levels in HT-29 cells by subtractive hybridisation. Fifty-one cDNA clones, corresponding to 34 independent transcripts, were isolated, partially sequenced by their 5' end, and classified into four groups according to their identity: transcripts that included a repeated sequence of the Alu family (10 clones, among them those encoding ribonucleoprotein RNP-L and E-cadherin), transcripts encoded by the mitochondrial genome (nine clones), transcripts encoding components of the protein synthesis machinery (23 clones, including the human ribosomal protein L38 not previously cloned in humans) and nine additional cDNAs that could not be classified in the previous groups. These last included ferritin, cytokeratin 18, translationally controlled human tumour protein (TCHTP), mt-aldehyde dehydrogenase, as well as unknown transcripts (three clones), and the human homologues of the molecular motor kinesin KIF3B and of the ser/thr protein kinase EMK1. Spot dot and Northern blot analyses showed that ser/thr protein kinase EMK1 was differentially expressed in M6 cells when compared with parental HT-29 cells. Steady-state levels of EMK1 were higher in proliferating, preconfluent, M6 and HT-29 cells than in 2 days post confluence (dpc) and 8dpc M6 and HT-29 cells. Transcripts that included an Alu repeat were also shown to be differentially expressed and accumulated in differentiating M6 cells when analysed by Northern blot. The significance of the transcripts cloned is discussed in the context of the commitment and differentiation of the M6 cells to the mucus secreting lineage of epithelial cells.
Collapse
Affiliation(s)
- E Navarro
- Unitat de Biologia Cel.lular i Molecular, Institut Municipal d'Investigació Mèdica (IMIM), C/ Dr Aiguader 80, E-08003, Barcelona, Spain.
| | | | | | | | | | | |
Collapse
|
20
|
Affiliation(s)
- D M Church
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | | |
Collapse
|
21
|
Shworak NW, Liu J, Petros LM, Zhang L, Kobayashi M, Copeland NG, Jenkins NA, Rosenberg RD. Multiple isoforms of heparan sulfate D-glucosaminyl 3-O-sulfotransferase. Isolation, characterization, and expression of human cdnas and identification of distinct genomic loci. J Biol Chem 1999; 274:5170-84. [PMID: 9988767 DOI: 10.1074/jbc.274.8.5170] [Citation(s) in RCA: 179] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
3-O-Sulfated glucosaminyl residues are rare constituents of heparan sulfate and are essential for the activity of anticoagulant heparan sulfate. Cellular production of the critical active structure is controlled by the rate-limiting enzyme, heparan sulfate D-glucosaminyl 3-O-sulfotransferase-1 (3-OST-1) (EC 2.8.2.23). We have probed the expressed sequence tag data base with the carboxyl-terminal sulfotransferase domain of 3-OST-1 to reveal three novel, incomplete human cDNAs. These were utilized in library screens to isolate full-length cDNAs. Clones corresponding to predominant transcripts were obtained for the 367-, 406-, and 390-amino acid enzymes 3-OST-2, 3-OST-3A, and 3-OST-3B, respectively. These type II integral membrane proteins are comprised of a divergent amino-terminal region and a very homologous carboxyl-terminal sulfotransferase domain of approximately 260 residues. Also recovered were partial length clones for 3-OST-4. Expression of the full-length enzymes confirms the 3-O-sulfation of specific glucosaminyl residues within heparan sulfate (Liu, J., Shworak, N. W., Sinaÿ, P., Schwartz, J. J. Zhang, L., Fritze, L. M. S., and Rosenberg, R. D. (1999) J. Biol. Chem. 274, 5185-5192). Southern analyses suggest the human 3OST1, 3OST2, and 3OST4 genes, and the corresponding mouse isologs, are single copy. However, 3OST3A and 3OST3B genes are each duplicated in humans and show at least one copy each in mice. Intriguingly, the entire sulfotransferase domain sequence of the 3-OST-3B cDNA (774 base pairs) was 99.2% identical to the same region of 3-OST-3A. Together, these data argue that the structure of this functionally important region is actively maintained by gene conversion between 3OST3A and 3OST3B loci. Interspecific mouse back-cross analysis identified the loci for mouse 3Ost genes and syntenic assignments of corresponding human isologs were confirmed by the identification of mapped sequence-tagged site markers. Northern blot analyses indicate brain exclusive and brain predominant expression of 3-OST-4 and 3-OST-2 transcripts, respectively; whereas, 3-OST-3A and 3-OST-3B isoforms show widespread expression of multiple transcripts. The reiteration and conservation of the 3-OST sulfotransferase domain suggest that this structure is a self-contained functional unit. Moreover, the extensive number of 3OST genes with diverse expression patterns of multiple transcripts suggests that the novel 3-OST enzymes, like 3-OST-1, regulate important biologic properties of heparan sulfate proteoglycans.
Collapse
Affiliation(s)
- N W Shworak
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Okihana H, Yamada K. Preparation of a cDNA library and preliminary assessment of 1400 genes from mouse growth cartilage. J Bone Miner Res 1999; 14:304-10. [PMID: 9933486 DOI: 10.1359/jbmr.1999.14.2.304] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Cartilage is an inconvenient tissue for the isolation of mRNA, and this has hampered studies of its component mRNAs conducted to date. Here, we describe the preparation of a good quality cDNA library from mouse growth cartilage (mGC). A total of 1.7 microg of poly(A)+ RNA was obtained from about 1200 pieces of the mGC zone of 60 young mice (BALB/c, 4 weeks old). Using this poly(A)+ RNA, we constructed a cDNA library using the pAP3neo vector by the linker-primer method. The complexity of the cDNA library was 2.6 x 106 colony-forming units (cfu), which signified that almost all of the mRNA components in the mGC were present in this cDNA library. From this library, 1401 clones were randomly selected and their insert sizes were examined. Of these clones, 166 (12%) had no inserts, 466 (33%) had inserts ranging in size from 0-0.9 kbp, 480 (34%) had inserts of 1. 0-1.9 kbp, 162 (12%) had inserts of 2.0-2.9 kbp, and 127 (9%) had sizes greater than 3.0 kbp. The average insert size was 1.45 kbp. The number of cfu and the insert size data qualified this library as of reasonably good quality. Clones with an insert size greater than 1 kbp (769 clones) were sequenced from their 5' ends. Among the 769 clones examined, 608 gave sequence data. Among these, 196 (32%) were unknown, 2 were only poly A, and 410 (67%) coded for known proteins. Of these, 55 clones coded for type II (pro)collagen, 54 for osteonectin, and 22 for other cartilage collagens (type IX, type X, and type XI). The rest included cartilage extracellular matrix genes, general cellular genes, and others. To judge further the quality of the library, 45 species coding for type II collagen chain were aligned based on their 5' end sequences. Three species (7%) contained almost the full-length insert, and the shortest one was 1. 5 kbp in length (full-length 5.6 kbp). These data show that this cDNA library is of reasonably good quality, making it likely that the large number of unknown inserts (32%) will provide a suitable pool for the identification and functional determination of new GC genes.
Collapse
Affiliation(s)
- H Okihana
- Fujimoto Pharmaceutical Corporation, Osaka, Japan
| | | |
Collapse
|
23
|
Harter C, Ripoll C, Lenoir M, Hamel CP, Rebillard G. Expression pattern of mammalian cochlea outer hair cell (OHC) mRNA: screening of a rat OHC cDNA library. DNA Cell Biol 1999; 18:1-10. [PMID: 10025504 DOI: 10.1089/104454999315574] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The aim of this study was to characterize the mRNA content of mammalian cochlear outer hair cells (OHCs) and to search for specific genes possibly involved in their unique properties. Indeed, OHCs, which feature high-frequency electromotility, are responsible for the exquisite sensitivity and frequency selectivity of the cochlea. Damage to these cells, which occurs in various conditions, causes a reduction in the cochlear sensitivity by about 50 dB and the alteration of frequency discrimination. Total RNA was extracted from about 2000 mechanically dissociated OHCs, and a polymerase chain reaction (PCR) amplified cDNA library was constructed. The presence of the alpha-9 acetylcholine receptor subunit, preferentially expressed in OHCs, was found by direct PCR amplification of the library. A systematic sequencing of 218 clones showed 78% known genes, 11% EST-related sequences, and 11% unknown genes. The known-gene group was characterized by two main features: a large proportion (55%) of mitochondrial transcripts and an abundance in calcium-binding proteins, such as calmodulin and calbindin, for which expression has already been demonstrated in OHCs. Another protein, the oncomodulin recently shown to be OHC specific, was also found, and its mRNA expression was confirmed by in situ hybridization. Among the 24 unknown genes, 7 were expressed in a restricted pattern, including one expressed in cochlea and spleen and, to a lesser extent, in lungs.
Collapse
Affiliation(s)
- C Harter
- INSERM U. 254 et Université Montpellier I, Hôpital Saint Charles, France
| | | | | | | | | |
Collapse
|
24
|
Ibrahim MM, Razmara M, Nguyen D, Donahue RJ, Wubah JA, Knudsen TB. Altered expression of mitochondrial 16S ribosomal RNA in p53-deficient mouse embryos revealed by differential display. BIOCHIMICA ET BIOPHYSICA ACTA 1998; 1403:254-64. [PMID: 9685670 DOI: 10.1016/s0167-4889(98)00066-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Inactivation of the tumor suppressor p53 is associated with neural tube defects and altered teratogenicity in early embryos. To gain insight into the function of p53 during early embryogenesis, RNA profiles of wild-type p53(+/+) and p53(-/-) null mutant mouse embryos were compared at the head-fold stage (day 8 post coitum) using HPLC-based mRNA differential display. The results of this screen revealed a deficiency of mitochondrial 16S ribosomal RNA in p53(-/-) embryos. RT-PCR showed abnormalities in 16S rRNA levels relative to some representative nuclear (COIV, beta-actin) and mitochondrial (COIII) transcripts in p53(-/-) embryos, and that 16S rRNA expression increased with development of p53(+/+) embryos during neurulation. Embryos that lack p53 also displayed weakened cytochrome c oxidase staining and reduced ATP content. During neurulation, the mouse embryo switches from an anaerobic (glycolytic) to an aerobic (oxidative) metabolism. The preliminary results of the present study suggest that p53 may be involved, directly or indirectly, in this transition.
Collapse
Affiliation(s)
- M M Ibrahim
- Department of Pathology, Anatomy and Cell Biology, Jefferson Medical College, 1020 Locust Street, Philadelphia, PA 19107, USA
| | | | | | | | | | | |
Collapse
|
25
|
Kim TA, Lim J, Ota S, Raja S, Rogers R, Rivnay B, Avraham H, Avraham S. NRP/B, a novel nuclear matrix protein, associates with p110(RB) and is involved in neuronal differentiation. J Cell Biol 1998; 141:553-66. [PMID: 9566959 PMCID: PMC2132755 DOI: 10.1083/jcb.141.3.553] [Citation(s) in RCA: 89] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The nuclear matrix is defined as the insoluble framework of the nucleus and has been implicated in the regulation of gene expression, the cell cycle, and nuclear structural integrity via linkage to intermediate filaments of the cytoskeleton. We have discovered a novel nuclear matrix protein, NRP/B (nuclear restricted protein/brain), which contains two major structural elements: a BTB domain-like structure in the predicted NH2 terminus, and a "kelch motif" in the predicted COOH-terminal domain. NRP/B mRNA (5.5 kb) is predominantly expressed in human fetal and adult brain with minor expression in kidney and pancreas. During mouse embryogenesis, NRP/B mRNA expression is upregulated in the nervous system. The NRP/B protein is expressed in rat primary hippocampal neurons, but not in primary astrocytes. NRP/B expression was upregulated during the differentiation of murine Neuro 2A and human SH-SY5Y neuroblastoma cells. Overexpression of NRP/B in these cells augmented neuronal process formation. Treatment with antisense NRP/B oligodeoxynucleotides inhibited the neurite development of rat primary hippocampal neurons as well as the neuronal process formation during neuronal differentiation of PC-12 cells. Since the hypophosphorylated form of retinoblastoma protein (p110(RB)) is found to be associated with the nuclear matrix and overexpression of p110(RB) induces neuronal differentiation, we investigated whether NRP/B is associated with p110(RB). Both in vivo and in vitro experiments demonstrate that NRP/B can be phosphorylated and can bind to the functionally active hypophosphorylated form of the p110(RB) during neuronal differentiation of SH-SY5Y neuroblastoma cells induced by retinoic acid. Our studies indicate that NRP/B is a novel nuclear matrix protein, specifically expressed in primary neurons, that interacts with p110(RB) and participates in the regulation of neuronal process formation.
Collapse
Affiliation(s)
- T A Kim
- Divisions of Experimental Medicine and Hematology/Oncology, Beth Israel Deaconess Medical Center, Harvard Institutes of Medicine, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | | | |
Collapse
|
26
|
Lesche R, Rüther U. Close linkage of p130 and Ft1 is conserved among mammals. Mamm Genome 1998; 9:253-5. [PMID: 9501314 DOI: 10.1007/s003359900737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- R Lesche
- Institut für Molekularbiologie, Medizinische Hochschule Hannover, Germany
| | | |
Collapse
|
27
|
Lesche R, Peetz A, van der Hoeven F, Rüther U. Ft1, a novel gene related to ubiquitin-conjugating enzymes, is deleted in the Fused toes mouse mutation. Mamm Genome 1997; 8:879-83. [PMID: 9383278 DOI: 10.1007/s003359900604] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The dominant mouse mutation Fused toes is characterized by partial syndactyly of the limbs and thymic hyperplasia. Both morphological abnormalities were shown to be related to impaired regulation of programmed cell death. Ft/Ft embryos die in midgestation showing severe malformations of fore- and midbrain as well as randomized situs. In Ft mice a large chromosomal deletion (about 300 kb) occurred after insertional mutagenesis. In this report we describe the identification of the first gene that has been mutated by Fused toes. The expression of the novel gene Ft1 is reduced in Ft/+ mice and completely absent in Ft/Ft embryos. Analysis of the Ft1 cDNA revealed an open reading frame that could code for a 32-kDa protein with similarities to ubiquitin-conjugating enzymes. Ft1 transcripts with alternative 5' UTR sequences as well as differential usage of polyadenylation sites were found. Interestingly, the 3' parts of the longest Ft1 transcripts are identical to the reverse complement of the 3'-most sequences of the Rb-related p130 gene. Both genes are transcribed in opposite directions and overlap in their 3' UTRs. Despite the close linkage, p130 expression appeared not to be affected by the Ft mutation. In wild type mice, Ft1 expression levels were found to be high in brain, kidney, and testes and detectable in all other adult organs and throughout embryonic development. Finally, we show that Ft1 is conserved among mammals and identify the human homolog.
Collapse
|
28
|
Vincent C, Tarbouriech N, Härtlein M. Genomic organization, cDNA sequence, bacterial expression, and purification of human seryl-tRNA synthase. EUROPEAN JOURNAL OF BIOCHEMISTRY 1997; 250:77-84. [PMID: 9431993 DOI: 10.1111/j.1432-1033.1997.00077.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
In this paper, we report the cDNA sequence and deduced primary sequence for human cytosolic seryl-tRNA synthetase, and its expression in Escherichia coli. Two human brain cDNA clones of different origin, containing overlapping fragments coding for human seryl-tRNA synthetase were sequenced: HFBDN14 (fetal brain clone); and IB48 (infant brain clone). For both clones the 5' region of the cDNA was missing. This 5' region was obtained via PCR methods using a human brain 5' RACE-Ready cDNA library. The complete cDNA sequence allowed us to define primers to isolate and characterize the intron/exon structure of the serS gene, consisting of 10 introns and 11 exons. The introns' sizes range from 283 bp to more than 3000 bp and the size of the exons from 71 bp to 222 bp. The availability of the gene structure of the human enzyme could help to clarify some aspects of the molecular evolution of class-II aminoacyl-tRNA synthetases. The human seryl-tRNA synthetase has been expressed in E. coli, purified (95% pure as determined by SDS/PAGE) and kinetic parameters have been measured for its substrate tRNA. The human seryl-tRNA synthetase sequence (514 amino acid residues) shows significant sequence identity with seryl-tRNA synthetases from E. coli (25%), Saccharomyces cerevisiae (40%), Arabidopsis thaliana (41%) and Caenorhabditis elegans (60%). The partial sequences from published mammalian seryl-tRNA synthetases are very similar to the human enzyme (94% and 92% identity for mouse and Chinese hamster seryl-tRNA synthetase, respectively). Human seryl-tRNA synthetase, similar to several other class-I and class-II human aminoacyl-tRNA synthetases, is clearly related to its bacterial counterparts, independent of an additional C-terminal domain and a N-terminal insertion identified in the human enzyme. In functional studies, the enzyme aminoacylates calf liver tRNA and prokaryotic E. coli tRNA.
Collapse
|
29
|
Gong Z, Yan T, Liao J, Lee SE, He J, Hew CL. Rapid identification and isolation of zebrafish cDNA clones. Gene X 1997; 201:87-98. [PMID: 9409775 DOI: 10.1016/s0378-1119(97)00431-9] [Citation(s) in RCA: 70] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
A fast and economical approach, referred to as cDNA clone tagging, was adapted to identify and isolate zebrafish cDNA clones. The basic approach was to partially sequence the coding region of size selected cDNA clones and the partial sequences were then used as tags for identifying the clones through homology search. To benefit maximally from the tagging approach, two cDNA libraries, derived from embryonic and adult fish poly(A)+ RNAs, respectively, were constructed by unidirectional cloning; conceptually, they have the potential to represent all expressed zebrafish genes. A total of 1084 clones were sequenced from the two libraries, and 511 clones were identified, based on sequence homology. These identified clones were derived from at least 261 genes, encoding 48 translational machinery proteins, 47 cytosolic proteins, 43 cytoskeletal proteins, 41 nuclear proteins, 32 membrane proteins, 22 secreted proteins, 20 mitochondrial proteins and 8 proteins with an unknown location. Of the 261 distinct cDNA clones identified, 254 were isolated for the first time in the zebrafish. These tagged cDNA clones, identified and unidentified, provide rich resources for developmental analysis as well as mapping of zebrafish genome. The long-term objective of this study is to establish a tagged zebrafish gene library that can be accessed both by hybridization screening against the plasmid DNAs and by electronic screening using the sequence information.
Collapse
Affiliation(s)
- Z Gong
- School of Biological Sciences, National University of Singapore.
| | | | | | | | | | | |
Collapse
|
30
|
Brady KP, Rowe LB, Her H, Stevens TJ, Eppig J, Sussman DJ, Sikela J, Beier DR. Genetic mapping of 262 loci derived from expressed sequences in a murine interspecific cross using single-strand conformational polymorphism analysis. Genome Res 1997; 7:1085-93. [PMID: 9371744 PMCID: PMC310685 DOI: 10.1101/gr.7.11.1085] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
We have demonstrated previously that noncoding sequences of genes are a robust source of polymorphisms between mouse species when tested using single-strand conformation polymorphism (SSCP) analysis, and that these polymorphisms are useful for genetic mapping. In this report we demonstrate that presumptive 3'-untranslated region sequence obtained from expressed sequence tags (ESTs) can be analyzed in a similar fashion, and we have used this approach to map 262 loci using an interspecific backcross. These results demonstrate SSCP analysis of genes or ESTs is a simple and efficient means for the genetic localization of transcribed sequences, and is furthermore an approach that is applicable to any system for which there is sufficient sequence polymorphism.
Collapse
Affiliation(s)
- K P Brady
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Niimi T, Kumagai C, Okano M, Kitagawa Y. Differentiation-dependent expression of laminin-8 (alpha 4 beta 1 gamma 1) mRNAs in mouse 3T3-L1 adipocytes. Matrix Biol 1997; 16:223-30. [PMID: 9402012 DOI: 10.1016/s0945-053x(97)90011-1] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
We report that laminin-8 (alpha 4 beta 1 gamma 1) is the specific isoform of laminin synthesized in adipocytes. Reverse transcription-polymerase chain reaction (RT-PCR) of mRNA from mouse 3T3-L1 cells with paired primers for alpha 1, alpha 2, alpha 3, alpha 4, alpha 5, beta 1, beta 2, beta 3, gamma 1 and gamma 2 laminins yielded amplified fragments only for alpha 4, beta 1 and gamma 1. A polyclonal antibody against mouse laminin-1 (alpha 1 beta 1 gamma 1) precipitated alpha 4 in addition to beta 1 and gamma 1, while the antibody against a deduced peptide sequence of mouse alpha 4 in addition to beta 1 and gamma 1 in addition to alpha 4. Thus, laminin-8 (alpha 4 beta 1 gamma 1) is the only isoform expressed in 3T3-L1 cells. Northern blots showed that the levels of alpha 4, beta 1 and gamma 1 mRNAs increased 2.5-fold during adipose conversion of 3T3-L1 cells. A 1062 bp cDNA fragment cloned by RT-PCR demonstrated a polymorphism in the mouse alpha 4 gene which would lead to five amino acid changes in the domain G.
Collapse
Affiliation(s)
- T Niimi
- Graduate Program for Biochemical Regulation, Graduate School of Agricultural Sciences, Nagoya University, Japan
| | | | | | | |
Collapse
|
32
|
Abstract
Genes differentially expressed in different tissues, during development, or during specific pathologies are of foremost interest to both basic and pharmaceutical research. "Transcript profiles" or "digital Northerns" are generated routinely by partially sequencing thousands of randomly selected clones from relevant cDNA libraries. Differentially expressed genes can then be detected from variations in the counts of their cognate sequence tags. Here we present the first systematic study on the influence of random fluctuations and sampling size on the reliability of this kind of data. We establish a rigorous significance test and demonstrate its use on publicly available transcript profiles. The theory links the threshold of selection of putatively regulated genes (e.g., the number of pharmaceutical leads) to the fraction of false positive clones one is willing to risk. Our results delineate more precisely and extend the limits within which digital Northern data can be used.
Collapse
Affiliation(s)
- S Audic
- Laboratory of Structural and Genetic Information, Centre National de la Recherche Scientifique-E.P.91, Marseille 13402, France
| | | |
Collapse
|
33
|
Maier E, Meier-Ewert S, Bancroft D, Lehrach H. Automated array technologies for gene expression profiling. Drug Discov Today 1997. [DOI: 10.1016/s1359-6446(97)01054-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
34
|
Soto-Prior A, Lavigne-Rebillard M, Lenoir M, Ripoll C, Rebillard G, Vago P, Pujol R, Hamel CP. Identification of preferentially expressed cochlear genes by systematic sequencing of a rat cochlea cDNA library. BRAIN RESEARCH. MOLECULAR BRAIN RESEARCH 1997; 47:1-10. [PMID: 9221896 DOI: 10.1016/s0169-328x(97)00033-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
107 expressed sequence tags (ESTs) from a rat cochlea cDNA library were identified by systematic sequencing coupled to database selection and RT-PCR analysis of novel sequences. This approach led us to select a clone, pCO8, showing no significant homology with any database sequence, that corresponds to a mRNA whose expression is restricted to the cochlea, except for traces detected in brain. Additional clones with novel sequences enriched in the cochlea were also found. ESTs bearing significant homologies with database sequences (63 out of 107) were classified according to the putatively encoded protein. They include tissue-specific genes not previously described in the cochlea as well as known genes from other species. We performed in situ hybridization in cochlear tissues to localize the pCO8 mRNA and that of clone pCO6 which is 100% homologous to the delayed rectifier potassium channel drk1. We found that both mRNAs were exclusively expressed in the cellular body of the primary auditory neurons from the spiral ganglion of the cochlea. These results indicate that this approach is an efficient way to identify novel genes that could be of importance in cochlear function.
Collapse
Affiliation(s)
- A Soto-Prior
- INSERM U254 and Universités de Montpellier 1 et 2, CHU Hôpital Saint Charles, France
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Becker KG, Mattson DH, Powers JM, Gado AM, Biddison WE. Analysis of a sequenced cDNA library from multiple sclerosis lesions. J Neuroimmunol 1997; 77:27-38. [PMID: 9209265 DOI: 10.1016/s0165-5728(97)00045-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
To identify genes that are expressed in MS pathogenesis, we have analyzed a normalized cDNA library made from mRNA obtained from CNS lesions of a patient with primary progressive MS. Complementary DNA clones obtained from this library were subjected to automated DNA sequencing to generate expressed sequence tags. Analysis of this MS cDNA library revealed the presence of 54 cDNAs that were associated with immune activation and indicated the presence of an ongoing inflammatory response with evidence of both cell-mediated and humoral immune responses. The surprising finding was that 16 of the cDNAs encoded autoantigens associated with seven other autoimmune disorders, while only three of these 16 autoantigen cDNAs were present in a similarly constructed adult brain library. Such aberrant autoantigen expression could provide a source of secondary autoimmune stimulation that could contribute to the ongoing inflammatory response in MS. In addition, two cDNAs were found that mapped to a known MS susceptibility locus (5p14-p12): one encoded an excitatory amino acid transporter and the other a human homologue of the Drosophila disabled gene. This approach to the molecular biology of MS pathogenesis may help to illuminate previously unappreciated aspects of this disease.
Collapse
Affiliation(s)
- K G Becker
- Molecular Immunology Section, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda MD, USA
| | | | | | | | | |
Collapse
|
36
|
Yu W, Andersson B, Worley KC, Muzny DM, Ding Y, Liu W, Ricafrente JY, Wentland MA, Lennon G, Gibbs RA. Large-scale concatenation cDNA sequencing. Genome Res 1997; 7:353-8. [PMID: 9110174 PMCID: PMC139146 DOI: 10.1101/gr.7.4.353] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/1996] [Accepted: 02/04/1997] [Indexed: 02/04/2023]
Abstract
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7-2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (> 20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (> or = 98% identity), and 16 clones generated nonexact matches (57%-97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching.
Collapse
|
37
|
Gianfrancesco F, Esposito T, Ruini L, Houlgatte R, Nagaraja R, D'Esposito M, Rocchi M, Auffray C, Schlessinger D, D'Urso M, Forabosco A. Mapping of 59 EST gene markers in 31 intervals spanning the human X chromosome. Gene 1997; 187:179-84. [PMID: 9099878 DOI: 10.1016/s0378-1119(96)00743-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The positioning of Expressed Sequence Tags (ESTs) constitutes an important step towards a functional map of the human genome, including candidate genes for human genetic disorders that have been localized by linkage analysis. We localized 59 ESTs on the human X chromosome, including 44 derived from infant brain and 15 from adult muscle cDNA libraries. Localizations by a somatic cell hybrid panel were refined for five cDNAs by mapping them in yeast artificial chromosome (YAC) contigs.
Collapse
Affiliation(s)
- F Gianfrancesco
- International Institute of Genetics and Biophysics (IIGB), CNR, Naples, Italy
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Touchman JW, Bouffard GG, Weintraub LA, Idol JR, Wang L, Robbins CM, Nussbaum JC, Lovett M, Green ED. 2006 expressed-sequence tags derived from human chromosome 7-enriched cDNA libraries. Genome Res 1997; 7:281-92. [PMID: 9074931 DOI: 10.1101/gr.7.3.281] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The establishment and mapping of gene-specific DNA sequences greatly complement the ongoing efforts to map and sequence all human chromosomes. To facilitate our studies of human chromosome 7, we have generated and analyzed 2006 expressed-sequence tags (ESTs) derived from a collection of direct selection cDNA libraries that are highly enriched for human chromosome 7 gene sequences. Similarity searches indicate that approximately two-thirds of the ESTs are not represented by sequences in the public databases, including those in dbEST. In addition, a large fraction (68%) of the ESTs do not have redundant or overlapping sequences within our collection. Human DNA-specific sequence-tagged sites (STSs) have been developed from 190 of the ESTs. Remarkably, 180 (96%) of these STSs map to chromosome 7, demonstrating the robustness of chromosome enrichment in constructing the direct selection cDNA libraries. Thus far, 140 of these EST-specific STSs have been assigned unequivocally to YAC contigs that are distributed across the chromosome. Together, these studies provide > 2000 ESTs highly enriched for chromosome 7 gene sequences, 180 new chromosome 7 STSs corresponding to ESTs, and a definitive demonstration of the ability to enrich for chromosome-specific cDNAs by direct selection. Furthermore, the libraries, sequence data, and mapping information will contribute to the construction of a chromosome 7 transcript map.
Collapse
Affiliation(s)
- J W Touchman
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Nabetani A, Hatada I, Morisaki H, Oshimura M, Mukai T. Mouse U2af1-rs1 is a neomorphic imprinted gene. Mol Cell Biol 1997; 17:789-98. [PMID: 9001233 PMCID: PMC231805 DOI: 10.1128/mcb.17.2.789] [Citation(s) in RCA: 80] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
The mouse U2af1-rs1 gene is an endogenous imprinted gene on the proximal region of chromosome 11. This gene is transcribed exclusively from the unmethylated paternal allele, while the methylated maternal allele is silent. An analysis of genome structure of this gene revealed that the whole gene is located in an intron of the Murr1 gene. Although none of the three human U2af1-related genes have been mapped to chromosome 2, the human homolog of Murr1 is assigned to chromosome 2. The mouse Murr1 gene is transcribed biallelically, and therefore it is not imprinted in neonatal mice. Allele-specific methylation is limited to a region around U2af1-rs1 in an intron of Murr1. These results suggest that in chromosomal homology and genomic imprinting, the U2af1-rs1 gene is distinct from the genome region surrounding it. We have proposed the neomorphic origin of the U2af1-rs1 gene by retrotransposition and the particular mechanism of genomic imprinting of ectopic genes.
Collapse
Affiliation(s)
- A Nabetani
- Department of Bioscience, National Cardiovascular Center Research Institute, Suita, Osaka, Japan
| | | | | | | | | |
Collapse
|
40
|
Zhang MQ. Identification of protein coding regions in the human genome by quadratic discriminant analysis. Proc Natl Acad Sci U S A 1997; 94:565-8. [PMID: 9012824 PMCID: PMC19553 DOI: 10.1073/pnas.94.2.565] [Citation(s) in RCA: 198] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/1996] [Accepted: 10/29/1996] [Indexed: 02/03/2023] Open
Abstract
A new method for predicting internal coding exons in genomic DNA sequences has been developed. This method is based on a prediction algorithm that uses the quadratic discriminant function for multivariate statistical pattern recognition. Substantial improvements have been made (with only 9 discriminant variables) when compared with existing methods: HEXON [Solovyev, V. V., Salamov, A. A. & Lawrence, C. B. (1994) Nucleic Acids Res. 22, 5156-5163] (based on linear discriminant analysis) and GRAIL2 [Uberbacher, E. C. & Mural, R. J. (1991) Proc. Natl. Acad. Sci. USA 88, 11261-11265] (based on neural networks). A computer program called MZEF is freely available to the genome community and allows users to adjust prior probability and to output alternative overlapping exons.
Collapse
Affiliation(s)
- M Q Zhang
- Cold Spring Harbor Laboratory, NY 11724, USA
| |
Collapse
|
41
|
Schuler GD, Boguski MS, Stewart EA, Stein LD, Gyapay G, Rice K, White RE, Rodriguez-Tom P, Aggarwal A, Bajorek E, Bentolila S, Birren BB, Butler A, Castle AB, Chiannilkulchai N, Chu A, Clee C, Cowles S, Day PJR, Dibling T, East C, Drouot N, Dunham I, Duprat S, Edwards C, Fan JB, Fang N, Fizames C, Garrett C, Green L, Hadley D, Harris M, Harrison P, Brady S, Hicks A, Holloway E, Hui L, Hussain S, Louis-Dit-Sully C, Ma J, MacGilvery A, Mader C, Maratukulam A, Matise TC, McKusick KB, Morissette J, Mungall A, Muselet D, Nusbaum HC, Page DC, Peck A, Perkins S, Piercy M, Qin F, Quackenbush J, Ranby S, Reif T, Rozen S, Sanders C, She X, Silva J, Slonim DK, Soderlund C, Sun WL, Tabar P, Thangarajah T, Vega-Czarny N, Vollrath D, Voyticky S, Wilmer T, Wu X, Adams MD, Auffray C, Walter NAR, Brandon R, Dehejia A, Goodfellow PN, Houlgatte R, Hudson JR, Ide SE, Iorio KR, Lee WY, Seki N, Nagase T, Ishikawa K, Nomura N, Phillips C, Polymeropoulos MH, Sandusky M, Schmitt K, Berry R, Swanson K, Torres R, Venter JC, Sikela JM, Beckmann JS, Weissenbach J, Myers RM, Cox DR, James MR, Bentley D, Deloukas P, Lander ES, Hudson TJ. A Gene Map of the Human Genome. Science 1996. [DOI: 10.1126/science.274.5287.540] [Citation(s) in RCA: 717] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
42
|
Hillier LD, Lennon G, Becker M, Bonaldo MF, Chiapelli B, Chissoe S, Dietrich N, DuBuque T, Favello A, Gish W, Hawkins M, Hultman M, Kucaba T, Lacy M, Le M, Le N, Mardis E, Moore B, Morris M, Parsons J, Prange C, Rifkin L, Rohlfing T, Schellenberg K, Bento Soares M, Tan F, Thierry-Meg J, Trevaskis E, Underwood K, Wohldman P, Waterston R, Wilson R, Marra M. Generation and analysis of 280,000 human expressed sequence tags. Genome Res 1996; 6:807-28. [PMID: 8889549 DOI: 10.1101/gr.6.9.807] [Citation(s) in RCA: 327] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
We report the generation of 319,311 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' and 3' ends of 194,031 human cDNA clones. Our goal has been to obtain tag sequences from many different genes and to deposit these in the publicly accessible Data Base for Expressed Sequence Tags. Highly efficient automatic screening of the data allows deposition of the annotated sequences without delay. Sequences have been generated from 26 oligo(dT) primed directionally cloned libraries, of which 18 were normalized. The libraries were constructed using mRNA isolated from 17 different tissues representing three developmental states. Comparisons of a subset of our data with nonredundant human mRNA and protein data bases show that the ESTs represent many known sequences and contain many that are novel. Analysis of protein families using Hidden Markov Models confirms this observation and supports the contention that although normalization reduces significantly the relative abundance of redundant cDNA clones, it does not result in the complete removal of members of gene families.
Collapse
Affiliation(s)
- L D Hillier
- Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri 63108, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Aaronson JS, Eckman B, Blevins RA, Borkowski JA, Myerson J, Imran S, Elliston KO. Toward the development of a gene index to the human genome: an assessment of the nature of high-throughput EST sequence data. Genome Res 1996; 6:829-45. [PMID: 8889550 DOI: 10.1101/gr.6.9.829] [Citation(s) in RCA: 84] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
A rigorous analysis of the Merck-sponsored EST data with respect to known gene sequences increases the utility of the data set and helps refine methods for building a gene index. A highly curated human transcript data base was used as a reference data set of known genes. A detailed analysis of EST sequences derived from known genes was performed to assess the accuracy of EST sequence annotation. The EST data was screened to remove low-quality and low-complexity sequences. A set of high-quality ESTs similar to the transcript data base was identified using BLAST; this subset of ESTs was compared with the set of known genes using the Smith-Waterman algorithm. Error rates of several types were assessed based on a flexible match criterion defining sequence identity. The rate of lane-tracking errors is very low, approximately 0.5%. Insert size data is accurate within approximately 20%. Reversed clone and internal priming error rates are approximately 5% and 2.5%, respectively, contributing to the incorrect identification of reads as 3' ends of genes. Follow-up investigation reveals that a significant number of clones, miscategorized as reversed, represent overlapping genes on the opposite strand of entries in the transcript data base. Relevance of these results to the creation of a high-quality index to the human genome capable of supporting diverse genomic investigations is discussed.
Collapse
Affiliation(s)
- J S Aaronson
- Merck Research Laboratories, Department of Bioinformatics, Rahway, New Jersey 07065, USA.
| | | | | | | | | | | | | |
Collapse
|
44
|
Bonaldo MF, Lennon G, Soares MB. Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res 1996; 6:791-806. [PMID: 8889548 DOI: 10.1101/gr.6.9.791] [Citation(s) in RCA: 363] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Large-scale sequencing of cDNAs randomly picked from libraries has proven to be a very powerful approach to discover (putatively) expressed sequences that, in turn, once mapped, may greatly expedite the process involved in the identification and cloning of human disease genes. However, the integrity of the data and the pace at which novel sequences can be identified depends to a great extent on the cDNA libraries that are used. Because altogether, in a typical cell, the mRNAs of the prevalent and intermediate frequency classes comprise as much as 50-65% of the total mRNA mass, but represent no more than 1000-2000 different mRNAs, redundant identification of mRNAs of these two frequency classes is destined to become overwhelming relatively early in any such random gene discovery programs, thus seriously compromising their cost-effectiveness. With the goal of facilitating such efforts, previously we developed a method to construct directionally cloned normalized cDNA libraries and applied it to generate infant brain (INIB) and fetal liver/spleen (INFLS) libraries, from which a total of 45,192 and 86,088 expressed sequence tags, respectively, have been derived. While improving the representation of the longest cDNAs in our libraries, we developed three additional methods to normalize cDNA libraries and generated over 35 libraries, most of which have been contributed to our integrated Molecular Analysis of Genomes and Their Expression (IMAGE) Consortium and thus distributed widely and used for sequencing and mapping. In an attempt to facilitate the process of gene discovery further, we have also developed a subtractive hybridization approach designed specifically to eliminate (or reduce significantly the representation of) large pools of arrayed and (mostly) sequenced clones from normalized libraries yet to be (or just partly) surveyed. Here we present a detailed description and a comparative analysis of four methods that we developed and used to generate normalize cDNA libraries from human (15), mouse (3), rat (2), as well as the parasite Schistosoma mansoni (1). In addition, we describe the construction and preliminary characterization of a subtracted liver/spleen library (INFLS-SI) that resulted from the elimination (or reduction of representation) of -5000 INFLS-IMAGE clones from the INFLS library.
Collapse
Affiliation(s)
- M F Bonaldo
- Department of Psychiatry, College of Physicians and Surgeons of Columbia University, New York, New York, USA
| | | | | |
Collapse
|
45
|
Le Provost F, Lépingle A, Martin P. A survey of the goat genome transcribed in the lactating mammary gland. Mamm Genome 1996; 7:657-66. [PMID: 8703118 DOI: 10.1007/s003359900201] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
To fulfill its primary function, which is to synthesize milk during the course of lactation, the mammary gland requires efficient transcriptional, translational, and secretory machineries involving multiple genes among which promising candidates underlying the genetic variation of milk production have to be found. With the aim of providing a first transcriptional profile of lactating mammary tissue, a non-normalized cDNA library has been constructed from the udder of a lactating goat. After having discarded cDNA clones encoding the major milk proteins the rapid characterization of genes expressed in this tissue, by automated partial cDNA sequencing, was used to analyze a total of 435 cDNA clones. Examination of the Expressed Sequence Tags (ESTs) for similarities with sequence databases identified 234 cDNAs corresponding to 140 unique genes or proteins. Eighty-three clones, not similar to any current database entries, representing 77 novel sequences unrelated to previously described genes, were thus identified. Tissue specificity and relative abundance of 18 of these 77 unidentified clones were examined by dot blot and RT-PCR experiments. Sequence data were subsequently used to assign six genes of unknown localization in the bovine genome, to synteny groups by use of bovine-hamster cell hybrids and PCR.
Collapse
Affiliation(s)
- F Le Provost
- Laboratoire de Génétique Biochimique et de Cytogénétique, Institut National de la Recherche Agronomique, 78352 Jouy-en-Josas Cedex, France
| | | | | |
Collapse
|
46
|
Wintero AK, Fredholm M, Davies W. Evaluation and characterization of a porcine small intestine cDNA library: analysis of 839 clones. Mamm Genome 1996; 7:509-17. [PMID: 8672129 DOI: 10.1007/s003359900153] [Citation(s) in RCA: 53] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
A porcine small intestine directionally cloned cDNA library was constructed in the vector lambda Zap II. Clones were hybridized with total labeled cDNA such that putative high-copy number transcripts could be differentiated from middle- and low-copy number transcripts prior to selection and characterization by DNA sequencing. More than 2000 non-hybridizing and 242 hybridizing clones were collected. In total, 839 clones were sequenced from the 3' end of the cDNA, and after inter-clone comparison, the unique clones were sequenced from the 5' end of the cDNA. The 5' data were used to query the sequence in databases and resulted in the identification of 630 different gene transcripts, of which 604 are new porcine genes. The identity of 361 transcripts could be identified from sequence comparison studies. The validity of this semi-random selection approach was verified by the identification of a large number of unique transcripts.
Collapse
Affiliation(s)
- A K Wintero
- Royal Veterinary and Agricultural University, Department of Animal Science and Animal Health, Division of Animal Genetics, Bülowsvej 13, 1870 Frederiksberg C, Denmark
| | | | | |
Collapse
|
47
|
Lyu MS, Park DJ, Rhee SG, Kozak CA. Genetic mapping of the human and mouse phospholipase C genes. Mamm Genome 1996; 7:501-4. [PMID: 8672127 DOI: 10.1007/s003359900151] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
To determine chromosome positions for 10 mouse phospholipase C (PLC) genes, we typed the progeny of two sets of genetic crosses for inheritance of restriction enzyme polymorphisms of each PLC. Four mouse chromosomes, Chr 1, 11, 12, and 19, contained single PLC genes. Four PLC loci, Plcb1, Plcb2, Plcb4, and Plcg1, mapped to three sites on distal mouse Chr 2. Two PLC genes, Plcd1 and Plcg2, mapped to distinct sites on Chr 8. We mapped the human homologs of eight of these genes to six chromosomes by analysis of human x rodent somatic cell hybrids. The map locations of seven of these genes were consistent with previously defined regions of conserved synteny; Plcd1 defines a new region of homology between human Chr 3 and mouse Chr 8.
Collapse
Affiliation(s)
- M S Lyu
- Laboratory of Molecular Microbiology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | |
Collapse
|
48
|
Coles LS, Diamond P, Occhiodoro F, Vadas MA, Shannon MF. Cold shock domain proteins repress transcription from the GM-CSF promoter. Nucleic Acids Res 1996; 24:2311-7. [PMID: 8710501 PMCID: PMC145951 DOI: 10.1093/nar/24.12.2311] [Citation(s) in RCA: 50] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The human granulocyte-macrophage colony stimulating factor (GM-CSF) gene promoter binds a sequence-specific single-strand DNA binding protein termed NF-GMb. We previously demonstrated that the NF-GMb binding sites were required for repression of tumor necrosis factor-alpha (TNF-alpha) induction of the proximal GM-CSF promoter sequences in fibroblasts. We now describe the isolation of two different cDNA clones that encode cold shock domain (CSD) proteins with NF-GMb binding characteristics. One is identical to the previously reported CSD protein dbpB and the other is a previously unreported variant of the dbpA CSD factor. This is the first report of CSD factors binding to a cytokine gene. Nuclear NF-GMb and expressed CSD proteins have the same binding specificity for the GM-CSF promoter and other CSD binding sites. We present evidence that CSD factors are components of the nuclear NF-GMb complex. We also demonstrate that overexpression of the CSD proteins leads to complete repression of the proximal GM-CSF promoter containing the NF-GMb/CSD binding sites. Surprisingly, we show that CSD overexpression can also directly repress a region of the promoter which apparently lacks NF-GMb/CSD binding sites. NF-GMb/CSD factors may hence be acting by two different mechanisms. We discuss the potential importance of CSD factors in maintaining strict regulation of the GM-CSF gene.
Collapse
Affiliation(s)
- L S Coles
- Division of Human Immunology, Hanson Centre for Cancer Research, Institute of Medical and Veterinary Science, Adelaide, South Australia, Australia
| | | | | | | | | |
Collapse
|
49
|
Hayes PD, Schmitt K, Jones HB, Gyapay G, Weissenbach J, Goodfellow PN. Regional assignment of human ESTs by whole-genome radiation hybrid mapping. Mamm Genome 1996; 7:446-50. [PMID: 8662228 DOI: 10.1007/s003359900130] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The UK HGMP Resource Centre's collection of human partial cDNA sequences (ESTs) have been examined for suitability for mapping by PCR on a panel of somatic cell hybrids. The chromosomal assignments of 92 ESTs were determined with a monochromosomal hybrid panel, and a subset of 45 were linked to genetic markers with a panel of whole-genome radiation hybrids (WG-RHs). These results demonstrate the potential of WG-RHs to construct a transcript map of the human genome.
Collapse
Affiliation(s)
- P D Hayes
- HGMP Resource Centre, Hinxton, Cambridgeshire, UK
| | | | | | | | | | | |
Collapse
|
50
|
Abstract
Great progress is being made in the elucidation of disease genes, especially in hereditary skin diseases, by genome analysis, including functional and positional cloning. In more than fifty skin disorders, not only the chromosomal localizations, but also the abnormalities of the disease genes have been identified. As a resource for candidate genes, the expressed gene catalogues generated by large scale cDNA sequencing analysis are available. The isolation of disease genes may not directly serve to provide any therapeutic aids for the corresponding diseases at present, but the discovery of disease genes is expected to revolutionize our understanding of the pathogenesis and diagnosis of these skin disorders.
Collapse
Affiliation(s)
- K Yamanishi
- Department of Dermatology, Kyoto Prefectural University of Medicine, Japan
| |
Collapse
|