1201
|
Yamauchi T, Kamon J, Ito Y, Tsuchida A, Yokomizo T, Kita S, Sugiyama T, Miyagishi M, Hara K, Tsunoda M, Murakami K, Ohteki T, Uchida S, Takekawa S, Waki H, Tsuno NH, Shibata Y, Terauchi Y, Froguel P, Tobe K, Koyasu S, Taira K, Kitamura T, Shimizu T, Nagai R, Kadowaki T. Cloning of adiponectin receptors that mediate antidiabetic metabolic effects. Nature 2003; 423:762-9. [PMID: 12802337 DOI: 10.1038/nature01705] [Citation(s) in RCA: 2308] [Impact Index Per Article: 104.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2002] [Accepted: 05/01/2003] [Indexed: 12/12/2022]
Abstract
Adiponectin (also known as 30-kDa adipocyte complement-related protein; Acrp30) is a hormone secreted by adipocytes that acts as an antidiabetic and anti-atherogenic adipokine. Levels of adiponectin in the blood are decreased under conditions of obesity, insulin resistance and type 2 diabetes. Administration of adiponectin causes glucose-lowering effects and ameliorates insulin resistance in mice. Conversely, adiponectin-deficient mice exhibit insulin resistance and diabetes. This insulin-sensitizing effect of adiponectin seems to be mediated by an increase in fatty-acid oxidation through activation of AMP kinase and PPAR-alpha. Here we report the cloning of complementary DNAs encoding adiponectin receptors 1 and 2 (AdipoR1 and AdipoR2) by expression cloning. AdipoR1 is abundantly expressed in skeletal muscle, whereas AdipoR2 is predominantly expressed in the liver. These two adiponectin receptors are predicted to contain seven transmembrane domains, but to be structurally and functionally distinct from G-protein-coupled receptors. Expression of AdipoR1/R2 or suppression of AdipoR1/R2 expression by small-interfering RNA supports our conclusion that they serve as receptors for globular and full-length adiponectin, and that they mediate increased AMP kinase and PPAR-alpha ligand activities, as well as fatty-acid oxidation and glucose uptake by adiponectin.
Collapse
Affiliation(s)
- Toshimasa Yamauchi
- Department of Internal Medicine, Graduate School of Medicine, University of Tokyo, Tokyo 113-8655, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1202
|
|
1203
|
|
1204
|
Baldarelli RM, Hill DP, Blake JA, Adachi J, Furuno M, Bradt D, Corbani LE, Cousins S, Frazer KS, Qi D, Yang L, Ramachandran S, Reed D, Zhu Y, Kasukawa T, Ringwald M, King BL, Maltais LJ, McKenzie LM, Schriml LM, Maglott D, Church DM, Pruitt K, Eppig JT, Richardson JE, Kadin JA, Bult CJ. Connecting sequence and biology in the laboratory mouse. Genome Res 2003; 13:1505-19. [PMID: 12819150 PMCID: PMC403701 DOI: 10.1101/gr.991003] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The Mouse Genome Sequencing Consortium and the RIKEN Genome Exploration Research grouphave generated large sets of sequence data representing the mouse genome and transcriptome, respectively. These data provide a valuable foundation for genomic research. The challenges for the informatics community are how to integrate these data with the ever-expanding knowledge about the roles of genes and gene products in biological processes, and how to provide useful views to the scientific community. Public resources, such as the National Center for Biotechnology Information (NCBI; http://www.ncbi.nih.gov), and model organism databases, such as the Mouse Genome Informatics database (MGI; http://www.informatics.jax.org), maintain the primary data and provide connections between sequence and biology. In this paper, we describe how the partnership of MGI and NCBI LocusLink contributes to the integration of sequence and biology, especially in the context of the large-scale genome and transcriptome data now available for the laboratory mouse. In particular, we describe the methods and results of integration of 60,770 FANTOM2 mouse cDNAs with gene records in the databases of MGI and LocusLink.
Collapse
Affiliation(s)
- Richard M Baldarelli
- Mouse Genome Informatics Group, The Jackson Laboratory, Bar Harbor, Maine 04609, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1205
|
Forrest ARR, Taylor D, Grimmond S. Exploration of the cell-cycle genes found within the RIKEN FANTOM2 data set. Genome Res 2003; 13:1366-75. [PMID: 12819135 PMCID: PMC403664 DOI: 10.1101/gr.1012403] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2002] [Accepted: 03/31/2003] [Indexed: 01/13/2023]
Abstract
The cell cycle is one of the most fundamental processes within a cell. Phase-dependent expression and cell-cycle checkpoints require a high level of control. A large number of genes with varying functions and modes of action are responsible for this biology. In a targeted exploration of the FANTOM2-Variable Protein Set, a number of mouse homologs to known cell-cycle regulators as well as novel members of cell-cycle families were identified. Focusing on two prototype cell-cycle families, the cyclins and the NIMA-related kinases (NEKs), we believe we have identified all of the mouse members of these families, 24 cyclins and 10 NEKs, and mapped them to ENSEMBL transcripts. To attempt to globally identify all potential cell cycle-related genes within mouse, the MGI (Mouse Genome Database) assignments for the RIKEN Representative Set (RPS) and the results from two homology-based queries were merged. We identified 1415 genes with possible cell-cycle roles, and 1758 potential paralogs. We comment on the genes identified in this screen and evaluate the merits of each approach.
Collapse
Affiliation(s)
- Alistair R R Forrest
- The Institute for Molecular Bioscience, University of Queensland, Queensland Q4072, Australia.
| | | | | |
Collapse
|
1206
|
Semple CAM. The comparative proteomics of ubiquitination in mouse. Genome Res 2003; 13:1389-94. [PMID: 12819137 PMCID: PMC403670 DOI: 10.1101/gr.980303] [Citation(s) in RCA: 105] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2002] [Accepted: 03/06/2003] [Indexed: 11/24/2022]
Abstract
Ubiquitination is a common posttranslational modification in eukaryotic cells, influencing many fundamental cellular processes. Defects in ubiquitination and the processes it mediates are involved in many human disease states. The ubiquitination of a substrate involves four classes of enzymes:a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), a ubiquitin protein ligase (E3), and a de-ubiquitinating enzyme (DUB). A substantial number of E1s (four), E2s (13), E3s (97), and DUBs (six) that were previously unknown in the mouse are included in the FANTOM2 Representative Transcript and Protein Set (RTPS). Many of the genes encoding these proteins will constitute promising candidates for involvement in disease. In addition, the RTPS provides the basis for the most comprehensive survey of ubiquitination-associated proteins across eukaryotes undertaken to date. Comparisons of these proteins across human and other organisms suggest that eukaryotic evolution has been associated with an increase in the number and diversity of E3s (possessing either zinc-finger RING, F-box, or HECT domains) and DUBs (containing the ubiquitin thiolesterase family 2 domain). These increases in numbers are too large to be accounted for by the presence of fragmentary proteins in the data sets examined. Much of this innovation appears to have been associated with the emergence of multicellular organisms, and subsequently of vertebrates, increasing the opportunity for complex regulation of ubiquitination-mediated cellular and developmental processes.
Collapse
|
1207
|
Kiyosawa H, Yamanaka I, Osato N, Kondo S, Hayashizaki Y. Antisense transcripts with FANTOM2 clone set and their implications for gene regulation. Genome Res 2003; 13:1324-34. [PMID: 12819130 PMCID: PMC403655 DOI: 10.1101/gr.982903] [Citation(s) in RCA: 191] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We have used the FANTOM2 mouse cDNA set (60,770 clones), public mRNA data, and mouse genome sequence data to identify 2481 pairs of sense-antisense transcripts and 899 further pairs of nonantisense bidirectional transcription based upon genomic mapping. The analysis greatly expands the number of known examples of sense-antisense transcript and nonantisense bidirectional transcription pairs in mammals. The FANTOM2 cDNA set appears to contain substantially large numbers of noncoding transcripts suitable for antisense transcript analysis. The average proportion of loci encoding sense-antisense transcript and nonantisense bidirectional transcription pairs on autosomes was 15.1 and 5.4%, respectively. Those on the X chromosome were 6.3 and 4.2%, respectively. Sense-antisense transcript pairs, rather than nonantisense bidirectional transcription pairs, may be less prevalent on the X chromosome, possibly due to X chromosome inactivation. Sense and antisense transcripts tended to be isolated from the same libraries, where nonantisense bidirectional transcription pairs were not apparently coregulated. The existence of large numbers of natural antisense transcripts implies that the regulation of gene expression by antisense transcripts is more common that previously recognized. The viewer showing mapping patterns of sense-antisense transcript pairs and nonantisense bidirectional transcription pairs on the genome and other related statistical data is available on our Web site.
Collapse
Affiliation(s)
- Hidenori Kiyosawa
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | |
Collapse
|
1208
|
Kasukawa T, Furuno M, Nikaido I, Bono H, Hume DA, Bult C, Hill DP, Baldarelli R, Gough J, Kanapin A, Matsuda H, Schriml LM, Hayashizaki Y, Okazaki Y, Quackenbush J. Development and evaluation of an automated annotation pipeline and cDNA annotation system. Genome Res 2003; 13:1542-51. [PMID: 12819153 PMCID: PMC403710 DOI: 10.1101/gr.992803] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Manual curation has long been held to be the "gold standard" for functional annotation of DNA sequence. Our experience with the annotation of more than 20,000 full-length cDNA sequences revealed problems with this approach, including inaccurate and inconsistent assignment of gene names, as well as many good assignments that were difficult to reproduce using only computational methods. For the FANTOM2 annotation of more than 60,000 cDNA clones, we developed a number of methods and tools to circumvent some of these problems, including an automated annotation pipeline that provides high-quality preliminary annotation for each sequence by introducing an "uninformative filter" that eliminates uninformative annotations, controlled vocabularies to accurately reflect both the functional assignments and the evidence supporting them, and a highly refined, Web-based manual annotation tool that allows users to view a wide array of sequence analyses and to assign gene names and putative functions using a consistent nomenclature. The ultimate utility of our approach is reflected in the low rate of reassignment of automated assignments by manual curation. Based on these results, we propose a new standard for large-scale annotation, in which the initial automated annotations are manually investigated and then computational methods are iteratively modified and improved based on the results of manual curation.
Collapse
Affiliation(s)
- Takeya Kasukawa
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1209
|
Kawasawa Y, McKenzie LM, Hill DP, Bono H, Yanagisawa M. G protein-coupled receptor genes in the FANTOM2 database. Genome Res 2003; 13:1466-77. [PMID: 12819145 PMCID: PMC403690 DOI: 10.1101/gr.1087603] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
G protein-coupled receptors (GPCRs) comprise the largest family of receptor proteins in mammals and play important roles in many physiological and pathological processes. Gene expression of GPCRs is temporally and spatially regulated, and many splicing variants are also described. In many instances, different expression profiles of GPCR gene are accountable for the changes of its biological function. Therefore, it is intriguing to assess the complexity of the transcriptome of GPCRs in various mammalian organs. In this study, we took advantage of the FANTOM2 (Functional Annotation Meeting of Mouse cDNA 2) project, which aimed to collect full-length cDNAs inclusively from mouse tissues, and found 410 candidate GPCR cDNAs. Clustering of these clones into transcriptional units (TUs) reduced this number to 213. Out of these, 165 genes were represented within the known 308 GPCRs in the Mouse Genome Informatics (MGI) resource. The remaining 48 genes were new to mouse, and 14 of them had no clear mammalian ortholog. To dissect the detailed characteristics of each transcript, tissue distribution pattern and alternative splicing were also ascertained. We found many splicing variants of GPCRs that may have a relevance to disease occurrence. In addition, the difficulty in cloning tissue-specific and infrequently transcribed GPCRs is discussed further.
Collapse
MESH Headings
- Alternative Splicing/genetics
- Animals
- DNA, Complementary/genetics
- Databases, Genetic/statistics & numerical data
- GTP-Binding Proteins/classification
- GTP-Binding Proteins/genetics
- Humans
- Membrane Proteins/classification
- Membrane Proteins/genetics
- Mice
- Nerve Tissue Proteins
- Organ Specificity/genetics
- Proteome/genetics
- Receptor, Anaphylatoxin C5a
- Receptors, Cell Surface/classification
- Receptors, Cell Surface/genetics
- Receptors, Chemokine/classification
- Receptors, Chemokine/genetics
- Receptors, Cytokine/classification
- Receptors, Cytokine/genetics
- Receptors, G-Protein-Coupled
- Receptors, Galanin
- Receptors, Lysophospholipid
- Receptors, Neuropeptide/classification
- Receptors, Neuropeptide/genetics
- Receptors, Odorant/classification
- Receptors, Odorant/genetics
- Receptors, Purinergic/classification
- Receptors, Purinergic/genetics
- Receptors, Purinergic P2/genetics
- Signal Transduction/genetics
- Transcription, Genetic/genetics
Collapse
Affiliation(s)
- Yuka Kawasawa
- Howard Hughes Medical Institute, Department of Molecular Genetics, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas 75390-9050, USA.
| | | | | | | | | |
Collapse
|
1210
|
Numata K, Kanai A, Saito R, Kondo S, Adachi J, Wilming LG, Hume DA, Hayashizaki Y, Tomita M. Identification of putative noncoding RNAs among the RIKEN mouse full-length cDNA collection. Genome Res 2003; 13:1301-6. [PMID: 12819127 PMCID: PMC403720 DOI: 10.1101/gr.1011603] [Citation(s) in RCA: 115] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
With the sequencing and annotation of genomes and transcriptomes of several eukaryotes, the importance of noncoding RNA (ncRNA)-RNA molecules that are not translated to protein products-has become more evident. A subclass of ncRNA transcripts are encoded by highly regulated, multi-exon, transcriptional units, are processed like typical protein-coding mRNAs and are increasingly implicated in regulation of many cellular functions in eukaryotes. This study describes the identification of candidate functional ncRNAs from among the RIKEN mouse full-length cDNA collection, which contains 60,770 sequences, by using a systematic computational filtering approach. We initially searched for previously reported ncRNAs and found nine murine ncRNAs and homologs of several previously described nonmouse ncRNAs. Through our computational approach to filter artifact-free clones that lack protein coding potential, we extracted 4280 transcripts as the largest-candidate set. Many clones in the set had EST hits, potential CpG islands surrounding the transcription start sites, and homologies with the human genome. This implies that many candidates are indeed transcribed in a regulated manner. Our results demonstrate that ncRNAs are a major functional subclass of processed transcripts in mammals.
Collapse
Affiliation(s)
- Koji Numata
- Graduate School of Media and Governance, Bioinformatics Program, Keio University, Fujisawa, Kanagawa 252-8520, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
1211
|
Holmes R, Williamson C, Peters J, Denny P, Wells C. A comprehensive transcript map of the mouse Gnas imprinted complex. Genome Res 2003; 13:1410-5. [PMID: 12819140 PMCID: PMC403675 DOI: 10.1101/gr.955503] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The recent publication of the FANTOM mouse transcriptome has provided a unique opportunity to study the diversity of transcripts arising from a single gene locus. We have focused on the Gnas complex, as imprinting loci themselves provide unique insights into transcriptional regulation. Thirteen full-length cDNAs from the FANTOM2 set were mapped to the Gnas locus. These represented one previously described transcript and 12 putative new transcripts. Of these, eight were found to be differentially expressed from either the maternal or paternal allele. Two clones extended Nespas in the 3' direction, providing evidence of antisense transcription spanning a 30-kb genomic region from a single allele. The transcripts were summarized into six transcriptional units, Nespas, Nesp, Gnasxl, F7, exon 1A, and Gnas. The resolution of the Gnas transcript map by the FANTOM2 clones revealed a pattern of alternate splicing. In addition to the transcripts described previously as splicing onto exon 2 of Gnas, each new sense transcript had an alternate short 3'UTR independent of Gnas. Both spliced and unspliced variants of the new imprinted sense transcripts were found. Whereas the functional significance of these alternate transcripts is not known, the availability of the FANTOM clones has provided remarkable insights into the repertoire of transcripts in the Gnas complex locus.
Collapse
Affiliation(s)
- Rebecca Holmes
- The Mammalian Genetics Unit, Medical Research Council, Harwell OX11 ORD, Oxfordshire, UK
| | | | | | | | | |
Collapse
|
1212
|
Abstract
We propose herein a new method of DNA distribution, whereby DNA clones or PCR products are printed directly onto the pages of books and delivered to users along with relevant scientific information. DNA sheets, comprising water-soluble paper onto which DNA is spotted, can be bound into books. Readers can easily extract the DNA fragments from DNA sheets and amplify them using PCR. We show that DNA sheets can withstand various conditions that may be experienced during bookbinding and delivery, such as high temperatures and humidity. Almost all genes (95%-100% of randomly selected RIKEN mouse cDNA clones) were recovered successfully by use of PCR. Readers can start their experiments after a 2-h PCR amplification without waiting for the delivery of DNA clones. The DNA Book thus provides a novel method for delivering DNA in a timely and cost-effective manner. A sample DNA sheet (carrying RIKEN mouse cDNA clones encoding genes of enzymes for the TCA cycle) is included in this issue for field-testing. We would greatly appreciate it if readers could attempt to extract DNA and report the results and whether the DNA sheet was shipped to readers in good condition.
Collapse
Affiliation(s)
- Jun Kawai
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | |
Collapse
|
1213
|
Miki H, Setou M, Hirokawa N. Kinesin superfamily proteins (KIFs) in the mouse transcriptome. Genome Res 2003; 13:1455-65. [PMID: 12819144 PMCID: PMC403687 DOI: 10.1101/gr.984503] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
In the post genomic era where virtually all the genes and the proteins are known, an important task is to provide a comprehensive analysis of the expression of important classes of genes, such as those that are required for intracellular transport. We report the comprehensive analysis of the Kinesin Superfamily, which is the first and only large protein family whose constituents have been completely identified and confirmed in silico and at the cDNA, mRNA level. In FANTOM2, we have found 90 clones from 33 Kinesin Superfamily Protein (KIF) gene loci. The clones were analyzed in reference to sequence state, library of origin, detection methods, and alternative splicing. More than half of the representative transcriptional units (TU) were full length. The FANTOM2 library also contains novel splice variants previously unreported. We have compared and evaluated various protein classification tools and protein search methods using this data set. This report provides a foundation for future research of the intracellular transport along microtubules and proves the significance of intracellular transport protein transcripts as part of the transcriptome.
Collapse
Affiliation(s)
- Harukata Miki
- Department of Cell Biology and Anatomy, Graduate School of Medicine, University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | | | | |
Collapse
|
1214
|
Bono H, Yagi K, Kasukawa T, Nikaido I, Tominaga N, Miki R, Mizuno Y, Tomaru Y, Goto H, Nitanda H, Shimizu D, Makino H, Morita T, Fujiyama J, Sakai T, Shimoji T, Hume DA, Hayashizaki Y, Okazaki Y. Systematic expression profiling of the mouse transcriptome using RIKEN cDNA microarrays. Genome Res 2003; 13:1318-23. [PMID: 12819129 PMCID: PMC403653 DOI: 10.1101/gr.1075103] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The number of known mRNA transcripts in the mouse has been greatly expanded by the RIKEN Mouse Gene Encyclopedia project. Validation of their reproducible expression in a tissue is an important contribution to the study of functional genomics. In this report, we determine the expression profile of 57,931 clones on 20 mouse tissues using cDNA microarrays. Of these 57,931 clones, 22,928 clones correspond to the FANTOM2 clone set. The set represents 20,234 transcriptional units (TUs) out of 33,409 TUs in the FANTOM2 set. We identified 7206 separate clones that satisfied stringent criteria for tissue-specific expression. Gene Ontology terms were assigned for these 7206 clones, and the proportion of 'molecular function' ontology for each tissue-specific clone was examined. These data will provide insights into the function of each tissue. Tissue-specific gene expression profiles obtained using our cDNA microarrays were also compared with the data extracted from the GNF Expression Atlas based on Affymetrix microarrays. One major outcome of the RIKEN transcriptome analysis is the identification of numerous nonprotein-coding mRNAs. The expression profile was also used to obtain evidence of expression for putative noncoding RNAs. In addition, 1926 clones (70%) of 2768 clones that were categorized as "unknown EST," and 1969 (58%) clones of 3388 clones that were categorized as "unclassifiable" were also shown to be reproducibly expressed.
Collapse
Affiliation(s)
- Hidemasa Bono
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1215
|
Nagashima T, Silva DG, Petrovsky N, Socha LA, Suzuki H, Saito R, Kasukawa T, Kurochkin IV, Konagaya A, Schönbach C. Inferring higher functional information for RIKEN mouse full-length cDNA clones with FACTS. Genome Res 2003; 13:1520-33. [PMID: 12819151 PMCID: PMC403704 DOI: 10.1101/gr.1019903] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2002] [Accepted: 03/04/2003] [Indexed: 01/22/2023]
Abstract
FACTS (Functional Association/Annotation of cDNA Clones from Text/Sequence Sources) is a semiautomated knowledge discovery and annotation system that integrates molecular function information derived from sequence analysis results (sequence inferred) with functional information extracted from text. Text-inferred information was extracted from keyword-based retrievals of MEDLINE abstracts and by matching of gene or protein names to OMIM, BIND, and DIP database entries. Using FACTS, we found that 47.5% of the 60,770 RIKEN mouse cDNA FANTOM2 clone annotations were informative for text searches. MEDLINE queries yielded molecular interaction-containing sentences for 23.1% of the clones. When disease MeSH and GO terms were matched with retrieved abstracts, 22.7% of clones were associated with potential diseases, and 32.5% with GO identifiers. A significant number (23.5%) of disease MeSH-associated clones were also found to have a hereditary disease association (OMIM Morbidmap). Inferred neoplastic and nervous system disease represented 49.6% and 36.0% of disease MeSH-associated clones, respectively. A comparison of sequence-based GO assignments with informative text-based GO assignments revealed that for 78.2% of clones, identical GO assignments were provided for that clone by either method, whereas for 21.8% of clones, the assignments differed. In contrast, for OMIM assignments, only 28.5% of clones had identical sequence-based and text-based OMIM assignments. Sequence, sentence, and term-based functional associations are included in the FACTS database (http://facts.gsc.riken.go.jp/), which permits results to be annotated and explored through web-accessible keyword and sequence search interfaces. The FACTS database will be a critical tool for investigating the functional complexity of the mouse transcriptome, cDNA-inferred interactome (molecular interactions), and pathome (pathologies).
Collapse
Affiliation(s)
- Takeshi Nagashima
- Biomedical Knowledge Discovery Team, Bioinformatics Group, RIKEN Genomic Sciences Center (GSC), Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
1216
|
Lenhard B, Wahlestedt C, Wasserman WW. GeneLynx mouse: integrated portal to the mouse genome. Genome Res 2003; 13:1501-4. [PMID: 12819149 PMCID: PMC403699 DOI: 10.1101/gr.951403] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
GeneLynx Mouse is a meta-database providing an extensive collection of hyperlinks to mouse gene-specific information in diverse databases available via the Internet. The GeneLynx project is based on the simple notion that given any gene-specific identifier (e.g., accession number, gene name, text, or sequence), scientists should be able to access a single location that provides a set of links to all the publicly available information pertinent to the specified gene. The recent climax in the mouse genome and RIKEN cDNA sequencing projects provided the data necessary for the development of a gene-centric mouse information portal based on the GeneLynx ideals. Clusters of RIKEN cDNA sequences were used to define the initial set of mouse genes. Like its human counterpart, GeneLynx Mouse is designed as an extensible relational database with an intuitive and user-friendly Web interface. Data is automatically extracted from diverse resources, using appropriate approaches to maximize the coverage. To promote cross-database interoperability, an indexing utility is provided to facilitate the establishment of hyperlinks in external databases. As a result of the integration of the human and mouse systems, GeneLynx now serves as a powerful comparative genomics data mining resource. GeneLynx Mouse can be freely accessed at http://mouse.genelynx.org.
Collapse
Affiliation(s)
- Boris Lenhard
- Center for Genomics and Bioinformatics, Karolinska Institutet, 17177 Stockholm, Sweden.
| | | | | |
Collapse
|
1217
|
Schriml LM, Hill DP, Blake JA, Bono H, Wynshaw-Boris A, Pavan WJ, Ring BZ, Beisel K, Setou M, Okazaki Y. Human disease genes and their cloned mouse orthologs: exploration of the FANTOM2 cDNA sequence data set. Genome Res 2003; 13:1496-500. [PMID: 12819148 PMCID: PMC403698 DOI: 10.1101/gr.979503] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The FANTOM2 cDNA sequence data set is an excellent model to demonstrate the power of large-scale cDNA sequencing, with the goal of providing a full-length transcript sequence for each mouse gene. This data set enhances the use of the mouse as a model for human disease. Here we identify mouse cDNA sequences in the FANTOM2 data set for a set of 67 human disease genes that as of May 2002 had no corresponding mouse cDNA annotated in the Mouse Genome Informatics (MGI) database. These 67 human disease genes include genes related to neurological and eye disorders and cancer. We also present a list of the human disease genes and their cloned mouse orthologs found in two public databases, LocusLink and MGI. Allelic variant and gene functional information available in MGI provides additional information relative to these mouse models, whereas computed sequence-based connections at NCBI support facile navigation through multiple genomes.
Collapse
Affiliation(s)
- Lynn M Schriml
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
1218
|
Furuno M, Kasukawa T, Saito R, Adachi J, Suzuki H, Baldarelli R, Hayashizaki Y, Okazaki Y. CDS annotation in full-length cDNA sequence. Genome Res 2003; 13:1478-87. [PMID: 12819146 PMCID: PMC403693 DOI: 10.1101/gr.1060303] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The identification of coding sequences (CDS) is an important step in the functional annotation of genes. CDS prediction for mammalian genes from genomic sequence is complicated by the vast abundance of intergenic sequence in the genome, and provides little information about how different parts of potential CDS regions are expressed. In contrast, mammalian gene CDS prediction from cDNA sequence offers obvious advantages, yet encounters a different set of complexities when performed on high-throughput cDNA (HTC) sequences, such as the set of 60,770 cDNAs isolated from full-length enriched libraries of the FANTOM2 project. We developed a CDS annotation strategy that uses a variety of different CDS prediction programs to annotate the CDS regions of FANTOM2 cDNAs. These include rsCDS, which uses sequence similarity to known proteins; ProCrest; Longest-ORF and Truncated-ORF, which are ab initio based predictors; and finally, DECODER and NCBI CDS predictor, which use a combination of both principles. Aided by graphical displays of these CDS prediction results in the context of other sequence similarity results for each cDNA, FANTOM2 CDS inspection by curators and follow-up quality control procedures resulted in high quality CDS predictions for a total of 14,345 FANTOM2 clones.
Collapse
Affiliation(s)
- Masaaki Furuno
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | |
Collapse
|
1219
|
Brusic V, Pillai RS, Silva DG, Petrovsky N, Schönbach C. Cytokine-related genes identified from the RIKEN full-length mouse cDNA data set. Genome Res 2003; 13:1307-17. [PMID: 12819128 PMCID: PMC403723 DOI: 10.1101/gr.1016503] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
To identify novel cytokine-related genes, we searched the set of 60,770 annotated RIKEN mouse cDNA clones (FANTOM2 clones), using keywords such as cytokine itself or cytokine names (such as interferon, interleukin, epidermal growth factor, fibroblast growth factor, and transforming growth factor). This search produced 108 known cytokines and cytokine-related products such as cytokine receptors, cytokine-associated genes, or their products (enhancers, accessory proteins, cytokine-induced genes). We found 15 clusters of FANTOM2 clones that are candidates for novel cytokine-related genes. These encoded products with strong sequence similarity to guanylate-binding protein (GBP-5), interleukin-1 receptor-associated kinase 2 (IRAK-2), interleukin 20 receptor alpha isoform 3, a member of the interferon-inducible proteins of the Ifi 200 cluster, four members of the membrane-associated family 1-8 of interferon-inducible proteins, one p27-like protein, and a hypothetical protein containing a Toll/Interleukin receptor domain. All four clones representing novel candidates of gene products from the family contain a novel highly conserved cross-species domain. Clones similar to growth factor-related products included transforming growth factor beta-inducible early growth response protein 2 (TIEG-2), TGFbeta-induced factor 2, integrin beta-like 1, latent TGF-binding protein 4S, and FGF receptor 4B. We performed a detailed sequence analysis of the candidate novel genes to elucidate their likely functional properties.
Collapse
|
1220
|
Gustincich S, Batalov S, Beisel KW, Bono H, Carninci P, Fletcher CF, Grimmond S, Hirokawa N, Jarvis ED, Jegla T, Kawasawa Y, LeMieux J, Miki H, Raviola E, Teasdale RD, Tominaga N, Yagi K, Zimmer A, Hayashizaki Y, Okazaki Y. Analysis of the mouse transcriptome for genes involved in the function of the nervous system. Genome Res 2003; 13:1395-401. [PMID: 12819138 PMCID: PMC403671 DOI: 10.1101/gr.1135303] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We analyzed the mouse Representative Transcript and Protein Set for molecules involved in brain function. We found full-length cDNAs of many known brain genes and discovered new members of known brain gene families, including Family 3 G-protein coupled receptors, voltage-gated channels, and connexins. We also identified previously unknown candidates for secreted neuroactive molecules. The existence of a large number of unique brain ESTs suggests an additional molecular complexity that remains to be explored.A list of genes containing CAG stretches in the coding region represents a first step in the potential identification of candidates for hereditary neurological disorders.
Collapse
Affiliation(s)
- Stefano Gustincich
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts 02115, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1221
|
Bono H, Nikaido I, Kasukawa T, Hayashizaki Y, Okazaki Y. Comprehensive analysis of the mouse metabolome based on the transcriptome. Genome Res 2003; 13:1345-9. [PMID: 12819132 PMCID: PMC403659 DOI: 10.1101/gr.974603] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The complete set of cDNAs encoding the enzymes of known metabolic pathways has not previously been available for any mammal. Here, transcripts encoding the metabolic pathways of the mouse (mouse metabolome) were reconstructed by making use of the KEGG metabolic pathway database and gene ontology (GO) assignment to the mouse representative transcript and protein set (RTPS), which contains all available mouse transcript sequences including the FANTOM set of RIKEN mouse cDNA clones. By assigning EC numbers extracted from the molecular function ontology in GO, the known mouse transcriptome was predicted to encode enzymes with 726 unique EC numbers. Of these, 648 EC numbers were newly assigned based on the FANTOM set. The mouse metabolome confirmed by cDNA analysis includes almost all of the enzymes of well known pathways such as the tricarboxylic acid cycle and urea cycle. On the other hand, analysis of enzymes required for the tryptophan metabolism pathway revealed a lack of connectivity, indicating that cDNAs/genes encoding several key enzymes remain to be identified. The information derived from coexpression from the cDNA microarray analysis of enzymes of known function may lead to identification of the missing components of the metabolome, and will add new insights into the connectivity of the mammalian metabolic pathways.
Collapse
Affiliation(s)
- Hidemasa Bono
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | |
Collapse
|
1222
|
Batalov S, Su AI, Ching KA, Fletcher CF. Functional Annotation of RIKEN Mouse cDNA Clones Using GNF Expression Atlas. Genome Res 2003. [DOI: 10.1101/gr.1457103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
1223
|
Zavolan M, Kondo S, Schonbach C, Adachi J, Hume DA, Hayashizaki Y, Gaasterland T. Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome. Genome Res 2003; 13:1290-300. [PMID: 12819126 PMCID: PMC403716 DOI: 10.1101/gr.1017303] [Citation(s) in RCA: 149] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2002] [Accepted: 02/25/2003] [Indexed: 11/25/2022]
Abstract
We analyzed the FANTOM2 clone set of 60,770 RIKEN full-length mouse cDNA sequences and 44,122 public mRNA sequences. We developed a new computational procedure to identify and classify the forms of splice variation evident in this data set and organized the results into a publicly accessible database that can be used for future expression array construction, structural genomics, and analyses of the mechanism and regulation of alternative splicing. Statistical analysis shows that at least 41% and possibly as much as 60% of multiexon genes in mouse have multiple splice forms. Of the transcription units with multiple splice forms, 49% contain transcripts in which the apparent use of an alternative transcription start (stop) is accompanied by alternative splicing of the initial (terminal) exon. This implies that alternative transcription may frequently induce alternative splicing. The fact that 73% of all exons with splice variation fall within the annotated coding region indicates that most splice variation is likely to affect the protein form. Finally, we compared the set of constitutive (present in all transcripts) exons with the set of cryptic (present only in some transcripts) exons and found statistically significant differences in their length distributions, the nucleotide distributions around their splice junctions, and the frequencies of occurrence of several short sequence motifs.
Collapse
Affiliation(s)
- Mihaela Zavolan
- Laboratory of Computational Genomics, The Rockefeller University, New York, New York 10021-6399, USA.
| | | | | | | | | | | | | |
Collapse
|
1224
|
Tajul-Arifin K, Teasdale R, Ravasi T, Hume DA, Mattick JS. Identification and analysis of chromodomain-containing proteins encoded in the mouse transcriptome. Genome Res 2003; 13:1416-29. [PMID: 12819141 PMCID: PMC403676 DOI: 10.1101/gr.1015703] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The chromodomain is 40-50 amino acids in length and is conserved in a wide range of chromatic and regulatory proteins involved in chromatin remodeling. Chromodomain-containing proteins can be classified into families based on their broader characteristics, in particular the presence of other types of domains, and which correlate with different subclasses of the chromodomains themselves. Hidden Markov model (HMM)-generated profiles of different subclasses of chromodomains were used here to identify sequences encoding chromodomain-containing proteins in the mouse transcriptome and genome. A total of 36 different loci encoding proteins containing chromodomains, including 17 novel loci, were identified. Six of these loci (including three apparent pseudogenes, a novel HP1 ortholog, and two novel Msl-3 transcription factor-like proteins) are not present in the human genome, whereas the human genome contains four loci (two CDY orthologs and two apparent CDY pseudogenes) that are not present in mouse. A number of these loci exhibit alternative splicing to produce different isoforms, including 43 novel variants, some of which lack the chromodomain. The likely functions of these proteins are discussed in relation to the known functions of other chromodomain-containing proteins within the same family.
Collapse
Affiliation(s)
- Khairina Tajul-Arifin
- ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St.Lucia, Queensland 4072, Australia
| | | | | | | | | |
Collapse
|
1225
|
Wada S, Tokuoka M, Shoguchi E, Kobayashi K, Di Gregorio A, Spagnuolo A, Branno M, Kohara Y, Rokhsar D, Levine M, Saiga H, Satoh N, Satou Y. A genomewide survey of developmentally relevant genes in Ciona intestinalis. II. Genes for homeobox transcription factors. Dev Genes Evol 2003; 213:222-34. [PMID: 12736825 DOI: 10.1007/s00427-003-0321-0] [Citation(s) in RCA: 115] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2002] [Accepted: 03/11/2003] [Indexed: 11/25/2022]
Abstract
Homeobox-containing genes play crucial roles in various developmental processes, including body-plan specification, pattern formation and cell-type specification. The present study searched the draft genome sequence and cDNA/EST database of the basal chordate Ciona intestinalis to identify 83 homeobox-containing genes in this animal. This number of homeobox genes in the Ciona genome is smaller than that in the Caenorhabditis elegans, Drosophila melanogaster, human and mouse genomes. Of the 83 genes, 76 have possible human orthologues and 7 may be unique to Ciona. The ascidian homeobox genes were classified into 11 classes, including Hox class, NK class, Paired class, POU class, LIM class, TALE class, SIX class, Prox class, Cut class, ZFH class and HNF1 class, according to the classification scheme devised for known homeobox genes. As to the Hox cluster, the Ciona genome contains single copies of each of the paralogous groups, suggesting that there is a single Hox cluster, if any, but genes orthologous to Hox7, 8, 9 and 11 were not found in the genome. In addition, loss of genes had occurred independently in the Ciona lineage and was noticed in Gbx of the EHGbox subclass, Sax, NK3, Vax and vent of the NK class, Cart, Og9, Anf and Mix of the Paired class, POU-I, III, V and VI of the POU class, Lhx6/7 of the LIM class, TGIF of the TALE class, Cux and SATB of the Cut class, and ZFH1 of the ZFH class, which might have reduced the number of Ciona homeobox genes. Interestingly, one of the newly identified Ciona intestinalis genes and its vertebrate counterparts constitute a novel subclass of HNF1 class homeobox genes. Furthermore, evidence for the gene structures and expression of 54 of the 83 homeobox genes was provided by analysis of ESTs, suggesting that cDNAs for these 54 genes are available. The present data thus reveal the repertoire of homeodomain-containing transcription factors in the Ciona genome, which will be useful for future research on the development and evolution of chordates.
Collapse
Affiliation(s)
- Shuichi Wada
- Department of Zoology, Graduate School of Science, Kyoto University, Sakyo-ku, Kyoto 606-8502, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1226
|
Kanapin A, Batalov S, Davis MJ, Gough J, Grimmond S, Kawaji H, Magrane M, Matsuda H, Schönbach C, Teasdale RD, Yuan Z. Mouse proteome analysis. Genome Res 2003; 13:1335-44. [PMID: 12819131 PMCID: PMC403658 DOI: 10.1101/gr.978703] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2002] [Accepted: 03/05/2003] [Indexed: 11/25/2022]
Abstract
A general overview of the protein sequence set for the mouse transcriptome produced during the FANTOM2 sequencing project is presented here. We applied different algorithms to characterize protein sequences derived from a nonredundant representative protein set (RPS) and a variant protein set (VPS) of the mouse transcriptome. The functional characterization and assignment of Gene Ontology terms was done by analysis of the proteome using InterPro. The Superfamily database analyses gave a detailed structural classification according to SCOP and provide additional evidence for the functional characterization of the proteome data. The MDS database analysis revealed new domains which are not presented in existing protein domain databases. Thus the transcriptome gives us a unique source of data for the detection of new functional groups. The data obtained for the RPS and VPS sets facilitated the comparison of different patterns of protein expression. A comparison of other existing mouse and human protein sequence sets (e.g., the International Protein Index) demonstrates the common patterns in mammalian proteomes. The analysis of the membrane organization within the transcriptome of multiple eukaryotes provides valuable statistics about the distribution of secretory and transmembrane proteins
Collapse
Affiliation(s)
- Alexander Kanapin
- EMBL Outstation-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1227
|
Nikaido I, Saito C, Mizuno Y, Meguro M, Bono H, Kadomura M, Kono T, Morris GA, Lyons PA, Oshimura M, Hayashizaki Y, Okazaki Y. Discovery of imprinted transcripts in the mouse transcriptome using large-scale expression profiling. Genome Res 2003; 13:1402-9. [PMID: 12819139 PMCID: PMC403673 DOI: 10.1101/gr.1055303] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Candidate imprinted transcriptional units in the mouse genome were identified systematically from 27,663 FANTOM2 full-length mouse cDNA clones by expression profiling. Large-scale cDNA microarrays were used to detect differential expression dependent upon chromosomal parent of origin by comparing the mRNA levels in the total tissue of 9.5 dpc parthenogenote and androgenote mouse embryos. Of the FANTOM2 transcripts, 2114 were identified as candidates on the basis of the array data. Of these, 39 mapped to known imprinted regions of the mouse genome, 56 were considered as nonprotein-coding RNAs, and 159 were natural antisense transcripts. The imprinted expression of two transcripts located in the mouse chromosomal region syntenic to the human Prader-Willi syndrome region was confirmed experimentally. We further mapped all candidate imprinted transcripts to the mouse and human genome and were shown in correlation with the imprinting disease loci. These data provide a major resource for understanding the role of imprinting in mammalian inherited traits.
Collapse
Affiliation(s)
- Itoshi Nikaido
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1228
|
Exon Structure Analysis, Ortholog Identification, and SNP Candidate Screening by Mapping Mouse RIKEN Sequences to Multiple Genome Assemblies. Genome Res 2003. [DOI: 10.1101/gr.1458903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
1229
|
Ravasi T, Huber T, Zavolan M, Forrest A, Gaasterland T, Grimmond S, Hume DA. Systematic characterization of the zinc-finger-containing proteins in the mouse transcriptome. Genome Res 2003; 13:1430-42. [PMID: 12819142 PMCID: PMC403681 DOI: 10.1101/gr.949803] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2002] [Accepted: 02/19/2003] [Indexed: 12/20/2022]
Abstract
Zinc-finger-containing proteins can be classified into evolutionary and functionally divergent protein families that share one or more domains in which a zinc ion is tetrahedrally coordinated by cysteines and histidines. The zinc finger domain defines one of the largest protein superfamilies in mammalian genomes;46 different conserved zinc finger domains are listed in InterPro (http://www.ebi.ac.uk/InterPro). Zinc finger proteins can bind to DNA, RNA, other proteins, or lipids as a modular domain in combination with other conserved structures. Owing to this combinatorial diversity, different members of zinc finger superfamilies contribute to many distinct cellular processes, including transcriptional regulation, mRNA stability and processing, and protein turnover. Accordingly, mutations of zinc finger genes lead to aberrations in a broad spectrum of biological processes such as development, differentiation, apoptosis, and immunological responses. This study provides the first comprehensive classification of zinc finger proteins in a mammalian transcriptome. Specific detailed analysis of the SP/Krüppel-like factors and the E3 ubiquitin-ligase RING-H2 families illustrates the importance of such an analysis for a more comprehensive functional classification of large protein families. We describe the characterization of a new family of C2H2 zinc-finger-containing proteins and a new conserved domain characteristic of this family, the identification and characterization of Sp8, a new member of the Sp family of transcriptional regulators, and the identification of five new RING-H2 proteins.
Collapse
Affiliation(s)
- Timothy Ravasi
- Institute for Molecular Bioscience, Brisbane, Australia.
| | | | | | | | | | | | | |
Collapse
|
1230
|
Forrest ARR, Ravasi T, Taylor D, Huber T, Hume DA, Grimmond S. Phosphoregulators: protein kinases and protein phosphatases of mouse. Genome Res 2003; 13:1443-54. [PMID: 12819143 PMCID: PMC403684 DOI: 10.1101/gr.954803] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2002] [Accepted: 02/19/2003] [Indexed: 11/24/2022]
Abstract
With the completion of the human and mouse genome sequences, the task now turns to identifying their encoded transcripts and assigning gene function. In this study, we have undertaken a computational approach to identify and classify all of the protein kinases and phosphatases present in the mouse gene complement. A nonredundant set of these sequences was produced by mining Ensembl gene predictions and publicly available cDNA sequences with a panel of InterPro domains. This approach identified 561 candidate protein kinases and 162 candidate protein phosphatases. This cohort was then analyzed using TribeMCL protein sequence similarity clustering followed by CLUSTALV alignment and hierarchical tree generation. This approach allowed us to (1) distinguish between true members of the protein kinase and phosphatase families and enzymes of related biochemistry, (2) determine the structure of the families, and (3) suggest functions for previously uncharacterized members. The classifications obtained by this approach were in good agreement with previous schemes and allowed us to demonstrate domain associations with a number of clusters. Finally, we comment on the complementary nature of cDNA and genome-based gene detection and the impact of the FANTOM2 transcriptome project.
Collapse
|
1231
|
Suzuki H, Saito R, Kanamori M, Kai C, Schönbach C, Nagashima T, Hosaka J, Hayashizaki Y. The mammalian protein-protein interaction database and its viewing system that is linked to the main FANTOM2 viewer. Genome Res 2003; 13:1534-41. [PMID: 12819152 PMCID: PMC403706 DOI: 10.1101/gr.956303] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Here, we describe the development of a mammalian protein-protein interaction (PPI) database and of a PPI Viewer application to display protein interaction networks (http://fantom21.gsc.riken.go.jp/PPI/). In the database, we stored the mammalian PPIs identified through our PPI assays (internal PPIs), as well as those we extracted and processed (external PPIs) from publicly available data sources, the DIP and BIND databases and MEDLINE abstracts by using FACTS, a new functional inference and curation system. We integrated the internal and external PPIs into the PPI database, which is linked to the main FANTOM2 viewer. In addition, we incorporated into the PPI Viewer information regarding the luciferase reporter activity of internal PPIs and the data confidence of external PPIs; these data enable visualization and evaluation of the reliability of each interaction. Using the described system, we successfully identified several interactions of biological significance. Therefore, the PPI Viewer is a useful tool for exploring FANTOM2 clone-related protein interactions and their potential effects on signaling and cellular communication.
Collapse
Affiliation(s)
- Harukazu Suzuki
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
| | | | | | | | | | | | | | | |
Collapse
|
1232
|
Levanon EY, Sorek R. The importance of alternative splicing in the drug discovery process. ACTA ACUST UNITED AC 2003. [DOI: 10.1016/s1477-3627(03)02322-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
1233
|
Carninci P, Waki K, Shiraki T, Konno H, Shibata K, Itoh M, Aizawa K, Arakawa T, Ishii Y, Sasaki D, Bono H, Kondo S, Sugahara Y, Saito R, Osato N, Fukuda S, Sato K, Watahiki A, Hirozane-Kishikawa T, Nakamura M, Shibata Y, Yasunishi A, Kikuchi N, Yoshiki A, Kusakabe M, Gustincich S, Beisel K, Pavan W, Aidinis V, Nakagawara A, Held WA, Iwata H, Kono T, Nakauchi H, Lyons P, Wells C, Hume DA, Fagiolini M, Hensch TK, Brinkmeier M, Camper S, Hirota J, Mombaerts P, Muramatsu M, Okazaki Y, Kawai J, Hayashizaki Y. Targeting a complex transcriptome: the construction of the mouse full-length cDNA encyclopedia. Genome Res 2003; 13:1273-89. [PMID: 12819125 PMCID: PMC403712 DOI: 10.1101/gr.1119703] [Citation(s) in RCA: 137] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We report the construction of the mouse full-length cDNA encyclopedia,the most extensive view of a complex transcriptome,on the basis of preparing and sequencing 246 libraries. Before cloning,cDNAs were enriched in full-length by Cap-Trapper,and in most cases,aggressively subtracted/normalized. We have produced 1,442,236 successful 3'-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAs annotated in the FANTOM-2 annotation. We have also produced 547,149 5' end reads,which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU),which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC),which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large numbers of clusters (and TUs) of this project,which also include non-protein-coding RNAs,and the lower gene number estimation of genome annotations. Altogether,5'-end clusters identify regions that are potential promoters for 8637 known genes and 5'-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.
Collapse
Affiliation(s)
- Piero Carninci
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1234
|
Grimmond SM, Miranda KC, Yuan Z, Davis MJ, Hume DA, Yagi K, Tominaga N, Bono H, Hayashizaki Y, Okazaki Y, Teasdale RD. The mouse secretome: functional classification of the proteins secreted into the extracellular environment. Genome Res 2003; 13:1350-9. [PMID: 12819133 PMCID: PMC403661 DOI: 10.1101/gr.983703] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2002] [Accepted: 04/17/2003] [Indexed: 11/25/2022]
Abstract
We have developed a computational strategy to identify the set of soluble proteins secreted into the extracellular environment of a cell. Within the protein sequences predominantly derived from the RIKEN representative transcript and protein set, we identified 2033 unique soluble proteins that are potentially secreted from the cell. These proteins contain a signal peptide required for entry into the secretory pathway and lack any transmembrane domains or intracellular localization signals. This class of proteins, which we have termed the mouse secretome, included >500 novel proteins and 92 proteins <100 amino acids in length. Functional analysis of the secretome included identification of human orthologs, functional units based on InterPro and SCOP Superfamily predictions, and expression of the proteins within the RIKEN READ microarray database. To highlight the utility of this information, we discuss the CUB domain-containing protein family.
Collapse
Affiliation(s)
- Sean M Grimmond
- Institute for Molecular Bioscience and ARC Special Research Centre for Functional and Applied Genomics, University of Queensland, St. Lucia 4072, Australia
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1235
|
Wells CA, Ravasi T, Sultana R, Yagi K, Carninci P, Bono H, Faulkner G, Okazaki Y, Quackenbush J, Hume DA, Lyons PA. Continued discovery of transcriptional units expressed in cells of the mouse mononuclear phagocyte lineage. Genome Res 2003; 13:1360-5. [PMID: 12819134 PMCID: PMC403663 DOI: 10.1101/gr.1056103] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2002] [Accepted: 02/25/2003] [Indexed: 11/24/2022]
Abstract
The current RIKEN transcript set represents a significant proportion of the mouse transcriptome but transcripts expressed in the innate and acquired immune systems are poorly represented. In the present study we have assessed the complexity of the transcriptome expressed in mouse macrophages before and after treatment with lipopolysaccharide, a global regulator of macrophage gene expression, using existing RIKEN 19K arrays. By comparison to array profiles of other cells and tissues, we identify a large set of macrophage-enriched genes, many of which have obvious functions in endocytosis and phagocytosis. In addition, a significant number of LPS-inducible genes were identified. The data suggest that macrophages are a complex source of mRNA for transcriptome studies. To assess complexity and identify additional macrophage expressed genes, cDNA libraries were created from purified populations of macrophage and dendritic cells, a functionally related cell type. Sequence analysis revealed a high incidence of novel mRNAs within these cDNA libraries. These studies provide insights into the depths of transcriptional complexity still untapped amongst products of inducible genes, and identify macrophage and dendritic cell populations as a starting point for sampling the inducible mammalian transcriptome.
Collapse
Affiliation(s)
- Christine A Wells
- Institute for Molecular Bioscience and ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1236
|
Shimizu H, Taniguchi H, Hippo Y, Hayashizaki Y, Aburatani H, Ishikawa T. Characterization of the mouse Abcc12 gene and its transcript encoding an ATP-binding cassette transporter, an orthologue of human ABCC12. Gene 2003; 310:17-28. [PMID: 12801629 DOI: 10.1016/s0378-1119(03)00504-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
We have recently reported on two novel human ABC transporters, ABCC11 and ABCC12, the genes of which are tandemly located on human chromosome 16q12.1 [Biochem. Biophys. Res. Commun. 288 (2001) 933]. The present study addresses the cloning and characterization of Abcc12, a mouse orthologue of human ABCC12. The cloned Abcc12 cDNA was 4511 bp long, comprising a 4101 bp open reading frame. The deduced peptide consists of 1367 amino acids and exhibits high sequence identity (84.5%) with human ABCC12. The mouse Abcc12 gene consists of at least 29 exons and is located on the mouse chromosome 8D3 locus where conserved linkage homologies have hitherto been identified with human chromosome 16q12.1. The mouse Abcc12 gene was expressed at high levels exclusively in the seminiferous tubules in the testis. In addition to the Abcc12 transcript, two splicing variants encoding short peptides (775 and 687 amino acid residues) were detected. In spite of the genes coding for both ABCC11 and ABCC12 being tandemly located on human chromosome 16q12.1, no putative mouse orthologous gene corresponding to the human ABCC11 was detected at the mouse chromosome 8D3 locus.
Collapse
MESH Headings
- ATP-Binding Cassette Transporters/genetics
- Alternative Splicing
- Amino Acid Sequence
- Animals
- Base Sequence
- Blotting, Northern
- Chromosome Mapping
- Cloning, Molecular
- DNA, Complementary/chemistry
- DNA, Complementary/genetics
- Embryo, Mammalian/metabolism
- Female
- Gene Expression
- Gene Expression Regulation, Developmental
- Humans
- In Situ Hybridization
- Male
- Mice
- Mice, Inbred BALB C
- Molecular Sequence Data
- Phylogeny
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Sequence Alignment
- Sequence Analysis, DNA
- Sequence Homology, Amino Acid
- Testis/metabolism
- Transcription, Genetic
Collapse
Affiliation(s)
- Hidetada Shimizu
- Department of Biomolecular Engineering, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Nagatsuta 4259, Midori-ku, 226-8501, Yokohama, Japan
| | | | | | | | | | | |
Collapse
|
1237
|
Edgar AJ. The gene structure and expression of human ABHD1: overlapping polyadenylation signal sequence with Sec12. BMC Genomics 2003; 4:18. [PMID: 12735795 PMCID: PMC156608 DOI: 10.1186/1471-2164-4-18] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2002] [Accepted: 05/07/2003] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Overlapping sense/antisense genes orientated in a tail-to-tail manner, often involving only the 3'UTRs, form the majority of gene pairs in mammalian genomes and can lead to the formation of double-stranded RNA that triggers the destruction of homologous mRNAs. Overlapping polyadenylation signal sequences have not been described previously. RESULTS An instance of gene overlap has been found involving a shared single functional polyadenylation site. The genes involved are the human alpha/beta hydrolase domain containing gene 1 (ABHD1) and Sec12 genes. The nine exon human ABHD1 gene is located on chromosome 2p23.3 and encodes a 405-residue protein containing a catalytic triad analogous to that present in serine proteases. The Sec12 protein promotes efficient guanine nucleotide exchange on the Sar1 GTPase in the ER. Their sequences overlap for 42 bp in the 3'UTR in an antisense manner. Analysis by 3' RACE identified a single functional polyadenylation site, ATTAAA, within the 3'UTR of ABHD1 and a single polyadenylation signal, AATAAA, within the 3'UTR of Sec12. These polyadenylation signals overlap, sharing three bp. They are also conserved in mouse and rat. ABHD1 was expressed in all tissues and cells examined, but levels of ABHD1 varied greatly, being high in skeletal muscle and testis and low in spleen and fibroblasts. CONCLUSIONS Mammalian ABHD1 and Sec12 genes contain a conserved 42 bp overlap in their 3'UTR, and share a conserved TTTATTAAA/TTTAATAAA sequence that serves as a polyadenylation signal for both genes. No inverse correlation between the respective levels of ABHD1 and Sec12 RNA was found to indicate that any RNA interference occurred.
Collapse
Affiliation(s)
- Alasdair J Edgar
- Department of Clinical and Diagnostic Oral Science, Clinical Research Centre, Queen Mary, University of London, 2 Newark Street, London, E1 2AD, UK.
| |
Collapse
|
1238
|
Affiliation(s)
- Gunter Schumann
- Molecular Genetics Laboratory, Department of Psychiatry, Central Institute of Mental Health, J5, D-68159 Mannheim, Germany.
| | | | | |
Collapse
|
1239
|
Reboul J, Vaglio P, Rual JF, Lamesch P, Martinez M, Armstrong CM, Li S, Jacotot L, Bertin N, Janky R, Moore T, Hudson JR, Hartley JL, Brasch MA, Vandenhaute J, Boulton S, Endress GA, Jenna S, Chevet E, Papasotiropoulos V, Tolias PP, Ptacek J, Snyder M, Huang R, Chance MR, Lee H, Doucette-Stamm L, Hill DE, Vidal M. C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression. Nat Genet 2003; 34:35-41. [PMID: 12679813 DOI: 10.1038/ng1140] [Citation(s) in RCA: 293] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2003] [Accepted: 03/14/2003] [Indexed: 11/08/2022]
Abstract
To verify the genome annotation and to create a resource to functionally characterize the proteome, we attempted to Gateway-clone all predicted protein-encoding open reading frames (ORFs), or the 'ORFeome,' of Caenorhabditis elegans. We successfully cloned approximately 12,000 ORFs (ORFeome 1.1), of which roughly 4,000 correspond to genes that are untouched by any cDNA or expressed-sequence tag (EST). More than 50% of predicted genes needed corrections in their intron-exon structures. Notably, approximately 11,000 C. elegans proteins can now be expressed under many conditions and characterized using various high-throughput strategies, including large-scale interactome mapping. We suggest that similar ORFeome projects will be valuable for other organisms, including humans.
Collapse
Affiliation(s)
- Jérôme Reboul
- Dana-Farber Cancer Institute and Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1240
|
Lim LP, Lau NC, Weinstein EG, Abdelhakim A, Yekta S, Rhoades MW, Burge CB, Bartel DP. The microRNAs of Caenorhabditis elegans. Genes Dev 2003; 17:991-1008. [PMID: 12672692 PMCID: PMC196042 DOI: 10.1101/gad.1074403] [Citation(s) in RCA: 875] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
MicroRNAs (miRNAs) are an abundant class of tiny RNAs thought to regulate the expression of protein-coding genes in plants and animals. In the present study, we describe a computational procedure to identify miRNA genes conserved in more than one genome. Applying this program, known as MiRscan, together with molecular identification and validation methods, we have identified most of the miRNA genes in the nematode Caenorhabditis elegans. The total number of validated miRNA genes stands at 88, with no more than 35 genes remaining to be detected or validated. These 88 miRNA genes represent 48 gene families; 46 of these families (comprising 86 of the 88 genes) are conserved in Caenorhabditis briggsae, and 22 families are conserved in humans. More than a third of the worm miRNAs, including newly identified members of the lin-4 and let-7 gene families, are differentially expressed during larval development, suggesting a role for these miRNAs in mediating larval developmental transitions. Most are present at very high steady-state levels-more than 1000 molecules per cell, with some exceeding 50,000 molecules per cell. Our census of the worm miRNAs and their expression patterns helps define this class of noncoding RNAs, lays the groundwork for functional studies, and provides the tools for more comprehensive analyses of miRNA genes in other species.
Collapse
MESH Headings
- Animals
- Base Sequence
- Blotting, Northern
- Caenorhabditis elegans/genetics
- Caenorhabditis elegans/growth & development
- Cloning, Molecular
- Computational Biology
- Conserved Sequence
- Evolution, Molecular
- Gene Expression Regulation
- Gene Expression Regulation, Developmental
- Gene Library
- Genes, Helminth
- Humans
- MicroRNAs/genetics
- Molecular Sequence Data
- Nucleic Acid Conformation
- RNA, Helminth/chemistry
- RNA, Helminth/genetics
- RNA, Untranslated/chemistry
- RNA, Untranslated/genetics
- RNA, Untranslated/metabolism
- Sequence Homology, Nucleic Acid
- Transcription Initiation Site
Collapse
Affiliation(s)
- Lee P Lim
- Department of Biology, Massachusetts Institute of Technology, Cambridge 02139, USA
| | | | | | | | | | | | | | | |
Collapse
|
1241
|
Kaneko S. Alternative splicing of Cav2 genes and their functional significance. Nihon Yakurigaku Zasshi 2003; 121:233-40. [PMID: 12777842 DOI: 10.1254/fpj.121.233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2023]
Abstract
Alternative splicing is one of the most pharmacologically and physiologically significant mechanisms for the functional diversity of the mammalian genomes. Here I review recent results on the diversity of the Ca(v)2 subclass of voltage-dependent Ca(2+) channel gene in neurons. Although the entire picture of alternative splicing is not yet understood, emerging evidences suggest the Ca(v)2 isoforms permit optimization of Ca(2+) signaling in different regions of the brain with specific pharmacological ligands.
Collapse
Affiliation(s)
- Shuji Kaneko
- Department of Pharmacology, Graduate School of Pharmaceutical Sciences, Kyoto University, Japan.
| |
Collapse
|
1242
|
|
1243
|
Svenson KL, Bogue MA, Peters LL. Invited review: Identifying new mouse models of cardiovascular disease: a review of high-throughput screens of mutagenized and inbred strains. J Appl Physiol (1985) 2003; 94:1650-9; discussion 1673. [PMID: 12626479 DOI: 10.1152/japplphysiol.01029.2003] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The mouse is a proven model for studying human disease. Many strains exist that exhibit either natural or engineered genetic variation and thereby enable the elucidation of pathways involved in the development of cardiovascular disease. Although those mouse models have been fundamental to advancing our knowledge base, we are still at an early stage in understanding how genes contribute to complex disorders. There remains a need for new animal models that closely represent human disease. To expedite their development, we have established the Center for New Mouse Models of Heart, Lung, Blood, and Sleep Disorders at The Jackson Laboratory. We are using a phenotype-driven approach to identify mutations leading to atherosclerosis, hypertension, obesity, blood disorders, lung dysfunction, thrombosis, and disordered sleep. Our high-throughput, comprehensive phenotyping draws from two sources for new models: 1) the natural variation among over 40 inbred mouse strains and 2) chemically induced, whole-genome mutagenized mice. Here, we review our cardiovascular screens and present some hypertensive, obese, and cardiovascular models identified with this approach.
Collapse
|
1244
|
Paigen K. One hundred years of mouse genetics: an intellectual history. II. The molecular revolution (1981-2002). Genetics 2003; 163:1227-35. [PMID: 12702670 PMCID: PMC1462511 DOI: 10.1093/genetics/163.4.1227] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Kenneth Paigen
- The Jackson Laboratory, Bar Harbor, Maine 04609-1517, USA
| |
Collapse
|
1245
|
Abstract
Autoimmune diseases are, in general, under complex genetic control and subject to strong interactions between genetics and the environment. Greater knowledge of the underlying genetics will provide immunologists with a framework for study of the immune dysregulation that occurs in such diseases. Ascertaining the number of genes that are involved and their characterization have, however, proven to be difficult. Improved methods of genetic analysis and the availability of a draft sequence of the complete mouse genome have markedly improved the outlook for such research, and they have emphasized the advantages of mice as a model system. In this review, we provide an overview of the genetic analysis of autoimmune diseases and of the crucial role of congenic and consomic mouse strains in such research.
Collapse
Affiliation(s)
- Ute C Rogner
- Institut Pasteur, Unité Génétique Moléculaire Murine, 25 rue du Docteur Roux, 75015 Paris, France
| | | |
Collapse
|
1246
|
Saito R, Suzuki H, Hayashizaki Y. Global insights into protein complexes through integrated analysis of the reliable interactome and knockout lethality. Biochem Biophys Res Commun 2003; 301:633-40. [PMID: 12565826 DOI: 10.1016/s0006-291x(03)00013-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
We performed an integrated computational analysis of data derived from a comprehensive set of protein-protein interactions (interactome) and a phenotype dataset on lethality in Saccharomyces cerevisiae. For the analysis, we selected reliable interactome data using our previous 'interaction generality,' a computational approach to assess reliability of interactions. Those efforts gave clear evidence that proteins with lethal phenotypes in knockout studies (lethal proteins) may interact with each other to form functional protein complexes to perform their cellular roles. However, our analysis indicates that interactions between lethal proteins are rather restricted to the same cellular pathway or function, and it is quite unlikely that they interact with other lethal proteins functioning in different cellular roles. Furthermore, our results allowed us predictions on the functions of thus far uncharacterized lethal proteins with an estimated 93% accuracy. Thus, the analysis described in here can provide global insights into the biological features of the protein complexes.
Collapse
Affiliation(s)
- Rintaro Saito
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), Yokohama, Japan
| | | | | |
Collapse
|
1247
|
Guigo R, Dermitzakis ET, Agarwal P, Ponting CP, Parra G, Reymond A, Abril JF, Keibler E, Lyle R, Ucla C, Antonarakis SE, Brent MR. Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes. Proc Natl Acad Sci U S A 2003; 100:1140-5. [PMID: 12552088 PMCID: PMC298740 DOI: 10.1073/pnas.0337561100] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2002] [Accepted: 12/11/2002] [Indexed: 11/18/2022] Open
Abstract
A primary motivation for sequencing the mouse genome was to accelerate the discovery of mammalian genes by using sequence conservation between mouse and human to identify coding exons. Achieving this goal proved challenging because of the large proportion of the mouse and human genomes that is apparently conserved but apparently does not code for protein. We developed a two-stage procedure that exploits the mouse and human genome sequences to produce a set of genes with a much higher rate of experimental verification than previously reported prediction methods. RT-PCR amplification and direct sequencing applied to an initial sample of mouse predictions that do not overlap previously known genes verified the regions flanking one intron in 139 predictions, with verification rates reaching 76%. On average, the confirmed predictions show more restricted expression patterns than the mouse orthologs of known human genes, and two-thirds lack homologs in fish genomes, demonstrating the sensitivity of this dual-genome approach to hard-to-find genes. We verified 112 previously unknown homologs of known proteins, including two homeobox proteins relevant to developmental biology, an aquaporin, and a homolog of dystrophin. We estimate that transcription and splicing can be verified for >1,000 gene predictions identified by this method that do not overlap known genes. This is likely to constitute a significant fraction of the previously unknown, multiexon mammalian genes.
Collapse
Affiliation(s)
- Roderic Guigo
- Research Group in Biomedical Informatics, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra/Centre de Regulació Genòmica, E08003 Barcelona, Catalonia, Spain
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1248
|
Kislinger T, Rahman K, Radulovic D, Cox B, Rossant J, Emili A. PRISM, a generic large scale proteomic investigation strategy for mammals. Mol Cell Proteomics 2003; 2:96-106. [PMID: 12644571 DOI: 10.1074/mcp.m200074-mcp200] [Citation(s) in RCA: 134] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We have developed a systematic analytical approach, termed PRISM (Proteomic Investigation Strategy for Mammals), that permits routine, large scale protein expression profiling of mammalian cells and tissues. PRISM combines subcellular fractionation, multidimensional liquid chromatography-tandem mass spectrometry-based protein shotgun sequencing, and two newly developed computer algorithms, STATQUEST and GOClust, as a means to rapidly identify, annotate, and categorize thousands of expressed mammalian proteins. The application of PRISM to adult mouse lung and liver resulted in the high confidence identification of over 2,100 unique proteins including more than 100 integral membrane proteins, 400 nuclear proteins, and 500 uncharacterized proteins, the largest proteome study carried out to date on this important model organism. Automated clustering of the identified proteins into Gene Ontology annotation groups allowed for streamlined analysis of the large data set, revealing interesting and physiologically relevant patterns of tissue and organelle specificity. PRISM therefore offers an effective platform for in-depth investigation of complex mammalian proteomes.
Collapse
Affiliation(s)
- Thomas Kislinger
- Program in Proteomics and Bioinformatics, Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5G 1L6, Canada
| | | | | | | | | | | |
Collapse
|
1249
|
|
1250
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2003. [PMCID: PMC2448450 DOI: 10.1002/cfg.228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|