1
|
Guerrini MM, Oguchi A, Suzuki A, Murakawa Y. Cap analysis of gene expression (CAGE) and noncoding regulatory elements. Semin Immunopathol 2021; 44:127-136. [PMID: 34468849 DOI: 10.1007/s00281-021-00886-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 08/13/2021] [Indexed: 01/06/2023]
Abstract
Cap analysis of gene expression (CAGE) was developed to detect the 5' end of RNA. Trapping of the RNA 5'-cap structure enables the enrichment and selective sequencing of complete transcripts. Upscaled high-throughput versions of CAGE have enabled the genome-wide identification of transcription start sites, including transcriptionally active promoters and enhancers. CAGE sequencing can be exploited to draw comprehensive maps of active genomic regulatory elements in a cell type- and activation-specific manner. The cells of the immune system are among the best candidates to be analyzed in humans, since they are easily accessible. In this review, we discuss how CAGE data are instrumental for integrative analyses with quantitative trait loci and omics data, and their usefulness in the mechanistic interpretation of the effects of genetic variations over the entire human genome. Integrating CAGE data with the currently available omics information will contribute to better understanding of the genome-wide association study variants that lie outside of annotated genes, deepening our knowledge on human diseases, and enabling the targeted design of more specific therapeutic interventions.
Collapse
Affiliation(s)
- Matteo Maurizio Guerrini
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan.
| | - Akiko Oguchi
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Akari Suzuki
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Yasuhiro Murakawa
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- IFOM-the FIRC Institute of Molecular Oncology, Milan, Italy
| |
Collapse
|
2
|
Sano K, Hayashi T, Suehara Y, Hosoya M, Takamochi K, Kohsaka S, Kishikawa S, Kishi M, Saito S, Takahashi F, Kaneko K, Suzuki K, Yao T, Ishijima M, Saito T. Transcription start site-level expression of thyroid transcription factor 1 isoforms in lung adenocarcinoma and its clinicopathological significance. J Pathol Clin Res 2021; 7:361-374. [PMID: 34014042 PMCID: PMC8185369 DOI: 10.1002/cjp2.213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 03/02/2021] [Accepted: 03/11/2021] [Indexed: 11/22/2022]
Abstract
There are multiple transcription start sites (TSSs) in agreement with multiple transcript variants encoding different isoforms of NKX2-1/TTF-1 (thyroid transcription factor 1); however, the clinicopathological significance of each transcript isoform of NKX2-1/TTF-1 in lung adenocarcinoma (LAD) is unknown. Herein, TSS-level expression of NKX2-1/TTF-1 isoforms was evaluated in 71 LADs using bioinformatic analysis of cap analysis of gene expression (CAGE)-sequencing data, which provides genome-wide expression levels of the 5'-untranslated regions and the TSSs of different isoforms. Results of CAGE were further validated in 664 LADs using in situ hybridisation. Fourteen of 17 TSSs in NKX2-1/TTF-1 (80% of known TSSs in FANTOM5, an atlas of mammalian promoters) were identified in LADs, including TSSs 1-13 and 15; four isoforms of NKX2-1/TTF-1 transcripts (NKX2-1_001, NKX2-1_002, NKX2-1_004, and NKX2-1_005) were expressed in LADs, although NKX2-1_005 did not contain a homeodomain. Among those, six TSSs regulated NKX2-1_004 and NKX2-1_005, both of which contain exon 1. LADs with low expression of isoforms from TSS region 11 regulating exon 1 were significantly associated with poor prognosis in the CAGE data set. In the validation set, 62 tumours (9.3%) showed no expression of NKX2-1/TTF-1 exon 1; such tumours were significantly associated with older age, EGFR wild-type tumours, and poor prognosis. In contrast, 94 tumours, including 22 of 30 pulmonary invasive mucinous adenocarcinomas (IMAs) exhibited exon 1 expression without immunohistochemical TTF-1 protein expression. Furthermore, IMAs commonly exhibited higher exon 1 expression relative to that of exon 4/5, which contained a homeodomain in comparison with EGFR-mutated LADs. These transcriptome and clinicopathological results reveal that LAD use at least 80% of NKX2-1 TSSs and expression of the NKX2-1/TTF-1 transcript isoform without exon 1 (NKX2-1_004 and NKX2-1_005) defines a distinct subset of LAD characterised by aggressive behaviour in elder patients. Moreover, usage of alternative TSSs regions regulating NKX2-1_005 may occur in subsets of LADs.
Collapse
Affiliation(s)
- Kei Sano
- Department of Human PathologyJuntendo University Graduate School of MedicineTokyoJapan
- Department of Medicine for Orthopaedics and Motor OrganJuntendo University Graduate School of MedicineTokyoJapan
| | - Takuo Hayashi
- Department of Human PathologyJuntendo University Graduate School of MedicineTokyoJapan
| | - Yoshiyuki Suehara
- Department of Medicine for Orthopaedics and Motor OrganJuntendo University Graduate School of MedicineTokyoJapan
| | - Masaki Hosoya
- Department of Medical OncologyJuntendo University Graduate School of MedicineTokyoJapan
| | - Kazuya Takamochi
- Department of General Thoracic SurgeryJuntendo University Graduate School of MedicineTokyoJapan
| | - Shinji Kohsaka
- Division of Cellular SignalingNational Cancer Center Research InstituteTokyoJapan
| | - Satsuki Kishikawa
- Department of Human PathologyJuntendo University Graduate School of MedicineTokyoJapan
| | - Monami Kishi
- Department of Human PathologyJuntendo University Graduate School of MedicineTokyoJapan
| | - Satomi Saito
- Department of Human PathologyJuntendo University Graduate School of MedicineTokyoJapan
| | - Fumiyuki Takahashi
- Department of Respiratory MedicineJuntendo University Graduate School of MedicineTokyoJapan
| | - Kazuo Kaneko
- Department of Medicine for Orthopaedics and Motor OrganJuntendo University Graduate School of MedicineTokyoJapan
| | - Kenji Suzuki
- Department of General Thoracic SurgeryJuntendo University Graduate School of MedicineTokyoJapan
| | - Takashi Yao
- Department of Human PathologyJuntendo University Graduate School of MedicineTokyoJapan
| | - Muneaki Ishijima
- Department of Medicine for Orthopaedics and Motor OrganJuntendo University Graduate School of MedicineTokyoJapan
| | - Tsuyoshi Saito
- Department of Human PathologyJuntendo University Graduate School of MedicineTokyoJapan
| |
Collapse
|
3
|
Yoshida E, Terao Y, Hayashi N, Mogushi K, Arakawa A, Tanaka Y, Ito Y, Ohmiya H, Hayashizaki Y, Takeda S, Itoh M, Kawaji H. Promoter-level transcriptome in primary lesions of endometrial cancer identified biomarkers associated with lymph node metastasis. Sci Rep 2017; 7:14160. [PMID: 29074988 PMCID: PMC5658375 DOI: 10.1038/s41598-017-14418-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 10/11/2017] [Indexed: 12/21/2022] Open
Abstract
For endometrial cancer patients, lymphadenectomy is recommended to exclude rarely metastasized cancer cells. This procedure is performed even in patients with low risk of recurrence despite the risk of complications such as lymphedema. A method to accurately identify cases with no lymph node metastases (LN-) before lymphadenectomy is therefore highly required. We approached this clinical problem by examining primary lesions of endometrial cancers with CAGE (Cap Analysis Gene Expression), which quantifies promoter-level expression across the genome. Fourteen profiles delineated distinct transcriptional networks between LN + and LN- cases, within those classified as having the low or intermediate risk of recurrence. Subsequent quantitative reverse transcription polymerase chain reaction (qRT-PCR) analyses of 115 primary tumors showed SEMA3D mRNA and TACC2 isoforms expressed through a novel promoter as promising biomarkers with high accuracy (area under the receiver operating characteristic curve, 0.929) when used in combination. Our high-resolution transcriptome provided evidence of distinct molecular profiles underlying LN + /LN- status in endometrial cancers, raising the possibility of preoperative diagnosis to reduce unnecessary operations in patients with minimum recurrence risk.
Collapse
Affiliation(s)
- Emiko Yoshida
- Department of Obstetrics & Gynecology, Juntendo University Faculty of Medicine, Tokyo, Japan
- Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, Japan
| | - Yasuhisa Terao
- Department of Obstetrics & Gynecology, Juntendo University Faculty of Medicine, Tokyo, Japan.
| | - Noriko Hayashi
- Department of Obstetrics & Gynecology, Juntendo University Faculty of Medicine, Tokyo, Japan
| | - Kaoru Mogushi
- Intractable Disease Research Center, Juntendo University Graduate School of Medicine, Tokyo, Japan
| | - Atsushi Arakawa
- Department of Human Pathology, Juntendo University Faculty of Medicine, Tokyo, Japan
| | - Yuji Tanaka
- Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, Japan
- Preventive Medicine and Applied Genomics Unit, RIKEN Advanced Center for Computing and Communication, Yokohama, Japan
| | - Yosuke Ito
- Department of Obstetrics & Gynecology, Juntendo University Faculty of Medicine, Tokyo, Japan
- Preventive Medicine and Applied Genomics Unit, RIKEN Advanced Center for Computing and Communication, Yokohama, Japan
| | - Hiroko Ohmiya
- Preventive Medicine and Applied Genomics Unit, RIKEN Advanced Center for Computing and Communication, Yokohama, Japan
| | | | - Satoru Takeda
- Department of Obstetrics & Gynecology, Juntendo University Faculty of Medicine, Tokyo, Japan
| | - Masayoshi Itoh
- Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, Japan
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Japan
| | - Hideya Kawaji
- Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, Japan
- Preventive Medicine and Applied Genomics Unit, RIKEN Advanced Center for Computing and Communication, Yokohama, Japan
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Japan
| |
Collapse
|
4
|
Hongo Y, Ikuta T, Takaki Y, Shimamura S, Shigenobu S, Maruyama T, Yoshida T. Expression of genes involved in the uptake of inorganic carbon in the gill of a deep-sea vesicomyid clam harboring intracellular thioautotrophic bacteria. Gene 2016; 585:228-40. [PMID: 27016297 DOI: 10.1016/j.gene.2016.03.033] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Revised: 03/17/2016] [Accepted: 03/21/2016] [Indexed: 11/20/2022]
Abstract
Deep-sea vesicomyid clams, including the genus Phreagena (formerly Calyptogena), harbor thioautotrophic bacterial symbionts in the host symbiosome, which consists of cytoplasmic vacuoles in gill epithelial cells called bacteriocytes. The symbiont requires inorganic carbon (Ci), such as CO2, HCO3(-), and CO3(2-), to synthesize organic compounds, which are utilized by the host clam. The dominant Ci in seawater is HCO3(-), which is impermeable to cell membranes. Within the bacteriocyte, cytoplasmic carbonic anhydrase (CA) from the host, which catalyzes the inter-conversion between CO2 and HCO3(-), has been shown to be abundant and is thought to supply intracellular CO2 to symbionts in the symbiosome. However, the mechanism of Ci uptake by the host gill from seawater is poorly understood. To elucidate the influx pathway of Ci into the bacteriocyte, we isolated the genes related to Ci uptake via the pyrosequencing of cDNA from the gill of Phreagena okutanii, and investigated their expression patterns. Using phylogenetic and amino acid sequence analyses, three solute carrier family 4 (SLC4) bicarbonate transporters (slc4co1, slc4co2, and slc4co4) and two membrane-associated CAs (mcaco1 and mcaco2) were identified as candidate genes for Ci uptake. In an in situ hybridization analysis of gill sections, the expression of mcaco1 and mcaco2 was detected in the bacteriocytes and asymbiotic non-ciliated cells, respectively, and the expression of slc4co1 and slc4co2 was detected in the asymbiotic cells, including the intermediate cells of the inner area and the non-ciliated cells of the external area. Although subcellular localizations of the products of these genes have not been fully elucidated, they may play an important role in the uptake of Ci into the bacteriocytes. These findings will improve our understanding of the Ci transport system in the symbiotic relationships of chemosynthetic bivalves.
Collapse
Affiliation(s)
- Yuki Hongo
- Department of Research Center of Aquatic Genomics, National Research Institute of Fisheries Science, Fisheries Research Agency, 2-12-4 Fukuura, Kanazawa, Yokohama, Kanagawa 236-8648, Japan; Department of Marine Biodiversity Research, Japan Agency for Marine-Earth Science and Technology, 2-15 Natsushima-cho, Yokosuka 237-0061, Japan; Tokyo University of Marine Science and Technology, 4-5-7 Konan, Minato-ku, Tokyo, 108-8477, Japan.
| | - Tetsuro Ikuta
- Department of Marine Biodiversity Research, Japan Agency for Marine-Earth Science and Technology, 2-15 Natsushima-cho, Yokosuka 237-0061, Japan.
| | - Yoshihiro Takaki
- Department of Subsurface Geobiological Analysis and Research, Japan Agency for Marine-Earth Science and Technology, 2-15 Natsushima-cho, Yokosuka 237-0061, Japan.
| | - Shigeru Shimamura
- Department of Subsurface Geobiological Analysis and Research, Japan Agency for Marine-Earth Science and Technology, 2-15 Natsushima-cho, Yokosuka 237-0061, Japan.
| | - Shuji Shigenobu
- Functional Genomics Facility, National Institute for Basic Biology, Okazaki 444-8585, Japan.
| | - Tadashi Maruyama
- Department of Marine Biodiversity Research, Japan Agency for Marine-Earth Science and Technology, 2-15 Natsushima-cho, Yokosuka 237-0061, Japan; Research and Development Center for Marine Biosciences, Japan Agency for Marine-Earth Science and Technology, 2-15 Natsushima-cho, Yokosuka 237-0061, Japan.
| | - Takao Yoshida
- Department of Marine Biodiversity Research, Japan Agency for Marine-Earth Science and Technology, 2-15 Natsushima-cho, Yokosuka 237-0061, Japan; Tokyo University of Marine Science and Technology, 4-5-7 Konan, Minato-ku, Tokyo, 108-8477, Japan.
| |
Collapse
|
5
|
|
6
|
de Hoon M, Shin JW, Carninci P. Paradigm shifts in genomics through the FANTOM projects. Mamm Genome 2015; 26:391-402. [PMID: 26253466 PMCID: PMC4602071 DOI: 10.1007/s00335-015-9593-8] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2015] [Accepted: 07/08/2015] [Indexed: 12/18/2022]
Abstract
Big leaps in science happen when scientists from different backgrounds interact. In the past 15 years, the FANTOM Consortium has brought together scientists from different fields to analyze and interpret genomic data produced with novel technologies, including mouse full-length cDNAs and, more recently, expression profiling at single-nucleotide resolution by cap-analysis gene expression. The FANTOM Consortium has provided the most comprehensive mouse cDNA collection for functional studies and extensive maps of the human and mouse transcriptome comprising promoters, enhancers, as well as the network of their regulatory interactions. More importantly, serendipitous observations of the FANTOM dataset led us to realize that the mammalian genome is pervasively transcribed, even from retrotransposon elements, which were previously considered junk DNA. The majority of products from the mammalian genome are long non-coding RNAs (lncRNAs), including sense-antisense, intergenic, and enhancer RNAs. While the biological function has been elucidated for some lncRNAs, more than 98 % of them remain without a known function. We argue that large-scale studies are urgently needed to address the functional role of lncRNAs.
Collapse
Affiliation(s)
- Michiel de Hoon
- Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, 230-0045, Japan.
| | - Jay W Shin
- Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, 230-0045, Japan.
| | - Piero Carninci
- Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, 230-0045, Japan.
| |
Collapse
|
7
|
Poletti V, Delli Carri A, Malagoli Tagliazucchi G, Faedo A, Petiti L, Mazza EMC, Peano C, De Bellis G, Bicciato S, Miccio A, Cattaneo E, Mavilio F. Genome-Wide Definition of Promoter and Enhancer Usage during Neural Induction of Human Embryonic Stem Cells. PLoS One 2015; 10:e0126590. [PMID: 25978676 PMCID: PMC4433211 DOI: 10.1371/journal.pone.0126590] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 04/06/2015] [Indexed: 11/21/2022] Open
Abstract
Genome-wide mapping of transcriptional regulatory elements is an essential tool for understanding the molecular events orchestrating self-renewal, commitment and differentiation of stem cells. We combined high-throughput identification of transcription start sites with genome-wide profiling of histones modifications to map active promoters and enhancers in embryonic stem cells (ESCs) induced to neuroepithelial-like stem cells (NESCs). Our analysis showed that most promoters are active in both cell types while approximately half of the enhancers are cell-specific and account for most of the epigenetic changes occurring during neural induction, and most likely for the modulation of the promoters to generate cell-specific gene expression programs. Interestingly, the majority of the promoters activated or up-regulated during neural induction have a “bivalent” histone modification signature in ESCs, suggesting that developmentally-regulated promoters are already poised for transcription in ESCs, which are apparently pre-committed to neuroectodermal differentiation. Overall, our study provides a collection of differentially used enhancers, promoters, transcription starts sites, protein-coding and non-coding RNAs in human ESCs and ESC-derived NESCs, and a broad, genome-wide description of promoter and enhancer usage and of gene expression programs characterizing the transition from a pluripotent to a neural-restricted cell fate.
Collapse
Affiliation(s)
- Valentina Poletti
- Division of Genetics and Cell Biology, Scientific Institute H. San Raffaele, Milan, Italy
- Genethon, Evry, France
| | | | | | - Andrea Faedo
- Department of Biosciences, University of Milano, Milan, Italy
| | - Luca Petiti
- Department of Life Sciences, University of Modena and Reggio Emilia, Modena, Italy
| | | | - Clelia Peano
- Institute of Biomedical Technologies, National Research Council, Milan, Italy
| | - Gianluca De Bellis
- Institute of Biomedical Technologies, National Research Council, Milan, Italy
| | - Silvio Bicciato
- Department of Life Sciences, University of Modena and Reggio Emilia, Modena, Italy
| | - Annarita Miccio
- Department of Life Sciences, University of Modena and Reggio Emilia, Modena, Italy
- Imagine Institute, Paris, France
| | - Elena Cattaneo
- Department of Biosciences, University of Milano, Milan, Italy
| | - Fulvio Mavilio
- Genethon, Evry, France
- Department of Life Sciences, University of Modena and Reggio Emilia, Modena, Italy
- * E-mail:
| |
Collapse
|
8
|
Zhang Y, Li ZX, Yu XD, Fan J, Pickett JA, Jones HD, Zhou JJ, Birkett MA, Caulfield J, Napier JA, Zhao GY, Cheng XG, Shi Y, Bruce TJA, Xia LQ. Molecular characterization of two isoforms of a farnesyl pyrophosphate synthase gene in wheat and their roles in sesquiterpene synthesis and inducible defence against aphid infestation. THE NEW PHYTOLOGIST 2015; 206:1101-1115. [PMID: 25644034 DOI: 10.1111/nph.13302] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2014] [Accepted: 12/16/2014] [Indexed: 05/09/2023]
Abstract
Aphids are important pests of wheat (Triticum aestivum) that affect crop production globally. Herbivore-induced emission of sesquiterpenes can repel pests, and farnesyl pyrophosphate synthase (FPS) is a key enzyme involved in sesquiterpene biosynthesis. However, fps orthologues in wheat and their functional roles in sesquiterpene synthesis and defence against aphid infestation are unknown. Here, two fps isoforms, Tafps1 and Tafps2, were identified in wheat. Quantitative real-time polymerase chain reaction (qRT-PCR) and in vitro catalytic activity analyses were conducted to investigate expression patterns and activity. Heterologous expression of these isoforms in Arabidopsis thaliana, virus-induced gene silencing (VIGS) in wheat and aphid behavioural assays were performed to understand the functional roles of these two isoforms. We demonstrated that Tafps1 and Tafps2 played different roles in induced responses to aphid infestation and in sesquiterpene synthesis. Heterologous expression in A. thaliana resulted in repulsion of the peach aphid (Myzus persicae). Wheat plants with these two isoforms transiently silenced were significantly attractive to grain aphid (Sitobion avenae). Our results provide new insights into induced defence against aphid herbivory in wheat, in particular, the different roles of the two Tafps isoforms in both sesquiterpene biosynthesis and defence against aphid infestation.
Collapse
Affiliation(s)
- Yan Zhang
- The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), 12 Zhongguancun South Street, Beijing, 100081, China
- Tobacco Research Institute, Chinese Academy of Agricultural Sciences (CAAS)/Key Laboratory of Tobacco Biology and Processing, Ministry of Agriculture, 11 Keyuanjing 4 Road, Laoshan District, Qingdao, 266101, China
| | - Zhi-Xia Li
- The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), 12 Zhongguancun South Street, Beijing, 100081, China
| | - Xiu-Dao Yu
- The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), 12 Zhongguancun South Street, Beijing, 100081, China
| | - Jia Fan
- Institute of Plant Protection, Chinese Academy of Agricultural Sciences (CAAS), Beijing, 100193, China
| | - John A Pickett
- Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, UK
| | - Huw D Jones
- Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, UK
| | | | | | - John Caulfield
- Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, UK
| | | | - Guang-Yao Zhao
- The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), 12 Zhongguancun South Street, Beijing, 100081, China
| | - Xian-Guo Cheng
- Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences (CAAS), 12 Zhongguancun South Street, Beijing, 100081, China
| | - Yi Shi
- Tobacco Research Institute, Chinese Academy of Agricultural Sciences (CAAS)/Key Laboratory of Tobacco Biology and Processing, Ministry of Agriculture, 11 Keyuanjing 4 Road, Laoshan District, Qingdao, 266101, China
| | - Toby J A Bruce
- Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, UK
| | - Lan-Qin Xia
- The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), 12 Zhongguancun South Street, Beijing, 100081, China
| |
Collapse
|
9
|
Chakraborty P, William Buaas F, Sharma M, Smith BE, Greenlee AR, Eacker SM, Braun RE. Androgen-dependent sertoli cell tight junction remodeling is mediated by multiple tight junction components. Mol Endocrinol 2014; 28:1055-72. [PMID: 24825397 DOI: 10.1210/me.2013-1134] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Sertoli cell tight junctions (SCTJs) of the seminiferous epithelium create a specialized microenvironment in the testis to aid differentiation of spermatocytes and spermatids from spermatogonial stem cells. SCTJs must be chronically broken and rebuilt with high fidelity to allow the transmigration of preleptotene spermatocytes from the basal to adluminal epithelial compartment. Impairment of androgen signaling in Sertoli cells perturbs SCTJ remodeling. Claudin (CLDN) 3, a tight junction component under androgen regulation, localizes to newly forming SCTJs and is absent in Sertoli cell androgen receptor knockout (SCARKO) mice. We show here that Cldn3-null mice do not phenocopy SCARKO mice: Cldn3(-/-) mice are fertile, show uninterrupted spermatogenesis, and exhibit fully functional SCTJs based on imaging and small molecule tracer analyses, suggesting that other androgen-regulated genes must contribute to the SCARKO phenotype. To further investigate the SCTJ phenotype observed in SCARKO mutants, we generated a new SCARKO model and extensively analyzed the expression of other tight junction components. In addition to Cldn3, we identified altered expression of several other SCTJ molecules, including down-regulation of Cldn13 and a noncanonical tight junction protein 2 isoform (Tjp2iso3). Chromatin immunoprecipitation was used to demonstrate direct androgen receptor binding to regions of these target genes. Furthermore, we demonstrated that CLDN13 is a constituent of SCTJs and that TJP2iso3 colocalizes with tricellulin, a constituent of tricellular junctions, underscoring the importance of androgen signaling in the regulation of both bicellular and tricellular Sertoli cell tight junctions.
Collapse
Affiliation(s)
- Papia Chakraborty
- The Jackson Laboratory (P.C., F.W.B., M.S., B.E.S., A.R.G., R.E.B.), Bar Harbor, Maine 04609; and Department of Neurology (S.M.E.), Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| | | | | | | | | | | | | |
Collapse
|
10
|
Rid R, Abdel-Hadi O, Maier R, Wagner M, Hundsberger H, Hintner H, Bauer J, Onder K. From the ORFeome concept to highly comprehensive, full-genome screening libraries. Assay Drug Dev Technol 2012; 11:52-7. [PMID: 22621725 DOI: 10.1089/adt.2012.450] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Recombination-based cloning techniques have in recent times facilitated the establishment of genome-scale single-gene ORFeome repositories. Their further handling and downstream application in systematic fashion is, however, practically impeded because of logistical plus economic challenges. At this juncture, simultaneously transferring entire gene collections in compiled pool format could represent an advanced compromise between systematic ORFeome (an organism's entire set of protein-encoding open reading frames) projects and traditional random library approaches, but has not yet been considered in great detail. In our endeavor to merge the comprehensiveness of ORFeomes with a basically simple, streamlined, and easily executable single-tube design, we have here produced five different pooled screening-ready libraries for both Staphylococcus aureus and Homo sapiens. By evaluating the parallel transfer efficiencies of differentially sized genes from initial polymerase chain reaction (PCR) product amplification to entry and final destination library construction via quantitative real-time PCR, we found that the complexity of the gene population is fairly stably maintained once an entry resource has been successfully established, and that no apparent size-selection bias loss of large inserts takes place. Recombinational transfer processes are hence robust enough for straightforwardly achieving such pooled screening libraries.
Collapse
Affiliation(s)
- Raphaela Rid
- Division of Molecular Dermatology, Department of Dermatology, Paracelsus Private Medical University Salzburg, Austria
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Itoh M, Kojima M, Nagao-Sato S, Saijo E, Lassmann T, Kanamori-Katayama M, Kaiho A, Lizio M, Kawaji H, Carninci P, Forrest ARR, Hayashizaki Y. Automated workflow for preparation of cDNA for cap analysis of gene expression on a single molecule sequencer. PLoS One 2012; 7:e30809. [PMID: 22303458 PMCID: PMC3268765 DOI: 10.1371/journal.pone.0030809] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2011] [Accepted: 12/21/2011] [Indexed: 11/19/2022] Open
Abstract
Background Cap analysis of gene expression (CAGE) is a 5′ sequence tag technology to globally determine transcriptional starting sites in the genome and their expression levels and has most recently been adapted to the HeliScope single molecule sequencer. Despite significant simplifications in the CAGE protocol, it has until now been a labour intensive protocol. Methodology In this study we set out to adapt the protocol to a robotic workflow, which would increase throughput and reduce handling. The automated CAGE cDNA preparation system we present here can prepare 96 ‘HeliScope ready’ CAGE cDNA libraries in 8 days, as opposed to 6 weeks by a manual operator.We compare the results obtained using the same RNA in manual libraries and across multiple automation batches to assess reproducibility. Conclusions We show that the sequencing was highly reproducible and comparable to manual libraries with an 8 fold increase in productivity. The automated CAGE cDNA preparation system can prepare 96 CAGE sequencing samples simultaneously. Finally we discuss how the system could be used for CAGE on Illumina/SOLiD platforms, RNA-seq and full-length cDNA generation.
Collapse
Affiliation(s)
- Masayoshi Itoh
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan
- * E-mail: (MI); (YH)
| | - Miki Kojima
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | | | - Eri Saijo
- K. K. Dnaform, Ono-cho, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Timo Lassmann
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | | | - Ai Kaiho
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Marina Lizio
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Hideya Kawaji
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Piero Carninci
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | | | - Yoshihide Hayashizaki
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan
- * E-mail: (MI); (YH)
| |
Collapse
|
12
|
Ruan X, Ruan Y. Genome wide full-length transcript analysis using 5' and 3' paired-end-tag next generation sequencing (RNA-PET). Methods Mol Biol 2012; 809:535-562. [PMID: 22113299 DOI: 10.1007/978-1-61779-376-9_35] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
RNA-PET is a paired end tag (PET) sequencing method for full-length mRNA transcripts analysis using the next generation sequencer platforms such as Illumina GA and SOLiD. Unlike RNA-Seq method that sequences randomly sheared shotgun RNA short fragments, RNA-PET captures and sequences the 5' and 3' end tags of full-length cDNA fragments of all expressed genes in a biological sample. When mapped to reference genome, RNA-PET sequences can demarcate the boundaries of transcription units genome-wide, in addition to its ability to quantify the transcription level of each expression genes. Furthermore, the unique feature of RNA-PET is to identify fusion transcripts. Therefore, RNA-PET has been regarded as the best PET for genome annotation (1). Here in this chapter, we describe the details of the RNA-PET protocol and discuss the critical issues.
Collapse
Affiliation(s)
- Xiaoan Ruan
- Genome Institute of Singapore, Singapore, Singapore
| | | |
Collapse
|
13
|
Park DJ. Lariat-dependent nested PCR for flanking sequence determination. Methods Mol Biol 2011; 687:43-55. [PMID: 20967600 DOI: 10.1007/978-1-60761-944-4_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Abstract
Methods detailed in this chapter relate to the use of Lariat-dependent Nested (LaNe) PCR to characterize unknown RNA or DNA sequence flanking known regions. A multitude of approaches designed to determine flanking sequences have been described in the literature. Variously, problems related to these approaches include lack of resolution or failure, depending on experimental context, and complex handling. LaNe-based methods are designed to harness "two-sided" gene-specific PCR with the option of nesting but without the requirement for inefficient and involved enzyme preprocessing steps.
Collapse
Affiliation(s)
- Daniel J Park
- Genetic Epidemiology Laboratory, Department of Pathology, The University of Melbourne, Parkville, Victoria, Australia.
| |
Collapse
|
14
|
Salimullah M, Sakai M, Mizuho S, Plessy C, Carninci P. NanoCAGE: a high-resolution technique to discover and interrogate cell transcriptomes. Cold Spring Harb Protoc 2011; 2011:pdb.prot5559. [PMID: 21205859 DOI: 10.1101/pdb.prot5559] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Cap analysis gene expression (CAGE) is a method to identify the 5' ends of transcripts, allowing the discovery of new promoters and the quantification of gene activity. Combining promoter location and their expression levels, CAGE data are essential for annotation-agnostic studies of regulatory gene networks. However, CAGE requires large amounts of input RNA, which usually are not obtainable from highly refined samples such as tissue microdissections or subcellular fractions. The nanoCAGE method can capture the 5' ends of transcripts from as little as 10 ng of total RNA and takes advantage of the capacity of current sequencers to produce longer (50-100 bp) reads. The method prepares cap-selected cDNAs ready for direct sequencing of their 5' ends (optionally mate-paired with the 3' end) that can provide information about downstream sequences. This protocol describes how to prepare nanoCAGE libraries from as little as 50 ng of total RNA within two working days. The libraries can be sequenced using an Illumina sequencer Genome Analyzer IIX [corrected] with a level of sensitivity 1000 times higher than CAGE.
Collapse
Affiliation(s)
- Md Salimullah
- RIKEN Yokohama Institute, Omics Science Center, Yokohama City, Kanagawa, 230-0045, Japan
| | | | | | | | | |
Collapse
|
15
|
Schaefer U, Kodzius R, Kai C, Kawai J, Carninci P, Hayashizaki Y, Bajic VB. High sensitivity TSS prediction: estimates of locations where TSS cannot occur. PLoS One 2010; 5:e13934. [PMID: 21085627 PMCID: PMC2981523 DOI: 10.1371/journal.pone.0013934] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2010] [Accepted: 10/19/2010] [Indexed: 11/26/2022] Open
Abstract
Background Although transcription in mammalian genomes can initiate from various genomic positions (e.g., 3′UTR, coding exons, etc.), most locations on genomes are not prone to transcription initiation. It is of practical and theoretical interest to be able to estimate such collections of non-TSS locations (NTLs). The identification of large portions of NTLs can contribute to better focusing the search for TSS locations and thus contribute to promoter and gene finding. It can help in the assessment of 5′ completeness of expressed sequences, contribute to more successful experimental designs, as well as more accurate gene annotation. Methodology Using comprehensive collections of Cap Analysis of Gene Expression (CAGE) and other transcript data from mouse and human genomes, we developed a methodology that allows us, by performing computational TSS prediction with very high sensitivity, to annotate, with a high accuracy in a strand specific manner, locations of mammalian genomes that are highly unlikely to harbor transcription start sites (TSSs). The properties of the immediate genomic neighborhood of 98,682 accurately determined mouse and 113,814 human TSSs are used to determine features that distinguish genomic transcription initiation locations from those that are not likely to initiate transcription. In our algorithm we utilize various constraining properties of features identified in the upstream and downstream regions around TSSs, as well as statistical analyses of these surrounding regions. Conclusions Our analysis of human chromosomes 4, 21 and 22 estimates ∼46%, ∼41% and ∼27% of these chromosomes, respectively, as being NTLs. This suggests that on average more than 40% of the human genome can be expected to be highly unlikely to initiate transcription. Our method represents the first one that utilizes high-sensitivity TSS prediction to identify, with high accuracy, large portions of mammalian genomes as NTLs. The server with our algorithm implemented is available at http://cbrc.kaust.edu.sa/ddm/.
Collapse
MESH Headings
- Algorithms
- Animals
- Base Sequence
- Chromosomes, Human, Pair 21/genetics
- Chromosomes, Human, Pair 22/genetics
- Chromosomes, Human, Pair 4/genetics
- Computational Biology/methods
- Genome/genetics
- Genome, Human/genetics
- Humans
- Internet
- Mice
- Molecular Sequence Data
- Promoter Regions, Genetic/genetics
- Receptors, Opioid, mu/genetics
- Reproducibility of Results
- Transcription Initiation Site
- Transcription, Genetic
Collapse
Affiliation(s)
- Ulf Schaefer
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Rimantas Kodzius
- Division of Physical Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Chikatoshi Kai
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, Japan
| | - Jun Kawai
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, Japan
| | - Piero Carninci
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, Wako, Saitama, Japan
| | - Yoshihide Hayashizaki
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, Japan
| | - Vladimir B. Bajic
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
- * E-mail:
| |
Collapse
|
16
|
Kapteyn J, He R, McDowell ET, Gang DR. Incorporation of non-natural nucleotides into template-switching oligonucleotides reduces background and improves cDNA synthesis from very small RNA samples. BMC Genomics 2010; 11:413. [PMID: 20598146 PMCID: PMC2996941 DOI: 10.1186/1471-2164-11-413] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2009] [Accepted: 07/02/2010] [Indexed: 01/15/2023] Open
Abstract
Background The template switching PCR (TS-PCR) method of cDNA synthesis represents one of the most straightforward approaches to generating full length cDNA for sequencing efforts. However, when applied to very small RNA samples, such as those obtained from tens or hundreds of cells, this approach leads to high background and low cDNA yield due to concatamerization of the TS oligo. Results In this study, we describe the application of nucleotide isomers that form non-standard base pairs in the template switching oligo to prevent background cDNA synthesis. When such bases are added to the 5' end of the template switching (TS) oligo, they inhibit MMLV-RT from extending the cDNA beyond the TS oligo, thus increasing cDNA yield by reducing formation of concatamers of the TS oligo that are the source of significant background. Conclusions Our results demonstrate that this novel approach for cDNA synthesis has valuable utility for application of ultra-high throughput technologies, such as whole transcriptome sequencing using 454 technology, to very small biological samples comprised of tens of cells as might be obtained via approaches like laser microdissection.
Collapse
Affiliation(s)
- Jeremy Kapteyn
- School of Plant Sciences and BIO5 Institute, The University of Arizona, 1657 E Helen Street, Tucson, AZ 85721, USA
| | | | | | | |
Collapse
|
17
|
Kondou Y, Higuchi M, Matsui M. High-throughput characterization of plant gene functions by using gain-of-function technology. ANNUAL REVIEW OF PLANT BIOLOGY 2010; 61:373-93. [PMID: 20192750 DOI: 10.1146/annurev-arplant-042809-112143] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Gain-of-function approaches have been used as an alternative or complementary method to loss-of-function approaches as well as to confer new functions to plants. Gain-of-function is achieved by increasing gene expression levels through the random activation of endogenous genes by transcriptional enhancers or the expression of individual transgenes by transformation. The advantages of gain-of-function approaches compared to loss-of-function approaches for the characterization of gene functions include the abilities to (a) analyze individual gene family members, (b) characterize the function of genes from nonmodel plants using a heterologous expression system, and (c) identify genes that confer stress tolerance to plants that result from the introduction of transgenes. In this review, we describe the current status of gain-of-function mutagenesis and provide several examples of how gene functions have been characterized via high-throughput screening using gain-of-function technology.
Collapse
Affiliation(s)
- Youichi Kondou
- Plant Functional Genomics Research Team, RIKEN Plant Science Center, Tsurumi-ku, Yokohama, Japan.
| | | | | |
Collapse
|
18
|
Zeng SH, Liu D, Wang Y. [Advances of gene enrichment in plant genome]. YI CHUAN = HEREDITAS 2009; 31:799-808. [PMID: 19689940 DOI: 10.3724/sp.j.1005.2009.00799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The genome size varies greatly in higher plants. Repetitive sequences account for most of the large plant genomes while low-copy or single copy genic sequences, referred to as gene space, take up only a small portion of the genomes. Considering the large amount of repetitive sequences, it is a great challenge to obtain genic sequences using high-throughout methods in non-model plants bearing large genomes. Currently, several approaches have been developed for gene enrichment on a genome-wide scale, such as cDNA library, methylation filtration library, high Cot library and transposon tagging. Here, we reviewed the technical principles, advantages and disadvantages of these methods, as well as the recent development of methylation filtration technology. An in-depth discussion was performed for selection of one method or combination of methods according to the research objectives and plant materials, especially for plants with large genomes.
Collapse
Affiliation(s)
- Shao-Hua Zeng
- Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China.
| | | | | |
Collapse
|
19
|
Rach EA, Yuan HY, Majoros WH, Tomancak P, Ohler U. Motif composition, conservation and condition-specificity of single and alternative transcription start sites in the Drosophila genome. Genome Biol 2009; 10:R73. [PMID: 19589141 PMCID: PMC2728527 DOI: 10.1186/gb-2009-10-7-r73] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2008] [Revised: 04/21/2009] [Accepted: 07/09/2009] [Indexed: 01/05/2023] Open
Abstract
A map of transcription start sites across the Drosophila genome, providing insights into initiation patterns and spatiotemporal conditions. Background Transcription initiation is a key component in the regulation of gene expression. mRNA 5' full-length sequencing techniques have enhanced our understanding of mammalian transcription start sites (TSSs), revealing different initiation patterns on a genomic scale. Results To identify TSSs in Drosophila melanogaster, we applied a hierarchical clustering strategy on available 5' expressed sequence tags (ESTs) and identified a high quality set of 5,665 TSSs for approximately 4,000 genes. We distinguished two initiation patterns: 'peaked' TSSs, and 'broad' TSS cluster groups. Peaked promoters were found to contain location-specific sequence elements; conversely, broad promoters were associated with non-location-specific elements. In alignments across other Drosophila genomes, conservation levels of sequence elements exceeded 90% within the melanogaster subgroup, but dropped considerably for distal species. Elements in broad promoters had lower levels of conservation than those in peaked promoters. When characterizing the distributions of ESTs, 64% of TSSs showed distinct associations to one out of eight different spatiotemporal conditions. Available whole-genome tiling array time series data revealed different temporal patterns of embryonic activity across the majority of genes with distinct alternative promoters. Many genes with maternally inherited transcripts were found to have alternative promoters utilized later in development. Core promoters of maternally inherited transcripts showed differences in motif composition compared to zygotically active promoters. Conclusions Our study provides a comprehensive map of Drosophila TSSs and the conditions under which they are utilized. Distinct differences in motif associations with initiation pattern and spatiotemporal utilization illustrate the complex regulatory code of transcription initiation.
Collapse
Affiliation(s)
- Elizabeth A Rach
- Program in Computational Biology and Bioinformatics, Duke University, Science Drive, Durham, NC 27708, USA
| | | | | | | | | |
Collapse
|
20
|
Seki M, Shinozaki K. Functional genomics using RIKEN Arabidopsis thaliana full-length cDNAs. JOURNAL OF PLANT RESEARCH 2009; 122:355-66. [PMID: 19412652 DOI: 10.1007/s10265-009-0239-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2009] [Accepted: 04/08/2009] [Indexed: 05/24/2023]
Abstract
Full-length cDNAs are essential for the correct annotation of genomic sequences as well as for the functional analysis of genes and their products. We have isolated about 240,000 RIKEN Arabidopsis full-length (RAFL) cDNA clones. These clones were clustered into about 17,000 non-redundant cDNA groups, i.e., about 60% of all Arabidopsis predicted genes. The sequence information of the RAFL cDNAs is useful for promoter analysis, and for the correct annotation of predicted transcriptional units and gene products. We prepared cDNA microarrays containing independent full-length cDNA groups and studied the expression profiles of genes under various stress- and hormone-treatment conditions, and in various mutants and transgenic plants. These expression profiling studies have shown the expression levels of many genes as a detailed snapshot describing the state of a biological system in planta under various conditions. We have applied RAFL cDNAs to the functional analysis of proteins using the full-length cDNA over-expressing (FOX) gene hunting system and the wheat germ cell-free protein synthesis system. The RAFL cDNA collection was also used for determination of the domain structure of proteins by NMR. In this review, we summarize the present state and perspectives of functional genomics using RAFL cDNAs.
Collapse
Affiliation(s)
- Motoaki Seki
- Plant Genomic Network Research Team, Plant Functional Genomics Research Group, RIKEN Plant Science Center, RIKEN Yokohama Institute, Yokohama 230-0045, Japan.
| | | |
Collapse
|
21
|
Qamar I, Park E, Gong EY, Lee HJ, Lee K. ARR19 (androgen receptor corepressor of 19 kDa), an antisteroidogenic factor, is regulated by GATA-1 in testicular Leydig cells. J Biol Chem 2009; 284:18021-32. [PMID: 19398553 DOI: 10.1074/jbc.m900896200] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
ARR19 (androgen receptor corepressor of 19 kDa), which encodes for a leucine-rich protein, is expressed abundantly in the testis. Further analyses revealed that ARR19 was expressed in Leydig cells, and its expression was differentially regulated during Leydig cell development. Adenovirus-mediated overexpression of ARR19 in Leydig cells inhibited testicular steroidogenesis, down-regulating the expression of steroidogenic enzymes, which suggests that ARR19 is an antisteroidogenic factor. Interestingly, cAMP/luteinizing hormone attenuated ARR19 expression in a fashion similar to that of GATA-1, which was previously reported to be down-regulated by cAMP. Sequence analysis of the Arr19 promoter revealed the presence of two putative GATA-1 binding motifs. Further analyses with 5' deletion and point mutants of putative GATA-1 binding motifs showed that these GATA-1 binding sites were critical for high promoter activity. CREB-binding protein coactivated GATA-1 and markedly increased the activity of the Arr19 promoter. Both GATA-1 and CREB-binding proteins occupied the GATA-1 motifs within the Arr19 promoter, which was repressed by cAMP treatment. Altogether, these findings demonstrate that ARR19 is the target gene of GATA-1 and suggest that ARR19 gene expression in testicular Leydig cells is regulated by luteinizing hormone/cAMP signaling via the control of GATA-1 expression, resulting in the control of testicular steroidogenesis.
Collapse
Affiliation(s)
- Imteyaz Qamar
- Hormone Research Center, School of Biological Sciences and Technology, Chonnam National University, Gwangju 500-757, Republic of Korea
| | | | | | | | | |
Collapse
|
22
|
Nikoh N, Nakabachi A. Aphids acquired symbiotic genes via lateral gene transfer. BMC Biol 2009; 7:12. [PMID: 19284544 PMCID: PMC2662799 DOI: 10.1186/1741-7007-7-12] [Citation(s) in RCA: 123] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2009] [Accepted: 03/10/2009] [Indexed: 11/10/2022] Open
Abstract
Background Aphids possess bacteriocytes, which are cells specifically differentiated to harbour the obligate mutualist Buchnera aphidicola (γ-Proteobacteria). Buchnera has lost many of the genes that appear to be essential for bacterial life. From the bacteriocyte of the pea aphid Acyrthosiphon pisum, we previously identified two clusters of expressed sequence tags that display similarity only to bacterial genes. Southern blot analysis demonstrated that they are encoded in the aphid genome. In this study, in order to assess the possibility of lateral gene transfer, we determined the full-length sequences of these transcripts, and performed detailed structural and phylogenetic analyses. We further examined their expression levels in the bacteriocyte using real-time quantitative RT-PCR. Results Sequence similarity searches demonstrated that these fully sequenced transcripts are significantly similar to the bacterial genes ldcA (product, LD-carboxypeptidase) and rlpA (product, rare lipoprotein A), respectively. Buchnera lacks these genes, whereas many other bacteria, including Escherichia coli, a close relative of Buchnera, possess both ldcA and rlpA. Molecular phylogenetic analysis clearly demonstrated that the aphid ldcA was derived from a rickettsial bacterium closely related to the extant Wolbachia spp. (α-Proteobacteria, Rickettsiales), which are intracellular symbionts of various lineages of arthropods. The evolutionary origin of rlpA was not fully resolved, but it was clearly demonstrated that its double-ψ β-barrel domain is of bacterial origin. Real-time quantitative RT-PCR demonstrated that ldcA and rlpA are expressed 11.6 and 154-fold higher in the bacteriocyte than in the whole body, respectively. LdcA is an enzyme required for recycling murein (peptidoglycan), which is a component of the bacterial cell wall. As Buchnera possesses a cell wall composed of murein but lacks ldcA, a high level of expression of the aphid ldcA in the bacteriocyte may be essential to maintain Buchnera. Although the function of RlpA is not well known, conspicuous up-regulation of the aphid rlpA in the bacteriocyte implies that this gene is also essential for Buchnera. Conclusion In this study, we obtained several lines of evidence indicating that aphids acquired genes from bacteria via lateral gene transfer and that these genes are used to maintain the obligately mutualistic bacterium, Buchnera.
Collapse
Affiliation(s)
- Naruo Nikoh
- Division of Natural Sciences, The Open University of Japan, Chiba, Japan.
| | | |
Collapse
|
23
|
Brinkmeier ML, Davis SW, Carninci P, MacDonald JW, Kawai J, Ghosh D, Hayashizaki Y, Lyons RH, Camper SA. Discovery of transcriptional regulators and signaling pathways in the developing pituitary gland by bioinformatic and genomic approaches. Genomics 2009; 93:449-60. [PMID: 19121383 DOI: 10.1016/j.ygeno.2008.11.010] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2007] [Revised: 11/13/2008] [Accepted: 11/24/2008] [Indexed: 01/15/2023]
Abstract
We report a catalog of the mouse embryonic pituitary gland transcriptome consisting of five cDNA libraries including wild type tissue from E12.5 and E14.5, Prop1(df/df) mutant at E14.5, and two cDNA subtractions: E14.5 WT-E14.5 Prop1(df/df) and E14.5 WT-E12.5 WT. DNA sequence information is assembled into a searchable database with gene ontology terms representing 12,009 expressed genes. We validated coverage of the libraries by detecting most known homeobox gene transcription factor cDNAs. A total of 45 homeobox genes were detected as part of the pituitary transcriptome, representing most expected ones, which validated library coverage, and many novel ones, underscoring the utility of this resource as a discovery tool. We took a similar approach for signaling-pathway members with novel pituitary expression and found 157 genes related to the BMP, FGF, WNT, SHH and NOTCH pathways. These genes are exciting candidates for regulators of pituitary development and function.
Collapse
Affiliation(s)
- Michelle L Brinkmeier
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109-5618, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Sato K, Shin-I T, Seki M, Shinozaki K, Yoshida H, Takeda K, Yamazaki Y, Conte M, Kohara Y. Development of 5006 full-length CDNAs in barley: a tool for accessing cereal genomics resources. DNA Res 2009; 16:81-9. [PMID: 19150987 PMCID: PMC2671202 DOI: 10.1093/dnares/dsn034] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
A collection of 5006 full-length (FL) cDNA sequences was developed in barley. Fifteen mRNA samples from various organs and treatments were pooled to develop a cDNA library using the CAP trapper method. More than 60% of the clones were confirmed to have complete coding sequences, based on comparison with rice amino acid and UniProt sequences. Blastn homologies (E<1E-5) to rice genes and Arabidopsis genes were 89 and 47%, respectively. Of the 5028 possible amino acid sequences derived from the 5006 FLcDNAs, 4032 (80.2%) were classified into 1678 GreenPhyl multigenic families. There were 555 cDNAs showing low homology to both rice and Arabidopsis. Gene ontology annotation by InterProScan indicated that many of these cDNAs (71%) have no known molecular functions and may be unique to barley. The cDNAs showed high homology to Barley 1 GeneChip oligo probes (81%) and the wheat gene index (84%). The high homology between FLcDNAs (27%) and mapped barley expressed sequence tag enabled assigning linkage map positions to 151–233 FLcDNAs on each of the seven barley chromosomes. These comprehensive barley FLcDNAs provide strong platform to connect pre-existing genomic and genetic resources and accelerate gene identification and genome analysis in barley and related species.
Collapse
Affiliation(s)
- Kazuhiro Sato
- Research Institute for Bioresources, Okayama University, Kurashiki, Japan.
| | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Seki M, Kamiya A, Carninci P, Hayashizaki Y, Shinozaki K. Generation of full-length cDNA libraries: focus on plants. Methods Mol Biol 2009; 533:49-68. [PMID: 19277562 DOI: 10.1007/978-1-60327-136-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Full-length cDNAs are essential for the correct annotation of transcriptional units and gene products from genomic sequence data and for functional analysis of the genes. Full-length cDNA libraries are very important resources for isolation of the full-length cDNAs. The biotinylated cap trapper method using the trehalose-thermostabilized reverse transcriptase has been developed and has become an efficient method for construction of high-content full-length cDNA libraries. We have constructed full-length cDNA libraries from various plants and animals using this method. The protocol of the method is described in this chapter.
Collapse
Affiliation(s)
- Motoaki Seki
- Plant Functional Genomics Research Group, RIKEN Genomic Sciences Center, Yokohama, Japan
| | | | | | | | | |
Collapse
|
26
|
Abstract
Critical steps in a cDNA library preparation include efficient cDNA synthesis, selection of full-length cDNAs, normalizing their abundance, and the subtraction of redundant transcripts. The use of trehalose and sorbiol stabilizes the activity of the reverse transcriptase leading to efficient cDNA synthesis and the cap-trapping method is used for efficient full-length cDNA selection. Through the incorporation of additional normalization and subtraction steps that eliminate the size bias and expressed gene frequency, it is possible to attain cDNA libraries that include larger or rarely expressed genes. This chapter describes an efficient method to construct a full-length cDNA library, with a focus on metazoan samples.
Collapse
Affiliation(s)
- Masako Harada
- Genome Exploration Research Group, Genomic Sciences Center (GSC), RIKEN, Yokohama, Japan
| | | |
Collapse
|
27
|
Taji T, Sakurai T, Mochida K, Ishiwata A, Kurotani A, Totoki Y, Toyoda A, Sakaki Y, Seki M, Ono H, Sakata Y, Tanaka S, Shinozaki K. Large-scale collection and annotation of full-length enriched cDNAs from a model halophyte, Thellungiella halophila. BMC PLANT BIOLOGY 2008; 8:115. [PMID: 19014467 PMCID: PMC2621223 DOI: 10.1186/1471-2229-8-115] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2008] [Accepted: 11/12/2008] [Indexed: 05/15/2023]
Abstract
BACKGROUND Thellungiella halophila (also known as Thellungiella salsuginea) is a model halophyte with a small plant size, short life cycle, and small genome. It easily undergoes genetic transformation by the floral dipping method used with its close relative, Arabidopsis thaliana. Thellungiella genes exhibit high sequence identity (approximately 90% at the cDNA level) with Arabidopsis genes. Furthermore, Thellungiella not only shows tolerance to extreme salinity stress, but also to chilling, freezing, and ozone stress, supporting the use of Thellungiella as a good genomic resource in studies of abiotic stress tolerance. RESULTS We constructed a full-length enriched Thellungiella (Shan Dong ecotype) cDNA library from various tissues and whole plants subjected to environmental stresses, including high salinity, chilling, freezing, and abscisic acid treatment. We randomly selected about 20,000 clones and sequenced them from both ends to obtain a total of 35 171 sequences. CAP3 software was used to assemble the sequences and cluster them into 9569 nonredundant cDNA groups. We named these cDNAs "RTFL" (RIKEN Thellungiella Full-Length) cDNAs. Information on functional domains and Gene Ontology (GO) terms for the RTFL cDNAs were obtained using InterPro. The 8289 genes assigned to InterPro IDs were classified according to the GO terms using Plant GO Slim. Categorical comparison between the whole Arabidopsis genome and Thellungiella genes showing low identity to Arabidopsis genes revealed that the population of Thellungiella transport genes is approximately 1.5 times the size of the corresponding Arabidopsis genes. This suggests that these genes regulate a unique ion transportation system in Thellungiella. CONCLUSION As the number of Thellungiella halophila (Thellungiella salsuginea) expressed sequence tags (ESTs) was 9388 in July 2008, the number of ESTs has increased to approximately four times the original value as a result of this effort. Our sequences will thus contribute to correct future annotation of the Thellungiella genome sequence. The full-length enriched cDNA clones will enable the construction of overexpressing mutant plants by introduction of the cDNAs driven by a constitutive promoter, the complementation of Thellungiella mutants, and the determination of promoter regions in the Thellungiella genome.
Collapse
Affiliation(s)
- Teruaki Taji
- Faculty of Applied Bioscience, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo 156-8502, Japan
- Laboratory of Plant Molecular Biology, RIKEN Tsukuba Institute, 3-1-1 Koyadai, Tsukuba, Ibaraki 305-0074, Japan
| | - Tetsuya Sakurai
- RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Keiichi Mochida
- RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Atsushi Ishiwata
- RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Atsushi Kurotani
- RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yasushi Totoki
- RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- MetaSystems Research Team, RIKEN Advanced Science Institute, Yokohama, 230-0045, Japan
| | - Atsushi Toyoda
- RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yoshiyuki Sakaki
- RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Motoaki Seki
- RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Hirokazu Ono
- Faculty of Applied Bioscience, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo 156-8502, Japan
| | - Yoichi Sakata
- Faculty of Applied Bioscience, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo 156-8502, Japan
| | - Shigeo Tanaka
- Faculty of Applied Bioscience, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo 156-8502, Japan
| | - Kazuo Shinozaki
- Laboratory of Plant Molecular Biology, RIKEN Tsukuba Institute, 3-1-1 Koyadai, Tsukuba, Ibaraki 305-0074, Japan
- RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
28
|
Yamashita R, Suzuki Y, Takeuchi N, Wakaguri H, Ueda T, Sugano S, Nakai K. Comprehensive detection of human terminal oligo-pyrimidine (TOP) genes and analysis of their characteristics. Nucleic Acids Res 2008; 36:3707-15. [PMID: 18480124 PMCID: PMC2441802 DOI: 10.1093/nar/gkn248] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2007] [Revised: 03/25/2008] [Accepted: 04/17/2008] [Indexed: 12/03/2022] Open
Abstract
Although the knowledge accumulated on the transcriptional regulations of eukaryotes is significant, the knowledge on their translational regulations remains limited. Thus, we performed a comprehensive detection of terminal oligo-pyrimidine (TOP), which is one of the well-characterized cis-regulatory motifs for translational controls located immediately downstream of the transcriptional start sites of mRNAs. Utilizing our precise 5'-end information of the full-length cDNAs, we could screen 1645 candidate TOP genes by position specific matrix search. Among them, not only 75 out of 78 ribosomal protein genes but also eight previously identified non-ribosomal-protein TOP genes were included. We further experimentally validated the translational activities of 83 TOP candidate genes. Clear translational regulations exerted on the stimulation of 12-O-tetradecanoyl-1-phorbol-13-acetate for at least 41 of them was observed, indicating that there should be a few hundreds of human genes which are subjected to regulation at translation levels via TOPs. Our result suggests that TOP genes code not only formerly characterized ribosomal proteins and translation-related proteins but also a wider variety of proteins, such as lysosome-related proteins and metabolism-related proteins, playing pivotal roles in gene expression controls in the majority of cellular mRNAs.
Collapse
Affiliation(s)
- Riu Yamashita
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562 and Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), 4-5-3 Chiyoda-ku, Tokyo, Japan
| | - Yutaka Suzuki
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562 and Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), 4-5-3 Chiyoda-ku, Tokyo, Japan
| | - Nono Takeuchi
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562 and Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), 4-5-3 Chiyoda-ku, Tokyo, Japan
| | - Hiroyuki Wakaguri
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562 and Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), 4-5-3 Chiyoda-ku, Tokyo, Japan
| | - Takuya Ueda
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562 and Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), 4-5-3 Chiyoda-ku, Tokyo, Japan
| | - Sumio Sugano
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562 and Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), 4-5-3 Chiyoda-ku, Tokyo, Japan
| | - Kenta Nakai
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562 and Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), 4-5-3 Chiyoda-ku, Tokyo, Japan
| |
Collapse
|
29
|
3G vector-primer plasmid for constructing full-length-enriched cDNA libraries. Anal Biochem 2008; 380:149-51. [PMID: 18544335 DOI: 10.1016/j.ab.2008.05.022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2008] [Revised: 05/18/2008] [Accepted: 05/19/2008] [Indexed: 11/21/2022]
Abstract
We designed a 3G vector-primer plasmid for the generation of full-length-enriched complementary DNA (cDNA) libraries. By employing the terminal transferase activity of reverse transcriptase and the modified strand replacement method, this plasmid (assembled with a polydT end and a deoxyguanosine [dG] end) combines priming full-length cDNA strand synthesis and directional cDNA cloning. As a result, the number of steps involved in cDNA library preparation is decreased while simplifying downstream gene manipulation, sequencing, and subcloning. The 3G vector-primer plasmid method yields fully represented plasmid primed libraries that are equivalent to those made by the SMART (switching mechanism at 5' end of RNA transcript) approach.
Collapse
|
30
|
Abstract
Recent progress in the analyses of the mouse transcriptome leads to unexpected discoveries. The mouse genomic sequences read by RNA polymerase II may be six times more than previously expected for human chromosomes. The transcript-abundant regions (named "transcription forests") occupy more than half of the genomic sequence and are divided by transcript-scarce regions (transcription deserts). Many of the coding mRNAs may have partially overlapping antisense RNAs. There are transcripts bridging several adjacent genes that were previously regarded as distinct ones. The transcription start sites appearing as cap analysis of gene expression (CAGE) tags are mapped on the mouse genomic sequences. Distributions of CAGE tags show that the shapes of mammalian gene promoters can be classified into four major categories. These shapes were conserved between mouse and human. Most of the gene has exonic transcription start sites, especially in the 3' untranslated region (3' UTR) sequences. The term "RNA continent" has been invented to express this unexpectedly complex and prodigious mouse transcriptome. More than a half of the RNA polymerase II transcripts are regarded as noncoding RNAs (ncRNAs). The great variety of ncRNAs in mammalian transcriptome implies that there are many functional ncRNAs in the cells. Especially, the evolutionarily conserved microRNAs play critical roles in mammalian development and other biological functions. Moreover, many other ncRNAs have also been shown to have biological significant functions, mainly in the regulation of gene expression. The functional survey of the RNA continent has just started. We will describe the state of the art of the RNA continent and its impact on the modern molecular biology, especially on the cancer research.
Collapse
Affiliation(s)
- Jun Yasuda
- Functional RNA Research Program, Frontier Research System, RIKEN Yokohama Institute, 1-7-22, Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | | |
Collapse
|
31
|
Abstract
The principal route to understanding the biological significance of the genome sequence comes from discovery and characterization of that portion of the genome that is transcribed into RNA products. We now know that this ;transcriptome' is unexpectedly complex and its precise definition in any one species requires multiple technical approaches and an ability to work on a very large scale. A key step is the development of technologies able to capture snapshots of the complexity of the various kinds of RNA generated by the genome. As the human, mouse and other model genome sequencing projects approach completion, considerable effort has been focused on identifying and annotating the protein-coding genes as the principal output of the genome. In pursuing this aim, several key technologies have been developed to generate large numbers and highly diverse sets of full-length cDNAs and their variants. However, the search has identified another hidden transcriptional universe comprising a wide variety of non-protein coding RNA transcripts. Despite initial scepticism, various experiments and complementary technologies have demonstrated that these RNAs are dynamically transcribed and a subset of them can act as sense-antisense RNAs, which influence the transcriptional output of the genome. Recent experimental evidence suggests that the list of non-protein coding RNAs is still largely incomplete and that transcription is substantially more complex even than currently thought.
Collapse
Affiliation(s)
- Piero Carninci
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Institute, Wako, Saitama, Japan.
| |
Collapse
|
32
|
Ruan Y, Ooi HS, Choo SW, Chiu KP, Zhao XD, Srinivasan K, Yao F, Choo CY, Liu J, Ariyaratne P, Bin WG, Kuznetsov VA, Shahab A, Sung WK, Bourque G, Palanisamy N, Wei CL. Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs). Genome Res 2007; 17:828-38. [PMID: 17568001 PMCID: PMC1891342 DOI: 10.1101/gr.6018607] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Identification of unconventional functional features such as fusion transcripts is a challenging task in the effort to annotate all functional DNA elements in the human genome. Paired-End diTag (PET) analysis possesses a unique capability to accurately and efficiently characterize the two ends of DNA fragments, which may have either normal or unusual compositions. This unique nature of PET analysis makes it an ideal tool for uncovering unconventional features residing in the human genome. Using the PET approach for comprehensive transcriptome analysis, we were able to identify fusion transcripts derived from genome rearrangements and actively expressed retrotransposed pseudogenes, which would be difficult to capture by other means. Here, we demonstrate this unique capability through the analysis of 865,000 individual transcripts in two types of cancer cells. In addition to the characterization of a large number of differentially expressed alternative 5' and 3' transcript variants and novel transcriptional units, we identified 70 fusion transcript candidates in this study. One was validated as the product of a fusion gene between BCAS4 and BCAS3 resulting from an amplification followed by a translocation event between the two loci, chr20q13 and chr17q23. Through an examination of PETs that mapped to multiple genomic locations, we identified 4055 retrotransposed loci in the human genome, of which at least three were found to be transcriptionally active. The PET mapping strategy presented here promises to be a useful tool in annotating the human genome, especially aberrations in human cancer genomes.
Collapse
Affiliation(s)
- Yijun Ruan
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
- Corresponding authors.E-mail ; fax 65-64789059.E-mail ; fax 65-64789059
| | - Hong Sain Ooi
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Siew Woh Choo
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Kuo Ping Chiu
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Xiao Dong Zhao
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - K.G. Srinivasan
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Fei Yao
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Chiou Yu Choo
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Jun Liu
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Pramila Ariyaratne
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Wilson G.W. Bin
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Vladimir A. Kuznetsov
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Atif Shahab
- Bioinformatics Institute, Singapore 138671, Singapore
| | - Wing-Kin Sung
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
- School of Computing, National University of Singapore, Singapore 117543, Singapore
| | - Guillaume Bourque
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | | | - Chia-Lin Wei
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
- Corresponding authors.E-mail ; fax 65-64789059.E-mail ; fax 65-64789059
| |
Collapse
|
33
|
Baxter LL, Hsu BJ, Umayam L, Wolfsberg TG, Larson DM, Frith MC, Kawai J, Hayashizaki Y, Carninci P, Pavan WJ. Informatic and genomic analysis of melanocyte cDNA libraries as a resource for the study of melanocyte development and function. ACTA ACUST UNITED AC 2007; 20:201-9. [PMID: 17516927 DOI: 10.1111/j.1600-0749.2007.00372.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
Collapse
Affiliation(s)
- Laura L Baxter
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20855, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Shaw PJ, Ponmee N, Karoonuthaisiri N, Kamchonwongpaisan S, Yuthavong Y. Characterization of human malaria parasite Plasmodium falciparum eIF4E homologue and mRNA 5' cap status. Mol Biochem Parasitol 2007; 155:146-55. [PMID: 17692399 DOI: 10.1016/j.molbiopara.2007.07.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2006] [Revised: 06/01/2007] [Accepted: 07/05/2007] [Indexed: 11/30/2022]
Abstract
The mRNA 5' cap is an essential structural feature for translation of eukaryotic mRNA. Translation is initiated by recognition of the cap by the translation initiation factor eIF4E. To further our understanding of mRNA translation in the human malaria parasite Plasmodium falciparum, we have investigated the parasite eIF4E and its interaction with capped mRNA. We have purified P. falciparum eIF4E as a recombinant protein and demonstrated that it has canonical mRNA cap binding activity. We used this protein to purify P. falciparum capped mRNAs from total parasite RNA. Microarray analysis comparing total and eIF4E-purified capped mRNAs shows that 34 features were more than twofold under-represented in the purified RNA sample, including 19 features representative of nuclear transcripts. The putatively uncapped nuclear transcripts may represent a class of mRNAs targeted for storage and cap removal.
Collapse
Affiliation(s)
- Philip J Shaw
- National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency, 113 Pahonyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | | | | | | | | |
Collapse
|
35
|
Casas-Tinto S, Marr MT, Andreu P, Puig O. Characterization of the Drosophila insulin receptor promoter. ACTA ACUST UNITED AC 2007; 1769:236-43. [PMID: 17462750 DOI: 10.1016/j.bbaexp.2007.03.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2006] [Revised: 03/01/2007] [Accepted: 03/02/2007] [Indexed: 01/27/2023]
Abstract
The insulin receptor (InR) signaling pathway is largely conserved in metazoans and it is required for normal growth and development in Drosophila. Despite the importance of this pathway in regulating growth, development and metabolism in Drosophila, little is known about how dInR expression is controlled in flies. Here we report the characterization of the dInR gene promoter and the analysis of its expression during embryo development. Drosophila InR gene has three promoters spanning 40 kb in the genome. These promoters direct the expression of three distinct mRNA transcripts that share common exons downstream of the initiator codon ATG but have different 5'UTRs. All three promoters are differentially regulated, spatially and temporally, contributing to a very complex pattern of expression in the developing embryo. Our results indicate that dInR expression in Drosophila displays an intricate pattern of regulation that assures an adequate control of growth, development and metabolism.
Collapse
Affiliation(s)
- Sergio Casas-Tinto
- Institute of Biotechnology, University of Helsinki, Viikinkaari 9, FIN-00014, Finland
| | | | | | | |
Collapse
|
36
|
Marr MT, D’Alessio JA, Puig O, Tjian R. IRES-mediated functional coupling of transcription and translation amplifies insulin receptor feedback. Genes Dev 2007; 21:175-83. [PMID: 17234883 PMCID: PMC1770900 DOI: 10.1101/gad.1506407] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2006] [Accepted: 11/17/2006] [Indexed: 11/25/2022]
Abstract
It is generally accepted that the growth rate of an organism is modulated by the availability of nutrients. One common mechanism to control cellular growth is through the global down-regulation of cap-dependent translation by eIF4E-binding proteins (4E-BPs). Here, we report evidence for a novel mechanism that allows eukaryotes to coordinate and selectively couple transcription and translation of target genes in response to a nutrient and growth signaling cascade. The Drosophila insulin-like receptor (dINR) pathway incorporates 4E-BP resistant cellular internal ribosome entry site (IRES) containing mRNAs, to functionally couple transcriptional activation with differential translational control in a cell that is otherwise translationally repressed by 4E-BP. Although examples of cellular IRESs have been previously reported, their critical role mediating a key physiological response has not been well documented. Our studies reveal an integrated transcriptional and translational response mechanism specifically dependent on a cellular IRES that coordinates an essential physiological signal responsible for monitoring nutrient and cell growth conditions.
Collapse
Affiliation(s)
- Michael T. Marr
- Howard Hughes Medical Institute, Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, California 94720, USA
| | - Joseph A. D’Alessio
- Howard Hughes Medical Institute, Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, California 94720, USA
| | - Oscar Puig
- University of Helsinki, Institute of Biotechnology, Helsinki FI-00014, Finland
| | - Robert Tjian
- Howard Hughes Medical Institute, Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, California 94720, USA
| |
Collapse
|
37
|
Jia J, Fu J, Zheng J, Zhou X, Huai J, Wang J, Wang M, Zhang Y, Chen X, Zhang J, Zhao J, Su Z, Lv Y, Wang G. Annotation and expression profile analysis of 2073 full-length cDNAs from stress-induced maize (Zea mays L.) seedlings. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2006; 48:710-27. [PMID: 17076806 DOI: 10.1111/j.1365-313x.2006.02905.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Full-length cDNAs are very important for genome annotation and functional analysis of genes. The number of full-length cDNAs from maize (Zea mays L.) remains limited. Here we report the construction of a full-length enriched cDNA library from osmotically stressed maize seedlings by using the modified CAP trapper method. From this library, 2073 full-length cDNAs were collected and further analyzed by sequencing from both the 5'- and 3'-ends. A total of 1728 (83.4%) sequences did not match known maize mRNA and full-length cDNA sequences in the GenBank database and represent new full-length genes. After alignment of the 2073 full-length cDNAs with 448 maize BAC sequences, it was found that 84 full-length cDNAs could be mapped to the BACs. Of these, 43 genes (51.2%) have been correctly annotated from the BAC clones, 37 genes (44.0%) have been annotated with a different exon-intron structure from our cDNA, and four genes (4.76%) had no annotations in the TIGR database. Expression analysis of 2073 full-length maize cDNAs using a cDNA macroarray led to the identification of 79 genes upregulated by stress treatments and 329 downregulated genes. Of the 79 stress-inducible genes, 30 genes contain ABRE, DRE, MYB, MYC core sequences or other abiotic-responsive cis-acting elements in their promoters. These results suggest that these cis-acting elements and the corresponding transcription factors take part in plant responses to osmotic stress either cooperatively or independently. Additionally, the data suggest that an ethylene signaling pathway may be involved in the maize response to drought stress.
Collapse
Affiliation(s)
- Jinping Jia
- State Key Laboratory of Agrobiotechnology and National Center for Maize Improvement, China Agricultural University, Beijing, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Rigault C, Le Borgne F, Demarquoy J. Genomic structure, alternative maturation and tissue expression of the human BBOX1 gene. Biochim Biophys Acta Mol Cell Biol Lipids 2006; 1761:1469-81. [PMID: 17110165 DOI: 10.1016/j.bbalip.2006.09.014] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2006] [Revised: 09/29/2006] [Accepted: 09/29/2006] [Indexed: 02/07/2023]
Abstract
Gamma-butyrobetaine hydroxylase (BBOX1) is the enzyme responsible for the biosynthesis of l-carnitine, a key molecule of fatty acid metabolism. This cytosolic dimeric protein belongs to the dioxygenase family. In human, enzyme activity has been detected in kidney, liver and brain. The human gene encoding gamma-butyrobetaine hydroxylase is located on chromosome 11. Although the protein structure and activity have been extensively described, little information is available concerning BBOX1 structure and expression. In this study, the organization of the human gene was determined. The structure and functions of the 5'- and 3'-untranslated regions of the human BBOX1 mRNA were characterized in kidney, liver and brain. Our experiments revealed that the transcription initiation of the human BBOX1 gene might occur at 3 different exons, and that the expression level of each type of transcript is organ-specific. We showed that the use of 3 different promoters is responsible for the 5'-end heterogeneity. Investigations on BBOX1 mRNA maturation highlighted an alternative polyadenylation mechanism that generates two 3'-untranslated regions differing by their length. This alternative polyadenylation exhibited a tissue specificity.
Collapse
Affiliation(s)
- Caroline Rigault
- Inserm - CRI-Dijon, University of Dijon, UFR Sciences Vie, 6 Blvd. Gabriel, 21000 Dijon, France
| | | | | |
Collapse
|
39
|
Wu G, Doberstein SK. HTS technologies in biopharmaceutical discovery. Drug Discov Today 2006; 11:718-24. [PMID: 16846799 DOI: 10.1016/j.drudis.2006.06.010] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2006] [Revised: 04/07/2006] [Accepted: 06/14/2006] [Indexed: 12/31/2022]
Abstract
The concepts and philosophies of HTS can be productively applied to the discovery of new biopharmaceuticals. It is now possible, comprehensively and systematically, to enumerate, clone, produce and screen all secreted proteins, by building upon knowledge accumulated over the past two decades in HTS, genomics and parallel protein expression technologies. Each of the crucial operational components (comprehensive and high-quality cDNA library construction, proper protein-sequence classification, high-throughput protein production, medically relevant assays, state-of-the-art screening and data management) must be optimized to increase the chances of success. In this review, we draw comparisons between small-molecule and protein screening to illuminate common underlying principles as well as differences between the two operations.
Collapse
Affiliation(s)
- Ge Wu
- Five Prime Therapeutics, 1650 Owens St., Suite 200, San Francisco, CA 94158, USA.
| | | |
Collapse
|
40
|
Yamashita R, Suzuki Y, Wakaguri H, Tsuritani K, Nakai K, Sugano S. DBTSS: DataBase of Human Transcription Start Sites, progress report 2006. Nucleic Acids Res 2006; 34:D86-9. [PMID: 16381981 PMCID: PMC1347491 DOI: 10.1093/nar/gkj129] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
DBTSS was first constructed in 2002 based on precise, experimentally determined 5′ end clones. Several major updates and additions have been made since the last report. First, the number of human clones has drastically increased, going from 190 964 to 1 359 000. Second, information about potential alternative promoters is presented because the number of 5′ end clones is now sufficient to determine several promoters for one gene. Namely, we defined putative promoter groups by clustering transcription start sites (TSSs) separated by <500 bases. A total of 8308 human genes and 4276 mouse genes were found to have putative multiple promoters. Third, DBTSS provides detailed sequence comparisons of user-specified TSSs. Finally, we have added TSS information for zebrafish, malaria and schyzon (a red algae model organism). DBTSS is accessible at .
Collapse
Affiliation(s)
| | - Yutaka Suzuki
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
- To whom correspondence should be addressed. Tel: +81 4 7136 3607; Fax: +81 4 7136 3607;
| | - Hiroyuki Wakaguri
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| | | | | | - Sumio Sugano
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo5-1-5, Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| |
Collapse
|
41
|
Dalla E, Mignone F, Verardo R, Marchionni L, Marzinotto S, Lazarević D, Reid JF, Marzio R, Klarić E, Licastro D, Marcuzzi G, Gambetta R, Pierotti MA, Pesole G, Schneider C. Discovery of 342 putative new genes from the analysis of 5'-end-sequenced full-length-enriched cDNA human transcripts. Genomics 2005; 85:739-51. [PMID: 15885500 DOI: 10.1016/j.ygeno.2005.02.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2004] [Revised: 01/31/2005] [Accepted: 02/16/2005] [Indexed: 12/31/2022]
Abstract
In this work we describe the process that, starting with the production of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5'-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40% of our clones extended previously annotated 5' ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation. cDNA-microarray gene expression analysis showed that 260 of 342 unknown clones are expressed in at least one cell line and/or tissue. Further analysis of their sequences and the corresponding genomic locations allowed us to conclude that most of them represent potential novel genes, with only a small fraction having protein-coding potential.
Collapse
Affiliation(s)
- E Dalla
- Laboratorio Nazionale Consorzio Interuniversitario Biotecnologie, AREA Science Park, 99 Padriciano, 34012 Trieste, Italy
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Yamashita R, Suzuki Y, Sugano S, Nakai K. Genome-wide analysis reveals strong correlation between CpG islands with nearby transcription start sites of genes and their tissue specificity. Gene 2005; 350:129-36. [PMID: 15784181 DOI: 10.1016/j.gene.2005.01.012] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2004] [Revised: 12/28/2004] [Accepted: 01/24/2005] [Indexed: 10/25/2022]
Abstract
It has been envisaged that CpG islands are often observed near the transcriptional start sites (TSS) of housekeeping genes. However, neither the precise positions of CpG islands relative to TSS of genes nor the correlation between the presence of the CpG islands and the expression specificity of these genes is well-understood. Using thousands of sequences with known TSS in human and mouse, we found that there is a clear peak in the distribution of CpG islands around TSS in the genes of these two species. Thus, we classified human (mouse) genes into 6600 (2948) CpG+ genes and 2619 (1830) CpG- ones, based on the presence of a CpG island within the -100: +100 region. We estimated the degree of each gene being a housekeeper by the number of cDNA libraries where its ESTs were detected. Then, the tendency that a gene lacking CpG islands around its TSS is expressed with a higher degree of tissue specificity turned out to be evolutionarily conserved. We also confirmed this tendency by analyzing the gene ontology annotation of classified genes. Since no such clear correlation was found in the control data (mRNAs, pre-mRNAs, and chromosome banding pattern), we concluded that the effect of a CpG island near the TSS should be more important than the global GC content of the region where the gene resides.
Collapse
Affiliation(s)
- Riu Yamashita
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1, Shirokane-dai Minato-ku, Tokyo 108-8639, Japan
| | | | | | | |
Collapse
|
43
|
Abstract
The utility of DNA sequence information for phylogenetics and phylogeography is now well known. Rather than attempt to summarize studies addressing this well-demonstrated utility, this chapter focuses on fundamental approaches and techniques that implement the collection of DNA sequence data for comparative phylogenetic purposes in a genomic context (phylogenomics). Whole genome sequencing approaches have changed the way we think about phylogenetics and have opened the way for new perspectives on "old" phylogenetics concerns. Some of these concerns are which gene regions to use and how much sequence information is needed for robust phylogenetic inference. Whole genome sequences of a few animal model organisms have gone a long way to implement approaches to better understand these important phylogenetic concerns. This chapter also addresses how genomics has made it more important for a clear understanding of orthology of gene regions in comparative biology. Finally, genome-enabled technologies that are affecting comparative biology are also discussed.
Collapse
Affiliation(s)
- Rob DeSalle
- Department of Interbrate Zoology, American Museum of Natural History, New York, New York 10024, USA
| |
Collapse
|
44
|
Abstract
Life science in the 21st century is developing rapidly through the structural analysis of biomolecules, the completion of the human genome sequence and the analysis of transcriptomes. The mouse transcriptome has been comprehensively analyzed using a gene discovery approach to collect full-length cDNA (FL-cDNA) clones. The framework of the transcriptome was then mapped out by an international Functional ANnoTation Of Mouse cDNA (FANTOM) effort, and a significant new population of noncoding transcripts was discovered. The geographical analogy of a second "RNA continent," separate from the "continent" of expressed proteins, aids the visualization of this concept. An unexpected number of variations was discovered in the mouse transcriptome. The animal transcriptome has evolved to produce several transcripts and proteins from a single "transcriptional unit". Transcriptome analysis has given rise to the FL-cDNA database and to the 60 770 FANTOM FL-cDNA clone set, and the DNABook was developed as an easier way to distribute these clones. In conjunction with genome sequence databases, transcriptome databases and clone banks will be platforms for developing advanced databases of gene function (e.g. the Genome Function Database). This will enable life science to make rapid progress towards understanding life as a system of molecules.
Collapse
Affiliation(s)
- Yoshihide Hayashizaki
- Genome Exploration Research Group, Genomic Science Center (GSC), RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045 Japan.
| | | |
Collapse
|
45
|
Abstract
The Riken mouse genome encyclopedia a comprehensive full-length cDNA collection and sequence database. High-level functional annotation is based on sequence homology search, expression profiling, mapping and protein-protein interactions. More than 1000000 clones prepared from 163 tissues were end-sequenced and classified into 128000 clusters, and 60000 representative clones were fully sequenced representing 24000 clear protein-encoding genes. The application of the mouse genome database for positional cloning and gene network regulation analysis is reported.
Collapse
Affiliation(s)
- Yoshihide Hayashizaki
- Gene Exploration Research Group, Riken Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.
| |
Collapse
|
46
|
Ali M, Girimaji SC, Kumar A. Identification of a core promoter and a novel isoform of the human TSC1 gene transcript and structural comparison with mouse homolog. Gene 2004; 320:145-54. [PMID: 14597398 DOI: 10.1016/s0378-1119(03)00821-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Tuberous sclerosis complex (TSC) is an autosomal dominant disorder with loci on chromosome 9q34.12 (TSC1) and chromosome 16p13.3 (TSC2). Genes for both loci have been isolated and characterized. The promoters of both genes have not been characterized so far and little is known about the regulation of these genes. This study reports the characterization of the human TSC1 promoter region for the first time. We have identified a novel alternative isoform in the 5' untranslated region (UTR) of the TSC1 gene transcript involving exon 1. Alternative isoforms in the 5'UTR of the mouse Tsc1 gene transcript involving exon 1 and exon 2 have also been identified. We have identified three upstream open reading frames (uORFs) in the 5'UTR of the TSC1/Tsc1 gene. A comparative study of the 5'UTR of TSC1/Tsc1 gene has revealed that there is a high degree of similarity not only in the sequence but also in the splicing pattern of both human and mouse TSC1 genes. We have used PCR methodology to isolate approximately 1.6 kb genomic DNA 5' to the TSC1 cDNA. This sequence has directed a high level of expression of luciferase activity in both HeLa and HepG2 cells. Successive 5' and 3' deletion analysis has suggested that a approximately 587 bp region, from position +77 to -510 from the transcription start site (TSS), contains the promoter activity. Interestingly, this region contains no consensus TATA box or CAAT box. However, a 521-bp fragment surrounding the TSS exhibits the characteristics of a CpG island which overlaps with the promoter region. The identification of the TSC1 promoter region will help in designing a suitable strategy to identify mutations in this region in patients who do not show any mutations in the coding regions. It will also help to study the regulation of the TSC1 gene and its role in tumorigenesis.
Collapse
Affiliation(s)
- Mahmood Ali
- Department of Molecular Reproduction, Development and Genetics, Indian Institute of Science, Bangalore 560 012, India
| | | | | |
Collapse
|
47
|
Jackson A, Jiao PE, Ni I, Fu GK. Agarose gel size fractionation of RNA for the cloning of full-length cDNAs. Anal Biochem 2003; 323:252-5. [PMID: 14656534 DOI: 10.1016/j.ab.2003.10.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Affiliation(s)
- Alan Jackson
- Incyte Corp, 3160 Porter Dr, Palo Alto, CA 94304, USA
| | | | | | | |
Collapse
|
48
|
Yamanaka I, Kiyosawa H, Kondo S, Saito T, Carninci P, Shinagawa A, Aizawa K, Fukuda S, Hara A, Itoh M, Kawai J, Shibata K, Arakawa T, Ishii Y, Hayashizaki Y. Mapping of 19032 mouse cDNAs on mouse chromosomes. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2003; 2:23-8. [PMID: 12836671 DOI: 10.1023/a:1013203019444] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Finding genes by the positional candidate approach requires abundant cDNAs mapped to chromosomes. To provide such important information, we computationally mapped 19032 of our mouse cDNAs to mouse chromosomes by using data from public databases. We used 2 approaches. In the first, we integrated the mapping data of cDNAs on the human genome, known gene-related data, and comparative mapping data. From this, we calculated map positions on the mouse chromosomes. For this first approach, we developed a simple and powerful criterion to choose the correct map position from candidate positions in sequence homology searches. In the second approach, we related cDNAs to expressed sequence tags (EST) previously mapped in radiation hybrid experiments. We discuss improving the mapping by combining the 2 methods.
Collapse
Affiliation(s)
- Itaru Yamanaka
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Nishiyama T, Fujita T, Shin-I T, Seki M, Nishide H, Uchiyama I, Kamiya A, Carninci P, Hayashizaki Y, Shinozaki K, Kohara Y, Hasebe M. Comparative genomics of Physcomitrella patens gametophytic transcriptome and Arabidopsis thaliana: implication for land plant evolution. Proc Natl Acad Sci U S A 2003; 100:8007-12. [PMID: 12808149 PMCID: PMC164703 DOI: 10.1073/pnas.0932694100] [Citation(s) in RCA: 279] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The mosses and flowering plants diverged >400 million years ago. The mosses have haploid-dominant life cycles, whereas the flowering plants are diploid-dominant. The common ancestors of land plants have been inferred to be haploid-dominant, suggesting that genes used in the diploid body of flowering plants were recruited from the genes used in the haploid body of the ancestors during the evolution of land plants. To assess this evolutionary hypothesis, we constructed an EST library of the moss Physcomitrella patens, and compared the moss transcriptome to the genome of Arabidopsis thaliana. We constructed full-length enriched cDNA libraries from auxin-treated, cytokinin-treated, and untreated gametophytes of P. patens, and sequenced both ends of >40,000 clones. These data, together with the mRNA sequences in the public databases, were assembled into 15,883 putative transcripts. Sequence comparisons of A. thaliana and P. patens showed that at least 66% of the A. thaliana genes had homologues in P. patens. Comparison of the P. patens putative transcripts with all known proteins, revealed 9,907 putative transcripts with high levels of similarity to vascular plant genes, and 850 putative transcripts with high levels of similarity to other organisms. The haploid transcriptome of P. patens appears to be quite similar to the A. thaliana genome, supporting the evolutionary hypothesis. Our study also revealed that a number of genes are moss specific and were lost in the flowering plant lineage.
Collapse
Affiliation(s)
- Tomoaki Nishiyama
- Division of Speciation Mechanisms 2 and Computer Laboratory, National Institute for Basic Biology, Okazaki 444-8585, Japan
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Carninci P, Waki K, Shiraki T, Konno H, Shibata K, Itoh M, Aizawa K, Arakawa T, Ishii Y, Sasaki D, Bono H, Kondo S, Sugahara Y, Saito R, Osato N, Fukuda S, Sato K, Watahiki A, Hirozane-Kishikawa T, Nakamura M, Shibata Y, Yasunishi A, Kikuchi N, Yoshiki A, Kusakabe M, Gustincich S, Beisel K, Pavan W, Aidinis V, Nakagawara A, Held WA, Iwata H, Kono T, Nakauchi H, Lyons P, Wells C, Hume DA, Fagiolini M, Hensch TK, Brinkmeier M, Camper S, Hirota J, Mombaerts P, Muramatsu M, Okazaki Y, Kawai J, Hayashizaki Y. Targeting a complex transcriptome: the construction of the mouse full-length cDNA encyclopedia. Genome Res 2003; 13:1273-89. [PMID: 12819125 PMCID: PMC403712 DOI: 10.1101/gr.1119703] [Citation(s) in RCA: 142] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We report the construction of the mouse full-length cDNA encyclopedia,the most extensive view of a complex transcriptome,on the basis of preparing and sequencing 246 libraries. Before cloning,cDNAs were enriched in full-length by Cap-Trapper,and in most cases,aggressively subtracted/normalized. We have produced 1,442,236 successful 3'-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAs annotated in the FANTOM-2 annotation. We have also produced 547,149 5' end reads,which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU),which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC),which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large numbers of clusters (and TUs) of this project,which also include non-protein-coding RNAs,and the lower gene number estimation of genome annotations. Altogether,5'-end clusters identify regions that are potential promoters for 8637 known genes and 5'-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.
Collapse
Affiliation(s)
- Piero Carninci
- Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|