151
|
Charchar FJ, Zimmerli LU, Tomaszewski M. The pressure of finding human hypertension genes: new tools, old dilemmas. J Hum Hypertens 2008; 22:821-8. [DOI: 10.1038/jhh.2008.67] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
152
|
Laubinger S, Zeller G, Henz SR, Sachsenberg T, Widmer CK, Naouar N, Vuylsteke M, Schölkopf B, Rätsch G, Weigel D. At-TAX: a whole genome tiling array resource for developmental expression analysis and transcript identification in Arabidopsis thaliana. Genome Biol 2008; 9:R112. [PMID: 18613972 PMCID: PMC2530869 DOI: 10.1186/gb-2008-9-7-r112] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2008] [Revised: 06/12/2008] [Accepted: 07/09/2008] [Indexed: 11/10/2022] Open
Abstract
Gene expression maps for model organisms, including Arabidopsis thaliana, have typically been created using gene-centric expression arrays. Here, we describe a comprehensive expression atlas, Arabidopsis thaliana Tiling Array Express (At-TAX), which is based on whole-genome tiling arrays. We demonstrate that tiling arrays are accurate tools for gene expression analysis and identified more than 1,000 unannotated transcribed regions. Visualizations of gene expression estimates, transcribed regions, and tiling probe measurements are accessible online at the At-TAX homepage.
Collapse
Affiliation(s)
- Sascha Laubinger
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Spemannstr, 37-39, 72076 Tübingen, Germany.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
153
|
Defining diversity, specialization, and gene specificity in transcriptomes through information theory. Proc Natl Acad Sci U S A 2008; 105:9709-14. [PMID: 18606989 DOI: 10.1073/pnas.0803479105] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The transcriptome is a set of genes transcribed in a given tissue under specific conditions and can be characterized by a list of genes with their corresponding frequencies of transcription. Transcriptome changes can be measured by counting gene tags from mRNA libraries or by measuring light signals in DNA microarrays. In any case, it is difficult to completely comprehend the global changes that occur in the transcriptome, given that thousands of gene expression measurements are involved. We propose an approach to define and estimate the diversity and specialization of transcriptomes and gene specificity. We define transcriptome diversity as the Shannon entropy of its frequency distribution. Gene specificity is defined as the mutual information between the tissues and the corresponding transcript, allowing detection of either housekeeping or highly specific genes and clarifying the meaning of these concepts in the literature. Tissue specialization is measured by average gene specificity. We introduce the formulae using a simple example and show their application in two datasets of gene expression in human tissues. Visualization of the positions of transcriptomes in a system of diversity and specialization coordinates makes it possible to understand at a glance their interrelations, summarizing in a powerful way which transcriptomes are richer in diversity of expressed genes, or which are relatively more specialized. The framework presented enlightens the relation among transcriptomes, allowing a better understanding of their changes through the development of the organism or in response to environmental stimuli.
Collapse
|
154
|
Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Res 2008; 18:1433-45. [PMID: 18562676 DOI: 10.1101/gr.078378.108] [Citation(s) in RCA: 596] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
The transcriptional networks that regulate embryonic stem (ES) cell pluripotency and lineage specification are the subject of considerable attention. To date such studies have focused almost exclusively on protein-coding transcripts. However, recent transcriptome analyses show that the mammalian genome contains thousands of long noncoding RNAs (ncRNAs), many of which appear to be expressed in a developmentally regulated manner. The functions of these remain untested. To identify ncRNAs involved in ES cell biology, we used a custom-designed microarray to examine the expression profiles of mouse ES cells differentiating as embryoid bodies (EBs) over a 16-d time course. We identified 945 ncRNAs expressed during EB differentiation, of which 174 were differentially expressed, many correlating with pluripotency or specific differentiation events. Candidate ncRNAs were identified for further characterization by an integrated examination of expression profiles, genomic context, chromatin state, and promoter analysis. Many ncRNAs showed coordinated expression with genomically associated developmental genes, such as Dlx1, Dlx4, Gata6, and Ecsit. We examined two novel developmentally regulated ncRNAs, Evx1as and Hoxb5/6as, which are derived from homeotic loci and share similar expression patterns and localization in mouse embryos with their associated protein-coding genes. Using chromatin immunoprecipitation, we provide evidence that both ncRNAs are associated with trimethylated H3K4 histones and histone methyltransferase MLL1, suggesting a role in epigenetic regulation of homeotic loci during ES cell differentiation. Taken together, our data indicate that long ncRNAs are likely to be important in processes directing pluripotency and alternative differentiation programs, in some cases through engagement of the epigenetic machinery.
Collapse
|
155
|
Djebali S, Kapranov P, Foissac S, Lagarde J, Reymond A, Ucla C, Wyss C, Drenkow J, Dumais E, Murray RR, Lin C, Szeto D, Denoeud F, Calvo M, Frankish A, Harrow J, Makrythanasis P, Vidal M, Salehi-Ashtiani K, Antonarakis SE, Gingeras TR, Guigó R. Efficient targeted transcript discovery via array-based normalization of RACE libraries. Nat Methods 2008; 5:629-35. [PMID: 18500348 PMCID: PMC2713501 DOI: 10.1038/nmeth.1216] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Accepted: 04/24/2008] [Indexed: 11/09/2022]
Abstract
RACE (Rapid Amplification of cDNA Ends) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. Here, we describe a strategy that uses array hybridization to improve sampling efficiency of human transcripts. The products of the RACE reaction are hybridized onto tiling arrays, and the exons detected are used to delineate a series of RT-PCR reactions, through which the original RACE mixture is segregated into simpler RT-PCR reactions. These are independently cloned, and randomly selected clones are sequenced. This approach is superior to direct cloning and sequencing of RACE products: it specifically targets novel transcripts, and often results in overall normalization of transcript abundances. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of novel transcripts, and we investigate multiplexing it by pooling RACE reactions from multiple interrogated loci prior to hybridization.
Collapse
Affiliation(s)
- Sarah Djebali
- Grup de Recerca en Informàtica Biomèdica, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
156
|
Lian Z, Karpikov A, Lian J, Mahajan MC, Hartman S, Gerstein M, Snyder M, Weissman SM. A genomic analysis of RNA polymerase II modification and chromatin architecture related to 3' end RNA polyadenylation. Genome Res 2008; 18:1224-37. [PMID: 18487515 DOI: 10.1101/gr.075804.107] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Genomic analyses have been applied extensively to analyze the process of transcription initiation in mammalian cells, but less to transcript 3' end formation and transcription termination. We used a novel approach to prepare 3' end fragments from polyadenylated RNA, and mapped the position of the poly(A) addition site using oligonucleotide arrays tiling 1% of the human genome. This approach revealed more 3' ends than had been annotated. The distribution of these ends relative to RNA polymerase II (PolII) and di- and trimethylated lysine 4 and lysine 36 of histone H3 was compared. A substantial fraction of unannotated 3' ends of RNA are intronic and antisense to the embedding gene. Poly(A) ends of annotated messages lie on average 2 kb upstream of the end of PolII binding (termination). Near the termination sites, and in some internal sites, unphosphorylated and C-terminal domain (CTD) serine 2 phosphorylated PolII (POLR2A) accumulate, suggesting pausing of the polymerase and perhaps dephosphorylation prior to release. Lysine 36 trimethylation occurs across transcribed genes, sometimes alternating with stretches of DNA in which lysine 36 dimethylation is more prominent. Lysine 36 methylation decreases at or near the site of polyadenylation, sometimes disappearing before disappearance of phosphorylated RNA PolII or release of PolII from DNA. Our results suggest that transcription termination loss of histone 3 lysine 36 methylation and later release of RNA polymerase. The latter is often associated with polymerase pausing. Overall, our study reveals extensive sites of poly(A) addition and provides insights into the events that occur during 3' end formation.
Collapse
Affiliation(s)
- Zheng Lian
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520-8005, USA
| | | | | | | | | | | | | | | |
Collapse
|
157
|
Cis- and trans-splicing of mRNAs mediated by tRNA sequences in eukaryotic cells. Proc Natl Acad Sci U S A 2008; 105:6864-9. [PMID: 18458335 DOI: 10.1073/pnas.0800420105] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The formation of chimeric mRNAs is a strategy used by human cells to increase the complexity of their proteome, as revealed by the ENCODE project. Here, we use Saccharomyces cerevisiae to show a way by which trans-spliced mRNAs can be generated. We demonstrate that a pretRNA inserted into a premRNA context directs the splicing reaction precisely to the sites of the tRNA intron. A suppressor pretRNA gene was inserted, in cis, into the sequence encoding the third cytoplasmic loop of the Ste2 or Ste3 G protein-coupled receptor. The hybrid RNAs are spliced at the specific pretRNA splicing sites, releasing both functional tRNAs that suppress nonsense mutations and translatable mRNAs that activate the signal transduction pathway. The RNA molecules extracted from yeast cells were amplified by RT-PCR, and their sequences were determined, confirming the identity of the splice junctions. We then constructed two fusions between the premRNA sequence (STE2 or STE3) and the 5'- or 3'-pretRNA half, so that the two hybrid RNAs can associate with each other, in trans, through their tRNA halves. Splicing occurs at the predicted pretRNA sites, producing a chimeric STE3-STE2 receptor mRNA. RNA trans-splicing mediated by tRNA sequences, therefore, is a mechanism capable of producing new kinds of RNAs, which could code for novel proteins.
Collapse
|
158
|
Abstract
Promoter-proximal polymerase II stalling prepares genes for prompt expression when signals are received. Stalling of RNA polymerase II near the promoter has recently been found to be much more common than previously thought. Genome-wide surveys of the phenomenon suggest that it is likely to be a rate-limiting control on gene activation that poises developmental and stimulus-responsive genes for prompt expression when inducing signals are received.
Collapse
Affiliation(s)
- Jia Qian Wu
- Molecular, Cellular and Developmental Biology Department, Yale University, PO Box 208103, New Haven, CT 06511, USA.
| | | |
Collapse
|
159
|
Soldà G, Suyama M, Pelucchi P, Boi S, Guffanti A, Rizzi E, Bork P, Tenchini ML, Ciccarelli FD. Non-random retention of protein-coding overlapping genes in Metazoa. BMC Genomics 2008; 9:174. [PMID: 18416813 PMCID: PMC2330155 DOI: 10.1186/1471-2164-9-174] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2007] [Accepted: 04/16/2008] [Indexed: 11/26/2022] Open
Abstract
Background Although the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm). Results For all analyzed species the observed number of overlapping genes is always lower than expected assuming functional neutrality, suggesting that gene overlap is negatively selected. The comparison to the random distribution also shows that retained overlaps do not exhibit random features: antiparallel overlaps are significantly enriched, while overlaps lying on the same strand and those involving coding sequences are highly underrepresented. We confirm that overlap is mostly species-specific and provide evidence that it frequently originates through the acquisition of terminal, non-coding exons. Finally, we show that overlapping genes tend to be significantly co-expressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression. Conclusion Our data suggest that overlap between protein-coding genes is selected against in Metazoa. However, when retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes. The tendency of overlaps to involve non-coding regions of the genes leads to the speculation that the advantages achieved by an overlapping arrangement may be optimized by evolving regulatory non-coding transcripts.
Collapse
Affiliation(s)
- Giulia Soldà
- 1Department of Biology and Genetics for Medical Sciences, University of Milan, 20133 Milan, Italy.
| | | | | | | | | | | | | | | | | |
Collapse
|
160
|
Borel C, Gagnebin M, Gehrig C, Kriventseva EV, Zdobnov EM, Antonarakis SE. Mapping of small RNAs in the human ENCODE regions. Am J Hum Genet 2008; 82:971-81. [PMID: 18394580 DOI: 10.1016/j.ajhg.2008.02.016] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2007] [Revised: 01/28/2008] [Accepted: 02/26/2008] [Indexed: 10/22/2022] Open
Abstract
The elucidation of the largely unknown transcriptome of small RNAs is crucial for the understanding of genome and cellular function. We report here the results of the analysis of small RNAs (< 50 nt) in the ENCODE regions of the human genome. Size-fractionated RNAs from four different cell lines (HepG2, HelaS3, GM06990, SK-N-SH) were mapped with the forward and reverse ENCODE high-density resolution tiling arrays. The top 1% of hybridization signals are termed SmRfrags (Small RNA fragments). Eight percent of SmRfrags overlap the GENCODE genes (CDS), given that the majority map to intergenic regions (34%), intronic regions (53%), and untranslated regions (UTRs) (5%). In addition, 9.6% and 16.8% of SmRfrags in the 5' UTR regions overlap significantly with His/Pol II/TAF250 binding sites and DNase I Hypersensitive sites, respectively (compared to the 5.3% and 9% expected). Interestingly, 17%-24% (depending on the cell line) of SmRfrags are sense-antisense strand pairs that show evidence of overlapping transcription. Only 3.4% and 7.2% of SmRfrags in intergenic regions overlap transcribed fragments (Txfrags) in HeLa and GM06990 cell lines, respectively. We hypothesized that a fraction of the identified SmRfrags corresponded to microRNAs. We tested by Northern blot a set of 15 high-likelihood predictions of microRNA candidates that overlap with smRfrags and validated three potential microRNAs ( approximately 20 nt length). Notably, most of the remaining candidates showed a larger hybridizing band ( approximately 100 nt) that could be a microRNA precursor. The small RNA transcriptome is emerging as an important and abundant component of the genome function.
Collapse
|
161
|
Osada N, Hashimoto K, Kameoka Y, Hirata M, Tanuma R, Uno Y, Inoue I, Hida M, Suzuki Y, Sugano S, Terao K, Kusuda J, Takahashi I. Large-scale analysis of Macaca fascicularis transcripts and inference of genetic divergence between M. fascicularis and M. mulatta. BMC Genomics 2008; 9:90. [PMID: 18294402 PMCID: PMC2287170 DOI: 10.1186/1471-2164-9-90] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2007] [Accepted: 02/24/2008] [Indexed: 01/26/2023] Open
Abstract
Background Cynomolgus macaques (Macaca fascicularis) are widely used as experimental animals in biomedical research and are closely related to other laboratory macaques, such as rhesus macaques (M. mulatta). We isolated 85,721 clones and determined 9407 full-insert sequences from cynomolgus monkey brain, testis, and liver. These sequences were annotated based on homology to human genes and stored in a database, QFbase . Results We found that 1024 transcripts did not represent any public human cDNA sequence and examined their expression using M. fascicularis oligonucleotide microarrays. Significant expression was detected for 544 (51%) of the unidentified transcripts. Moreover, we identified 226 genes containing exon alterations in the untranslated regions of the macaque transcripts, despite the highly conserved structure of the coding regions. Considering the polymorphism in the common ancestor of cynomolgus and rhesus macaques and the rate of PCR errors, the divergence time between the two species was estimated to be around 0.9 million years ago. Conclusion Transcript data from Old World monkeys provide a means not only to determine the evolutionary difference between human and non-human primates but also to unveil hidden transcripts in the human genome. Increasing the genomic resources and information of macaque monkeys will greatly contribute to the development of evolutionary biology and biomedical sciences.
Collapse
Affiliation(s)
- Naoki Osada
- Department of Biomedical Resources, National Institute of Biomedical Innovation, Ibaraki, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
162
|
Harbers M. The current status of cDNA cloning. Genomics 2008; 91:232-42. [PMID: 18222633 DOI: 10.1016/j.ygeno.2007.11.004] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2007] [Revised: 11/10/2007] [Accepted: 11/17/2007] [Indexed: 11/19/2022]
Abstract
The cloning of cDNAs, copies of cellular RNA, is one of the classical technologies in molecular biology. Over the past 30 years cDNA cloning technologies have been improved to enable the cloning of large cDNA collections, which are fundamental to today's understanding of the utilization of genetic information. With the discovery of noncoding RNAs, additional new approaches to the cloning of short RNAs have been developed. However, with the realization that much larger portions of genomes are transcribed than anticipated from genome annotations, cDNA cloning faces new challenges to uncover rare transcripts and to make the corresponding cDNAs available for functional studies. This review provides an overview on the current status of cDNA cloning and possibilities for the discovery and characterization of new RNA families.
Collapse
Affiliation(s)
- Matthias Harbers
- DNAFORM, Inc., Leading Venture Plaza 2, 75-1 Ono-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0046, Japan.
| |
Collapse
|
163
|
Wu JQ, Du J, Rozowsky J, Zhang Z, Urban AE, Euskirchen G, Weissman S, Gerstein M, Snyder M. Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome. Genome Biol 2008; 9:R3. [PMID: 18173853 PMCID: PMC2395237 DOI: 10.1186/gb-2008-9-1-r3] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2007] [Revised: 12/06/2007] [Accepted: 01/03/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there is still much uncertainty regarding precisely what portion of the genome is transcribed, the exact structures of these novel transcripts, and the levels of the transcripts produced. RESULTS We have interrogated the transcribed loci in 420 selected ENCyclopedia Of DNA Elements (ENCODE) regions using rapid amplification of cDNA ends (RACE) sequencing. We analyzed annotated known gene regions, but primarily we focused on novel transcriptionally active regions (TARs), which were previously identified by high-density oligonucleotide tiling arrays and on random regions that were not believed to be transcribed. We found RACE sequencing to be very sensitive and were able to detect low levels of transcripts in specific cell types that were not detectable by microarrays. We also observed many instances of sense-antisense transcripts; further analysis suggests that many of the antisense transcripts (but not all) may be artifacts generated from the reverse transcription reaction. Our results show that the majority of the novel TARs analyzed (60%) are connected to other novel TARs or known exons. Of previously unannotated random regions, 17% were shown to produce overlapping transcripts. Furthermore, it is estimated that 9% of the novel transcripts encode proteins. CONCLUSION We conclude that RACE sequencing is an efficient, sensitive, and highly accurate method for characterization of the transcriptome of specific cell/tissue types. Using this method, it appears that much of the genome is represented in polyA+ RNA. Moreover, a fraction of the novel RNAs can encode protein and are likely to be functional.
Collapse
Affiliation(s)
- Jia Qian Wu
- Molecular, Cellular and Developmental Biology Department, KBT918, Yale University, New Haven, Connecticut 06511, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
164
|
Frith MC, Carninci P, Kai C, Kawai J, Bailey TL, Hayashizaki Y, Mattick JS. Splicing bypasses 3′ end formation signals to allow complex gene architectures. Gene 2007; 403:188-93. [PMID: 17897791 DOI: 10.1016/j.gene.2007.08.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2007] [Revised: 06/12/2007] [Accepted: 08/21/2007] [Indexed: 11/26/2022]
Abstract
Many genes are arranged in complex overlapping and interlaced patterns in eukaryotic genomes. It is unclear whether or how such genes can avoid interference from each other's RNA processing signals and retain distinct identities. This puzzle applies particularly to 3' end formation sites, which inherently terminate the transcript, and thus act as boundaries between adjacent genes. We hypothesise that the transcript processing machinery can bypass 3' end formation sites by splicing out an intron surrounding the site. We confirm a prediction of this hypothesis: the likelihood of transcripts extending beyond 3' end sites depends on the strength of 3' end formation signals located in exons in the mature transcript, but not of those in introns that are spliced out of the transcript. This bypassing mechanism permits nested and interleaved gene architectures, as well as fusion transcripts that combine exons from adjacent genes.
Collapse
Affiliation(s)
- Martin C Frith
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Centre (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
| | | | | | | | | | | | | |
Collapse
|
165
|
Reymond A, Henrichsen CN, Harewood L, Merla G. Side effects of genome structural changes. Curr Opin Genet Dev 2007; 17:381-6. [PMID: 17913489 DOI: 10.1016/j.gde.2007.08.009] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2007] [Accepted: 08/17/2007] [Indexed: 12/13/2022]
Abstract
The first extensive catalog of structural human variation was recently released. It showed that large stretches of genomic DNA that vary considerably in copy number were extremely abundant. Thus it is conceivable that they play a major role in functional variation. Consistently, genomic insertions and deletions were shown to contribute to phenotypic differences by modifying not only the expression levels of genes within the aneuploid segments but also of normal copy-number neighboring genes. In this report, we review the possible mechanisms behind this latter effect.
Collapse
Affiliation(s)
- Alexandre Reymond
- Center for Integrative Genomics, Genopode Building, University of Lausanne, CH-1015 Lausanne, Switzerland.
| | | | | | | |
Collapse
|
166
|
Onnebo SMN, Saiardi A. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell 2007; 129:647-9. [PMID: 17512396 DOI: 10.1016/j.cell.2007.05.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Noncoding RNAs (ncRNA) participate in epigenetic regulation but are poorly understood. Here we characterize the transcriptional landscape of the four human HOX loci at five base pair resolution in 11 anatomic sites and identify 231 HOX ncRNAs that extend known transcribed regions by more than 30 kilobases. HOX ncRNAs are spatially expressed along developmental axes and possess unique sequence motifs, and their expression demarcates broad chromosomal domains of differential histone methylation and RNA polymerase accessibility. We identified a 2.2 kilobase ncRNA residing in the HOXC locus, termed HOTAIR, which represses transcription in trans across 40 kilobases of the HOXD locus. HOTAIR interacts with Polycomb Repressive Complex 2 (PRC2) and is required for PRC2 occupancy and histone H3 lysine-27 trimethylation of HOXD locus. Thus, transcription of ncRNA may demarcate chromosomal domains of gene silencing at a distance; these results have broad implications for gene regulation in development and disease states.
Collapse
Affiliation(s)
- Sara Maria Nancy Onnebo
- Medical Research Council (MRC) Cell Biology Unit and Laboratory for Molecular Cell Biology, Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK
| | | |
Collapse
|
167
|
Ruan Y, Ooi HS, Choo SW, Chiu KP, Zhao XD, Srinivasan K, Yao F, Choo CY, Liu J, Ariyaratne P, Bin WG, Kuznetsov VA, Shahab A, Sung WK, Bourque G, Palanisamy N, Wei CL. Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs). Genome Res 2007; 17:828-38. [PMID: 17568001 PMCID: PMC1891342 DOI: 10.1101/gr.6018607] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Identification of unconventional functional features such as fusion transcripts is a challenging task in the effort to annotate all functional DNA elements in the human genome. Paired-End diTag (PET) analysis possesses a unique capability to accurately and efficiently characterize the two ends of DNA fragments, which may have either normal or unusual compositions. This unique nature of PET analysis makes it an ideal tool for uncovering unconventional features residing in the human genome. Using the PET approach for comprehensive transcriptome analysis, we were able to identify fusion transcripts derived from genome rearrangements and actively expressed retrotransposed pseudogenes, which would be difficult to capture by other means. Here, we demonstrate this unique capability through the analysis of 865,000 individual transcripts in two types of cancer cells. In addition to the characterization of a large number of differentially expressed alternative 5' and 3' transcript variants and novel transcriptional units, we identified 70 fusion transcript candidates in this study. One was validated as the product of a fusion gene between BCAS4 and BCAS3 resulting from an amplification followed by a translocation event between the two loci, chr20q13 and chr17q23. Through an examination of PETs that mapped to multiple genomic locations, we identified 4055 retrotransposed loci in the human genome, of which at least three were found to be transcriptionally active. The PET mapping strategy presented here promises to be a useful tool in annotating the human genome, especially aberrations in human cancer genomes.
Collapse
Affiliation(s)
- Yijun Ruan
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
- Corresponding authors.E-mail ; fax 65-64789059.E-mail ; fax 65-64789059
| | - Hong Sain Ooi
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Siew Woh Choo
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Kuo Ping Chiu
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Xiao Dong Zhao
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - K.G. Srinivasan
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Fei Yao
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Chiou Yu Choo
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Jun Liu
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Pramila Ariyaratne
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Wilson G.W. Bin
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Vladimir A. Kuznetsov
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Atif Shahab
- Bioinformatics Institute, Singapore 138671, Singapore
| | - Wing-Kin Sung
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
- School of Computing, National University of Singapore, Singapore 117543, Singapore
| | - Guillaume Bourque
- Information and Mathematical Science Group, Genome Institute of Singapore, Singapore 138672, Singapore
| | | | - Chia-Lin Wei
- Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore
- Corresponding authors.E-mail ; fax 65-64789059.E-mail ; fax 65-64789059
| |
Collapse
|
168
|
Huvet M, Nicolay S, Touchon M, Audit B, d'Aubenton-Carafa Y, Arneodo A, Thermes C. Human gene organization driven by the coordination of replication and transcription. Genome Res 2007; 17:1278-85. [PMID: 17675363 PMCID: PMC1950896 DOI: 10.1101/gr.6533407] [Citation(s) in RCA: 116] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In this work, we investigated a large-scale organization of the human genes with respect to putative replication origins. We developed an appropriate multiscale method to analyze the nucleotide compositional skew along the genome and found that in more than one-quarter of the genome, the skew profile presents characteristic patterns consisting of successions of N-shaped structures, designated here N-domains, bordered by putative replication origins. Our analysis of recent experimental timing data confirmed that, in a number of cases, domain borders coincide with replication initiation zones active in the early S phase, whereas the central regions replicate in the late S phase. Around the putative origins, genes are abundant and broadly expressed, and their transcription is co-oriented with replication fork progression. These features weaken progressively with the distance from putative replication origins. At the center of domains, genes are rare and expressed in few tissues. We propose that this specific organization could result from the constraints of accommodating the replication and transcription initiation processes at chromatin level, and reducing head-on collisions between the two machineries. Our findings provide a new model of gene organization in the human genome, which integrates transcription, replication, and chromatin structure as coordinated determinants of genome architecture.
Collapse
Affiliation(s)
- Maxime Huvet
- Centre de Génétique Moléculaire (CNRS), 91198 Gif-sur-Yvette, France
| | - Samuel Nicolay
- Laboratoire Joliot Curie et Laboratoire de Physique, Ecole Normale Supérieure de Lyon, CNRS, 69364 Lyon, France
- Corresponding author.E-mail ; fax 33-1-69-82-38-77
| | - Marie Touchon
- Centre de Génétique Moléculaire (CNRS), 91198 Gif-sur-Yvette, France
- Génétique des Génomes Bactériens, CNRS URA2171, Institut Pasteur, 75015 Paris, France
- Atelier de Bioinformatique, Université Pierre et Marie Curie-Paris 6, 75005 Paris, France
| | - Benjamin Audit
- Laboratoire Joliot Curie et Laboratoire de Physique, Ecole Normale Supérieure de Lyon, CNRS, 69364 Lyon, France
| | | | - Alain Arneodo
- Laboratoire Joliot Curie et Laboratoire de Physique, Ecole Normale Supérieure de Lyon, CNRS, 69364 Lyon, France
| | - Claude Thermes
- Centre de Génétique Moléculaire (CNRS), 91198 Gif-sur-Yvette, France
- Corresponding author.E-mail ; fax 33-1-69-82-38-77
| |
Collapse
|
169
|
Abstract
While the concept of a gene has been helpful in defining the relationship of a portion of a genome to a phenotype, this traditional term may not be as useful as it once was. Currently, "gene" has come to refer principally to a genomic region producing a polyadenylated mRNA that encodes a protein. However, the recent emergence of a large collection of unannotated transcripts with apparently little protein coding capacity, collectively called transcripts of unknown function (TUFs), has begun to blur the physical boundaries and genomic organization of genic regions with noncoding transcripts often overlapping protein-coding genes on the same (sense) and opposite strand (antisense). Moreover, they are often located in intergenic regions, making the genic portions of the human genome an interleaved network of both annotated polyadenylated and nonpolyadenylated transcripts, including splice variants with novel 5' ends extending hundreds of kilobases. This complex transcriptional organization and other recently observed features of genomes argue for the reconsideration of the term "gene" and suggests that transcripts may be used to define the operational unit of a genome.
Collapse
|
170
|
Lin JM, Collins PJ, Trinklein ND, Fu Y, Xi H, Myers RM, Weng Z. Transcription factor binding and modified histones in human bidirectional promoters. Genome Res 2007; 17:818-27. [PMID: 17568000 PMCID: PMC1891341 DOI: 10.1101/gr.5623407] [Citation(s) in RCA: 104] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Bidirectional promoters have received considerable attention because of their ability to regulate two downstream genes (divergent genes). They are also highly abundant, directing the transcription of approximately 11% of genes in the human genome. We categorized the presence of DNA sequence motifs, binding of transcription factors, and modified histones as overrepresented, shared, or underrepresented in bidirectional promoters with respect to unidirectional promoters. We found that a small set of motifs, including GABPA, MYC, E2F1, E2F4, NRF-1, CCAAT, YY1, and ACTACAnnTCC are overrepresented in bidirectional promoters, while the majority (73%) of known vertebrate motifs are underrepresented. We performed chromatin-immunoprecipitation (ChIP), followed by quantitative PCR for GABPA, on 118 regions in the human genome and showed that it binds to bidirectional promoters more frequently than unidirectional promoters, and its position-specific scoring matrix is highly predictive of binding. Signatures of active transcription, such as occupancy of RNA polymerase II and the modified histones H3K4me2, H3K4me3, and H3ac, are overrepresented in regions around bidirectional promoters, suggesting that a higher fraction of divergent genes are transcribed in a given cell than the fraction of other genes. Accordingly, analysis of whole-genome microarray data indicates that 68% of divergent genes are transcribed compared with 44% of all human genes. By combining the analysis of publicly available ENCODE data and a detailed study of GABPA, we survey bidirectional promoters with breadth and depth, leading to biological insights concerning their motif composition and bidirectional regulatory mode.
Collapse
Affiliation(s)
- Jane M. Lin
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215, USA
| | - Patrick J. Collins
- Department of Genetics, Stanford University, School of Medicine, Stanford, California 94305-5120, USA
| | - Nathan D. Trinklein
- Department of Genetics, Stanford University, School of Medicine, Stanford, California 94305-5120, USA
| | - Yutao Fu
- Program in Bioinformatics and Systems Biology, Boston University, Boston, Massachusetts, 02215, USA
| | - Hualin Xi
- Program in Bioinformatics and Systems Biology, Boston University, Boston, Massachusetts, 02215, USA
| | - Richard M. Myers
- Department of Genetics, Stanford University, School of Medicine, Stanford, California 94305-5120, USA
| | - Zhiping Weng
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215, USA
- Program in Bioinformatics and Systems Biology, Boston University, Boston, Massachusetts, 02215, USA
- Corresponding author.E-mail ; fax (617) 353-6766
| |
Collapse
|
171
|
Zheng D, Frankish A, Baertsch R, Kapranov P, Reymond A, Choo SW, Lu Y, Denoeud F, Antonarakis SE, Snyder M, Ruan Y, Wei CL, Gingeras TR, Guigó R, Harrow J, Gerstein MB. Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution. Genome Res 2007; 17:839-51. [PMID: 17568002 PMCID: PMC1891343 DOI: 10.1101/gr.5586307] [Citation(s) in RCA: 152] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.
Collapse
Affiliation(s)
- Deyou Zheng
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
- Corresponding authors.E-mail ; fax (360) 838-7861.E-mail ; fax (360) 838-7861
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1HH, United Kingdom
| | - Robert Baertsch
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California 95064, USA
| | | | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Siew Woh Choo
- Genome Institute of Singapore, Singapore 138672, Singapore
| | - Yontao Lu
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California 95064, USA
| | - France Denoeud
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, Passeig Marítim de la Barceloneta, 37-49, 08003, Barcelona, Catalonia, Spain
| | - Stylianos E. Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Michael Snyder
- Molecular, Cellular & Developmental Biology Department, Yale University, New Haven, Connecticut 06520, USA
| | - Yijun Ruan
- Genome Institute of Singapore, Singapore 138672, Singapore
| | - Chia-Lin Wei
- Genome Institute of Singapore, Singapore 138672, Singapore
| | | | - Roderic Guigó
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, Passeig Marítim de la Barceloneta, 37-49, 08003, Barcelona, Catalonia, Spain
- Center for Genomic Regulation, Passeig Marítim de la Barceloneta, 37-49, 08003, Barcelona, Catalonia, Spain
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1HH, United Kingdom
| | - Mark B. Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
- Department of Computer Science, Yale University, New Haven, Connecticut 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
- Corresponding authors.E-mail ; fax (360) 838-7861.E-mail ; fax (360) 838-7861
| |
Collapse
|
172
|
Use of tiling array data and RNA secondary structure predictions to identify noncoding RNA genes. BMC Genomics 2007; 8:244. [PMID: 17645787 PMCID: PMC1949828 DOI: 10.1186/1471-2164-8-244] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2007] [Accepted: 07/23/2007] [Indexed: 11/23/2022] Open
Abstract
Background Within the last decade a large number of noncoding RNA genes have been identified, but this may only be the tip of the iceberg. Using comparative genomics a large number of sequences that have signals concordant with conserved RNA secondary structures have been discovered in the human genome. Moreover, genome wide transcription profiling with tiling arrays indicate that the majority of the genome is transcribed. Results We have combined tiling array data with genome wide structural RNA predictions to search for novel noncoding and structural RNA genes that are expressed in the human neuroblastoma cell line SK-N-AS. Using this strategy, we identify thousands of human candidate RNA genes. To further verify the expression of these genes, we focused on candidate genes that had a stable hairpin structures or a high level of covariance. Using northern blotting, we verify the expression of 2 out of 3 of the hairpin structures and 3 out of 9 high covariance structures in SK-N-AS cells. Conclusion Our results demonstrate that many human noncoding, structured and conserved RNA genes remain to be discovered and that tissue specific tiling array data can be used in combination with computational predictions of sequences encoding structural RNAs to improve the search for such genes.
Collapse
|
173
|
Mehler MF, Mattick JS. Noncoding RNAs and RNA Editing in Brain Development, Functional Diversification, and Neurological Disease. Physiol Rev 2007; 87:799-823. [PMID: 17615389 DOI: 10.1152/physrev.00036.2006] [Citation(s) in RCA: 224] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The progressive maturation and functional plasticity of the nervous system in health and disease involve a dynamic interplay between the transcriptome and the environment. There is a growing awareness that the previously unexplored molecular and functional interface mediating these complex gene-environmental interactions, particularly in brain, may encompass a sophisticated RNA regulatory network involving the twin processes of RNA editing and multifaceted actions of numerous subclasses of non-protein-coding RNAs. The mature nervous system encompasses a wide range of cell types and interconnections. Long-term changes in the strength of synaptic connections are thought to underlie memory retrieval, formation, stabilization, and effector functions. The evolving nervous system involves numerous developmental transitions, such as neurulation, neural tube patterning, neural stem cell expansion and maintenance, lineage elaboration, differentiation, axonal path finding, and synaptogenesis. Although the molecular bases for these processes are largely unknown, RNA-based epigenetic mechanisms appear to be essential for orchestrating these precise and versatile biological phenomena and in defining the etiology of a spectrum of neurological diseases. The concerted modulation of RNA editing and the selective expression of non-protein-coding RNAs during seminal as well as continuous state transitions may comprise the plastic molecular code needed to couple the intrinsic malleability of neural network connections to evolving environmental influences to establish diverse forms of short- and long-term memory, context-specific behavioral responses, and sophisticated cognitive capacities.
Collapse
Affiliation(s)
- Mark F Mehler
- Institute for Brain Disorders and Neural Regeneration, Department of Neurology, Einstein Cancer Center, Albert Einstein College of Medicine, Bronx, New York 10461, USA.
| | | |
Collapse
|
174
|
Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, Goodnough LH, Helms JA, Farnham PJ, Segal E, Chang HY. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell 2007; 129:1311-23. [PMID: 17604720 PMCID: PMC2084369 DOI: 10.1016/j.cell.2007.05.022] [Citation(s) in RCA: 3313] [Impact Index Per Article: 194.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2007] [Revised: 03/28/2007] [Accepted: 05/09/2007] [Indexed: 02/09/2023]
Abstract
Noncoding RNAs (ncRNA) participate in epigenetic regulation but are poorly understood. Here we characterize the transcriptional landscape of the four human HOX loci at five base pair resolution in 11 anatomic sites and identify 231 HOX ncRNAs that extend known transcribed regions by more than 30 kilobases. HOX ncRNAs are spatially expressed along developmental axes and possess unique sequence motifs, and their expression demarcates broad chromosomal domains of differential histone methylation and RNA polymerase accessibility. We identified a 2.2 kilobase ncRNA residing in the HOXC locus, termed HOTAIR, which represses transcription in trans across 40 kilobases of the HOXD locus. HOTAIR interacts with Polycomb Repressive Complex 2 (PRC2) and is required for PRC2 occupancy and histone H3 lysine-27 trimethylation of HOXD locus. Thus, transcription of ncRNA may demarcate chromosomal domains of gene silencing at a distance; these results have broad implications for gene regulation in development and disease states.
Collapse
Affiliation(s)
- John L Rinn
- Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
175
|
Birney E, Stamatoyannopoulos JA, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H, Giresi PG, Goldy J, Hawrylycz M, Haydock A, Humbert R, James KD, Johnson BE, Johnson EM, Frum TT, Rosenzweig ER, Karnani N, Lee K, Lefebvre GC, Navas PA, Neri F, Parker SCJ, Sabo PJ, Sandstrom R, Shafer A, Vetrie D, Weaver M, Wilcox S, Yu M, Collins FS, Dekker J, Lieb JD, Tullius TD, Crawford GE, Sunyaev S, Noble WS, Dunham I, Denoeud F, Reymond A, Kapranov P, Rozowsky J, Zheng D, Castelo R, Frankish A, Harrow J, Ghosh S, Sandelin A, Hofacker IL, Baertsch R, Keefe D, Dike S, Cheng J, Hirsch HA, Sekinger EA, Lagarde J, Abril JF, Shahab A, Flamm C, Fried C, Hackermüller J, Hertel J, Lindemeyer M, Missal K, Tanzer A, Washietl S, Korbel J, Emanuelsson O, Pedersen JS, Holroyd N, Taylor R, Swarbreck D, Matthews N, Dickson MC, Thomas DJ, Weirauch MT, Gilbert J, Drenkow J, Bell I, Zhao X, Srinivasan KG, Sung WK, Ooi HS, Chiu KP, Foissac S, Alioto T, Brent M, Pachter L, Tress ML, Valencia A, Choo SW, Choo CY, Ucla C, Manzano C, Wyss C, Cheung E, Clark TG, Brown JB, Ganesh M, Patel S, Tammana H, Chrast J, Henrichsen CN, Kai C, Kawai J, Nagalakshmi U, Wu J, Lian Z, Lian J, Newburger P, Zhang X, Bickel P, Mattick JS, Carninci P, Hayashizaki Y, Weissman S, Hubbard T, Myers RM, Rogers J, Stadler PF, Lowe TM, Wei CL, Ruan Y, Struhl K, Gerstein M, Antonarakis SE, Fu Y, Green ED, Karaöz U, Siepel A, Taylor J, Liefer LA, Wetterstrand KA, Good PJ, Feingold EA, Guyer MS, Cooper GM, Asimenos G, Dewey CN, Hou M, Nikolaev S, Montoya-Burgos JI, Löytynoja A, Whelan S, Pardi F, Massingham T, Huang H, Zhang NR, Holmes I, Mullikin JC, Ureta-Vidal A, Paten B, Seringhaus M, Church D, Rosenbloom K, Kent WJ, Stone EA, Batzoglou S, Goldman N, Hardison RC, Haussler D, Miller W, Sidow A, Trinklein ND, Zhang ZD, Barrera L, Stuart R, King DC, Ameur A, Enroth S, Bieda MC, Kim J, Bhinge AA, Jiang N, Liu J, Yao F, Vega VB, Lee CWH, Ng P, Shahab A, Yang A, Moqtaderi Z, Zhu Z, Xu X, Squazzo S, Oberley MJ, Inman D, Singer MA, Richmond TA, Munn KJ, Rada-Iglesias A, Wallerman O, Komorowski J, Fowler JC, Couttet P, Bruce AW, Dovey OM, Ellis PD, Langford CF, Nix DA, Euskirchen G, Hartman S, Urban AE, Kraus P, Van Calcar S, Heintzman N, Kim TH, Wang K, Qu C, Hon G, Luna R, Glass CK, Rosenfeld MG, Aldred SF, Cooper SJ, Halees A, Lin JM, Shulha HP, Zhang X, Xu M, Haidar JNS, Yu Y, Ruan Y, Iyer VR, Green RD, Wadelius C, Farnham PJ, Ren B, Harte RA, Hinrichs AS, Trumbower H, Clawson H, Hillman-Jackson J, Zweig AS, Smith K, Thakkapallayil A, Barber G, Kuhn RM, Karolchik D, Armengol L, Bird CP, de Bakker PIW, Kern AD, Lopez-Bigas N, Martin JD, Stranger BE, Woodroffe A, Davydov E, Dimas A, Eyras E, Hallgrímsdóttir IB, Huppert J, Zody MC, Abecasis GR, Estivill X, Bouffard GG, Guan X, Hansen NF, Idol JR, Maduro VVB, Maskeri B, McDowell JC, Park M, Thomas PJ, Young AC, Blakesley RW, Muzny DM, Sodergren E, Wheeler DA, Worley KC, Jiang H, Weinstock GM, Gibbs RA, Graves T, Fulton R, Mardis ER, Wilson RK, Clamp M, Cuff J, Gnerre S, Jaffe DB, Chang JL, Lindblad-Toh K, Lander ES, Koriabine M, Nefedov M, Osoegawa K, Yoshinaga Y, Zhu B, de Jong PJ. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007; 447:799-816. [PMID: 17571346 PMCID: PMC2212820 DOI: 10.1038/nature05874] [Citation(s) in RCA: 3826] [Impact Index Per Article: 225.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Collapse
|
176
|
Tsuritani K, Irie T, Yamashita R, Sakakibara Y, Wakaguri H, Kanai A, Mizushima-Sugano J, Sugano S, Nakai K, Suzuki Y. Distinct class of putative "non-conserved" promoters in humans: comparative studies of alternative promoters of human and mouse genes. Genome Res 2007; 17:1005-14. [PMID: 17567985 PMCID: PMC1899111 DOI: 10.1101/gr.6030107] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Although recent studies have revealed that the majority of human genes are subject to regulation of alternative promoters, the biological relevance of this phenomenon remains unclear. We have also demonstrated that roughly half of the human RefSeq genes examined contain putative alternative promoters (PAPs). Here we report large-scale comparative studies of PAPs between human and mouse counterpart genes. Detailed sequence comparison of the 17,245 putative promoter regions (PPRs) in 5463 PAP-containing human genes revealed that PPRs in only a minor fraction of genes (807 genes) showed clear evolutionary conservation as one or more pairs. Also, we found that there were substantial qualitative differences between conserved and non-conserved PPRs, with the latter class being AT-rich PPRs of relative minor usage, enriched in repetitive elements and sometimes producing transcripts that encode small or no proteins. Systematic luciferase assays of these PPRs revealed that both classes of PPRs did have promoter activity, but that their strength ranges were significantly different. Furthermore, we demonstrate that these characteristic features of the non-conserved PPRs are shared with the PPRs of previously discovered putative non-protein coding transcripts. Taken together, our data suggest that there are two distinct classes of promoters in humans, with the latter class of promoters emerging frequently during evolution.
Collapse
Affiliation(s)
- Katsuki Tsuritani
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, Minatoku, Tokyo 108-8639, Japan
| | - Takuma Irie
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
| | - Riu Yamashita
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, Minatoku, Tokyo 108-8639, Japan
| | - Yuta Sakakibara
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
| | - Hiroyuki Wakaguri
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
| | - Akinori Kanai
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
| | - Junko Mizushima-Sugano
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
- Laboratory of Viral Infection II Kitasato Institute for Life Sciences, Kitasato University, Tokyo 108-8641, Japan
| | - Sumio Sugano
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
| | - Kenta Nakai
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, Minatoku, Tokyo 108-8639, Japan
| | - Yutaka Suzuki
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
- Corresponding author.E-mail ; fax +81-4-7136-3607
| |
Collapse
|
177
|
Galante PAF, Vidal DO, de Souza JE, Camargo AA, de Souza SJ. Sense-antisense pairs in mammals: functional and evolutionary considerations. Genome Biol 2007; 8:R40. [PMID: 17371592 PMCID: PMC1868933 DOI: 10.1186/gb-2007-8-3-r40] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2006] [Revised: 09/04/2006] [Accepted: 03/19/2007] [Indexed: 12/25/2022] Open
Abstract
Analysis of a catalog of S-AS pairs in the human and mouse genomes revealed several putative roles for natural antisense transcripts and showed that some are artifacts of cDNA library construction. Background A significant number of genes in mammalian genomes are being found to have natural antisense transcripts (NATs). These sense-antisense (S-AS) pairs are believed to be involved in several cellular phenomena. Results Here, we generated a catalog of S-AS pairs occurring in the human and mouse genomes by analyzing different sources of expressed sequences available in the public domain plus 122 massively parallel signature sequencing (MPSS) libraries from a variety of human and mouse tissues. Using this dataset of almost 20,000 S-AS pairs in both genomes we investigated, in a computational and experimental way, several putative roles that have been assigned to NATs, including gene expression regulation. Furthermore, these global analyses allowed us to better dissect and propose new roles for NATs. Surprisingly, we found that a significant fraction of NATs are artifacts produced by genomic priming during cDNA library construction. Conclusion We propose an evolutionary and functional model in which alternative polyadenylation and retroposition account for the origin of a significant number of functional S-AS pairs in mammalian genomes.
Collapse
Affiliation(s)
- Pedro AF Galante
- Ludwig Institute for Cancer Research, São Paulo Branch, Hospital Alemão Oswaldo Cruz, Rua João Juliao 245, 1 andar, São Paulo, SP 01323-903, Brazil
- Department Of Biochemistry, University of São Paulo, Av. Prof. Lineu Prestes, 748 - sala 351, São Paulo, SP 05508-900, Brazil
| | - Daniel O Vidal
- Ludwig Institute for Cancer Research, São Paulo Branch, Hospital Alemão Oswaldo Cruz, Rua João Juliao 245, 1 andar, São Paulo, SP 01323-903, Brazil
| | - Jorge E de Souza
- Ludwig Institute for Cancer Research, São Paulo Branch, Hospital Alemão Oswaldo Cruz, Rua João Juliao 245, 1 andar, São Paulo, SP 01323-903, Brazil
| | - Anamaria A Camargo
- Ludwig Institute for Cancer Research, São Paulo Branch, Hospital Alemão Oswaldo Cruz, Rua João Juliao 245, 1 andar, São Paulo, SP 01323-903, Brazil
| | - Sandro J de Souza
- Ludwig Institute for Cancer Research, São Paulo Branch, Hospital Alemão Oswaldo Cruz, Rua João Juliao 245, 1 andar, São Paulo, SP 01323-903, Brazil
| |
Collapse
|
178
|
Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR, Hackermüller J, Hertel J, Lindemeyer M, Reiche K, Tanzer A, Ucla C, Wyss C, Antonarakis SE, Denoeud F, Lagarde J, Drenkow J, Kapranov P, Gingeras TR, Guigó R, Snyder M, Gerstein MB, Reymond A, Hofacker IL, Stadler PF. Structured RNAs in the ENCODE selected regions of the human genome. Genes Dev 2007; 17:852-64. [PMID: 17568003 PMCID: PMC1891344 DOI: 10.1101/gr.5650707] [Citation(s) in RCA: 136] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2006] [Accepted: 12/12/2006] [Indexed: 12/16/2022]
Abstract
Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic-stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to approximately 2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).
Collapse
Affiliation(s)
- Stefan Washietl
- Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
179
|
Denoeud F, Kapranov P, Ucla C, Frankish A, Castelo R, Drenkow J, Lagarde J, Alioto T, Manzano C, Chrast J, Dike S, Wyss C, Henrichsen CN, Holroyd N, Dickson MC, Taylor R, Hance Z, Foissac S, Myers RM, Rogers J, Hubbard T, Harrow J, Guigó R, Gingeras TR, Antonarakis SE, Reymond A. Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genes Dev 2007; 17:746-59. [PMID: 17567994 PMCID: PMC1891335 DOI: 10.1101/gr.5660607] [Citation(s) in RCA: 162] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2006] [Accepted: 01/22/2007] [Indexed: 11/24/2022]
Abstract
This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.
Collapse
Affiliation(s)
- France Denoeud
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | | | - Catherine Ucla
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Robert Castelo
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Jorg Drenkow
- Affymetrix, Inc., Santa Clara, California 95051, USA
| | - Julien Lagarde
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Tyler Alioto
- Center for Genomic Regulation, 08003 Barcelona, Catalonia, Spain
| | - Caroline Manzano
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Jacqueline Chrast
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Sujit Dike
- Affymetrix, Inc., Santa Clara, California 95051, USA
| | - Carine Wyss
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | | | - Nancy Holroyd
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Mark C. Dickson
- Department of Genetics, Stanford Human Genome Center, Stanford University School of Medicine, Stanford, California 94305-5120, USA
| | - Ruth Taylor
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Zahra Hance
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Sylvain Foissac
- Center for Genomic Regulation, 08003 Barcelona, Catalonia, Spain
| | - Richard M. Myers
- Department of Genetics, Stanford Human Genome Center, Stanford University School of Medicine, Stanford, California 94305-5120, USA
| | - Jane Rogers
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Tim Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Roderic Guigó
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
- Center for Genomic Regulation, 08003 Barcelona, Catalonia, Spain
| | | | - Stylianos E. Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Alexandre Reymond
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
180
|
Doran G. RNAi - Is one suffix sufficient? JOURNAL OF RNAI AND GENE SILENCING : AN INTERNATIONAL JOURNAL OF RNA AND GENE TARGETING RESEARCH 2007; 3:217-9. [PMID: 19771220 PMCID: PMC2737217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Download PDF] [Subscribe] [Scholar Register] [Received: 01/31/2007] [Indexed: 11/04/2022]
Affiliation(s)
- Graeme Doran
- Center for Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA,
| |
Collapse
|
181
|
Wiemer EAC. The role of microRNAs in cancer: no small matter. Eur J Cancer 2007; 43:1529-44. [PMID: 17531469 DOI: 10.1016/j.ejca.2007.04.002] [Citation(s) in RCA: 271] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2007] [Accepted: 04/02/2007] [Indexed: 12/19/2022]
Abstract
MicroRNAs are a recently discovered class of small, evolutionarily conserved, RNA molecules that negatively regulate gene expression at the post-transcriptional level. Mature microRNAs of approximately 20-22 nucleotides are formed from longer primary transcripts by two sequential processing steps mediated by a nuclear (Drosha) and a cytoplasmic (Dicer) RNAse III endonuclease. In the context of a protein complex, the RNA-induced silencing complex (RISC), microRNAs base-pair with target messenger RNA sequences causing translational repression and/or messenger RNA degradation. MicroRNAs have been implicated in the control of many fundamental cellular and physiological processes such as tissue development, cellular differentiation and proliferation, metabolic and signalling pathways, apoptosis and stem cell maintenance. Mounting evidence indicates that microRNAs also play a significant role in cellular transformation and carcinogenesis acting either as oncogenes or tumour suppressors. This review briefly introduces microRNAs in a historical perspective and focuses on the biogenesis of microRNAs, their mode of action, mammalian microRNA functions with emphasis on their involvement in disease - particularly cancer - and their potential therapeutic use.
Collapse
Affiliation(s)
- Erik A C Wiemer
- Department of Medical Oncology, Josephine Nefkens Institute, Erasmus Medical Center, 3015 GE Rotterdam, The Netherlands.
| |
Collapse
|
182
|
HYBRIDdb: a database of hybrid genes in the human genome. BMC Genomics 2007; 8:128. [PMID: 17519042 PMCID: PMC1890557 DOI: 10.1186/1471-2164-8-128] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2006] [Accepted: 05/23/2007] [Indexed: 11/30/2022] Open
Abstract
Background Hybrid genes are candidate risk factors for human tumors by inducing mutation, translocation, inversion, or rearrangement of genes. The occurrence of hybrid genes may also have given rise to new transcripts during hominid evolution. Description HYBRIDdb is a database of hybrid genes in humans. This system encompasses the bioinformatics analysis of mRNA, EST, cDNA, and genomic DNA sequences in the INDC databases, and can be used to identify hybrid genes. We searched for hybrid genes among the 28,171 genes listed in the NCBI database, and analyzed their structural patterns in the human genome. The 2,344 gene pairs were detected as hybrid forms of transcriptional products. We classified the hybrid genes into two groups: chromosomal-mediated translocation fusion transcripts and transcription-mediated fusion transcripts. Conclusion The HYBRIDdb database will provide genome scientists with insight into potential roles for hybrid genes in human evolution and disease.
Collapse
|
183
|
Kapranov P, Willingham AT, Gingeras TR. Genome-wide transcription and the implications for genomic organization. Nat Rev Genet 2007; 8:413-23. [PMID: 17486121 DOI: 10.1038/nrg2083] [Citation(s) in RCA: 529] [Impact Index Per Article: 31.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Recent evidence of genome-wide transcription in several species indicates that the amount of transcription that occurs cannot be entirely accounted for by current sets of genome-wide annotations. Evidence indicates that most of both strands of the human genome might be transcribed, implying extensive overlap of transcriptional units and regulatory elements. These observations suggest that genomic architecture is not colinear, but is instead interleaved and modular, and that the same genomic sequences are multifunctional: that is, used for multiple independently regulated transcripts and as regulatory regions. What are the implications and consequences of such an interleaved genomic architecture in terms of increased information content, transcriptional complexity, evolution and disease states?
Collapse
Affiliation(s)
- Philipp Kapranov
- Affymetrix, Inc., 3420 Central Expressway, Santa Clara, California 95051, USA
| | | | | |
Collapse
|
184
|
Haddad F, Qin AX, Giger JM, Guo H, Baldwin KM. Potential pitfalls in the accuracy of analysis of natural sense-antisense RNA pairs by reverse transcription-PCR. BMC Biotechnol 2007; 7:21. [PMID: 17480233 PMCID: PMC1876213 DOI: 10.1186/1472-6750-7-21] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2006] [Accepted: 05/04/2007] [Indexed: 01/25/2023] Open
Abstract
Background The ability to accurately measure patterns of gene expression is essential in studying gene function. The reverse transcription polymerase chain reaction (RT-PCR) has become the method of choice for the detection and measurement of RNA expression patterns in both cells and small quantities of tissue. Our previous results show that there is a significant production of primer-independent cDNA synthesis using a popular RNase H- RT enzyme. A PCR product was amplified from RT reactions that were carried out without addition of RT-primer. This finding jeopardizes the accuracy of RT-PCR when analyzing RNA that is expressed in both orientations. Current literature findings suggest that naturally occurring antisense expression is widespread in the mammalian transcriptome and consists of both coding and non-coding regulatory RNA. The primary purpose of this present study was to investigate the occurrence of primer-independent cDNA synthesis and how it may influence the accuracy of detection of sense-antisense RNA pairs. Results Our findings on cellular RNA and in vitro synthesized RNA suggest that these products are likely the results of RNA self-priming to generate random cDNA products, which contributes to the loss of strand specificity. The use of RNase H+ RT enzyme and carrying the RT reaction at high temperature (50°C) greatly improved the strand specificity of the RT-PCR detection. Conclusion While RT PCR is a basic method used for the detection and quantification of RNA expression in cells, primer-independent cDNA synthesis can interfere with RT specificity, and may lead to misinterpretation of the results, especially when both sense and antisense RNA are expressed. For accurate interpretation of the results, it is essential to carry out the appropriate negative controls.
Collapse
Affiliation(s)
- Fadia Haddad
- Physiology and Biophysics Department; University of California Irvine, Irvine, CA 92697; USA
| | - Anqi X Qin
- Physiology and Biophysics Department; University of California Irvine, Irvine, CA 92697; USA
| | - Julie M Giger
- Physiology and Biophysics Department; University of California Irvine, Irvine, CA 92697; USA
| | - Hongyan Guo
- Physiology and Biophysics Department; University of California Irvine, Irvine, CA 92697; USA
| | - Kenneth M Baldwin
- Physiology and Biophysics Department; University of California Irvine, Irvine, CA 92697; USA
| |
Collapse
|
185
|
Abstract
SUMMARY
It is usually thought that the development of complex organisms is controlled by protein regulatory factors and morphogenetic signals exchanged between cells and differentiating tissues during ontogeny. However, it is now evident that the majority of all animal genomes is transcribed, apparently in a developmentally regulated manner, suggesting that these genomes largely encode RNA machines and that there may be a vast hidden layer of RNA regulatory transactions in the background. I propose that the epigenetic trajectories of differentiation and development are primarily programmed by feed-forward RNA regulatory networks and that most of the information required for multicellular development is embedded in these networks, with cell–cell signalling required to provide important positional information and to correct stochastic errors in the endogenous RNA-directed program.
Collapse
Affiliation(s)
- John S Mattick
- ARC Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia QLD 4072, Australia.
| |
Collapse
|
186
|
Pajares MJ, Ezponda T, Catena R, Calvo A, Pio R, Montuenga LM. Alternative splicing: an emerging topic in molecular and clinical oncology. Lancet Oncol 2007; 8:349-57. [PMID: 17395108 DOI: 10.1016/s1470-2045(07)70104-3] [Citation(s) in RCA: 202] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Alternative pre-mRNA splicing is a key molecular event that allows for protein diversity. Through this process, a single gene increases its coding capacity by expressing several related proteins with diverse and even antagonistic functions. Aberrant splicing has been found to be associated with various diseases, including cancer. Mutations in splicing regulatory elements within the nucleotide sequence and alterations in the cellular-splicing-regulatory machinery both result in changes in the splicing pattern of many cancer-related genes. The analysis of cancer-specific alternative splicing and its molecular consequences is promising. In this review we summarise the current knowledge on the mechanisms governing abnormal alternative splicing in cancer and the biological consequences associated with the alteration of splicing in some relevant cancer-related genes. The use of alternative splicing as a potential source for new diagnostic, prognostic, predictive, and therapeutic tools is also discussed.
Collapse
Affiliation(s)
- María J Pajares
- Oncology Division, Centre for Applied Medical Research, School of Medicine, University of Navarra, Pamplona, Spain
| | | | | | | | | | | |
Collapse
|
187
|
Carninci P, Hayashizaki Y. Noncoding RNA transcription beyond annotated genes. Curr Opin Genet Dev 2007; 17:139-44. [PMID: 17317145 DOI: 10.1016/j.gde.2007.02.008] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2006] [Accepted: 02/12/2007] [Indexed: 11/20/2022]
Abstract
Recent analyses based on high-throughput transcriptome data have revealed that the fraction of the genome that is transcribed largely exceeds the fraction encoding protein. Transcription of unconventional genes into noncoding RNAs is widespread and, in mammals, these RNAs comprise at least half the total number of RNAs transcribed by RNA polymerase II. Although the function of the majority of noncoding RNAs has yet to be discovered, many of them are transcribed from both strands of the genome, and evidence points towards a regulatory function for many noncoding RNAs in mammalian cells.
Collapse
Affiliation(s)
- Piero Carninci
- Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan.
| | | |
Collapse
|
188
|
Louro R, Nakaya HI, Amaral PP, Festa F, Sogayar MC, da Silva AM, Verjovski-Almeida S, Reis EM. Androgen responsive intronic non-coding RNAs. BMC Biol 2007; 5:4. [PMID: 17263875 PMCID: PMC1800835 DOI: 10.1186/1741-7007-5-4] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2006] [Accepted: 01/30/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcription of large numbers of non-coding RNAs originating from intronic regions of human genes has been recently reported, but mechanisms governing their biosynthesis and biological functions are largely unknown. In this work, we evaluated the existence of a common mechanism of transcription regulation shared by protein-coding mRNAs and intronic RNAs by measuring the effect of androgen on the transcriptional profile of a prostate cancer cell line. RESULTS Using a custom-built cDNA microarray enriched in intronic transcribed sequences, we found 39 intronic non-coding RNAs for which levels were significantly regulated by androgen exposure. Orientation-specific reverse transcription-PCR indicated that 10 of the 13 were transcribed in the antisense direction. These transcripts are long (0.5-5 kb), unspliced and apparently do not code for proteins. Interestingly, we found that the relative levels of androgen-regulated intronic transcripts could be correlated with the levels of the corresponding protein-coding gene (asGAS6 and asDNAJC3) or with the alternative usage of exons (asKDELR2 and asITGA6) in the corresponding protein-coding transcripts. Binding of the androgen receptor to a putative regulatory region upstream from asMYO5A, an androgen-regulated antisense intronic transcript, was confirmed by chromatin immunoprecipitation. CONCLUSION Altogether, these results indicate that at least a fraction of naturally transcribed intronic non-coding RNAs may be regulated by common physiological signals such as hormones, and further corroborate the notion that the intronic complement of the transcriptome play functional roles in the human gene-expression program.
Collapse
Affiliation(s)
- Rodrigo Louro
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, Brazil
| | - Helder I Nakaya
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, Brazil
| | - Paulo P Amaral
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, Brazil
| | - Fernanda Festa
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, Brazil
| | - Mari C Sogayar
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, Brazil
| | - Aline M da Silva
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, Brazil
| | - Sergio Verjovski-Almeida
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, Brazil
| | - Eduardo M Reis
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, Brazil
| |
Collapse
|
189
|
Prasanth KV, Spector DL. Eukaryotic regulatory RNAs: an answer to the 'genome complexity' conundrum. Genes Dev 2007; 21:11-42. [PMID: 17210785 DOI: 10.1101/gad.1484207] [Citation(s) in RCA: 301] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
A large portion of the eukaryotic genome is transcribed as noncoding RNAs (ncRNAs). While once thought of primarily as "junk," recent studies indicate that a large number of these RNAs play central roles in regulating gene expression at multiple levels. The increasing diversity of ncRNAs identified in the eukaryotic genome suggests a critical nexus between the regulatory potential of ncRNAs and the complexity of genome organization. We provide an overview of recent advances in the identification and function of eukaryotic ncRNAs and the roles played by these RNAs in chromatin organization, gene expression, and disease etiology.
Collapse
|
190
|
Emerick MC, Parmigiani G, Agnew WS. Multivariate analysis and visualization of splicing correlations in single-gene transcriptomes. BMC Bioinformatics 2007; 8:16. [PMID: 17233916 PMCID: PMC1785386 DOI: 10.1186/1471-2105-8-16] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Accepted: 01/18/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA metabolism, through 'combinatorial splicing', can generate enormous structural diversity in the proteome. Alternative domains may interact, however, with unpredictable phenotypic consequences, necessitating integrated RNA-level regulation of molecular composition. Splicing correlations within transcripts of single genes provide valuable clues to functional relationships among molecular domains as well as genomic targets for higher-order splicing regulation. RESULTS We present tools to visualize complex splicing patterns in full-length cDNA libraries. Developmental changes in pair-wise correlations are presented vectorially in 'clock plots' and linkage grids. Higher-order correlations are assessed statistically through Monte Carlo analysis of a log-linear model with an empirical-Bayes estimate of the true probabilities of observed and unobserved splice forms. Log-linear coefficients are visualized in a 'spliceprint,' a signature of splice correlations in the transcriptome. We present two novel metrics: the linkage change index, which measures the directional change in pair-wise correlation with tissue differentiation, and the accuracy index, a very simple goodness-of-fit metric that is more sensitive than the integrated squared error when applied to sparsely populated tables, and unlike chi-square, does not diverge at low variance. Considerable attention is given to sparse contingency tables, which are inherent to single-gene libraries. CONCLUSION Patterns of splicing correlations are revealed, which span a broad range of interaction order and change in development. The methods have a broad scope of applicability, beyond the single gene--including, for example, multiple gene interactions in the complete transcriptome.
Collapse
Affiliation(s)
- Mark C Emerick
- Department of Physiology, Johns Hopkins Medical School, Baltimore, MD 21205 USA
| | - Giovanni Parmigiani
- Departments of Oncology, Zoology, Johns Hopkins Medical School, and Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205 USA
| | - William S Agnew
- Departments of Physiology and Neuroscience, Johns Hopkins Medical School, Baltimore, MD 21205 USA
| |
Collapse
|
191
|
Nakaya HI, Amaral PP, Louro R, Lopes A, Fachel AA, Moreira YB, El-Jundi TA, da Silva AM, Reis EM, Verjovski-Almeida S. Genome mapping and expression analyses of human intronic noncoding RNAs reveal tissue-specific patterns and enrichment in genes related to regulation of transcription. Genome Biol 2007; 8:R43. [PMID: 17386095 PMCID: PMC1868932 DOI: 10.1186/gb-2007-8-3-r43] [Citation(s) in RCA: 155] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2006] [Revised: 01/17/2007] [Accepted: 03/26/2007] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND RNAs transcribed from intronic regions of genes are involved in a number of processes related to post-transcriptional control of gene expression. However, the complement of human genes in which introns are transcribed, and the number of intronic transcriptional units and their tissue expression patterns are not known. RESULTS A survey of mRNA and EST public databases revealed more than 55,000 totally intronic noncoding (TIN) RNAs transcribed from the introns of 74% of all unique RefSeq genes. Guided by this information, we designed an oligoarray platform containing sense and antisense probes for each of 7,135 randomly selected TIN transcripts plus the corresponding protein-coding genes. We identified exonic and intronic tissue-specific expression signatures for human liver, prostate and kidney. The most highly expressed antisense TIN RNAs were transcribed from introns of protein-coding genes significantly enriched (p = 0.002 to 0.022) in the 'Regulation of transcription' Gene Ontology category. RNA polymerase II inhibition resulted in increased expression of a fraction of intronic RNAs in cell cultures, suggesting that other RNA polymerases may be involved in their biosynthesis. Members of a subset of intronic and protein-coding signatures transcribed from the same genomic loci have correlated expression patterns, suggesting that intronic RNAs regulate the abundance or the pattern of exon usage in protein-coding messages. CONCLUSION We have identified diverse intronic RNA expression patterns, pointing to distinct regulatory roles. This gene-oriented approach, using a combined intron-exon oligoarray, should permit further comparative analysis of intronic transcription under various physiological and pathological conditions, thus advancing current knowledge about the biological functions of these noncoding RNAs.
Collapse
Affiliation(s)
- Helder I Nakaya
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Paulo P Amaral
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Rodrigo Louro
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - André Lopes
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Angela A Fachel
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Yuri B Moreira
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Tarik A El-Jundi
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Aline M da Silva
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Eduardo M Reis
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Sergio Verjovski-Almeida
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| |
Collapse
|
192
|
Heber S, Sick B. Quality assessment of Affymetrix GeneChip data. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2006; 10:358-68. [PMID: 17069513 DOI: 10.1089/omi.2006.10.358] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Affymetrix GeneChips are one of the best established microarray platforms. This powerful technique allows users to measure the expression of thousands of genes simultaneously. However, a microarray experiment is a sophisticated and time consuming endeavor with many potential sources of unwanted variation that could compromise the results if left uncontrolled. Increasing data volume and data complexity have triggered growing concern and awareness of the importance of assessing the quality of generated microarray data. In this review, we give an overview of current methods and software tools for quality assessment of Affymetrix GeneChip data. We focus on quality metrics, diagnostic plots, probe-level methods, pseudo-images, and classification methods to identify corrupted chips. We also describe RNA quality assessment methods which play an important role in challenging RNA sources like formalin embedded biopsies, laser-micro dissected samples, or single cells. No wet-lab methods are discussed in this paper.
Collapse
Affiliation(s)
- Steffen Heber
- Department of Computer Science, North Carolina State University, Raleigh, North Carolina, USA
| | | |
Collapse
|
193
|
Abstract
For a long time, molecular evolutionary biologists have been focused on DNA and proteins, whereas RNA has lived in the shadow of its famous chemical cousins as a mere intermediary. Although this perspective has begun to change since genome-wide transcriptional profiling was successfully extended to evolutionary biology, it still echoes in evolutionary literature. In this mini-review, new developments of RNA biochemistry and transcriptomics are brought to the attention of evolutionary biologists. In particular, the unexpected abundance and functional significance of noncoding RNAs is briefly reviewed. Noncoding RNAs control a remarkable range of biological pathways and processes, all with obvious fitness consequences, such as initiation of translation, mRNA abundance, transposon jumping, chromosome architecture, stem cell maintenance, development of brain and muscles, insulin secretion, cancerogenesis and plant resistance to viral infections.
Collapse
Affiliation(s)
- P Michalak
- Department of Biology, The University of Texas at Arlington, Arlington, TX 76010, USA.
| |
Collapse
|
194
|
Werner A, Schmutzler G, Carlile M, Miles CG, Peters H. Expression profiling of antisense transcripts on DNA arrays. Physiol Genomics 2006; 28:294-300. [PMID: 17105753 DOI: 10.1152/physiolgenomics.00127.2006] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The majority of mouse genes are estimated to undergo bidirectional transcription; however, their tissue-specific distribution patterns and physiological significance are largely unknown. This is in part due to the lack of methodology to routinely assess the expression of natural antisense transcripts (NATs) on a large scale. Here we tested whether commercial DNA arrays can be used to monitor antisense transcription in mouse kidney and brain. We took advantage of the reversely annotated oligonucleotides on the U74 mouse genome array from Affymetrix that hybridize to NATs overlapping with the sense transcript in the area of the probe set. In RNA samples from mouse kidney and brain, 11.9% and 10.1%, respectively, of 5,652 potential NATs returned positive and about half of the antisense RNAs were detected in both tissues, which was similar to the fraction of sense transcripts expressed in both tissues. Notably, we found that the majority of NATs are related to the sense transcriptome since corresponding sense transcripts were detected for 92.5% (kidney) and 74.5% (brain) of the detected antisense RNAs. Antisense RNA transcription was confirmed by real-time PCR and included additional RNA samples from heart, thymus, and liver. The randomly selected transcripts showed tissue specific expression patterns and varying sense/antisense ratios. The results indicate that antisense transcriptomes are tissue specific, and although pairing of sense/antisense transcripts are known to result in rapid degradation, our data provide proof of principle that the sensitivity of commercial DNA arrays is sufficient to assess NATs in total RNA of whole organs.
Collapse
Affiliation(s)
- Andreas Werner
- Epithelial Research Group, Institute for Cell and Molecular Biosciences.
| | | | | | | | | |
Collapse
|
195
|
Thiebaut M, Kisseleva-Romanova E, Rougemaille M, Boulay J, Libri D. Transcription termination and nuclear degradation of cryptic unstable transcripts: a role for the nrd1-nab3 pathway in genome surveillance. Mol Cell 2006; 23:853-64. [PMID: 16973437 DOI: 10.1016/j.molcel.2006.07.029] [Citation(s) in RCA: 196] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2006] [Revised: 06/23/2006] [Accepted: 07/28/2006] [Indexed: 11/25/2022]
Abstract
Cryptic unstable transcripts (CUTs) are widely distributed in the genome of S. cerevisiae. These RNAs generally derive from nonannotated regions of the genome and are degraded rapidly and efficiently by the nuclear exosome via a pathway that involves degradative polyadenylation by a new poly(A) polymerase borne by the TRAMP complex. What is the share of significant information that is encrypted in CUTs and what distinguishes a CUT from other Pol II transcripts are unclear to date. Here we report the dissection of the molecular mechanism that leads to degradation of a model CUT, NEL025c. We show that the Nrd1p-Nab3p-dependent pathway, involved in transcription termination of sno/snRNAs, is required, albeit not sufficient, for efficient degradation of NEL025c RNAs and at least a subset of other CUTs. Our results suggest an important role for the Nrd1p-Nab3p pathway in the control of gene expression throughout the genome.
Collapse
Affiliation(s)
- Marilyne Thiebaut
- Centre de Génétique Moléculaire, Centre National de la Recherche Scientifique, 91190 Gif sur Yvette, France
| | | | | | | | | |
Collapse
|
196
|
Wahlestedt C. Natural antisense and noncoding RNA transcripts as potential drug targets. Drug Discov Today 2006; 11:503-8. [PMID: 16713901 DOI: 10.1016/j.drudis.2006.04.013] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2005] [Revised: 03/17/2006] [Accepted: 04/10/2006] [Indexed: 12/26/2022]
Abstract
Information on the complexity of mammalian RNA transcription has increased greatly in the past few years. Notably, thousands of sense transcripts (conventional protein-coding genes) have antisense transcript partners, most of which are noncoding. Interestingly, a number of antisense transcripts regulate the expression of their sense partners, either in a discordant (antisense knockdown results in sense-transcript elevation) or concordant (antisense knockdown results in concomitant sense-transcript reduction) manner. Two new pharmacological strategies based on the knockdown of antisense RNA transcripts by siRNA (or another RNA targeting principle) are proposed in this review. In the case of discordant regulation, knockdown of antisense transcript elevates the expression of the conventional (sense) gene, thereby conceivably mimicking agonist-activator action. In the case of concordant regulation, knockdown of antisense transcript, or concomitant knockdown of antisense and sense transcripts, results in an additive or even synergistic reduction of the conventional gene expression. Although both strategies have been demonstrated to be valid in cell culture, it remains to be seen whether they provide advantages in other contexts.
Collapse
Affiliation(s)
- Claes Wahlestedt
- The Scripps Research Institute (Scripps Florida), 5353 Parkside Drive, RF-2, Jupiter, FL 33458, USA.
| |
Collapse
|
197
|
Costa FF. Non-coding RNAs: lost in translation? Gene 2006; 386:1-10. [PMID: 17113247 DOI: 10.1016/j.gene.2006.09.028] [Citation(s) in RCA: 97] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2006] [Revised: 08/15/2006] [Accepted: 09/13/2006] [Indexed: 01/07/2023]
Abstract
In the last ten years, several RNAs with no protein-coding potential have been accumulating in RNA databases and are in need of further molecular characterization. At the same time, examples of non-coding RNAs (ncRNAs) such as microRNAs, small RNAs, small interfering RNAs (siRNAs) and medium/large RNAs with various functions have been described in the literature. Recent evidence points to a widespread role of these molecules in eukaryotic cells, suggesting that the majority of the new ncRNA examples might have specific functions. The aim of this review is to describe several new functional ncRNAs that have been recently identified and characterized, providing some clues that these molecules might not be produced by chance or as by-products of transcription as has been speculated.
Collapse
Affiliation(s)
- Fabrício F Costa
- Cancer Biology and Epigenomics Program, Children's Memorial Research Center and Northwestern University's Feinberg School of Medicine, 2300 Children's Plaza, Box 220, Chicago, IL 60614, USA
| |
Collapse
|
198
|
Fredlake CP, Hert DG, Mardis ER, Barron AE. What is the future of electrophoresis in large-scale genomic sequencing? Electrophoresis 2006; 27:3689-702. [PMID: 17031784 DOI: 10.1002/elps.200600408] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Although a finished human genome reference sequence is now available, the ability to sequence large, complex genomes remains critically important for researchers in the biological sciences, and in particular, continued human genomic sequence determination will ultimately help to realize the promise of medical care tailored to an individual's unique genetic identity. Many new technologies are being developed to decrease the costs and to dramatically increase the data acquisition rate of such sequencing projects. These new sequencing approaches include Sanger reaction-based technologies that have electrophoresis as the final separation step as well as those that use completely novel, nonelectrophoretic methods to generate sequence data. In this review, we discuss the various advances in sequencing technologies and evaluate the current limitations of novel methods that currently preclude their complete acceptance in large-scale sequencing projects. Our primary goal is to analyze and predict the continuing role of electrophoresis in large-scale DNA sequencing, both in the near and longer term.
Collapse
Affiliation(s)
- Christopher P Fredlake
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | | | | | | |
Collapse
|
199
|
|
200
|
Khaitovich P, Kelso J, Franz H, Visagie J, Giger T, Joerchel S, Petzold E, Green RE, Lachmann M, Pääbo S. Functionality of intergenic transcription: an evolutionary comparison. PLoS Genet 2006; 2:e171. [PMID: 17040132 PMCID: PMC1599769 DOI: 10.1371/journal.pgen.0020171] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2006] [Accepted: 08/28/2006] [Indexed: 11/30/2022] Open
Abstract
Although a large proportion of human transcription occurs outside the boundaries of known genes, the functional significance of this transcription remains unknown. We have compared the expression patterns of known genes as well as intergenic transcripts within the ENCODE regions between humans and chimpanzees in brain, heart, testis, and lymphoblastoid cell lines. We find that intergenic transcripts show patterns of tissue-specific conservation of their expression, which are comparable to exonic transcripts of known genes. This suggests that intergenic transcripts are subject to functional constraints that restrict their rate of evolutionary change as well as putative positive selection to an extent comparable to that of classical protein-coding genes. In brain and testis, we find that part of this intergenic transcription is caused by widespread use of alternative promoters. Further, we find that about half of the expression differences between humans and chimpanzees are due to intergenic transcripts. In order to convert the genetic information encoded in an organism's genomic sequence into the functional features, the genomic sequence must be transcribed. According to the current genome annotation, the human genome encodes 20,000–25,000 protein-coding transcripts and a smaller number of non-coding transcripts. There is, however, a growing body of evidence indicating that a much greater proportion of the human genome is transcribed than is accounted for by the existing annotation. Much of this evidence has been found using tiling arrays, microarrays that enable the measurement of transcription regardless of existing annotation. Although some have suggested that these transcripts represent previously unidentified functional RNAs as well as extensions of known genes, the extent of their functionality remains unknown. In this study, Khaitovich et al. assess the functionality of these novel transcripts by testing the extent to which their expression is conserved between humans and chimpanzees in different tissues. The results suggest that, surprisingly, the expression of both known and novel transcripts was affected by the same functional constraints during human and chimpanzee evolution.
Collapse
|