1
|
Chen Y, Kuroki Y, Shaw G, Pask AJ, Yu H, Toyoda A, Fujiyama A, Renfree MB. Androgen and Oestrogen Affect the Expression of Long Non-Coding RNAs During Phallus Development in a Marsupial. Noncoding RNA 2018; 5:E3. [PMID: 30598023 PMCID: PMC6468475 DOI: 10.3390/ncrna5010003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 12/19/2018] [Accepted: 12/27/2018] [Indexed: 12/24/2022] Open
Abstract
There is increasing evidence that long non-coding RNAs (lncRNAs) are important for normal reproductive development, yet very few lncRNAs have been identified in phalluses so far. Unlike eutherians, phallus development in the marsupial tammar wallaby occurs post-natally, enabling manipulation not possible in eutherians in which differentiation occurs in utero. We treated with sex steroids to determine the effects of androgen and oestrogen on lncRNA expression during phallus development. Hormonal manipulations altered the coding and non-coding gene expression profile of phalluses. We identified several predicted co-regulatory lncRNAs that appear to be co-expressed with the hormone-responsive candidate genes regulating urethral closure and phallus growth, namely IGF1, AR and ESR1. Interestingly, more than 50% of AR-associated coding genes and lncRNAs were also associated with ESR1. In addition, we identified and validated three novel co-regulatory and hormone-responsive lncRNAs: lnc-BMP5, lnc-ZBTB16 and lncRSPO4. Lnc-BMP5 was detected in the urethral epithelium of male phalluses and was downregulated by oestrogen in males. Lnc-ZBTB16 was downregulated by oestrogen treatment in male phalluses at day 50 post-partum (pp). LncRSPO4 was downregulated by adiol treatment in female phalluses but increased in male phalluses after castration. Thus, the expression pattern and hormone responsiveness of these lncRNAs suggests a physiological role in the development of the phallus.
Collapse
Affiliation(s)
- Yu Chen
- School of BioSciences, The University of Melbourne 3010, VIC, Australia.
| | - Yoko Kuroki
- RIKEN, Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.
| | - Geoff Shaw
- School of BioSciences, The University of Melbourne 3010, VIC, Australia.
| | - Andrew J Pask
- School of BioSciences, The University of Melbourne 3010, VIC, Australia.
| | - Hongshi Yu
- School of BioSciences, The University of Melbourne 3010, VIC, Australia.
| | - Atsushi Toyoda
- Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan.
| | - Asao Fujiyama
- Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan.
| | - Marilyn B Renfree
- School of BioSciences, The University of Melbourne 3010, VIC, Australia.
| |
Collapse
|
2
|
Askarian-Amiri ME, Seyfoddin V, Smart CE, Wang J, Kim JE, Hansji H, Baguley BC, Finlay GJ, Leung EY. Emerging role of long non-coding RNA SOX2OT in SOX2 regulation in breast cancer. PLoS One 2014; 9:e102140. [PMID: 25006803 PMCID: PMC4090206 DOI: 10.1371/journal.pone.0102140] [Citation(s) in RCA: 100] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Accepted: 06/13/2014] [Indexed: 02/02/2023] Open
Abstract
The transcription factor SOX2 is essential for maintaining pluripotency in a variety of stem cells. It has important functions during embryonic development, is involved in cancer stem cell maintenance, and is often deregulated in cancer. The mechanism of SOX2 regulation has yet to be clarified, but the SOX2 gene lies in an intron of a long multi-exon non-coding RNA called SOX2 overlapping transcript (SOX2OT). Here, we show that the expression of SOX2 and SOX2OT is concordant in breast cancer, differentially expressed in estrogen receptor positive and negative breast cancer samples and that both are up-regulated in suspension culture conditions that favor growth of stem cell phenotypes. Importantly, ectopic expression of SOX2OT led to an almost 20-fold increase in SOX2 expression, together with a reduced proliferation and increased breast cancer cell anchorage-independent growth. We propose that SOX2OT plays a key role in the induction and/or maintenance of SOX2 expression in breast cancer.
Collapse
Affiliation(s)
| | - Vahid Seyfoddin
- Auckland Cancer Society Research Centre, University of Auckland, Auckland, New Zealand
| | - Chanel E. Smart
- University of Queensland Centre for Clinical Research, Royal Brisbane & Women's Hospital Campus, Herston, Queensland, Australia
| | - Jingli Wang
- Auckland Cancer Society Research Centre, University of Auckland, Auckland, New Zealand
| | - Ji Eun Kim
- Auckland Cancer Society Research Centre, University of Auckland, Auckland, New Zealand
| | - Herah Hansji
- Auckland Cancer Society Research Centre, University of Auckland, Auckland, New Zealand
| | - Bruce C. Baguley
- Auckland Cancer Society Research Centre, University of Auckland, Auckland, New Zealand
| | - Graeme J. Finlay
- Auckland Cancer Society Research Centre, University of Auckland, Auckland, New Zealand
- * E-mail: (GJF); (EYL)
| | - Euphemia Y. Leung
- Auckland Cancer Society Research Centre, University of Auckland, Auckland, New Zealand
- * E-mail: (GJF); (EYL)
| |
Collapse
|
3
|
Pelechano V, Wei W, Jakob P, Steinmetz LM. Genome-wide identification of transcript start and end sites by transcript isoform sequencing. Nat Protoc 2014; 9:1740-59. [PMID: 24967623 DOI: 10.1038/nprot.2014.121] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Hundreds of transcript isoforms with varying boundaries and alternative regulatory signals are transcribed from the genome, even in a genetically homogeneous population of cells. To study this transcriptional heterogeneity, we developed transcript isoform sequencing (TIF-seq), a method that allows the genome-wide profiling of full-length transcript isoforms defined by their exact 5' and 3' boundaries. TIF-seq entails the generation of full-length cDNA libraries, followed by their circularization and the sequencing of the junction fragments spanning the 5' and 3' transcript ends. By determining the respective co-occurrence of start and end sites of individual transcript molecules, TIF-seq can distinguish variations that conventional approaches for mapping single ends cannot, such as short abortive transcripts, bicistronic messages and overlapping transcripts that differ in lengths. The TIF-seq protocol we describe here can be applied to any eukaryotic organism (e.g., yeast, human), and it requires 6-10 d for generating TIF-seq libraries, 10 d for sequencing and 2-3 d for analysis.
Collapse
Affiliation(s)
- Vicent Pelechano
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Wu Wei
- 1] European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany. [2] Stanford Genome Technology Center, Palo Alto, California, USA
| | - Petra Jakob
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Lars M Steinmetz
- 1] European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany. [2] Stanford Genome Technology Center, Palo Alto, California, USA. [3] Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| |
Collapse
|
4
|
Clark BS, Blackshaw S. Long non-coding RNA-dependent transcriptional regulation in neuronal development and disease. Front Genet 2014; 5:164. [PMID: 24936207 PMCID: PMC4047558 DOI: 10.3389/fgene.2014.00164] [Citation(s) in RCA: 118] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Accepted: 05/18/2014] [Indexed: 01/17/2023] Open
Abstract
Comprehensive analysis of the mammalian transcriptome has revealed that long non-coding RNAs (lncRNAs) may make up a large fraction of cellular transcripts. Recent years have seen a surge of studies aimed at functionally characterizing the role of lncRNAs in development and disease. In this review, we discuss new findings implicating lncRNAs in controlling development of the central nervous system (CNS). The evolution of the higher vertebrate brain has been accompanied by an increase in the levels and complexities of lncRNAs expressed within the developing nervous system. Although a limited number of CNS-expressed lncRNAs are now known to modulate the activity of proteins important for neuronal differentiation, the function of the vast majority of neuronal-expressed lncRNAs is still unknown. Topics of intense current interest include the mechanism by which CNS-expressed lncRNAs might function in epigenetic and transcriptional regulation during neuronal development, and how gain and loss of function of individual lncRNAs contribute to neurological diseases.
Collapse
Affiliation(s)
- Brian S Clark
- Solomon Snyder Department of Neuroscience, Johns Hopkins University School of Medicine Baltimore, MD, USA
| | - Seth Blackshaw
- Solomon Snyder Department of Neuroscience, Johns Hopkins University School of Medicine Baltimore, MD, USA ; Department of Ophthalmology, Johns Hopkins University School of Medicine Baltimore, MD, USA ; Department of Neurology, Johns Hopkins University School of Medicine Baltimore, MD, USA ; Center for High-Throughput Biology, Johns Hopkins University School of Medicine Baltimore, MD, USA ; Institute for Cell Engineering, Johns Hopkins University School of Medicine Baltimore, MD, USA
| |
Collapse
|
5
|
Juárez-Méndez S, Zentella-Dehesa A, Villegas-Ruíz V, Pérez-González OA, Salcedo M, López-Romero R, Román-Basaure E, Lazos-Ochoa M, Montes de Oca-Fuentes VE, Vázquez-Ortiz G, Moreno J. Splice variants of zinc finger protein 695 mRNA associated to ovarian cancer. J Ovarian Res 2013; 6:61. [PMID: 24007497 PMCID: PMC3847372 DOI: 10.1186/1757-2215-6-61] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Accepted: 08/24/2013] [Indexed: 12/22/2022] Open
Abstract
Background Studies of alternative mRNA splicing (AS) in health and disease have yet to yield the complete picture of protein diversity and its role in physiology and pathology. Some forms of cancer appear to be associated to certain alternative mRNA splice variants, but their role in the cancer development and outcome is unclear. Methods We examined AS profiles by means of whole genome exon expression microarrays (Affymetrix GeneChip 1.0) in ovarian tumors and ovarian cancer-derived cell lines, compared to healthy ovarian tissue. Alternatively spliced genes expressed predominantly in ovarian tumors and cell lines were confirmed by RT-PCR. Results Among several significantly overexpressed AS genes in malignant ovarian tumors and ovarian cancer cell lines, the most significant one was that of the zinc finger protein ZNF695, with two previously unknown mRNA splice variants identified in ovarian tumors and cell lines. The identity of ZNF695 AS variants was confirmed by cloning and sequencing of the amplicons obtained from ovarian cancer tissue and cell lines. Conclusions Alternative ZNF695 mRNA splicing could be a marker of ovarian cancer with possible implications on its pathogenesis.
Collapse
|
6
|
Pardo LM, Rizzu P, Francescatto M, Vitezic M, Leday GGR, Sanchez JS, Khamis A, Takahashi H, van de Berg WDJ, Medvedeva YA, van de Wiel MA, Daub CO, Carninci P, Heutink P. Regional differences in gene expression and promoter usage in aged human brains. Neurobiol Aging 2013; 34:1825-36. [PMID: 23428183 DOI: 10.1016/j.neurobiolaging.2013.01.005] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2012] [Revised: 11/29/2012] [Accepted: 01/07/2013] [Indexed: 10/27/2022]
Abstract
To characterize the promoterome of caudate and putamen regions (striatum), frontal and temporal cortices, and hippocampi from aged human brains, we used high-throughput cap analysis of gene expression to profile the transcription start sites and to quantify the differences in gene expression across the 5 brain regions. We also analyzed the extent to which methylation influenced the observed expression profiles. We sequenced more than 71 million cap analysis of gene expression tags corresponding to 70,202 promoter regions and 16,888 genes. More than 7000 transcripts were differentially expressed, mainly because of differential alternative promoter usage. Unexpectedly, 7% of differentially expressed genes were neurodevelopmental transcription factors. Functional pathway analysis on the differentially expressed genes revealed an overrepresentation of several signaling pathways (e.g., fibroblast growth factor and wnt signaling) in hippocampus and striatum. We also found that although 73% of methylation signals mapped within genes, the influence of methylation on the expression profile was small. Our study underscores alternative promoter usage as an important mechanism for determining the regional differences in gene expression at old age.
Collapse
Affiliation(s)
- Luba M Pardo
- Section Medical Genomics, Department of Clinical Genetics, VU University Medical Center, Amsterdam, The Netherlands
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Abstract
The importance of various classes of regulatory non-protein-coding RNA molecules (ncRNAs) in the normal functioning of the CNS is becoming increasingly evident. ncRNAs are involved in neuronal cell specification and patterning during development, but also in higher cognitive processes, such as structural plasticity and memory formation in the adult brain. We discuss advances in understanding of the function of ncRNAs in the CNS, with a focus on the potential involvement of specific species, such as microRNAs, endogenous small interfering RNAs, long intergenic non-coding RNAs, and natural antisense transcripts, in various neurodegenerative disorders. This emerging field is anticipated to profoundly affect clinical research, diagnosis, and therapy in neurology.
Collapse
|
8
|
Macrophages.com: an on-line community resource for innate immunity research. Immunobiology 2011; 216:1203-11. [PMID: 21924793 DOI: 10.1016/j.imbio.2011.07.025] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2011] [Accepted: 07/18/2011] [Indexed: 01/03/2023]
Abstract
Macrophages play a major role in tissue remodelling during development, wound healing and tissue homeostasis, and are central to innate immunity and to the pathology of tissue injury and inflammation. Given this fundamental role in many aspects of biological function, an enormous wealth of information has accumulated on these fascinating cells in the literature and other public repositories. With the escalation of genome-scale data derived from macrophages and related haematopoietic cell types, there is a growing need for an integrated resource that seeks to compile, organise and analyse our collective knowledge of macrophage biology. Here we describe a community-driven web-based resource, macrophages.com that aims to provide a portal onto various types of Omics data to facilitate comparative genomic studies, promoter and transcriptional network analyses, models of macrophage pathways together with other information on these cells. To this end, the website combines public and in-house analyses of expression data with pre-analysed views of co-expressed genes as supported by the network analysis tool BioLayout Express(3D), as well as providing access to maps of pathways active in macrophages. Macrophages.com also provides access to an extensive image library of macrophages in adult/embryonic tissue sections prepared from normal and transgenic mice. In addition, the site links to the Human Protein Atlas database so as to provide direct access to protein expression patterns in human macrophages. Finally, an integrated gene-centric portal provides the tools for rapid promoter analysis studies based on a comprehensive set of CAGE-derived transcription start site (TSS) sequences in human and mouse genomes as generated by the Functional Annotation of Mammalian genomes (FANTOM) projects initiated by the RIKEN Omics Science Center. Our aim is to continue to grow the macrophages.com resource using publicly available data, as well as in-house generated knowledge. In so doing we aim to provide a user-friendly community website and a community portal for access to comprehensive sets of macrophage-related data.
Collapse
|
9
|
Raabe CA, Hoe CH, Randau G, Brosius J, Tang TH, Rozhdestvensky TS. The rocks and shallows of deep RNA sequencing: Examples in the Vibrio cholerae RNome. RNA (NEW YORK, N.Y.) 2011; 17:1357-1366. [PMID: 21610211 PMCID: PMC3138571 DOI: 10.1261/rna.2682311] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2011] [Accepted: 04/15/2011] [Indexed: 05/30/2023]
Abstract
New deep RNA sequencing methodologies in transcriptome analyses identified a wealth of novel nonprotein-coding RNAs (npcRNAs). Recently, deep sequencing was used to delineate the small npcRNA transcriptome of the human pathogen Vibrio cholerae and 627 novel npcRNA candidates were identified. Here, we report the detection of 223 npcRNA candidates in V. cholerae by different cDNA library construction and conventional sequencing methods. Remarkably, only 39 of the candidates were common to both surveys. We therefore examined possible biasing influences in the transcriptome analyses. Key steps, including tailing and adapter ligations for generating cDNA, contribute qualitatively and quantitatively to the discrepancies between data sets. In addition, the state of 5'-end phosphorylation influences the efficiency of adapter ligation and C-tailing at the 3'-end of the RNA. Finally, our data indicate that the inclusion of sample-specific molecular identifier sequences during ligation steps also leads to biases in cDNA representation. In summary, even deep sequencing is unlikely to identify all RNA species, and caution should be used for meta-analyses among alternatively generated data sets.
Collapse
Affiliation(s)
- Carsten A. Raabe
- Institute of Experimental Pathology, University of Muenster, 48149 Muenster, Germany
| | - Chee Hock Hoe
- Infectious Diseases Cluster, Advanced Medical and Dental Institute (AMDI), Universiti Sains Malaysia, 13200 Penang, Malaysia
| | - Gerrit Randau
- Institute of Experimental Pathology, University of Muenster, 48149 Muenster, Germany
| | - Juergen Brosius
- Institute of Experimental Pathology, University of Muenster, 48149 Muenster, Germany
| | - Thean Hock Tang
- Infectious Diseases Cluster, Advanced Medical and Dental Institute (AMDI), Universiti Sains Malaysia, 13200 Penang, Malaysia
| | | |
Collapse
|
10
|
Substance-specific and shared transcription and epigenetic changes in the human hippocampus chronically exposed to cocaine and alcohol. Proc Natl Acad Sci U S A 2011; 108:6626-31. [PMID: 21464311 DOI: 10.1073/pnas.1018514108] [Citation(s) in RCA: 171] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The hippocampus is a key brain region involved in both short- and long-term memory processes and may play critical roles in drug-associated learning and addiction. Using whole genome sequencing of mRNA transcripts (RNA-Seq) and immunoprecipitation-enriched genomic DNA (ChIP-Seq) coupled with histone H3 lysine 4 trimethylation (H3K4me3), we found extensive hippocampal gene expression changes common to both cocaine-addicted and alcoholic individuals that may reflect neuronal adaptations common to both addictions. However, we also observed functional changes that were related only to long-term cocaine exposure, particularly the inhibition of mitochondrial inner membrane functions related to oxidative phosphorylation and energy metabolism, which has also been observed previously in neurodegenerative diseases. Cocaine- and alcohol-related histone H3K4me3 changes highly overlapped, but greater effects were detected under cocaine exposure. There was no direct correlation, however, between either cocaine- or alcohol-related histone H3k4me3 and gene expression changes at an individual gene level, indicating that transcriptional regulation as well as drug-related gene expression changes are outcomes of a complex gene-regulatory process that includes multifaceted histone modifications.
Collapse
|
11
|
Mantila Roosa SM, Liu Y, Turner CH. Alternative splicing in bone following mechanical loading. Bone 2011; 48:543-51. [PMID: 21095247 PMCID: PMC3039044 DOI: 10.1016/j.bone.2010.11.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2010] [Revised: 11/05/2010] [Accepted: 11/08/2010] [Indexed: 12/22/2022]
Abstract
It is estimated that more than 90% of human genes express multiple mRNA transcripts due to alternative splicing. Consequently, the proteins produced by different splice variants will likely have different functions and expression levels. Several genes with splice variants are known in bone, with functions that affect osteoblast function and bone formation. The primary goal of this study was to evaluate the extent of alternative splicing in a bone subjected to mechanical loading and subsequent bone formation. We used the rat forelimb loading model, in which the right forelimb was loaded axially for 3 min, while the left forearm served as a non-loaded control. Animals were subjected to loading sessions every day, with 24 h between sessions. Ulnae were sampled at 11 time points, from 4 h to 32days after beginning loading. RNA was isolated and mRNA abundance was measured at each time point using Affymetrix exon arrays (GeneChip® Rat Exon 1.0 ST Arrays). An ANOVA model was used to identify potential alternatively spliced genes across the time course, and five alternatively spliced genes were validated with qPCR: Akap12, Fn1, Pcolce, Sfrp4, and Tpm1. The number of alternatively spliced genes varied with time, ranging from a low of 68 at 12h to a high of 992 at 16d. We identified genes across the time course that encoded proteins with known functions in bone formation, including collagens, matrix proteins, and components of the Wnt/β-catenin and TGF-β signaling pathways. We also identified alternatively spliced genes encoding cytokines, ion channels, muscle-related genes, and solute carriers that do not have a known function in bone formation and represent potentially novel findings. In addition, a functional characterization was performed to categorize the global functions of the alternatively spliced genes in our data set. In conclusion, mechanical loading induces alternative splicing in bone, which may play an important role in the response of bone to mechanical loading.
Collapse
Affiliation(s)
- Sara M Mantila Roosa
- Department of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA.
| | | | | |
Collapse
|
12
|
Skandalis A, Frampton M, Seger J, Richards MH. The adaptive significance of unproductive alternative splicing in primates. RNA (NEW YORK, N.Y.) 2010; 16:2014-2022. [PMID: 20719917 PMCID: PMC2941109 DOI: 10.1261/rna.2127910] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2010] [Accepted: 07/12/2010] [Indexed: 05/29/2023]
Abstract
Alternative gene splicing is pervasive in metazoa, particularly in humans, where the majority of genes generate splice variant transcripts. Characterizing the biological significance of alternative transcripts is methodologically difficult since it is impractical to assess thousands of splice variants as to whether they actually encode proteins, whether these proteins are functional, or whether transcripts have a function independent of protein synthesis. Consequently, to elucidate the functional significance of splice variants and to investigate mechanisms underlying the fidelity of mRNA splicing, we used an indirect approach based on analyzing the evolutionary conservation of splice variants among species. Using DNA polymerase β as an indicator locus, we cloned and characterized the types and frequencies of transcripts generated in primary cell lines of five primate species. Overall, we found that in addition to the canonical DNA polymerase β transcript, there were 25 alternative transcripts generated, most containing premature terminating codons. We used a statistical method borrowed from community ecology to show that there is significant diversity and little conservation in alternative splicing patterns among species, despite high sequence similarity in the underlying genomic (exonic) sequences. However, the frequency of alternative splicing at this locus correlates well with life history parameters such as the maximal longevity of each species, indicating that the alternative splicing of unproductive splice variants may have adaptive significance, even if the specific RNA transcripts themselves have no function. These results demonstrate the validity of the phylogenetic conservation approach in elucidating the biological significance of alternative splicing.
Collapse
Affiliation(s)
- Adonis Skandalis
- Department of Biological Sciences, Brock University, St. Catharines, Ontario, Canada.
| | | | | | | |
Collapse
|
13
|
Chodroff RA, Goodstadt L, Sirey TM, Oliver PL, Davies KE, Green ED, Molnár Z, Ponting CP. Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes. Genome Biol 2010; 11:R72. [PMID: 20624288 PMCID: PMC2926783 DOI: 10.1186/gb-2010-11-7-r72] [Citation(s) in RCA: 194] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2010] [Revised: 05/17/2010] [Accepted: 07/12/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Long considered to be the building block of life, it is now apparent that protein is only one of many functional products generated by the eukaryotic genome. Indeed, more of the human genome is transcribed into noncoding sequence than into protein-coding sequence. Nevertheless, whilst we have developed a deep understanding of the relationships between evolutionary constraint and function for protein-coding sequence, little is known about these relationships for non-coding transcribed sequence. This dearth of information is partially attributable to a lack of established non-protein-coding RNA (ncRNA) orthologs among birds and mammals within sequence and expression databases. RESULTS Here, we performed a multi-disciplinary study of four highly conserved and brain-expressed transcripts selected from a list of mouse long intergenic noncoding RNA (lncRNA) loci that generally show pronounced evolutionary constraint within their putative promoter regions and across exon-intron boundaries. We identify some of the first lncRNA orthologs present in birds (chicken), marsupial (opossum), and eutherian mammals (mouse), and investigate whether they exhibit conservation of brain expression. In contrast to conventional protein-coding genes, the sequences, transcriptional start sites, exon structures, and lengths for these non-coding genes are all highly variable. CONCLUSIONS The biological relevance of lncRNAs would be highly questionable if they were limited to closely related phyla. Instead, their preservation across diverse amniotes, their apparent conservation in exon structure, and similarities in their pattern of brain expression during embryonic and early postnatal stages together indicate that these are functional RNA molecules, of which some have roles in vertebrate brain development.
Collapse
Affiliation(s)
- Rebecca A Chodroff
- Department of Physiology, Anatomy, and Genetics, Le Gros Clark Building South Parks Road, University of Oxford, Oxford OX1 3QX, UK
| | | | | | | | | | | | | | | |
Collapse
|
14
|
Ferraresso S, Milan M, Pellizzari C, Vitulo N, Reinhardt R, Canario AVM, Patarnello T, Bargelloni L. Development of an oligo DNA microarray for the European sea bass and its application to expression profiling of jaw deformity. BMC Genomics 2010; 11:354. [PMID: 20525278 PMCID: PMC2889902 DOI: 10.1186/1471-2164-11-354] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2010] [Accepted: 06/03/2010] [Indexed: 11/10/2022] Open
Abstract
Background The European sea bass (Dicentrarchus labrax) is a marine fish of great importance for fisheries and aquaculture. Functional genomics offers the possibility to discover the molecular mechanisms underlying productive traits in farmed fish, and a step towards the application of marker assisted selection methods in this species. To this end, we report here on the development of an oligo DNA microarray for D. labrax. Results A database consisting of 19,048 unique transcripts was constructed, of which 12,008 (63%) could be annotated by similarity and 4,692 received a GO functional annotation. Two non-overlapping 60mer probes were designed for each unique transcript and in-situ synthesized on glass slides using Agilent SurePrint™ technology. Probe design was positively completed for 19,035 target clusters; the oligo microarray was then applied to profile gene expression in mandibles and whole-heads of fish affected by prognathism, a skeletal malformation that strongly affects sea bass production. Statistical analysis identified 242 transcripts that are significantly down-regulated in deformed individuals compared to normal fish, with a significant enrichment in genes related to nervous system development and functioning. A set of genes spanning a wide dynamic range in gene expression level were selected for quantitative RT-PCR validation. Fold change correlation between microarray and qPCR data was always significant. Conclusions The microarray platform developed for the European sea bass has a high level of flexibility, reliability, and reproducibility. Despite the well known limitations in achieving a proper functional annotation in non-model species, sufficient information was obtained to identify biological processes that are significantly enriched among differentially expressed genes. New insights were obtained on putative mechanisms involved on mandibular prognathism, suggesting that bone/nervous system development might play a role in this phenomenon.
Collapse
Affiliation(s)
- Serena Ferraresso
- Department of Public Health, Comparative Pathology, and Veterinary Hygiene, Faculty of Veterinary Medicine, University of Padova, Viale dell'Università 16, 35020 Legnaro, Italy
| | | | | | | | | | | | | | | |
Collapse
|
15
|
|
16
|
ExprAlign--the identification of ESTs in non-model species by alignment of cDNA microarray expression profiles. BMC Genomics 2009; 10:560. [PMID: 19939286 PMCID: PMC2790474 DOI: 10.1186/1471-2164-10-560] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2009] [Accepted: 11/26/2009] [Indexed: 12/05/2022] Open
Abstract
Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data.
Collapse
|
17
|
Kunec D, Nanduri B, Burgess SC. Experimental annotation of channel catfish virus by probabilistic proteogenomic mapping. Proteomics 2009; 9:2634-47. [PMID: 19391180 DOI: 10.1002/pmic.200800397] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Experimental identification of expressed proteins by proteomics constitutes the most reliable approach to identify genomic location and structure of protein-coding genes and substantially complements computational genome annotation. Channel catfish herpesvirus (CCV) is a simple comparative model for understanding herpesvirus biology and the evolution of the Herpesviridae. The canonical CCV genome has 76 predicted ORF and only 12 of these have been confirmed experimentally. We describe a modification of a statistical method, which assigns significance measures, q-values, to peptide identifications based on 2-D LC ESI MS/MS, real-decoy database searches and SEQUEST XCorr and DeltaC(n) scores. We used this approach to identify CCV proteins expressed during its replication in cell culture, to determine protein composition of mature virions and, consequently, to refine the canonical CCV genome annotation. To complement trypsin, we used partial proteinase K digestion, which yielded greater proteome coverage. At FDR <5%, for peptide identifications, we identified 25/76 previously predicted ORF using trypsin and 31/76 using proteinase K. Furthermore, we identified 17 novel protein-coding regions (7 potential ATG-initiated ORF). Most of these novel ORF encode small proteins (<100 amino acids). Directed, strand-specific reverse transcription real-time PCR confirmed RNA expression from 6/7 novel ATG-initiated ORF investigated.
Collapse
Affiliation(s)
- Dusan Kunec
- College of Veterinary Medicine, Mississippi State, MS 39762, USA.
| | | | | |
Collapse
|
18
|
Weikard R, Goldammer T, Eberlein A, Kuehn C. Novel transcripts discovered by mining genomic DNA from defined regions of bovine chromosome 6. BMC Genomics 2009; 10:186. [PMID: 19393061 PMCID: PMC2681481 DOI: 10.1186/1471-2164-10-186] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2008] [Accepted: 04/24/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Linkage analyses strongly suggest a number of QTL for production, health and conformation traits in the middle part of bovine chromosome 6 (BTA6). The identification of the molecular background underlying the genetic variation at the QTL and subsequent functional studies require a well-annotated gene sequence map of the critical QTL intervals. To complete the sequence map of the defined subchromosomal regions on BTA6 poorly covered with comparative gene information, we focused on targeted isolation of transcribed sequences from bovine bacterial artificial chromosome (BAC) clones mapped to the QTL intervals. RESULTS Using the method of exon trapping, 92 unique exon trapping sequences (ETS) were discovered in a chromosomal region of poor gene coverage. Sequence identity to the current NCBI sequence assembly for BTA6 was detected for 91% of unique ETS. Comparative sequence similarity search revealed that 11% of the isolated ETS displayed high similarity to genomic sequences located on the syntenic chromosomes of the human and mouse reference genome assemblies. Nearly a third of the ETS identified similar equivalent sequences in genomic sequence scaffolds from the alternative Celera-based sequence assembly of the human genome. Screening gene, EST, and protein databases detected 17% of ETS with identity to known transcribed sequences. Expression analysis of a subset of the ETS showed that most ETS (84%) displayed a distinctive expression pattern in a multi-tissue panel of a lactating cow verifying their existence in the bovine transcriptome. CONCLUSION The results of our study demonstrate that the exon trapping method based on region-specific BAC clones is very useful for targeted screening for novel transcripts located within a defined chromosomal region being deficiently endowed with annotated gene information. The majority of identified ETS represents unknown noncoding sequences in intergenic regions on BTA6 displaying a distinctive tissue-specific expression profile. However, their definite regulatory function has to be analyzed in further studies. The novel transcripts will add new sequence information to annotate a complete bovine genome sequence assembly, contribute to establish a detailed transcription map for targeted BTA6 regions and will also be helpful to dissect of the molecular and regulatory background of the QTL detected on BTA6.
Collapse
Affiliation(s)
- Rosemarie Weikard
- Forschungsinstitut für die Biologie Landwirtschaftlicher Nutztiere (FBN), Dummerstorf, Germany.
| | | | | | | |
Collapse
|
19
|
Rose D, Jöris J, Hackermüller J, Reiche K, Li Q, Stadler PF. Duplicated RNA genes in teleost fish genomes. J Bioinform Comput Biol 2009; 6:1157-75. [PMID: 19090022 DOI: 10.1142/s0219720008003886] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2007] [Revised: 06/17/2008] [Accepted: 06/18/2008] [Indexed: 12/29/2022]
Abstract
Teleost fishes share a duplication of their entire genomes. We report here on a computational survey of structured non-coding RNAs (ncRNAs) in teleost genomes, focusing on the fate of fish-specific duplicates. As in other metazoan groups, we find evidence of a large number (11,543) of structured RNAs, most of which (~86%) are clade-specific or evolve so fast that their tetrapod homologs cannot be detected. In surprising contrast to protein-coding genes, the fish-specific genome duplication did not lead to a large number of paralogous ncRNAs: only 188 candidates, mostly microRNAs, appear in a larger copy number in teleosts than in tetrapods, suggesting that large-scale gene duplications do not play a major role in the expansion of the vertebrate ncRNA inventory.
Collapse
Affiliation(s)
- Dominic Rose
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
| | | | | | | | | | | |
Collapse
|
20
|
Seim I, Carter SL, Herington AC, Chopin LK. Complex organisation and structure of the ghrelin antisense strand gene GHRLOS, a candidate non-coding RNA gene. BMC Mol Biol 2008; 9:95. [PMID: 18954468 PMCID: PMC2621237 DOI: 10.1186/1471-2199-9-95] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2008] [Accepted: 10/28/2008] [Indexed: 12/13/2022] Open
Abstract
Background The peptide hormone ghrelin has many important physiological and pathophysiological roles, including the stimulation of growth hormone (GH) release, appetite regulation, gut motility and proliferation of cancer cells. We previously identified a gene on the opposite strand of the ghrelin gene, ghrelinOS (GHRLOS), which spans the promoter and untranslated regions of the ghrelin gene (GHRL). Here we further characterise GHRLOS. Results We have described GHRLOS mRNA isoforms that extend over 1.4 kb of the promoter region and 106 nucleotides of exon 4 of the ghrelin gene, GHRL. These GHRLOS transcripts initiate 4.8 kb downstream of the terminal exon 4 of GHRL and are present in the 3' untranslated exon of the adjacent gene TATDN2 (TatD DNase domain containing 2). Interestingly, we have also identified a putative non-coding TATDN2-GHRLOS chimaeric transcript, indicating that GHRLOS RNA biogenesis is extremely complex. Moreover, we have discovered that the 3' region of GHRLOS is also antisense, in a tail-to-tail fashion to a novel terminal exon of the neighbouring SEC13 gene, which is important in protein transport. Sequence analyses revealed that GHRLOS is riddled with stop codons, and that there is little nucleotide and amino-acid sequence conservation of the GHRLOS gene between vertebrates. The gene spans 44 kb on 3p25.3, is extensively spliced and harbours multiple variable exons. We have also investigated the expression of GHRLOS and found evidence of differential tissue expression. It is highly expressed in tissues which are emerging as major sites of non-coding RNA expression (the thymus, brain, and testis), as well as in the ovary and uterus. In contrast, very low levels were found in the stomach where sense, GHRL derived RNAs are highly expressed. Conclusion GHRLOS RNA transcripts display several distinctive features of non-coding (ncRNA) genes, including 5' capping, polyadenylation, extensive splicing and short open reading frames. The gene is also non-conserved, with differential and tissue-restricted expression. The overlapping genomic arrangement of GHRLOS with the ghrelin gene indicates that it is likely to have interesting regulatory and functional roles in the ghrelin axis.
Collapse
Affiliation(s)
- Inge Seim
- Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, Queensland, Australia.
| | | | | | | |
Collapse
|
21
|
Rupert JL. Genomics and Environmental Hypoxia: What (and How) We Can Learn from the Transcriptome. High Alt Med Biol 2008; 9:115-22. [DOI: 10.1089/ham.2007.1070] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Affiliation(s)
- Jim L. Rupert
- School of Human Kinetics, University of British Columbia, Vancouver, B.C., Canada
| |
Collapse
|
22
|
Tanney A, Oliver GR, Farztdinov V, Kennedy RD, Mulligan JM, Fulton CE, Farragher SM, Field JK, Johnston PG, Harkin DP, Proutski V, Mulligan KA. Generation of a non-small cell lung cancer transcriptome microarray. BMC Med Genomics 2008; 1:20. [PMID: 18513400 PMCID: PMC2426710 DOI: 10.1186/1755-8794-1-20] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2007] [Accepted: 05/30/2008] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND Non-small cell lung cancer (NSCLC) is the leading cause of cancer mortality worldwide. At present no reliable biomarkers are available to guide the management of this condition. Microarray technology may allow appropriate biomarkers to be identified but present platforms are lacking disease focus and are thus likely to miss potentially vital information contained in patient tissue samples. METHODS A combination of large-scale in-house sequencing, gene expression profiling and public sequence and gene expression data mining were used to characterise the transcriptome of NSCLC and the data used to generate a disease-focused microarray - the Lung Cancer DSA research tool. RESULTS Built on the Affymetrix GeneChip platform, the Lung Cancer DSA research tool allows for interrogation of ~60,000 transcripts relevant to Lung Cancer, tens of thousands of which are unavailable on leading commercial microarrays. CONCLUSION We have developed the first high-density disease specific transcriptome microarray. We present the array design process and the results of experiments carried out to demonstrate the array's utility. This approach serves as a template for the development of other disease transcriptome microarrays, including non-neoplastic diseases.
Collapse
Affiliation(s)
- Austin Tanney
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - Gavin R Oliver
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - Vadim Farztdinov
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - Richard D Kennedy
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - Jude M Mulligan
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - Ciaran E Fulton
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - Susan M Farragher
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - John K Field
- Roy Castle Lung Cancer Research Programme, The University of Liverpool Cancer Research Centre, 200 London Road, Liverpool, L3 9TA, UK
| | - Patrick G Johnston
- Centre for Cancer Research and Cell Biology, Queen's University of Belfast, 97 Lisburn Road, Belfast, BT9 7BL, UK
| | - D Paul Harkin
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - Vitali Proutski
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| | - Karl A Mulligan
- Almac Diagnostics Ltd, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK
| |
Collapse
|
23
|
Harbers M. The current status of cDNA cloning. Genomics 2008; 91:232-42. [PMID: 18222633 DOI: 10.1016/j.ygeno.2007.11.004] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2007] [Revised: 11/10/2007] [Accepted: 11/17/2007] [Indexed: 11/19/2022]
Abstract
The cloning of cDNAs, copies of cellular RNA, is one of the classical technologies in molecular biology. Over the past 30 years cDNA cloning technologies have been improved to enable the cloning of large cDNA collections, which are fundamental to today's understanding of the utilization of genetic information. With the discovery of noncoding RNAs, additional new approaches to the cloning of short RNAs have been developed. However, with the realization that much larger portions of genomes are transcribed than anticipated from genome annotations, cDNA cloning faces new challenges to uncover rare transcripts and to make the corresponding cDNAs available for functional studies. This review provides an overview on the current status of cDNA cloning and possibilities for the discovery and characterization of new RNA families.
Collapse
Affiliation(s)
- Matthias Harbers
- DNAFORM, Inc., Leading Venture Plaza 2, 75-1 Ono-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0046, Japan.
| |
Collapse
|
24
|
Gracey AY. Interpreting physiological responses to environmental change through gene expression profiling. ACTA ACUST UNITED AC 2008; 210:1584-92. [PMID: 17449823 DOI: 10.1242/jeb.004333] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Identification of differentially expressed genes in response to environmental change offers insights into the roles of the transcriptome in the regulation of physiological responses. A variety of methods are now available to implement large-scale gene expression screens, and each method has specific advantages and disadvantages. Construction of custom cDNA microarrays remains the most popular route to implement expression screens in the non-model organisms favored by comparative physiologists, and we highlight some factors that should be considered when embarking along this path. Using a carp cDNA microarray, we have undertaken a broad, system-wide gene expression screen to investigate the physiological mechanisms underlying cold and hypoxia acclimation. This dataset provides a starting point from which to explore a range of specific mechanistic hypotheses at all levels of organization, from individual biochemical pathways to the level of the whole organism. We demonstrate the utility of two data analysis methods, Gene Ontology profiling and rank-based statistical methods, to summarize the probable physiological function of acclimation-induced gene expression changes, and to prioritize specific genes as candidates for further study.
Collapse
Affiliation(s)
- Andrew Y Gracey
- Marine Environmental Biology, University of Southern California, 3616 Trousdale Parkway, Los Angeles, CA 90089, USA.
| |
Collapse
|
25
|
Abstract
The promise of the genome project was that a complete sequence would provide us with information that would transform biology and medicine. But the 'parts list' that has emerged from the genome project is far from the 'wiring diagram' and 'circuit logic' we need to understand the link between genotype, environment and phenotype. While genomic technologies such as DNA microarrays, proteomics and metabolomics have given us new tools and new sources of data to address these problems, a number of crucial elements remain to be addressed before we can begin to close the loop and develop a predictive quantitative biology that is the stated goal of so much of current biological research, including systems biology. Our approach to this problem has largely been one of integration, bringing together a vast wealth of information to better interpret the experimental data we are generating in genomic assays and creating publicly available databases and software tools to facilitate the work of others. Recently, we have used a similar approach to trying to understand the biological networks that underlie the phenotypic responses we observe and starting us on the road to developing a predictive biology.
Collapse
Affiliation(s)
- John Quackenbush
- Department of Biostatistics and Computational Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
26
|
Abstract
While less than 1.5% of the mammalian genome encodes proteins, it is now evident that the vast majority is transcribed, mainly into non-protein-coding RNAs. This raises the question of what fraction of the genome is functional, i.e., composed of sequences that yield functional products, are required for the expression (regulation or processing) of these products, or are required for chromosome replication and maintenance. Many of the observed noncoding transcripts are differentially expressed, and, while most have not yet been studied, increasing numbers are being shown to be functional and/or trafficked to specific subcellular locations, as well as exhibit subtle evidence of selection. On the other hand, analyses of conservation patterns indicate that only approximately 5% (3%-8%) of the human genome is under purifying selection for functions common to mammals. However, these estimates rely on the assumption that reference sequences (usually ancient transposon-derived sequences) have evolved neutrally, which may not be the case, and if so would lead to an underestimate of the fraction of the genome under evolutionary constraint. These analyses also do not detect functional sequences that are evolving rapidly and/or have acquired lineage-specific functions. Indeed, many regulatory sequences and known functional noncoding RNAs, including many microRNAs, are not conserved over significant evolutionary distances, and recent evidence from the ENCODE project suggests that many functional elements show no detectable level of sequence constraint. Thus, it is likely that much more than 5% of the genome encodes functional information, and although the upper bound is unknown, it may be considerably higher than currently thought.
Collapse
Affiliation(s)
- Michael Pheasant
- ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland 4072, Australia
| | | |
Collapse
|