101
|
Nitsche A, Stadler PF. Evolutionary clues in lncRNAs. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 8. [PMID: 27436689 DOI: 10.1002/wrna.1376] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Revised: 06/06/2016] [Accepted: 06/09/2016] [Indexed: 12/13/2022]
Abstract
The diversity of long non-coding RNAs (lncRNAs) in the human transcriptome is in stark contrast to the sparse exploration of their functions concomitant with their conservation and evolution. The pervasive transcription of the largely non-coding human genome makes the evolutionary age and conservation patterns of lncRNAs to a topic of interest. Yet it is a fairly unexplored field and not that easy to determine as for protein-coding genes. Although there are a few experimentally studied cases, which are conserved at the sequence level, most lncRNAs exhibit weak or untraceable primary sequence conservation. Recent studies shed light on the interspecies conservation of secondary structures among lncRNA homologs by using diverse computational methods. This highlights the importance of structure on functionality of lncRNAs as opposed to the poor impact of primary sequence changes. Further clues in the evolution of lncRNAs are given by selective constraints on non-coding gene structures (e.g., promoters or splice sites) as well as the conservation of prevalent spatio-temporal expression patterns. However, a rapid evolutionary turnover is observable throughout the heterogeneous group of lncRNAs. This still gives rise to questions about its functional meaning. WIREs RNA 2017, 8:e1376. doi: 10.1002/wrna.1376 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Anne Nitsche
- Bioinformatics Group, Department of Computer Science, University Leipzig, Leipzig, Germany.,Institute de Biologie Moléculaire et Cellulaire, Université de Strasbourg, Cedex, France
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University Leipzig, Leipzig, Germany.,Interdisciplinary Center for Bioinformatics, University Leipzig, Leipzig, Germany.,Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.,Department of Diagnostics, Fraunhofer Institute for Cell Therapy and Immunology - IZI, Leipzig, Germany.,Center for Non-Coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.,Department of Theoretical Chemistry, University of Vienna, Wien, Austria.,Santa Fe Institute, Santa Fe, NM, USA
| |
Collapse
|
102
|
Raj A, Wang SH, Shim H, Harpak A, Li YI, Engelmann B, Stephens M, Gilad Y, Pritchard JK. Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling. eLife 2016; 5. [PMID: 27232982 PMCID: PMC4940163 DOI: 10.7554/elife.13328] [Citation(s) in RCA: 111] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2015] [Accepted: 05/26/2016] [Indexed: 01/19/2023] Open
Abstract
Accurate annotation of protein coding regions is essential for understanding how genetic information is translated into function. We describe riboHMM, a new method that uses ribosome footprint data to accurately infer translated sequences. Applying riboHMM to human lymphoblastoid cell lines, we identified 7273 novel coding sequences, including 2442 translated upstream open reading frames. We observed an enrichment of footprints at inferred initiation sites after drug-induced arrest of translation initiation, validating many of the novel coding sequences. The novel proteins exhibit significant selective constraint in the inferred reading frames, suggesting that many are functional. Moreover, ~40% of bicistronic transcripts showed negative correlation in the translation levels of their two coding sequences, suggesting a potential regulatory role for these novel regions. Despite known limitations of mass spectrometry to detect protein expressed at low level, we estimated a 14% validation rate. Our work significantly expands the set of known coding regions in humans. DOI:http://dx.doi.org/10.7554/eLife.13328.001
Collapse
Affiliation(s)
- Anil Raj
- Department of Genetics, Stanford University, Stanford, United States
| | - Sidney H Wang
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Heejung Shim
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Arbel Harpak
- Department of Biology, Stanford University, Stanford, United States
| | - Yang I Li
- Department of Genetics, Stanford University, Stanford, United States
| | - Brett Engelmann
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, United States.,Department of Statistics, University of Chicago, Chicago, United States
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, United States.,Department of Biology, Stanford University, Stanford, United States.,Howard Hughes Medical Institute, Stanford University, Stanford, United States
| |
Collapse
|
103
|
Quinn JJ, Zhang QC, Georgiev P, Ilik IA, Akhtar A, Chang HY. Rapid evolutionary turnover underlies conserved lncRNA-genome interactions. Genes Dev 2016; 30:191-207. [PMID: 26773003 PMCID: PMC4719309 DOI: 10.1101/gad.272187.115] [Citation(s) in RCA: 137] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Many long noncoding RNAs (lncRNAs) can regulate chromatin states, but the evolutionary origin and dynamics driving lncRNA-genome interactions are unclear. We adapted an integrative strategy that identifies lncRNA orthologs in different species despite limited sequence similarity, which is applicable to mammalian and insect lncRNAs. Analysis of the roX lncRNAs, which are essential for dosage compensation of the single X chromosome in Drosophila males, revealed 47 new roX orthologs in diverse Drosophilid species across ∼40 million years of evolution. Genetic rescue by roX orthologs and engineered synthetic lncRNAs showed that altering the number of focal, repetitive RNA structures determines roX ortholog function. Genomic occupancy maps of roX RNAs in four species revealed conserved targeting of X chromosome neighborhoods but rapid turnover of individual binding sites. Many new roX-binding sites evolved from DNA encoding a pre-existing RNA splicing signal, effectively linking dosage compensation to transcribed genes. Thus, dynamic change in lncRNAs and their genomic targets underlies conserved and essential lncRNA-genome interactions.
Collapse
Affiliation(s)
- Jeffrey J Quinn
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California 94305, USA; Department of Bioengineering, Stanford University School of Medicine and School of Engineering, Stanford, California 94305, USA
| | - Qiangfeng C Zhang
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Plamen Georgiev
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany
| | - Ibrahim A Ilik
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany
| | - Asifa Akhtar
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany
| | - Howard Y Chang
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, California 94305, USA
| |
Collapse
|
104
|
Melissari MT, Grote P. Roles for long non-coding RNAs in physiology and disease. Pflugers Arch 2016; 468:945-58. [DOI: 10.1007/s00424-016-1804-y] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 02/24/2016] [Indexed: 01/04/2023]
|
105
|
Song Y, Ci D, Tian M, Zhang D. Stable methylation of a non-coding RNA gene regulates gene expression in response to abiotic stress in Populus simonii. JOURNAL OF EXPERIMENTAL BOTANY 2016; 67:1477-92. [PMID: 26712827 DOI: 10.1093/jxb/erv543] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
DNA methylation plays important roles in responses to environmental stimuli. However, in perennial plants, the roles of DNA methylation in stress-specific adaptions to different abiotic stresses remain unclear. Here, we present a systematic, comparative analysis of the methylome and gene expression in poplar under cold, osmotic, heat, and salt stress conditions from 3h to 24h. Comparison of the stress responses revealed different patterns of cytosine methylation in response to the four abiotic stresses. We isolated and sequenced 1376 stress-specific differentially methylated regions (SDMRs); annotation revealed that these SDMRs represent 1123 genes encoding proteins, 16 miRNA genes, and 17 long non-coding RNA (lncRNA) genes. The SDMR162 region, consisting of Psi-MIR396e and PsiLNCRNA00268512, is regulated by epigenetic pathways and we speculate that PsiLNCRNA00268512 regulates miR396e levels by acting as a target mimic. The ratios of methylated cytosine declined to ~35.1% after 1 month of recovery from abiotic stress and to ~15.3% after 6 months. Among methylated miRNA genes, only expression of the methylation-regulated gene MIRNA6445a showed long-term stability. Our data provide a strong basis for future work and improve our understanding of the effect of epigenetic regulation of non-coding RNA expression, which will enable in-depth functional analysis.
Collapse
Affiliation(s)
- Yuepeng Song
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, PR China Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, PR China
| | - Dong Ci
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, PR China Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, PR China
| | - Min Tian
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, PR China Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, PR China
| | - Deqiang Zhang
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, PR China Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, PR China
| |
Collapse
|
106
|
Cho YB, Lee EJ, Cho S, Kim TY, Park JH, Cho BK. Functional elucidation of the non-coding RNAs of Kluyveromyces marxianus in the exponential growth phase. BMC Genomics 2016; 17:154. [PMID: 26923790 PMCID: PMC4770515 DOI: 10.1186/s12864-016-2474-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Accepted: 02/15/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Non-coding RNAs (ncRNAs), which perform diverse regulatory roles, have been found in organisms from all superkingdoms of life. However, there have been limited numbers of studies on the functions of ncRNAs, especially in nonmodel organisms such as Kluyveromyces marxianus that is widely used in the field of industrial biotechnology. RESULTS In this study, we measured changes in transcriptome at three time points during the exponential growth phase of K. marxianus by using strand-specific RNA-seq. We found that approximately 60% of the transcriptome consists of ncRNAs transcribed from antisense and intergenic regions of the genome that were transcribed at lower levels than mRNA. In the transcriptome, a substantial number of long antisense ncRNAs (lancRNAs) are differentially expressed and enriched in carbohydrate and energy metabolism pathways. Furthermore, this enrichment is evolutionarily conserved, at least in yeast. Particularly, the mode of regulation of mRNA/lancRNA pairs is associated with mRNA transcription levels; the correlation between the pairs is positive at high mRNA transcriptional levels and negative at low levels. In addition, significant induction of mRNA and coverage of more than half of the mRNA sequence by a lancRNA strengthens the positive correlation between mRNA/lancRNA pairs. CONCLUSIONS Transcriptome sequencing of K. marxianus in the exponential growth phase reveals pervasive transcription of ncRNAs with evolutionarily conserved functions. Studies of the mode of regulation of mRNA/lancRNA pairs suggest that induction of lancRNA may be associated with switch-like behavior of mRNA/lancRNA pairs and efficient regulation of the carbohydrate and energy metabolism pathways in the exponential growth phase of K. marxianus being used in industrial applications.
Collapse
Affiliation(s)
- Yoo-Bok Cho
- Department of Biological Sciences and KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Republic of Korea.
| | - Eun Ju Lee
- Department of Biological Sciences and KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Republic of Korea.
| | - Suhyung Cho
- Department of Biological Sciences and KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Republic of Korea.
| | - Tae Yong Kim
- Biomaterials Lab., Samsung Advanced Institute of Technology (SAIT), 130 Samsung-ro, Yeongtong-gu, Suwon, 443-803, Republic of Korea.
| | - Jin Hwan Park
- Biomaterials Lab., Samsung Advanced Institute of Technology (SAIT), 130 Samsung-ro, Yeongtong-gu, Suwon, 443-803, Republic of Korea.
| | - Byung-Kwan Cho
- Department of Biological Sciences and KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Republic of Korea. .,Intelligent Synthetic Biology Center, Daejeon, 305-701, Republic of Korea.
| |
Collapse
|
107
|
Yu H, Zhao X, Li Q. Genome-wide identification and characterization of long intergenic noncoding RNAs and their potential association with larval development in the Pacific oyster. Sci Rep 2016; 6:20796. [PMID: 26861843 PMCID: PMC4748301 DOI: 10.1038/srep20796] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 01/12/2016] [Indexed: 11/10/2022] Open
Abstract
An increasing amount of evidence suggests that long intergenic noncoding RNAs (lincRNAs) may play diverse roles in many cellular processes. However, little is known about lincRNAs in marine invertebrates. Here, we presented the first identification and characterization of lincRNAs in the Pacific oyster (Crassostrea gigas). We developed a pipeline and identified 11,668 lincRNAs in C. gigas based on RNA-Seq resources available. These lincRNAs exhibited many common characteristics with vertebrate lincRNAs: relatively short length, low exon numbers, low expression, and low sequence conservation. 1,175 lincRNAs were expressed in a tissue-specific manner, with 35.2% preferentially expressed in male gonad. 776 lincRNAs were specifically expressed in juvenile during different developmental stages. In addition, 47 lincRNAs were found to be potentially related to oyster settlement and metamorphosis. Such diverse temporal and spatial patterns of expression suggest that these lincRNAs might function in cell differentiation during early development, as well as sex differentiation and reproduction. Based on a co-expression network analysis, five lincRNAs were detected that have an expression correlation with key hub genes in four modules significantly correlated with larval development. Our study provides the first large-scale identification of lincRNAs in molluscs and offers new insights into potential functions of lincRNAs in marine invertebrates.
Collapse
Affiliation(s)
- Hong Yu
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao 266003, China
| | - Xuelin Zhao
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao 266003, China
| | - Qi Li
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao 266003, China
| |
Collapse
|
108
|
Abdollahi-Arpanahi R, Morota G, Valente BD, Kranis A, Rosa GJM, Gianola D. Differential contribution of genomic regions to marked genetic variation and prediction of quantitative traits in broiler chickens. Genet Sel Evol 2016; 48:10. [PMID: 26842494 PMCID: PMC4739338 DOI: 10.1186/s12711-016-0187-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 01/15/2016] [Indexed: 11/15/2022] Open
Abstract
Background Genome-wide association studies in humans have found enrichment of trait-associated single nucleotide polymorphisms (SNPs) in coding regions of the genome and depletion of these in intergenic regions. However, a recent release of the ENCyclopedia of DNA elements showed that ~80 % of the human genome has a biochemical function. Similar studies on the chicken genome are lacking, thus assessing the relative contribution of its genic and non-genic regions to variation is relevant for biological studies and genetic improvement of chicken populations. Methods A dataset including 1351 birds that were genotyped with the 600K Affymetrix platform was used. We partitioned SNPs according to genome annotation data into six classes to characterize the relative contribution of genic and non-genic regions to genetic variation as well as their predictive power using all available quality-filtered SNPs. Target traits were body weight, ultrasound measurement of breast muscle and hen house egg production in broiler chickens. Six genomic regions were considered: intergenic regions, introns, missense, synonymous, 5′ and 3′ untranslated regions, and regions that are located 5 kb upstream and downstream of coding genes. Genomic relationship matrices were constructed for each genomic region and fitted in the models, separately or simultaneously. Kernel-based ridge regression was used to estimate variance components and assess predictive ability. Contribution of each class of genomic regions to dominance variance was also considered. Results Variance component estimates indicated that all genomic regions contributed to marked additive genetic variation and that the class of synonymous regions tended to have the greatest contribution. The marked dominance genetic variation explained by each class of genomic regions was similar and negligible (~0.05). In terms of prediction mean-square error, the whole-genome approach showed the best predictive ability. Conclusions All genic and non-genic regions contributed to phenotypic variation for the three traits studied. Overall, the contribution of additive genetic variance to the total genetic variance was much greater than that of dominance variance. Our results show that all genomic regions are important for the prediction of the targeted traits, and the whole-genome approach was reaffirmed as the best tool for genome-enabled prediction of quantitative traits.
Collapse
Affiliation(s)
- Rostam Abdollahi-Arpanahi
- Department of Animal Sciences, University of Wisconsin, Madison, WI, USA. .,Department of Animal and Poultry Science, College of Aburaihan, University of Tehran, Pakdasht, Iran.
| | - Gota Morota
- Department of Animal Science, University of Nebraska, Lincoln, NE, USA.
| | - Bruno D Valente
- Department of Animal Sciences, University of Wisconsin, Madison, WI, USA. .,Department of Dairy Science, University of Wisconsin, Madison, WI, USA.
| | - Andreas Kranis
- Aviagen Ltd, Midlothian, UK. .,The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, UK.
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin, Madison, WI, USA. .,Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA.
| | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin, Madison, WI, USA. .,Department of Dairy Science, University of Wisconsin, Madison, WI, USA. .,Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA.
| |
Collapse
|
109
|
Yotsukura S, duVerle D, Hancock T, Natsume-Kitatani Y, Mamitsuka H. Computational recognition for long non-coding RNA (lncRNA): Software and databases. Brief Bioinform 2016; 18:9-27. [DOI: 10.1093/bib/bbv114] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 12/10/2015] [Indexed: 01/22/2023] Open
|
110
|
Marí-Alexandre J, Sánchez-Izquierdo D, Gilabert-Estellés J, Barceló-Molina M, Braza-Boïls A, Sandoval J. miRNAs Regulation and Its Role as Biomarkers in Endometriosis. Int J Mol Sci 2016; 17:ijms17010093. [PMID: 26771608 PMCID: PMC4730335 DOI: 10.3390/ijms17010093] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Revised: 01/05/2016] [Accepted: 01/08/2016] [Indexed: 02/07/2023] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs (18-22 nt) that function as modulators of gene expression. Since their discovery in 1993 in C. elegans, our knowledge about their biogenesis, function, and mechanism of action has increased enormously, especially in recent years, with the development of deep-sequencing technologies. New biogenesis pathways and sources of miRNAs are changing our concept about these molecules. The study of the miRNA contribution to pathological states is a field of great interest in research. Different groups have reported the implication of miRNAs in pathologies such as cancer, diabetes, cardiovascular, and gynecological diseases. It is also well-known that miRNAs are present in biofluids (plasma, serum, urine, semen, and menstrual blood) and have been proposed as ideal candidates as disease biomarkers. The goal of this review is to highlight the current knowledge in the field of miRNAs with a special emphasis to their role in endometriosis and the newest investigations addressing the use of miRNAs as biomarkers for this gynecological disease.
Collapse
Affiliation(s)
- Josep Marí-Alexandre
- Unit of Hemostasia, Thrombosis, Atherosclerosis and Vascular Biology, Health Research Institute La Fe, Valencia 46026, Spain.
| | | | | | - Moisés Barceló-Molina
- Unit of Hemostasia, Thrombosis, Atherosclerosis and Vascular Biology, Health Research Institute La Fe, Valencia 46026, Spain.
| | - Aitana Braza-Boïls
- Unit of Hemostasia, Thrombosis, Atherosclerosis and Vascular Biology, Health Research Institute La Fe, Valencia 46026, Spain.
| | - Juan Sandoval
- Epigomics Unit, Health Research Institute La Fe, Valencia 46026, Spain.
| |
Collapse
|
111
|
McGettigan PA, Browne JA, Carrington SD, Crowe MA, Fair T, Forde N, Loftus BJ, Lohan A, Lonergan P, Pluta K, Mamo S, Murphy A, Roche J, Walsh SW, Creevey CJ, Earley B, Keady S, Kenny DA, Matthews D, McCabe M, Morris D, O'Loughlin A, Waters S, Diskin MG, Evans ACO. Fertility and genomics: comparison of gene expression in contrasting reproductive tissues of female cattle. Reprod Fertil Dev 2016; 28:11-24. [DOI: 10.1071/rd15354] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
To compare gene expression among bovine tissues, large bovine RNA-seq datasets were used, comprising 280 samples from 10 different bovine tissues (uterine endometrium, granulosa cells, theca cells, cervix, embryos, leucocytes, liver, hypothalamus, pituitary, muscle) and generating 260 Gbases of data. Twin approaches were used: an information–theoretic analysis of the existing annotated transcriptome to identify the most tissue-specific genes and a de-novo transcriptome annotation to evaluate general features of the transcription landscape. Expression was detected for 97% of the Ensembl transcriptome with at least one read in one sample and between 28% and 66% at a level of 10 tags per million (TPM) or greater in individual tissues. Over 95% of genes exhibited some level of tissue-specific gene expression. This was mostly due to different levels of expression in different tissues rather than exclusive expression in a single tissue. Less than 1% of annotated genes exhibited a highly restricted tissue-specific expression profile and approximately 2% exhibited classic housekeeping profiles. In conclusion, it is the combined effects of the variable expression of large numbers of genes (73%–93% of the genome) and the specific expression of a small number of genes (<1% of the transcriptome) that contribute to determining the outcome of the function of individual tissues.
Collapse
|
112
|
Tripathi KP, Evangelista D, Zuccaro A, Guarracino MR. Transcriptator: An Automated Computational Pipeline to Annotate Assembled Reads and Identify Non Coding RNA. PLoS One 2015; 10:e0140268. [PMID: 26581084 PMCID: PMC4651556 DOI: 10.1371/journal.pone.0140268] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Accepted: 09/22/2015] [Indexed: 12/20/2022] Open
Abstract
RNA-seq is a new tool to measure RNA transcript counts, using high-throughput sequencing at an extraordinary accuracy. It provides quantitative means to explore the transcriptome of an organism of interest. However, interpreting this extremely large data into biological knowledge is a problem, and biologist-friendly tools are lacking. In our lab, we developed Transcriptator, a web application based on a computational Python pipeline with a user-friendly Java interface. This pipeline uses the web services available for BLAST (Basis Local Search Alignment Tool), QuickGO and DAVID (Database for Annotation, Visualization and Integrated Discovery) tools. It offers a report on statistical analysis of functional and Gene Ontology (GO) annotation's enrichment. It helps users to identify enriched biological themes, particularly GO terms, pathways, domains, gene/proteins features and protein-protein interactions related informations. It clusters the transcripts based on functional annotations and generates a tabular report for functional and gene ontology annotations for each submitted transcript to the web server. The implementation of QuickGo web-services in our pipeline enable the users to carry out GO-Slim analysis, whereas the integration of PORTRAIT (Prediction of transcriptomic non coding RNA (ncRNA) by ab initio methods) helps to identify the non coding RNAs and their regulatory role in transcriptome. In summary, Transcriptator is a useful software for both NGS and array data. It helps the users to characterize the de-novo assembled reads, obtained from NGS experiments for non-referenced organisms, while it also performs the functional enrichment analysis of differentially expressed transcripts/genes for both RNA-seq and micro-array experiments. It generates easy to read tables and interactive charts for better understanding of the data. The pipeline is modular in nature, and provides an opportunity to add new plugins in the future. Web application is freely available at: http://www-labgtp.na.icar.cnr.it/Transcriptator.
Collapse
Affiliation(s)
- Kumar Parijat Tripathi
- Laboratory for Genomics, Transcriptomics and Proteomics (LAB-GTP), High Performance Computing and Networking Institute (ICAR), National Research Council of Italy (CNR), Via Pietro Castellino, 111, Napoli, Italy
- * E-mail:
| | - Daniela Evangelista
- Laboratory for Genomics, Transcriptomics and Proteomics (LAB-GTP), High Performance Computing and Networking Institute (ICAR), National Research Council of Italy (CNR), Via Pietro Castellino, 111, Napoli, Italy
| | - Antonio Zuccaro
- Laboratory for Genomics, Transcriptomics and Proteomics (LAB-GTP), High Performance Computing and Networking Institute (ICAR), National Research Council of Italy (CNR), Via Pietro Castellino, 111, Napoli, Italy
| | - Mario Rosario Guarracino
- Laboratory for Genomics, Transcriptomics and Proteomics (LAB-GTP), High Performance Computing and Networking Institute (ICAR), National Research Council of Italy (CNR), Via Pietro Castellino, 111, Napoli, Italy
| |
Collapse
|
113
|
Koufariotis LT, Chen YPP, Chamberlain A, Vander Jagt C, Hayes BJ. A catalogue of novel bovine long noncoding RNA across 18 tissues. PLoS One 2015; 10:e0141225. [PMID: 26496443 PMCID: PMC4619662 DOI: 10.1371/journal.pone.0141225] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 10/05/2015] [Indexed: 11/19/2022] Open
Abstract
Long non-coding RNA (lncRNA) have been implicated in diverse biological roles including gene regulation and genomic imprinting. Identifying lncRNA in bovine across many differing tissue would contribute to the current repertoire of bovine lncRNA, and help further improve our understanding of the evolutionary importance and constraints of these transcripts. Additionally, it could aid in identifying sites in the genome outside of protein coding genes where mutations could contribute to variation in complex traits. This is particularly important in bovine as genomic predictions are increasingly used in genetic improvement for milk and meat production. Our aim was to identify and annotate novel long non coding RNA transcripts in the bovine genome captured from RNA Sequencing (RNA-Seq) data across 18 tissues, sampled in triplicate from a single cow. To address the main challenge in identifying lncRNA, namely distinguishing lncRNA transcripts from unannotated genes and protein coding genes, a lncRNA identification pipeline with a number of filtering steps was developed. A total of 9,778 transcripts passed the filtering pipeline. The bovine lncRNA catalogue includes MALAT1 and HOTAIR, both of which have been well described in human and mouse genomes. We attempted to validate the lncRNA in libraries from three additional cows. 726 (87.47%) liver and 1,668 (55.27%) blood class 3 lncRNA were validated with stranded liver and blood libraries respectively. Additionally, this study identified a large number of novel unknown transcripts in the bovine genome with high protein coding potential, illustrating a clear need for better annotations of protein coding genes.
Collapse
Affiliation(s)
- Lambros T. Koufariotis
- College of Science, Health and Engineering, La Trobe University Bundoora, Melbourne, Victoria, Australia
- Department of Environment and Primary Industries, AgriBio Bundoora, Melbourne, Victoria, Australia
- Dairy Futures Co-operative Research Centre, Melbourne, Victoria, Australia
- * E-mail:
| | - Yi-Ping Phoebe Chen
- College of Science, Health and Engineering, La Trobe University Bundoora, Melbourne, Victoria, Australia
| | - Amanda Chamberlain
- Department of Environment and Primary Industries, AgriBio Bundoora, Melbourne, Victoria, Australia
- Dairy Futures Co-operative Research Centre, Melbourne, Victoria, Australia
| | - Christy Vander Jagt
- Department of Environment and Primary Industries, AgriBio Bundoora, Melbourne, Victoria, Australia
- Dairy Futures Co-operative Research Centre, Melbourne, Victoria, Australia
| | - Ben J. Hayes
- College of Science, Health and Engineering, La Trobe University Bundoora, Melbourne, Victoria, Australia
- Department of Environment and Primary Industries, AgriBio Bundoora, Melbourne, Victoria, Australia
- Dairy Futures Co-operative Research Centre, Melbourne, Victoria, Australia
| |
Collapse
|
114
|
Holl HM, Gao S, Fei Z, Andrews C, Brooks SA. Generation of a de novo transcriptome from equine lamellar tissue. BMC Genomics 2015; 16:739. [PMID: 26432030 PMCID: PMC4592545 DOI: 10.1186/s12864-015-1948-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Accepted: 09/22/2015] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Laminitis, the structural failure of interdigitated tissue that suspends the distal skeleton within the hoof capsule, is a devastating disease that is the second leading cause of both lameness and euthanasia in the horse. Current transcriptomic research focuses on the expression of known genes. However, as this tissue is quite unique and equine gene annotation is largely derived from computational predictions, there are likely yet uncharacterized transcripts that may be involved in the etiology of laminitis. In order to create a novel annotation resource, we performed whole transcriptome sequencing of sagittal lamellar sections from one control and two laminitis affected horses. RESULTS Whole transcriptome sequencing of the three samples resulted in 113 million reads. Overall, 88 % of the reads mapped to the equCab2 reference genome, allowing for the identification of 119,430 SNPs. The de novo assembly generated around 75,000 transcripts, of which 36,000 corresponded to known annotations. Annotated transcript models are hosted in a public data repository and thus can be easily accessed or loaded into genome browsers. RT-PCR of 12 selected assemblies confirmed structure and expression in lamellar tissue. CONCLUSIONS Transcriptome sequencing represents a powerful tool to expand on equine annotation and identify novel targets for further laminitis research.
Collapse
Affiliation(s)
- Heather M Holl
- Department of Animal Sciences, University of Florida, Gainesville, FL, 32611, USA.
| | - Shan Gao
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, NY, 14853, USA.
| | - Zhangjun Fei
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, NY, 14853, USA.
| | - Caroline Andrews
- Laboratory of Molecular Immunoregulation, National Cancer Institute, Bethesda, MD, 20892, USA.
| | - Samantha A Brooks
- Department of Animal Sciences, University of Florida, Gainesville, FL, 32611, USA.
| |
Collapse
|
115
|
Szczepińska T, Kalisiak K, Tomecki R, Labno A, Borowski LS, Kulinski TM, Adamska D, Kosinska J, Dziembowski A. DIS3 shapes the RNA polymerase II transcriptome in humans by degrading a variety of unwanted transcripts. Genome Res 2015; 25:1622-33. [PMID: 26294688 PMCID: PMC4617959 DOI: 10.1101/gr.189597.115] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Accepted: 07/16/2015] [Indexed: 01/13/2023]
Abstract
Human DIS3, the nuclear catalytic subunit of the exosome complex, contains exonucleolytic and endonucleolytic active domains. To identify DIS3 targets genome-wide, we combined comprehensive transcriptomic analyses of engineered HEK293 cells that expressed mutant DIS3, with Photoactivatable Ribonucleoside-Enhanced Cross-Linking and Immunoprecipitation (PAR-CLIP) experiments. In cells expressing DIS3 with both catalytic sites mutated, RNAs originating from unannotated genomic regions increased ∼2.5-fold, covering ∼70% of the genome and allowing for thousands of novel transcripts to be discovered. Previously described pervasive transcription products, such as Promoter Upstream Transcripts (PROMPTs), accumulated robustly upon DIS3 dysfunction, representing a significant fraction of PAR-CLIP reads. We have also detected relatively long putative premature RNA polymerase II termination products of protein-coding genes whose levels in DIS3 mutant cells can exceed the mature mRNAs, indicating that production of such truncated RNA is a common phenomenon. In addition, we found DIS3 to be involved in controlling the formation of paraspeckles, nuclear bodies that are organized around NEAT1 lncRNA, whose short form was overexpressed in cells with mutated DIS3. Moreover, the DIS3 mutations resulted in misregulation of expression of ∼50% of transcribed protein-coding genes, probably as a secondary effect of accumulation of various noncoding RNA species. Finally, cells expressing mutant DIS3 accumulated snoRNA precursors, which correlated with a strong PAR-CLIP signal, indicating that DIS3 is the main snoRNA-processing enzyme. EXOSC10 (RRP6) instead controls the levels of the mature snoRNAs. Overall, we show that DIS3 has a major nucleoplasmic function in shaping the human RNA polymerase II transcriptome.
Collapse
Affiliation(s)
- Teresa Szczepińska
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland; Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Katarzyna Kalisiak
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland; Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Rafal Tomecki
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland; Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Anna Labno
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland; Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Lukasz S Borowski
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland; Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Tomasz M Kulinski
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland; Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Dorota Adamska
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland; Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| | - Joanna Kosinska
- Department of Medical Genetics, Center for Biostructure Research, Medical University of Warsaw, 02-106 Warsaw, Poland
| | - Andrzej Dziembowski
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland; Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
| |
Collapse
|
116
|
Gloss BS, Dinger ME. The specificity of long noncoding RNA expression. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2015; 1859:16-22. [PMID: 26297315 DOI: 10.1016/j.bbagrm.2015.08.005] [Citation(s) in RCA: 159] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 08/12/2015] [Accepted: 08/12/2015] [Indexed: 01/09/2023]
Abstract
Over the last decade, long noncoding RNAs (lncRNAs) have emerged as a fundamental molecular class whose members play pivotal roles in the regulation of the genome. The observation of pervasive transcription of mammalian genomes in the early 2000s sparked a revolution in the understanding of information flow in eukaryotic cells and the incredible flexibility and dynamic nature of the transcriptome. As a molecular class, distinct loci yielding lncRNAs are set to outnumber those yielding mRNAs. However, like many important discoveries, the road leading to uncovering this diverse class of molecules that act through a remarkable repertoire of mechanisms, was not a straight one. The same characteristic that most distinguishes lncRNAs from mRNAs, i.e. their developmental-stage, tissue-, and cell-specific expression, was one of the major impediments to their discovery and recognition as potentially functional regulatory molecules. With growing numbers of lncRNAs being assigned to biological functions, the specificity of lncRNA expression is now increasingly recognized as a characteristic that imbues lncRNAs with great potential as biomarkers and for the development of highly targeted therapeutics. Here we review the history of lncRNA research and how technological advances and insight into biological complexity have gone hand-in-hand in shaping this revolution. We anticipate that as increasing numbers of these molecules, often described as the dark matter of the genome, are characterized and the structure-function relationship of lncRNAs becomes better understood, it may ultimately be feasible to decipher what these non-(protein)-coding genes encode. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa.
Collapse
Affiliation(s)
- Brian S Gloss
- Division of Genomics and Epigenetics, Garvan Institute of Medical Research, Sydney, Australia; St Vincent's Clinical School, Faculty of Medicine, UNSW Australia
| | - Marcel E Dinger
- Division of Genomics and Epigenetics, Garvan Institute of Medical Research, Sydney, Australia; St Vincent's Clinical School, Faculty of Medicine, UNSW Australia.
| |
Collapse
|
117
|
Tabas-Madrid D, Alves-Cruzeiro J, Segura V, Guruceaga E, Vialas V, Prieto G, García C, Corrales FJ, Albar JP, Pascual-Montano A. Proteogenomics Dashboard for the Human Proteome Project. J Proteome Res 2015; 14:3738-49. [PMID: 26144527 DOI: 10.1021/acs.jproteome.5b00466] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Daniel Tabas-Madrid
- ProteoRed-ISCIII, National Center for Biotechnology-CSIC (CNB), C/Darwin 3, Madrid 28049, Spain
| | - Joao Alves-Cruzeiro
- ProteoRed-ISCIII, National Center for Biotechnology-CSIC (CNB), C/Darwin 3, Madrid 28049, Spain
| | - Victor Segura
- ProteoRed-ISCIII,
Center for Applied Medical Research (CIMA), University of Navarra, Avda. Pío XII, 55, Pamplona E-31008, Spain
| | - Elizabeth Guruceaga
- ProteoRed-ISCIII,
Center for Applied Medical Research (CIMA), University of Navarra, Avda. Pío XII, 55, Pamplona E-31008, Spain
| | - Vital Vialas
- ProteoRed-ISCIII, National Center for Biotechnology-CSIC (CNB), C/Darwin 3, Madrid 28049, Spain
| | - Gorka Prieto
- Department
of Communication Engineering E.T.S. Ingenierı́a de Bilbao, University of the Basque Country (UPV/EHU), Alda. Urquijo, s/n, Bilbao 48013, Spain
| | - Carlos García
- Computer
Science Faculty, Complutense University of Madrid (UCM), C/ Jose
Garcı́á Santesmases 9, Madrid 28040, Spain
| | - Fernando J. Corrales
- ProteoRed-ISCIII,
Center for Applied Medical Research (CIMA), University of Navarra, Avda. Pío XII, 55, Pamplona E-31008, Spain
| | - Juan Pablo Albar
- ProteoRed-ISCIII, National Center for Biotechnology-CSIC (CNB), C/Darwin 3, Madrid 28049, Spain
| | - Alberto Pascual-Montano
- ProteoRed-ISCIII, National Center for Biotechnology-CSIC (CNB), C/Darwin 3, Madrid 28049, Spain
| |
Collapse
|
118
|
Nakagawa S. Lessons from reverse-genetic studies of lncRNAs. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2015; 1859:177-83. [PMID: 26117798 DOI: 10.1016/j.bbagrm.2015.06.011] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Revised: 06/16/2015] [Accepted: 06/18/2015] [Indexed: 11/18/2022]
Abstract
The functions of long noncoding RNAs (lncRNAs) have mainly been studied using cultured cell lines, and this approach has revealed the involvement of lncRNAs in a variety of biological processes, including the epigenetic control of gene expression, post-transcriptional regulation of mRNA, and cellular proliferation and differentiation. Recently, increasing numbers of studies have investigated the functions of lncRNAs using gene-targeted model mice, largely confirming the physiological importance of lncRNA-mediated regulation in individual animals. In some cases, however, the results obtained by studies using knockout mice have been somewhat inconsistent with those of the preceding cell-based analyses. In this review, I will summarize the lessons that we are learning from the reverse-genetic studies of lncRNAs, namely the importance of noncoding DNA elements, the weak correlation between expression level and phenotypic prominence, the existence of tissue- and condition-specific phenotypes and incomplete penetrance, and the function of lncRNAs as precursor molecules. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa.
Collapse
Affiliation(s)
- Shinichi Nakagawa
- RNA Biology Laboratory, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.
| |
Collapse
|
119
|
Raabe CA, Brosius J. Does every transcript originate from a gene? Ann N Y Acad Sci 2015; 1341:136-48. [PMID: 25847549 DOI: 10.1111/nyas.12741] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Revised: 02/05/2015] [Accepted: 02/11/2015] [Indexed: 12/20/2022]
Abstract
Outdated gene definitions favored regions corresponding to mature messenger RNAs, in particular, the open reading frame. In eukaryotes, the intergenic space was widely regarded nonfunctional and devoid of RNA transcription. Original concepts were based on the assumption that RNA expression was restricted to known protein-coding genes and a few so-called structural RNA genes, such as ribosomal RNAs or transfer RNAs. With the discovery of introns and, more recently, sensitive techniques for monitoring genome-wide transcription, this view had to be substantially modified. Tiling microarrays and RNA deep sequencing revealed myriads of transcripts, which cover almost entire genomes. The tremendous complexity of non-protein-coding RNA transcription has to be integrated into novel gene definitions. Despite an ever-growing list of functional RNAs, questions concerning the mass of identified transcripts are under dispute. Here, we examined genome-wide transcription from various angles, including evolutionary considerations, and suggest, in analogy to novel alternative splice variants that do not persist, that the vast majority of transcripts represent raw material for potential, albeit rare, exaptation events.
Collapse
Affiliation(s)
- Carsten A Raabe
- Institute of Experimental Pathology, ZMBE, University of Münster, Münster, Germany
| | | |
Collapse
|
120
|
Chen HI, Liu Y, Zou Y, Lai Z, Sarkar D, Huang Y, Chen Y. Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads. BMC Genomics 2015; 16 Suppl 7:S14. [PMID: 26099631 PMCID: PMC4474535 DOI: 10.1186/1471-2164-16-s7-s14] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Background RNA sequencing (RNA-seq) is a powerful tool for genome-wide expression profiling of biological samples with the advantage of high-throughput and high resolution. There are many existing algorithms nowadays for quantifying expression levels and detecting differential gene expression, but none of them takes the misaligned reads that are mapped to non-exonic regions into account. We developed a novel algorithm, XBSeq, where a statistical model was established based on the assumption that observed signals are the convolution of true expression signals and sequencing noises. The mapped reads in non-exonic regions are considered as sequencing noises, which follows a Poisson distribution. Given measureable observed and noise signals from RNA-seq data, true expression signals, assuming governed by the negative binomial distribution, can be delineated and thus the accurate detection of differential expressed genes. Results We implemented our novel XBSeq algorithm and evaluated it by using a set of simulated expression datasets under different conditions, using a combination of negative binomial and Poisson distributions with parameters derived from real RNA-seq data. We compared the performance of our method with other commonly used differential expression analysis algorithms. We also evaluated the changes in true and false positive rates with variations in biological replicates, differential fold changes, and expression levels in non-exonic regions. We also tested the algorithm on a set of real RNA-seq data where the common and different detection results from different algorithms were reported. Conclusions In this paper, we proposed a novel XBSeq, a differential expression analysis algorithm for RNA-seq data that takes non-exonic mapped reads into consideration. When background noise is at baseline level, the performance of XBSeq and DESeq are mostly equivalent. However, our method surpasses DESeq and other algorithms with the increase of non-exonic mapped reads. Only in very low read count condition XBSeq had a slightly higher false discovery rate, which may be improved by adjusting the background noise effect in this situation. Taken together, by considering non-exonic mapped reads, XBSeq can provide accurate expression measurement and thus detect differential expressed genes even in noisy conditions.
Collapse
|
121
|
Sibley CR, Emmett W, Blazquez L, Faro A, Haberman N, Briese M, Trabzuni D, Ryten M, Weale ME, Hardy J, Modic M, Curk T, Wilson SW, Plagnol V, Ule J. Recursive splicing in long vertebrate genes. Nature 2015; 521:371-375. [PMID: 25970246 PMCID: PMC4471124 DOI: 10.1038/nature14466] [Citation(s) in RCA: 111] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 04/09/2015] [Indexed: 12/13/2022]
Abstract
It is generally believed that splicing removes introns as single units from precursor messenger RNA transcripts. However, some long Drosophila melanogaster introns contain a cryptic site, known as a recursive splice site (RS-site), that enables a multi-step process of intron removal termed recursive splicing. The extent to which recursive splicing occurs in other species and its mechanistic basis have not been examined. Here we identify highly conserved RS-sites in genes expressed in the mammalian brain that encode proteins functioning in neuronal development. Moreover, the RS-sites are found in some of the longest introns across vertebrates. We find that vertebrate recursive splicing requires initial definition of an 'RS-exon' that follows the RS-site. The RS-exon is then excluded from the dominant mRNA isoform owing to competition with a reconstituted 5' splice site formed at the RS-site after the first splicing step. Conversely, the RS-exon is included when preceded by cryptic promoters or exons that fail to reconstitute an efficient 5' splice site. Most RS-exons contain a premature stop codon such that their inclusion can decrease mRNA stability. Thus, by establishing a binary splicing switch, RS-sites demarcate different mRNA isoforms emerging from long genes by coupling cryptic elements with inclusion of RS-exons.
Collapse
Affiliation(s)
- Christopher R Sibley
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Warren Emmett
- University College London Genetics Institute, Gower Street, London WC1E 6BT, UK
| | - Lorea Blazquez
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
| | - Ana Faro
- Department of Cell and Developmental Biology, University College London, Gower Street, London WC1E 6BT, UK
| | - Nejc Haberman
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
| | - Michael Briese
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
- Institute for Clinical Neurobiology, University of Würzburg, Versbacherstr. 5, 97078, Würzburg, Germany
| | - Daniah Trabzuni
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
- Department of Genetics, King Faisal Specialist Hospital and Research Centre, Riyadh 11211, Saudi Arabia
| | - Mina Ryten
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
- Department of Medical &Molecular Genetics, King’s College London, Guy’s Hospital, London, UK
| | - Michael E Weale
- King’s College London, Department of Medical & Molecular Genetics, Guy’s Hospital, London SE1 9RT, UK
| | - John Hardy
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
| | - Miha Modic
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
- Institute of Stem Cell Research, German Research Center for Environmental Health, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Tomaž Curk
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | - Stephen W Wilson
- Department of Cell and Developmental Biology, University College London, Gower Street, London WC1E 6BT, UK
| | - Vincent Plagnol
- University College London Genetics Institute, Gower Street, London WC1E 6BT, UK
| | - Jernej Ule
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| |
Collapse
|
122
|
Rutkowski AJ, Erhard F, L'Hernault A, Bonfert T, Schilhabel M, Crump C, Rosenstiel P, Efstathiou S, Zimmer R, Friedel CC, Dölken L. Widespread disruption of host transcription termination in HSV-1 infection. Nat Commun 2015; 6:7126. [PMID: 25989971 PMCID: PMC4441252 DOI: 10.1038/ncomms8126] [Citation(s) in RCA: 198] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2015] [Accepted: 04/07/2015] [Indexed: 02/07/2023] Open
Abstract
Herpes simplex virus 1 (HSV-1) is an important human pathogen and a paradigm for virus-induced host shut-off. Here we show that global changes in transcription and RNA processing and their impact on translation can be analysed in a single experimental setting by applying 4sU-tagging of newly transcribed RNA and ribosome profiling to lytic HSV-1 infection. Unexpectedly, we find that HSV-1 triggers the disruption of transcription termination of cellular, but not viral, genes. This results in extensive transcription for tens of thousands of nucleotides beyond poly(A) sites and into downstream genes, leading to novel intergenic splicing between exons of neighbouring cellular genes. As a consequence, hundreds of cellular genes seem to be transcriptionally induced but are not translated. In contrast to previous reports, we show that HSV-1 does not inhibit co-transcriptional splicing. Our approach thus substantially advances our understanding of HSV-1 biology and establishes HSV-1 as a model system for studying transcription termination. Herpes simplex virus 1 (HSV-1) efficiently shuts down host gene expression in infected cells. Here Rutkowski et al. analyse the genome-wide changes in transcription and translation in infected cells, and show that HSV-1 triggers an extensive disruption of transcription termination of cellular genes.
Collapse
Affiliation(s)
- Andrzej J Rutkowski
- Division of Infectious Diseases, Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Florian Erhard
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstraße 17, 80333 München, Germany
| | - Anne L'Hernault
- Division of Infectious Diseases, Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Thomas Bonfert
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstraße 17, 80333 München, Germany
| | - Markus Schilhabel
- Institut für Klinische Molekularbiologie, Christian-Albrechts-Universität Kiel, Schittenhelmstraße 12, 24105 Kiel, Germany
| | - Colin Crump
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge CB2 1QP, UK
| | - Philip Rosenstiel
- Institut für Klinische Molekularbiologie, Christian-Albrechts-Universität Kiel, Schittenhelmstraße 12, 24105 Kiel, Germany
| | - Stacey Efstathiou
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge CB2 1QP, UK
| | - Ralf Zimmer
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstraße 17, 80333 München, Germany
| | - Caroline C Friedel
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstraße 17, 80333 München, Germany
| | - Lars Dölken
- 1] Division of Infectious Diseases, Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK [2] Institut für Virologie, Julius-Maximilians-Universität Würzburg, Versbacher Straße 7, 97078 Würzburg, Germany
| |
Collapse
|
123
|
Roberts TC, Morris KV, Wood MJA. The role of long non-coding RNAs in neurodevelopment, brain function and neurological disease. Philos Trans R Soc Lond B Biol Sci 2015; 369:rstb.2013.0507. [PMID: 25135968 DOI: 10.1098/rstb.2013.0507] [Citation(s) in RCA: 140] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are transcripts with low protein-coding potential that represent a large proportion of the transcriptional output of the cell. Many lncRNAs exhibit features indicative of functionality including tissue-restricted expression, localization to distinct subcellular structures, regulated expression and evolutionary conservation. Some lncRNAs have been shown to associate with chromatin-modifying activities and transcription factors, suggesting that a common mode of action may be to guide protein complexes to target genomic loci. However, the functions (if any) of the vast majority of lncRNA transcripts are currently unknown, and the subject of investigation. Here, we consider the putative role(s) of lncRNAs in neurodevelopment and brain function with an emphasis on the epigenetic regulation of gene expression. Associations of lncRNAs with neurodevelopmental/neuropsychiatric disorders, neurodegeneration and brain cancers are also discussed.
Collapse
Affiliation(s)
- Thomas C Roberts
- Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Kevin V Morris
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA School of Biotechnology and Biomedical Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Matthew J A Wood
- Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK
| |
Collapse
|
124
|
Caudron-Herger M, Cook PR, Rippe K, Papantonis A. Dissecting the nascent human transcriptome by analysing the RNA content of transcription factories. Nucleic Acids Res 2015; 43:e95. [PMID: 25897132 PMCID: PMC4538806 DOI: 10.1093/nar/gkv390] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 04/13/2015] [Indexed: 11/21/2022] Open
Abstract
While mapping total and poly-adenylated human transcriptomes has now become routine, characterizing nascent transcripts remains challenging, largely because nascent RNAs have such short half-lives. Here, we describe a simple, fast and cost-effective method to isolate RNA associated with transcription factories, the sites responsible for the majority of nuclear transcription. Following stimulation of human endothelial cells with the pro-inflammatory cytokine TNFα, we isolate and analyse the RNA content of factories by sequencing. Comparison with total, poly(A)+ and chromatin RNA fractions reveals that sequencing of purified factory RNA maps the complete nascent transcriptome; it is rich in intronic unprocessed transcript, as well as long intergenic non-coding (lincRNAs) and enhancer-associated RNAs (eRNAs), micro-RNA precursors and repeat-derived RNAs. Hence, we verify that transcription factories produce most nascent RNA and confer a regulatory role via their association with a set of specifically-retained non-coding transcripts.
Collapse
Affiliation(s)
| | - Peter R Cook
- Sir William Dunn School of Pathology, University of Oxford, OX1 3RE Oxford, UK
| | - Karsten Rippe
- Deutsches Krebsforschungszentrum (DKFZ) & BioQuant, D-69120 Heidelberg, Germany
| | - Argyris Papantonis
- Sir William Dunn School of Pathology, University of Oxford, OX1 3RE Oxford, UK Center for Molecular Medicine, University of Cologne, D-50931 Cologne, Germany
| |
Collapse
|
125
|
Adelson DL, Raison JM, Garber M, Edgar RC. Interspersed repeats in the horse (Equus caballus); spatial correlations highlight conserved chromosomal domains. Anim Genet 2015; 41 Suppl 2:91-9. [PMID: 21070282 DOI: 10.1111/j.1365-2052.2010.02115.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The interspersed repeat content of mammalian genomes has been best characterized in human, mouse and cow. In this study, we carried out de novo identification of repeated elements in the equine genome and identified previously unknown elements present at low copy number. The equine genome contains typical eutherian mammal repeats, but also has a significant number of hybrid repeats in addition to clade-specific Long Interspersed Nuclear Elements (LINE). Equus caballus clade specific LINE 1 (L1) repeats can be classified into approximately five subfamilies, three of which have undergone significant expansion. There are 1115 full-length copies of these equine L1, but of the 103 presumptive active copies, 93 fall within a single subfamily, indicating a rapid recent expansion of this subfamily. We also analysed both interspersed and simple sequence repeats (SSR) genome-wide, finding that some repeat classes are spatially correlated with each other as well as with G+C content and gene density. Based on these spatial correlations, we have confirmed that recently-described ancestral vs. clade-specific genome territories can be defined by their repeat content. The clade-specific Short Interspersed Nuclear Element correlations were scattered over the genome and appear to have been extensively remodelled. In contrast, territories enriched for ancestral repeats tended to be contiguous domains. To determine if the latter territories were evolutionarily conserved, we compared these results with a similar analysis of the human genome, and observed similar ancestral repeat enriched domains. These results indicate that ancestral, evolutionarily conserved mammalian genome territories can be identified on the basis of repeat content alone. Interspersed repeats of different ages appear to be analogous to geologic strata, allowing identification of ancient vs. newly remodelled regions of mammalian genomes.
Collapse
Affiliation(s)
- D L Adelson
- School of Molecular and Biomedical Science, University of Adelaide, North Terrace, Adelaide, South Australia, Australia.
| | | | | | | |
Collapse
|
126
|
Lee ES, Akef A, Mahadevan K, Palazzo AF. The consensus 5' splice site motif inhibits mRNA nuclear export. PLoS One 2015; 10:e0122743. [PMID: 25826302 PMCID: PMC4380460 DOI: 10.1371/journal.pone.0122743] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2014] [Accepted: 02/12/2015] [Indexed: 11/19/2022] Open
Abstract
In eukaryotes, mRNAs are synthesized in the nucleus and then exported to the cytoplasm where they are translated into proteins. We have mapped an element, which when present in the 3’terminal exon or in an unspliced mRNA, inhibits mRNA nuclear export. This element has the same sequence as the consensus 5’splice site motif that is used to define the start of introns. Previously it was shown that when this motif is retained in the mRNA, it causes defects in 3’cleavage and polyadenylation and promotes mRNA decay. Our new data indicates that this motif also inhibits nuclear export and promotes the targeting of transcripts to nuclear speckles, foci within the nucleus which have been linked to splicing. The motif, however, does not disrupt splicing or the recruitment of UAP56 or TAP/Nxf1 to the RNA, which are normally required for nuclear export. Genome wide analysis of human mRNAs, lncRNA and eRNAs indicates that this motif is depleted from naturally intronless mRNAs and eRNAs, but less so in lncRNAs. This motif is also depleted from the beginning and ends of the 3’terminal exons of spliced mRNAs, but less so for lncRNAs. Our data suggests that the presence of the 5’splice site motif in mature RNAs promotes their nuclear retention and may help to distinguish mRNAs from misprocessed transcripts and transcriptional noise.
Collapse
Affiliation(s)
- Eliza S. Lee
- Department of Biochemistry, University of Toronto, 1 King’s College Circle, MSB Room 5336, Toronto, ON, M5S 1A8, Canada
| | - Abdalla Akef
- Department of Biochemistry, University of Toronto, 1 King’s College Circle, MSB Room 5336, Toronto, ON, M5S 1A8, Canada
| | - Kohila Mahadevan
- Department of Biochemistry, University of Toronto, 1 King’s College Circle, MSB Room 5336, Toronto, ON, M5S 1A8, Canada
| | - Alexander F. Palazzo
- Department of Biochemistry, University of Toronto, 1 King’s College Circle, MSB Room 5336, Toronto, ON, M5S 1A8, Canada
- * E-mail:
| |
Collapse
|
127
|
Barichievy S, Naidoo J, Mhlanga MM. Non-coding RNAs and HIV: viral manipulation of host dark matter to shape the cellular environment. Front Genet 2015; 6:108. [PMID: 25859257 PMCID: PMC4374539 DOI: 10.3389/fgene.2015.00108] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Accepted: 03/02/2015] [Indexed: 11/13/2022] Open
Abstract
On October 28th 1943 Winston Churchill said “we shape our buildings, and afterward our buildings shape us” (Humes, 1994). Churchill was pondering how and when to rebuild the British House of Commons, which had been destroyed by enemy bombs on May 10th 1941. The old House had been small and insufficient to hold all its members, but was restored to its original form in 1950 in order to recapture the “convenience and dignity” that the building had shaped into its parliamentary members. The circular loop whereby buildings or dwellings are shaped and go on to shape those that reside in them is also true of pathogens and their hosts. As obligate parasites, pathogens need to alter their cellular host environments to ensure survival. Typically pathogens modify cellular transcription profiles and in doing so, the pathogen in turn is affected, thereby closing the loop. As key orchestrators of gene expression, non-coding RNAs provide a vast and extremely precise set of tools for pathogens to target in order to shape the cellular environment. This review will focus on host non-coding RNAs that are manipulated by the infamous intracellular pathogen, the human immunodeficiency virus (HIV). We will briefly describe both short and long host non-coding RNAs and discuss how HIV gains control of these factors to ensure widespread dissemination throughout the host as well as the establishment of lifelong, chronic infection.
Collapse
Affiliation(s)
- Samantha Barichievy
- Gene Expression and Biophysics Group, Synthetic Biology Emerging Research Area, Council for Scientific and Industrial Research, Pretoria South Africa ; Discovery Sciences, Research & Development, AstraZeneca, Mölndal Sweden
| | - Jerolen Naidoo
- Gene Expression and Biophysics Group, Synthetic Biology Emerging Research Area, Council for Scientific and Industrial Research, Pretoria South Africa
| | - Musa M Mhlanga
- Gene Expression and Biophysics Group, Synthetic Biology Emerging Research Area, Council for Scientific and Industrial Research, Pretoria South Africa ; Gene Expression and Biophysics Unit, Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, Lisbon Portugal
| |
Collapse
|
128
|
Abstract
Riboswitches present a ubiquitous genetic regulatory mechanism for prokaryotes and have been found in HIV1, fungi, plants, and even H. sapiens. We present an overview of approaches to predict riboswitch aptamers and, more generally, RNA conformational switches.
Collapse
Affiliation(s)
- P Clote
- Biology Department, Boston College, Boston, Massachusetts, USA.
| |
Collapse
|
129
|
Legeai F, Derrien T. Identification of long non-coding RNAs in insects genomes. CURRENT OPINION IN INSECT SCIENCE 2015; 7:37-44. [PMID: 32846672 DOI: 10.1016/j.cois.2015.01.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2014] [Revised: 01/07/2015] [Accepted: 01/07/2015] [Indexed: 06/11/2023]
Abstract
The development of high throughput sequencing technologies (HTS) has allowed researchers to better assess the complexity and diversity of the transcriptome. Among the many classes of non-coding RNAs (ncRNAs) identified the last decade, long non-coding RNAs (lncRNAs) represent a diverse and numerous repertoire of important ncRNAs, reinforcing the view that they are of central importance to the cell machinery in all branches of life. Although lncRNAs have been involved in essential biological processes such as imprinting, gene regulation or dosage compensation especially in mammals, the repertoire of lncRNAs is poorly characterized for many non-model organisms. In this review, we first focus on what is known about experimentally validated lncRNAs in insects and then review bioinformatic methods to annotate lncRNAs in the genomes of hexapods.
Collapse
Affiliation(s)
- Fabrice Legeai
- INRA, UMR1349, Institute of Genetics, Environment and Plant Protection, Domaine de la Motte, BP35327, 35653 Le Rheu cedex, France; IRISA/INRIA GenScale, Campus Beaulieu, 35000 Rennes, France.
| | - Thomas Derrien
- CNRS, UMR 6290, Institut de Génétique et Développement de Rennes, Université de Rennes 1, 2 Avenue du Pr. Léon Bernard, 35000 Rennes, France
| |
Collapse
|
130
|
Palazzo AF, Lee ES. Non-coding RNA: what is functional and what is junk? Front Genet 2015; 6:2. [PMID: 25674102 PMCID: PMC4306305 DOI: 10.3389/fgene.2015.00002] [Citation(s) in RCA: 557] [Impact Index Per Article: 55.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Accepted: 01/06/2015] [Indexed: 12/12/2022] Open
Abstract
The genomes of large multicellular eukaryotes are mostly comprised of non-protein coding DNA. Although there has been much agreement that a small fraction of these genomes has important biological functions, there has been much debate as to whether the rest contributes to development and/or homeostasis. Much of the speculation has centered on the genomic regions that are transcribed into RNA at some low level. Unfortunately these RNAs have been arbitrarily assigned various names, such as “intergenic RNA,” “long non-coding RNAs” etc., which have led to some confusion in the field. Many researchers believe that these transcripts represent a vast, unchartered world of functional non-coding RNAs (ncRNAs), simply because they exist. However, there are reasons to question this Panglossian view because it ignores our current understanding of how evolution shapes eukaryotic genomes and how the gene expression machinery works in eukaryotic cells. Although there are undoubtedly many more functional ncRNAs yet to be discovered and characterized, it is also likely that many of these transcripts are simply junk. Here, we discuss how to determine whether any given ncRNA has a function. Importantly, we advocate that in the absence of any such data, the appropriate null hypothesis is that the RNA in question is junk.
Collapse
Affiliation(s)
| | - Eliza S Lee
- Department of Biochemistry, University of Toronto Toronto, ON, Canada
| |
Collapse
|
131
|
Ramos MJN, Coito JL, Silva HG, Cunha J, Costa MMR, Rocheta M. Flower development and sex specification in wild grapevine. BMC Genomics 2014; 15:1095. [PMID: 25495781 PMCID: PMC4363350 DOI: 10.1186/1471-2164-15-1095] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Accepted: 11/26/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Wild plants of Vitis closely related to the cultivated grapevine (V. v. vinifera) are believed to have been first domesticated 10,000 years BC around the Caspian Sea. V. v. vinifera is hermaphrodite whereas V. v. sylvestris is a dioecious species. Male flowers show a reduced pistil without style or stigma and female flowers present reflexed stamens with infertile pollen. V. vinifera produce perfect flowers with all functional structures. The mechanism for flower sex determination and specification in grapevine is still unknown. RESULTS To understand which genes are involved during the establishment of male, female and complete flowers, we analysed and compared the transcription profiles of four developmental stages of the three genders. We showed that sex determination is a late event during flower development and that the expression of genes from the ABCDE model is not directly correlated with the establishment of sexual dimorphism. We propose a temporal comprehensive model in which two mutations in two linked genes could be players in sex determination and indirectly establish the Vitis domestication process. Additionally, we also found clusters of genes differentially expressed between genders and between developmental stages that suggest a role involved in sex differentiation. Also, the detection of differentially transcribed regions that extended existing gene models (intergenic regions) between sexes suggests that they may account for some of the variation between the subspecies. CONCLUSIONS There is no evidence of differences of expression levels in genes from the ABCDE model that could explain the shift from hermaphroditism to dioecy. We propose that sex specification occurs after floral organ identity has been established and therefore, sex determination genes might be having an effect downstream of the ABCDE model genes.For the first time a full transcriptomic analysis was performed in different flower developmental stages in the same individual. Our experimental approach enabled us to create a comprehensive catalogue of transcribed genes across developmental stages and genders that will contribute for future work in sex determination in seed plants.
Collapse
Affiliation(s)
- Miguel Jesus Nunes Ramos
- />Universidade de Lisboa, Instituto Superior de Agronomia, CBAA, Tapada da Ajuda, 1359-017 Lisboa, Portugal
| | - João Lucas Coito
- />Universidade de Lisboa, Instituto Superior de Agronomia, CBAA, Tapada da Ajuda, 1359-017 Lisboa, Portugal
| | - Helena Gomes Silva
- />Center for Biodiversity Functional and Integrative Genomics (BioFIG), Plant Functional Biology Center, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| | - Jorge Cunha
- />Instituto Nacional de Investigação Agrária e Veterinária, Quinta d’Almoinha, Dois Portos, Portugal
- />ITQB, Universidade Nova de Lisboa, Oeiras, Portugal
| | - Maria Manuela Ribeiro Costa
- />Center for Biodiversity Functional and Integrative Genomics (BioFIG), Plant Functional Biology Center, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| | - Margarida Rocheta
- />Universidade de Lisboa, Instituto Superior de Agronomia, CBAA, Tapada da Ajuda, 1359-017 Lisboa, Portugal
| |
Collapse
|
132
|
Abstract
Long noncoding RNAs (lncRNAs) are a group of transcripts that are longer than 200 nucleotides and have no protein-coding function. LncRNAs can regulate gene expression at the levels of epigenetic modification, transcription and post-transcriptional processing, and participate in many physiological and pathological processes. It is becoming evident that lncRNAs may be an important class of pervasive genes involved in carcinogenesis and metastasis. Moreover, emerging studies have demonstrated that a class of lncRNAs are dysregulated in hepatocellular carcinoma (HCC) and closely related with tumorigenesis, metastasis and prognosis. As such, lncRNAs may be promising novel molecules for disease diagnosis, treatment and prognosis. Here, we review the recent progress in understanding the role of lncRNAs in HCC.
Collapse
|
133
|
Quek XC, Thomson DW, Maag JLV, Bartonicek N, Signal B, Clark MB, Gloss BS, Dinger ME. lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs. Nucleic Acids Res 2014; 43:D168-73. [PMID: 25332394 PMCID: PMC4384040 DOI: 10.1093/nar/gku988] [Citation(s) in RCA: 396] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Despite the prevalence of long noncoding RNA (lncRNA) genes in eukaryotic genomes, only a small proportion have been examined for biological function. lncRNAdb, available at http://lncrnadb.org, provides users with a comprehensive, manually curated reference database of 287 eukaryotic lncRNAs that have been described independently in the scientific literature. In addition to capturing a great proportion of the recent literature describing functions for individual lncRNAs, lncRNAdb now offers an improved user interface enabling greater accessibility to sequence information, expression data and the literature. The new features in lncRNAdb include the integration of Illumina Body Atlas expression profiles, nucleotide sequence information, a BLAST search tool and easy export of content via direct download or a REST API. lncRNAdb is now endorsed by RNAcentral and is in compliance with the International Nucleotide Sequence Database Collaboration.
Collapse
Affiliation(s)
- Xiu Cheng Quek
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2052, Australia
| | - Daniel W Thomson
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia
| | - Jesper L V Maag
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2052, Australia
| | - Nenad Bartonicek
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia
| | - Bethany Signal
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia
| | - Michael B Clark
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia MRC Functional Genomics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford OX1 3PT, UK
| | - Brian S Gloss
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2052, Australia
| | - Marcel E Dinger
- Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
134
|
Qi Y, Kang YN, Zhao XD. Unexpected roles of long non-coding RNAs in cancer biology. ACTA ACUST UNITED AC 2014. [DOI: 10.1007/s12204-014-1538-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
135
|
Schliebner I, Becher R, Hempel M, Deising HB, Horbach R. New gene models and alternative splicing in the maize pathogen Colletotrichum graminicola revealed by RNA-Seq analysis. BMC Genomics 2014; 15:842. [PMID: 25281481 PMCID: PMC4194422 DOI: 10.1186/1471-2164-15-842] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2014] [Accepted: 09/09/2014] [Indexed: 11/27/2022] Open
Abstract
BACKGROUND An annotated genomic sequence of the corn anthracnose fungus Colletotrichum graminicola has been published previously, but correct identification of gene models by means of automated gene annotation remains a challenge. RNA-Seq offers the potential for substantially improved gene annotations and for the identification of posttranscriptional RNA modifications, such as alternative splicing and RNA editing. RESULTS Based on the nucleotide sequence information of transcripts, we identified 819 novel transcriptionally active regions (nTARs) and revised 906 incorrectly predicted gene models, including revisions of exon-intron structure, gene orientation and sequencing errors. Among the nTARs, 146 share significant similarity with proteins that have been identified in other species suggesting that they are hitherto unidentified genes in C. graminicola. Moreover, 5'- and 3'-UTR sequences of 4378 genes have been retrieved and alternatively spliced variants of 69 genes have been identified. Comparative analysis of RNA-Seq data and the genome sequence did not provide evidence for RNA editing in C. graminicola. CONCLUSIONS We successfully employed deep sequencing RNA-Seq data in combination with an elaborate bioinformatics strategy in order to identify novel genes, incorrect gene models and mechanisms of transcript processing in the corn anthracnose fungus C. graminicola. Sequence data of the revised genome annotation including several hundreds of novel transcripts, improved gene models and candidate genes for alternative splicing have been made accessible in a comprehensive database. Our results significantly contribute to both routine laboratory experiments and large-scale genomics or transcriptomic studies in C. graminicola.
Collapse
Affiliation(s)
- Ivo Schliebner
- />Interdisciplinary Center for Crop Plant Research, Martin-Luther-University Halle-Wittenberg, Betty-Heimann-Str. 3, D-06120 Halle (Saale), Germany
| | - Rayko Becher
- />Interdisciplinary Center for Crop Plant Research, Martin-Luther-University Halle-Wittenberg, Betty-Heimann-Str. 3, D-06120 Halle (Saale), Germany
| | - Marcus Hempel
- />Interdisciplinary Center for Crop Plant Research, Martin-Luther-University Halle-Wittenberg, Betty-Heimann-Str. 3, D-06120 Halle (Saale), Germany
| | - Holger B Deising
- />Interdisciplinary Center for Crop Plant Research, Martin-Luther-University Halle-Wittenberg, Betty-Heimann-Str. 3, D-06120 Halle (Saale), Germany
- />Institute for Agricultural and Nutritional Sciences, Martin-Luther-University Halle-Wittenberg, Betty-Heimann-Str. 3, D-06120 Halle (Saale), Germany
| | - Ralf Horbach
- />Interdisciplinary Center for Crop Plant Research, Martin-Luther-University Halle-Wittenberg, Betty-Heimann-Str. 3, D-06120 Halle (Saale), Germany
| |
Collapse
|
136
|
Abstract
Over the past decade there has been a greater understanding of genomic complexity in eukaryotes ushered in by the immense technological advances in high-throughput sequencing of DNA and its corresponding RNA transcripts. This has resulted in the realization that beyond protein-coding genes, there are a large number of transcripts that do not encode for proteins and, therefore, may perform their function through RNA sequences and/or through secondary and tertiary structural determinants. This review is focused on the latest findings on a class of noncoding RNAs that are relatively large (>200 nucleotides), display nuclear localization, and use different strategies to regulate transcription. These are exciting times for discovering the biological scope and the mechanism of action for these RNA molecules, which have roles in dosage compensation, imprinting, enhancer function, and transcriptional regulation, with a great impact on development and disease.
Collapse
Affiliation(s)
- Roberto Bonasio
- Department of Cell and Developmental Biology and Epigenetics Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104;
| | | |
Collapse
|
137
|
Abstract
A paragraph from the highlights of “Transcriptomics: Throwing light on dark matter” by L. Flintoft (Nature Reviews Genetics 11, 455, 2010), says: “Reports over the past few years of extensive transcription throughout eukaryotic genomes have led to considerable excitement. However, doubts have been raised about the methods that have detected this pervasive transcription and about how much of it is functional.” Since the appearance of the ENCODE project and due to follow-up work, a shift from the pervasive transcription observed from RNA-Seq data to its functional validation is gradually occurring. However, much less attention has been turned to the problem of deciphering the complexity of transcriptome data, which determines uncertainty with regard to identification, quantification and differential expression of genes and non-coding RNAs. The aim of this mini-review is to emphasize transcriptome-related problems of direct and inverse nature for which novel inference approaches are needed.
Collapse
Affiliation(s)
- Enrico Capobianco
- Center for Computational Science, University of Miami, Miami, FL, USA
| |
Collapse
|
138
|
Cao J. The functional role of long non-coding RNAs and epigenetics. Biol Proced Online 2014; 16:11. [PMID: 25276098 PMCID: PMC4177375 DOI: 10.1186/1480-9222-16-11] [Citation(s) in RCA: 256] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2014] [Accepted: 09/06/2014] [Indexed: 02/07/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) are non-protein coding transcripts longer than 200 nucleotides. The post-transcriptional regulation is influenced by these lncRNAs by interfering with the microRNA pathways, involving in diverse cellular processes. The regulation of gene expression by lncRNAs at the epigenetic level, transcriptional and post-transcriptional level have been well known and widely studied. Recent recognition that lncRNAs make effects in many biological and pathological processes such as stem cell pluripotency, neurogenesis, oncogenesis and etc. This review will focus on the functional roles of lncRNAs in epigenetics and related research progress will be summarized.
Collapse
Affiliation(s)
- Jinneng Cao
- Department of respiratory medicine, Fuyong People's Hospital, Baoan District, Shenzhen 518103, Guangdong, People's Republic of China
| |
Collapse
|
139
|
Lee JH, Lee T, Lee HK, Cho BW, Shin DH, Do KT, Sung S, Kwak W, Kim HJ, Kim H, Cho S, Park KD. Thoroughbred Horse Single Nucleotide Polymorphism and Expression Database: HSDB. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2014; 27:1236-43. [PMID: 25178365 PMCID: PMC4150188 DOI: 10.5713/ajas.2013.13694] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2013] [Revised: 02/20/2014] [Accepted: 06/21/2014] [Indexed: 01/17/2023]
Abstract
Genetics is important for breeding and selection of horses but there is a lack of well-established horse-related browsers or databases. In order to better understand horses, more variants and other integrated information are needed. Thus, we construct a horse genomic variants database including expression and other information. Horse Single Nucleotide Polymorphism and Expression Database (HSDB) (http://snugenome2.snu.ac.kr/HSDB) provides the number of unexplored genomic variants still remaining to be identified in the horse genome including rare variants by using population genome sequences of eighteen horses and RNA-seq of four horses. The identified single nucleotide polymorphisms (SNPs) were confirmed by comparing them with SNP chip data and variants of RNA-seq, which showed a concordance level of 99.02% and 96.6%, respectively. Moreover, the database provides the genomic variants with their corresponding transcriptional profiles from the same individuals to help understand the functional aspects of these variants. The database will contribute to genetic improvement and breeding strategies of Thoroughbreds.
Collapse
Affiliation(s)
- Joon-Ho Lee
- Genomic Informatics Center, Hankyong National University, Anseong 456-749, Korea
| | - Taeheon Lee
- Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Korea
| | - Hak-Kyo Lee
- Genomic Informatics Center, Hankyong National University, Anseong 456-749, Korea
| | - Byung-Wook Cho
- Department of Animal Science, College of Life Sciences, Pusan National University, Miryang 627-702, Korea
| | - Dong-Hyun Shin
- Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Korea
| | - Kyoung-Tag Do
- Department of Equine Sciences, Sorabol College, Gyeongju 780-711, Korea
| | - Samsun Sung
- C&K Genomics, Seoul National University Research Park, Seoul 151-919, Korea
| | - Woori Kwak
- C&K Genomics, Seoul National University Research Park, Seoul 151-919, Korea . ; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Korea
| | - Hyeon Jeong Kim
- C&K Genomics, Seoul National University Research Park, Seoul 151-919, Korea
| | - Heebal Kim
- Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Korea . ; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Korea
| | - Seoae Cho
- C&K Genomics, Seoul National University Research Park, Seoul 151-919, Korea
| | - Kyung-Do Park
- Genomic Informatics Center, Hankyong National University, Anseong 456-749, Korea
| |
Collapse
|
140
|
Abstract
BACKGROUND Long intergenic non-coding RNAs (lncRNAs) represent an emerging and under-studied class of transcripts that play a significant role in human cancers. Due to the tissue- and cancer-specific expression patterns observed for many lncRNAs it is believed that they could serve as ideal diagnostic biomarkers. However, until each tumor type is examined more closely, many of these lncRNAs will remain elusive. RESULTS Here we characterize the lncRNA landscape in lung cancer using publicly available transcriptome sequencing data from a cohort of 567 adenocarcinoma and squamous cell carcinoma tumors. Through this compendium we identify over 3,000 unannotated intergenic transcripts representing novel lncRNAs. Through comparison of both adenocarcinoma and squamous cell carcinomas with matched controls we discover 111 differentially expressed lncRNAs, which we term lung cancer-associated lncRNAs (LCALs). A pan-cancer analysis of 324 additional tumor and adjacent normal pairs enable us to identify a subset of lncRNAs that display enriched expression specific to lung cancer as well as a subset that appear to be broadly deregulated across human cancers. Integration of exome sequencing data reveals that expression levels of many LCALs have significant associations with the mutational status of key oncogenes in lung cancer. Functional validation, using both knockdown and overexpression, shows that the most differentially expressed lncRNA, LCAL1, plays a role in cellular proliferation. CONCLUSIONS Our systematic characterization of publicly available transcriptome data provides the foundation for future efforts to understand the role of LCALs, develop novel biomarkers, and improve knowledge of lung tumor biology.
Collapse
|
141
|
White NM, Cabanski CR, Silva-Fisher JM, Dang HX, Govindan R, Maher CA. Transcriptome sequencing reveals altered long intergenic non-coding RNAs in lung cancer. Genome Biol 2014; 15:429. [PMID: 25116943 PMCID: PMC4156652 DOI: 10.1186/s13059-014-0429-8] [Citation(s) in RCA: 165] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 07/31/2014] [Indexed: 02/07/2023] Open
Abstract
Background Long intergenic non-coding RNAs (lncRNAs) represent an emerging and under-studied class of transcripts that play a significant role in human cancers. Due to the tissue- and cancer-specific expression patterns observed for many lncRNAs it is believed that they could serve as ideal diagnostic biomarkers. However, until each tumor type is examined more closely, many of these lncRNAs will remain elusive. Results Here we characterize the lncRNA landscape in lung cancer using publicly available transcriptome sequencing data from a cohort of 567 adenocarcinoma and squamous cell carcinoma tumors. Through this compendium we identify over 3,000 unannotated intergenic transcripts representing novel lncRNAs. Through comparison of both adenocarcinoma and squamous cell carcinomas with matched controls we discover 111 differentially expressed lncRNAs, which we term lung cancer-associated lncRNAs (LCALs). A pan-cancer analysis of 324 additional tumor and adjacent normal pairs enable us to identify a subset of lncRNAs that display enriched expression specific to lung cancer as well as a subset that appear to be broadly deregulated across human cancers. Integration of exome sequencing data reveals that expression levels of many LCALs have significant associations with the mutational status of key oncogenes in lung cancer. Functional validation, using both knockdown and overexpression, shows that the most differentially expressed lncRNA, LCAL1, plays a role in cellular proliferation. Conclusions Our systematic characterization of publicly available transcriptome data provides the foundation for future efforts to understand the role of LCALs, develop novel biomarkers, and improve knowledge of lung tumor biology. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0429-8) contains supplementary material, which is available to authorized users.
Collapse
|
142
|
Influence of RNA extraction methods and library selection schemes on RNA-seq data. BMC Genomics 2014; 15:675. [PMID: 25113896 PMCID: PMC4148917 DOI: 10.1186/1471-2164-15-675] [Citation(s) in RCA: 123] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Accepted: 08/04/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene expression analysis by RNA sequencing is now widely used in a number of applications surveying the whole transcriptomes of cells and tissues. The recent introduction of ribosomal RNA depletion protocols, such as RiboZero, has extended the view of the polyadenylated transcriptome to the poly(A)- fraction of the RNA. However, substantial amounts of intronic transcriptional activity has been reported in RiboZero protocols, raising issues regarding their potential nuclear origin and the impact on the actual sequence depth in exonic regions. RESULTS Using HEK293 human cells as source material, we assessed here the impact of the two commonly used RNA extraction methods and of the library construction protocols (rRNA depletion versus mRNA) on 1) the relative abundance of intronic reads and 2) on the estimation of gene expression values. We benchmarked the rRNA depletion-based sequencing with a specific analysis of the cytoplasmic and nuclear transcriptome fractions, suggesting that the large majority of the intronic reads correspond to unprocessed nuclear transcripts rather than to independent transcriptional units. We show that Qiagen or TRIzol extraction methods retain differentially nuclear RNA species, and that consequently, rRNA depletion-based RNA sequencing protocols are particularly sensitive to the extraction methods. CONCLUSIONS We could show that the combination of Trizol-based RNA extraction with rRNA depletion sequencing protocols led to the largest fraction of intronic reads, after the sequencing of the nuclear transcriptome. We discuss here the impact of the various strategies on gene expression and alternative splicing estimation measures. Further, we propose guidelines and a double selection strategy for minimizing the expression biases, without loss of information.
Collapse
|
143
|
Brosius J. The persistent contributions of RNA to eukaryotic gen(om)e architecture and cellular function. Cold Spring Harb Perspect Biol 2014; 6:a016089. [PMID: 25081515 DOI: 10.1101/cshperspect.a016089] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Currently, the best scenario for earliest forms of life is based on RNA molecules as they have the proven ability to catalyze enzymatic reactions and harbor genetic information. Evolutionary principles valid today become apparent in such models already. Furthermore, many features of eukaryotic genome architecture might have their origins in an RNA or RNA/protein (RNP) world, including the onset of a further transition, when DNA replaced RNA as the genetic bookkeeper of the cell. Chromosome maintenance, splicing, and regulatory function via RNA may be deeply rooted in the RNA/RNP worlds. Mostly in eukaryotes, conversion from RNA to DNA is still ongoing, which greatly impacts the plasticity of extant genomes. Raw material for novel genes encoding protein or RNA, or parts of genes including regulatory elements that selection can act on, continues to enter the evolutionary lottery.
Collapse
Affiliation(s)
- Jürgen Brosius
- Institute of Experimental Pathology (ZMBE), University of Münster, D-48149 Münster, Germany
| |
Collapse
|
144
|
Padrón A, Molina-Cruz A, Quinones M, Ribeiro JM, Ramphul U, Rodrigues J, Shen K, Haile A, Ramirez JL, Barillas-Mury C. In depth annotation of the Anopheles gambiae mosquito midgut transcriptome. BMC Genomics 2014; 15:636. [PMID: 25073905 PMCID: PMC4131051 DOI: 10.1186/1471-2164-15-636] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Accepted: 07/01/2014] [Indexed: 11/25/2022] Open
Abstract
Background Genome sequencing of Anopheles gambiae was completed more than ten years ago and has accelerated research on malaria transmission. However, annotation needs to be refined and verified experimentally, as most predicted transcripts have been identified by comparative analysis with genomes from other species. The mosquito midgut—the first organ to interact with Plasmodium parasites—mounts effective antiplasmodial responses that limit parasite survival and disease transmission. High-throughput Illumina sequencing of the midgut transcriptome was used to identify new genes and transcripts, contributing to the refinement of An. gambiae genome annotation. Results We sequenced ~223 million reads from An. gambiae midgut cDNA libraries generated from susceptible (G3) and refractory (L35) mosquito strains. Mosquitoes were infected with either Plasmodium berghei or Plasmodium falciparum, and midguts were collected after the first or second Plasmodium infection. In total, 22,889 unique midgut transcript models were generated from both An. gambiae strain sequences combined, and 76% are potentially novel. Of these novel transcripts, 49.5% aligned with annotated genes and appear to be isoforms or pre-mRNAs of reference transcripts, while 50.5% mapped to regions between annotated genes and represent novel intergenic transcripts (NITs). Predicted models were validated for midgut expression using qRT-PCR and microarray analysis, and novel isoforms were confirmed by sequencing predicted intron-exon boundaries. Coding potential analysis revealed that 43% of total midgut transcripts appear to be long non-coding RNA (lncRNA), and functional annotation of NITs showed that 68% had no homology to current databases from other species. Reads were also analyzed using de novo assembly and predicted transcripts compared with genome mapping-based models. Finally, variant analysis of G3 and L35 midgut transcripts detected 160,742 variants with respect to the An. gambiae PEST genome, and 74% were new variants. Intergenic transcripts had a higher frequency of variation compared with non-intergenic transcripts. Conclusion This in-depth Illumina sequencing and assembly of the An. gambiae midgut transcriptome doubled the number of known transcripts and tripled the number of variants known in this mosquito species. It also revealed existence of a large number of lncRNA and opens new possibilities for investigating the biological function of many newly discovered transcripts. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-636) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Alvaro Molina-Cruz
- Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
145
|
Pervasive transcription: illuminating the dark matter of bacterial transcriptomes. Nat Rev Microbiol 2014; 12:647-53. [DOI: 10.1038/nrmicro3316] [Citation(s) in RCA: 183] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
146
|
Vikram R, Ramachandran R, Abdul KSM. Functional significance of long non-coding RNAs in breast cancer. Breast Cancer 2014; 21:515-21. [PMID: 25038622 DOI: 10.1007/s12282-014-0554-y] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2014] [Accepted: 06/30/2014] [Indexed: 01/26/2023]
Abstract
Most of the genome is transcribed to transcripts of no protein-coding potential. However, these transcripts do not represent transcriptional 'noise', rather they play an important role in cellular metabolism and development. Non-coding transcripts of 200 bases to 100 kb length are termed as long non-coding RNAs, majority of which are yet to be characterised thoroughly. Long non-coding RNAs (lncRNAs) play a significant role in cellular process ranging from transcriptional to post-transcriptional regulation. In this review, we highlight the recent efforts to characterise the major functions of lncRNAs in breast cancer. lncRNA expression is altered in several cancer types. Further, the aberrant regulation of lncRNAs promotes tumour development as they are involved in several cancer-associated pathways.
Collapse
Affiliation(s)
- Rajeev Vikram
- School of Science and Technology, Nottingham Trent University, Clifton Campus, Nottingham, NG11 8NS, UK,
| | | | | |
Collapse
|
147
|
Zhao X, Zhu W, Zha W, Chen F, Wu Z, Liu Y, Huang M. Expression profiles and initial confirmation of long noncoding RNAs in Chinese patients with pulmonary adenocarcinoma. Onco Targets Ther 2014; 7:1195-204. [PMID: 25061321 PMCID: PMC4085304 DOI: 10.2147/ott.s64033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Background The purpose of this study was to investigate differentially expressed long noncoding RNAs (lncRNAs) in pulmonary adenocarcinoma tissue and adjacent noncancerous tissue from Chinese patients using lncRNA expression microarray and preliminary analysis. Methods RNA extracted from three paired pulmonary adenocarcinoma tissue and adjacent noncancerous tissue specimens was used to synthesize double-stranded complementary DNA after labeling and hybridization. The complementary DNA was labeled and hybridized to the lncRNA expression microarray, and array data were analyzed for hierarchical clustering. Gene coexpression networks were constructed to identify interactions among genes. To validate the microarray findings, we measured the relative expression levels of four random differentially expressed lncRNAs in the same tissue used for microarray using real-time quantitative polymerase chain reaction. The expression level of one lncRNA, AK124939, in the paired pulmonary adenocarcinoma/adjacent noncancerous tissue of another 30 patients was measured using real-time quantitative polymerase chain reaction. The experimental data were further analyzed and compared with clinical features. Results Of 39,000 lncRNAs investigated, 704 were differentially expressed in pulmonary adenocarcinoma tissue; 385 were upregulated and 319 were downregulated compared with those in the adjacent noncancerous tissue (fold change ≥2 and ≤−2, P<0.05). AK124939 expression levels in poorly differentiated adenocarcinoma tissue were lower than those found in well to moderately differentiated adenocarcinoma tissue (P=0.05). Conclusion There are significant differences in the lncRNA expression profiles in Chinese patients with pulmonary adenocarcinoma. LncRNAs such as AK124939 may be anticancer factors related to the progression of pulmonary adenocarcinoma.
Collapse
Affiliation(s)
- Xin Zhao
- Department of Respiratory Medicine, The First Affiliated Hospital, Nanjing Medical University, Nanjing, People's Republic of China
| | - Wen Zhu
- Department of Respiratory Medicine, The First Affiliated Hospital, Nanjing Medical University, Nanjing, People's Republic of China
| | - Wangjian Zha
- Department of Respiratory Medicine, The First Affiliated Hospital, Nanjing Medical University, Nanjing, People's Republic of China
| | - Feifei Chen
- Department of Respiratory Medicine, The First Affiliated Hospital, Nanjing Medical University, Nanjing, People's Republic of China
| | - Zhenzhen Wu
- Department of Respiratory Medicine, The First Affiliated Hospital, Nanjing Medical University, Nanjing, People's Republic of China
| | - Yanan Liu
- Department of Respiratory Medicine, The First Affiliated Hospital, Nanjing Medical University, Nanjing, People's Republic of China
| | - Mao Huang
- Department of Respiratory Medicine, The First Affiliated Hospital, Nanjing Medical University, Nanjing, People's Republic of China
| |
Collapse
|
148
|
Johnson R, Guigó R. The RIDL hypothesis: transposable elements as functional domains of long noncoding RNAs. RNA (NEW YORK, N.Y.) 2014; 20:959-76. [PMID: 24850885 PMCID: PMC4114693 DOI: 10.1261/rna.044560.114] [Citation(s) in RCA: 205] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Our genome contains tens of thousands of long noncoding RNAs (lncRNAs), many of which are likely to have genetic regulatory functions. It has been proposed that lncRNA are organized into combinations of discrete functional domains, but the nature of these and their identification remain elusive. One class of sequence elements that is enriched in lncRNA is represented by transposable elements (TEs), repetitive mobile genetic sequences that have contributed widely to genome evolution through a process termed exaptation. Here, we link these two concepts by proposing that exonic TEs act as RNA domains that are essential for lncRNA function. We term such elements Repeat Insertion Domains of LncRNAs (RIDLs). A growing number of RIDLs have been experimentally defined, where TE-derived fragments of lncRNA act as RNA-, DNA-, and protein-binding domains. We propose that these reflect a more general phenomenon of exaptation during lncRNA evolution, where inserted TE sequences are repurposed as recognition sites for both protein and nucleic acids. We discuss a series of genomic screens that may be used in the future to systematically discover RIDLs. The RIDL hypothesis has the potential to explain how functional evolution can keep pace with the rapid gene evolution observed in lncRNA. More practically, TE maps may in the future be used to predict lncRNA function.
Collapse
Affiliation(s)
- Rory Johnson
- Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), 08003 Barcelona, Spain
- Corresponding authorE-mail
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), 08003 Barcelona, Spain
| |
Collapse
|
149
|
Thavamanikumar S, Southerton S, Thumma B. RNA-Seq using two populations reveals genes and alleles controlling wood traits and growth in Eucalyptus nitens. PLoS One 2014; 9:e101104. [PMID: 24967893 PMCID: PMC4072731 DOI: 10.1371/journal.pone.0101104] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2014] [Accepted: 06/02/2014] [Indexed: 11/17/2022] Open
Abstract
Eucalyptus nitens is a perennial forest tree species grown mainly for kraft pulp production in many parts of the world. Kraft pulp yield (KPY) is a key determinant of plantation profitability and increasing the KPY of trees grown in plantations is a major breeding objective. To speed up the breeding process, molecular markers that can predict KPY are desirable. To achieve this goal, we carried out RNA-Seq studies on trees at extremes of KPY in two different trials to identify genes and alleles whose expression correlated with KPY. KPY is positively correlated with growth measured as diameter at breast height (DBH) in both trials. In total, six RNA bulks from two treatments were sequenced on an Illumina HiSeq platform. At 5% false discovery rate level, 3953 transcripts showed differential expression in the same direction in both trials; 2551 (65%) were down-regulated and 1402 (35%) were up-regulated in low KPY samples. The genes up-regulated in low KPY trees were largely involved in biotic and abiotic stress response reflecting the low growth among low KPY trees. Genes down-regulated in low KPY trees mainly belonged to gene categories involved in wood formation and growth. Differential allelic expression was observed in 2103 SNPs (in 1068 genes) and of these 640 SNPs (30%) occurred in 313 unique genes that were also differentially expressed. These SNPs may represent the cis-acting regulatory variants that influence total gene expression. In addition we also identified 196 genes which had Ka/Ks ratios greater than 1.5, suggesting that these genes are under positive selection. Candidate genes and alleles identified in this study will provide a valuable resource for future association studies aimed at identifying molecular markers for KPY and growth.
Collapse
Affiliation(s)
- Saravanan Thavamanikumar
- Department of Forest and Ecosystem Science, University of Melbourne, Creswick, Victoria, Australia
| | | | - Bala Thumma
- CSIRO Plant Industry, Acton, ACT, Australia
- * E-mail:
| |
Collapse
|
150
|
Hu HY, He L, Khaitovich P. Deep sequencing reveals a novel class of bidirectional promoters associated with neuronal genes. BMC Genomics 2014; 15:457. [PMID: 24916849 PMCID: PMC4094773 DOI: 10.1186/1471-2164-15-457] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Accepted: 05/27/2014] [Indexed: 12/22/2022] Open
Abstract
Background Comprehensive annotation of transcripts expressed in a given tissue is a critical step towards the understanding of regulatory and functional pathways that shape the transcriptome. Results Here, we reconstructed a cumulative transcriptome of the human prefrontal cortex (PFC) based on approximately 300 million strand-specific RNA sequence (RNA-seq) reads collected at different stages of postnatal development. We find that more than 50% of reconstructed transcripts represent novel transcriptome elements, including 8,343 novel exons and exon extensions of annotated coding genes, 11,217 novel antisense transcripts and 29,541 novel intergenic transcripts or their fragments showing canonical features of long non-coding RNAs (lncRNAs). Our analysis further led to a surprising discovery of a novel class of bidirectional promoters (NBiPs) driving divergent transcription of mRNA and novel lncRNA pairs and displaying a distinct set of sequence and epigenetic features. In contrast to known bidirectional and unidirectional promoters, NBiPs are strongly associated with genes involved in neuronal functions and regulated by neuron-associated transcription factors. Conclusions Taken together, our results demonstrate that large portions of the human transcriptome remain uncharacterized. The distinct sequence and epigenetic features of NBiPs, as well as their specific association with neuronal genes, further suggest existence of regulatory pathways specific to the human brain. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-457) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hai Yang Hu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, 320 Yue Yang Road, 200031 Shanghai, China.
| | | | | |
Collapse
|