251
|
Transcriptomics in the RNA-seq era. Curr Opin Chem Biol 2013; 17:4-11. [PMID: 23290152 DOI: 10.1016/j.cbpa.2012.12.008] [Citation(s) in RCA: 190] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Revised: 11/07/2012] [Accepted: 12/02/2012] [Indexed: 12/12/2022]
Abstract
The transcriptomics field has developed rapidly with the advent of next-generation sequencing technologies. RNA-seq has now displaced microarrays as the preferred method for gene expression profiling. The comprehensive nature of the data generated has been a boon in terms of transcript identification but analysis challenges remain. Key among these problems is the development of suitable expression metrics for expression level comparisons and methods for identification of differentially expressed genes (and exons). Several approaches have been developed but as yet no consensus exists on the best pipeline to use. De novo transcriptome approaches are increasingly viable for organisms lacking a sequenced genome. The reduction in starting RNA required has enabled the development of new applications such as single cell transcriptomics. The emerging picture of mammalian transcription is complex with further refinement expected with the integration of epigenomic data generated by projects such as ENCODE.
Collapse
|
252
|
Schaffer ME, Platero JS. Pharmacogenomics in Cancer Therapeutics. Pharmacogenomics 2013. [DOI: 10.1016/b978-0-12-391918-2.00004-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
253
|
Abstract
Deep sequencing has many possible applications; one of them is the identification and quantification of RNA editing sites. The most common type of RNA editing is adenosine to inosine (A-to-I) editing. A prerequisite for this editing process is a double-stranded RNA (dsRNA) structure. Such dsRNAs are formed as part of the microRNA (miRNA) maturation process, and it is therefore expected that miRNAs are affected by A-to-I editing. Indeed, tens of editing sites were found in miRNAs, some of which change the miRNA binding specificity. Here, we describe a protocol for the identification of RNA editing sites in mature miRNAs using deep sequencing data.
Collapse
Affiliation(s)
- Shahar Alon
- George S. Wise Faculty of Life Sciences, Department of Neurobiology, Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | | |
Collapse
|
254
|
Lefkovits I. Alacrity of Cells Engaged in the Immune Response. Scand J Immunol 2012; 77:1-12. [DOI: 10.1111/sji.12003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2012] [Accepted: 09/24/2012] [Indexed: 01/30/2023]
Affiliation(s)
- I. Lefkovits
- Department of Biomedicine; University Hospital Basel; Basel; Switzerland
| |
Collapse
|
255
|
Altelaar AFM, Munoz J, Heck AJR. Next-generation proteomics: towards an integrative view of proteome dynamics. Nat Rev Genet 2012. [PMID: 23207911 DOI: 10.1038/nrg3356] [Citation(s) in RCA: 526] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Next-generation sequencing allows the analysis of genomes, including those representing disease states. However, the causes of most disorders are multifactorial, and systems-level approaches, including the analysis of proteomes, are required for a more comprehensive understanding. The proteome is extremely multifaceted owing to splicing and protein modifications, and this is further amplified by the interconnectivity of proteins into complexes and signalling networks that are highly divergent in time and space. Proteome analysis heavily relies on mass spectrometry (MS). MS-based proteomics is starting to mature and to deliver through a combination of developments in instrumentation, sample preparation and computational analysis. Here we describe this emerging next generation of proteomics and highlight recent applications.
Collapse
Affiliation(s)
- A F Maarten Altelaar
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | | | | |
Collapse
|
256
|
Evans VC, Barker G, Heesom KJ, Fan J, Bessant C, Matthews DA. De novo derivation of proteomes from transcriptomes for transcript and protein identification. Nat Methods 2012; 9:1207-11. [PMID: 23142869 PMCID: PMC3581816 DOI: 10.1038/nmeth.2227] [Citation(s) in RCA: 135] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2012] [Accepted: 09/21/2012] [Indexed: 11/30/2022]
Abstract
Identification of proteins by tandem mass spectrometry requires a reference protein database, but these are only available for model species. Here we demonstrate that, for a non-model species, the sequencing of expressed mRNA can generate a protein database for mass spectrometry-based identification. This combination of high-throughput sequencing and protein identification technologies allows detection of genes and proteins. We use human cells infected with human adenovirus as a complex and dynamic model to demonstrate the robustness of this approach. Our proteomics informed by transcriptomics (PIT) technique identifies >99% of over 3,700 distinct proteins identified using traditional analysis that relies on comprehensive human and adenovirus protein lists. We show that this approach can also be used to highlight genes and proteins undergoing dynamic changes in post-transcriptional protein stability.
Collapse
Affiliation(s)
- Vanessa C. Evans
- School of Cellular and Molecular Medicine, University of Bristol, University Walk, Bristol. BS8 1TD. UK
| | - Gary Barker
- School of Biological Sciences, University of Bristol, University Walk, Bristol. BS8 1TD. UK
| | - Kate J. Heesom
- School of Biochemistry, University of Bristol, University Walk, Bristol. BS8 1TD. UK
| | - Jun Fan
- Bioinformatics Group, Cranfield Health, Cranfield University, Cranfield, Bedfordshire. MK43 0AL. UK
| | - Conrad Bessant
- Bioinformatics Group, Cranfield Health, Cranfield University, Cranfield, Bedfordshire. MK43 0AL. UK
| | - David A. Matthews
- School of Cellular and Molecular Medicine, University of Bristol, University Walk, Bristol. BS8 1TD. UK
| |
Collapse
|
257
|
Transcription Factors and Gene Expression. Mol Pharmacol 2012. [DOI: 10.1002/9781118451908.ch8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
|
258
|
Dahabieh MS, Samanta D, Brodovitch JC, Frech C, O'Neill MA, Pinto BM. Sequence-dependent structural dynamics of primate adenosine-to-inosine editing substrates. Chembiochem 2012. [PMID: 23193088 DOI: 10.1002/cbic.201200526] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Humans have the highest level of adenosine-to-inosine (A-to-I) editing amongst primates, yet the reasons for this difference remain unclear. Sequence analysis of the Alu Sg elements (A-to-I RNA substrates) corresponding to the Nup50 gene in human, chimp, and rhesus reveals subtle sequence variations surrounding the edit sites. We have developed three constructs that represent human (HuAp5), chimp (ChAp5), and rhesus (RhAp5) Nup50 Alu Sg A-to-I editing substrates. Here, 2-aminopurine (2-Ap) was substituted for edited adenosine (A5) so as to monitor the fluorescence intensity with respect to temperature. UV and steady-state fluorescence (SSF) T(M) plots indicate that local and global unfolding are coincident, with the human construct displaying a T(M) of approximately 70°C, compared to 60°C for chimp and 54°C for rhesus. However, time-resolved fluorescence (TRF) resolves three different fluorescence lifetimes that we assign to folded, intermediate(s), and unfolded states. The TRF data fit well to a two-intermediate model, whereby both intermediates (M, J) are in equilibrium with each other, and the folded/unfolded states. Our model suggests that, at 37°C, human state J and the folded state will be the most heavily populated in comparison to the other primate constructs. In order for adenosine deaminase acting on RNA (ADAR) to efficiently dock, a stable duplex must be present that corresponds to the human construct, globally. Next, the enzyme must "flip out" the base of interest to facilitate the A-to-I conversion; a nucleotide in an intermediate-like position would enhance this conformational change. Our experiments demonstrate that subtle variations in RNA sequence might contribute to the high A-to-I editing levels found in humans.
Collapse
|
259
|
Chen R, Snyder M. Promise of personalized omics to precision medicine. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2012. [PMID: 23184638 DOI: 10.1002/wsbm.1198] [Citation(s) in RCA: 201] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The rapid development of high-throughput technologies and computational frameworks enables the examination of biological systems in unprecedented detail. The ability to study biological phenomena at omics levels in turn is expected to lead to significant advances in personalized and precision medicine. Patients can be treated according to their own molecular characteristics. Individual omes as well as the integrated profiles of multiple omes, such as the genome, the epigenome, the transcriptome, the proteome, the metabolome, the antibodyome, and other omics information are expected to be valuable for health monitoring, preventative measures, and precision medicine. Moreover, omics technologies have the potential to transform medicine from traditional symptom-oriented diagnosis and treatment of diseases toward disease prevention and early diagnostics. We discuss here the advances and challenges in systems biology-powered personalized medicine at its current stage, as well as a prospective view of future personalized health care at the end of this review.
Collapse
Affiliation(s)
- Rui Chen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | |
Collapse
|
260
|
Abstract
The central dogma of molecular biology has come under scrutiny in recent years. Here, we reviewed high-throughput mRNA and protein expression data of Escherichia coli, Saccharomyces cerevisiae, and several mammalian cells. At both single cell and population scales, the statistical comparisons between the entire transcriptomes and proteomes show clear correlation structures. In contrast, the pair-wise correlations of single transcripts to proteins show nullity. These data suggest that the organizing structure guiding cellular processes is observed at omics-wide scale, and not at single molecule level. The central dogma, thus, globally emerges as an average integrated flow of cellular information.
Collapse
Affiliation(s)
- Vincent Piras
- Institute for Advanced Biosciences, Keio University Tsuruoka, Yamagata, Japan ; Graduate School of Media and Governance, Keio University Fujisawa, Kanagawa, Japan
| | | | | |
Collapse
|
261
|
Systematic investigation of insertional and deletional RNA-DNA differences in the human transcriptome. BMC Genomics 2012; 13:616. [PMID: 23148664 PMCID: PMC3505181 DOI: 10.1186/1471-2164-13-616] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Accepted: 11/07/2012] [Indexed: 12/04/2022] Open
Abstract
Background The genomic information which is transcribed into the primary RNA can be altered by RNA editing at the transcriptional or post-transcriptional level, which provides an effective way to create transcript diversity in an organism. Altering can occur through substitutional RNA editing or via the insertion or deletion of nucleotides relative to the original template. Taking advantage of recent high throughput sequencing technology combined with bioinformatics tools, several groups have recently studied the genome-wide substitutional RNA editing profiles in human. However, while insertional/deletional (indel) RNA editing is well known in several lower species, only very scarce evidence supports the existence of insertional editing events in higher organisms such as human, and no previous work has specifically focused on indel differences between RNA and their matching DNA in human. Here, we provide the first study to examine the possibility of genome-wide indel RNA-DNA differences in one human individual, NA12878, whose RNA and matching genome have been deeply sequenced. Results We apply different computational tools that are capable of identifying indel differences between RNA reads and the matching reference genome and we initially find hundreds of such indel candidates. However, with careful further analysis and filtering, we conclude that all candidates are false-positives created by splice junctions, paralog sequences, diploid alleles, and known genomic indel variations. Conclusions Overall, our study suggests that indel RNA editing events are unlikely to exist broadly in the human transcriptome and emphasizes the necessity of a robust computational filter pipeline to obtain high confidence RNA-DNA difference results when analyzing high throughput sequencing data as suggested in the recent genome-wide RNA editing studies.
Collapse
|
262
|
Iliuk AB, Tao WA. Is phosphoproteomics ready for clinical research? Clin Chim Acta 2012; 420:23-7. [PMID: 23159844 DOI: 10.1016/j.cca.2012.10.063] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2012] [Accepted: 10/31/2012] [Indexed: 12/29/2022]
Abstract
BACKGROUND For many diseases such as cancer where phosphorylation-dependent signaling is the foundation of disease onset and progression, single-gene testing and genomic profiling alone are not sufficient in providing most critical information. The reason for this is that in these activated pathways the signaling changes and drug resistance are often not directly correlated with changes in protein expression levels. In order to obtain the essential information needed to evaluate pathway activation or the effects of certain drugs and therapies on the molecular level, the analysis of changes in protein phosphorylation is critical. METHODS Existing approaches do not differentiate clinical disease subtypes on the protein and signaling pathway level, and therefore hamper the predictive management of the disease and the selection of therapeutic targets. CONCLUSIONS The mini-review examines the impact of emerging systems biology tools and the possibility of applying phosphoproteomics to clinical research.
Collapse
Affiliation(s)
- Anton B Iliuk
- Department of Biochemistry, Purdue University, West Lafayette, IN 47907, United States
| | | |
Collapse
|
263
|
Saletore Y, Meyer K, Korlach J, Vilfan ID, Jaffrey S, Mason CE. The birth of the Epitranscriptome: deciphering the function of RNA modifications. Genome Biol 2012; 13:175. [PMID: 23113984 PMCID: PMC3491402 DOI: 10.1186/gb-2012-13-10-175] [Citation(s) in RCA: 346] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/31/2012] [Indexed: 01/28/2023] Open
Abstract
Recent studies have found methyl-6-adenosine in thousands of mammalian genes, and this modification is most pronounced near the beginning of the 3' UTR. We present a perspective on current work and new single-molecule sequencing methods for detecting RNA base modifications.
Collapse
Affiliation(s)
- Yogesh Saletore
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, NY 10065, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10065, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY10065, USA
| | - Kate Meyer
- Department of Pharmacology, Weill Cornell Medical College, New York, NY 10065, USA
| | - Jonas Korlach
- Pacific Biosciences, 1380 Willow Rd, Menlo Park, CA 94025, USA
| | - Igor D Vilfan
- Pacific Biosciences, 1380 Willow Rd, Menlo Park, CA 94025, USA
| | - Samie Jaffrey
- Department of Pharmacology, Weill Cornell Medical College, New York, NY 10065, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, NY 10065, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10065, USA
| |
Collapse
|
264
|
Large-scale profiling and identification of potential regulatory mechanisms for allelic gene expression in colorectal cancer cells. Gene 2012; 512:16-22. [PMID: 23064046 DOI: 10.1016/j.gene.2012.10.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Revised: 09/28/2012] [Accepted: 10/02/2012] [Indexed: 01/20/2023]
Abstract
Allelic variation in gene expression is common in humans and this variation is associated with phenotypic variation. In this study, we employed high-density single nucleotide polymorphism (SNP) chips containing 13,900 exonic SNPs to identify genes with allelic gene expression in cells from colorectal cancer cell lines. We found 2 monoallelically expressed genes (ERAP2 and MYLK4), 32 genes with an allelic imbalance in their expression, and 13 genes showing allele substitution by RNA editing. Among a total of 34 allelically expressed genes in colorectal cancer cells, 15 genes (44.1%) were associated with cis-acting eQTL, indicating that large portions of allelically expressed genes are regulated by cis-acting mechanisms of gene expression. In addition, potential regulatory variants present in the proximal promoter regions of genes showing either monoallelic expression or allelic imbalance were not tightly linked with coding SNPs, which were detected with allelic gene expression. These results suggest that multiple rare variants could be involved in the cis-acting regulatory mechanism of allelic gene expression. In the comparison with allelic gene expression data from Centre d'Etude du Polymorphisme Humain (CEPH) family B cells, 12 genes showed B-cell specific allelic imbalance and 1 noncoding SNP showed colorectal cancer cell-specific allelic imbalance. In addition, different patterns of allele substitution were observed between B cells and colorectal cancer cells. Overall, our study not only indicates that allelic gene expression is common in colorectal cancer cells, but our study also provides a better understanding of allele-specific gene expression in colorectal cancer cells.
Collapse
|
265
|
Abstract
Transcriptomics is the study of how our genes are regulated and expressed in different biological settings. Technical advances now enable quantitative assessment of all expressed genes (ie, the entire "transcriptome") in a given tissue at a given time. These approaches provide a powerful tool for understanding complex biological systems and for developing novel biomarkers. This chapter will introduce basic concepts in transcriptomics and available technologies for developing transcriptomic biomarkers. We will then review current and emerging applications in cardiovascular medicine.
Collapse
Affiliation(s)
- Dawn M Pedrotty
- Penn Cardiovascular Institute and Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | | | | |
Collapse
|
266
|
Blanc V, Xie Y, Luo J, Kennedy S, Davidson NO. Intestine-specific expression of Apobec-1 rescues apolipoprotein B RNA editing and alters chylomicron production in Apobec1 -/- mice. J Lipid Res 2012; 53:2643-55. [PMID: 22993231 DOI: 10.1194/jlr.m030494] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Intestinal apolipoprotein B (apoB) mRNA undergoes C-to-U editing, mediated by the catalytic deaminase apobec-1, which results in translation of apoB48. Apobec1(-/-) mice produce only apoB100 and secrete larger chylomicron particles than those observed in wild-type (WT) mice. Here we show that transgenic rescue of intestinal apobec-1 expression (Apobec1(Int/O)) restores C-to-U RNA editing of apoB mRNA in vivo, including the canonical site at position 6666 and also at approximately 20 other newly identified downstream sites present in WT mice. The small intestine of Apobec1(Int/O) mice produces only apoB48, and the liver produces only apoB100. Serum chylomicron particles were smaller in Apobec1(Int/O) mice compared with those from Apobec1(-/-) mice, and the predominant fraction of serum apoB48 in Apobec1(Int/O) mice migrated in lipoproteins smaller than chylomicrons, even when these mice were fed a high-fat diet. Because apoB48 arises exclusively from the intestine in Apobec1(Int/O) mice and intestinal apoB48 synthesis and secretion rates were comparable to WT mice, we were able to infer the major sites of origin of serum apoB48 in WT mice. Our findings imply that less than 25% of serum apoB48 in WT mice arises from the intestine, with the majority originating from the liver.
Collapse
Affiliation(s)
- Valerie Blanc
- Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | | | | | | | | |
Collapse
|
267
|
Deep sequencing reveals small RNA characterization of invasive micropapillary carcinomas of the breast. Breast Cancer Res Treat 2012; 136:77-87. [PMID: 22976804 DOI: 10.1007/s10549-012-2166-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2012] [Accepted: 07/10/2012] [Indexed: 12/14/2022]
Abstract
Invasive micropapillary carcinoma (IMPC) is an uncommon histological type of breast cancer. IMPC has a special growth pattern and a more aggressive behavior than invasive ductal carcinomas of no special types (IDC-NSTs). microRNAs are a large class of non-coding RNAs involved in the regulation of various biological processes. Here, we analyzed the small RNA transcriptomes of five formalin-fixed paraffin-embedded (FFPE) pure IMPC samples and five FFPE IDC-NSTs samples by means of next-generation sequencing, generating a total of >170,000,000 clean reads. In an unsupervised cluster analysis, differently expressed miRNAs generated a tree with clear distinction between IMPC and IDC-NSTs classes. Paired fresh-frozen and FFPE specimens showed very similar miRNA expression profiles. By means of RT-qPCR, we further investigated miRNA expression in more IMPC (n = 22) and IDC-NSTs (n = 24) FFPE samples and found let-7b, miR-30c, miR-148a, miR-181a, miR-181a*, and miR-181b were significantly differently expressed between the two groups. We also elucidated several features of miRNA in these breast cancer tissues including 5' variability, miRNA editing, and 3' untemplated addition. Our findings will lead to further understanding of the invasive potency of IMPC and gain an insight into the diversity and complexity of small RNA molecules in breast cancer tissues.
Collapse
|
268
|
Park KD, Park J, Ko J, Kim BC, Kim HS, Ahn K, Do KT, Choi H, Kim HM, Song S, Lee S, Jho S, Kong HS, Yang YM, Jhun BH, Kim C, Kim TH, Hwang S, Bhak J, Lee HK, Cho BW. Whole transcriptome analyses of six thoroughbred horses before and after exercise using RNA-Seq. BMC Genomics 2012; 13:473. [PMID: 22971240 PMCID: PMC3472166 DOI: 10.1186/1471-2164-13-473] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2011] [Accepted: 09/06/2012] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Thoroughbred horses are the most expensive domestic animals, and their running ability and knowledge about their muscle-related diseases are important in animal genetics. While the horse reference genome is available, there has been no large-scale functional annotation of the genome using expressed genes derived from transcriptomes. RESULTS We present a large-scale analysis of whole transcriptome data. We sequenced the whole mRNA from the blood and muscle tissues of six thoroughbred horses before and after exercise. By comparing current genome annotations, we identified 32,361 unigene clusters spanning 51.83 Mb that contained 11,933 (36.87%) annotated genes. More than 60% (20,428) of the unigene clusters did not match any current equine gene model. We also identified 189,973 single nucleotide variations (SNVs) from the sequences aligned against the horse reference genome. Most SNVs (171,558 SNVs; 90.31%) were novel when compared with over 1.1 million equine SNPs from two SNP databases. Using differential expression analysis, we further identified a number of exercise-regulated genes: 62 up-regulated and 80 down-regulated genes in the blood, and 878 up-regulated and 285 down-regulated genes in the muscle. Six of 28 previously-known exercise-related genes were over-expressed in the muscle after exercise. Among the differentially expressed genes, there were 91 transcription factor-encoding genes, which included 56 functionally unknown transcription factor candidates that are probably associated with an early regulatory exercise mechanism. In addition, we found interesting RNA expression patterns where different alternative splicing forms of the same gene showed reversed expressions before and after exercising. CONCLUSION The first sequencing-based horse transcriptome data, extensive analyses results, deferentially expressed genes before and after exercise, and candidate genes that are related to the exercise are provided in this study.
Collapse
Affiliation(s)
- Kyung-Do Park
- Department of Biotechnology, Hankyong National University, Anseong, Republic of Korea
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
269
|
Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi AM, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alioto T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, Derrien T, Drenkow J, Dumais E, Dumais J, Duttagupta R, Falconnet E, Fastuca M, Fejes-Toth K, Ferreira P, Foissac S, Fullwood MJ, Gao H, Gonzalez D, Gordon A, Gunawardena H, Howald C, Jha S, Johnson R, Kapranov P, King B, Kingswood C, Luo OJ, Park E, Persaud K, Preall JB, Ribeca P, Risk B, Robyr D, Sammeth M, Schaffer L, See LH, Shahab A, Skancke J, Suzuki AM, Takahashi H, Tilgner H, Trout D, Walters N, Wang H, Wrobel J, Yu Y, Ruan X, Hayashizaki Y, Harrow J, Gerstein M, Hubbard T, Reymond A, Antonarakis SE, Hannon G, Giddings MC, Ruan Y, Wold B, Carninci P, Guigó R, Gingeras TR. Landscape of transcription in human cells. Nature 2012; 489:101-8. [PMID: 22955620 PMCID: PMC3684276 DOI: 10.1038/nature11233] [Citation(s) in RCA: 3893] [Impact Index Per Article: 299.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2011] [Accepted: 05/15/2012] [Indexed: 02/07/2023]
Abstract
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.
Collapse
Affiliation(s)
- Sarah Djebali
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Carrie A. Davis
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Angelika Merkel
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Alex Dobin
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Timo Lassmann
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Ali M. Mortazavi
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
- University of California Irvine, Dept of. Developmental and Cell Biology, 2300 Biological Sciences III, Irving, CA USA 92697
| | - Andrea Tanzer
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Julien Lagarde
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Wei Lin
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Felix Schlesinger
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Chenghai Xue
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Georgi K. Marinov
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Jainab Khatun
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Brian A. Williams
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Chris Zaleski
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Joel Rozowsky
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
| | - Maik Röder
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Felix Kokocinski
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Rehab F. Abdelhamid
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Tyler Alioto
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Igor Antoshechkin
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Michael T. Baer
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Nadav S. Bar
- Department of Chemical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Philippe Batut
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Kimberly Bell
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Ian Bell
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Sudipto Chakrabortty
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Xian Chen
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Jacqueline Chrast
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Joao Curado
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Thomas Derrien
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Jorg Drenkow
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Erica Dumais
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Jacqueline Dumais
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Radha Duttagupta
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Emilie Falconnet
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Meagan Fastuca
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Kata Fejes-Toth
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Pedro Ferreira
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Sylvain Foissac
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - Melissa J. Fullwood
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Hui Gao
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| | - David Gonzalez
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Assaf Gordon
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Harsha Gunawardena
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Cedric Howald
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Sonali Jha
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Rory Johnson
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Philipp Kapranov
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
- St. Laurent Institute, One Kendall Square, Cambridge, MA
| | - Brandon King
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Colin Kingswood
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Oscar J. Luo
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Eddie Park
- University of California Irvine, Dept of. Developmental and Cell Biology, 2300 Biological Sciences III, Irving, CA USA 92697
| | - Kimberly Persaud
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Jonathan B. Preall
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Paolo Ribeca
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Brian Risk
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Daniel Robyr
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Michael Sammeth
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Lorian Schaffer
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Lei-Hoon See
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Atif Shahab
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Jorgen Skancke
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
- Department of Chemical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Ana Maria Suzuki
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Hazuki Takahashi
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Hagen Tilgner
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Diane Trout
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Nathalie Walters
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Huaien Wang
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - John Wrobel
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
| | - Yanbao Yu
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Xiaoan Ruan
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Yoshihide Hayashizaki
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
- Department of Computer Science, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520
| | - Tim Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire United Kingdom CB10 1SA
| | - Alexandre Reymond
- University of Lausanne, Center for Integrative Genomics, Genopode building, Lausanne, Switzerland 1015
| | - Stylianos E. Antonarakis
- University of Geneva Medical School, Department of Genetic Medicine and Development and iGE3 Institute of Genetics and Genomics of Geneva, 1 rue Michel-Servet, Geneva, Switzerland 1015
| | - Gregory Hannon
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
| | - Morgan C. Giddings
- Boise State University, College of Arts & Sciences, 1910 University Dr. Boise, ID USA 83725
- University of North Carolina at Chapel Hill, Department of Biochemistry & Biophysics, 120 Mason Farm Rd., Chapel Hill, NC USA 27599
| | - Yijun Ruan
- Genome Institute of Singapore, Genome Technology and Biology, 60 Biopolis Street, #02-01, Genome, Singapore, Singapore 138672
| | - Barbara Wold
- California Institute of Technology, Division of Biology, 91125. 2 Beckman Institute, Pasadena, CA USA 91125
| | - Piero Carninci
- RIKEN Yokohama Institute, RIKEN Omics Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa Japan 230-0045
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88 . Barcelona, Catalunya, Spain 08003
| | - Thomas R. Gingeras
- Cold Spring Harbor Laboratory, Functional Genomics, 1 Bungtown Rd. Cold Spring Harbor, NY, USA 11742
- Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA. USA 95051
| |
Collapse
|
270
|
Picardi E, Gallo A, Galeano F, Tomaselli S, Pesole G. A novel computational strategy to identify A-to-I RNA editing sites by RNA-Seq data: de novo detection in human spinal cord tissue. PLoS One 2012; 7:e44184. [PMID: 22957051 PMCID: PMC3434223 DOI: 10.1371/journal.pone.0044184] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Accepted: 07/30/2012] [Indexed: 01/30/2023] Open
Abstract
RNA editing is a post-transcriptional process occurring in a wide range of organisms. In human brain, the A-to-I RNA editing, in which individual adenosine (A) bases in pre-mRNA are modified to yield inosine (I), is the most frequent event. Modulating gene expression, RNA editing is essential for cellular homeostasis. Indeed, its deregulation has been linked to several neurological and neurodegenerative diseases. To date, many RNA editing sites have been identified by next generation sequencing technologies employing massive transcriptome sequencing together with whole genome or exome sequencing. While genome and transcriptome reads are not always available for single individuals, RNA-Seq data are widespread through public databases and represent a relevant source of yet unexplored RNA editing sites. In this context, we propose a simple computational strategy to identify genomic positions enriched in novel hypothetical RNA editing events by means of a new two-steps mapping procedure requiring only RNA-Seq data and no a priori knowledge of RNA editing characteristics and genomic reads. We assessed the suitability of our procedure by confirming A-to-I candidates using conventional Sanger sequencing and performing RNA-Seq as well as whole exome sequencing of human spinal cord tissue from a single individual.
Collapse
Affiliation(s)
- Ernesto Picardi
- Dipartimento di Bioscienze, Biotecnologie e Scienze Farmacologiche, Università di Bari, Bari, Italy
- Istituto di Biomembrane e Bioenergetica, Consiglio Nazionale delle Ricerche, Bari, Italy
| | - Angela Gallo
- RNA Editing Laboratory, Oncohaematology Department, Ospedale Pediatrico “Bambino Gesù”, IRCCS, Rome, Italy
| | - Federica Galeano
- RNA Editing Laboratory, Oncohaematology Department, Ospedale Pediatrico “Bambino Gesù”, IRCCS, Rome, Italy
| | - Sara Tomaselli
- RNA Editing Laboratory, Oncohaematology Department, Ospedale Pediatrico “Bambino Gesù”, IRCCS, Rome, Italy
| | - Graziano Pesole
- Dipartimento di Bioscienze, Biotecnologie e Scienze Farmacologiche, Università di Bari, Bari, Italy
- Istituto di Biomembrane e Bioenergetica, Consiglio Nazionale delle Ricerche, Bari, Italy
- * E-mail:
| |
Collapse
|
271
|
Pertea M. The human transcriptome: an unfinished story. Genes (Basel) 2012; 3:344-60. [PMID: 22916334 PMCID: PMC3422666 DOI: 10.3390/genes3030344] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2012] [Revised: 06/14/2012] [Accepted: 06/25/2012] [Indexed: 11/16/2022] Open
Abstract
Despite recent technological advances, the study of the human transcriptome is still in its early stages. Here we provide an overview of the complex human transcriptomic landscape, present the bioinformatics challenges posed by the vast quantities of transcriptomic data, and discuss some of the studies that have tried to determine how much of the human genome is transcribed. Recent evidence has suggested that more than 90% of the human genome is transcribed into RNA. However, this view has been strongly contested by groups of scientists who argued that many of the observed transcripts are simply the result of transcriptional noise. In this review, we conclude that the full extent of transcription remains an open question that will not be fully addressed until we decipher the complete range and biological diversity of the transcribed genomic sequences.
Collapse
Affiliation(s)
- Mihaela Pertea
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| |
Collapse
|
272
|
Kleinman CL, Adoue V, Majewski J. RNA editing of protein sequences: a rare event in human transcriptomes. RNA (NEW YORK, N.Y.) 2012; 18:1586-96. [PMID: 22832026 PMCID: PMC3425774 DOI: 10.1261/rna.033233.112] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2012] [Accepted: 06/19/2012] [Indexed: 05/21/2023]
Abstract
RNA editing, the post-transcriptional recoding of RNA molecules, has broad potential implications for gene expression. Several recent studies of human transcriptomes reported a high number of differences between DNA and RNA, including events not explained by any known mammalian RNA-editing mechanism. However, RNA-editing estimates differ by orders of magnitude, since technical limitations of high-throughput sequencing have been sometimes overlooked and sequencing errors have been confounded with editing sites. Here, we developed a series of computational approaches to analyze the extent of this process in the human transcriptome, identifying and addressing the major sources of error of a large-scale approach. We apply the detection pipeline to deep sequencing data from lymphoblastoid cell lines expressing ADAR1 at high levels, and show that noncanonical editing is unlikely to occur, with at least 85%-98% of candidate sites being the result of sequencing and mapping artifacts. By implementing a method to detect intronless gene duplications, we show that most noncanonical sites previously validated originate in read mismapping within these regions. Canonical A-to-G editing, on the other hand, is widespread in noncoding Alu sequences and rare in exonic and coding regions, where the validation rate also dropped. The genomic distribution of editing sites we find, together with the lack of consistency across studies or biological replicates, suggest a minor quantitative impact of this process in the overall recoding of protein sequences. We propose instead a primary role of ADAR1 protein as a defense system against elements potentially damaging to the genome.
Collapse
Affiliation(s)
- Claudia L Kleinman
- Department of Human Genetics, McGill University–Genome Quebec Innovation Centre, Montreal, Quebec H3A 1A4, Canada.
| | | | | |
Collapse
|
273
|
A method to identify RNA A-to-I editing targets using I-specific cleavage and exon array analysis. Mol Cell Probes 2012; 27:38-45. [PMID: 22960667 DOI: 10.1016/j.mcp.2012.08.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Revised: 08/17/2012] [Accepted: 08/20/2012] [Indexed: 11/21/2022]
Abstract
RNA A-to-I editing is the most common single-base editing in the animal kingdom. Dysregulations of RNA A-to-I editing are associated with developmental defects in mouse and human diseases. Mouse knockout models deficient in ADAR activities show lethal phenotypes associated with defects in nervous system, failure of hematopoiesis and reduced tolerance to stress. While several methods of identifying RNA A-to-I editing sites are currently available, most of the critical editing targets responsible for the important biological functions of ADARs remain unknown. Here we report a method to systematically analyze RNA A-to-I editing targets by combining I-specific cleavage and exon array analysis. Our results show that I-specific cleavage on editing sites causes more than twofold signal reductions in edited exons of known targets such as Gria2, Htr2c, Gabra3 and Cyfip2 in mice. This method provides an experimental approach for genome-wide analysis of RNA A-to-I editing targets with exon-level resolution. We believe this method will help expedite inquiry into the roles of RNA A-to-I editing in various biological processes and diseases.
Collapse
|
274
|
Stoeckle MY, Kerr KCR. Frequency matrix approach demonstrates high sequence quality in avian BARCODEs and highlights cryptic pseudogenes. PLoS One 2012; 7:e43992. [PMID: 22952842 PMCID: PMC3428349 DOI: 10.1371/journal.pone.0043992] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2012] [Accepted: 07/26/2012] [Indexed: 11/19/2022] Open
Abstract
The accuracy of DNA barcode databases is critical for research and practical applications. Here we apply a frequency matrix to assess sequencing errors in a very large set of avian BARCODEs. Using 11,000 sequences from 2,700 bird species, we show most avian cytochrome c oxidase I (COI) nucleotide and amino acid sequences vary within a narrow range. Except for third codon positions, nearly all (96%) sites were highly conserved or limited to two nucleotides or two amino acids. A large number of positions had very low frequency variants present in single individuals of a species; these were strongly concentrated at the ends of the barcode segment, consistent with sequencing error. In addition, a small fraction (0.1%) of BARCODEs had multiple very low frequency variants shared among individuals of a species; these were found to represent overlooked cryptic pseudogenes lacking stop codons. The calculated upper limit of sequencing error was 8 × 10(-5) errors/nucleotide, which was relatively high for direct Sanger sequencing of amplified DNA, but unlikely to compromise species identification. Our results confirm the high quality of the avian BARCODE database and demonstrate significant quality improvement in avian COI records deposited in GenBank over the past decade. This approach has potential application for genetic database quality control, discovery of cryptic pseudogenes, and studies of low-level genetic variation.
Collapse
Affiliation(s)
- Mark Y Stoeckle
- Program for the Human Environment, Rockefeller University, New York, New York, United States of America.
| | | |
Collapse
|
275
|
Cai G, Li H, Lu Y, Huang X, Lee J, Müller P, Ji Y, Liang S. Accuracy of RNA-Seq and its dependence on sequencing depth. BMC Bioinformatics 2012; 13 Suppl 13:S5. [PMID: 23320920 PMCID: PMC3426807 DOI: 10.1186/1471-2105-13-s13-s5] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Background The cost of DNA sequencing has undergone a dramatical reduction in the past decade. As a result, sequencing technologies have been increasingly applied to genomic research. RNA-Seq is becoming a common technique for surveying gene expression based on DNA sequencing. As it is not clear how increased sequencing capacity has affected measurement accuracy of mRNA, we sought to investigate that relationship. Result We empirically evaluate the accuracy of repeated gene expression measurements using RNA-Seq. We identify library preparation steps prior to DNA sequencing as the main source of error in this process. Studying three datasets, we show that the accuracy indeed improves with the sequencing depth. However, the rate of improvement as a function of sequence reads is generally slower than predicted by the binomial distribution. We therefore used the beta-binomial distribution to model the overdispersion. The overdispersion parameters we introduced depend explicitly on the number of reads so that the resulting statistical uncertainty is consistent with the empirical data that measurement accuracy increases with the sequencing depth. The overdispersion parameters were determined by maximizing the likelihood. We shown that our modified beta-binomial model had lower false discovery rate than the binomial or the pure beta-binomial models. Conclusion We proposed a novel form of overdispersion guaranteeing that the accuracy improves with sequencing depth. We demonstrated that the new form provides a better fit to the data.
Collapse
Affiliation(s)
- Guoshuai Cai
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | | | | | | | | | | | | | | |
Collapse
|
276
|
Abstract
Prions are agents of analog, protein conformation-based inheritance that can confer beneficial phenotypes to cells, especially under stress. Combined with genetic variation, prion-mediated inheritance can be channeled into prion-independent genomic inheritance. Latest screening shows that prions are common, at least in fungi. Thus, there is non-negligible flow of information from proteins to the genome in modern cells, in a direct violation of the Central Dogma of molecular biology. The prion-mediated heredity that violates the Central Dogma appears to be a specific, most radical manifestation of the widespread assimilation of protein (epigenetic) variation into genetic variation. The epigenetic variation precedes and facilitates genetic adaptation through a general 'look-ahead effect' of phenotypic mutations. This direction of the information flow is likely to be one of the important routes of environment-genome interaction and could substantially contribute to the evolution of complex adaptive traits.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
277
|
Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data. BMC Genomics 2012; 13:412. [PMID: 22908858 PMCID: PMC3478165 DOI: 10.1186/1471-2164-13-412] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Accepted: 08/10/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ribosomal proteins (RPs) have about 2000 pseudogenes in the human genome. While anecdotal reports for RP pseudogene transcription exists, it is unclear to what extent these pseudogenes are transcribed. The RP pseudogene transcription is difficult to identify in microarrays due to potential cross-hybridization between transcripts from the parent genes and pseudogenes. Recently, transcriptome sequencing (RNA-seq) provides an opportunity to ascertain the transcription of pseudogenes. A challenge for pseudogene expression discovery in RNA-seq data lies in the difficulty to uniquely identify reads mapped to pseudogene regions, which are typically also similar to the parent genes. RESULTS Here we developed a specialized pipeline for pseudogene transcription discovery. We first construct a "composite genome" that includes the entire human genome sequence as well as mRNA sequences of real ribosomal protein genes. We then map all sequence reads to the composite genome, and only exact matches were retained. Moreover, we restrict our analysis to strictly defined mappable regions and calculate the RPKM values as measurement of pseudogene transcription levels. We report evidences for the transcription of RP pseudogenes in 16 human tissues. By analyzing the Human Body Map 2.0 study RNA-sequencing data using our pipeline, we identified that one ribosomal protein (RP) pseudogene (PGOHUM-249508) is transcribed with RPKM 170 in thyroid. Moreover, three other RP pseudogenes are transcribed with RPKM > 10, a level similar to that of the normal RP genes, in white blood cell, kidney, and testes, respectively. Furthermore, an additional thirteen RP pseudogenes are of RPKM > 5, corresponding to the 20-30 percentile among all genes. Unlike ribosomal protein genes that are constitutively expressed in almost all tissues, RP pseudogenes are differentially expressed, suggesting that they may contribute to tissue-specific biological processes. CONCLUSIONS Using a specialized bioinformatics method, we identified the transcription of ribosomal protein pseudogenes in human tissues using RNA-seq data.
Collapse
|
278
|
Zhu H, Urban DJ, Blashka J, McPheeters MT, Kroeze WK, Mieczkowski P, Overholser JC, Jurjus GJ, Dieter L, Mahajan GJ, Rajkowska G, Wang Z, Sullivan PF, Stockmeier CA, Roth BL. Quantitative analysis of focused a-to-I RNA editing sites by ultra-high-throughput sequencing in psychiatric disorders. PLoS One 2012; 7:e43227. [PMID: 22912834 PMCID: PMC3422315 DOI: 10.1371/journal.pone.0043227] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Accepted: 07/18/2012] [Indexed: 12/01/2022] Open
Abstract
A-to-I RNA editing is a post-transcriptional modification of single nucleotides in RNA by adenosine deamination, which thereby diversifies the gene products encoded in the genome. Thousands of potential RNA editing sites have been identified by recent studies (e.g. see Li et al, Science 2009); however, only a handful of these sites have been independently confirmed. Here, we systematically and quantitatively examined 109 putative coding region A-to-I RNA editing sites in three sets of normal human brain samples by ultra-high-throughput sequencing (uHTS). Forty of 109 putative sites, including 25 previously confirmed sites, were validated as truly edited in our brain samples, suggesting an overestimation of A-to-I RNA editing in these putative sites by Li et al (2009). To evaluate RNA editing in human disease, we analyzed 29 of the confirmed sites in subjects with major depressive disorder and schizophrenia using uHTS. In striking contrast to many prior studies, we did not find significant alterations in the frequency of RNA editing at any of the editing sites in samples from these patients, including within the 5HT2C serotonin receptor (HTR2C). Our results indicate that uHTS is a fast, quantitative and high-throughput method to assess RNA editing in human physiology and disease and that many prior studies of RNA editing may overestimate both the extent and disease-related variability of RNA editing at the sites we examined in the human brain.
Collapse
Affiliation(s)
- Hu Zhu
- Department of Pharmacology, University of North Carolina Chapel Hill Medical School, Chapel Hill, North Carolina, United States of America
| | - Daniel J. Urban
- Department of Pharmacology, University of North Carolina Chapel Hill Medical School, Chapel Hill, North Carolina, United States of America
| | - Jared Blashka
- Department of Pharmacology, University of North Carolina Chapel Hill Medical School, Chapel Hill, North Carolina, United States of America
| | - Matthew T. McPheeters
- Department of Pharmacology, University of North Carolina Chapel Hill Medical School, Chapel Hill, North Carolina, United States of America
| | - Wesley K. Kroeze
- Department of Pharmacology, University of North Carolina Chapel Hill Medical School, Chapel Hill, North Carolina, United States of America
| | - Piotr Mieczkowski
- Department of Genetics, School of Medicine, Chapel Hill, North Carolina, United States of America
| | - James C. Overholser
- Department of Psychology, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - George J. Jurjus
- Department of Psychiatry, Case Western Reserve University, Cleveland, Ohio, United States of America
- Department of Psychiatry, Louis Stokes Cleveland VA Medical Center, Cleveland, Ohio, United States of America
| | - Lesa Dieter
- Department of Psychology, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Gouri J. Mahajan
- Center for Psychiatric Neuroscience, Department of Psychiatry and Human Behavior, University of Mississippi Medical Center, Jackson, Mississippi, United States of America
| | - Grazyna Rajkowska
- Center for Psychiatric Neuroscience, Department of Psychiatry and Human Behavior, University of Mississippi Medical Center, Jackson, Mississippi, United States of America
| | - Zefeng Wang
- Department of Pharmacology, University of North Carolina Chapel Hill Medical School, Chapel Hill, North Carolina, United States of America
| | - Patrick F. Sullivan
- Department of Genetics, School of Medicine, Chapel Hill, North Carolina, United States of America
| | - Craig A. Stockmeier
- Department of Psychiatry, Case Western Reserve University, Cleveland, Ohio, United States of America
- Center for Psychiatric Neuroscience, Department of Psychiatry and Human Behavior, University of Mississippi Medical Center, Jackson, Mississippi, United States of America
| | - Bryan L. Roth
- Department of Pharmacology, University of North Carolina Chapel Hill Medical School, Chapel Hill, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
279
|
Zook JM, Samarov D, McDaniel J, Sen SK, Salit M. Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing. PLoS One 2012; 7:e41356. [PMID: 22859977 PMCID: PMC3409179 DOI: 10.1371/journal.pone.0041356] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2012] [Accepted: 06/20/2012] [Indexed: 01/04/2023] Open
Abstract
While the importance of random sequencing errors decreases at higher DNA or RNA sequencing depths, systematic sequencing errors (SSEs) dominate at high sequencing depths and can be difficult to distinguish from biological variants. These SSEs can cause base quality scores to underestimate the probability of error at certain genomic positions, resulting in false positive variant calls, particularly in mixtures such as samples with RNA editing, tumors, circulating tumor cells, bacteria, mitochondrial heteroplasmy, or pooled DNA. Most algorithms proposed for correction of SSEs require a data set used to calculate association of SSEs with various features in the reads and sequence context. This data set is typically either from a part of the data set being “recalibrated” (Genome Analysis ToolKit, or GATK) or from a separate data set with special characteristics (SysCall). Here, we combine the advantages of these approaches by adding synthetic RNA spike-in standards to human RNA, and use GATK to recalibrate base quality scores with reads mapped to the spike-in standards. Compared to conventional GATK recalibration that uses reads mapped to the genome, spike-ins improve the accuracy of Illumina base quality scores by a mean of 5 Phred-scaled quality score units, and by as much as 13 units at CpG sites. In addition, since the spike-in data used for recalibration are independent of the genome being sequenced, our method allows run-specific recalibration even for the many species without a comprehensive and accurate SNP database. We also use GATK with the spike-in standards to demonstrate that the Illumina RNA sequencing runs overestimate quality scores for AC, CC, GC, GG, and TC dinucleotides, while SOLiD has less dinucleotide SSEs but more SSEs for certain cycles. We conclude that using these DNA and RNA spike-in standards with GATK improves base quality score recalibration.
Collapse
Affiliation(s)
- Justin M Zook
- Biochemical Science Division, National Institute of Standards and Technology, Gaithersburg, Maryland, United States of America.
| | | | | | | | | |
Collapse
|
280
|
Affiliation(s)
- Daniel Macarthur
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA.
| |
Collapse
|
281
|
Wu JR, Zeng R. Molecular basis for population variation: from SNPs to SAPs. FEBS Lett 2012; 586:2841-5. [PMID: 22828278 DOI: 10.1016/j.febslet.2012.07.036] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2012] [Revised: 07/14/2012] [Accepted: 07/16/2012] [Indexed: 01/09/2023]
Abstract
Single nucleotide polymorphisms (SNPs) are one type of genomic DNA variations in a population. Correspondingly, single amino-acid polymorphisms (SAPs) derived from non-synonymous SNPs represent protein variations in a population. Recently, using proteomic approaches, SAPs in the plasma proteomes of an Asian population were systematically identified for the first time. That study showed that heterozygous and homozygous proteins with various SAPs have different associations with particular traits in the population. Recent discoveries of widespread differences between RNA and DNA sequences indicate that RNA editing is also a source of SAPs--one that is independent of genomic SNPs. Furthermore, we argue that there are de novo SAPs that are not encoded by either DNA or RNA sequences.
Collapse
Affiliation(s)
- Jia-Rui Wu
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.
| | | |
Collapse
|
282
|
Longo G, Miquel PA, Sonnenschein C, Soto AM. Is information a proper observable for biological organization? PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2012; 109:108-14. [PMID: 22796169 DOI: 10.1016/j.pbiomolbio.2012.06.004] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2012] [Revised: 05/03/2012] [Accepted: 06/19/2012] [Indexed: 01/08/2023]
Abstract
In the last century, jointly with the advent of computers, mathematical theories of information were developed. Shortly thereafter, during the ascent of molecular biology, the concept of information was rapidly transferred into biology at large. Several philosophers and biologists have argued against adopting this concept based on epistemological and ontological arguments, and also, because it encouraged genetic determinism. While the theories of elaboration and transmission of information are valid mathematical theories, their own logic and implicit causal structure make them inimical to biology, and because of it, their applications have and are hindering the development of a sound theory of organisms. Our analysis concentrates on the development of information theories in mathematics and on the differences between these theories regarding the relationship among complexity, information and entropy.
Collapse
Affiliation(s)
- G Longo
- CREA, Ecole Polytechnique, 32 Boulevard Victor, 75015 Paris, France.
| | | | | | | |
Collapse
|
283
|
Rosenthal JJC, Seeburg PH. A-to-I RNA editing: effects on proteins key to neural excitability. Neuron 2012; 74:432-9. [PMID: 22578495 DOI: 10.1016/j.neuron.2012.04.010] [Citation(s) in RCA: 109] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/11/2012] [Indexed: 10/28/2022]
Abstract
RNA editing by adenosine deamination is a process used to diversify the proteome. The expression of ADARs, the editing enzymes, is ubiquitous among true metazoans, and so adenosine deamination is thought to be universal. By changing codons at the level of mRNA, protein function can be altered, perhaps in response to physiological demand. Although the number of editing sites identified in recent years has been rising exponentially, their effects on protein function, in general, are less well understood. This review assesses the state of the field and highlights particular cases where the biophysical alterations and functional effects caused by RNA editing have been studied in detail.
Collapse
Affiliation(s)
- Joshua J C Rosenthal
- Institute of Neurobiology and Department of Biochemistry, University of Puerto Rico Medical Sciences Campus, San Juan, Puerto Rico 00901, USA
| | | |
Collapse
|
284
|
Abstract
Transcription by RNA polymerase II is the process that copies DNA into RNA leading to the expression of a specific gene. Averaged estimates of polymerase elongation rates in mammalian cells have been shown to vary between 1 and 4 kilobases per minute. However, recent advances in live cell imaging allowed direct measurements of RNA biogenesis from a single gene exceeded 50 kb·min(-1) . This unexpected finding opens novel and intriguing perspectives on the control of metazoan transcription.
Collapse
Affiliation(s)
- Alessandro Marcello
- Laboratory of Molecular Virology, International Centre for Genetic Engineering and Biotechnology, Trieste, Italy.
| |
Collapse
|
285
|
Comprehensive analysis of mRNA methylation reveals enrichment in 3' UTRs and near stop codons. Cell 2012; 149:1635-46. [PMID: 22608085 DOI: 10.1016/j.cell.2012.05.003] [Citation(s) in RCA: 3027] [Impact Index Per Article: 232.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Revised: 03/15/2012] [Accepted: 03/27/2012] [Indexed: 02/07/2023]
Abstract
Methylation of the N(6) position of adenosine (m(6)A) is a posttranscriptional modification of RNA with poorly understood prevalence and physiological relevance. The recent discovery that FTO, an obesity risk gene, encodes an m(6)A demethylase implicates m(6)A as an important regulator of physiological processes. Here, we present a method for transcriptome-wide m(6)A localization, which combines m(6)A-specific methylated RNA immunoprecipitation with next-generation sequencing (MeRIP-Seq). We use this method to identify mRNAs of 7,676 mammalian genes that contain m(6)A, indicating that m(6)A is a common base modification of mRNA. The m(6)A modification exhibits tissue-specific regulation and is markedly increased throughout brain development. We find that m(6)A sites are enriched near stop codons and in 3' UTRs, and we uncover an association between m(6)A residues and microRNA-binding sites within 3' UTRs. These findings provide a resource for identifying transcripts that are substrates for adenosine methylation and reveal insights into the epigenetic regulation of the mammalian transcriptome.
Collapse
|
286
|
Maiolica A, Jünger MA, Ezkurdia I, Aebersold R. Targeted proteome investigation via selected reaction monitoring mass spectrometry. J Proteomics 2012; 75:3495-513. [PMID: 22579752 DOI: 10.1016/j.jprot.2012.04.048] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2011] [Revised: 04/27/2012] [Accepted: 04/29/2012] [Indexed: 12/20/2022]
Abstract
Due to the enormous complexity of proteomes which constitute the entirety of protein species expressed by a certain cell or tissue, proteome-wide studies performed in discovery mode are still limited in their ability to reproducibly identify and quantify all proteins present in complex biological samples. Therefore, the targeted analysis of informative subsets of the proteome has been beneficial to generate reproducible data sets across multiple samples. Here we review the repertoire of antibody- and mass spectrometry (MS) -based analytical tools which is currently available for the directed analysis of predefined sets of proteins. The topics of emphasis for this review are Selected Reaction Monitoring (SRM) mass spectrometry, emerging tools to control error rates in targeted proteomic experiments, and some representative examples of applications. The ability to cost- and time-efficiently generate specific and quantitative assays for large numbers of proteins and posttranslational modifications has the potential to greatly expand the range of targeted proteomic coverage in biological studies. This article is part of a Special Section entitled: Understanding genome regulation and genetic diversity by mass spectrometry.
Collapse
Affiliation(s)
- Alessio Maiolica
- Department of Biology, Institute of Molecular Systems Biology, Zurich, Switzerland
| | | | | | | |
Collapse
|
287
|
Chen R, Mias GI, Li-Pook-Than J, Jiang L, Lam HYK, Chen R, Miriami E, Karczewski KJ, Hariharan M, Dewey FE, Cheng Y, Clark MJ, Im H, Habegger L, Balasubramanian S, O'Huallachain M, Dudley JT, Hillenmeyer S, Haraksingh R, Sharon D, Euskirchen G, Lacroute P, Bettinger K, Boyle AP, Kasowski M, Grubert F, Seki S, Garcia M, Whirl-Carrillo M, Gallardo M, Blasco MA, Greenberg PL, Snyder P, Klein TE, Altman RB, Butte AJ, Ashley EA, Gerstein M, Nadeau KC, Tang H, Snyder M. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 2012; 148:1293-307. [PMID: 22424236 DOI: 10.1016/j.cell.2012.02.009] [Citation(s) in RCA: 881] [Impact Index Per Article: 67.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2011] [Revised: 01/27/2012] [Accepted: 02/04/2012] [Indexed: 12/18/2022]
Abstract
Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.
Collapse
Affiliation(s)
- Rui Chen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
288
|
Riva L, Luzi L, Pelicci PG. Genomics of acute myeloid leukemia: the next generation. Front Oncol 2012; 2:40. [PMID: 22666660 PMCID: PMC3364462 DOI: 10.3389/fonc.2012.00040] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 02/27/2012] [Indexed: 11/18/2022] Open
Abstract
Acute myeloid leukemia (AML) is, as other types of cancer, a genetic disorder of somatic cells. The detection of somatic molecular abnormalities that may cause and maintain AML is crucial for patient stratification. The development of mutation-specific therapeutic interventions will hopefully increase cure rates and improve patients’ quality of life. This review illustrates how next generation sequencing technologies are changing the study of cancer genomics of adult AML patients.
Collapse
Affiliation(s)
- Laura Riva
- Department of Experimental Oncology, European Institute of Oncology Milan, Italy
| | | | | |
Collapse
|
289
|
|
290
|
Danecek P, Nellåker C, McIntyre RE, Buendia-Buendia JE, Bumpstead S, Ponting CP, Flint J, Durbin R, Keane TM, Adams DJ. High levels of RNA-editing site conservation amongst 15 laboratory mouse strains. Genome Biol 2012; 13:26. [PMID: 22524474 PMCID: PMC3446300 DOI: 10.1186/gb-2012-13-4-r26] [Citation(s) in RCA: 126] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2012] [Revised: 04/18/2012] [Accepted: 04/23/2012] [Indexed: 11/12/2022] Open
Abstract
Background Adenosine-to-inosine (A-to-I) editing is a site-selective post-transcriptional alteration of double-stranded RNA by ADAR deaminases that is crucial for homeostasis and development. Recently the Mouse Genomes Project generated genome sequences for 17 laboratory mouse strains and rich catalogues of variants. We also generated RNA-seq data from whole brain RNA from 15 of the sequenced strains. Results Here we present a computational approach that takes an initial set of transcriptome/genome mismatch sites and filters these calls taking into account systematic biases in alignment, single nucleotide variant calling, and sequencing depth to identify RNA editing sites with high accuracy. We applied this approach to our panel of mouse strain transcriptomes identifying 7,389 editing sites with an estimated false-discovery rate of between 2.9 and 10.5%. The overwhelming majority of these edits were of the A-to-I type, with less than 2.4% not of this class, and only three of these edits could not be explained as alignment artifacts. We validated 24 novel RNA editing sites in coding sequence, including two non-synonymous edits in the Cacna1d gene that fell into the IQ domain portion of the Cav1.2 voltage-gated calcium channel, indicating a potential role for editing in the generation of transcript diversity. Conclusions We show that despite over two million years of evolutionary divergence, the sites edited and the level of editing at each site is remarkably consistent across the 15 strains. In the Cds2 gene we find evidence for RNA editing acting to preserve the ancestral transcript sequence despite genomic sequence divergence.
Collapse
Affiliation(s)
- Petr Danecek
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1HH, UK
| | | | | | | | | | | | | | | | | | | |
Collapse
|
291
|
Van Bers NEM, Santure AW, Van Oers K, De Cauwer I, Dibbits BW, Mateman C, Crooijmans RPMA, Sheldon BC, Visser ME, Groenen MAM, Slate J. The design and cross-population application of a genome-wide SNP chip for the great tit Parus major. Mol Ecol Resour 2012; 12:753-70. [PMID: 22487530 DOI: 10.1111/j.1755-0998.2012.03141.x] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The vast amount of phenotypic information collected in some wild animal populations makes them extremely valuable for unravelling the genetics of ecologically important traits and understanding how populations adapt to changes in their environment. Next generation sequencing has revolutionized the development of large marker panels in species previously lacking genomic resources. In this study, a unique genomics toolkit was developed for the great tit (Parus major), a model species in ecology and behavioural biology. This toolkit consists of nearly 100,000 SNPs, over 250 million nucleotides of assembled genomic DNA and more than 80 million nucleotides of assembled expressed sequences. A SNP chip with 9193 SNP markers expected to be spaced evenly along the great tit genome was used to genotype 4702 birds from two of the most intensively studied natural vertebrate populations [Wytham Woods/Bagley Woods (United Kingdom) and de Hoge Veluwe/Westerheide (The Netherlands)]. We show that (i) SNPs identified in either of the two populations have a high genotyping success in the other population, (ii) the minor allele frequencies of the SNPs are highly correlated between the two populations and (iii) despite this high correlation, a large number of SNPs display significant differentiation (F(ST) ) between the populations, with an overrepresentation of genes involved in cardiovascular development close to these SNPs. The developed resources provide the basis for unravelling the genetics of important traits in many long-term studies of great tits. More generally, the protocols and pitfalls encountered will be of use for those developing similar resources.
Collapse
Affiliation(s)
- Nikkie E M Van Bers
- Animal Breeding and Genomics Centre, Wageningen University, De Elst 1, Wageningen, 6708 WD, the Netherlands
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
292
|
Accurate identification of human Alu and non-Alu RNA editing sites. Nat Methods 2012; 9:579-81. [PMID: 22484847 DOI: 10.1038/nmeth.1982] [Citation(s) in RCA: 282] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2012] [Accepted: 03/29/2012] [Indexed: 11/08/2022]
Abstract
We developed a computational framework to robustly identify RNA editing sites using transcriptome and genome deep-sequencing data from the same individual. As compared with previous methods, our approach identified a large number of Alu and non-Alu RNA editing sites with high specificity. We also found that editing of non-Alu sites appears to be dependent on nearby edited Alu sites, possibly through the locally formed double-stranded RNA structure.
Collapse
|
293
|
Kleinman CL, Majewski J. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science 2012; 335:1302; author reply 1302. [PMID: 22422962 DOI: 10.1126/science.1209658] [Citation(s) in RCA: 136] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Li et al. (Research Articles, 1 July 2011, p. 53; published online 19 May 2011) reported large numbers of differences between DNA and messenger RNA in human cells, indicating unprecedented levels of RNA editing, and including sequence changes not produced by any of the known RNA editing mechanisms. However, common sources of systematic errors in high-throughput sequencing technology, which were not properly accounted for in this study, explain most of the claimed differences.
Collapse
Affiliation(s)
- Claudia L Kleinman
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada.
| | | |
Collapse
|
294
|
Lin W, Piskol R, Tan MH, Li JB. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science 2012; 335:1302; author reply 1302. [PMID: 22422964 DOI: 10.1126/science.1210419] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Li et al. (Research Articles, 1 July 2011, p. 53; published online 19 May 2011) reported widespread differences between the RNA and DNA sequences of the same human cells, including all 12 possible mismatch types. Before accepting such a fundamental claim, a deeper analysis of the sequencing data is required to discern true differences between RNA and DNA from potential artifacts.
Collapse
Affiliation(s)
- Wei Lin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | | | | | | |
Collapse
|
295
|
Lin W, Piskol R, Tan MH, Li JB. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science 2012. [PMID: 22422964 DOI: 10.1126/science.1210624] [Citation(s) in RCA: 127] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Li et al. (Research Articles, 1 July 2011, p. 53; published online 19 May 2011) reported widespread differences between the RNA and DNA sequences of the same human cells, including all 12 possible mismatch types. Before accepting such a fundamental claim, a deeper analysis of the sequencing data is required to discern true differences between RNA and DNA from potential artifacts.
Collapse
Affiliation(s)
- Wei Lin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | | | | | | |
Collapse
|
296
|
Murphy D, Konopacka A, Hindmarch C, Paton JFR, Sweedler JV, Gillette MU, Ueta Y, Grinevich V, Lozic M, Japundzic-Zigon N. The hypothalamic-neurohypophyseal system: from genome to physiology. J Neuroendocrinol 2012; 24:539-53. [PMID: 22448850 PMCID: PMC3315060 DOI: 10.1111/j.1365-2826.2011.02241.x] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The elucidation of the genomes of a large number of mammalian species has produced a huge amount of data on which to base physiological studies. These endeavours have also produced surprises, not least of which has been the revelation that the number of protein coding genes needed to make a mammal is only 22 333 (give or take). However, this small number belies an unanticipated complexity that has only recently been revealed as a result of genomic studies. This complexity is evident at a number of levels: (i) cis-regulatory sequences; (ii) noncoding and antisense mRNAs, most of which have no known function; (iii) alternative splicing that results in the generation of multiple, subtly different mature mRNAs from the precursor transcript encoded by a single gene; and (iv) post-translational processing and modification. In this review, we examine the steps being taken to decipher genome complexity in the context of gene expression, regulation and function in the hypothalamic-neurohypophyseal system (HNS). Five unique stories explain: (i) the use of transcriptomics to identify genes involved in the response to physiological (dehydration) and pathological (hypertension) cues; (ii) the use of mass spectrometry for single-cell level identification of biological active peptides in the HNS, and to measure in vitro release; (iii) the use of transgenic lines that express fusion transgenes enabling (by cross-breeding) the generation of double transgenic lines that can be used to study vasopressin (AVP) and oxytocin (OXT) neurones in the HNS, as well as their neuroanatomy, electrophysiology and activation upon exposure to any given stimulus; (iv) the use of viral vectors to demonstrate that somato-dendritically released AVP plays an important role in cardiovascular homeostasis by binding to V1a receptors on local somata and dendrites; and (v) the use of virally-mediated optogenetics to dissect the role of OXT and AVP in the modulation of a wide variety of behaviours.
Collapse
Affiliation(s)
- D Murphy
- Henry Wellcome Laboratories for Integrative Neuroscience and Endocrinology, University of Bristol, Bristol, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
297
|
Gu T, Buaas FW, Simons AK, Ackert-Bicknell CL, Braun RE, Hibbs MA. Canonical A-to-I and C-to-U RNA editing is enriched at 3'UTRs and microRNA target sites in multiple mouse tissues. PLoS One 2012; 7:e33720. [PMID: 22448268 PMCID: PMC3308996 DOI: 10.1371/journal.pone.0033720] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2011] [Accepted: 02/15/2012] [Indexed: 11/30/2022] Open
Abstract
RNA editing is a process that modifies RNA nucleotides and changes the efficiency and fidelity of the central dogma. Enzymes that catalyze RNA editing are required for life, and defects in RNA editing are associated with many diseases. Recent advances in sequencing have enabled the genome-wide identification of RNA editing sites in mammalian transcriptomes. Here, we demonstrate that canonical RNA editing (A-to-I and C-to-U) occurs in liver, white adipose, and bone tissues of the laboratory mouse, and we show that apparent non-canonical editing (all other possible base substitutions) is an artifact of current high-throughput sequencing technology. Further, we report that high-confidence canonical RNA editing sites can cause non-synonymous amino acid changes and are significantly enriched in 3′ UTRs, specifically at microRNA target sites, suggesting both regulatory and functional consequences for RNA editing.
Collapse
Affiliation(s)
- Tongjun Gu
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | | | | | | | | | | |
Collapse
|
298
|
Pickrell JK, Gilad Y, Pritchard JK. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science 2012; 335:1302; author reply 1302. [PMID: 22422963 PMCID: PMC5207799 DOI: 10.1126/science.1210484] [Citation(s) in RCA: 143] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Li et al. (Research Articles, 1 July 2011, p. 53; published online 19 May 2011) reported more than 10,000 mismatches between messenger RNA and DNA sequences from the same individuals, which they attributed to previously unrecognized mechanisms of gene regulation. We found that at least 88% of these sequence mismatches can likely be explained by technical artifacts such as errors in mapping sequencing reads to a reference genome, sequencing errors, and genetic variation.
Collapse
Affiliation(s)
- Joseph K Pickrell
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| | | | | |
Collapse
|
299
|
|
300
|
|