2
|
He Y, Yuan C, Chen L, Lei M, Zellmer L, Huang H, Liao DJ. Transcriptional-Readthrough RNAs Reflect the Phenomenon of "A Gene Contains Gene(s)" or "Gene(s) within a Gene" in the Human Genome, and Thus Are Not Chimeric RNAs. Genes (Basel) 2018; 9:E40. [PMID: 29337901 PMCID: PMC5793191 DOI: 10.3390/genes9010040] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 12/29/2017] [Accepted: 01/07/2018] [Indexed: 02/06/2023] Open
Abstract
Tens of thousands of chimeric RNAs, i.e., RNAs with sequences of two genes, have been identified in human cells. Most of them are formed by two neighboring genes on the same chromosome and are considered to be derived via transcriptional readthrough, but a true readthrough event still awaits more evidence and trans-splicing that joins two transcripts together remains as a possible mechanism. We regard those genomic loci that are transcriptionally read through as unannotated genes, because their transcriptional and posttranscriptional regulations are the same as those of already-annotated genes, including fusion genes formed due to genetic alterations. Therefore, readthrough RNAs and fusion-gene-derived RNAs are not chimeras. Only those two-gene RNAs formed at the RNA level, likely via trans-splicing, without corresponding genes as genomic parents, should be regarded as authentic chimeric RNAs. However, since in human cells, procedural and mechanistic details of trans-splicing have never been disclosed, we doubt the existence of trans-splicing. Therefore, there are probably no authentic chimeras in humans, after readthrough and fusion-gene derived RNAs are all put back into the group of ordinary RNAs. Therefore, it should be further determined whether in human cells all two-neighboring-gene RNAs are derived from transcriptional readthrough and whether trans-splicing truly exists.
Collapse
Affiliation(s)
- Yan He
- Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University, Guiyang 550004, Guizhou, China.
| | - Chengfu Yuan
- Department of Biochemistry, China Three Gorges University, Yichang City 443002, Hubei, China.
| | - Lichan Chen
- Hormel Institute, University of Minnesota, Austin, MN 55912, USA.
| | - Mingjuan Lei
- Hormel Institute, University of Minnesota, Austin, MN 55912, USA.
| | - Lucas Zellmer
- Masonic Cancer Center, University of Minnesota, 435 E. River Road, Minneapolis, MN 55455, USA.
| | - Hai Huang
- School of Clinical Laboratory Science, Guizhou Medical University, Guiyang 550004, Guizhou, China.
| | - Dezhong Joshua Liao
- Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University, Guiyang 550004, Guizhou, China.
- Department of Pathology, Guizhou Medical University Hospital, Guiyang 550004, Guizhou, China.
| |
Collapse
|
3
|
Audoux J, Salson M, Grosset CF, Beaumeunier S, Holder JM, Commes T, Philippe N. SimBA: A methodology and tools for evaluating the performance of RNA-Seq bioinformatic pipelines. BMC Bioinformatics 2017; 18:428. [PMID: 28969586 PMCID: PMC5623974 DOI: 10.1186/s12859-017-1831-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 09/08/2017] [Indexed: 11/10/2022] Open
Abstract
Background The evolution of next-generation sequencing (NGS) technologies has led to increased focus on RNA-Seq. Many bioinformatic tools have been developed for RNA-Seq analysis, each with unique performance characteristics and configuration parameters. Users face an increasingly complex task in understanding which bioinformatic tools are best for their specific needs and how they should be configured. In order to provide some answers to these questions, we investigate the performance of leading bioinformatic tools designed for RNA-Seq analysis and propose a methodology for systematic evaluation and comparison of performance to help users make well informed choices. Results To evaluate RNA-Seq pipelines, we developed a suite of two benchmarking tools. SimCT generates simulated datasets that get as close as possible to specific real biological conditions accompanied by the list of genomic incidents and mutations that have been inserted. BenchCT then compares the output of any bioinformatics pipeline that has been run against a SimCT dataset with the simulated genomic and transcriptional variations it contains to give an accurate performance evaluation in addressing specific biological question. We used these tools to simulate a real-world genomic medicine question s involving the comparison of healthy and cancerous cells. Results revealed that performance in addressing a particular biological context varied significantly depending on the choice of tools and settings used. We also found that by combining the output of certain pipelines, substantial performance improvements could be achieved. Conclusion Our research emphasizes the importance of selecting and configuring bioinformatic tools for the specific biological question being investigated to obtain optimal results. Pipeline designers, developers and users should include benchmarking in the context of their biological question as part of their design and quality control process. Our SimBA suite of benchmarking tools provides a reliable basis for comparing the performance of RNA-Seq bioinformatics pipelines in addressing a specific biological question. We would like to see the creation of a reference corpus of data-sets that would allow accurate comparison between benchmarks performed by different groups and the publication of more benchmarks based on this public corpus. SimBA software and data-set are available at http://cractools.gforge.inria.fr/softwares/simba/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1831-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jérôme Audoux
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France.,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France
| | - Mikaël Salson
- University Lille, CNRS, Centrale Lille, Inria, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, Lille, F-59000, France
| | | | - Sacha Beaumeunier
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France.,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France
| | - Jean-Marc Holder
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France.,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France
| | - Thérèse Commes
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France.,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France
| | - Nicolas Philippe
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France. .,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France.
| |
Collapse
|
4
|
Rufflé F, Audoux J, Boureux A, Beaumeunier S, Gaillard JB, Bou Samra E, Megarbane A, Cassinat B, Chomienne C, Alves R, Riquier S, Gilbert N, Lemaitre JM, Bacq-Daian D, Bougé AL, Philippe N, Commes T. New chimeric RNAs in acute myeloid leukemia. F1000Res 2017; 6. [PMID: 29623188 PMCID: PMC5861515 DOI: 10.12688/f1000research.11352.2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/05/2017] [Indexed: 12/24/2022] Open
Abstract
Background: High-throughput next generation sequencing (NGS) technologies enable the detection of biomarkers used for tumor classification, disease monitoring and cancer therapy. Whole-transcriptome analysis using RNA-seq is important, not only as a means of understanding the mechanisms responsible for complex diseases but also to efficiently identify novel genes/exons, splice isoforms, RNA editing, allele-specific mutations, differential gene expression and fusion-transcripts or chimeric RNA (chRNA). Methods: We used
Crac, a tool that uses genomic locations and local coverage to classify biological events and directly infer splice and chimeric junctions within a single read. Crac’s algorithm extracts transcriptional chimeric events irrespective of annotation with a high sensitivity, and
CracTools was used to aggregate, annotate and filter the chRNA reads. The selected chRNA candidates were validated by real time PCR and sequencing. In order to check the tumor specific expression of chRNA, we analyzed a publicly available dataset using a new tag search approach. Results: We present data related to acute myeloid leukemia (AML) RNA-seq analysis. We highlight novel biological cases of chRNA, in addition to previously well characterized leukemia chRNA. We have identified and validated 17 chRNAs among 3 AML patients: 10 from an AML patient with a translocation between chromosomes 15 and 17 (AML-t(15;17), 4 from patient with normal karyotype (AML-NK) 3 from a patient with chromosomal 16 inversion (AML-inv16). The new fusion transcripts can be classified into four groups according to the exon organization. Conclusions: All groups suggest complex but distinct synthesis mechanisms involving either collinear exons of different genes, non-collinear exons, or exons of different chromosomes. Finally, we check tumor-specific expression in a larger RNA-seq AML cohort and identify new AML biomarkers that could improve diagnosis and prognosis of AML.
Collapse
Affiliation(s)
- Florence Rufflé
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Jerome Audoux
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Anthony Boureux
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Sacha Beaumeunier
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | | | - Elias Bou Samra
- Université Paris Sud, Université Paris-Saclay, Orsay, France.,Institut Curie, PSL Research University, Paris, France
| | | | - Bruno Cassinat
- Laboratoire de Biologie Cellulaire, Hôpital Saint-Louis, Assistance publique - Hôpitaux de Paris (AP-HP), Paris, France
| | - Christine Chomienne
- Laboratoire de Biologie Cellulaire, Hôpital Saint-Louis, Assistance publique - Hôpitaux de Paris (AP-HP), Paris, France.,Hôpital Saint-Louis, Université Paris Diderot, INSERM UMRS 1131, Paris, France
| | - Ronnie Alves
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Instituto Tecnológico Vale, Nazaré, Belém, PA, Brazil
| | - Sebastien Riquier
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Nicolas Gilbert
- Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Jean-Marc Lemaitre
- Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | | | - Anne Laure Bougé
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Nicolas Philippe
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Therese Commes
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| |
Collapse
|