1
|
Tan Y, Mohanty V, Liang S, Dou J, Ma J, Kim KH, Bonder MJ, Shi X, Lee C, Chong Z, Chen K. Novornabreak: Local Assembly for Novel Splice Junction and Fusion Transcript Detection from RNA-Seq Data. JOURNAL OF BIOINFORMATICS AND SYSTEMS BIOLOGY : OPEN ACCESS 2023; 6:74-81. [PMID: 39301431 PMCID: PMC11412692 DOI: 10.26502/jbsb.5107050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
We present novoRNABreak, a unified framework for cancer specific novel splice junction and fusion transcript detection in RNA-seq data obtained from human cancer samples. novoRNABreak is based on a local assembly model, which offers a tradeoff between the alignment-based and de novo whole transcriptome assembly (WTA) methods. This approach is accurate and sensitive in assembling novel junctions that are difficult to directly align or have multiple alignments. Additionally, it is more efficient due to the strategy that focuses on junctions rather than full length transcripts. The performance of novoRNABreak is demonstrated by a comprehensive set of experiments using synthetic data generated based on genome reference, as well as real RNA-seq data from breast cancer and prostate cancer samples. The results show that our tool has a better performance by fully utilizing unmapped reads and precisely identifying the junctions where short reads or small exons have multiple alignments. novoRNABreak is a fully-fledged program available on GitHub (https://github.com/KChen-lab/novoRNABreak).
Collapse
Affiliation(s)
- Yukun Tan
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Vakul Mohanty
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Shaoheng Liang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Jinzhuang Dou
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Jun Ma
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Kun Hee Kim
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Marc Jan Bonder
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| | - Xinghua Shi
- Department of Computer & Information Sciences, College of Science and Technology, Temple University, Philadelphia, PA, 19122, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Zechen Chong
- Department of Genetics, the University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| |
Collapse
|
2
|
Pavlovich PV, Cauchy P. Sequences to Differences in Gene Expression: Analysis of RNA-Seq Data. Methods Mol Biol 2022; 2508:279-318. [PMID: 35737247 DOI: 10.1007/978-1-0716-2376-3_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
RNA-Seq is now a routinely employed assay to measure gene expression. As the technique matured over the last decade, so have dedicated analytic tools. In this chapter, we first describe the mainstream as well as the most up-to-date protocols and their implications on downstream analysis. We then detail the steps entailing RNA-Seq analysis in three main stages: (i) preprocessing and data preparation, (ii) upstream processing, and (iii) high-level analyses. We review the most recent and relevant tools as one workflow following a stepwise order. The chapter further encompasses in-depth features of these tools. Details of the required code are made available throughout the chapter, as well as of the underlying statistics. We illustrate these steps with analysis of publicly available RNA-Seq data.
Collapse
Affiliation(s)
| | - Pierre Cauchy
- Universitätskilinkum Freiburg, Freiburg, Germany.
- Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany.
| |
Collapse
|
3
|
CStone: A de novo transcriptome assembler for short-read data that identifies non-chimeric contigs based on underlying graph structure. PLoS Comput Biol 2021; 17:e1009631. [PMID: 34813594 PMCID: PMC8651127 DOI: 10.1371/journal.pcbi.1009631] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 12/07/2021] [Accepted: 11/11/2021] [Indexed: 11/19/2022] Open
Abstract
With the exponential growth of sequence information stored over the last decade, including that of de novo assembled contigs from RNA-Seq experiments, quantification of chimeric sequences has become essential when assembling read data. In transcriptomics, de novo assembled chimeras can closely resemble underlying transcripts, but patterns such as those seen between co-evolving sites, or mapped read counts, become obscured. We have created a de Bruijn based de novo assembler for RNA-Seq data that utilizes a classification system to describe the complexity of underlying graphs from which contigs are created. Each contig is labelled with one of three levels, indicating whether or not ambiguous paths exist. A by-product of this is information on the range of complexity of the underlying gene families present. As a demonstration of CStones ability to assemble high-quality contigs, and to label them in this manner, both simulated and real data were used. For simulated data, ten million read pairs were generated from cDNA libraries representing four species, Drosophila melanogaster, Panthera pardus, Rattus norvegicus and Serinus canaria. These were assembled using CStone, Trinity and rnaSPAdes; the latter two being high-quality, well established, de novo assembers. For real data, two RNA-Seq datasets, each consisting of ≈30 million read pairs, representing two adult D. melanogaster whole-body samples were used. The contigs that CStone produced were comparable in quality to those of Trinity and rnaSPAdes in terms of length, sequence identity of aligned regions and the range of cDNA transcripts represented, whilst providing additional information on chimerism. Here we describe the details of CStones assembly and classification process, and propose that similar classification systems can be incorporated into other de novo assembly tools. Within a related side study, we explore the effects that chimera’s within reference sets have on the identification of differentially expression genes. CStone is available at: https://sourceforge.net/projects/cstone/. Within transcriptome reference sets, non-chimeric sequences are representations of transcribed genes, while artificially generated chimeric ones are mosaics of two or more pieces of DNA incorrectly pieced together. One area where such sets are utilized is in the quantification of gene expression patterns; where RNA-Seq reads are mapped to the sequences within, and subsequent count values reflect expression levels. Artificial chimeras can have a negative impact on count values by erroneously increasing variation in relation to the reads being mapped. Reference sets can be created from de novo assembled contigs, but chimeras can be introduced during the assembly process via the required traversal of graphs, representing gene families, constructed from the RNA-Seq data. Graph complexity determines how likely chimeras will arise. We have created CStone, a de novo assembler that utilizes a classification system to describe such complexity. Contigs created by CStone are labelled in a manner that indicates whether or not they are non-chimeric. This encourages contig dependent results to be presented with increased objectivity by maintaining the context of ambiguity associated with the assembly process. CStone has been tested extensively. Additionally, we have quantified the relationship between chimeras within reference sets and the identification of differentially expressed genes.
Collapse
|
4
|
Sigman MJ, Panda K, Kirchner R, McLain LL, Payne H, Peasari JR, Husbands AY, Slotkin RK, McCue AD. An siRNA-guided ARGONAUTE protein directs RNA polymerase V to initiate DNA methylation. NATURE PLANTS 2021; 7:1461-1474. [PMID: 34750500 PMCID: PMC8592841 DOI: 10.1038/s41477-021-01008-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 09/09/2021] [Indexed: 05/03/2023]
Abstract
In mammals and plants, cytosine DNA methylation is essential for the epigenetic repression of transposable elements and foreign DNA. In plants, DNA methylation is guided by small interfering RNAs (siRNAs) in a self-reinforcing cycle termed RNA-directed DNA methylation (RdDM). RdDM requires the specialized RNA polymerase V (Pol V), and the key unanswered question is how Pol V is first recruited to new target sites without pre-existing DNA methylation. We find that Pol V follows and is dependent on the recruitment of an AGO4-clade ARGONAUTE protein, and any siRNA can guide the ARGONAUTE protein to the new target locus independent of pre-existing DNA methylation. These findings reject long-standing models of RdDM initiation and instead demonstrate that siRNA-guided ARGONAUTE targeting is necessary, sufficient and first to target Pol V recruitment and trigger the cycle of RdDM at a transcribed target locus, thereby establishing epigenetic silencing.
Collapse
Affiliation(s)
- Meredith J Sigman
- Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA
- Donald Danforth Plant Science Center, St. Louis, MO, USA
| | - Kaushik Panda
- Donald Danforth Plant Science Center, St. Louis, MO, USA
| | - Rachel Kirchner
- Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA
- Medical Scientist Training Program, University of Wisconsin, Madison, WI, USA
| | | | - Hayden Payne
- Donald Danforth Plant Science Center, St. Louis, MO, USA
- Graduate Program in the School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - John Reddy Peasari
- Donald Danforth Plant Science Center, St. Louis, MO, USA
- Bioinformatics and Computational Biology Program, Saint Louis University, St. Louis, MO, USA
| | - Aman Y Husbands
- Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA
| | - R Keith Slotkin
- Donald Danforth Plant Science Center, St. Louis, MO, USA.
- Division of Biological Sciences, University of Missouri, Columbia, MO, USA.
| | - Andrea D McCue
- Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA
- Donald Danforth Plant Science Center, St. Louis, MO, USA
| |
Collapse
|
5
|
Alser M, Rotman J, Deshpande D, Taraszka K, Shi H, Baykal PI, Yang HT, Xue V, Knyazev S, Singer BD, Balliu B, Koslicki D, Skums P, Zelikovsky A, Alkan C, Mutlu O, Mangul S. Technology dictates algorithms: recent developments in read alignment. Genome Biol 2021; 22:249. [PMID: 34446078 PMCID: PMC8390189 DOI: 10.1186/s13059-021-02443-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 07/28/2021] [Indexed: 01/08/2023] Open
Abstract
Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today's diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.
Collapse
Affiliation(s)
- Mohammed Alser
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Jeremy Rotman
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Dhrithi Deshpande
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
| | - Kodi Taraszka
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Pelin Icer Baykal
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Harry Taegyun Yang
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Ph.D. Program, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Victor Xue
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Benjamin D Singer
- Division of Pulmonary and Critical Care Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
- Department of Biochemistry & Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, USA
- Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - David Koslicki
- Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16801, USA
- Biology Department, Pennsylvania State University, University Park, PA, 16801, USA
- The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16801, USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Can Alkan
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Bilkent-Hacettepe Health Sciences and Technologies Program, Ankara, Turkey
| | - Onur Mutlu
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Serghei Mangul
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
6
|
Kuitche E, Jammali S, Ouangraoua A. SimSpliceEvol: alternative splicing-aware simulation of biological sequence evolution. BMC Bioinformatics 2019; 20:640. [PMID: 31842741 PMCID: PMC6916212 DOI: 10.1186/s12859-019-3207-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Background It is now well established that eukaryotic coding genes have the ability to produce more than one type of transcript thanks to the mechanisms of alternative splicing and alternative transcription. Because of the lack of gold standard real data on alternative splicing, simulated data constitute a good option for evaluating the accuracy and the efficiency of methods developed for splice-aware sequence analysis. However, existing sequence evolution simulation methods do not model alternative splicing, and so they can not be used to test spliced sequence analysis methods. Results We propose a new method called SimSpliceEvol for simulating the evolution of sets of alternative transcripts along the branches of an input gene tree. In addition to traditional sequence evolution events, the simulation also includes gene exon-intron structure evolution events and alternative splicing events that modify the sets of transcripts produced from genes. SimSpliceEvol was implemented in Python. The source code is freely available at https://github.com/UdeS-CoBIUS/SimSpliceEvol. Conclusions Data generated using SimSpliceEvol are useful for testing spliced RNA sequence analysis methods such as methods for spliced alignment of cDNA and genomic sequences, multiple cDNA alignment, orthologous exons identification, splicing orthology inference, transcript phylogeny inference, which requires to know the real evolutionary relationships between the sequences.
Collapse
Affiliation(s)
- Esaie Kuitche
- Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l'Université, Quebec, J1K2R1, Canada.
| | - Safa Jammali
- Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l'Université, Quebec, J1K2R1, Canada.,Department of Biochemistry, University of Sherbrooke, 3001 12e avenue Nord, Quebec, J1H5N4, Canada
| | - Aïda Ouangraoua
- Department of Computer Science, University of Sherbrooke, 2500 Boulevard de l'Université, Quebec, J1K2R1, Canada
| |
Collapse
|
7
|
Ji S, Liu Z, Liu B, Wang Y. Comparative analysis of biocontrol agent Trichoderma asperellum ACCC30536 transcriptome during its interaction with Populus davidiana × P. alba var. pyramidalis. Microbiol Res 2019; 227:126294. [PMID: 31421718 DOI: 10.1016/j.micres.2019.126294] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/13/2019] [Accepted: 06/18/2019] [Indexed: 12/11/2022]
Abstract
After exposure to with Populus davidiana × P. alba var. pyramidalis, the expression of genes in Trichoderma asperellum were compared in four transcriptomes. The top 20 high expression genes included six heat shock proteins and three hydrophobins, indicating that Trichoderma can rapidly adapt to environment stresses and elicit a plant defense response. The genes, involved in the interaction between Trichoderma and plant, showed an increasing expression level, for example sugar transporters, EPL1s, endoxylanases, pectin lyases, and nitrilases. Interestingly, sugar transporters also showed high expression when T. asperellum was cultured on medium lacking a carbon substrate, which would contribute to T. asperellum's survival and domination in ecological niche competition. And the genes related to mycoparasitism were expressed abundantly following T. asperellum's interaction with PdPap, indicating the PdPap induction could enhance the mycoparasitic ability of T. asperellum. Twelve chitinases and five glucanases showed higher expression in transcriptome Cs, indicating that T. asperellum secretes both types of enzyme before interacting with pathogens, allowing T. asperellum to implement mycoparasitism and obtain more energy. Many novel transcripts were obtained in each transcriptome, which may play important roles in the biocontrol process of T. asperellum. Interestingly, T. asperellum undergo constitutive alternative splicing in the biocontrol process: Seven biocontrol genes were alternative spliced via intron retention. qRT-PCR analysis proved that intron retention is negatively associated with the expression of chitinase, oligopeptide transporters, and beta-lactamase. However, the percentage of MAPK intron retention was quite low, suggesting that intron retention has little effect on the function of MAPK.
Collapse
Affiliation(s)
- Shida Ji
- Key Laboratory of Biogeography and Bioresource in Arid Land, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, 830011, China
| | - Zhihua Liu
- State Key Laboratory of Tree Genetics and Breeding (Northeast Forestry University), 26 Hexing Road, 150040, Harbin, China
| | - Bin Liu
- State Key Laboratory of Tree Genetics and Breeding (Northeast Forestry University), 26 Hexing Road, 150040, Harbin, China
| | - Yucheng Wang
- Key Laboratory of Biogeography and Bioresource in Arid Land, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, 830011, China; State Key Laboratory of Tree Genetics and Breeding (Northeast Forestry University), 26 Hexing Road, 150040, Harbin, China.
| |
Collapse
|
8
|
Sun LF, Zhang B, Chen XJ, Wang XY, Zhang BW, Ji YY, Wu KC, Wu J, Jin ZB. Circular RNAs in human and vertebrate neural retinas. RNA Biol 2019; 16:821-829. [PMID: 30874468 DOI: 10.1080/15476286.2019.1591034] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Circular RNAs (circRNAs) belong to an endogenous class of RNA molecules with both ends covalently linked in a circle. Although their expression pattern in the mammalian brain has been well studied, the characteristics and functions of circRNAs in retinas remain unknown. To reveal the whole expression profiles of circRNAs in the neural retina, we investigated retinal RNAs of human, monkey, mouse, pig, zebrafish and tree shrew and detected thousands of circRNAs showing conservation and variation in the retinas across different vertebrate species. We further investigated one of the abundant circRNAs, circPDE4B, identified in human retina. Silencing of circPDE4B significantly inhibited the proliferation of human A549 cells. Functional assays demonstrated that circPDE4B could sponge miR-181C, thereby altering the cell phenotype. We have explored the retinal circRNA repertoires across human and different vertebrates, which provide new insights into the important role of circRNAs in the vertebrate retinas, as well as in related human diseases.
Collapse
Affiliation(s)
- Lan-Fang Sun
- a Laboratory for Stem Cell and Retinal Regeneration, Institute of Stem Cell Research, Division of Ophthalmic Genetics , The Eye Hospital, Wenzhou Medical University , Wenzhou , China.,b State Key Laboratory of Ophthalmology, Optometry and Vision Science , Wenzhou Medical University, National International Joint Research Center for Regenerative Medicine and Neurogenetics , Wenzhou , China
| | - Bing Zhang
- c Institute of Genomic Medicine, Wenzhou Medical University , Wenzhou , China.,d Computational Genomics Lab , Beijing Institutes of Life Science, Chinese Academy of Sciences , Beijing , China
| | - Xue-Jiao Chen
- a Laboratory for Stem Cell and Retinal Regeneration, Institute of Stem Cell Research, Division of Ophthalmic Genetics , The Eye Hospital, Wenzhou Medical University , Wenzhou , China.,b State Key Laboratory of Ophthalmology, Optometry and Vision Science , Wenzhou Medical University, National International Joint Research Center for Regenerative Medicine and Neurogenetics , Wenzhou , China
| | - Xiao-Yun Wang
- a Laboratory for Stem Cell and Retinal Regeneration, Institute of Stem Cell Research, Division of Ophthalmic Genetics , The Eye Hospital, Wenzhou Medical University , Wenzhou , China.,b State Key Laboratory of Ophthalmology, Optometry and Vision Science , Wenzhou Medical University, National International Joint Research Center for Regenerative Medicine and Neurogenetics , Wenzhou , China
| | - Bo-Wen Zhang
- a Laboratory for Stem Cell and Retinal Regeneration, Institute of Stem Cell Research, Division of Ophthalmic Genetics , The Eye Hospital, Wenzhou Medical University , Wenzhou , China.,b State Key Laboratory of Ophthalmology, Optometry and Vision Science , Wenzhou Medical University, National International Joint Research Center for Regenerative Medicine and Neurogenetics , Wenzhou , China
| | - Yang-Yang Ji
- a Laboratory for Stem Cell and Retinal Regeneration, Institute of Stem Cell Research, Division of Ophthalmic Genetics , The Eye Hospital, Wenzhou Medical University , Wenzhou , China.,b State Key Laboratory of Ophthalmology, Optometry and Vision Science , Wenzhou Medical University, National International Joint Research Center for Regenerative Medicine and Neurogenetics , Wenzhou , China
| | - Kun-Chao Wu
- a Laboratory for Stem Cell and Retinal Regeneration, Institute of Stem Cell Research, Division of Ophthalmic Genetics , The Eye Hospital, Wenzhou Medical University , Wenzhou , China.,b State Key Laboratory of Ophthalmology, Optometry and Vision Science , Wenzhou Medical University, National International Joint Research Center for Regenerative Medicine and Neurogenetics , Wenzhou , China
| | - Jinyu Wu
- c Institute of Genomic Medicine, Wenzhou Medical University , Wenzhou , China
| | - Zi-Bing Jin
- a Laboratory for Stem Cell and Retinal Regeneration, Institute of Stem Cell Research, Division of Ophthalmic Genetics , The Eye Hospital, Wenzhou Medical University , Wenzhou , China.,b State Key Laboratory of Ophthalmology, Optometry and Vision Science , Wenzhou Medical University, National International Joint Research Center for Regenerative Medicine and Neurogenetics , Wenzhou , China
| |
Collapse
|
9
|
Jammali S, Aguilar JD, Kuitche E, Ouangraoua A. SplicedFamAlign: CDS-to-gene spliced alignment and identification of transcript orthology groups. BMC Bioinformatics 2019; 20:133. [PMID: 30925859 PMCID: PMC6439985 DOI: 10.1186/s12859-019-2647-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND The inference of splicing orthology relationships between gene transcripts is a basic step for the prediction of transcripts and the annotation of gene structures in genomes. The splicing structure of a sequence refers to the exon extremity information in a CDS or the exon-intron extremity information in a gene sequence. Splicing orthologous CDS are pairs of CDS with similar sequences and conserved splicing structures from orthologous genes. Spliced alignment that consists in aligning a spliced cDNA sequence against an unspliced genomic sequence, constitutes a promising, yet unexplored approach for the identification of splicing orthology relationships. Existing spliced alignment algorithms do not exploit the information on the splicing structure of the input sequences, namely the exon structure of the cDNA sequence and the exon-intron structure of the genomic sequences. Yet, this information is often available for coding DNA sequences (CDS) and gene sequences annotated in databases, and it can help improve the accuracy of the computed spliced alignments. To address this issue, we introduce a new spliced alignment problem and a method called SplicedFamAlign (SFA) for computing the alignment of a spliced CDS against a gene sequence while accounting for the splicing structures of the input sequences, and then the inference of transcript splicing orthology groups in a gene family based on spliced alignments. RESULTS The experimental results show that SFA outperforms existing spliced alignment methods in terms of accuracy and execution time for CDS-to-gene alignment. We also show that the performance of SFA remains high for various levels of sequence similarity between input sequences, thanks to accounting for the splicing structure of the input sequences. It is important to notice that unlike all current spliced alignment methods that are meant for cDNA-to-genome alignments and can be used for CDS-to-gene alignments, SFA is the first method specifically designed for CDS-to-gene alignments. CONCLUSION We show the usefulness of SFA for the comparison of genes and transcripts within a gene family for the purpose of analyzing splicing orthologies. It can also be used for gene structure annotation and alternative splicing analyses. SplicedFamAlign was implemented in Python. Source code is freely available at https://github.com/UdeS-CoBIUS/SpliceFamAlign .
Collapse
Affiliation(s)
- Safa Jammali
- Department of Computer science, Faculty of Science, Université de Sherbrooke, Sherbrooke, Quebec, Canada
- Department of Biochemistry, Faculty of medecine and health science, Université de Sherbrooke, Sherbrooke, Quebec, Canada
| | - Jean-David Aguilar
- Department of Computer science, Faculty of Science, Université de Sherbrooke, Sherbrooke, Quebec, Canada
- Department of Biochemistry, Faculty of medecine and health science, Université de Sherbrooke, Sherbrooke, Quebec, Canada
| | - Esaie Kuitche
- Department of Computer science, Faculty of Science, Université de Sherbrooke, Sherbrooke, Quebec, Canada
| | - Aïda Ouangraoua
- Department of Computer science, Faculty of Science, Université de Sherbrooke, Sherbrooke, Quebec, Canada
| |
Collapse
|
10
|
Owen N, Moosajee M. RNA-sequencing in ophthalmology research: considerations for experimental design and analysis. Ther Adv Ophthalmol 2019; 11:2515841419835460. [PMID: 30911735 PMCID: PMC6421592 DOI: 10.1177/2515841419835460] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 02/08/2019] [Indexed: 12/13/2022] Open
Abstract
High-throughput, massively parallel sequence analysis has revolutionized the way that researchers design and execute scientific investigations. Vast amounts of sequence data can be generated in short periods of time. Regarding ophthalmology and vision research, extensive interrogation of patient samples for underlying causative DNA mutations has resulted in the discovery of many new genes relevant to eye disease. However, such analysis remains functionally limited. RNA-sequencing accurately snapshots thousands of genes, capturing many subtypes of RNA molecules, and has become the gold standard for transcriptome gene expression quantification. RNA-sequencing has the potential to advance our understanding of eye development and disease; it can reveal new candidates to improve our molecular diagnosis rates and highlight therapeutic targets for intervention. But with a wide range of applications, the design of such experiments can be problematic, no single optimal pipeline exists, and therefore, several considerations must be undertaken for optimal study design. We review the key steps involved in RNA-sequencing experimental design and the downstream bioinformatic pipelines used for differential gene expression. We provide guidance on the application of RNA-sequencing to ophthalmology and sources of open-access eye-related data sets.
Collapse
Affiliation(s)
- Nicholas Owen
- Development, Ageing and Disease Theme, UCL Institute of Ophthalmology, University College London, London, UK
| | | |
Collapse
|
11
|
Mapleson D, Venturini L, Kaithakottil G, Swarbreck D. Efficient and accurate detection of splice junctions from RNA-seq with Portcullis. Gigascience 2018; 7:5173486. [PMID: 30418570 PMCID: PMC6302956 DOI: 10.1093/gigascience/giy131] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 10/25/2018] [Indexed: 12/14/2022] Open
Abstract
Next-generation sequencing technologies enable rapid and cheap genome-wide transcriptome analysis, providing vital information about gene structure, transcript expression, and alternative splicing. Key to this is the accurate identification of exon-exon junctions from RNA sequenced (RNA-seq) reads. A number of RNA-seq aligners capable of splitting reads across these splice junctions (SJs) have been developed; however, it has been shown that while they correctly identify most genuine SJs available in a given sample, they also often produce large numbers of incorrect SJs. Here, we describe the extent of this problem using popular RNA-seq mapping tools and present a new method, called Portcullis, to rapidly filter false SJs derived from spliced alignments. We show that Portcullis distinguishes between genuine and false-positive junctions to a high degree of accuracy across different species, samples, expression levels, error profiles, and read lengths. Portcullis is portable, efficient, and, to our knowledge, currently the only SJ prediction tool that reliably scales for use with large RNA-seq datasets and large, highly fragmented genomes, while delivering accurate SJs.
Collapse
Affiliation(s)
- Daniel Mapleson
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| | - Luca Venturini
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| | - Gemy Kaithakottil
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| | - David Swarbreck
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| |
Collapse
|
12
|
HSRA: Hadoop-based spliced read aligner for RNA sequencing data. PLoS One 2018; 13:e0201483. [PMID: 30063721 PMCID: PMC6067734 DOI: 10.1371/journal.pone.0201483] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 07/16/2018] [Indexed: 01/18/2023] Open
Abstract
Nowadays, the analysis of transcriptome sequencing (RNA-seq) data has become the standard method for quantifying the levels of gene expression. In RNA-seq experiments, the mapping of short reads to a reference genome or transcriptome is considered a crucial step that remains as one of the most time-consuming. With the steady development of Next Generation Sequencing (NGS) technologies, unprecedented amounts of genomic data introduce significant challenges in terms of storage, processing and downstream analysis. As cost and throughput continue to improve, there is a growing need for new software solutions that minimize the impact of increasing data volume on RNA read alignment. In this work we introduce HSRA, a Big Data tool that takes advantage of the MapReduce programming model to extend the multithreading capabilities of a state-of-the-art spliced read aligner for RNA-seq data (HISAT2) to distributed memory systems such as multi-core clusters or cloud platforms. HSRA has been built upon the Hadoop MapReduce framework and supports both single- and paired-end reads from FASTQ/FASTA datasets, providing output alignments in SAM format. The design of HSRA has been carefully optimized to avoid the main limitations and major causes of inefficiency found in previous Big Data mapping tools, which cannot fully exploit the raw performance of the underlying aligner. On a 16-node multi-core cluster, HSRA is on average 2.3 times faster than previous Hadoop-based tools. Source code in Java as well as a user’s guide are publicly available for download at http://hsra.dec.udc.es.
Collapse
|
13
|
De Vito A, Lazzaro M, Palmisano I, Cittaro D, Riba M, Lazarevic D, Bannai M, Gabellini D, Schiaffino MV. Amino acid deprivation triggers a novel GCN2-independent response leading to the transcriptional reactivation of non-native DNA sequences. PLoS One 2018; 13:e0200783. [PMID: 30020994 PMCID: PMC6051655 DOI: 10.1371/journal.pone.0200783] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2018] [Accepted: 07/03/2018] [Indexed: 12/18/2022] Open
Abstract
In a variety of species, reduced food intake, and in particular protein or amino acid (AA) restriction, extends lifespan and healthspan. However, the underlying epigenetic and/or transcriptional mechanisms are largely unknown, and dissection of specific pathways in cultured cells may contribute to filling this gap. We have previously shown that, in mammalian cells, deprivation of essential AAs (methionine/cysteine or tyrosine) leads to the transcriptional reactivation of integrated silenced transgenes, including plasmid and retroviral vectors and latent HIV-1 provirus, by a process involving epigenetic chromatic remodeling and histone acetylation. Here we show that the deprivation of methionine/cysteine also leads to the transcriptional upregulation of endogenous retroviruses, suggesting that essential AA starvation affects the expression not only of exogenous non-native DNA sequences, but also of endogenous anciently-integrated and silenced parasitic elements of the genome. Moreover, we show that the transgene reactivation response is highly conserved in different mammalian cell types, and it is reproducible with deprivation of most essential AAs. The General Control Non-derepressible 2 (GCN2) kinase and the downstream integrated stress response represent the best candidates mediating this process; however, by pharmacological approaches, RNA interference and genomic editing, we demonstrate that they are not implicated. Instead, the response requires MEK/ERK and/or JNK activity and is reproduced by ribosomal inhibitors, suggesting that it is triggered by a novel nutrient-sensing and signaling pathway, initiated by translational block at the ribosome, and independent of mTOR and GCN2. Overall, these findings point to a general transcriptional response to essential AA deprivation, which affects the expression of non-native genomic sequences, with relevant implications for the epigenetic/transcriptional effects of AA restriction in health and disease.
Collapse
Affiliation(s)
- Annarosaria De Vito
- Division of Genetics and Cell Biology, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Massimo Lazzaro
- Division of Genetics and Cell Biology, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Ilaria Palmisano
- Division of Genetics and Cell Biology, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Davide Cittaro
- Center for Translational Genomics and Bioinformatics, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Michela Riba
- Center for Translational Genomics and Bioinformatics, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Dejan Lazarevic
- Center for Translational Genomics and Bioinformatics, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Makoto Bannai
- Frontier Research Labs, Institute for Innovation, Ajinomoto Co., Kawasaki, Tokyo, Japan
| | - Davide Gabellini
- Division of Genetics and Cell Biology, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Maria Vittoria Schiaffino
- Division of Genetics and Cell Biology, IRCCS San Raffaele Scientific Institute, Milan, Italy
- * E-mail:
| |
Collapse
|
14
|
Lecluze E, Jégou B, Rolland AD, Chalmel F. New transcriptomic tools to understand testis development and functions. Mol Cell Endocrinol 2018; 468:47-59. [PMID: 29501799 DOI: 10.1016/j.mce.2018.02.019] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 02/26/2018] [Accepted: 02/27/2018] [Indexed: 12/16/2022]
Abstract
The testis plays a central role in the male reproductive system - secreting several hormones including male steroids and producing male gametes. A complex and coordinated molecular program is required for the proper differentiation of testicular cell types and maintenance of their functions in adulthood. The testicular transcriptome displays the highest levels of complexity and specificity across all tissues in a wide range of species. Many studies have used high-throughput sequencing technologies to define the molecular dynamics and regulatory networks in the testis as well as to identify novel genes or gene isoforms expressed in this organ. This review intends to highlight the complementarity of these transcriptomic studies and to show how the use of different sequencing protocols contribute to improve our global understanding of testicular biology.
Collapse
Affiliation(s)
- Estelle Lecluze
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, Environnement et travail) - UMR_S1085, F-35000 Rennes, France
| | - Bernard Jégou
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, Environnement et travail) - UMR_S1085, F-35000 Rennes, France
| | - Antoine D Rolland
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, Environnement et travail) - UMR_S1085, F-35000 Rennes, France
| | - Frédéric Chalmel
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, Environnement et travail) - UMR_S1085, F-35000 Rennes, France.
| |
Collapse
|
15
|
Wang M, Wang P, Liang F, Ye Z, Li J, Shen C, Pei L, Wang F, Hu J, Tu L, Lindsey K, He D, Zhang X. A global survey of alternative splicing in allopolyploid cotton: landscape, complexity and regulation. THE NEW PHYTOLOGIST 2018; 217:163-178. [PMID: 28892169 DOI: 10.1111/nph.14762] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Accepted: 07/25/2017] [Indexed: 05/21/2023]
Abstract
Alternative splicing (AS) is a crucial regulatory mechanism in eukaryotes, which acts by greatly increasing transcriptome diversity. The extent and complexity of AS has been revealed in model plants using high-throughput next-generation sequencing. However, this technique is less effective in accurately identifying transcript isoforms in polyploid species because of the high sequence similarity between coexisting subgenomes. Here we characterize AS in the polyploid species cotton. Using Pacific Biosciences single-molecule long-read isoform sequencing (Iso-Seq), we developed an integrated pipeline for Iso-Seq transcriptome data analysis (https://github.com/Nextomics/pipeline-for-isoseq). We identified 176 849 full-length transcript isoforms from 44 968 gene models and updated gene annotation. These data led us to identify 15 102 fibre-specific AS events and estimate that c. 51.4% of homoeologous genes produce divergent isoforms in each subgenome. We reveal that AS allows differential regulation of the same gene by miRNAs at the isoform level. We also show that nucleosome occupancy and DNA methylation play a role in defining exons at the chromatin level. This study provides new insights into the complexity and regulation of AS, and will enhance our understanding of AS in polyploid species. Our methodology for Iso-Seq data analysis will be a useful reference for the study of AS in other species.
Collapse
Affiliation(s)
- Maojun Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Pengcheng Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Fan Liang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Zhengxiu Ye
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Jianying Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Chao Shen
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Liuling Pei
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Feng Wang
- Nextomics Biosciences, Wuhan, 430000, Hubei, China
| | - Jiang Hu
- Nextomics Biosciences, Wuhan, 430000, Hubei, China
| | - Lili Tu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Keith Lindsey
- Department of Biosciences, Durham University, Durham, DH1 3LE, UK
| | - Daohua He
- College of Agronomy, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Xianlong Zhang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| |
Collapse
|
16
|
Nellore A, Collado-Torres L, Jaffe AE, Alquicira-Hernández J, Wilks C, Pritt J, Morton J, Leek JT, Langmead B. Rail-RNA: scalable analysis of RNA-seq splicing and coverage. Bioinformatics 2017; 33:4033-4040. [PMID: 27592709 PMCID: PMC5860083 DOI: 10.1093/bioinformatics/btw575] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Revised: 06/29/2016] [Accepted: 08/26/2016] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it requires extra work to obtain analysis products that incorporate data from across samples. RESULTS We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 667 RNA-seq samples from the GEUVADIS project on Amazon Web Services in under 16 h for US$0.91 per sample. Rail-RNA outputs alignments in SAM/BAM format; but it also outputs (i) base-level coverage bigWigs for each sample; (ii) coverage bigWigs encoding normalized mean and median coverages at each base across samples analyzed; and (iii) exon-exon splice junctions and indels (features) in columnar formats that juxtapose coverages in samples in which a given feature is found. Supplementary outputs are ready for use with downstream packages for reproducible statistical analysis. We use Rail-RNA to identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounding variables. AVAILABILITY AND IMPLEMENTATION Rail-RNA is open-source software available at http://rail.bio. CONTACTS anellore@gmail.com or langmea@cs.jhu.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Abhinav Nellore
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Leonardo Collado-Torres
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
| | - Andrew E Jaffe
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - José Alquicira-Hernández
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Undergraduate Program on Genomic Sciences, National Autonomous University of Mexico, Mexico City, D.F., Mexico
| | - Christopher Wilks
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Jacob Pritt
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - James Morton
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Jeffrey T Leek
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Ben Langmead
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
17
|
Evolutionary relationships among protein lysine deacetylases of parasites causing neglected diseases. INFECTION GENETICS AND EVOLUTION 2017; 53:175-188. [DOI: 10.1016/j.meegid.2017.05.011] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Revised: 05/10/2017] [Accepted: 05/12/2017] [Indexed: 12/20/2022]
|
18
|
Ding L, Rath E, Bai Y. Comparison of Alternative Splicing Junction Detection Tools Using RNA-Seq Data. Curr Genomics 2017; 18:268-277. [PMID: 28659722 PMCID: PMC5476949 DOI: 10.2174/1389202918666170215125048] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Revised: 11/28/2016] [Accepted: 12/01/2016] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Alternative splicing (AS) is a posttranscriptional process that produces differ-ent transcripts from the same gene and is important to produce diverse protein products in response to environmental stimuli. AS occurs at specific sites on the mRNA sequence, some of which have been de-fined. Multiple bioinformatics tools have been developed to detect AS from experimental data. OBJECTIVES The goal of this review is to help researchers use specific tools to aid their research and to develop new AS detection tools based on these previously established tools. METHOD We selected 15 AS detection tools that were recently published; we classified and delineated them on several aspects. Also, a performance comparison of these tools with the same starting input was conducted. RESULT We reviewed the following categorized features of the tools: Publication information, working principles, generic and distinct workflows, running platform, input data requirement, sequencing depth dependency, reads mapped to multiple locations, isoform annotation basis, precise detected AS types, and performance benchmarks. CONCLUSION Through comparisons of these tools, we provide a panorama of the advantages and short-comings of each tool and their scopes of application.
Collapse
Affiliation(s)
| | | | - Yongsheng Bai
- Department of Biology.,The Center for Genomic Advocacy, Indiana State University, Terre Haute, IN, USA
| |
Collapse
|
19
|
Simulation-based comprehensive benchmarking of RNA-seq aligners. Nat Methods 2016; 14:135-139. [PMID: 27941783 DOI: 10.1038/nmeth.4106] [Citation(s) in RCA: 163] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 11/15/2016] [Indexed: 01/27/2023]
Abstract
Alignment is the first step in most RNA-seq analysis pipelines, and the accuracy of downstream analyses depends heavily on it. Unlike most steps in the pipeline, alignment is particularly amenable to benchmarking with simulated data. We performed a comprehensive benchmarking of 14 common splice-aware aligners for base, read, and exon junction-level accuracy and compared default with optimized parameters. We found that performance varied by genome complexity, and accuracy and popularity were poorly correlated. The most widely cited tool underperforms for most metrics, particularly when using default settings.
Collapse
|
20
|
mRNA changes in nucleus accumbens related to methamphetamine addiction in mice. Sci Rep 2016; 6:36993. [PMID: 27869204 PMCID: PMC5116666 DOI: 10.1038/srep36993] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 10/21/2016] [Indexed: 11/12/2022] Open
Abstract
Methamphetamine (METH) is a highly addictive psychostimulant that elicits aberrant changes in the expression of microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) in the nucleus accumbens of mice, indicating a potential role of METH in post-transcriptional regulations. To decipher the potential consequences of these post-transcriptional regulations in response to METH, we performed strand-specific RNA sequencing (ssRNA-Seq) to identify alterations in mRNA expression and their alternative splicing in the nucleus accumbens of mice following exposure to METH. METH-mediated changes in mRNAs were analyzed and correlated with previously reported changes in non-coding RNAs (miRNAs and lncRNAs) to determine the potential functions of these mRNA changes observed here and how non-coding RNAs are involved. A total of 2171 mRNAs were differentially expressed in response to METH with functions involved in synaptic plasticity, mitochondrial energy metabolism and immune response. 309 and 589 of these mRNAs are potential targets of miRNAs and lncRNAs respectively. In addition, METH treatment decreases mRNA alternative splicing, and there are 818 METH-specific events not observed in saline-treated mice. Our results suggest that METH-mediated addiction could be attributed by changes in miRNAs and lncRNAs and consequently, changes in mRNA alternative splicing and expression. In conclusion, our study reported a methamphetamine-modified nucleus accumbens transcriptome and provided non-coding RNA-mRNA interaction networks possibly involved in METH addiction.
Collapse
|
21
|
Böhmdorfer G, Sethuraman S, Rowley MJ, Krzyszton M, Rothi MH, Bouzit L, Wierzbicki AT. Long non-coding RNA produced by RNA polymerase V determines boundaries of heterochromatin. eLife 2016; 5. [PMID: 27779094 PMCID: PMC5079748 DOI: 10.7554/elife.19092] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Accepted: 10/06/2016] [Indexed: 01/10/2023] Open
Abstract
RNA-mediated transcriptional gene silencing is a conserved process where small RNAs target transposons and other sequences for repression by establishing chromatin modifications. A central element of this process are long non-coding RNAs (lncRNA), which in Arabidopsis thaliana are produced by a specialized RNA polymerase known as Pol V. Here we show that non-coding transcription by Pol V is controlled by preexisting chromatin modifications located within the transcribed regions. Most Pol V transcripts are associated with AGO4 but are not sliced by AGO4. Pol V-dependent DNA methylation is established on both strands of DNA and is tightly restricted to Pol V-transcribed regions. This indicates that chromatin modifications are established in close proximity to Pol V. Finally, Pol V transcription is preferentially enriched on edges of silenced transposable elements, where Pol V transcribes into TEs. We propose that Pol V may play an important role in the determination of heterochromatin boundaries. DOI:http://dx.doi.org/10.7554/eLife.19092.001
Collapse
Affiliation(s)
- Gudrun Böhmdorfer
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| | - Shriya Sethuraman
- Bioinformatics Graduate Program, University of Michigan, Ann Arbor, United States
| | - M Jordan Rowley
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| | - Michal Krzyszton
- Faculty of Biology, Institute of Genetics and Biotechnology, University of Warsaw, Warsaw, Poland
| | - M Hafiz Rothi
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| | - Lilia Bouzit
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| | - Andrzej T Wierzbicki
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| |
Collapse
|
22
|
Genome-wide analysis of alternative splicing during human heart development. Sci Rep 2016; 6:35520. [PMID: 27752099 PMCID: PMC5067579 DOI: 10.1038/srep35520] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 09/27/2016] [Indexed: 12/16/2022] Open
Abstract
Alternative splicing (AS) drives determinative changes during mouse heart development. Recent high-throughput technological advancements have facilitated genome-wide AS, while its analysis in human foetal heart transition to the adult stage has not been reported. Here, we present a high-resolution global analysis of AS transitions between human foetal and adult hearts. RNA-sequencing data showed extensive AS transitions occurred between human foetal and adult hearts, and AS events occurred more frequently in protein-coding genes than in long non-coding RNA (lncRNA). A significant difference of AS patterns was found between foetal and adult hearts. The predicted difference in AS events was further confirmed using quantitative reverse transcription-polymerase chain reaction analysis of human heart samples. Functional foetal-specific AS event analysis showed enrichment associated with cell proliferation-related pathways including cell cycle, whereas adult-specific AS events were associated with protein synthesis. Furthermore, 42.6% of foetal-specific AS events showed significant changes in gene expression levels between foetal and adult hearts. Genes exhibiting both foetal-specific AS and differential expression were highly enriched in cell cycle-associated functions. In conclusion, we provided a genome-wide profiling of AS transitions between foetal and adult hearts and proposed that AS transitions and deferential gene expression may play determinative roles in human heart development.
Collapse
|
23
|
Hao Y, Feng Y, Yang P, Cui Y, Liu J, Yang C, Gu X. Transcriptome analysis reveals that constant heat stress modifies the metabolism and structure of the porcine longissimus dorsi skeletal muscle. Mol Genet Genomics 2016; 291:2101-2115. [PMID: 27561287 DOI: 10.1007/s00438-016-1242-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 08/18/2016] [Indexed: 12/31/2022]
Abstract
Exposure to high ambient temperatures is detrimental to pig rearing and porcine meat quality. Deep molecular sequencing allows for genomic characterization of porcine skeletal muscles and helps understand how the genomic landscape may impact meat quality. To this end, we performed mRNA-seq to molecularly dissect the impact of heat stress on porcine skeletal muscles, longissimus dorsi. Sixteen castrated, male DLY pigs [which are crossbreeds between Duroc (D) boars and Landrace (L) × Yorkshire (Y) sows, 79.0 ± 1.5 kg BW] were evenly split into two groups that were subjected to either control (CON) (22 °C; 55 % humidity) or constant heat stress (H30; 30 °C; 55 % humidity) conditions for 21 days. Seventy-eight genes were found to be differentially expressed, of which 37 were up-regulated and 41 were down-regulated owing to constant heat stress. We predicted 5247 unknown genes and 6108 novel transcribed units attributed to alternative splicing (AS) events in the skeletal muscle. Furthermore, 30,761 and 31,360 AS events were observed in the CON and H30 RNA-seq libraries, respectively. The differentially expressed genes in the porcine skeletal muscles were involved in glycolysis, lactate metabolism, lipid metabolism, cellular defense, and stress responses. Additionally, the expression levels of these genes were associated with variations in meat quality between the CON and H30 groups, indicating that heat stress modulated genes crucial to skeletal muscle development and metabolism. Our transcriptomic analysis provides valuable information for understanding the molecular mechanisms governing porcine skeletal muscle development. Such insights may lead to innovative strategies to improve meat quality of pigs under heat stress.
Collapse
Affiliation(s)
- Yue Hao
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Yuejin Feng
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Peige Yang
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Yanjun Cui
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Jiru Liu
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.,College of Veterinary and Animal Science, Shenyang Agricultural University, Shenyang, 110866, China
| | - Chunhe Yang
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Xianhong Gu
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
| |
Collapse
|
24
|
Wang J, Li X, Wang L, Li J, Zhao Y, Bou G, Li Y, Jiao G, Shen X, Wei R, Liu S, Xie B, Lei L, Li W, Zhou Q, Liu Z. A novel long intergenic noncoding RNA indispensable for the cleavage of mouse two-cell embryos. EMBO Rep 2016; 17:1452-1470. [PMID: 27496889 DOI: 10.15252/embr.201642051] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Accepted: 07/07/2016] [Indexed: 01/30/2023] Open
Abstract
Endogenous retroviruses (ERVs) are transcriptionally active in cleavage stage embryos, yet their functions are unknown. ERV sequences are present in the majority of long intergenic noncoding RNAs (lincRNAs) in mouse and humans, playing key roles in many cellular processes and diseases. Here, we identify LincGET as a nuclear lincRNA that is GLN-, MERVL-, and ERVK-associated and essential for mouse embryonic development beyond the two-cell stage. LincGET is expressed in late two- to four-cell mouse embryos. Its depletion leads to developmental arrest at the late G2 phase of the two-cell stage and to MAPK signaling pathway inhibition. LincGET forms an RNA-protein complex with hnRNP U, FUBP1, and ILF2, promoting the cis-regulatory activity of long terminal repeats (LTRs) in GLN, MERVL, and ERVK (GLKLTRs), and inhibiting RNA alternative splicing, partially by downregulating hnRNP U, FUBP1, and ILF2 protein levels. Hnrnpu or Ilf2 mRNA injection at the pronuclear stage also decreases the preimplantation developmental rate, and Fubp1 mRNA injection at the pronuclear stage causes a block at the two-cell stage. Thus, as the first functional ERV-associated lincRNA, LincGET provides clues for ERV functions in cleavage stage embryonic development.
Collapse
Affiliation(s)
- Jiaqiang Wang
- College of Life Science, Northeast Agricultural University, Harbin, China State Key Laboratory of Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Xin Li
- College of Life Science, Northeast Agricultural University, Harbin, China State Key Laboratory of Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Leyun Wang
- College of Life Science, Northeast Agricultural University, Harbin, China State Key Laboratory of Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Jingyu Li
- College of Life Science, Northeast Agricultural University, Harbin, China
| | - Yanhua Zhao
- College of Life Science, Northeast Agricultural University, Harbin, China
| | - Gerelchimeg Bou
- College of Life Science, Northeast Agricultural University, Harbin, China
| | - Yufei Li
- State Key Laboratory of Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Guanyi Jiao
- State Key Laboratory of Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Xinghui Shen
- Department of Histology and Embryology, Harbin Medical University, Harbin, China
| | - Renyue Wei
- College of Life Science, Northeast Agricultural University, Harbin, China
| | - Shichao Liu
- College of Life Science, Northeast Agricultural University, Harbin, China
| | - Bingteng Xie
- College of Life Science, Northeast Agricultural University, Harbin, China
| | - Lei Lei
- Department of Histology and Embryology, Harbin Medical University, Harbin, China
| | - Wei Li
- State Key Laboratory of Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Qi Zhou
- State Key Laboratory of Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Zhonghua Liu
- College of Life Science, Northeast Agricultural University, Harbin, China
| |
Collapse
|
25
|
Wang XG, Ju ZH, Hou MH, Jiang Q, Yang CH, Zhang Y, Sun Y, Li RL, Wang CF, Zhong JF, Huang JM. Deciphering Transcriptome and Complex Alternative Splicing Transcripts in Mammary Gland Tissues from Cows Naturally Infected with Staphylococcus aureus Mastitis. PLoS One 2016; 11:e0159719. [PMID: 27459697 PMCID: PMC4961362 DOI: 10.1371/journal.pone.0159719] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2016] [Accepted: 07/06/2016] [Indexed: 11/20/2022] Open
Abstract
Alternative splicing (AS) contributes to the complexity of the mammalian proteome and plays an important role in diseases, including infectious diseases. The differential AS patterns of these transcript sequences between the healthy (HS3A) and mastitic (HS8A) cows naturally infected by Staphylococcus aureus were compared to understand the molecular mechanisms underlying mastitis resistance and susceptibility. In this study, using the Illumina paired-end RNA sequencing method, 1352 differentially expressed genes (DEGs) with higher than twofold changes were found in the HS3A and HS8A mammary gland tissues. Gene ontology and KEGG pathway analyses revealed that the cytokine–cytokine receptor interaction pathway is the most significantly enriched pathway. Approximately 16k annotated unigenes were respectively identified in two libraries, based on the bovine Bos taurus UMD3.1 sequence assembly and search. A total of 52.62% and 51.24% annotated unigenes were alternatively spliced in term of exon skipping, intron retention, alternative 5′ splicing and alternative 3ʹ splicing. Additionally, 1,317 AS unigenes were HS3A-specific, whereas 1,093 AS unigenes were HS8A-specific. Some immune-related genes, such as ITGB6, MYD88, ADA, ACKR1, and TNFRSF1B, and their potential relationships with mastitis were highlighted. From Chromosome 2, 4, 6, 7, 10, 13, 14, 17, and 20, 3.66% (HS3A) and 5.4% (HS8A) novel transcripts, which harbor known quantitative trait locus associated with clinical mastitis, were identified. Many DEGs in the healthy and mastitic mammary glands are involved in immune, defense, and inflammation responses. These DEGs, which exhibit diverse and specific splicing patterns and events, can endow dairy cattle with the potential complex genetic resistance against mastitis.
Collapse
Affiliation(s)
- Xiu Ge Wang
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Zhi Hua Ju
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Ming Hai Hou
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Qiang Jiang
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Chun Hong Yang
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Yan Zhang
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Yan Sun
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Rong Ling Li
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Chang Fa Wang
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Ji Feng Zhong
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
| | - Jin Ming Huang
- Dairy Cattle Research Center, Shandong Academy of Agricultural Sciences, Jinan, Shandong, P.R. China
- * E-mail:
| |
Collapse
|
26
|
Implication of Long noncoding RNAs in the endothelial cell response to hypoxia revealed by RNA-sequencing. Sci Rep 2016; 6:24141. [PMID: 27063004 PMCID: PMC4827084 DOI: 10.1038/srep24141] [Citation(s) in RCA: 113] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Accepted: 03/21/2016] [Indexed: 01/01/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) are non-protein coding RNAs regulating gene expression. Although for some lncRNAs a relevant role in hypoxic endothelium has been shown, the regulation and function of lncRNAs is still largely unknown in the vascular physio-pathology. Taking advantage of next-generation sequencing techniques, transcriptomic changes induced by endothelial cell exposure to hypoxia were investigated. Paired-end sequencing of polyadenylated RNA derived from human umbilical vein endothelial cells (HUVECs) exposed to 1% O2 or normoxia was performed. Bioinformatics analysis identified ≈2000 differentially expressed genes, including 122 lncRNAs. Extensive validation was performed by both microarray and qPCR. Among the validated lncRNAs, H19, MIR210HG, MEG9, MALAT1 and MIR22HG were also induced in a mouse model of hindlimb ischemia. To test the functional relevance of lncRNAs in endothelial cells, knockdown of H19 expression was performed. H19 inhibition decreased HUVEC growth, inducing their accumulation in G1 phase of the cell cycle; accordingly, p21 (CDKN1A) expression was increased. Additionally, H19 knockdown also diminished HUVEC ability to form capillary like structures when plated on matrigel. In conclusion, a high-confidence signature of lncRNAs modulated by hypoxia in HUVEC was identified and a significant impact of H19 lncRNA was shown.
Collapse
|
27
|
Involvement of Alternative Splicing in Barley Seed Germination. PLoS One 2016; 11:e0152824. [PMID: 27031341 PMCID: PMC4816419 DOI: 10.1371/journal.pone.0152824] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 03/18/2016] [Indexed: 11/19/2022] Open
Abstract
Seed germination activates many new biological processes including DNA, membrane and mitochondrial repairs and requires active protein synthesis and sufficient energy supply. Alternative splicing (AS) regulates many cellular processes including cell differentiation and environmental adaptations. However, limited information is available on the regulation of seed germination at post-transcriptional levels. We have conducted RNA-sequencing experiments to dissect AS events in barley seed germination. We identified between 552 and 669 common AS transcripts in germinating barley embryos from four barley varieties (Hordeum vulgare L. Bass, Baudin, Harrington and Stirling). Alternative 3’ splicing (34%-45%), intron retention (32%-34%) and alternative 5’ splicing (16%-21%) were three major AS events in germinating embryos. The AS transcripts were predominantly mapped onto ribosome, RNA transport machineries, spliceosome, plant hormone signal transduction, glycolysis, sugar and carbon metabolism pathways. Transcripts of these genes were also very abundant in the early stage of seed germination. Correlation analysis of gene expression showed that AS hormone responsive transcripts could also be co-expressed with genes responsible for protein biosynthesis and sugar metabolisms. Our RNA-sequencing data revealed that AS could play important roles in barley seed germination.
Collapse
|
28
|
Mapping and differential expression analysis from short-read RNA-Seq data in model organisms. QUANTITATIVE BIOLOGY 2016. [DOI: 10.1007/s40484-016-0060-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
29
|
Histological and transcriptome analyses of testes from Duroc and Meishan boars. Sci Rep 2016; 6:20758. [PMID: 26865000 PMCID: PMC4749976 DOI: 10.1038/srep20758] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 01/07/2016] [Indexed: 12/14/2022] Open
Abstract
Meishan boars are known for their early sexual maturity. However, they exhibit a significantly smaller testicular size and a reduced proportion of Sertoli cells and daily sperm production compared with Duroc boars. The testes of Duroc and Meishan boars at 20, 75 and 270 days of age were used for histological and transcriptome analyses. Haematoxylin-eosin staining was conducted to observe histological structure of the testes in Duroc and Meishan boars at different ages. Although spermatogenesis occurred prior to 75 days in Meishan boars, the number of spermatogonia and Sertoli cells in Meishan boars were less than in Duroc boars at adulthood. The diameters of the seminiferous tubules of the testes differed significantly during the initiation of development of the seminiferous tubules between the two breeds. We obtained differentially expressed functional genes and analysed seven pathways involved in male sexual maturity and spermatogenesis using RNA-seq. We also detected four main alternative splicing events and many single nucleotide polymorphisms from testes. Eight functionally important genes were validated by qPCR, and Neurotrophin 3 was subjected to quantification and cellular localization analysis. Our study provides the first transcriptome evidence for the differences in sexual function development between Meishan and Duroc boars.
Collapse
|
30
|
Liang C, Cheng S, Zhang Y, Sun Y, Fernie AR, Kang K, Panagiotou G, Lo C, Lim BL. Transcriptomic, proteomic and metabolic changes in Arabidopsis thaliana leaves after the onset of illumination. BMC PLANT BIOLOGY 2016; 16:43. [PMID: 26865323 PMCID: PMC4750186 DOI: 10.1186/s12870-016-0726-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 01/28/2016] [Indexed: 05/19/2023]
Abstract
BACKGROUND Light plays an important role in plant growth and development. In this study, the impact of light on physiology of 20-d-old Arabidopsis leaves was examined through transcriptomic, proteomic and metabolomic analysis. Since the energy-generating electron transport chains in chloroplasts and mitochondria are encoded by both nuclear and organellar genomes, sequencing total RNA after removal of ribosomal RNAs provides essential information on transcription of organellar genomes. The changes in the levels of ADP, ATP, NADP(+), NADPH and 41 metabolites upon illumination were also quantified. RESULTS Upon illumination, while the transcription of the genes encoded by the plastid genome did not change significantly, the transcription of nuclear genes encoding different functional complexes in the photosystem are differentially regulated whereas members of the same complex are co-regulated with each other. The abundance of mRNAs and proteins encoded by all three genomes are, however, not always positively correlated. One such example is the negative correlation between mRNA and protein abundances of the photosystem components, which reflects the importance of post-transcriptional regulation in plant physiology. CONCLUSION This study provides systems-wide datasets which allow plant researchers to examine the changes in leaf transcriptomes, proteomes and key metabolites upon illumination and to determine whether there are any correlations between changes in transcript and protein abundances of a particular gene or pathway upon illumination. The integration of data of the organelles and the photosystems, Calvin-Benson cycle, carbohydrate metabolism, glycolysis, the tricarboxylic acid cycle and respiratory chain, thereby provides a more complete picture to the changes in plant physiology upon illumination than has been attained to date.
Collapse
Affiliation(s)
- Chao Liang
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.
| | - Shifeng Cheng
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.
| | - Youjun Zhang
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany.
| | - Yuzhe Sun
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.
| | - Alisdair R Fernie
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany.
| | - Kang Kang
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.
| | - Gianni Panagiotou
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.
| | - Clive Lo
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.
| | - Boon Leong Lim
- School of Biological Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.
- State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, Hong Kong, China.
| |
Collapse
|
31
|
MicroRNA-222 regulates muscle alternative splicing through Rbm24 during differentiation of skeletal muscle cells. Cell Death Dis 2016; 7:e2086. [PMID: 26844700 PMCID: PMC4849150 DOI: 10.1038/cddis.2016.10] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Revised: 12/15/2015] [Accepted: 01/03/2016] [Indexed: 02/01/2023]
Abstract
A number of microRNAs have been shown to regulate skeletal muscle development and differentiation. MicroRNA-222 is downregulated during myogenic differentiation and its overexpression leads to alteration of muscle differentiation process and specialized structures. By using RNA-induced silencing complex (RISC) pulldown followed by RNA sequencing, combined with in silico microRNA target prediction, we have identified two new targets of microRNA-222 involved in the regulation of myogenic differentiation, Ahnak and Rbm24. Specifically, the RNA-binding protein Rbm24 is a major regulator of muscle-specific alternative splicing and its downregulation by microRNA-222 results in defective exon inclusion impairing the production of muscle-specific isoforms of Coro6, Fxr1 and NACA transcripts. Reconstitution of normal levels of Rbm24 in cells overexpressing microRNA-222 rescues muscle-specific splicing. In conclusion, we have identified a new function of microRNA-222 leading to alteration of myogenic differentiation at the level of alternative splicing, and we provide evidence that this effect is mediated by Rbm24 protein.
Collapse
|
32
|
Zhang Q, Zhang X, Pettolino F, Zhou G, Li C. Changes in cell wall polysaccharide composition, gene transcription and alternative splicing in germinating barley embryos. JOURNAL OF PLANT PHYSIOLOGY 2016; 191:127-139. [PMID: 26788957 DOI: 10.1016/j.jplph.2015.12.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2015] [Revised: 12/17/2015] [Accepted: 12/17/2015] [Indexed: 06/05/2023]
Abstract
Barley (Hordeum vulgare L.) seed germination initiates many important biological processes such as DNA, membrane and mitochondrial repairs. However, little is known on cell wall modifications in germinating embryos. We have investigated cell wall polysaccharide composition change, gene transcription and alternative splicing events in four barley varieties at 24h and 48 h germination. Cell wall components in germinating barley embryos changed rapidly, with increases in cellulose and (1,3)(1,4)-β-D-glucan (20-100%) within 24h, but decreases in heteroxylan and arabinan (3-50%). There were also significant changes in the levels of type I arabinogalactans and heteromannans. Alternative splicing played very important roles in cell wall modifications. At least 22 cell wall transcripts were detected to undergo either alternative 3' splicing, alternative 5' splicing or intron retention type of alternative splicing. These genes coded enzymes catalyzing synthesis and degradation of cellulose, heteroxylan, (1,3)(1,4)-β-D-glucan and other cell wall polymers. Furthermore, transcriptional regulation also played very important roles in cell wall modifications. Transcript levels of primary wall cellulase synthase, heteroxylan synthesizing and nucleotide sugar inter-conversion genes were very high in germinating embryos. At least 50 cell wall genes changed transcript levels significantly. Expression patterns of many cell wall genes coincided with changes in polysaccharide composition. Our data showed that cell wall polysaccharide metabolism was very active in germinating barley embryos, which was regulated at both transcriptional and post-transcriptional levels.
Collapse
Affiliation(s)
- Qisen Zhang
- Australian Export Grains Innovation Centre, 3 Baron-Hay Court, South Perth, WA 6155, Australia.
| | - Xiaoqi Zhang
- Western Barley Genetics Alliance, Murdoch University, 90 South Street, Murdoch, WA 6150 Australia.
| | | | - Gaofeng Zhou
- Department of Agriculture and Food Western Australia, 3 Baron-Hay Court, South Perth, WA 6155, Australia.
| | - Chengdao Li
- Australian Export Grains Innovation Centre, 3 Baron-Hay Court, South Perth, WA 6155, Australia; Western Barley Genetics Alliance, Murdoch University, 90 South Street, Murdoch, WA 6150 Australia; Department of Agriculture and Food Western Australia, 3 Baron-Hay Court, South Perth, WA 6155, Australia.
| |
Collapse
|
33
|
Exploiting RNA-sequencing data from the porcine testes to identify the key genes involved in spermatogenesis in Large White pigs. Gene 2015; 573:303-9. [DOI: 10.1016/j.gene.2015.07.057] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2015] [Revised: 07/07/2015] [Accepted: 07/16/2015] [Indexed: 11/19/2022]
|
34
|
RNA-seq transcriptome analysis of extensor digitorum longus and soleus muscles in large white pigs. Mol Genet Genomics 2015; 291:687-701. [DOI: 10.1007/s00438-015-1138-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Accepted: 10/17/2015] [Indexed: 10/24/2022]
|
35
|
Thangam M, Gopal RK. CRCDA--Comprehensive resources for cancer NGS data analysis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav092. [PMID: 26450948 PMCID: PMC4597977 DOI: 10.1093/database/bav092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 08/31/2015] [Indexed: 12/24/2022]
Abstract
Next generation sequencing (NGS) innovations put a compelling landmark in life science and changed the direction of research in clinical oncology with its productivity to diagnose and treat cancer. The aim of our portal comprehensive resources for cancer NGS data analysis (CRCDA) is to provide a collection of different NGS tools and pipelines under diverse classes with cancer pathways and databases and furthermore, literature information from PubMed. The literature data was constrained to 18 most common cancer types such as breast cancer, colon cancer and other cancers that exhibit in worldwide population. NGS-cancer tools for the convenience have been categorized into cancer genomics, cancer transcriptomics, cancer epigenomics, quality control and visualization. Pipelines for variant detection, quality control and data analysis were listed to provide out-of-the box solution for NGS data analysis, which may help researchers to overcome challenges in selecting and configuring individual tools for analysing exome, whole genome and transcriptome data. An extensive search page was developed that can be queried by using (i) type of data [literature, gene data and sequence read archive (SRA) data] and (ii) type of cancer (selected based on global incidence and accessibility of data). For each category of analysis, variety of tools are available and the biggest challenge is in searching and using the right tool for the right application. The objective of the work is collecting tools in each category available at various places and arranging the tools and other data in a simple and user-friendly manner for biologists and oncologists to find information easier. To the best of our knowledge, we have collected and presented a comprehensive package of most of the resources available in cancer for NGS data analysis. Given these factors, we believe that this website will be an useful resource to the NGS research community working on cancer. Database URL: http://bioinfo.au-kbc.org.in/ngs/ngshome.html.
Collapse
Affiliation(s)
- Manonanthini Thangam
- AU-KBC Research Centre, MIT Campus of Anna University, Chromepet, Chennai, India
| | - Ramesh Kumar Gopal
- AU-KBC Research Centre, MIT Campus of Anna University, Chromepet, Chennai, India
| |
Collapse
|
36
|
Yang C, Wu PY, Tong L, Phan JH, Wang MD. The impact of RNA-seq aligners on gene expression estimation. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2015; 2015:462-471. [PMID: 27583310 DOI: 10.1145/2808719.2808767] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
While numerous RNA-seq data analysis pipelines are available, research has shown that the choice of pipeline influences the results of differentially expressed gene detection and gene expression estimation. Gene expression estimation is a key step in RNA-seq data analysis, since the accuracy of gene expression estimates profoundly affects the subsequent analysis. Generally, gene expression estimation involves sequence alignment and quantification, and accurate gene expression estimation requires accurate alignment. However, the impact of aligners on gene expression estimation remains unclear. We address this need by constructing nine pipelines consisting of nine spliced aligners and one quantifier. We then use simulated data to investigate the impact of aligners on gene expression estimation. To evaluate alignment, we introduce three alignment performance metrics, (1) the percentage of reads aligned, (2) the percentage of reads aligned with zero mismatch (ZeroMismatchPercentage), and (3) the percentage of reads aligned with at most one mismatch (ZeroOneMismatchPercentage). We then evaluate the impact of alignment performance on gene expression estimation using three metrics, (1) gene detection accuracy, (2) the number of genes falsely quantified (FalseExpNum), and (3) the number of genes with falsely estimated fold changes (FalseFcNum). We found that among various pipelines, FalseExpNum and FalseFcNum are correlated. Moreover, FalseExpNum is linearly correlated with the percentage of reads aligned and ZeroMismatchPercentage, and FalseFcNum is linearly correlated with ZeroMismatchPercentage. Because of this correlation, the percentage of reads aligned and ZeroMismatchPercentage may be used to assess the performance of gene expression estimation for all RNA-seq datasets.
Collapse
Affiliation(s)
- Cheng Yang
- Department of Biomedical Engineering, Georgia Institute of Technology, Emory University, and Peking University, Atlanta, GA 30332, USA
| | - Po-Yen Wu
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Li Tong
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| | - John H Phan
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| | - May D Wang
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| |
Collapse
|
37
|
Li L, Liang XF, He S, Sun J, Wen ZY, He YH, Cai WJ, Wang YP, Tao YX. Transcriptome analysis of grass carp (Ctenopharyngodon idella) fed with animal and plant diets. Gene 2015; 574:371-9. [PMID: 26283148 DOI: 10.1016/j.gene.2015.08.030] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Revised: 08/09/2015] [Accepted: 08/13/2015] [Indexed: 10/23/2022]
Abstract
Numerous studies have been focused on the replacement of fish meal by other alternative protein sources. However, little is currently known about the molecular mechanism of utilization of diets with different protein sources in fish. Grass carp is a typical herbivorous fish. To elucidate the relationship between gene expression and utilization of animal and plant diets, transcriptome sequencing was performed in grass carp fed with chironomid larvae and duckweed. Grass carp fed with duckweed had significantly higher relative length of gut than those fed with chironomid larvae. 4435 differentially expressed genes were identified between grass carp fed with chironomid larvae and duckweed in brain, liver and gut, involved in cell proliferation and differentiation, appetite control, circadian rhythm, digestion and metabolism pathways. These pathways might play important roles in utilization of diets with different protein sources in grass carp. And the findings could provide a new insight into the replacement of fish meal in artificial diets.
Collapse
Affiliation(s)
- Ling Li
- College of Fisheries, Key Lab of Freshwater Animal Breeding, Ministry of Agriculture, Huazhong Agricultural University, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, Wuhan, Hubei 430070, China
| | - Xu-Fang Liang
- College of Fisheries, Key Lab of Freshwater Animal Breeding, Ministry of Agriculture, Huazhong Agricultural University, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, Wuhan, Hubei 430070, China.
| | - Shan He
- College of Fisheries, Key Lab of Freshwater Animal Breeding, Ministry of Agriculture, Huazhong Agricultural University, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, Wuhan, Hubei 430070, China
| | - Jian Sun
- College of Fisheries, Key Lab of Freshwater Animal Breeding, Ministry of Agriculture, Huazhong Agricultural University, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, Wuhan, Hubei 430070, China
| | - Zheng-Yong Wen
- College of Fisheries, Key Lab of Freshwater Animal Breeding, Ministry of Agriculture, Huazhong Agricultural University, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, Wuhan, Hubei 430070, China
| | - Yu-Hui He
- College of Fisheries, Key Lab of Freshwater Animal Breeding, Ministry of Agriculture, Huazhong Agricultural University, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, Wuhan, Hubei 430070, China
| | - Wen-Jing Cai
- College of Fisheries, Key Lab of Freshwater Animal Breeding, Ministry of Agriculture, Huazhong Agricultural University, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, Wuhan, Hubei 430070, China
| | - Ya-Ping Wang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei 430070, China
| | - Ya-Xiong Tao
- Department of Anatomy, Physiology, and Pharmacology, College of Veterinary Medicine, Auburn University, Auburn, AL 36849-5519, United States
| |
Collapse
|
38
|
Okada S, Sakurai M, Ueda H, Suzuki T. Biochemical and Transcriptome-Wide Identification of A-to-I RNA Editing Sites by ICE-Seq. Methods Enzymol 2015; 560:331-53. [PMID: 26253977 DOI: 10.1016/bs.mie.2015.03.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Inosine (I) is a modified adenosine (A) in RNA. In Metazoa, I is generated by hydrolytic deamination of A, catalyzed by adenosine deaminase acting RNA (ADAR) in a process called A-to-I RNA editing. A-to-I RNA editing affects various biological processes by modulating gene expression. In addition, dysregulation of A-to-I RNA editing results in pathological consequences. I on RNA strands is converted to guanosine (G) during cDNA synthesis by reverse transcription. Thus, the conventional method used to identify A-to-I RNA editing sites compares cDNA sequences with their corresponding genomic sequences. Combined with deep sequencing, this method has been applied to transcriptome-wide screening of A-to-I RNA editing sites. This approach, however, produces a large number of false positives mainly owing to mapping errors. To address this issue, we developed a biochemical method called inosine chemical erasing (ICE) to reliably identify genuine A-to-I RNA editing sites. In addition, we applied the ICE method combined with RNA-seq, referred to as ICE-seq, to identify transcriptome-wide A-to-I RNA editing sites. In this chapter, we describe the detailed protocol for ICE-seq, which can be applied to various sources and taxa.
Collapse
Affiliation(s)
- Shunpei Okada
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Masayuki Sakurai
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Hiroki Ueda
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Tsutomu Suzuki
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Bunkyo-ku, Tokyo, Japan.
| |
Collapse
|
39
|
Li Z, Long Y, Zhong L, Song G, Zhang X, Yuan L, Cui Z, Dai H. RNA sequencing provides insights into the toxicogenomic response of ZF4 cells to methyl methanesulfonate. J Appl Toxicol 2015; 36:94-104. [DOI: 10.1002/jat.3147] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Revised: 02/10/2015] [Accepted: 02/10/2015] [Indexed: 12/16/2022]
Affiliation(s)
- Zhouquan Li
- State Key Laboratory of Fresh water Ecology and Biotechnology; Institute of Hydrobiology, Chinese Academy of Sciences; 7 Southern East Lake Road Wuhan 430072 People's Republic of China
- University of Chinese Academy of Sciences; Yuquan Road 19A Beijing 100039 People's Republic of China
| | - Yong Long
- State Key Laboratory of Fresh water Ecology and Biotechnology; Institute of Hydrobiology, Chinese Academy of Sciences; 7 Southern East Lake Road Wuhan 430072 People's Republic of China
| | - Liqiao Zhong
- State Key Laboratory of Fresh water Ecology and Biotechnology; Institute of Hydrobiology, Chinese Academy of Sciences; 7 Southern East Lake Road Wuhan 430072 People's Republic of China
- University of Chinese Academy of Sciences; Yuquan Road 19A Beijing 100039 People's Republic of China
| | - Guili Song
- State Key Laboratory of Fresh water Ecology and Biotechnology; Institute of Hydrobiology, Chinese Academy of Sciences; 7 Southern East Lake Road Wuhan 430072 People's Republic of China
| | - Xiaohua Zhang
- State Key Laboratory of Fresh water Ecology and Biotechnology; Institute of Hydrobiology, Chinese Academy of Sciences; 7 Southern East Lake Road Wuhan 430072 People's Republic of China
| | - Li Yuan
- State Key Laboratory of Fresh water Ecology and Biotechnology; Institute of Hydrobiology, Chinese Academy of Sciences; 7 Southern East Lake Road Wuhan 430072 People's Republic of China
| | - Zongbin Cui
- State Key Laboratory of Fresh water Ecology and Biotechnology; Institute of Hydrobiology, Chinese Academy of Sciences; 7 Southern East Lake Road Wuhan 430072 People's Republic of China
| | - Heping Dai
- State Key Laboratory of Fresh water Ecology and Biotechnology; Institute of Hydrobiology, Chinese Academy of Sciences; 7 Southern East Lake Road Wuhan 430072 People's Republic of China
| |
Collapse
|
40
|
Bahrami-Samani E, Vo DT, de Araujo PR, Vogel C, Smith AD, Penalva LOF, Uren PJ. Computational challenges, tools, and resources for analyzing co- and post-transcriptional events in high throughput. WILEY INTERDISCIPLINARY REVIEWS. RNA 2015; 6:291-310. [PMID: 25515586 PMCID: PMC4397117 DOI: 10.1002/wrna.1274] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 10/24/2014] [Accepted: 10/29/2014] [Indexed: 11/10/2022]
Abstract
Co- and post-transcriptional regulation of gene expression is complex and multifaceted, spanning the complete RNA lifecycle from genesis to decay. High-throughput profiling of the constituent events and processes is achieved through a range of technologies that continue to expand and evolve. Fully leveraging the resulting data is nontrivial, and requires the use of computational methods and tools carefully crafted for specific data sources and often intended to probe particular biological processes. Drawing upon databases of information pre-compiled by other researchers can further elevate analyses. Within this review, we describe the major co- and post-transcriptional events in the RNA lifecycle that are amenable to high-throughput profiling. We place specific emphasis on the analysis of the resulting data, in particular the computational tools and resources available, as well as looking toward future challenges that remain to be addressed.
Collapse
Affiliation(s)
- Emad Bahrami-Samani
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Dat T. Vo
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Patricia Rosa de Araujo
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Christine Vogel
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY
| | - Andrew D. Smith
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| | - Luiz O. F. Penalva
- Children’s Cancer Research Institute and Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, TX
| | - Philip J. Uren
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA
| |
Collapse
|
41
|
Suzuki T, Ueda H, Okada S, Sakurai M. Transcriptome-wide identification of adenosine-to-inosine editing using the ICE-seq method. Nat Protoc 2015; 10:715-32. [PMID: 25855956 DOI: 10.1038/nprot.2015.037] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Inosine (I), a modified base found in the double-stranded regions of RNA in metazoans, has various roles in biological processes by modulating gene expression. Inosine is generated from adenosine (A) catalyzed by ADAR (adenosine deaminase acting on RNA) enzymes in a process called A-to-I RNA editing. As inosine is converted to guanosine (G) by reverse transcription, the editing sites can be identified by simply comparing cDNA sequences with the corresponding genomic sequence. One approach to screening I sites is by deep sequencing based on A-to-G conversion from genomic sequence to cDNA; however, this approach produces a high rate of false positives because it cannot efficiently eliminate G signals arising from inevitable mapping errors. To address this issue, we developed a biochemical method to identify inosines called inosine chemical erasing (ICE), which is based on cyanoethylation combined with reverse transcription. ICE was subsequently combined with deep sequencing (ICE-seq) for the reliable identification of transcriptome-wide A-to-I editing sites. Here we describe a protocol for the practical application of ICE-seq, which can be completed within 22 d, and which allows the accurate identification of transcriptome-wide A-to-I RNA editing sites.
Collapse
Affiliation(s)
- Tsutomu Suzuki
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Tokyo, Japan
| | - Hiroki Ueda
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Tokyo, Japan
| | - Shunpei Okada
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Tokyo, Japan
| | - Masayuki Sakurai
- Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Tokyo, Japan
| |
Collapse
|
42
|
He S, Liang XF, Li L, Sun J, Wen ZY, Cheng XY, Li AX, Cai WJ, He YH, Wang YP, Tao YX, Yuan XC. Transcriptome analysis of food habit transition from carnivory to herbivory in a typical vertebrate herbivore, grass carp Ctenopharyngodon idella. BMC Genomics 2015; 16:15. [PMID: 25608568 PMCID: PMC4307112 DOI: 10.1186/s12864-015-1217-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 01/02/2015] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Although feeding behavior and food habit are ecologically and economically important properties, little is known about formation and evolution of herbivory. Grass carp (Ctenopharyngodon idella) is an ecologically appealing model of vertebrate herbivore, widely cultivated in the world as edible fish or as biological control agents for aquatic weeds. Grass carp exhibits food habit transition from carnivory to herbivory during development. However, currently little is known about the genes regulating the unique food habit transition and the formation of herbivory, and how they could achieve higher growth rates on plant materials, which have a relatively poor nutritional quality. RESULTS We showed that grass carp fed with duckweed (modeling fish after food habit transition) had significantly higher relative length of gut than fish before food habit transition or those fed with chironomid larvae (fish without transition). Using transcriptome sequencing, we identified 10,184 differentially expressed genes between grass carp before and after transition in brain, liver and gut. By eliminating genes potentially involved in development (via comparing fish with or without food habit transition), we identified changes in expression of genes involved in cell proliferation and differentiation, appetite control, circadian rhythm, and digestion and metabolism between fish before and after food habit transition. Up-regulation of GHRb, Egfr, Fgf, Fgfbp1, Insra, Irs2, Jak, STAT, PKC, PI3K expression in fish fed with duckweed, consistent with faster gut growth, could promote the food habit transition. Grass carp after food habit transition had increased appetite signal in brain. Altered expressions of Per, Cry, Clock, Bmal2, Pdp, Dec and Fbxl3 might reset circadian phase of fish after food habit transition. Expression of genes involved in digestion and metabolism were significantly different between fish before and after the transition. CONCLUSIONS We suggest that the food habit transition from carnivory to herbivory in grass carp might be due to enhanced gut growth, increased appetite, resetting of circadian phase and enhanced digestion and metabolism. We also found extensive alternative splicing and novel transcript accompanying food habit transition. These differences together might account for the food habit transition and the formation of herbivory in grass carp.
Collapse
Affiliation(s)
- Shan He
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Xu-Fang Liang
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Ling Li
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Jian Sun
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Zheng-Yong Wen
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Xiao-Yan Cheng
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Ai-Xuan Li
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Wen-Jing Cai
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Yu-Hui He
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| | - Ya-Ping Wang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, 430072, Wuhan, China.
| | - Ya-Xiong Tao
- Department of Anatomy, Physiology, and Pharmacology, College of Veterinary Medicine, Auburn University, Auburn, AL, 36849-5519, USA.
| | - Xiao-Chen Yuan
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, College of Fisheries, Huazhong Agricultural University, Hubei Collaborative Innovation Center for Freshwater Aquaculture, 430070, Wuhan, China.
| |
Collapse
|
43
|
Yang Y, Xiong J, Zhou Z, Huo F, Miao W, Ran C, Liu Y, Zhang J, Feng J, Wang M, Wang M, Wang L, Yao B. The genome of the myxosporean Thelohanellus kitauei shows adaptations to nutrient acquisition within its fish host. Genome Biol Evol 2014; 6:3182-98. [PMID: 25381665 PMCID: PMC4986447 DOI: 10.1093/gbe/evu247] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Members of Myxozoa, a parasitic metazoan taxon, have considerable detrimental effects on fish hosts and also have been associated with human food-borne illness. Little is known about their biology and metabolism. Analysis of the genome of Thelohanellus kitauei and comparative analysis with genomes of its two free-living cnidarian relatives revealed that T. kitauei has adapted to parasitism, as indicated by the streamlined metabolic repertoire and the tendency toward anabolism rather than catabolism. Thelohanellus kitauei mainly secretes proteases and protease inhibitors for nutrient digestion (parasite invasion), and depends on endocytosis (mainly low-density lipoprotein receptors-mediated type) and secondary carriers for nutrient absorption. Absence of both classic and complementary anaerobic pathways and gluconeogenesis, the lack of de novo synthesis and reduced activity in hydrolysis of fatty acids, amino acids, and nucleotides indicated that T. kitauei in this vertebrate host-parasite system has adapted to inhabit a physiological environment extremely rich in both oxygen and nutrients (especially glucose), which is consistent with its preferred parasitic site, that is, the host gut submucosa. Taking advantage of the genomic and transcriptomic information, 23 potential nutrition-related T. kitauei-specific chemotherapeutic targets were identified. This first genome sequence of a myxozoan will facilitate development of potential therapeutics for efficient control of myxozoan parasites and ultimately prevent myxozoan-induced fish-borne illnesses in humans.
Collapse
Affiliation(s)
- Yalin Yang
- Key Laboratory for Feed Biotechnology of the Ministry of Agriculture, Feed Research Institute, Chinese Academy of Agricultural Sciences, Beijing, People's Republic of China
| | - Jie Xiong
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, People's Republic of China
| | - Zhigang Zhou
- Key Laboratory for Feed Biotechnology of the Ministry of Agriculture, Feed Research Institute, Chinese Academy of Agricultural Sciences, Beijing, People's Republic of China
| | - Fengmin Huo
- Key Laboratory for Feed Biotechnology of the Ministry of Agriculture, Feed Research Institute, Chinese Academy of Agricultural Sciences, Beijing, People's Republic of China
| | - Wei Miao
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, People's Republic of China
| | - Chao Ran
- Key Laboratory for Feed Biotechnology of the Ministry of Agriculture, Feed Research Institute, Chinese Academy of Agricultural Sciences, Beijing, People's Republic of China
| | - Yuchun Liu
- Key Laboratory for Feed Biotechnology of the Ministry of Agriculture, Feed Research Institute, Chinese Academy of Agricultural Sciences, Beijing, People's Republic of China
| | - Jinyong Zhang
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, People's Republic of China
| | - Jinmei Feng
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, People's Republic of China
| | - Meng Wang
- Tianjin Biochip Corporation, Tianjin, People's Republic of China
| | - Min Wang
- TEDA School of Biological Sciences and Biotechnology, Nankai University, Tianjin, People's Republic of China
| | - Lei Wang
- TEDA School of Biological Sciences and Biotechnology, Nankai University, Tianjin, People's Republic of China
| | - Bin Yao
- Key Laboratory for Feed Biotechnology of the Ministry of Agriculture, Feed Research Institute, Chinese Academy of Agricultural Sciences, Beijing, People's Republic of China
| |
Collapse
|
44
|
Inamdar H, Datta A, Manjari K S, Joshi R. Rule-based integration of RNA-Seq analyses tools for identification of novel transcripts. J Bioinform Comput Biol 2014; 12:1450026. [DOI: 10.1142/s0219720014500267] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Recent evidences suggest that a substantial amount of genome is transcribed more than that was anticipated, giving rise to a large number of unknown or novel transcripts. Identification of novel transcripts can provide key insights into understanding important cellular functions as well as molecular mechanisms underlying complex diseases like cancer. RNA-Seq has emerged as a powerful tool to detect novel transcripts, which previous profiling techniques failed to identify. A number of tools are available for enabling identification of novel transcripts at different levels. Read mappers such as TopHat, MapSplice, and SOAPsplice predict novel junctions, which are the indicators of novel transcripts. Cufflinks assembles novel transcripts based on alignment information and Oases performs de novo construction of transcripts. A common limitation of all these tools is prediction of sizable number of spurious or false positive (FP) novel transcripts. An approach that integrates information from all above sources and simultaneously scrutinizes FPs to correctly identify authentic novel transcripts of high confidence is proposed.
Collapse
Affiliation(s)
- Harshal Inamdar
- Bioinformatics Group, Centre for Development of Advanced Computing, Pune University Campus, Ganeshkhind, Pune 411007, India
| | - Avik Datta
- Bioinformatics Group, Centre for Development of Advanced Computing, Pune University Campus, Ganeshkhind, Pune 411007, India
| | - Sunitha Manjari K
- Bioinformatics Group, Centre for Development of Advanced Computing, Pune University Campus, Ganeshkhind, Pune 411007, India
| | - Rajendra Joshi
- Bioinformatics Group, Centre for Development of Advanced Computing, Pune University Campus, Ganeshkhind, Pune 411007, India
| |
Collapse
|
45
|
Wu PY, Chandramohan R, Phan JH, Mahle WT, Gaynor JW, Maher KO, Wang MD. Cardiovascular transcriptomics and epigenomics using next-generation sequencing: challenges, progress, and opportunities. CIRCULATION. CARDIOVASCULAR GENETICS 2014; 7:701-10. [PMID: 25518043 PMCID: PMC4983435 DOI: 10.1161/circgenetics.113.000129] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Affiliation(s)
- Po-Yen Wu
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - Raghu Chandramohan
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - John H Phan
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - William T Mahle
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - J William Gaynor
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - Kevin O Maher
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - May D Wang
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.).
| |
Collapse
|
46
|
Genome-wide investigation and transcriptome analysis of the WRKY gene family in Gossypium. Mol Genet Genomics 2014; 290:151-71. [DOI: 10.1007/s00438-014-0904-7] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 08/19/2014] [Indexed: 10/24/2022]
|
47
|
Parada GE, Munita R, Cerda CA, Gysling K. A comprehensive survey of non-canonical splice sites in the human transcriptome. Nucleic Acids Res 2014; 42:10564-78. [PMID: 25123659 PMCID: PMC4176328 DOI: 10.1093/nar/gku744] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
We uncovered the diversity of non-canonical splice sites at the human transcriptome using deep transcriptome profiling. We mapped a total of 3.7 billion human RNA-seq reads and developed a set of stringent filters to avoid false non-canonical splice site detections. We identified 184 splice sites with non-canonical dinucleotides and U2/U12-like consensus sequences. We selected 10 of the herein identified U2/U12-like non-canonical splice site events and successfully validated 9 of them via reverse transcriptase-polymerase chain reaction and Sanger sequencing. Analyses of the 184 U2/U12-like non-canonical splice sites indicate that 51% of them are not annotated in GENCODE. In addition, 28% of them are conserved in mouse and 76% are involved in alternative splicing events, some of them with tissue-specific alternative splicing patterns. Interestingly, our analysis identified some U2/U12-like non-canonical splice sites that are converted into canonical splice sites by RNA A-to-I editing. Moreover, the U2/U12-like non-canonical splice sites have a differential distribution of splicing regulatory sequences, which may contribute to their recognition and regulation. Our analysis provides a high-confidence group of U2/U12-like non-canonical splice sites, which exhibit distinctive features among the total human splice sites.
Collapse
Affiliation(s)
- Guillermo E Parada
- Nucleus Millennium in Stress and Addiction, Department of Cellular and Molecular Biology, Faculty of Biological Sciences, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile
| | - Roberto Munita
- Nucleus Millennium in Stress and Addiction, Department of Cellular and Molecular Biology, Faculty of Biological Sciences, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile
| | - Cledi A Cerda
- Nucleus Millennium in Stress and Addiction, Department of Cellular and Molecular Biology, Faculty of Biological Sciences, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile
| | - Katia Gysling
- Nucleus Millennium in Stress and Addiction, Department of Cellular and Molecular Biology, Faculty of Biological Sciences, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile
| |
Collapse
|
48
|
Butterfield YS, Kreitzman M, Thiessen N, Corbett RD, Li Y, Pang J, Ma YP, Jones SJM, Birol İ. JAGuaR: junction alignments to genome for RNA-seq reads. PLoS One 2014; 9:e102398. [PMID: 25062255 PMCID: PMC4111418 DOI: 10.1371/journal.pone.0102398] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 06/19/2014] [Indexed: 01/06/2023] Open
Abstract
JAGuaR is an alignment protocol for RNA-seq reads that uses an extended reference to increase alignment sensitivity. It uses BWA to align reads to the genome and reference transcript models (including annotated exon-exon junctions) specifically allowing for the possibility of a single read spanning multiple exons. Reads aligned to the transcript models are then re-mapped on to genomic coordinates, transforming alignments that span multiple exons into large-gapped alignments on the genome. While JAGuaR does not detect novel junctions, we demonstrate how JAGuaR generates fast and accurate transcriptome alignments, which allows for both sensitive and specific SNV calling.
Collapse
Affiliation(s)
| | - Maayan Kreitzman
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - Nina Thiessen
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | | | - Yisu Li
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - Johnson Pang
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - Yussanne P. Ma
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | | | - İnanç Birol
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| |
Collapse
|
49
|
Kloetgen A, Münch PC, Borkhardt A, Hoell JI, McHardy AC. Biochemical and bioinformatic methods for elucidating the role of RNA-protein interactions in posttranscriptional regulation. Brief Funct Genomics 2014; 14:102-14. [PMID: 24951655 PMCID: PMC4471435 DOI: 10.1093/bfgp/elu020] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Our understanding of transcriptional gene regulation has dramatically increased over the past decades, and many regulators of gene expression, such as transcription factors, have been analyzed extensively. Additionally, in recent years, deeper insights into the physiological roles of RNA have been obtained. More precisely, splicing, polyadenylation, various modifications, localization and the translation of messenger RNAs (mRNAs) are regulated by their interaction with RNA-binding proteins (RBPs). New technologies now enable the analysis of this regulation at different levels. A technique known as ultraviolet (UV) cross-linking and immunoprecipitation (CLIP) allows us to determine physical protein–RNA interactions on a genome-wide scale. UV cross-linking introduces covalent bonds between interacting RBPs and RNAs. In combination with immunoprecipitation and deep sequencing techniques, tens of millions of short reads (representing bound RNAs by an RBP of interest) are generated and are used to characterize the regulatory network mediated by an RBP. Other methods, such as mass spectrometry, can also be used for characterization of cross-linked RBPs and RNAs instead of CLIP methods. In this review, we discuss experimental and computational methods for the generation and analysis of CLIP data. The computational methods include short-read alignment, annotation and RNA-binding motif discovery. We describe the challenges of analyzing CLIP data and indicate areas where improvements are needed.
Collapse
Affiliation(s)
| | | | | | | | - Alice C McHardy
- Corresponding author. Alice C. McHardy, Heinrich-Heine University, Department of Algorithmic Bioinformatics, Universitaetsstrasse 1, 40225 Duesseldorf, Germany. Tel.: +49-211-8110427; Fax: +49-211-8113464; E-mail:
| |
Collapse
|
50
|
Conte I, Merella S, Garcia-Manteiga JM, Migliore C, Lazarevic D, Carrella S, Marco-Ferreres R, Avellino R, Davidson NP, Emmett W, Sanges R, Bockett N, Van Heel D, Meroni G, Bovolenta P, Stupka E, Banfi S. The combination of transcriptomics and informatics identifies pathways targeted by miR-204 during neurogenesis and axon guidance. Nucleic Acids Res 2014; 42:7793-806. [PMID: 24895435 PMCID: PMC4081098 DOI: 10.1093/nar/gku498] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Vertebrate organogenesis is critically sensitive to gene dosage and even subtle variations in the expression levels of key genes may result in a variety of tissue anomalies. MicroRNAs (miRNAs) are fundamental regulators of gene expression and their role in vertebrate tissue patterning is just beginning to be elucidated. To gain further insight into this issue, we analysed the transcriptomic consequences of manipulating the expression of miR-204 in the Medaka fish model system. We used RNA-Seq and an innovative bioinformatics approach, which combines conventional differential expression analysis with the behavior expected by miR-204 targets after its overexpression and knockdown. With this approach combined with a correlative analysis of the putative targets, we identified a wider set of miR-204 target genes belonging to different pathways. Together, these approaches confirmed that miR-204 has a key role in eye development and further highlighted its putative function in neural differentiation processes, including axon guidance as supported by in vivo functional studies. Together, our results demonstrate the advantage of integrating next-generation sequencing and bioinformatics approaches to investigate miRNA biology and provide new important information on the role of miRNAs in the control of axon guidance and more broadly in nervous system development.
Collapse
Affiliation(s)
- Ivan Conte
- Telethon Institute of Genetics and Medicine, Via Pietro Castellino, 111, 80131 Naples, Italy
| | - Stefania Merella
- Center For Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina, 58, 20132 Milan, Italy
| | - Jose Manuel Garcia-Manteiga
- Center For Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina, 58, 20132 Milan, Italy
| | - Chiara Migliore
- CBM Scrl, c/o Area Science Park, Basovizza, 30143 Trieste, Italy
| | - Dejan Lazarevic
- Center For Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina, 58, 20132 Milan, Italy
| | - Sabrina Carrella
- Telethon Institute of Genetics and Medicine, Via Pietro Castellino, 111, 80131 Naples, Italy
| | - Raquel Marco-Ferreres
- Centro de Biología Molecular 'Severo Ochoa', CSIC-UAM, c/Nicolas Cabrera 1, Madrid 28049, Spain CIBER de Enfermedades Raras (CIBERER), c/ Nicolas Cabrera 1, Madrid 28049, Spain
| | - Raffaella Avellino
- Telethon Institute of Genetics and Medicine, Via Pietro Castellino, 111, 80131 Naples, Italy
| | - Nathan Paul Davidson
- Telethon Institute of Genetics and Medicine, Via Pietro Castellino, 111, 80131 Naples, Italy
| | - Warren Emmett
- UCL Cancer Institute, Huntley Street, University College London, London WC1E 6BT, UK
| | - Remo Sanges
- Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Napoli, Italy
| | - Nicholas Bockett
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
| | - David Van Heel
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
| | - Germana Meroni
- CBM Scrl, c/o Area Science Park, Basovizza, 30143 Trieste, Italy
| | - Paola Bovolenta
- Centro de Biología Molecular 'Severo Ochoa', CSIC-UAM, c/Nicolas Cabrera 1, Madrid 28049, Spain CIBER de Enfermedades Raras (CIBERER), c/ Nicolas Cabrera 1, Madrid 28049, Spain
| | - Elia Stupka
- Center For Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina, 58, 20132 Milan, Italy
| | - Sandro Banfi
- Telethon Institute of Genetics and Medicine, Via Pietro Castellino, 111, 80131 Naples, Italy Medical Genetics, Department of Biochemistry, Biophysics and General Pathology, Second University of Naples, 80138 Naples, Italy
| |
Collapse
|