1
|
Shi Q, Zhang Q, Shao M. Accurate assembly of multiple RNA-seq samples with Aletsch. Bioinformatics 2024; 40:i307-i317. [PMID: 38940157 PMCID: PMC11211816 DOI: 10.1093/bioinformatics/btae215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION High-throughput RNA sequencing has become indispensable for decoding gene activities, yet the challenge of reconstructing full-length transcripts persists. Traditional single-sample assemblers frequently produce fragmented transcripts, especially in single-cell RNA-seq data. While algorithms designed for assembling multiple samples exist, they encounter various limitations. RESULTS We present Aletsch, a new assembler for multiple bulk or single-cell RNA-seq samples. Aletsch incorporates several algorithmic innovations, including a "bridging" system that can effectively integrate multiple samples to restore missed junctions in individual samples, and a new graph-decomposition algorithm that leverages "supporting" information across multiple samples to guide the decomposition of complex vertices. A standout feature of Aletsch is its application of a random forest model with 50 well-designed features for scoring transcripts. We demonstrate its robust adaptability across different chromosomes, datasets, and species. Our experiments, conducted on RNA-seq data from several protocols, firmly demonstrate Aletsch's significant outperformance over existing meta-assemblers. As an example, when measured with the partial area under the precision-recall curve (pAUC, constrained by precision), Aletsch surpasses the leading assemblers TransMeta by 22.9%-62.1% and PsiCLASS by 23.0%-175.5% on human datasets. AVAILABILITY AND IMPLEMENTATION Aletsch is freely available at https://github.com/Shao-Group/aletsch. Scripts that reproduce the experimental results of this manuscript is available at https://github.com/Shao-Group/aletsch-test.
Collapse
Affiliation(s)
- Qian Shi
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, United States
| | - Qimin Zhang
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, United States
| | - Mingfu Shao
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, United States
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, United States
| |
Collapse
|
2
|
Liu Z, Ouyang T, Yang Y, Sheng Y, Shi H, Liu Q, Bai Y, Ge Q. The Impact of Blood Sample Processing on Ribonucleic Acid (RNA) Sequencing. Genes (Basel) 2024; 15:502. [PMID: 38674435 PMCID: PMC11050547 DOI: 10.3390/genes15040502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 04/12/2024] [Accepted: 04/13/2024] [Indexed: 04/28/2024] Open
Abstract
In gene quantification and expression analysis, issues with sample selection and processing can be serious, as they can easily introduce irrelevant variables and lead to ambiguous results. This study aims to investigate the extent and mechanism of the impact of sample selection and processing on ribonucleic acid (RNA) sequencing. RNA from PBMCs and blood samples was investigated in this study. The integrity of this RNA was measured under different storage times. All the samples underwent high-throughput sequencing for comprehensive evaluation. The differentially expressed genes and their potential functions were analyzed after the samples were placed at room temperature for 0h, 4h and 8h, and different feature changes in these samples were also revealed. The sequencing results showed that the differences in gene expression were higher with an increased storage time, while the total number of genes detected did not change significantly. There were five genes showing gradient patterns over different storage times, all of which were protein-coding genes that had not been mentioned in previous studies. The effect of different storage times on seemingly the same samples was analyzed in this present study. This research, therefore, provides a theoretical basis for the long-term consideration of whether sample processing should be adequately addressed.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Qinyu Ge
- State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing 211189, China; (Z.L.); (T.O.); (Y.Y.); (Y.S.); (H.S.); (Q.L.); (Y.B.)
| |
Collapse
|
3
|
Yu T, Zhao X, Li G. TransMeta simultaneously assembles multisample RNA-seq reads. Genome Res 2022; 32:1398-1407. [PMID: 35858749 PMCID: PMC9341511 DOI: 10.1101/gr.276434.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 06/03/2022] [Indexed: 11/25/2022]
Abstract
Assembling RNA-seq reads into full-length transcripts is crucial in transcriptomic studies and poses computational challenges. Here we present TransMeta, a simple and robust algorithm that simultaneously assembles RNA-seq reads from multiple samples. TransMeta is designed based on the newly introduced vector-weighted splicing graph model, which enables accurate reconstruction of the consensus transcriptome via incorporating a cosine similarity-based combing strategy and a newly designed label-setting path-searching strategy. Tests on both simulated and real data sets show that TransMeta consistently outperforms PsiCLASS, StringTie2 plus its merge mode, and Scallop plus TACO, the most popular tools, in terms of precision and recall under a wide range of coverage thresholds at the meta-assembly level. Additionally, TransMeta consistently shows superior performance at the individual sample level.
Collapse
Affiliation(s)
- Ting Yu
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Xiaoyu Zhao
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- School of Mathematics, Shandong University, Jinan, Shandong 250100, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- School of Mathematical Science, Liaocheng University, Liaocheng 252000, China
| |
Collapse
|
4
|
Gatter T, Stadler PF. Ryūtō: Improved multi-sample transcript assembly for differential transcript expression analysis and more. Bioinformatics 2021; 37:4307-4313. [PMID: 34255826 DOI: 10.1093/bioinformatics/btab494] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 06/21/2021] [Accepted: 07/01/2021] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION Accurate assembly of RNA-seq is a crucial step in many analytic tasks such as gene annotation or expression studies. Despite ongoing research, progress on traditional single sample assembly has brought no major breakthrough. Multi-sample RNA-Seq experiments provide more information than single sample datasets and thus constitute a promising area of research. Yet, this advantage is challenging to utilize due to the large amount of accumulating errors. RESULTS We present an extension to Ryūtō enabling the reconstruction of consensus transcriptomes from multiple RNA-seq data sets, incorporating consensus calling at low level features. We report stable improvements already at 3 replicates. Ryūtō outperforms competing approaches, providing a better and user-adjustable sensitivity-precision trade-off. Ryūtō's unique ability to utilize a (incomplete) reference for multi sample assemblies greatly increases precision. We demonstrate benefits for differential expression analysis. CONCLUSION Ryūtō consistently improves assembly on replicates of the same tissue independent of filter settings, even when mixing conditions or time series. Consensus voting in Ryūtō is especially effective at high precision assembly, while Ryūtō's conventional mode can reach higher recall. AVAILABILITY Ryūtō is available at https://github.com/studla/RYUTO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thomas Gatter
- Bioinformatics Group, Department of Computer Science & Interdisciplinary Center for Bioinformatics, Universität Leipzig, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science & Interdisciplinary Center for Bioinformatics, Universität Leipzig, D-04107 Leipzig, Germany
- Discrete Biomath Group, Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria
- Santa Fe Institute, Santa Fe, NM 87501, USA
| |
Collapse
|
5
|
Shi X, Neuwald AF, Wang X, Wang TL, Hilakivi-Clarke L, Clarke R, Xuan J. IntAPT: integrated assembly of phenotype-specific transcripts from multiple RNA-seq profiles. Bioinformatics 2021; 37:650-658. [PMID: 33016988 PMCID: PMC8097681 DOI: 10.1093/bioinformatics/btaa852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2019] [Revised: 08/27/2020] [Accepted: 09/21/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION High-throughput RNA sequencing has revolutionized the scope and depth of transcriptome analysis. Accurate reconstruction of a phenotype-specific transcriptome is challenging due to the noise and variability of RNA-seq data. This requires computational identification of transcripts from multiple samples of the same phenotype, given the underlying consensus transcript structure. RESULTS We present a Bayesian method, integrated assembly of phenotype-specific transcripts (IntAPT), that identifies phenotype-specific isoforms from multiple RNA-seq profiles. IntAPT features a novel two-layer Bayesian model to capture the presence of isoforms at the group layer and to quantify the abundance of isoforms at the sample layer. A spike-and-slab prior is used to model the isoform expression and to enforce the sparsity of expressed isoforms. Dependencies between the existence of isoforms and their expression are modeled explicitly to facilitate parameter estimation. Model parameters are estimated iteratively using Gibbs sampling to infer the joint posterior distribution, from which the presence and abundance of isoforms can reliably be determined. Studies using both simulations and real datasets show that IntAPT consistently outperforms existing methods for the IntAPT. Experimental results demonstrate that, despite sequencing errors, IntAPT exhibits a robust performance among multiple samples, resulting in notably improved identification of expressed isoforms of low abundance. AVAILABILITY AND IMPLEMENTATION The IntAPT package is available at http://github.com/henryxushi/IntAPT. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xu Shi
- Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Xiao Wang
- Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| | - Tian-Li Wang
- Department of Pathology, Johns Hopkins Medical Institutions, Baltimore, MD 21231, USA
| | | | - Robert Clarke
- Hormel Institute, University of Minnesota, 801 16th Ave NE, Austin, MN 55912, USA
| | - Jianhua Xuan
- Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
| |
Collapse
|
6
|
Kemski MM, Rappleye CA, Dabrowski K, Bruno RS, Wick M. Transcriptomic response to soybean meal-based diets as the first formulated feed in juvenile yellow perch (Perca flavescens). Sci Rep 2020; 10:3998. [PMID: 32132548 PMCID: PMC7055240 DOI: 10.1038/s41598-020-59691-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 01/20/2020] [Indexed: 12/24/2022] Open
Abstract
With increasing levels of fish meal (FM) protein in aquafeeds being replaced with soybean meal (SBM) protein, understanding the molecular mechanisms involved in response to alternative diets has become a critical concern. Thus, the goal of this study was to examine transcriptional differences in the intestine of juvenile yellow perch through RNA-sequencing (RNA-seq), after their initial introduction to a formulated diet with 75% SBM protein inclusion for 61 days, compared to those fed a traditional FM-based diet. Transcriptomic analysis revealed a concise set of differentially expressed genes in juveniles fed the SBM-based diet, the majority of which were intrinsic to the cholesterol biosynthesis pathway. Analysis of total body lipid and cholesterol levels were also investigated, with no between-treatment differences detected. Results of this study demonstrate that in response to SBM-based diets, yellow perch juveniles up-regulate the cholesterol biosynthesis pathway in order to maintain homeostasis. These findings suggest that the upregulation of the cholesterol biosynthesis pathway may negatively impact fish growth due to its large energy expenditure, and future studies are warranted.
Collapse
Affiliation(s)
- Megan M Kemski
- Department of Food Science and Technology, The Ohio State University, Columbus, OH, USA
- School of Environment and Natural Resources, The Ohio State University, Columbus, OH, USA
| | - Chad A Rappleye
- Department of Microbiology, The Ohio State University, Columbus, OH, USA
| | - Konrad Dabrowski
- School of Environment and Natural Resources, The Ohio State University, Columbus, OH, USA
| | - Richard S Bruno
- Department of Human Sciences, The Ohio State University, Columbus, OH, USA
| | - Macdonald Wick
- Department of Animal Sciences, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
7
|
Song L, Sabunciyan S, Yang G, Florea L. A multi-sample approach increases the accuracy of transcript assembly. Nat Commun 2019; 10:5000. [PMID: 31676772 PMCID: PMC6825223 DOI: 10.1038/s41467-019-12990-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Accepted: 10/11/2019] [Indexed: 01/21/2023] Open
Abstract
Transcript assembly from RNA-seq reads is a critical step in gene expression and subsequent functional analyses. Here we present PsiCLASS, an accurate and efficient transcript assembler based on an approach that simultaneously analyzes multiple RNA-seq samples. PsiCLASS combines mixture statistical models for exonic feature selection across multiple samples with splice graph based dynamic programming algorithms and a weighted voting scheme for transcript selection. PsiCLASS achieves significantly better sensitivity-precision tradeoff, and renders precision up to 2-3 fold higher than the StringTie system and Scallop plus TACO, the two best current approaches. PsiCLASS is efficient and scalable, assembling 667 GEUVADIS samples in 9 h, and has robust accuracy with large numbers of samples. Transcript assembly is an important step in analysis of RNA-seq data whose accuracy influences downstream quantification, detection and characterization of alternative splice variants. Here, the authors develop PsiCLASS, a transcript assembler leveraging simultaneous analysis of multiple RNA-seq samples.
Collapse
Affiliation(s)
- Li Song
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.,Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.,Department of Data Sciences, Dana Farber Cancer Institute, Boston, MA, USA
| | - Sarven Sabunciyan
- Department of Pediatrics, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Guangyu Yang
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.,Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Liliana Florea
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA. .,Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA. .,Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
8
|
Aguiar D, Cheng LF, Dumitrascu B, Mordelet F, Pai AA, Engelhardt BE. Bayesian nonparametric discovery of isoforms and individual specific quantification. Nat Commun 2018; 9:1681. [PMID: 29703885 PMCID: PMC5923247 DOI: 10.1038/s41467-018-03402-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 02/11/2018] [Indexed: 12/18/2022] Open
Abstract
Most human protein-coding genes can be transcribed into multiple distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity, and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and occur in different frequencies across tissues and samples. Here, we develop biisq, a Bayesian nonparametric model for isoform discovery and individual specific quantification from short-read RNA-seq data. biisq does not require isoform reference sequences but instead estimates an isoform catalog shared across samples. We use stochastic variational inference for efficient posterior estimates and demonstrate superior precision and recall for simulations compared to state-of-the-art isoform reconstruction methods. biisq shows the most gains for low abundance isoforms, with 36% more isoforms correctly inferred at low coverage versus a multi-sample method and 170% more versus single-sample methods. We estimate isoforms in the GEUVADIS RNA-seq data and validate inferred isoforms by associating genetic variants with isoform ratios. Alternative splicing leads to transcript isoform diversity. Here, Aguiar et al. develop biisq, a Bayesian nonparametric approach to discover and quantify isoforms from RNA-seq data.
Collapse
Affiliation(s)
- Derek Aguiar
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA.
| | - Li-Fang Cheng
- Department of Electrical Engineering, Princeton University, Princeton, NJ, 08540, USA
| | - Bianca Dumitrascu
- Lewis-Sigler Institute, Princeton University, Princeton, NJ, 08544, USA
| | - Fantine Mordelet
- Institute for Genome Sciences and Policy, Duke University, Durham, NC, 27708, USA
| | - Athma A Pai
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA, 01605, USA
| | - Barbara E Engelhardt
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA. .,Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, 08540, USA.
| |
Collapse
|
9
|
Stavrianakou M, Perez R, Wu C, Sachs MS, Aramayo R, Harlow M. Draft de novo transcriptome assembly and proteome characterization of the electric lobe of Tetronarce californica: a molecular tool for the study of cholinergic neurotransmission in the electric organ. BMC Genomics 2017; 18:611. [PMID: 28806931 PMCID: PMC5557070 DOI: 10.1186/s12864-017-3890-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 06/21/2017] [Indexed: 11/10/2022] Open
Abstract
Background The electric organ of Tetronarce californica (an electric ray formerly known as Torpedo californica) is a classic preparation for biochemical studies of cholinergic neurotransmission. To broaden the usefulness of this preparation, we have performed a transcriptome assembly of the presynaptic component of the electric organ (the electric lobe). We combined our assembled transcriptome with a previous transcriptome of the postsynaptic electric organ, to define a MetaProteome containing pre- and post-synaptic components of the electric organ. Results Sequencing yielded 102 million paired-end 100 bp reads. De novo Trinity assembly was performed at Kmer 25 (default) and Kmers 27, 29, and 31. Trinity, generated around 103,000 transcripts, and 78,000 genes per assembly. Assemblies were evaluated based on the number of bases/transcripts assembled, RSEM-EVAL scores and informational content and completeness. We found that different assemblies scored differently according to the evaluation criteria used, and that while each individual assembly contained unique information, much of the assembly information was shared by all assemblies. To generate the presynaptic transcriptome (electric lobe), while capturing all information, assemblies were first clustered and then combined with postsynaptic transcripts (electric organ) downloaded from NCBI. The completness of the resulting clustered predicted MetaProteome was rigorously evaluated by comparing its information against the predicted proteomes from Homo sapiens, Callorhinchus milli, and the Transporter Classification Database (TCDB). Conclusions In summary, we obtained a MetaProteome containing 92%, 88.5%, and 66% of the expected set of ultra-conserved sequences (i.e., BUSCOs), expected to be found for Eukaryotes, Metazoa, and Vertebrata, respectively. We cross-annotated the conserved set of proteins shared between the T. californica MetaProteome and the proteomes of H. sapiens and C. milli, using the H. sapiens genome as a reference. This information was used to predict the position in human pathways of the conserved members of the T. californica MetaProteome. We found proteins not detected before in T. californica, corresponding to processes involved in synaptic vesicle biology. Finally, we identified 42 transporter proteins in TCDB that were detected by the T. californica MetaProteome (electric fish) and not selected by a control proteome consisting of the combined proteomes of 12 widely diverse non-electric fishes by Reverse-Blast-Hit Blast. Combined, the information provided here is not only a unique tool for the study of cholinergic neurotransmission, but it is also a starting point for understanding the evolution of early vertebrates. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3890-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maria Stavrianakou
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA
| | - Ricardo Perez
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA
| | - Cheng Wu
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA
| | - Matthew S Sachs
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA
| | - Rodolfo Aramayo
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA.
| | - Mark Harlow
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, 77843-3258, USA.
| |
Collapse
|
10
|
Ramanouskaya TV, Grinev VV. The determinants of alternative RNA splicing in human cells. Mol Genet Genomics 2017; 292:1175-1195. [PMID: 28707092 DOI: 10.1007/s00438-017-1350-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Accepted: 07/06/2017] [Indexed: 12/29/2022]
Abstract
Alternative splicing represents an important level of the regulation of gene function in eukaryotic organisms. It plays a critical role in virtually every biological process within an organism, including regulation of cell division and cell death, differentiation of tissues in the embryo and the adult organism, as well as in cellular response to diverse environmental factors. In turn, studies of the last decade have shown that alternative splicing itself is controlled by different mechanisms. Unfortunately, there is no clear understanding of how these diverse mechanisms, or determinants, regulate and constrain the set of alternative RNA species produced from any particular gene in every cell of the human body. Here, we provide a consolidated overview of alternative splicing determinants including RNA-protein interactions, epigenetic regulation via chromatin remodeling, coupling of transcription-to-alternative splicing, effect of secondary structures in pre-RNA, and function of the RNA quality control systems. We also extensively and critically discuss some mechanistic insights on coordinated inclusion/exclusion of exons during the formation of mature RNA molecules. We conclude that the final structure of RNA is pre-determined by a complex interplay between cis- and trans-acting factors. Altogether, currently available empirical data significantly expand our understanding of the functioning of the alternative splicing machinery of cells in normal and pathological conditions. On the other hand, there are still many blind spots that require further deep investigations.
Collapse
|
11
|
Babonis LS, Martindale MQ, Ryan JF. Do novel genes drive morphological novelty? An investigation of the nematosomes in the sea anemone Nematostella vectensis. BMC Evol Biol 2016; 16:114. [PMID: 27216622 PMCID: PMC4877951 DOI: 10.1186/s12862-016-0683-3] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Accepted: 05/12/2016] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND The evolution of novel genes is thought to be a critical component of morphological innovation but few studies have explicitly examined the contribution of novel genes to the evolution of novel tissues. Nematosomes, the free-floating cellular masses that circulate through the body cavity of the sea anemone Nematostella vectensis, are the defining apomorphy of the genus Nematostella and are a useful model for understanding the evolution of novel tissues. Although many hypotheses have been proposed, the function of nematosomes is unknown. To gain insight into their putative function and to test hypotheses about the role of lineage-specific genes in the evolution of novel structures, we have re-examined the cellular and molecular biology of nematosomes. RESULTS Using behavioral assays, we demonstrate that nematosomes are capable of immobilizing live brine shrimp (Artemia salina) by discharging their abundant cnidocytes. Additionally, the ability of nematosomes to engulf fluorescently labeled bacteria (E. coli) reveals the presence of phagocytes in this tissue. Using RNA-Seq, we show that the gene expression profile of nematosomes is distinct from that of the tentacles and the mesenteries (their tissue of origin) and, further, that nematosomes (a Nematostella-specific tissue) are enriched in Nematostella-specific genes. CONCLUSIONS Despite the small number of cell types they contain, nematosomes are distinct among tissues, both functionally and molecularly. We provide the first evidence that nematosomes comprise part of the innate immune system in N. vectensis, and suggest that this tissue is potentially an important place to look for genes associated with pathogen stress. Finally, we demonstrate that Nematostella-specific genes comprise a significant proportion of the differentially expressed genes in all three of the tissues we examined and may play an important role in novel cell functions.
Collapse
Affiliation(s)
- Leslie S Babonis
- Whitney Laboratory for Marine Bioscience, University of Florida, 9505 Ocean Shore Blvd, St. Augustine, FL, 32080, USA.
| | - Mark Q Martindale
- Whitney Laboratory for Marine Bioscience, University of Florida, 9505 Ocean Shore Blvd, St. Augustine, FL, 32080, USA
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
| | - Joseph F Ryan
- Whitney Laboratory for Marine Bioscience, University of Florida, 9505 Ocean Shore Blvd, St. Augustine, FL, 32080, USA
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
| |
Collapse
|