1
|
Dwivedi SL, Quiroz LF, Reddy ASN, Spillane C, Ortiz R. Alternative Splicing Variation: Accessing and Exploiting in Crop Improvement Programs. Int J Mol Sci 2023; 24:15205. [PMID: 37894886 PMCID: PMC10607462 DOI: 10.3390/ijms242015205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 10/09/2023] [Accepted: 10/10/2023] [Indexed: 10/29/2023] Open
Abstract
Alternative splicing (AS) is a gene regulatory mechanism modulating gene expression in multiple ways. AS is prevalent in all eukaryotes including plants. AS generates two or more mRNAs from the precursor mRNA (pre-mRNA) to regulate transcriptome complexity and proteome diversity. Advances in next-generation sequencing, omics technology, bioinformatics tools, and computational methods provide new opportunities to quantify and visualize AS-based quantitative trait variation associated with plant growth, development, reproduction, and stress tolerance. Domestication, polyploidization, and environmental perturbation may evolve novel splicing variants associated with agronomically beneficial traits. To date, pre-mRNAs from many genes are spliced into multiple transcripts that cause phenotypic variation for complex traits, both in model plant Arabidopsis and field crops. Cataloguing and exploiting such variation may provide new paths to enhance climate resilience, resource-use efficiency, productivity, and nutritional quality of staple food crops. This review provides insights into AS variation alongside a gene expression analysis to select for novel phenotypic diversity for use in breeding programs. AS contributes to heterosis, enhances plant symbiosis (mycorrhiza and rhizobium), and provides a mechanistic link between the core clock genes and diverse environmental clues.
Collapse
Affiliation(s)
| | - Luis Felipe Quiroz
- Agriculture and Bioeconomy Research Centre, Ryan Institute, University of Galway, University Road, H91 REW4 Galway, Ireland
| | - Anireddy S N Reddy
- Department of Biology and Program in Cell and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Charles Spillane
- Agriculture and Bioeconomy Research Centre, Ryan Institute, University of Galway, University Road, H91 REW4 Galway, Ireland
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences, 23053 Alnarp, SE, Sweden
| |
Collapse
|
2
|
Paré L, Bideau L, Baduel L, Dalle C, Benchouaia M, Schneider SQ, Laplane L, Clément Y, Vervoort M, Gazave E. Transcriptomic landscape of posterior regeneration in the annelid Platynereis dumerilii. BMC Genomics 2023; 24:583. [PMID: 37784028 PMCID: PMC10546743 DOI: 10.1186/s12864-023-09602-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 08/18/2023] [Indexed: 10/04/2023] Open
Abstract
BACKGROUND Restorative regeneration, the capacity to reform a lost body part following amputation or injury, is an important and still poorly understood process in animals. Annelids, or segmented worms, show amazing regenerative capabilities, and as such are a crucial group to investigate. Elucidating the molecular mechanisms that underpin regeneration in this major group remains a key goal. Among annelids, the nereididae Platynereis dumerilii (re)emerged recently as a front-line regeneration model. Following amputation of its posterior part, Platynereis worms can regenerate both differentiated tissues of their terminal part as well as a growth zone that contains putative stem cells. While this regeneration process follows specific and reproducible stages that have been well characterized, the transcriptomic landscape of these stages remains to be uncovered. RESULTS We generated a high-quality de novo Reference transcriptome for the annelid Platynereis dumerilii. We produced and analyzed three RNA-sequencing datasets, encompassing five stages of posterior regeneration, along with blastema stages and non-amputated tissues as controls. We included two of these regeneration RNA-seq datasets, as well as embryonic and tissue-specific datasets from the literature to produce a Reference transcriptome. We used this Reference transcriptome to perform in depth analyzes of RNA-seq data during the course of regeneration to reveal the important dynamics of the gene expression, process with thousands of genes differentially expressed between stages, as well as unique and specific gene expression at each regeneration stage. The study of these genes highlighted the importance of the nervous system at both early and late stages of regeneration, as well as the enrichment of RNA-binding proteins (RBPs) during almost the entire regeneration process. CONCLUSIONS In this study, we provided a high-quality de novo Reference transcriptome for the annelid Platynereis that is useful for investigating various developmental processes, including regeneration. Our extensive stage-specific transcriptional analysis during the course of posterior regeneration sheds light upon major molecular mechanisms and pathways, and will foster many specific studies in the future.
Collapse
Affiliation(s)
- Louis Paré
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Loïc Bideau
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Loeiza Baduel
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Caroline Dalle
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Médine Benchouaia
- Département de biologie, GenomiqueENS, Institut de Biologie de l'ENS (IBENS), École normale supérieure, CNRS, INSERM, Université PSL, Paris, 75005, France
| | - Stephan Q Schneider
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Lucie Laplane
- Université Paris I Panthéon-Sorbonne, CNRS UMR 8590 Institut d'Histoire et de Philosophie des Sciences et des Techniques (IHPST), Paris, France
- Gustave Roussy, UMR 1287, Villejuif, France
| | - Yves Clément
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Michel Vervoort
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France
| | - Eve Gazave
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, F-75013, France.
| |
Collapse
|
3
|
Burioli EAV, Hammel M, Vignal E, Vidal-Dupiol J, Mitta G, Thomas F, Bierne N, Destoumieux-Garzón D, Charrière GM. Transcriptomics of mussel transmissible cancer MtrBTN2 suggests accumulation of multiple cancer traits and oncogenic pathways shared among bilaterians. Open Biol 2023; 13:230259. [PMID: 37816387 PMCID: PMC10564563 DOI: 10.1098/rsob.230259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/12/2023] [Indexed: 10/12/2023] Open
Abstract
Transmissible cancer cell lines are rare biological entities giving rise to diseases at the crossroads of cancer and parasitic diseases. These malignant cells have acquired the amazing capacity to spread from host to host. They have been described only in dogs, Tasmanian devils and marine bivalves. The Mytilus trossulus bivalve transmissible neoplasia 2 (MtrBTN2) lineage has even acquired the capacity to spread inter-specifically between marine mussels of the Mytilus edulis complex worldwide. To identify the oncogenic processes underpinning the biology of these atypical cancers we performed transcriptomics of MtrBTN2 cells. Differential expression, enrichment, protein-protein interaction network, and targeted analyses were used. Overall, our results suggest the accumulation of multiple cancerous traits that may be linked to the long-term evolution of MtrBTN2. We also highlight that vertebrate and lophotrochozoan cancers could share a large panel of common drivers, which supports the hypothesis of an ancient origin of oncogenic processes in bilaterians.
Collapse
Affiliation(s)
- E A V Burioli
- IHPE, Univ Montpellier, CNRS, IFREMER, Univ Perpignan Via Domitia, Montpellier, France
| | - M Hammel
- IHPE, Univ Montpellier, CNRS, IFREMER, Univ Perpignan Via Domitia, Montpellier, France
- ISEM, Univ Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - E Vignal
- IHPE, Univ Montpellier, CNRS, IFREMER, Univ Perpignan Via Domitia, Montpellier, France
| | - J Vidal-Dupiol
- IHPE, Univ Montpellier, CNRS, IFREMER, Univ Perpignan Via Domitia, Montpellier, France
| | - G Mitta
- IFREMER, UMR 241 Écosystèmes Insulaires Océaniens, Labex Corail, Centre Ifremer du Pacifique, Tahiti, Polynésie française
| | - F Thomas
- CREEC/CANECEV (CREES), MIVEGEC, Unité Mixte de Recherches, IRD 224-CNRS 5290-Université de Montpellier, Montpellier, France
| | - N Bierne
- ISEM, Univ Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - D Destoumieux-Garzón
- IHPE, Univ Montpellier, CNRS, IFREMER, Univ Perpignan Via Domitia, Montpellier, France
| | - G M Charrière
- IHPE, Univ Montpellier, CNRS, IFREMER, Univ Perpignan Via Domitia, Montpellier, France
| |
Collapse
|
4
|
Yu F, Luo W, Xie W, Li Y, Liu Y, Ye X, Peng T, Wang H, Huang T, Hu Z. The effects of long-term hexabromocyclododecanes contamination on microbial communities in the microcosms. CHEMOSPHERE 2023; 325:138412. [PMID: 36925001 DOI: 10.1016/j.chemosphere.2023.138412] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 01/21/2023] [Accepted: 03/13/2023] [Indexed: 06/18/2023]
Abstract
The adaptation of microbial community to the long-term contamination of hexabromocyclododecanes (HBCDs) has not been well studied. Our previous study found that the HBCDs contamination in the microcosms constructed of sediments from two different mangrove forests in 8 months resulted in serious acidification (pH2-3). This study reanalyzed previous sequencing data and compared them with data after 20 months to investigate the adaptive properties of microbial communities in the stress of HBCDs and acidification. It hypothesized that the reassembly was based on the fitness of taxa. The results indicated that eukaryotes and fungi might have better adaptive capacity to these deteriorated habitats. Eukaryotic taxa Eufallia and Syncystis, and fungal taxa Wickerhamomyces were only detected after 20 months of contamination. Moreover, eukaryotic taxa Caloneis and Nitzschia, and fungal taxa Talaromyces were dominant in most of microbial communities (14.467-95.941%). The functional compositions were sediment-dependent and more divergent than community reassemblies. Network and co-occurrence analysis suggested that acidophiles such as Acidisoma and Acidiphilium were gaining more positive relations in the long-term stress. The acidophilic taxa and genes involved in resistance to the acidification and toxicity of HBCDs were enriched, for example, bacteria Acidisoma and Acidiphilium, archaea Thermogymnomonas, and eukaryotes Nitzschia, and genes kdpC, odc1, polA, gst, and sod-2. These genes involved in oxidative stress response, energy metabolism, DNA damage repair, potassium transportation, and decarboxylation. It suggested that the microbial communities might cope with the stress from HBCDs and acidification via multiple pathways. The present research shed light on the evolution of microbial communities under the long-term stress of HBCDs contamination and acidification.
Collapse
Affiliation(s)
- Fei Yu
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China
| | - Wenqi Luo
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China
| | - Wei Xie
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China
| | - Yuyang Li
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China
| | - Yongjin Liu
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China
| | - Xueying Ye
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China
| | - Tao Peng
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China
| | - Hui Wang
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China
| | - Tongwang Huang
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China.
| | - Zhong Hu
- Department of Biology, College of Science, Shantou University, Shantou, Guangdong Province, China; Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, Guangdong Province, China.
| |
Collapse
|
5
|
Schreibing F, Anslinger TM, Kramann R. Fibrosis in Pathology of Heart and Kidney: From Deep RNA-Sequencing to Novel Molecular Targets. Circ Res 2023; 132:1013-1033. [PMID: 37053278 DOI: 10.1161/circresaha.122.321761] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
Diseases of the heart and the kidney, including heart failure and chronic kidney disease, can dramatically impair life expectancy and the quality of life of patients. The heart and kidney form a functional axis; therefore, functional impairment of 1 organ will inevitably affect the function of the other. Fibrosis represents the common final pathway of diseases of both organs, regardless of the disease entity. Thus, inhibition of fibrosis represents a promising therapeutic approach to treat diseases of both organs and to resolve functional impairment. However, despite the growing knowledge in this field, the exact pathomechanisms that drive fibrosis remain elusive. RNA-sequencing approaches, particularly single-cell RNA-sequencing, have revolutionized the investigation of pathomechanisms at a molecular level and facilitated the discovery of disease-associated cell types and mechanisms. In this review, we give a brief overview over the evolution of RNA-sequencing techniques, summarize most recent insights into the pathogenesis of heart and kidney fibrosis, and discuss how transcriptomic data can be used, to identify new drug targets and to develop novel therapeutic strategies.
Collapse
Affiliation(s)
- Felix Schreibing
- Institute of Experimental Medicine and Systems Biology (F.S., T.M.A., R.K.), RWTH Aachen University, Medical Faculty, Aachen, Germany
- Division of Nephrology and Clinical Immunology (F.S., T.M.A., R.K.), RWTH Aachen University, Medical Faculty, Aachen, Germany
| | - Teresa M Anslinger
- Institute of Experimental Medicine and Systems Biology (F.S., T.M.A., R.K.), RWTH Aachen University, Medical Faculty, Aachen, Germany
- Division of Nephrology and Clinical Immunology (F.S., T.M.A., R.K.), RWTH Aachen University, Medical Faculty, Aachen, Germany
| | - Rafael Kramann
- Institute of Experimental Medicine and Systems Biology (F.S., T.M.A., R.K.), RWTH Aachen University, Medical Faculty, Aachen, Germany
- Division of Nephrology and Clinical Immunology (F.S., T.M.A., R.K.), RWTH Aachen University, Medical Faculty, Aachen, Germany
- Department of Internal Medicine, Nephrology and Transplantation, Erasmus Medical Center, Rotterdam, The Netherlands (R.K.)
| |
Collapse
|
6
|
Guo W, Coulter M, Waugh R, Zhang R. The value of genotype-specific reference for transcriptome analyses in barley. Life Sci Alliance 2022; 5:e202101255. [PMID: 35459738 PMCID: PMC9034525 DOI: 10.26508/lsa.202101255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 04/10/2022] [Accepted: 04/11/2022] [Indexed: 12/31/2022] Open
Abstract
It is increasingly apparent that although different genotypes within a species share "core" genes, they also contain variable numbers of "specific" genes and different structures of "core" genes that are only present in a subset of individuals. Using a common reference genome may thus lead to a loss of genotype-specific information in the assembled Reference Transcript Dataset (RTD) and the generation of erroneous, incomplete or misleading transcriptomics analysis results. In this study, we assembled genotype-specific RTD (sRTD) and common reference-based RTD (cRTD) from RNA-seq data of cultivated Barke and Morex barley, respectively. Our quantitative evaluation showed that the sRTD has a significantly higher diversity of transcripts and alternative splicing events, whereas the cRTD missed 40% of transcripts present in the sRTD and it only has ∼70% accurate transcript assemblies. We found that the sRTD is more accurate for transcript quantification as well as differential expression analysis. However, gene-level quantification is less affected, which may be a reasonable compromise when a high-quality genotype-specific reference is not available.
Collapse
Affiliation(s)
- Wenbin Guo
- Information and Computational Sciences, James Hutton Institute, Dundee, UK
| | - Max Coulter
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Dundee, UK
| | - Robbie Waugh
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Dundee, UK
- Cell and Molecular Sciences, James Hutton Institute, Dundee, UK
| | - Runxuan Zhang
- Information and Computational Sciences, James Hutton Institute, Dundee, UK
| |
Collapse
|
7
|
de la Rubia I, Srivastava A, Xue W, Indi JA, Carbonell-Sala S, Lagarde J, Albà MM, Eyras E. RATTLE: reference-free reconstruction and quantification of transcriptomes from Nanopore sequencing. Genome Biol 2022; 23:153. [PMID: 35804393 PMCID: PMC9264490 DOI: 10.1186/s13059-022-02715-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 06/20/2022] [Indexed: 11/04/2022] Open
Abstract
Nanopore sequencing enables the efficient and unbiased measurement of transcriptomes. Current methods for transcript identification and quantification rely on mapping reads to a reference genome, which precludes the study of species with a partial or missing reference or the identification of disease-specific transcripts not readily identifiable from a reference. We present RATTLE, a tool to perform reference-free reconstruction and quantification of transcripts using only Nanopore reads. Using simulated data and experimental data from isoform spike-ins, human tissues, and cell lines, we show that RATTLE accurately determines transcript sequences and their abundances, and shows good scalability with the number of transcripts.
Collapse
Affiliation(s)
- Ivan de la Rubia
- EMBL Australia Partner Laboratory Network at the Australian National University, Acton, Canberra, ACT, 2601, Australia.,Pompeu Fabra University (UPF), E08003, Barcelona, Spain
| | - Akanksha Srivastava
- EMBL Australia Partner Laboratory Network at the Australian National University, Acton, Canberra, ACT, 2601, Australia.,Australian National University, Acton, Canberra, ACT, 2601, Australia
| | - Wenjing Xue
- EMBL Australia Partner Laboratory Network at the Australian National University, Acton, Canberra, ACT, 2601, Australia.,Australian National University, Acton, Canberra, ACT, 2601, Australia
| | - Joel A Indi
- EMBL Australia Partner Laboratory Network at the Australian National University, Acton, Canberra, ACT, 2601, Australia.,Universidade de Lisboa, Lisboa, Portugal
| | - Silvia Carbonell-Sala
- Pompeu Fabra University (UPF), E08003, Barcelona, Spain.,Centre for Regulatory Genomics (CRG), E08001, Barcelona, Spain
| | - Julien Lagarde
- Pompeu Fabra University (UPF), E08003, Barcelona, Spain.,Centre for Regulatory Genomics (CRG), E08001, Barcelona, Spain
| | - M Mar Albà
- Pompeu Fabra University (UPF), E08003, Barcelona, Spain. .,Catalan Institution for Research and Advanced Studies (ICREA), E08010, Barcelona, Spain. .,Hospital del Mar Medical Research Institute (IMIM), E08001, Barcelona, Spain.
| | - Eduardo Eyras
- EMBL Australia Partner Laboratory Network at the Australian National University, Acton, Canberra, ACT, 2601, Australia. .,Australian National University, Acton, Canberra, ACT, 2601, Australia. .,Catalan Institution for Research and Advanced Studies (ICREA), E08010, Barcelona, Spain. .,Hospital del Mar Medical Research Institute (IMIM), E08001, Barcelona, Spain.
| |
Collapse
|
8
|
Dias MC, Caldeira C, Gastauer M, Ramos S, Oliveira G. Cross-species transcriptomes reveal species-specific and shared molecular adaptations for plants development on iron-rich rocky outcrops soils. BMC Genomics 2022; 23:313. [PMID: 35439930 PMCID: PMC9020022 DOI: 10.1186/s12864-022-08449-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 02/23/2022] [Indexed: 12/13/2022] Open
Abstract
Background Canga is the Brazilian term for the savanna-like vegetation harboring several endemic species on iron-rich rocky outcrops, usually considered for mining activities. Parkia platycephala Benth. and Stryphnodendron pulcherrimum (Willd.) Hochr. naturally occur in the cangas of Serra dos Carajás (eastern Amazonia, Brazil) and the surrounding forest, indicating high phenotypic plasticity. The morphological and physiological mechanisms of the plants’ establishment in the canga environment are well studied, but the molecular adaptative responses are still unknown. To understand these adaptative responses, we aimed to identify molecular mechanisms that allow the establishment of these plants in the canga environment. Results Plants were grown in canga and forest substrates collected in the Carajás Mineral Province. RNA was extracted from pooled leaf tissue, and RNA-seq paired-end reads were assembled into representative transcriptomes for P. platycephala and S. pulcherrimum containing 31,728 and 31,311 primary transcripts, respectively. We identified both species-specific and core molecular responses in plants grown in the canga substrate using differential expression analyses. In the species-specific analysis, we identified 1,112 and 838 differentially expressed genes for P. platycephala and S. pulcherrimum, respectively. Enrichment analyses showed that unique biological processes and metabolic pathways were affected for each species. Comparative differential expression analysis was based on shared single-copy orthologs. The overall pattern of ortholog expression was species-specific. Even so, we identified almost 300 altered genes between plants in canga and forest substrates with conserved responses in the two species. The genes were functionally associated with the response to light stimulus and the circadian rhythm pathway. Conclusions Plants possess species-specific adaptative responses to cope with the substrates. Our results also suggest that plants adapted to both canga and forest environments can adjust the circadian rhythm in a substrate-dependent manner. The circadian clock gene modulation might be a central mechanism regulating the plants’ development in the canga substrate in the studied legume species. The mechanism may be shared as a common mechanism to abiotic stress compensation in other native species. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08449-0.
Collapse
Affiliation(s)
- Mariana Costa Dias
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil.,Universidade Federal de Minas Gerais, Avenida Antônio Carlos 6627, Belo Horizonte, Minas Gerais, CEP 31270-901, Brazil
| | - Cecílio Caldeira
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil
| | - Markus Gastauer
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil
| | - Silvio Ramos
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil
| | - Guilherme Oliveira
- Instituto Tecnológico Vale, Rua Boaventura da Silva 955, Belém, Pará, CEP 66055-090, Brazil.
| |
Collapse
|
9
|
Shrestha AMS, B Guiao JE, R Santiago KC. Assembly-free rapid differential gene expression analysis in non-model organisms using DNA-protein alignment. BMC Genomics 2022; 23:97. [PMID: 35120462 PMCID: PMC8815227 DOI: 10.1186/s12864-021-08278-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Accepted: 12/22/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND RNA-seq is being increasingly adopted for gene expression studies in a panoply of non-model organisms, with applications spanning the fields of agriculture, aquaculture, ecology, and environment. For organisms that lack a well-annotated reference genome or transcriptome, a conventional RNA-seq data analysis workflow requires constructing a de-novo transcriptome assembly and annotating it against a high-confidence protein database. The assembly serves as a reference for read mapping, and the annotation is necessary for functional analysis of genes found to be differentially expressed. However, assembly is computationally expensive. It is also prone to errors that impact expression analysis, especially since sequencing depth is typically much lower for expression studies than for transcript discovery. RESULTS We propose a shortcut, in which we obtain counts for differential expression analysis by directly aligning RNA-seq reads to the high-confidence proteome that would have been otherwise used for annotation. By avoiding assembly, we drastically cut down computational costs - the running time on a typical dataset improves from the order of tens of hours to under half an hour, and the memory requirement is reduced from the order of tens of Gbytes to tens of Mbytes. We show through experiments on simulated and real data that our pipeline not only reduces computational costs, but has higher sensitivity and precision than a typical assembly-based pipeline. A Snakemake implementation of our workflow is available at: https://bitbucket.org/project_samar/samar . CONCLUSIONS The flip side of RNA-seq becoming accessible to even modestly resourced labs has been that the time, labor, and infrastructure cost of bioinformatics analysis has become a bottleneck. Assembly is one such resource-hungry process, and we show here that it can be avoided for quick and easy, yet more sensitive and precise, differential gene expression analysis in non-model organisms.
Collapse
Affiliation(s)
- Anish M S Shrestha
- Bioinformatics Lab, Advanced Research Institute for Informatics, Computing, and Networking (AdRIC), De La Salle University, Manila, Philippines.
- Department of Software Technology, College of Computer Studies, De La Salle University, Manila, Philippines.
| | - Joyce Emlyn B Guiao
- Bioinformatics Lab, Advanced Research Institute for Informatics, Computing, and Networking (AdRIC), De La Salle University, Manila, Philippines
- Department of Mathematics and Statistics, College of Science, De La Salle University, Manila, Philippines
| | - Kyle Christian R Santiago
- Bioinformatics Lab, Advanced Research Institute for Informatics, Computing, and Networking (AdRIC), De La Salle University, Manila, Philippines
- Department of Software Technology, College of Computer Studies, De La Salle University, Manila, Philippines
| |
Collapse
|
10
|
Hernández-Fernández J, Pinzón Velasco AM, López Barrera EA, Rodríguez Becerra MDP, Villanueva-Cañas JL, Alba MM, Mariño Ramírez L. De novo assembly and functional annotation of blood transcriptome of loggerhead turtle, and in silico characterization of peroxiredoxins and thioredoxins. PeerJ 2021; 9:e12395. [PMID: 34820176 PMCID: PMC8606161 DOI: 10.7717/peerj.12395] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 10/06/2021] [Indexed: 12/21/2022] Open
Abstract
The aim of this study was to generate and analyze the atlas of the loggerhead turtle blood transcriptome by RNA-seq, as well as identify and characterize thioredoxin (Tnxs) and peroxiredoxin (Prdxs) antioxidant enzymes of the greatest interest in the control of peroxide levels and other biological functions. The transcriptome of loggerhead turtle was sequenced using the Illumina Hiseq 2000 platform and de novo assembly was performed using the Trinity pipeline. The assembly comprised 515,597 contigs with an N50 of 2,631 bp. Contigs were analyzed with CD-Hit obtaining 374,545 unigenes, of which 165,676 had ORFs encoding putative proteins longer than 100 amino acids. A total of 52,147 (31.5%) of these transcripts had significant homology matches in at least one of the five databases used. From the enrichment of GO terms, 180 proteins with antioxidant activity were identified, among these 28 Prdxs and 50 putative Tnxs. The putative proteins of loggerhead turtles encoded by the genes Prdx1, Prdx3, Prdx5, Prdx6, Txn and Txnip were predicted and characterized in silico. When comparing Prdxs and Txns of loggerhead turtle with homologous human proteins, they showed 18 (9%), 52 (18%) 94 (43%), 36 (16%), 35 (33%) and 74 (19%) amino acid mutations respectively. However, they showed high conservation in active sites and structural motifs (98%), with few specific modifications. Of these, Prdx1, Prdx3, Prdx5, Prdx6, Txn and Txnip presented 0, 25, 18, three, six and two deleterious changes. This study provides a high quality blood transcriptome and functional annotation of loggerhead sea turtles.
Collapse
Affiliation(s)
- Javier Hernández-Fernández
- Department of Natural and Environmental Sciences, Faculty of Science and Engineering, Genetics, Molecular Biology and Bioinformatic Research Group-GENBIMOL, Universidad Jorge Tadeo Lozano, Bogotá, D.C., Colombia.,Faculty of Sciences, Department of Biology, Pontificia Universidad Javeriana, Bogotá, D.C., Colombia
| | | | - Ellie Anne López Barrera
- Institute of Environmental Studies and Services. IDEASA Research Group-IDEASA, Sergio Arboleda University, Bogotá, D.C., Colombia
| | - María Del Pilar Rodríguez Becerra
- Department of Natural and Environmental Sciences, Faculty of Science and Engineering, Genetics, Molecular Biology and Bioinformatic Research Group-GENBIMOL, Universidad Jorge Tadeo Lozano, Bogotá, D.C., Colombia
| | | | - M Mar Alba
- Evolutionary Genomics Group, Research Program on Biomedical Informatics (GRIB), Hospital del Mar Research Institute (IMIM), Universitat Pompeu Fabra, Barcelona, Spain.,Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| | | |
Collapse
|
11
|
CStone: A de novo transcriptome assembler for short-read data that identifies non-chimeric contigs based on underlying graph structure. PLoS Comput Biol 2021; 17:e1009631. [PMID: 34813594 PMCID: PMC8651127 DOI: 10.1371/journal.pcbi.1009631] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 12/07/2021] [Accepted: 11/11/2021] [Indexed: 11/19/2022] Open
Abstract
With the exponential growth of sequence information stored over the last decade, including that of de novo assembled contigs from RNA-Seq experiments, quantification of chimeric sequences has become essential when assembling read data. In transcriptomics, de novo assembled chimeras can closely resemble underlying transcripts, but patterns such as those seen between co-evolving sites, or mapped read counts, become obscured. We have created a de Bruijn based de novo assembler for RNA-Seq data that utilizes a classification system to describe the complexity of underlying graphs from which contigs are created. Each contig is labelled with one of three levels, indicating whether or not ambiguous paths exist. A by-product of this is information on the range of complexity of the underlying gene families present. As a demonstration of CStones ability to assemble high-quality contigs, and to label them in this manner, both simulated and real data were used. For simulated data, ten million read pairs were generated from cDNA libraries representing four species, Drosophila melanogaster, Panthera pardus, Rattus norvegicus and Serinus canaria. These were assembled using CStone, Trinity and rnaSPAdes; the latter two being high-quality, well established, de novo assembers. For real data, two RNA-Seq datasets, each consisting of ≈30 million read pairs, representing two adult D. melanogaster whole-body samples were used. The contigs that CStone produced were comparable in quality to those of Trinity and rnaSPAdes in terms of length, sequence identity of aligned regions and the range of cDNA transcripts represented, whilst providing additional information on chimerism. Here we describe the details of CStones assembly and classification process, and propose that similar classification systems can be incorporated into other de novo assembly tools. Within a related side study, we explore the effects that chimera’s within reference sets have on the identification of differentially expression genes. CStone is available at: https://sourceforge.net/projects/cstone/. Within transcriptome reference sets, non-chimeric sequences are representations of transcribed genes, while artificially generated chimeric ones are mosaics of two or more pieces of DNA incorrectly pieced together. One area where such sets are utilized is in the quantification of gene expression patterns; where RNA-Seq reads are mapped to the sequences within, and subsequent count values reflect expression levels. Artificial chimeras can have a negative impact on count values by erroneously increasing variation in relation to the reads being mapped. Reference sets can be created from de novo assembled contigs, but chimeras can be introduced during the assembly process via the required traversal of graphs, representing gene families, constructed from the RNA-Seq data. Graph complexity determines how likely chimeras will arise. We have created CStone, a de novo assembler that utilizes a classification system to describe such complexity. Contigs created by CStone are labelled in a manner that indicates whether or not they are non-chimeric. This encourages contig dependent results to be presented with increased objectivity by maintaining the context of ambiguity associated with the assembly process. CStone has been tested extensively. Additionally, we have quantified the relationship between chimeras within reference sets and the identification of differentially expressed genes.
Collapse
|
12
|
Bollati E, Rosenberg Y, Simon-Blecher N, Tamir R, Levy O, Huang D. Untangling the molecular basis of coral response to sedimentation. Mol Ecol 2021; 31:884-901. [PMID: 34738686 DOI: 10.1111/mec.16263] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 10/25/2021] [Accepted: 10/28/2021] [Indexed: 12/23/2022]
Abstract
Urbanized coral reefs are often chronically affected by sedimentation and reduced light levels, yet many species of corals appear to be able to thrive under these highly disturbed conditions. Recently, these marginal ecosystems have gained attention as potential climate change refugia due to the shading effect of suspended sediment, as well as potential reservoirs for stress-tolerant species. However, little research exists on the impact of sedimentation on coral physiology, particularly at the molecular level. Here, we investigated the transcriptomic response to sediment stress in corals of the family Merulinidae from a chronically turbid reef (one genet each of Goniastrea pectinata and Mycedium elephantotus from Singapore) and a clear-water reef (multiple genets of G. pectinata from the Gulf of Aqaba/Eilat). In two ex-situ experiments, we exposed corals to either natural sediment or artificial sediment enriched with organic matter and used whole-transcriptome sequencing (RNA sequencing) to quantify gene expression. Analysis revealed a shared basis for the coral transcriptomic response to sediment stress, which involves the expression of genes broadly related to energy metabolism and immune response. In particular, sediment exposure induced upregulation of anaerobic glycolysis and glyoxylate bypass enzymes, as well as genes involved in hydrogen sulphide metabolism and in pathogen pattern recognition. Our results point towards hypoxia as a probable driver of this transcriptomic response, providing a molecular basis to previous work that identified hypoxia as a primary cause of tissue necrosis in sediment-stressed corals. Potential metabolic and immunity trade-offs of corals living under chronic sedimentation should be considered in future studies on the ecology and conservation of turbid reefs.
Collapse
Affiliation(s)
- Elena Bollati
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore.,Department of Biology, Marine Biology Section, University of Copenhagen, Helsingør, Denmark
| | - Yaeli Rosenberg
- Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Noa Simon-Blecher
- Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Raz Tamir
- School of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.,The Interuniversity Institute for Marine Sciences in Eilat, Eilat, Israel
| | - Oren Levy
- Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel.,The Interuniversity Institute for Marine Sciences in Eilat, Eilat, Israel
| | - Danwei Huang
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore.,Tropical Marine Science Institute, National University of Singapore, Singapore, Singapore.,Centre for Nature-based Climate Solutions, National University of Singapore, Singapore, Singapore
| |
Collapse
|
13
|
Bucchini F, Del Cortona A, Kreft Ł, Botzki A, Van Bel M, Vandepoele K. TRAPID 2.0: a web application for taxonomic and functional analysis of de novo transcriptomes. Nucleic Acids Res 2021; 49:e101. [PMID: 34197621 PMCID: PMC8464036 DOI: 10.1093/nar/gkab565] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 06/07/2021] [Accepted: 06/16/2021] [Indexed: 12/24/2022] Open
Abstract
Advances in high-throughput sequencing have resulted in a massive increase of RNA-Seq transcriptome data. However, the promise of rapid gene expression profiling in a specific tissue, condition, unicellular organism or microbial community comes with new computational challenges. Owing to the limited availability of well-resolved reference genomes, de novo assembled (meta)transcriptomes have emerged as popular tools for investigating the gene repertoire of previously uncharacterized organisms. Yet, despite their potential, these datasets often contain fragmented or contaminant sequences, and their analysis remains difficult. To alleviate some of these challenges, we developed TRAPID 2.0, a web application for the fast and efficient processing of assembled transcriptome data. The initial processing phase performs a global characterization of the input data, providing each transcript with several layers of annotation, comprising structural, functional, and taxonomic information. The exploratory phase enables downstream analyses from the web application. Available analyses include the assessment of gene space completeness, the functional analysis and comparison of transcript subsets, and the study of transcripts in an evolutionary context. A comparison with similar tools highlights TRAPID’s unique features. Finally, analyses performed within TRAPID 2.0 are complemented by interactive data visualizations, facilitating the extraction of new biological insights, as demonstrated with diatom community metatranscriptomes.
Collapse
Affiliation(s)
- François Bucchini
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium.,Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
| | - Andrea Del Cortona
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium.,Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
| | - Łukasz Kreft
- VIB Bioinformatics Core, VIB, 9052 Ghent, Belgium
| | | | - Michiel Van Bel
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium.,Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium.,Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, 9052 Ghent, Belgium
| |
Collapse
|
14
|
Robinson EK, Jagannatha P, Covarrubias S, Cattle M, Smaliy V, Safavi R, Shapleigh B, Abu-Shumays R, Jain M, Cloonan SM, Akeson M, Brooks AN, Carpenter S. Inflammation drives alternative first exon usage to regulate immune genes including a novel iron-regulated isoform of Aim2. eLife 2021; 10:69431. [PMID: 34047695 PMCID: PMC8260223 DOI: 10.7554/elife.69431] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 05/21/2021] [Indexed: 12/11/2022] Open
Abstract
Determining the layers of gene regulation within the innate immune response is critical to our understanding of the cellular responses to infection and dysregulation in disease. We identified a conserved mechanism of gene regulation in human and mouse via changes in alternative first exon (AFE) usage following inflammation, resulting in changes to the isoforms produced. Of these AFE events, we identified 95 unannotated transcription start sites in mice using a de novo transcriptome generated by long-read native RNA-sequencing, one of which is in the cytosolic receptor for dsDNA and known inflammatory inducible gene, Aim2. We show that this unannotated AFE isoform of Aim2 is the predominant isoform expressed during inflammation and contains an iron-responsive element in its 5′UTR enabling mRNA translation to be regulated by iron levels. This work highlights the importance of examining alternative isoform changes and translational regulation in the innate immune response and uncovers novel regulatory mechanisms of Aim2.
Collapse
Affiliation(s)
- Elektra K Robinson
- Department of Molecular, Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, United States
| | - Pratibha Jagannatha
- Department of Molecular, Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, United States.,Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, United States
| | - Sergio Covarrubias
- Department of Molecular, Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, United States
| | - Matthew Cattle
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, United States
| | - Valeriya Smaliy
- Department of Molecular, Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, United States
| | - Rojin Safavi
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, United States
| | - Barbara Shapleigh
- Department of Molecular, Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, United States
| | - Robin Abu-Shumays
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, United States
| | - Miten Jain
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, United States
| | - Suzanne M Cloonan
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, United States
| | - Mark Akeson
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, United States
| | - Angela N Brooks
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, United States
| | - Susan Carpenter
- Department of Molecular, Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, United States
| |
Collapse
|
15
|
Scott MA, Woolums AR, Swiderski CE, Perkins AD, Nanduri B, Smith DR, Karisch BB, Epperson WB, Blanton JR. Comprehensive at-arrival transcriptomic analysis of post-weaned beef cattle uncovers type I interferon and antiviral mechanisms associated with bovine respiratory disease mortality. PLoS One 2021; 16:e0250758. [PMID: 33901263 PMCID: PMC8075194 DOI: 10.1371/journal.pone.0250758] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 04/13/2021] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Despite decades of extensive research, bovine respiratory disease (BRD) remains the most devastating disease in beef cattle production. Establishing a clinical diagnosis often relies upon visual detection of non-specific signs, leading to low diagnostic accuracy. Thus, post-weaned beef cattle are often metaphylactically administered antimicrobials at facility arrival, which poses concerns regarding antimicrobial stewardship and resistance. Additionally, there is a lack of high-quality research that addresses the gene-by-environment interactions that underlie why some cattle that develop BRD die while others survive. Therefore, it is necessary to decipher the underlying host genomic factors associated with BRD mortality versus survival to help determine BRD risk and severity. Using transcriptomic analysis of at-arrival whole blood samples from cattle that died of BRD, as compared to those that developed signs of BRD but lived (n = 3 DEAD, n = 3 ALIVE), we identified differentially expressed genes (DEGs) and associated pathways in cattle that died of BRD. Additionally, we evaluated unmapped reads, which are often overlooked within transcriptomic experiments. RESULTS 69 DEGs (FDR<0.10) were identified between ALIVE and DEAD cohorts. Several DEGs possess immunological and proinflammatory function and associations with TLR4 and IL6. Biological processes, pathways, and disease phenotype associations related to type-I interferon production and antiviral defense were enriched in DEAD cattle at arrival. Unmapped reads aligned primarily to various ungulate assemblies, but failed to align to viral assemblies. CONCLUSION This study further revealed increased proinflammatory immunological mechanisms in cattle that develop BRD. DEGs upregulated in DEAD cattle were predominantly involved in innate immune pathways typically associated with antiviral defense, although no viral genes were identified within unmapped reads. Our findings provide genomic targets for further analysis in cattle at highest risk of BRD, suggesting that mechanisms related to type I interferons and antiviral defense may be indicative of viral respiratory disease at arrival and contribute to eventual BRD mortality.
Collapse
Affiliation(s)
- Matthew A. Scott
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Amelia R. Woolums
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Cyprianna E. Swiderski
- Department of Clinical Sciences, Mississippi State University, Mississippi State, MS, United States of America
| | - Andy D. Perkins
- Department of Computer Science and Engineering, Mississippi State University, Mississippi State, MS, United States of America
| | - Bindu Nanduri
- Department of Basic Sciences, Mississippi State University College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - David R. Smith
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Brandi B. Karisch
- Department of Animal and Dairy Sciences, Mississippi State University, Mississippi State, MS, United States of America
| | - William B. Epperson
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - John R. Blanton
- Department of Animal and Dairy Sciences, Mississippi State University, Mississippi State, MS, United States of America
| |
Collapse
|
16
|
Behera S, Voshall A, Moriyama EN. Plant Transcriptome Assembly: Review and Benchmarking. Bioinformatics 2021. [DOI: 10.36255/exonpublications.bioinformatics.2021.ch7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
17
|
Guang A, Howison M, Zapata F, Lawrence C, Dunn CW. Revising transcriptome assemblies with phylogenetic information. PLoS One 2021; 16:e0244202. [PMID: 33434218 PMCID: PMC7802918 DOI: 10.1371/journal.pone.0244202] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 12/04/2020] [Indexed: 11/18/2022] Open
Abstract
A common transcriptome assembly error is to mistake different transcripts of the same gene as transcripts from multiple closely related genes. This error is difficult to identify during assembly, but in a phylogenetic analysis such errors can be diagnosed from gene phylogenies where they appear as clades of tips from the same species with improbably short branch lengths. treeinform is a method that uses phylogenetic information across species to refine transcriptome assemblies within species. It identifies transcripts of the same gene that were incorrectly assigned to multiple genes and reassign them as transcripts of the same gene. The treeinform method is implemented in Agalma, available at https://bitbucket.org/caseywdunn/agalma, and the general approach is relevant in a variety of other contexts.
Collapse
Affiliation(s)
- August Guang
- Center for Computational Biology of Human Disease, Brown University, Providence, RI, United States of America
- Center for Computation and Visualization, Brown University, Providence, RI, United States of America
- * E-mail:
| | - Mark Howison
- Research Improving People’s Lives, Providence, RI, United States of America
| | - Felipe Zapata
- Department of Ecology & Evolutionary Biology, University of California-Los Angeles, Los Angeles, CA, United States of America
| | - Charles Lawrence
- Department of Applied Mathematics, Brown University, Providence, RI, United States of America
| | - Casey W. Dunn
- Department of Ecology & Evolutionary Biology, Yale University, New Haven, CT, United States of America
| |
Collapse
|
18
|
Marisaldi L, Basili D, Gioacchini G, Canapa A, Carnevali O. De novo transcriptome assembly, functional annotation and characterization of the Atlantic bluefin tuna (Thunnus thynnus) larval stage. Mar Genomics 2020; 58:100834. [PMID: 33371994 DOI: 10.1016/j.margen.2020.100834] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 12/10/2020] [Accepted: 12/11/2020] [Indexed: 10/22/2022]
Abstract
In the present work, we assembled and characterized a de novo larval transcriptome of the Atlantic bluefin tuna Thunnus thynnus by taking advantage of publicly available databases with the goal of better understanding its larval development. The assembled transcriptome comprised 37,117 protein-coding transcripts, of which 13,633 full-length (>80% coverage), with an Ex90N50 of 3061 bp and 76% of complete and single-copy core vertebrate genes orthologues. Of these transcripts, 34,980 had a hit against the EggNOG database and 14,983 with the KEGG database. Codon usage bias was identified in processes such as translation and muscle development. By comparing our data with a set of representative fish species, 87.1% of tuna transcripts were included in orthogroups with other species and 5.1% in assembly-specific orthogroups, which were enriched in terms related to muscle and bone development, visual system and ion transport. Following this comparative approach, protein families related to myosin, extracellular matrix and immune system resulted significantly expanded in the Atlantic bluefin tuna. Altogether, these results provide a glimpse of how the Atlantic bluefin tuna might have achieved early physical advantages over competing species in the pelagic environment. The information generated lays the foundation for future research on the more detailed exploration of physiological responses at the molecular level in different larval stages and paves the way to evolutionary studies on the Atlantic bluefin tuna.
Collapse
Affiliation(s)
- Luca Marisaldi
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy
| | - Danilo Basili
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy; Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| | - Giorgia Gioacchini
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy
| | - Adriana Canapa
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy
| | - Oliana Carnevali
- Department of Life and Environmental Sciences, Università Politecnica delle Marche, Ancona 60131, Italy.
| |
Collapse
|
19
|
An improved de novo assembling and polishing of Solea senegalensis transcriptome shed light on retinoic acid signalling in larvae. Sci Rep 2020; 10:20654. [PMID: 33244091 PMCID: PMC7691524 DOI: 10.1038/s41598-020-77201-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 11/06/2020] [Indexed: 12/17/2022] Open
Abstract
Senegalese sole is an economically important flatfish species in aquaculture and an attractive model to decipher the molecular mechanisms governing the severe transformations occurring during metamorphosis, where retinoic acid seems to play a key role in tissue remodeling. In this study, a robust sole transcriptome was envisaged by reducing the number of assembled libraries (27 out of 111 available), fine-tuning a new automated and reproducible set of workflows for de novo assembling based on several assemblers, and removing low confidence transcripts after mapping onto a sole female genome draft. From a total of 96 resulting assemblies, two "raw" transcriptomes, one containing only Illumina reads and another with Illumina and GS-FLX reads, were selected to provide SOLSEv5.0, the most informative transcriptome with low redundancy and devoid of most single-exon transcripts. It included both Illumina and GS-FLX reads and consisted of 51,348 transcripts of which 22,684 code for 17,429 different proteins described in databases, where 9527 were predicted as complete proteins. SOLSEv5.0 was used as reference for the study of retinoic acid (RA) signalling in sole larvae using drug treatments (DEAB, a RA synthesis blocker, and TTNPB, a RA-receptor agonist) for 24 and 48 h. Differential expression and functional interpretation were facilitated by an updated version of DEGenes Hunter. Acute exposure of both drugs triggered an intense, specific and transient response at 24 h but with hardly observable differences after 48 h at least in the DEAB treatments. Activation of RA signalling by TTNPB specifically increased the expression of genes in pathways related to RA degradation, retinol storage, carotenoid metabolism, homeostatic response and visual cycle, and also modified the expression of transcripts related to morphogenesis and collagen fibril organisation. In contrast, DEAB mainly decreased genes related to retinal production, impairing phototransduction signalling in the retina. A total of 755 transcripts mainly related to lipid metabolism, lipid transport and lipid homeostasis were altered in response to both treatments, indicating non-specific drug responses associated with intestinal absorption. These results indicate that a new assembling and transcript sieving were both necessary to provide a reliable transcriptome to identify the many aspects of RA action during sole development that are of relevance for sole aquaculture.
Collapse
|
20
|
Mora-Márquez F, Vázquez-Poletti JL, Chano V, Collada C, Soto Á, de Heredia UL. Hardware Performance Evaluation of De novo Transcriptome Assembly Software in Amazon Elastic Compute Cloud. Curr Bioinform 2020. [DOI: 10.2174/1574893615666191219095817] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Bioinformatics software for RNA-seq analysis has a high computational
requirement in terms of the number of CPUs, RAM size, and processor characteristics.
Specifically, de novo transcriptome assembly demands large computational infrastructure due to
the massive data size, and complexity of the algorithms employed. Comparative studies on the
quality of the transcriptome yielded by de novo assemblers have been previously published,
lacking, however, a hardware efficiency-oriented approach to help select the assembly hardware
platform in a cost-efficient way.
Objective:
We tested the performance of two popular de novo transcriptome assemblers, Trinity
and SOAPdenovo-Trans (SDNT), in terms of cost-efficiency and quality to assess limitations, and
provided troubleshooting and guidelines to run transcriptome assemblies efficiently.
Methods:
We built virtual machines with different hardware characteristics (CPU number, RAM
size) in the Amazon Elastic Compute Cloud of the Amazon Web Services. Using simulated and
real data sets, we measured the elapsed time, cost, CPU percentage and output size of small and
large data set assemblies.
Results:
For small data sets, SDNT outperformed Trinity by an order the magnitude, significantly
reducing the time duration and costs of the assembly. For large data sets, Trinity performed better
than SDNT. Both the assemblers provide good quality transcriptomes.
Conclusion:
The selection of the optimal transcriptome assembler and provision of computational
resources depend on the combined effect of size and complexity of RNA-seq experiments.
Collapse
Affiliation(s)
- Fernando Mora-Márquez
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - José Luis Vázquez-Poletti
- GI Arquitectura de Sistemas Distribuidos, Dpto. Arquitectura de Computadores y Automatica, Facultad de Informatica, Universidad Complutense de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - Víctor Chano
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - Carmen Collada
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - Álvaro Soto
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| | - Unai López de Heredia
- GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politecnica de Madrid, Ciudad Universitaria, 28040 Madrid, Spain
| |
Collapse
|
21
|
Landis JB, Kurti A, Lawhorn AJ, Litt A, McCarthy EW. Differential Gene Expression with an Emphasis on Floral Organ Size Differences in Natural and Synthetic Polyploids of Nicotiana tabacum (Solanaceae). Genes (Basel) 2020; 11:E1097. [PMID: 32961813 PMCID: PMC7563459 DOI: 10.3390/genes11091097] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/14/2020] [Accepted: 09/16/2020] [Indexed: 11/16/2022] Open
Abstract
Floral organ size, especially the size of the corolla, plays an important role in plant reproduction by facilitating pollination efficiency. Previous studies have outlined a hypothesized organ size pathway. However, the expression and function of many of the genes in the pathway have only been investigated in model diploid species; therefore, it is unknown how these genes interact in polyploid species. Although correlations between ploidy and cell size have been shown in many systems, it is unclear whether there is a difference in cell size between naturally occurring and synthetic polyploids. To address these questions comparing floral organ size and cell size across ploidy, we use natural and synthetic polyploids of Nicotiana tabacum (Solanaceae) as well as their known diploid progenitors. We employ a comparative transcriptomics approach to perform analyses of differential gene expression, focusing on candidate genes that may be involved in floral organ size, both across developmental stages and across accessions. We see differential expression of several known floral organ candidate genes including ARF2, BIG BROTHER, and GASA/GAST1. Results from linear models show that ploidy, cell width, and cell number positively influence corolla tube circumference; however, the effect of cell width varies by ploidy, and diploids have a significantly steeper slope than both natural and synthetic polyploids. These results demonstrate that polyploids have wider cells and that polyploidy significantly increases corolla tube circumference.
Collapse
Affiliation(s)
- Jacob B. Landis
- Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA 92521, USA; (A.K.); (A.J.L.); (A.L.)
- School of Integrative Plant Science, Section of Plant Biology and the L.H. Bailey Hortorium, Cornell University, Ithaca, NY 14853, USA
| | - Amelda Kurti
- Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA 92521, USA; (A.K.); (A.J.L.); (A.L.)
| | - Amber J. Lawhorn
- Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA 92521, USA; (A.K.); (A.J.L.); (A.L.)
| | - Amy Litt
- Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA 92521, USA; (A.K.); (A.J.L.); (A.L.)
| | - Elizabeth W. McCarthy
- Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA 92521, USA; (A.K.); (A.J.L.); (A.L.)
- Department of Biology, SUNY Cortland, Cortland, NY 13045, USA
| |
Collapse
|
22
|
The Peptide Venom Composition of the Fierce Stinging Ant Tetraponera aethiops (Formicidae: Pseudomyrmecinae). Toxins (Basel) 2019; 11:toxins11120732. [PMID: 31847368 PMCID: PMC6950161 DOI: 10.3390/toxins11120732] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 12/06/2019] [Accepted: 12/11/2019] [Indexed: 12/19/2022] Open
Abstract
In the mutualisms involving certain pseudomyrmicine ants and different myrmecophytes (i.e., plants sheltering colonies of specialized “plant-ant” species in hollow structures), the ant venom contributes to the host plant biotic defenses by inducing the rapid paralysis of defoliating insects and causing intense pain to browsing mammals. Using integrated transcriptomic and proteomic approaches, we identified the venom peptidome of the plant-ant Tetraponera aethiops (Pseudomyrmecinae). The transcriptomic analysis of its venom glands revealed that 40% of the expressed contigs encoded only seven peptide precursors related to the ant venom peptides from the A-superfamily. Among the 12 peptide masses detected by liquid chromatography-mass spectrometry (LC–MS), nine mature peptide sequences were characterized and confirmed through proteomic analysis. These venom peptides, called pseudomyrmecitoxins (PSDTX), share amino acid sequence identities with myrmeciitoxins known for their dual offensive and defensive functions on both insects and mammals. Furthermore, we demonstrated through reduction/alkylation of the crude venom that four PSDTXs were homo- and heterodimeric. Thus, we provide the first insights into the defensive venom composition of the ant genus Tetraponera indicative of a streamlined peptidome.
Collapse
|