1
|
Zicola J, Dasari P, Hahn KK, Ziese-Kubon K, Meurer A, Buhl T, Scholten S. De novo transcriptome assembly of the oak processionary moth Thaumetopoea processionea. BMC Genom Data 2024; 25:55. [PMID: 38851674 PMCID: PMC11161914 DOI: 10.1186/s12863-024-01237-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 05/24/2024] [Indexed: 06/10/2024] Open
Abstract
OBJECTIVES The oak processionary moth (OPM) (Thaumetopoea processionea) is a species of moth (order: Lepidoptera) native to parts of central Europe. However, in recent years, it has become an invasive species in various countries, particularly in the United Kingdom and the Netherlands. The larvae of the OPM are covered with urticating barbed hairs (setae) causing irritating and allergic reactions at the three last larval stages (L3-L5). The aim of our study was to generate a de novo transcriptomic assembly for OPM larvae by including one non-allergenic stage (L2) and two allergenic stages (L4 and L5). A transcriptomic assembly will help identify potential allergenic peptides produced by OPM larvae, providing valuable information for developing novel therapeutic strategies and allergic immunodiagnostic assays. DATA Transcriptomes of three larval stages of the OPM were de novo assembled and annotated using Trinity and Trinotate, respectively. A total of 145,251 transcripts from 99,868 genes were identified. Bench-marking universal single-copy orthologues analysis indicated high completeness of the assembly. About 19,600 genes are differentially expressed between the non-allergenic and allergenic larval stages. The data provided here contribute to the characterization of OPM, which is both an invasive species and a health hazard.
Collapse
Affiliation(s)
- Johan Zicola
- Division of Crop Plant Genetics, Department of Crop Science, Georg-August-University Göttingen, Göttingen, Germany
- Center for integrated Breeding Research (CiBreed), Göttingen, Germany
| | - Prasad Dasari
- Department of Dermatology, University Medical Center Göttingen, Göttingen, Germany
| | - Katharina Klara Hahn
- Department of Dermatology, University Medical Center Göttingen, Göttingen, Germany
| | - Katharina Ziese-Kubon
- Division of Crop Plant Genetics, Department of Crop Science, Georg-August-University Göttingen, Göttingen, Germany
- Center for integrated Breeding Research (CiBreed), Göttingen, Germany
| | - Armin Meurer
- Faculty of Resource Management, University of Applied Sciences and Arts (HAWK), Göttingen, Germany
| | - Timo Buhl
- Department of Dermatology, University Medical Center Göttingen, Göttingen, Germany
| | - Stefan Scholten
- Division of Crop Plant Genetics, Department of Crop Science, Georg-August-University Göttingen, Göttingen, Germany.
- Center for integrated Breeding Research (CiBreed), Göttingen, Germany.
| |
Collapse
|
2
|
Shabbir M, Mithani A. Roast: a tool for reference-free optimization of supertranscriptome assemblies. BMC Bioinformatics 2024; 25:2. [PMID: 38166712 PMCID: PMC10763045 DOI: 10.1186/s12859-023-05614-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 12/12/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND Transcriptomic studies involving organisms for which reference genomes are not available typically start by generating de novo transcriptome or supertranscriptome assembly from the raw RNA-seq reads. Assembling a supertranscriptome is, however, a challenging task due to significantly varying abundance of mRNA transcripts, alternative splicing, and sequencing errors. As a result, popular de novo supertranscriptome assembly tools generate assemblies containing contigs that are partially-assembled, fragmented, false chimeras or have local mis-assemblies leading to decreased assembly accuracy. Commonly available tools for assembly improvement rely primarily on running BLAST using closely related species making their accuracy and reliability conditioned on the availability of the data for closely related organisms. RESULTS We present ROAST, a tool for optimization of supertranscriptome assemblies that uses paired-end RNA-seq data from Illumina sequencing platform to iteratively identify and fix assembly errors solely using the error signatures generated by RNA-seq alignment tools including soft-clips, unexpected expression coverage, and reads with mates unmapped or mapped on a different contig to identify and fix various supertranscriptome assembly errors without performing BLAST searches against other organisms. Evaluation results using simulated as well as real datasets show that ROAST significantly improves assembly quality by identifying and fixing various assembly errors. CONCLUSION ROAST provides a reference-free approach to optimizing supertranscriptome assemblies highlighting its utility in refining de novo supertranscriptome assemblies of non-model organisms.
Collapse
Affiliation(s)
- Madiha Shabbir
- Department of Life Sciences, Syed Babar Ali School of Science and Engineering, Lahore University of Management Sciences (LUMS), DHA, Lahore, 54792, Pakistan
| | - Aziz Mithani
- Department of Life Sciences, Syed Babar Ali School of Science and Engineering, Lahore University of Management Sciences (LUMS), DHA, Lahore, 54792, Pakistan.
| |
Collapse
|
3
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. Genome Biol Evol 2023; 15:evad211. [PMID: 38000902 PMCID: PMC10709115 DOI: 10.1093/gbe/evad211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 11/26/2023] Open
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred models for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best-fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
4
|
Aurelio AMM, Fabián CAF, Carlos Iván CC, Felipe GL. Optimized method for differential gene expression analysis in non-model species: Case of Cedrela odorata L. MethodsX 2023; 11:102449. [PMID: 37920871 PMCID: PMC10618499 DOI: 10.1016/j.mex.2023.102449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 10/17/2023] [Indexed: 11/04/2023] Open
Abstract
The following protocol introduces a targeted methodological approach of differential gene expression analysis, which is particularly beneficial in the context of non-model species. While we acknowledge that biological complexity often involves the interplay of multiple genes in any given biological response our method provides a strategy to streamline this complexity, enabling researchers to focus on a more manageable subset of genes of interest. In this context, red cedar transcriptome (Cedrela odorata L.) and known or hypothetical genes related to the response to herbivory were used as reference. The protocol key points are:•Implementation of a transcriptome thinning process to eliminate redundant and non-coding sequences, optimizing the analysis and reducing processing time.•Use of a custom gene database to identify and retain coding sequences with high precision.•Focus on specific genes of interest, allowing a more targeted analysis for specific experimental conditions. This approach holds particular value for pilot studies, research with limited resources, or when rapid identification and validation of candidate genes are needed in species without a reference genome.
Collapse
Affiliation(s)
- Aragón-Magadán Marco Aurelio
- Agricultural and Livestock Researches, National Genetic Resources Center, National Institute of Forestry, Jalisco, Mexico
| | | | - Cruz-Cárdenas Carlos Iván
- Agricultural and Livestock Researches, National Genetic Resources Center, National Institute of Forestry, Jalisco, Mexico
| | - Guzmán Luis Felipe
- Agricultural and Livestock Researches, National Genetic Resources Center, National Institute of Forestry, Jalisco, Mexico
| |
Collapse
|
5
|
Dwivedi SL, Quiroz LF, Reddy ASN, Spillane C, Ortiz R. Alternative Splicing Variation: Accessing and Exploiting in Crop Improvement Programs. Int J Mol Sci 2023; 24:15205. [PMID: 37894886 PMCID: PMC10607462 DOI: 10.3390/ijms242015205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 10/09/2023] [Accepted: 10/10/2023] [Indexed: 10/29/2023] Open
Abstract
Alternative splicing (AS) is a gene regulatory mechanism modulating gene expression in multiple ways. AS is prevalent in all eukaryotes including plants. AS generates two or more mRNAs from the precursor mRNA (pre-mRNA) to regulate transcriptome complexity and proteome diversity. Advances in next-generation sequencing, omics technology, bioinformatics tools, and computational methods provide new opportunities to quantify and visualize AS-based quantitative trait variation associated with plant growth, development, reproduction, and stress tolerance. Domestication, polyploidization, and environmental perturbation may evolve novel splicing variants associated with agronomically beneficial traits. To date, pre-mRNAs from many genes are spliced into multiple transcripts that cause phenotypic variation for complex traits, both in model plant Arabidopsis and field crops. Cataloguing and exploiting such variation may provide new paths to enhance climate resilience, resource-use efficiency, productivity, and nutritional quality of staple food crops. This review provides insights into AS variation alongside a gene expression analysis to select for novel phenotypic diversity for use in breeding programs. AS contributes to heterosis, enhances plant symbiosis (mycorrhiza and rhizobium), and provides a mechanistic link between the core clock genes and diverse environmental clues.
Collapse
Affiliation(s)
| | - Luis Felipe Quiroz
- Agriculture and Bioeconomy Research Centre, Ryan Institute, University of Galway, University Road, H91 REW4 Galway, Ireland
| | - Anireddy S N Reddy
- Department of Biology and Program in Cell and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Charles Spillane
- Agriculture and Bioeconomy Research Centre, Ryan Institute, University of Galway, University Road, H91 REW4 Galway, Ireland
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences, 23053 Alnarp, SE, Sweden
| |
Collapse
|
6
|
Zhang R, Liu Q, Pan S, Zhang Y, Qin Y, Du X, Yuan Z, Lu Y, Song Y, Zhang M, Zhang N, Ma J, Zhang Z, Jia X, Wang K, He S, Liu S, Ni M, Liu X, Xu X, Yang H, Wang J, Seim I, Fan G. A single-cell atlas of West African lungfish respiratory system reveals evolutionary adaptations to terrestrialization. Nat Commun 2023; 14:5630. [PMID: 37699889 PMCID: PMC10497629 DOI: 10.1038/s41467-023-41309-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 08/30/2023] [Indexed: 09/14/2023] Open
Abstract
The six species of lungfish possess both lungs and gills and are the closest extant relatives of tetrapods. Here, we report a single-cell transcriptome atlas of the West African lungfish (Protopterus annectens). This species manifests the most extreme form of terrestrialization, a life history strategy to survive dry periods that can last for years, characterized by dormancy and reversible adaptive changes of the gills and lungs. Our atlas highlights the cell type diversity of the West African lungfish, including gene expression consistent with phenotype changes of terrestrialization. Comparison with terrestrial tetrapods and ray-finned fishes reveals broad homology between the swim bladder and lung cell types as well as shared and idiosyncratic changes of the external gills of the West African lungfish and the internal gills of Atlantic salmon. The single-cell atlas presented here provides a valuable resource for further exploration of the respiratory system evolution in vertebrates and the diversity of lungfish terrestrialization.
Collapse
Affiliation(s)
- Ruihua Zhang
- College of Life Sciences, University of Chinese Academy of Sciences, 100049, Beijing, China
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
| | - Qun Liu
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
- Department of Biology, University of Copenhagen, Copenhagen, 2100, Denmark
| | - Shanshan Pan
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
| | - Yingying Zhang
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
| | - Yating Qin
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
| | - Xiao Du
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
- BGI Research, 518083, Shenzhen, China
| | - Zengbao Yuan
- College of Life Sciences, University of Chinese Academy of Sciences, 100049, Beijing, China
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
| | - Yongrui Lu
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, 430072, Wuhan, China
| | - Yue Song
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
| | | | - Nannan Zhang
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
| | - Jie Ma
- BGI Research, 266555, Qingdao, China
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China
| | | | - Xiaodong Jia
- Joint Laboratory for Translational Medicine Research, Liaocheng People's Hospital, 252000, Liaocheng, Shandong, P.R. China
| | - Kun Wang
- Center for Ecological and Environmental Sciences, Northwestern Polytechnical University, 710072, Xi'an, China
| | - Shunping He
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, 430072, Wuhan, China
| | - Shanshan Liu
- BGI Research, 518083, Shenzhen, China
- MGI Tech, 518083, Shenzhen, China
| | - Ming Ni
- BGI Research, 518083, Shenzhen, China
- MGI Tech, 518083, Shenzhen, China
| | - Xin Liu
- BGI Research, 518083, Shenzhen, China
| | - Xun Xu
- BGI Research, 518083, Shenzhen, China
- Guangdong Provincial Key Laboratory of Genome Read and Write, BGI Research, 518083, Shenzhen, China
| | | | - Jian Wang
- BGI Research, 518083, Shenzhen, China
| | - Inge Seim
- Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, Nanjing, China.
- School of Biology and Environmental Science, Queensland University of Technology, Brisbane, 4000, Australia.
| | - Guangyi Fan
- BGI Research, 266555, Qingdao, China.
- Qingdao Key Laboratory of Marine Genomics, BGI Research, 266555, Qingdao, China.
- BGI Research, 518083, Shenzhen, China.
| |
Collapse
|
7
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.09.527893. [PMID: 37645857 PMCID: PMC10461906 DOI: 10.1101/2023.02.09.527893] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well-described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred model for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Canada
- Department of Genetics, Washington University School of Medicine, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Canada
- Department of Quantitative and Computational Biology, University of Southern California, USA
- Department of Biological Sciences, University of Southern California, USA
| |
Collapse
|
8
|
Son KH, Aldonza MBD, Nam AR, Lee KH, Lee JW, Shin KJ, Kang K, Cho JY. Integrative mapping of the dog epigenome: Reference annotation for comparative intertissue and cross-species studies. SCIENCE ADVANCES 2023; 9:eade3399. [PMID: 37406108 DOI: 10.1126/sciadv.ade3399] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 06/02/2023] [Indexed: 07/07/2023]
Abstract
Dogs have become a valuable model in exploring multifaceted diseases and biology relevant to human health. Despite large-scale dog genome projects producing high-quality draft references, a comprehensive annotation of functional elements is still lacking. We addressed this through integrative next-generation sequencing of transcriptomes paired with five histone marks and DNA methylome profiling across 11 tissue types, deciphering the dog's epigenetic code by defining distinct chromatin states, super-enhancer, and methylome landscapes, and thus showed that these regions are associated with a wide range of biological functions and cell/tissue identity. In addition, we confirmed that the phenotype-associated variants are enriched in tissue-specific regulatory regions and, therefore, the tissue of origin of the variants can be traced. Ultimately, we delineated conserved and dynamic epigenomic changes at the tissue- and species-specific resolutions. Our study provides an epigenomic blueprint of the dog that can be used for comparative biology and medical research.
Collapse
Affiliation(s)
- Keun Hong Son
- Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, Korea
- Comparative Medicine and Disease Research Center (CDRC), Science Research Center (SRC), Seoul National University, Seoul, Korea
- BK21 PLUS Program for Creative Veterinary Science Research and Research Institute for Veterinary Science, Seoul National University, Seoul, Korea
| | - Mark Borris D Aldonza
- Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, Korea
- Comparative Medicine and Disease Research Center (CDRC), Science Research Center (SRC), Seoul National University, Seoul, Korea
- BK21 PLUS Program for Creative Veterinary Science Research and Research Institute for Veterinary Science, Seoul National University, Seoul, Korea
| | - A-Reum Nam
- Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, Korea
- Comparative Medicine and Disease Research Center (CDRC), Science Research Center (SRC), Seoul National University, Seoul, Korea
- BK21 PLUS Program for Creative Veterinary Science Research and Research Institute for Veterinary Science, Seoul National University, Seoul, Korea
| | - Kang-Hoon Lee
- Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, Korea
- BK21 PLUS Program for Creative Veterinary Science Research and Research Institute for Veterinary Science, Seoul National University, Seoul, Korea
| | - Jeong-Woon Lee
- Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, Korea
- Comparative Medicine and Disease Research Center (CDRC), Science Research Center (SRC), Seoul National University, Seoul, Korea
- BK21 PLUS Program for Creative Veterinary Science Research and Research Institute for Veterinary Science, Seoul National University, Seoul, Korea
| | - Kyung-Ju Shin
- Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, Korea
- BK21 PLUS Program for Creative Veterinary Science Research and Research Institute for Veterinary Science, Seoul National University, Seoul, Korea
| | - Keunsoo Kang
- Department of Microbiology, College of Natural Sciences, Dankook University, Cheonan, Korea
| | - Je-Yoel Cho
- Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, Korea
- Comparative Medicine and Disease Research Center (CDRC), Science Research Center (SRC), Seoul National University, Seoul, Korea
- BK21 PLUS Program for Creative Veterinary Science Research and Research Institute for Veterinary Science, Seoul National University, Seoul, Korea
| |
Collapse
|
9
|
Shao C, Tao S, Liang Y. Comparative transcriptome analysis of juniper branches infected by Gymnosporangium spp. highlights their different infection strategies associated with cytokinins. BMC Genomics 2023; 24:173. [PMID: 37020280 PMCID: PMC10077639 DOI: 10.1186/s12864-023-09276-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 03/27/2023] [Indexed: 04/07/2023] Open
Abstract
BACKGROUND Gymnosporangium asiaticum and G. yamadae can share Juniperus chinensis as the telial host, but the symptoms are completely different. The infection of G. yamadae causes the enlargement of the phloem and cortex of young branches as a gall, but not for G. asiaticum, suggesting that different molecular interaction mechanisms exist the two Gymnosporangium species with junipers. RESULTS Comparative transcriptome analysis was performed to investigate genes regulation of juniper in responses to the infections of G. asiaticum and G. yamadae at different stages. Functional enrichment analysis showed that genes related to transport, catabolism and transcription pathways were up-regulated, while genes related to energy metabolism and photosynthesis were down-regulated in juniper branch tissues after infection with G. asiaticum and G. yamadae. The transcript profiling of G. yamadae-induced gall tissues revealed that more genes involved in photosynthesis, sugar metabolism, plant hormones and defense-related pathways were up-regulated in the vigorous development stage of gall compared to the initial stage, and were eventually repressed overall. Furthermore, the concentration of cytokinins (CKs) in the galls tissue and the telia of G. yamadae was significantly higher than in healthy branch tissues of juniper. As well, tRNA-isopentenyltransferase (tRNA-IPT) was identified in G. yamadae with highly expression levels during the gall development stages. CONCLUSIONS In general, our study provided new insights into the host-specific mechanisms by which G. asiaticum and G. yamadae differentially utilize CKs and specific adaptations on juniper during their co-evolution.
Collapse
Affiliation(s)
- Chenxi Shao
- The Key Laboratory for Silviculture and Conservation of Ministry of Education, College of Forestry, Beijing Forestry University, Beijing, 100083, China
| | - Siqi Tao
- The Key Laboratory for Silviculture and Conservation of Ministry of Education, College of Forestry, Beijing Forestry University, Beijing, 100083, China
| | - Yingmei Liang
- Museum of Beijing Forestry University, Beijing Forestry University, No. 35, Qinghua Eastern Road, Beijing, 100083, China.
| |
Collapse
|
10
|
Abstract
Polyploidizations, or whole-genome duplications (WGDs), in plants have increased biological complexity, facilitated evolutionary innovation, and likely enabled adaptation under harsh conditions. Besides genomic data, transcriptome data have been widely employed to detect WGDs, due to their efficient accessibility to the gene space of a species. Age distributions based on synonymous substitutions (so-called KS age distributions) for paralogs assembled from transcriptome data have identified numerous WGDs in plants, paving the way for further studies on the importance of WGDs for the evolution of seed and flowering plants. However, it is still unclear how transcriptome-based age distributions compare to those based on genomic data. In this chapter, we implemented three different de novo transcriptome assembly pipelines with two popular assemblers, i.e., Trinity and SOAPdenovo-Trans. We selected six plant species with published genomes and transcriptomes to evaluate how assembled transcripts from different pipelines perform when using KS distributions to detect previously documented WGDs in the six species. Further, using genes predicted in each genome as references, we evaluated the effects of missing genes, gene family clustering, and de novo assembled transcripts on the transcriptome-based KS distributions. Our results show that, although the transcriptome-based KS distributions differ from the genome-based ones with respect to their shapes and scales, they are still reasonably reliable for unveiling WGDs, except in species where most duplicates originated from a recent WGD. We also discuss how to overcome some possible pitfalls when using transcriptome data to identify WGDs.
Collapse
Affiliation(s)
- Jia Li
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,VIB Center for Plant Systems Biology, VIB, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
| | - Zhen Li
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
| |
Collapse
|
11
|
Church SH, Munro C, Dunn CW, Extavour CG. The evolution of ovary-biased gene expression in Hawaiian Drosophila. PLoS Genet 2023; 19:e1010607. [PMID: 36689550 PMCID: PMC9894553 DOI: 10.1371/journal.pgen.1010607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 02/02/2023] [Accepted: 01/09/2023] [Indexed: 01/24/2023] Open
Abstract
With detailed data on gene expression accessible from an increasingly broad array of species, we can test the extent to which our developmental genetic knowledge from model organisms predicts expression patterns and variation across species. But to know when differences in gene expression across species are significant, we first need to know how much evolutionary variation in gene expression we expect to observe. Here we provide an answer by analyzing RNAseq data across twelve species of Hawaiian Drosophilidae flies, focusing on gene expression differences between the ovary and other tissues. We show that over evolutionary time, there exists a cohort of ovary specific genes that is stable and that largely corresponds to described expression patterns from laboratory model Drosophila species. Our results also provide a demonstration of the prediction that, as phylogenetic distance increases, variation between species overwhelms variation between tissue types. Using ancestral state reconstruction of expression, we describe the distribution of evolutionary changes in tissue-biased expression, and use this to identify gains and losses of ovary-biased expression across these twelve species. We then use this distribution to calculate the evolutionary correlation in expression changes between genes, and demonstrate that genes with known interactions in D. melanogaster are significantly more correlated in their evolution than genes with no or unknown interactions. Finally, we use this correlation matrix to infer new networks of genes that share evolutionary trajectories, and we present these results as a dataset of new testable hypotheses about genetic roles and interactions in the function and evolution of the Drosophila ovary.
Collapse
Affiliation(s)
- Samuel H Church
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Current address: Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Catriona Munro
- Collège de France, PSL Research University, CNRS, Inserm, Center for Interdisciplinary Research in Biology, Paris, France
| | - Casey W Dunn
- Current address: Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Cassandra G Extavour
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| |
Collapse
|
12
|
Walter M, Puniamoorthy N. Discovering novel reproductive genes in a non-model fly using de novo GridION transcriptomics. Front Genet 2022; 13:1003771. [PMID: 36568389 PMCID: PMC9768217 DOI: 10.3389/fgene.2022.1003771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 11/16/2022] [Indexed: 12/12/2022] Open
Abstract
Gene discovery has important implications for investigating phenotypic trait evolution, adaptation, and speciation. Male reproductive tissues, such as accessory glands (AGs), are hotspots for recruitment of novel genes that diverge rapidly even among closely related species/populations. These genes synthesize seminal fluid proteins that often affect post-copulatory sexual selection-they can mediate male-male sperm competition, ejaculate-female interactions that modify female remating and even influence reproductive incompatibilities among diverging species/populations. Although de novo transcriptomics has facilitated gene discovery in non-model organisms, reproductive gene discovery is still challenging without a reference database as they are often novel and bear no homology to known proteins. Here, we use reference-free GridION long-read transcriptomics, from Oxford Nanopore Technologies (ONT), to discover novel AG genes and characterize their expression in the widespread dung fly, Sepsis punctum. Despite stark population differences in male reproductive traits (e.g.: Body size, testes size, and sperm length) as well as female re-mating, the male AG genes and their secretions of S. punctum are still unknown. We implement a de novo ONT transcriptome pipeline incorporating quality-filtering and rigorous error-correction procedures, and we evaluate gene sequence and gene expression results against high-quality Illumina short-read data. We discover highly-expressed reproductive genes in AG transcriptomes of S. punctum consisting of 40 high-quality and high-confidence ONT genes that cross-verify against Illumina genes, among which 26 are novel and specific to S. punctum. Novel genes account for an average of 81% of total gene expression and may be functionally relevant in seminal fluid protein production. For instance, 80% of genes encoding secretory proteins account for 74% total gene expression. In addition, median sequence similarities of ONT nucleotide and protein sequences match within-Illumina sequence similarities. Read-count based expression quantification in ONT is congruent with Illumina's Transcript per Million (TPM), both in overall pattern and within functional categories. Rapid genomic innovation followed by recruitment of de novo genes for high expression in S. punctum AG tissue, a pattern observed in other insects, could be a likely mechanism of evolution of these genes. The study also demonstrates the feasibility of adapting ONT transcriptomics for gene discovery in non-model systems.
Collapse
|
13
|
Lotterhos KE, Fitzpatrick MC, Blackmon H. Simulation Tests of Methods in Evolution, Ecology, and Systematics: Pitfalls, Progress, and Principles. ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2022; 53:113-136. [PMID: 38107485 PMCID: PMC10723108 DOI: 10.1146/annurev-ecolsys-102320-093722] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Complex statistical methods are continuously developed across the fields of ecology, evolution, and systematics (EES). These fields, however, lack standardized principles for evaluating methods, which has led to high variability in the rigor with which methods are tested, a lack of clarity regarding their limitations, and the potential for misapplication. In this review, we illustrate the common pitfalls of method evaluations in EES, the advantages of testing methods with simulated data, and best practices for method evaluations. We highlight the difference between method evaluation and validation and review how simulations, when appropriately designed, can refine the domain in which a method can be reliably applied. We also discuss the strengths and limitations of different evaluation metrics. The potential for misapplication of methods would be greatly reduced if funding agencies, reviewers, and journals required principled method evaluation.
Collapse
Affiliation(s)
- Katie E Lotterhos
- Department of Marine and Environmental Sciences, Northeastern University, Nahant, Massachusetts, USA
| | - Matthew C Fitzpatrick
- Appalachian Lab, University of Maryland Center for Environmental Science, Frostburg, Maryland, USA
| | - Heath Blackmon
- Department of Biology, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
14
|
Proteotranscriptomics - A facilitator in omics research. Comput Struct Biotechnol J 2022; 20:3667-3675. [PMID: 35891789 PMCID: PMC9293588 DOI: 10.1016/j.csbj.2022.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 07/04/2022] [Accepted: 07/04/2022] [Indexed: 11/26/2022] Open
Abstract
Applications in omics research, such as comparative transcriptomics and proteomics, require the knowledge of the species-specific gene sequence and benefit from a comprehensive high-quality annotation of the coding genes to achieve high coverage. While protein-coding genes can in simple cases be detected by scanning the genome for open reading frames, in more complex genomes exonic sequences are separated by introns. Despite advances in sequencing technologies that allow for ever-growing numbers of genomes, the quality of many of the provided genome assemblies do not reach reference quality. These non-contiguous assemblies with gaps and the necessity to predict splice sites limit accurate gene annotation from solely genomic data. In contrast, the transcriptome only contains transcribed gene regions, is devoid of introns and thus provides the optimal basis for the identification of open reading frames. The additional integration of proteomics data to validate predicted protein-coding genes further enriches for accurate gene models. This review outlines the principles of the proteotranscriptomics approach, discusses common challenges and suggests methods for improvement.
Collapse
|
15
|
Improving the Annotation of the Venom Gland Transcriptome of Pamphobeteus verdolaga, Prospecting Novel Bioactive Peptides. Toxins (Basel) 2022; 14:toxins14060408. [PMID: 35737069 PMCID: PMC9228390 DOI: 10.3390/toxins14060408] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 06/06/2022] [Accepted: 06/07/2022] [Indexed: 02/01/2023] Open
Abstract
Spider venoms constitute a trove of novel peptides with biotechnological interest. Paucity of next-generation-sequencing (NGS) data generation has led to a description of less than 1% of these peptides. Increasing evidence supports the underestimation of the assembled genes a single transcriptome assembler can predict. Here, the transcriptome of the venom gland of the spider Pamphobeteus verdolaga was re-assembled, using three free access algorithms, Trinity, SOAPdenovo-Trans, and SPAdes, to obtain a more complete annotation. Assembler’s performance was evaluated by contig number, N50, read representation on the assembly, and BUSCO’s terms retrieval against the arthropod dataset. Out of all the assembled sequences with all software, 39.26% were common between the three assemblers, and 27.88% were uniquely assembled by Trinity, while 27.65% were uniquely assembled by SPAdes. The non-redundant merging of all three assemblies’ output permitted the annotation of 9232 sequences, which was 23% more when compared to each software and 28% more when compared to the previous P. verdolaga annotation; moreover, the description of 65 novel theraphotoxins was possible. In the generation of data for non-model organisms, as well as in the search for novel peptides with biotechnological interest, it is highly recommended to employ at least two different transcriptome assemblers.
Collapse
|
16
|
Yin Z, Nie H, Jiang K, Yan X. Molecular Mechanisms Underlying Vibrio Tolerance in Ruditapes philippinarum Revealed by Comparative Transcriptome Profiling. Front Immunol 2022; 13:879337. [PMID: 35615362 PMCID: PMC9125321 DOI: 10.3389/fimmu.2022.879337] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Accepted: 04/05/2022] [Indexed: 12/13/2022] Open
Abstract
The clam Ruditapes philippinarum is an important species in the marine aquaculture industry in China. However, in recent years, the aquaculture of R. philippinarum has been negatively impacted by various bacterial pathogens. In this study, the transcriptome libraries of R. philippinarum showing different levels of resistance to challenge with Vibrio anguillarum were constructed and RNA-seq was performed using the Illumina sequencing platform. Host immune factors were identified that responded to V. anguillarum infection, including C-type lectin domain, glutathione S-transferase 9, lysozyme, methyltransferase FkbM domain, heat shock 70 kDa protein, Ras-like GTP-binding protein RHO, C1q, F-box and BTB/POZ domain protein zf-C2H2. Ten genes were selected and verified by RT-qPCR, and nine of the gene expression results were consistent with those of RNA-seq. The lectin gene in the phagosome pathway was expressed at a significantly higher level after V. anguillarum infection, which might indicate the role of lectin in the immune response to V. anguillarum. Comparing the results from R. philippinarum resistant and nonresistant to V. anguillarum increases our understanding of the resistant genes and key pathways related to Vibrio challenge in this species. The results obtained here provide a reference for future immunological research focusing on the response of R. philippinarum to V. anguillarum infection.
Collapse
Affiliation(s)
- Zhihui Yin
- Engineering and Technology Research Center of Shellfish Breeding in Liaoning Province, College of Fisheries and Life Science, Dalian Ocean University, Dalian, China
| | - Hongtao Nie
- Engineering and Technology Research Center of Shellfish Breeding in Liaoning Province, College of Fisheries and Life Science, Dalian Ocean University, Dalian, China
| | - Kunyin Jiang
- Engineering and Technology Research Center of Shellfish Breeding in Liaoning Province, College of Fisheries and Life Science, Dalian Ocean University, Dalian, China
| | - Xiwu Yan
- Engineering and Technology Research Center of Shellfish Breeding in Liaoning Province, College of Fisheries and Life Science, Dalian Ocean University, Dalian, China
| |
Collapse
|
17
|
Guo W, Coulter M, Waugh R, Zhang R. The value of genotype-specific reference for transcriptome analyses in barley. Life Sci Alliance 2022; 5:5/8/e202101255. [PMID: 35459738 PMCID: PMC9034525 DOI: 10.26508/lsa.202101255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 04/10/2022] [Accepted: 04/11/2022] [Indexed: 12/31/2022] Open
Abstract
We demonstrate in this study that using a common reference genome may lead to loss of genotype-specific information in the assembled Reference Transcript Dataset (RTD) and the generation of erroneous, incomplete, or misleading transcriptomics analysis results in barley. It is increasingly apparent that although different genotypes within a species share “core” genes, they also contain variable numbers of “specific” genes and different structures of “core” genes that are only present in a subset of individuals. Using a common reference genome may thus lead to a loss of genotype-specific information in the assembled Reference Transcript Dataset (RTD) and the generation of erroneous, incomplete or misleading transcriptomics analysis results. In this study, we assembled genotype-specific RTD (sRTD) and common reference–based RTD (cRTD) from RNA-seq data of cultivated Barke and Morex barley, respectively. Our quantitative evaluation showed that the sRTD has a significantly higher diversity of transcripts and alternative splicing events, whereas the cRTD missed 40% of transcripts present in the sRTD and it only has ∼70% accurate transcript assemblies. We found that the sRTD is more accurate for transcript quantification as well as differential expression analysis. However, gene-level quantification is less affected, which may be a reasonable compromise when a high-quality genotype-specific reference is not available.
Collapse
Affiliation(s)
- Wenbin Guo
- Information and Computational Sciences, James Hutton Institute, Dundee, UK
| | - Max Coulter
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Dundee, UK
| | - Robbie Waugh
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Dundee, UK.,Cell and Molecular Sciences, James Hutton Institute, Dundee, UK
| | - Runxuan Zhang
- Information and Computational Sciences, James Hutton Institute, Dundee, UK
| |
Collapse
|
18
|
Raghavan V, Kraft L, Mesny F, Rigerte L. A simple guide to de novo transcriptome assembly and annotation. Brief Bioinform 2022; 23:6514404. [PMID: 35076693 PMCID: PMC8921630 DOI: 10.1093/bib/bbab563] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 12/03/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
Collapse
Affiliation(s)
- Venket Raghavan
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | - Louis Kraft
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | | | | |
Collapse
|
19
|
Narum S, News JK, Fountain-Jones N, Hooper Junior R, Ortiz-Barrientos D, O'Boyle B, Sibbett B. Editorial 2022. Mol Ecol Resour 2021; 22:1-8. [PMID: 34919782 DOI: 10.1111/1755-0998.13572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
20
|
CStone: A de novo transcriptome assembler for short-read data that identifies non-chimeric contigs based on underlying graph structure. PLoS Comput Biol 2021; 17:e1009631. [PMID: 34813594 PMCID: PMC8651127 DOI: 10.1371/journal.pcbi.1009631] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 12/07/2021] [Accepted: 11/11/2021] [Indexed: 11/19/2022] Open
Abstract
With the exponential growth of sequence information stored over the last decade, including that of de novo assembled contigs from RNA-Seq experiments, quantification of chimeric sequences has become essential when assembling read data. In transcriptomics, de novo assembled chimeras can closely resemble underlying transcripts, but patterns such as those seen between co-evolving sites, or mapped read counts, become obscured. We have created a de Bruijn based de novo assembler for RNA-Seq data that utilizes a classification system to describe the complexity of underlying graphs from which contigs are created. Each contig is labelled with one of three levels, indicating whether or not ambiguous paths exist. A by-product of this is information on the range of complexity of the underlying gene families present. As a demonstration of CStones ability to assemble high-quality contigs, and to label them in this manner, both simulated and real data were used. For simulated data, ten million read pairs were generated from cDNA libraries representing four species, Drosophila melanogaster, Panthera pardus, Rattus norvegicus and Serinus canaria. These were assembled using CStone, Trinity and rnaSPAdes; the latter two being high-quality, well established, de novo assembers. For real data, two RNA-Seq datasets, each consisting of ≈30 million read pairs, representing two adult D. melanogaster whole-body samples were used. The contigs that CStone produced were comparable in quality to those of Trinity and rnaSPAdes in terms of length, sequence identity of aligned regions and the range of cDNA transcripts represented, whilst providing additional information on chimerism. Here we describe the details of CStones assembly and classification process, and propose that similar classification systems can be incorporated into other de novo assembly tools. Within a related side study, we explore the effects that chimera’s within reference sets have on the identification of differentially expression genes. CStone is available at: https://sourceforge.net/projects/cstone/. Within transcriptome reference sets, non-chimeric sequences are representations of transcribed genes, while artificially generated chimeric ones are mosaics of two or more pieces of DNA incorrectly pieced together. One area where such sets are utilized is in the quantification of gene expression patterns; where RNA-Seq reads are mapped to the sequences within, and subsequent count values reflect expression levels. Artificial chimeras can have a negative impact on count values by erroneously increasing variation in relation to the reads being mapped. Reference sets can be created from de novo assembled contigs, but chimeras can be introduced during the assembly process via the required traversal of graphs, representing gene families, constructed from the RNA-Seq data. Graph complexity determines how likely chimeras will arise. We have created CStone, a de novo assembler that utilizes a classification system to describe such complexity. Contigs created by CStone are labelled in a manner that indicates whether or not they are non-chimeric. This encourages contig dependent results to be presented with increased objectivity by maintaining the context of ambiguity associated with the assembly process. CStone has been tested extensively. Additionally, we have quantified the relationship between chimeras within reference sets and the identification of differentially expressed genes.
Collapse
|
21
|
Zhao C, Miao S, Yin Y, Zhu Y, Nabity P, Bansal R, Liu C. Tripartite parasitic and symbiotic interactions as a possible mechanism of horizontal gene transfer. Ecol Evol 2021; 11:7018-7028. [PMID: 34141272 PMCID: PMC8207144 DOI: 10.1002/ece3.7550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 03/23/2021] [Accepted: 03/25/2021] [Indexed: 12/03/2022] Open
Abstract
Herbivory is a highly sophisticated feeding behavior that requires abilities of plant defense suppression, phytochemical detoxification, and plant macromolecule digestion. For plant-sucking insects, salivary glands (SGs) play important roles in herbivory by secreting and injecting proteins into plant tissues to facilitate feeding. Little is known on how insects evolved secretory SG proteins for such specialized functions. Here, we investigated the composition and evolution of secretory SG proteins in the brown marmorated stink bug (Halyomorpha halys) and identified a group of secretory SG phospholipase C (PLC) genes with highest sequence similarity to the bacterial homologs. Further analyses demonstrated that they were most closely related to PLCs of Xenorhabdus, a genus of Gammaproteobacteria living in symbiosis with insect-parasitizing nematodes. These suggested that H. halys might acquire these PLCs from Xenorhabdus through the mechanism of horizontal gene transfer (HGT), likely mediated by a nematode during its parasitizing an insect host. We also showed that the original HGT event was followed by gene duplication and expansion, leading to functional diversification of the bacterial-origin PLC genes in H. halys. Thus, this study suggested that an herbivore might enhance adaptation through gaining genes from an endosymbiont of its parasite in the tripartite parasitic and symbiotic interactions.
Collapse
Affiliation(s)
- Chaoyang Zhao
- Department of Botany and Plant SciencesUniversity of California RiversideRiversideCAUSA
| | - Shaoming Miao
- Sino‐American Biological Control LaboratoryInstitute of Plant ProtectionChinese Academy of Agricultural SciencesBeijingChina
| | - Yanfang Yin
- Sino‐American Biological Control LaboratoryInstitute of Plant ProtectionChinese Academy of Agricultural SciencesBeijingChina
| | - Yanjuan Zhu
- Sino‐American Biological Control LaboratoryInstitute of Plant ProtectionChinese Academy of Agricultural SciencesBeijingChina
| | - Paul Nabity
- Department of Botany and Plant SciencesUniversity of California RiversideRiversideCAUSA
| | - Raman Bansal
- USDA‐ARSSan Joaquin Valley Agricultural Sciences CenterParlierCAUSA
| | - Chenxi Liu
- Sino‐American Biological Control LaboratoryInstitute of Plant ProtectionChinese Academy of Agricultural SciencesBeijingChina
| |
Collapse
|
22
|
Banerjee SM, Stoll JA, Allen CD, Lynch JM, Harris HS, Kenyon L, Connon RE, Sterling EJ, Naro-Maciel E, McFadden K, Lamont MM, Benge J, Fernandez NB, Seminoff JA, Benson SR, Lewison RL, Eguchi T, Summers TM, Hapdei JR, Rice MR, Martin S, Jones TT, Dutton PH, Balazs GH, Komoroske LM. Species and population specific gene expression in blood transcriptomes of marine turtles. BMC Genomics 2021; 22:346. [PMID: 33985425 PMCID: PMC8117300 DOI: 10.1186/s12864-021-07656-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 04/23/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Transcriptomic data has demonstrated utility to advance the study of physiological diversity and organisms' responses to environmental stressors. However, a lack of genomic resources and challenges associated with collecting high-quality RNA can limit its application for many wild populations. Minimally invasive blood sampling combined with de novo transcriptomic approaches has great potential to alleviate these barriers. Here, we advance these goals for marine turtles by generating high quality de novo blood transcriptome assemblies to characterize functional diversity and compare global transcriptional profiles between tissues, species, and foraging aggregations. RESULTS We generated high quality blood transcriptome assemblies for hawksbill (Eretmochelys imbricata), loggerhead (Caretta caretta), green (Chelonia mydas), and leatherback (Dermochelys coriacea) turtles. The functional diversity in assembled blood transcriptomes was comparable to those from more traditionally sampled tissues. A total of 31.3% of orthogroups identified were present in all four species, representing a core set of conserved genes expressed in blood and shared across marine turtle species. We observed strong species-specific expression of these genes, as well as distinct transcriptomic profiles between green turtle foraging aggregations that inhabit areas of greater or lesser anthropogenic disturbance. CONCLUSIONS Obtaining global gene expression data through non-lethal, minimally invasive sampling can greatly expand the applications of RNA-sequencing in protected long-lived species such as marine turtles. The distinct differences in gene expression signatures between species and foraging aggregations provide insight into the functional genomics underlying the diversity in this ancient vertebrate lineage. The transcriptomic resources generated here can be used in further studies examining the evolutionary ecology and anthropogenic impacts on marine turtles.
Collapse
Affiliation(s)
- Shreya M Banerjee
- Department of Environmental Conservation, University of Massachusetts, Amherst, MA, USA
| | - Jamie Adkins Stoll
- Department of Environmental Conservation, University of Massachusetts, Amherst, MA, USA
| | - Camryn D Allen
- Marine Turtle Biology and Assessment Program, Protected Species Division, Pacific Islands Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Honolulu, HI, USA.,Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA
| | - Jennifer M Lynch
- Chemical Sciences Division, National Institute of Standards and Technology, Hawai'i Pacific University, Waimanalo, HI, USA
| | - Heather S Harris
- Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA
| | - Lauren Kenyon
- Department of Environmental Conservation, University of Massachusetts, Amherst, MA, USA
| | - Richard E Connon
- Department of Anatomy, Physiology and Cell Biology, University of California, Davis, Davis, CA, USA
| | - Eleanor J Sterling
- Center for Biodiversity and Conservation, American Museum of Natural History, New York, NY, USA
| | | | - Kathryn McFadden
- School of Agricultural, Forest, and Environmental Sciences, Clemson University, Clemson, SC, USA
| | - Margaret M Lamont
- United States Geological Survey, Wetland and Aquatic Research Center, Gainesville, FL, USA
| | - James Benge
- Section of Molecular Biology, Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Nadia B Fernandez
- Department of Environmental Conservation, University of Massachusetts, Amherst, MA, USA
| | - Jeffrey A Seminoff
- Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA
| | - Scott R Benson
- Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Moss Landing, CA, 95039, USA.,Moss Landing Marine Laboratories, San Jose State University, Moss Landing, CA, 95039, USA
| | - Rebecca L Lewison
- Department of Biology, San Diego State University, San Diego, CA, USA
| | - Tomoharu Eguchi
- Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA
| | | | - Jessy R Hapdei
- Jessy's Tag Services, Saipan, Commonwealth of the Northern Mariana Islands, USA
| | - Marc R Rice
- Hawai'i Preparatory Academy, Kamuela, HI, USA
| | - Summer Martin
- Marine Turtle Biology and Assessment Program, Protected Species Division, Pacific Islands Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Honolulu, HI, USA
| | - T Todd Jones
- Marine Turtle Biology and Assessment Program, Protected Species Division, Pacific Islands Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Honolulu, HI, USA
| | - Peter H Dutton
- Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA
| | | | - Lisa M Komoroske
- Department of Environmental Conservation, University of Massachusetts, Amherst, MA, USA. .,Marine Mammal and Turtle Division, Southwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, La Jolla, CA, USA.
| |
Collapse
|
23
|
Scott MA, Woolums AR, Swiderski CE, Perkins AD, Nanduri B, Smith DR, Karisch BB, Epperson WB, Blanton JR. Comprehensive at-arrival transcriptomic analysis of post-weaned beef cattle uncovers type I interferon and antiviral mechanisms associated with bovine respiratory disease mortality. PLoS One 2021; 16:e0250758. [PMID: 33901263 PMCID: PMC8075194 DOI: 10.1371/journal.pone.0250758] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 04/13/2021] [Indexed: 12/02/2022] Open
Abstract
Background Despite decades of extensive research, bovine respiratory disease (BRD) remains the most devastating disease in beef cattle production. Establishing a clinical diagnosis often relies upon visual detection of non-specific signs, leading to low diagnostic accuracy. Thus, post-weaned beef cattle are often metaphylactically administered antimicrobials at facility arrival, which poses concerns regarding antimicrobial stewardship and resistance. Additionally, there is a lack of high-quality research that addresses the gene-by-environment interactions that underlie why some cattle that develop BRD die while others survive. Therefore, it is necessary to decipher the underlying host genomic factors associated with BRD mortality versus survival to help determine BRD risk and severity. Using transcriptomic analysis of at-arrival whole blood samples from cattle that died of BRD, as compared to those that developed signs of BRD but lived (n = 3 DEAD, n = 3 ALIVE), we identified differentially expressed genes (DEGs) and associated pathways in cattle that died of BRD. Additionally, we evaluated unmapped reads, which are often overlooked within transcriptomic experiments. Results 69 DEGs (FDR<0.10) were identified between ALIVE and DEAD cohorts. Several DEGs possess immunological and proinflammatory function and associations with TLR4 and IL6. Biological processes, pathways, and disease phenotype associations related to type-I interferon production and antiviral defense were enriched in DEAD cattle at arrival. Unmapped reads aligned primarily to various ungulate assemblies, but failed to align to viral assemblies. Conclusion This study further revealed increased proinflammatory immunological mechanisms in cattle that develop BRD. DEGs upregulated in DEAD cattle were predominantly involved in innate immune pathways typically associated with antiviral defense, although no viral genes were identified within unmapped reads. Our findings provide genomic targets for further analysis in cattle at highest risk of BRD, suggesting that mechanisms related to type I interferons and antiviral defense may be indicative of viral respiratory disease at arrival and contribute to eventual BRD mortality.
Collapse
Affiliation(s)
- Matthew A Scott
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Amelia R Woolums
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Cyprianna E Swiderski
- Department of Clinical Sciences, Mississippi State University, Mississippi State, MS, United States of America
| | - Andy D Perkins
- Department of Computer Science and Engineering, Mississippi State University, Mississippi State, MS, United States of America
| | - Bindu Nanduri
- Department of Basic Sciences, Mississippi State University College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - David R Smith
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Brandi B Karisch
- Department of Animal and Dairy Sciences, Mississippi State University, Mississippi State, MS, United States of America
| | - William B Epperson
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - John R Blanton
- Department of Animal and Dairy Sciences, Mississippi State University, Mississippi State, MS, United States of America
| |
Collapse
|
24
|
Klein A, Husselmann LHH, Williams A, Bell L, Cooper B, Ragar B, Tabb DL. Proteomic Identification and Meta-Analysis in Salvia hispanica RNA-Seq de novo Assemblies. PLANTS (BASEL, SWITZERLAND) 2021; 10:765. [PMID: 33919777 PMCID: PMC8070742 DOI: 10.3390/plants10040765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/26/2021] [Accepted: 03/28/2021] [Indexed: 11/24/2022]
Abstract
While proteomics has demonstrated its value for model organisms and for organisms with mature genome sequence annotations, proteomics has been of less value in nonmodel organisms that are unaccompanied by genome sequence annotations. This project sought to determine the value of RNA-Seq experiments as a basis for establishing a set of protein sequences to represent a nonmodel organism, in this case, the pseudocereal chia. Assembling four publicly available chia RNA-Seq datasets produced transcript sequence sets with a high BUSCO completeness, though the number of transcript sequences and Trinity "genes" varied considerably among them. After six-frame translation, ProteinOrtho detected substantial numbers of orthologs among other species within the taxonomic order Lamiales. These protein sequence databases demonstrated a good identification efficiency for three different LC-MS/MS proteomics experiments, though a seed proteome showed considerable variability in the identification of peptides based on seed protein sequence inclusion. If a proteomics experiment emphasizes a particular tissue, an RNA-Seq experiment incorporating that same tissue is more likely to support a database search identification of that proteome.
Collapse
Affiliation(s)
- Ashwil Klein
- Department of Biotechnology, University of the Western Cape, Bellville 7535, South Africa; (A.K.); (L.H.H.H.); (A.W.)
| | - Lizex H. H. Husselmann
- Department of Biotechnology, University of the Western Cape, Bellville 7535, South Africa; (A.K.); (L.H.H.H.); (A.W.)
| | - Achmat Williams
- Department of Biotechnology, University of the Western Cape, Bellville 7535, South Africa; (A.K.); (L.H.H.H.); (A.W.)
| | - Liam Bell
- Centre for Proteomic and Genomic Research, Cape Town 7925, South Africa;
| | - Bret Cooper
- USDA Agricultural Research Service, Beltsville, MD 20705, USA;
| | - Brent Ragar
- Departments of Internal Medicine and Pediatrics, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02150, USA;
| | - David L. Tabb
- Department of Biotechnology, University of the Western Cape, Bellville 7535, South Africa; (A.K.); (L.H.H.H.); (A.W.)
- Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town 7500, South Africa
- Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch 7602, South Africa
| |
Collapse
|
25
|
Fifer J, Bentlage B, Lemer S, Fujimura AG, Sweet M, Raymundo LJ. Going with the flow: How corals in high-flow environments can beat the heat. Mol Ecol 2021; 30:2009-2024. [PMID: 33655552 DOI: 10.1111/mec.15869] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2019] [Revised: 01/28/2021] [Accepted: 02/16/2021] [Indexed: 12/18/2022]
Abstract
Coral reefs are experiencing unprecedented declines in health on a global scale leading to severe reductions in coral cover. One major cause of this decline is increasing sea surface temperature. However, conspecific colonies separated by even small spatial distances appear to show varying responses to this global stressor. One factor contributing to differential responses to heat stress is variability in the coral's micro-environment, such as the amount of water flow a coral experiences. High flow provides corals with a variety of health benefits, including heat stress mitigation. Here, we investigate how water flow affects coral gene expression and provides resilience to increasing temperatures. We examined host and photosymbiont gene expression of Acropora cf. pulchra colonies in discrete in situ flow environments during a natural bleaching event. In addition, we conducted controlled ex situ tank experiments where we exposed A. cf. pulchra to different flow regimes and acute heat stress. Notably, we observed distinct flow-driven transcriptomic signatures related to energy expenditure, growth, heterotrophy and a healthy coral host-photosymbiont relationship. We also observed disparate transcriptomic responses during bleaching recovery between the high- and low-flow sites. Additionally, corals exposed to high flow showed "frontloading" of specific heat-stress-related genes such as heat shock proteins, antioxidant enzymes, genes involved in apoptosis regulation, innate immunity and cell adhesion. We posit that frontloading is a result of increased oxidative metabolism generated by the increased water movement. Gene frontloading may at least partially explain the observation that colonies in high-flow environments show higher survival and/or faster recovery in response to bleaching events.
Collapse
Affiliation(s)
- James Fifer
- University of Guam Marine Laboratory, UOG Station, Mangilao, GU, USA.,Department of Biology, Boston University, Boston, MA, USA
| | - Bastian Bentlage
- University of Guam Marine Laboratory, UOG Station, Mangilao, GU, USA
| | - Sarah Lemer
- University of Guam Marine Laboratory, UOG Station, Mangilao, GU, USA
| | | | - Michael Sweet
- Aquatic Research Facility, Environmental Sustainability Research Centre, University of Derby, Derby, UK
| | - Laurie J Raymundo
- University of Guam Marine Laboratory, UOG Station, Mangilao, GU, USA
| |
Collapse
|
26
|
The Developmental Transcriptome of Bagworm, Metisa plana (Lepidoptera: Psychidae) and Insights into Chitin Biosynthesis Genes. Genes (Basel) 2020; 12:genes12010007. [PMID: 33374651 PMCID: PMC7822449 DOI: 10.3390/genes12010007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 12/09/2020] [Accepted: 12/12/2020] [Indexed: 01/11/2023] Open
Abstract
Bagworm, Metisa plana (Lepidoptera: Psychidae) is a ubiquitous insect pest in the oil palm plantations. M. plana infestation could reduce the oil palm productivity by 40% if it remains untreated over two consecutive years. Despite the urgency to tackle this issue, the genome and transcriptome of M. plana have not yet been fully elucidated. Here, we report a comprehensive transcriptome dataset from four different developmental stages of M. plana, comprising of egg, third instar larva, pupa and female adult. The de novo transcriptome assembly of the raw data had produced a total of 193,686 transcripts, which were then annotated against UniProt, NCBI non-redundant (NR) database, Gene Ontology, Cluster of Orthologous Group, and Kyoto Encyclopedia of Genes and Genomes databases. From this, 46,534 transcripts were annotated and mapped to 146 known metabolic or signalling KEGG pathways. The paper further identified 41 differentially expressed transcripts encoding seven genes in the chitin biosynthesis pathways, and their expressions across each developmental stage were further analysed. The genetic diversity of M. plana was profiled whereby there were 21,516 microsatellite sequences and 379,895 SNPs loci found in the transcriptome of M. plana. These datasets add valuable transcriptomic resources for further study of developmental gene expression, transcriptional regulations and functional gene activities involved in the development of M. plana. Identification of regulatory genes in the chitin biosynthesis pathway may also help in developing an RNAi-mediated pest control management by targeting certain pathways, and functional studies of the genes in M. plana.
Collapse
|
27
|
Lataretu M, Hölzer M. RNAflow: An Effective and Simple RNA-Seq Differential Gene Expression Pipeline Using Nextflow. Genes (Basel) 2020; 11:E1487. [PMID: 33322033 PMCID: PMC7763471 DOI: 10.3390/genes11121487] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 12/04/2020] [Accepted: 12/07/2020] [Indexed: 12/28/2022] Open
Abstract
RNA-Seq enables the identification and quantification of RNA molecules, often with the aim of detecting differentially expressed genes (DEGs). Although RNA-Seq evolved into a standard technique, there is no universal gold standard for these data's computational analysis. On top of that, previous studies proved the irreproducibility of RNA-Seq studies. Here, we present a portable, scalable, and parallelizable Nextflow RNA-Seq pipeline to detect DEGs, which assures a high level of reproducibility. The pipeline automatically takes care of common pitfalls, such as ribosomal RNA removal and low abundance gene filtering. Apart from various visualizations for the DEG results, we incorporated downstream pathway analysis for common species as Homo sapiens and Mus musculus. We evaluated the DEG detection functionality while using qRT-PCR data serving as a reference and observed a very high correlation of the logarithmized gene expression fold changes.
Collapse
Affiliation(s)
- Marie Lataretu
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany;
| | - Martin Hölzer
- Methodology and Research Infrastructure, MF1 Bioinformatics, Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| |
Collapse
|
28
|
Hölzer M. A decade of de novo transcriptome assembly: Are we there yet? Mol Ecol Resour 2020; 21:11-13. [PMID: 33030794 DOI: 10.1111/1755-0998.13268] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 09/23/2020] [Indexed: 01/15/2023]
Abstract
A decade ago, de novo transcriptome assembly evolved as a versatile and powerful approach to make evolutionary assumptions, analyse gene expression, and annotate novel transcripts, in particular, for non-model organisms lacking an appropriate reference genome. Various tools have been developed to generate a transcriptome assembly, and even more computational methods depend on the results of these tools for further downstream analyses. In this issue of Molecular Ecology Resources, Freedman et al. (Mol Ecol Resourc 2020) present a comprehensive analysis of errors in de novo transcriptome assemblies across public data sets and different assembly methods. They focus on two implicit assumptions that are often violated: First, the assembly presents an unbiased view of the transcriptome. Second, the expression estimates derived from the assembly are reasonable, albeit noisy, approximations of the relative frequency of expressed transcripts. They show that appropriate filtering can reduce this bias but can also lead to the loss of a reasonable number of highly expressed transcripts. Thus, to partly alleviate the noise in expression estimates, they propose a new normalization method called length-rescaled CPM. Remarkably, the authors found considerable distortions at the nucleotide level, which leads to an underestimation of diversity in transcriptome assemblies. The study by Freedman et al. (Mol Ecol Resourc 2020) clearly shows that we have not yet reached "high-quality" in the field of transcriptome assembly. Above all, it helps researchers be aware of these problems and filter and interpret their transcriptome assembly data appropriately and with caution.
Collapse
Affiliation(s)
- Martin Hölzer
- MF1 Bioinformatics, Robert Koch Institute, Berlin, Germany.,RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University, Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University, Jena, Germany
| |
Collapse
|
29
|
Hartmann S, Preick M, Abelt S, Scheffel A, Hofreiter M. Annotated genome sequences of the carnivorous plant Roridula gorgonias and a non-carnivorous relative, Clethra arborea. BMC Res Notes 2020; 13:426. [PMID: 32912303 PMCID: PMC7488092 DOI: 10.1186/s13104-020-05254-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 08/24/2020] [Indexed: 11/21/2022] Open
Abstract
Objective Plant carnivory is distributed across the tree of life and has evolved at least six times independently, but sequenced and annotated nuclear genomes of carnivorous plants are currently lacking. We have sequenced and structurally annotated the nuclear genome of the carnivorous Roridula gorgonias and that of a non-carnivorous relative, Madeira’s lily-of-the-valley-tree, Clethra arborea, both within the Ericales. This data adds an important resource to study the evolutionary genetics of plant carnivory across angiosperm lineages and also for functional and systematic aspects of plants within the Ericales. Results Our assemblies have total lengths of 284 Mbp (R. gorgonias) and 511 Mbp (C. arborea) and show high BUSCO scores of 84.2% and 89.5%, respectively. We used their predicted genes together with publicly available data from other Ericales’ genomes and transcriptomes to assemble a phylogenomic data set for the inference of a species tree. However, groups of orthologs showed a marked absence of species represented by a transcriptome. We discuss possible reasons and caution against combining predicted genes from genome- and transriptome-based assemblies.
Collapse
Affiliation(s)
- Stefanie Hartmann
- Institute for Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476, Potsdam, Germany.
| | - Michaela Preick
- Institute for Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476, Potsdam, Germany
| | - Silke Abelt
- Institute for Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476, Potsdam, Germany
| | - André Scheffel
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476, Potsdam, Germany
| | - Michael Hofreiter
- Institute for Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476, Potsdam, Germany
| |
Collapse
|
30
|
Termignoni-Garcia F, Louder MIM, Balakrishnan CN, O’Connell L, Edwards SV. Prospects for sociogenomics in avian cooperative breeding and parental care. Curr Zool 2020; 66:293-306. [PMID: 32440290 PMCID: PMC7233861 DOI: 10.1093/cz/zoz057] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Accepted: 11/20/2019] [Indexed: 01/08/2023] Open
Abstract
For the last 40 years, the study of cooperative breeding (CB) in birds has proceeded primarily in the context of discovering the ecological, geographical, and behavioral drivers of helping. The advent of molecular tools in the early 1990s assisted in clarifying the relatedness of helpers to those helped, in some cases, confirming predictions of kin selection theory. Methods for genome-wide analysis of sequence variation, gene expression, and epigenetics promise to add new dimensions to our understanding of avian CB, primarily in the area of molecular and developmental correlates of delayed breeding and dispersal, as well as the ontogeny of achieving parental status in nature. Here, we outline key ways in which modern -omics approaches, in particular genome sequencing, transcriptomics, and epigenetic profiling such as ATAC-seq, can be used to add a new level of analysis of avian CB. Building on recent and ongoing studies of avian social behavior and sociogenomics, we review how high-throughput sequencing of a focal species or clade can provide a robust foundation for downstream, context-dependent destructive and non-destructive sampling of specific tissues or physiological states in the field for analysis of gene expression and epigenetics. -Omics approaches have the potential to inform not only studies of the diversification of CB over evolutionary time, but real-time analyses of behavioral interactions in the field or lab. Sociogenomics of birds represents a new branch in the network of methods used to study CB, and can help clarify ways in which the different levels of analysis of CB ultimately interact in novel and unexpected ways.
Collapse
Affiliation(s)
- Flavia Termignoni-Garcia
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Matthew I M Louder
- International Research Center for Neurointelligence, The University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | | | - Lauren O’Connell
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
31
|
Lim WK, Mathuru AS. Design, challenges, and the potential of transcriptomics to understand social behavior. Curr Zool 2020; 66:321-330. [PMID: 32684913 PMCID: PMC7357267 DOI: 10.1093/cz/zoaa007] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2019] [Accepted: 02/18/2020] [Indexed: 12/17/2022] Open
Abstract
Rapid advances in Ribonucleic Acid sequencing (or RNA-seq) technology for analyzing entire transcriptomes of desired tissue samples, or even of single cells at scale, have revolutionized biology in the past decade. Increasing accessibility and falling costs are making it possible to address many problems in biology that were once considered intractable, including the study of various social behaviors. RNA-seq is opening new avenues to understand long-standing questions on the molecular basis of behavioral plasticity and individual variation in the expression of a behavior. As whole transcriptomes are examined, it has become possible to make unbiased discoveries of underlying mechanisms with little or no necessity to predict genes involved in advance. However, researchers need to be aware of technical limitations and have to make specific decisions when applying RNA-seq to study social behavior. Here, we provide a perspective on the applications of RNA-seq and experimental design considerations for behavioral scientists who are unfamiliar with the technology but are considering using it in their research.
Collapse
Affiliation(s)
- Wen Kin Lim
- Science Division, Yale-NUS College, 12 College Avenue West, Singapore
| | - Ajay S Mathuru
- Science Division, Yale-NUS College, 12 College Avenue West, Singapore.,Institute of Molecular and Cell Biology (IMCB), 61 Biopolis Drive, Singapore.,Department of Physiology, Yong Loo Lin School of Medicine (YLL), National University of Singapore, Singapore
| |
Collapse
|