Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Rana SB, Zadlock FJ 4th, Zhang Z, Murphy WR, Bentivegna CS. Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus. PLoS One 2016;11:e0153104. [PMID: 27054874 DOI: 10.1371/journal.pone.0153104] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 03/23/2016] [Indexed: 11/25/2022] Open

For:	Rana SB, Zadlock FJ 4th, Zhang Z, Murphy WR, Bentivegna CS. Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus. PLoS One 2016;11:e0153104. [PMID: 27054874 DOI: 10.1371/journal.pone.0153104] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 03/23/2016] [Indexed: 11/25/2022] Open

Number

Cited by Other Article(s)

Saldarriaga-Córdoba M, Clavero-León C, Rey-Suarez P, Nuñez-Rangel V, Avendaño-Herrera R, Solano-González S, Alzate JF. Unveiling Novel Kunitz- and Waprin-Type Toxins in the Micrurus mipartitus Coral Snake Venom Gland: An In Silico Transcriptome Analysis. Toxins (Basel) 2024;16:224. [PMID: 38787076 PMCID: PMC11126030 DOI: 10.3390/toxins16050224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/23/2024] [Accepted: 05/08/2024] [Indexed: 05/25/2024] Open

Khelghatibana F, Javan-Nikkhah M, Safaie N, Sobhani A, Shams S, Sari E. A reference transcriptome for walnut anthracnose pathogen, Ophiognomonia leptostyla, guides the discovery of candidate virulence genes. Fungal Genet Biol 2023;169:103828. [PMID: 37657751 DOI: 10.1016/j.fgb.2023.103828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 08/13/2023] [Accepted: 08/28/2023] [Indexed: 09/03/2023]

Liu S, Koslicki D. CMash: fast, multi-resolution estimation of k-mer-based Jaccard and containment indices. Bioinformatics 2022;38:i28-i35. [PMID: 35758788 PMCID: PMC9235470 DOI: 10.1093/bioinformatics/btac237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Improving the Annotation of the Venom Gland Transcriptome of Pamphobeteus verdolaga, Prospecting Novel Bioactive Peptides. Toxins (Basel) 2022;14:toxins14060408. [PMID: 35737069 PMCID: PMC9228390 DOI: 10.3390/toxins14060408] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 06/06/2022] [Accepted: 06/07/2022] [Indexed: 02/01/2023] Open

Adolfo LM, Rao X, Dixon RA. Identification of Pueraria spp. through DNA barcoding and comparative transcriptomics. BMC PLANT BIOLOGY 2022;22:10. [PMID: 34979934 PMCID: PMC8722073 DOI: 10.1186/s12870-021-03383-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 12/05/2021] [Indexed: 06/14/2023]

Lee SG, Na D, Park C. Comparability of reference-based and reference-free transcriptome analysis approaches at the gene expression level. BMC Bioinformatics 2021;22:310. [PMID: 34674628 PMCID: PMC8529712 DOI: 10.1186/s12859-021-04226-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Accepted: 06/01/2021] [Indexed: 11/10/2022] Open

Voshall A, Behera S, Li X, Yu XH, Kapil K, Deogun JS, Shanklin J, Cahoon EB, Moriyama EN. A consensus-based ensemble approach to improve transcriptome assembly. BMC Bioinformatics 2021;22:513. [PMID: 34674629 PMCID: PMC8532302 DOI: 10.1186/s12859-021-04434-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 10/10/2021] [Indexed: 01/02/2023] Open

Abstract

BACKGROUND

Systems-level analyses, such as differential gene expression analysis, co-expression analysis, and metabolic pathway reconstruction, depend on the accuracy of the transcriptome. Multiple tools exist to perform transcriptome assembly from RNAseq data. However, assembling high quality transcriptomes is still not a trivial problem. This is especially the case for non-model organisms where adequate reference genomes are often not available. Different methods produce different transcriptome models and there is no easy way to determine which are more accurate. Furthermore, having alternative-splicing events exacerbates such difficult assembly problems. While benchmarking transcriptome assemblies is critical, this is also not trivial due to the general lack of true reference transcriptomes.

RESULTS

In this study, we first provide a pipeline to generate a set of the simulated benchmark transcriptome and corresponding RNAseq data. Using the simulated benchmarking datasets, we compared the performance of various transcriptome assembly approaches including both de novo and genome-guided methods. The results showed that the assembly performance deteriorates significantly when alternative transcripts (isoforms) exist or for genome-guided methods when the reference is not available from the same genome. To improve the transcriptome assembly performance, leveraging the overlapping predictions between different assemblies, we present a new consensus-based ensemble transcriptome assembly approach, ConSemble.

CONCLUSIONS

Without using a reference genome, ConSemble using four de novo assemblers achieved an accuracy up to twice as high as any de novo assemblers we compared. When a reference genome is available, ConSemble using four genome-guided assemblies removed many incorrectly assembled contigs with minimal impact on correctly assembled contigs, achieving higher precision and accuracy than individual genome-guided methods. Furthermore, ConSemble using de novo assemblers matched or exceeded the best performing genome-guided assemblers even when the transcriptomes included isoforms. We thus demonstrated that the ConSemble consensus strategy both for de novo and genome-guided assemblers can improve transcriptome assembly. The RNAseq simulation pipeline, the benchmark transcriptome datasets, and the script to perform the ConSemble assembly are all freely available from: http://bioinfolab.unl.edu/emlab/consemble/ .

Collapse

Analysis of Gene Expression Changes in Plants Grown in Salty Soil in Response to Inoculation with Halophilic Bacteria. Int J Mol Sci 2021;22:ijms22073611. [PMID: 33807153 PMCID: PMC8036567 DOI: 10.3390/ijms22073611] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 03/25/2021] [Accepted: 03/27/2021] [Indexed: 12/24/2022] Open

Cortese IJ, Castrillo ML, Zapata PD, Laczeski ME. EFECTO DEL FILTRADO DE SECUENCIAS EN EL ENSAMBLADO DEL GENOMA DE Bacillus altitudinis AISLADO DE Ilex paraguariensis. ACTA BIOLÓGICA COLOMBIANA 2021. [DOI: 10.15446/abc.v26n2.86406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

De novo RNA sequencing analysis of Aeluropus littoralis halophyte plant under salinity stress. Sci Rep 2020;10:9148. [PMID: 32499577 PMCID: PMC7272644 DOI: 10.1038/s41598-020-65947-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 05/13/2020] [Indexed: 01/24/2023] Open

Abstract

The study of salt tolerance mechanisms in halophyte plants can provide valuable information for crop breeding and plant engineering programs. The aim of the present study was to investigate whole transcriptome analysis of Aeluropus littoralis in response to salinity stress (200 and 400 mM NaCl) by de novo RNA-sequencing. To assemble the transcriptome, Trinity v2.4.0 and Bridger tools, were comparatively used with two k-mer sizes (25 and 32 bp). The de novo assembled transcriptome by Bridger (k-mer 32) was chosen as final assembly for subsequent analysis. In general, 103290 transcripts were obtained. The differential expression analysis (log₂^FC > 1 and FDR < 0.01) showed that 1861 transcripts expressed differentially, including169 up and 316 down-regulated transcripts in 200 mM NaCl treatment and 1035 up and 430 down-regulated transcripts in 400 mM NaCl treatment compared to control. In addition, 89 transcripts were common in both treatments. The most important over-represented terms in the GO analysis of differentially expressed genes (FDR < 0.05) were chitin response, response to abscisic acid, and regulation of jasmonic acid mediated signaling pathway under 400 mM NaCl treatment and cell cycle, cell division, and mitotic cell cycle process under 200 mM treatment. In addition, the phosphatidylcholine biosynthetic process term was common in both salt treatments. Interestingly, under 400 mM salt treatment, the PRC1 complex that contributes to chromatin remodeling was also enriched along with vacuole as a general salinity stress responsive cell component. Among enriched pathways, the MAPK signaling pathway (ko04016) and phytohormone signal transduction (ko04075) were significantly enriched in 400 mM NaCl treatment, whereas DNA replication (ko03032) was the only pathway that significantly enriched in 200 mM NaCl treatment. Finally, our findings indicate the salt-concentration depended responses of A. littoralis, which well-known salinity stress-related pathways are induced in 400 mM NaCl, while less considered pathways, e.g. cell cycle and DNA replication, are highlighted under 200 mM NaCl treatment.

Collapse

Gen2EpiGUI: User-Friendly Pipeline for Analyzing Whole-Genome Sequencing Data for Epidemiological Studies of Neisseria gonorrhoeae. Sex Transm Dis 2020;47:e42-e44. [DOI: 10.1097/olq.0000000000001206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Cogne Y, Gouveia D, Chaumot A, Degli-Esposti D, Geffard O, Pible O, Almunia C, Armengaud J. Proteogenomics-Guided Evaluation of RNA-Seq Assembly and Protein Database Construction for Emergent Model Organisms. Proteomics 2020;20:e1900261. [PMID: 32249536 DOI: 10.1002/pmic.201900261] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 03/24/2020] [Indexed: 11/10/2022]

Klein AH, Ballard KR, Storey KB, Motti CA, Zhao M, Cummins SF. Multi-omics investigations within the Phylum Mollusca, Class Gastropoda: from ecological application to breakthrough phylogenomic studies. Brief Funct Genomics 2020;18:377-394. [PMID: 31609407 DOI: 10.1093/bfgp/elz017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Revised: 07/06/2019] [Accepted: 07/15/2019] [Indexed: 12/22/2022] Open

Evaluation of Seven Different RNA-Seq Alignment Tools Based on Experimental Data from the Model Plant Arabidopsis thaliana. Int J Mol Sci 2020;21:ijms21051720. [PMID: 32138290 PMCID: PMC7084517 DOI: 10.3390/ijms21051720] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 02/28/2020] [Accepted: 02/29/2020] [Indexed: 01/15/2023] Open

Transcriptome Landscape Variation in the Genus Thymus. Genes (Basel) 2019;10:genes10080620. [PMID: 31426352 PMCID: PMC6723042 DOI: 10.3390/genes10080620] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 07/31/2019] [Accepted: 08/12/2019] [Indexed: 12/13/2022] Open

Nuamtanong S, Reamtong O, Phuphisut O, Chotsiri P, Malaithong P, Dekumyoy P, Adisakwattana P. Transcriptome and excretory-secretory proteome of infective-stage larvae of the nematode Gnathostoma spinigerum reveal potential immunodiagnostic targets for development. ACTA ACUST UNITED AC 2019;26:34. [PMID: 31166909 PMCID: PMC6550564 DOI: 10.1051/parasite/2019033] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 05/16/2019] [Indexed: 01/02/2023]

Hölzer M, Marz M. De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers. Gigascience 2019;8:giz039. [PMID: 31077315 PMCID: PMC6511074 DOI: 10.1093/gigascience/giz039] [Citation(s) in RCA: 109] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 12/21/2018] [Accepted: 03/09/2019] [Indexed: 12/13/2022] Open

Seoane P, Espigares M, Carmona R, Polonio Á, Quintana J, Cretazzo E, Bota J, Pérez-García A, Dios Alché JD, Gómez L, Claros MG. TransFlow: a modular framework for assembling and assessing accurate de novo transcriptomes in non-model organisms. BMC Bioinformatics 2018;19:416. [PMID: 30453874 PMCID: PMC6245506 DOI: 10.1186/s12859-018-2384-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Abstract

BACKGROUND

The advances in high-throughput sequencing technologies are allowing more and more de novo assembling of transcriptomes from many new organisms. Some degree of automation and evaluation is required to warrant reproducibility, repetitivity and the selection of the best possible transcriptome. Workflows and pipelines are becoming an absolute requirement for such a purpose, but the issue of assembling evaluation for de novo transcriptomes in organisms lacking a sequenced genome remains unsolved. An automated, reproducible and flexible framework called TransFlow to accomplish this task is described.

RESULTS

TransFlow with its five independent modules was designed to build different workflows depending on the nature of the original reads. This architecture enables different combinations of Illumina and Roche/454 sequencing data, and can be extended to other sequencing platforms. Its capabilities are illustrated with the selection of reliable plant reference transcriptomes and the assembling six transcriptomes (three case studies for grapevine leaves, olive tree pollen, and chestnut stem, and other three for haustorium, epiphytic structures and their combination for the phytopathogenic fungus Podosphaera xanthii). Arabidopsis and poplar transcriptomes revealed to be the best references. A common result regarding de novo assemblies is that Illumina paired-end reads of 100 nt in length assembled with OASES can provide reliable transcriptomes, while the contribution of longer reads is noticeable only when they complement a set of short, single-reads.

CONCLUSIONS

TransFlow can handle up to 181 different assembling strategies. Evaluation based on principal component analyses allows its self-adaptation to different sets of reads to provide a suitable transcriptome for each combination of reads and assemblers. As a result, each case study has its own behaviour, prioritises evaluation parameters, and gives an objective and automated way for detecting the best transcriptome within a pool of them. Sequencing data type and quantity (preferably several hundred millions of 2×100 nt or longer), assemblers (OASES for Illumina, MIRA4 and EULER-SR reconciled with CAP3 for Roche/454) and strategy (preferably scaffolding with OASES, and probably merging with Roche/454 when available) arise as the most impacting factors.

Collapse

Affiliation(s)

Pedro Seoane Departmento de Biología Molecular y Bioquímica, Universidad de Málaga, Campus de Teatinos s/n, Malaga, 29071 Spain
Marina Espigares Departmento de Biología Molecular y Bioquímica, Universidad de Málaga, Campus de Teatinos s/n, Malaga, 29071 Spain
Rosario Carmona Plant Reproductive Biology Laboratory, Department of Biochemistry, Cell and Molecular Biology of Plants. Estación Experimental del Zaidín. CSIC, Prof. Albareda, 1, Granada, 18160 Spain
Álvaro Polonio Departamento de Microbiología, and Instituto de Hortofruticultura Subtropical y Mediterránea “La Mayora”, Universidad de Málaga, Consejo Superior de Investigaciones Científicas (IHSM-UMA-CSIC), Campus de Teatinos s/n, Malaga, 29071 Spain
Julia Quintana Department of Chemistry and Biochemistry, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA, 01609-2280 USA
Enrico Cretazzo Instituto Andaluz de Investigación y Formación Agraria (IFAPA), Centro de Churriana, Cortijo de la Cruz s/n, Churriana, 29140 Spain
Josefina Bota Grup de Recerca en Biologia de les Plantes en Condicions Mediterrànies, Departament de Biologia, Universitat de les Illes Balears, Carretera de Valldemossa, km 7.5, Palma de Mallorca, 07122 Spain
Alejandro Pérez-García Departamento de Microbiología, and Instituto de Hortofruticultura Subtropical y Mediterránea “La Mayora”, Universidad de Málaga, Consejo Superior de Investigaciones Científicas (IHSM-UMA-CSIC), Campus de Teatinos s/n, Malaga, 29071 Spain
Juan de Dios Alché Plant Reproductive Biology Laboratory, Department of Biochemistry, Cell and Molecular Biology of Plants. Estación Experimental del Zaidín. CSIC, Prof. Albareda, 1, Granada, 18160 Spain
Luis Gómez Departamento de Sistemas y Recursos Naturales, ETSI Forestal, de Montes y del Medio Natural, Universidad Politécnica de Madrid, Ciudad Universitaria, Madrid, 28040 Spain CBGP, INIA-Universidad Politécnica de Madrid, Campus de Montegancedo, Pozuelo de Alarcón, 28223 Spain
M. Gonzalo Claros Departmento de Biología Molecular y Bioquímica, Universidad de Málaga, Campus de Teatinos s/n, Malaga, 29071 Spain

Collapse

Phuphisut O, Ajawatanawong P, Limpanont Y, Reamtong O, Nuamtanong S, Ampawong S, Chaimon S, Dekumyoy P, Watthanakulpanich D, Swierczewski BE, Adisakwattana P. Transcriptomic analysis of male and female Schistosoma mekongi adult worms. Parasit Vectors 2018;11:504. [PMID: 30201055 PMCID: PMC6131826 DOI: 10.1186/s13071-018-3086-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Accepted: 08/29/2018] [Indexed: 12/23/2022] Open

Abstract

Background

Schistosoma mekongi is one of five major causative agents of human schistosomiasis and is endemic to communities along the Mekong River in southern Lao People’s Democratic Republic (Laos) and northern Cambodia. Sporadic cases of schistosomiasis have been reported in travelers and immigrants who have visited endemic areas. Schistosoma mekongi biology and molecular biology is poorly understood, and few S. mekongi gene and transcript sequences are available in public databases.

Results

Transcriptome sequencing (RNA-Seq) of male and female S. mekongi adult worms (a total of three biological replicates for each sex) were analyzed and the results demonstrated that approximately 304.9 and 363.3 million high-quality clean reads with quality Q30 (> 90%) were obtained from male and female adult worms, respectively. A total of 119,604 contigs were assembled with an average length of 1273 nt and an N50 of 2017 nt. From the contigs, 20,798 annotated protein sequences and 48,256 annotated transcript sequences were obtained using BLASTP and BLASTX searches against the UniProt Trematoda database. A total of 4658 and 3509 transcripts were predominantly expressed in male and female worms, respectively. Male-biased transcripts were mostly involved in structural organization while female-biased transcripts were typically involved in cell differentiation and egg production. Interestingly, pathway enrichment analysis suggested that genes involved in the phosphatidylinositol signaling pathway may play important roles in the cellular processes and reproductive systems of S. mekongi worms.

Conclusions

We present comparative transcriptomic analyses of male and female S. mekongi adult worms, which provide a global view of the S. mekongi transcriptome as well as insights into differentially-expressed genes associated with each sex. This work provides valuable information and sequence resources for future studies of gene function and for ongoing whole genome sequencing efforts in S. mekongi.

Electronic supplementary material

The online version of this article (10.1186/s13071-018-3086-z) contains supplementary material, which is available to authorized users.

Collapse

Ward MJ, Rokyta DR. Venom-gland transcriptomics and venom proteomics of the giant Florida blue centipede, Scolopendra viridis. Toxicon 2018;152:121-136. [PMID: 30086358 DOI: 10.1016/j.toxicon.2018.07.030] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 07/25/2018] [Accepted: 07/31/2018] [Indexed: 12/19/2022]

von Reumont BM. Studying Smaller and Neglected Organisms in Modern Evolutionary Venomics Implementing RNASeq (Transcriptomics)-A Critical Guide. Toxins (Basel) 2018;10:toxins10070292. [PMID: 30012955 PMCID: PMC6070909 DOI: 10.3390/toxins10070292] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Revised: 07/06/2018] [Accepted: 07/13/2018] [Indexed: 12/20/2022] Open

Abstract

Venoms are evolutionary key adaptations that species employ for defense, predation or competition. However, the processes and forces that drive the evolution of venoms and their toxin components remain in many aspects understudied. In particular, the venoms of many smaller, neglected (mostly invertebrate) organisms are not characterized in detail, especially with modern methods. For the majority of these taxa, even their biology is only vaguely known. Modern evolutionary venomics addresses the question of how venoms evolve by applying a plethora of -omics methods. These recently became so sensitive and enhanced that smaller, neglected organisms are now more easily accessible to comparatively study their venoms. More knowledge about these taxa is essential to better understand venom evolution in general. The methodological core pillars of integrative evolutionary venomics are genomics, transcriptomics and proteomics, which are complemented by functional morphology and the field of protein synthesis and activity tests. This manuscript focuses on transcriptomics (or RNASeq) as one toolbox to describe venom evolution in smaller, neglected taxa. It provides a hands-on guide that discusses a generalized RNASeq workflow, which can be adapted, accordingly, to respective projects. For neglected and small taxa, generalized recommendations are difficult to give and conclusions need to be made individually from case to case. In the context of evolutionary venomics, this overview highlights critical points, but also promises of RNASeq analyses. Methodologically, these concern the impact of read processing, possible improvements by perfoming multiple and merged assemblies, and adequate quantification of expressed transcripts. Readers are guided to reappraise their hypotheses on venom evolution in smaller organisms and how robustly these are testable with the current transcriptomics toolbox. The complementary approach that combines particular proteomics but also genomics with transcriptomics is discussed as well. As recently shown, comparative proteomics is, for example, most important in preventing false positive identifications of possible toxin transcripts. Finally, future directions in transcriptomics, such as applying 3rd generation sequencing strategies to overcome difficulties by short read assemblies, are briefly addressed.

Collapse

Evaluating the Performance of De Novo Assembly Methods for Venom-Gland Transcriptomics. Toxins (Basel) 2018;10:toxins10060249. [PMID: 29921759 PMCID: PMC6024825 DOI: 10.3390/toxins10060249] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 06/14/2018] [Accepted: 06/15/2018] [Indexed: 11/17/2022] Open

Abstract

Venom-gland transcriptomics is a key tool in the study of the evolution, ecology, function, and pharmacology of animal venoms. In particular, gene-expression variation and coding sequences gained through transcriptomics provide key information for explaining functional venom variation over both ecological and evolutionary timescales. The accuracy and usefulness of inferences made through transcriptomics, however, is limited by the accuracy of the transcriptome assembly, which is a bioinformatic problem with several possible solutions. Several methods have been employed to assemble venom-gland transcriptomes, with the Trinity assembler being the most commonly applied among them. Although previous evidence of variation in performance among assembly software exists, particularly regarding recovery of difficult-to-assemble multigene families such as snake venom metalloproteinases, much work to date still employs a single assembly method. We evaluated the performance of several commonly used de novo assembly methods for the recovery of both nontoxin transcripts and complete, high-quality venom-gene transcripts across eleven snake and four scorpion transcriptomes. We varied k-mer sizes used by some assemblers to evaluate the impact of k-mer length on transcript recovery. We showed that the recovery of nontoxin transcripts and toxin transcripts is best accomplished through different assembly software, with SDT at smaller k-mer lengths and Trinity being best for nontoxin recovery and a combination of SeqMan NGen and a seed-and-extend approach implemented in Extender as the best means of recovering a complete set of toxin transcripts. In particular, Extender was the only means tested capable of assembling multiple isoforms of the diverse snake venom metalloproteinase family, while traditional approaches such as Trinity recovered at most one metalloproteinase transcript. Our work demonstrated that traditional metrics of assembly performance are not predictive of performance in the recovery of complete and high quality toxin genes. Instead, effective venom-gland transcriptomic studies should combine and quality-filter the results of several assemblers with varying algorithmic strategies.

Collapse

Hoang NV, Furtado A, Thirugnanasambandam PP, Botha FC, Henry RJ. De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts. Heliyon 2018;4:e00583. [PMID: 29862346 PMCID: PMC5968133 DOI: 10.1016/j.heliyon.2018.e00583] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 03/02/2018] [Accepted: 03/16/2018] [Indexed: 12/31/2022] Open

Abstract

Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an earlier study by using about one billion paired-end (150 bp) reads of deep RNA sequencing of samples from 20 diverse sugarcane genotypes and combining de novo assemblies from different assemblers and different settings. Although many genes could be recovered, this resulted in a large combined assembly which created the need for clustering to reduce transcript redundancy while maintaining gene content. Here, we present a comprehensive analysis of the effect of different assembly settings and clustering methods on de novo assembly, annotation and transcript profiling focusing especially on the coding transcripts from the highly polyploid sugarcane genome. The new coding sequence-based transcript clustering resulted in a better representation of transcripts compared to the earlier approach, having 121,987 contigs, which included 78,052 main and 43,935 alternative transcripts. About 73%, 67%, 61% and 10% of the transcriptome was annotated against the NCBI NR protein database, GO terms, orthologous groups and KEGG orthologies, respectively. Using this set for a differential gene expression analysis between the young and mature sugarcane culm tissues, a total of 822 transcripts were found to be differentially expressed, including key transcripts involved in sugar/fiber accumulation in sugarcane. In the context of the lack of a whole genome sequence for sugarcane, the availability of a well annotated culm-derived meta-transcriptome through deep sequencing provides useful information on coding genes specific to the sugarcane culm and will certainly contribute to understanding the process of carbon partitioning, and biomass accumulation in the sugarcane culm.

Collapse

Dhaygude K, Trontti K, Paviala J, Morandin C, Wheat C, Sundström L, Helanterä H. Transcriptome sequencing reveals high isoform diversity in the ant Formica exsecta. PeerJ 2017;5:e3998. [PMID: 29177112 PMCID: PMC5701548 DOI: 10.7717/peerj.3998] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 10/17/2017] [Indexed: 12/21/2022] Open

Gates K, Sandoval-Castillo J, Bernatchez L, Beheregaray LB. De novo transcriptome assembly and annotation for the desert rainbowfish ( Melanotaenia splendida tatei ) with comparison with candidate genes for future climates. Mar Genomics 2017;35:63-68. [DOI: 10.1016/j.margen.2017.05.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 05/15/2017] [Indexed: 01/25/2023]

Challenges and advances for transcriptome assembly in non-model species. PLoS One 2017;12:e0185020. [PMID: 28931057 PMCID: PMC5607178 DOI: 10.1371/journal.pone.0185020] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 09/04/2017] [Indexed: 12/28/2022] Open

Abstract

Analyses of high-throughput transcriptome sequences of non-model organisms are based on two main approaches: de novo assembly and genome-guided assembly using mapping to assign reads prior to assembly. Given the limits of mapping reads to a reference when it is highly divergent, as is frequently the case for non-model species, we evaluate whether using blastn would outperform mapping methods for read assignment in such situations (>15% divergence). We demonstrate its high performance by using simulated reads of lengths corresponding to those generated by the most common sequencing platforms, and over a realistic range of genetic divergence (0% to 30% divergence). Here we focus on gene identification and not on resolving the whole set of transcripts (i.e. the complete transcriptome). For simulated datasets, the transcriptome-guided assembly based on blastn recovers 94.8% of genes irrespective of read length at 0% divergence; however, assignment rate of reads is negatively correlated with both increasing divergence level and reducing read lengths. Nevertheless, we still observe 92.6% of recovered genes at 30% divergence irrespective of read length. This analysis also produces a categorization of genes relative to their assignment, and suggests guidelines for data processing prior to analyses of comparative transcriptomics and gene expression to minimize potential inferential bias associated with incorrect transcript assignment. We also compare the performances of de novo assembly alone vs in combination with a transcriptome-guided assembly based on blastn both via simulation and empirically, using data from a cyprinid fish species and from an oak species. For any simulated scenario, the transcriptome-guided assembly using blastn outperforms the de novo approach alone, including when the divergence level is beyond the reach of traditional mapping methods. Combining de novo assembly and a related reference transcriptome for read assignment also addresses the bias/error in contigs caused by the dependence on a related reference alone. Empirical data corroborate these findings when assembling transcriptomes from the two non-model organisms: Parachondrostoma toxostoma (fish) and Quercus pubescens (plant). For the fish species, out of the 31,944 genes known from D. rerio, the guided and de novo assemblies recover respectively 20,605 and 20,032 genes but the performance of the guided assembly approach is much higher for both the contiguity and completeness metrics. For the oak, out of the 29,971 genes known from Vitis vinifera, the transcriptome-guided and de novo assemblies display similar performance, but the new guided approach detects 16,326 genes where the de novo assembly only detects 9,385 genes.

Collapse

Lopez L, Wolf EM, Pires JC, Edger PP, Koch MA. Molecular Resources from Transcriptomes in the Brassicaceae Family. FRONTIERS IN PLANT SCIENCE 2017;8:1488. [PMID: 28900436 PMCID: PMC5581910 DOI: 10.3389/fpls.2017.01488] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 08/11/2017] [Indexed: 06/07/2023]

Johnson KL, Cassin AM, Lonsdale A, Bacic A, Doblin MS, Schultz CJ. Pipeline to Identify Hydroxyproline-Rich Glycoproteins. PLANT PHYSIOLOGY 2017;174:886-903. [PMID: 28446635 PMCID: PMC5462032 DOI: 10.1104/pp.17.00294] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Accepted: 04/21/2017] [Indexed: 05/14/2023]

Affiliation(s)

Kim L Johnson Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
Andrew M Cassin Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
Andrew Lonsdale Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
Antony Bacic Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
Monika S Doblin Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
Carolyn J Schultz Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)

Collapse

Hoang NV, Furtado A, Mason PJ, Marquardt A, Kasirajan L, Thirugnanasambandam PP, Botha FC, Henry RJ. A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing. BMC Genomics 2017;18:395. [PMID: 28532419 PMCID: PMC5440902 DOI: 10.1186/s12864-017-3757-8] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 05/03/2017] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Despite the economic importance of sugarcane in sugar and bioenergy production, there is not yet a reference genome available. Most of the sugarcane transcriptomic studies have been based on Saccharum officinarum gene indices (SoGI), expressed sequence tags (ESTs) and de novo assembled transcript contigs from short-reads; hence knowledge of the sugarcane transcriptome is limited in relation to transcript length and number of transcript isoforms.

RESULTS

The sugarcane transcriptome was sequenced using PacBio isoform sequencing (Iso-Seq) of a pooled RNA sample derived from leaf, internode and root tissues, of different developmental stages, from 22 varieties, to explore the potential for capturing full-length transcript isoforms. A total of 107,598 unique transcript isoforms were obtained, representing about 71% of the total number of predicted sugarcane genes. The majority of this dataset (92%) matched the plant protein database, while just over 2% was novel transcripts, and over 2% was putative long non-coding RNAs. About 56% and 23% of total sequences were annotated against the gene ontology and KEGG pathway databases, respectively. Comparison with de novo contigs from Illumina RNA-Sequencing (RNA-Seq) of the internode samples from the same experiment and public databases showed that the Iso-Seq method recovered more full-length transcript isoforms, had a higher N50 and average length of largest 1,000 proteins; whereas a greater representation of the gene content and RNA diversity was captured in RNA-Seq. Only 62% of PacBio transcript isoforms matched 67% of de novo contigs, while the non-matched proportions were attributed to the inclusion of leaf/root tissues and the normalization in PacBio, and the representation of more gene content and RNA classes in the de novo assembly, respectively. About 69% of PacBio transcript isoforms and 41% of de novo contigs aligned with the sorghum genome, indicating the high conservation of orthologs in the genic regions of the two genomes.

CONCLUSIONS

The transcriptome dataset should contribute to improved sugarcane gene models and sugarcane protein predictions; and will serve as a reference database for analysis of transcript expression in sugarcane.

Collapse

Affiliation(s)

Nam V Hoang Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,College of Agriculture and Forestry, Hue University, Hue, Vietnam
Agnelo Furtado Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia
Patrick J Mason Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia
Annelie Marquardt Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,Sugar Research Australia, Indooroopilly, QLD, 4068, Australia
Lakshmi Kasirajan Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,ICAR - Sugarcane Breeding Institute, Coimbatore, Tamil Nadu, India
Prathima P Thirugnanasambandam Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,ICAR - Sugarcane Breeding Institute, Coimbatore, Tamil Nadu, India
Frederik C Botha Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,Sugar Research Australia, Indooroopilly, QLD, 4068, Australia
Robert J Henry Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.

Collapse