1
|
Meyer D, Kosacka J, von Bergen M, Christ B, Marz M. Data report on gene expression after hepatic portal vein ligation (PVL) in rats. Front Genet 2024; 15:1421955. [PMID: 39233735 PMCID: PMC11371715 DOI: 10.3389/fgene.2024.1421955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 07/09/2024] [Indexed: 09/06/2024] Open
Affiliation(s)
- Daria Meyer
- Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Jena, Germany
- Oncgnostics GmbH, Jena, Germany
| | - Joanna Kosacka
- Cell Transplantation/Molecular Hepatology Lab, Department of Visceral, Transplant, Thoracic, and Vascular Surgery, University of Leipzig Medical Center, Leipzig, Germany
| | - Martin von Bergen
- Molecular Systems Biology, Helmholtz Centre for Environmental Research-UFZ, Leipzig, Germany
| | - Bruno Christ
- Cell Transplantation/Molecular Hepatology Lab, Department of Visceral, Transplant, Thoracic, and Vascular Surgery, University of Leipzig Medical Center, Leipzig, Germany
| | - Manja Marz
- Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Jena, Germany
- FLI Leibniz Institute for Age Research, Jena, Germany
- European Virus Bioinformatics Center, Jena, Germany
- Michael Stifel Center Jena, Jena, Germany
- Aging Research Center (ARC), Jena, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
- Max Planck Institute for the Science of Human History, Jena, Germany
| |
Collapse
|
2
|
Kasianova AM, Penin AA, Schelkunov MI, Kasianov AS, Logacheva MD, Klepikova AV. Trans2express - de novo transcriptome assembly pipeline optimized for gene expression analysis. PLANT METHODS 2024; 20:128. [PMID: 39152473 PMCID: PMC11330051 DOI: 10.1186/s13007-024-01255-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 08/01/2024] [Indexed: 08/19/2024]
Abstract
BACKGROUND As genomes of many eukaryotic species, especially plants, are large and complex, their de novo sequencing and assembly is still a difficult task despite progress in sequencing technologies. An alternative to genome assembly is the assembly of transcriptome, the set of RNA products of the expressed genes. While a bunch of de novo transcriptome assemblers exists, the challenges of transcriptomes (the existence of isoforms, the uneven expression levels across genes) complicates the generation of high-quality assemblies suitable for downstream analyses. RESULTS We developed Trans2express - a web-based tool and a pipeline of de novo hybrid transcriptome assembly and postprocessing based on rnaSPAdes with a set of subsequent filtrations. The pipeline was tested on Arabidopsis thaliana cDNA sequencing data obtained using Illumina and Oxford Nanopore Technologies platforms and three non-model plant species. The comparison of structural characteristics of the transcriptome assembly with reference Arabidopsis genome revealed the high quality of assembled transcriptome with 86.1% of Arabidopsis expressed genes assembled as a single contig. We tested the applicability of the transcriptome assembly for gene expression analysis. For both Arabidopsis and non-model species the results showed high congruence of gene expression levels and sets of differentially expressed genes between analyses based on genome and based on the transcriptome assembly. CONCLUSIONS We present Trans2express - a protocol for de novo hybrid transcriptome assembly aimed at recovering of a single transcript per gene. We expect this protocol to promote the characterization of transcriptomes and gene expression analysis in non-model plants and web-based tool to be of use to a wide range of plant biologists.
Collapse
Affiliation(s)
- Aleksandra M Kasianova
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Aleksey A Penin
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
| | - Mikhail I Schelkunov
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Artem S Kasianov
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
| | - Maria D Logacheva
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Anna V Klepikova
- Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia.
| |
Collapse
|
3
|
Ritsch M, Eulenfeld T, Lamkiewicz K, Schoen A, Weber F, Hölzer M, Marz M. Endogenous Bornavirus-like Elements in Bats: Evolutionary Insights from the Conserved Riboviral L-Gene in Microbats and Its Antisense Transcription in Myotis daubentonii. Viruses 2024; 16:1210. [PMID: 39205184 PMCID: PMC11360350 DOI: 10.3390/v16081210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 07/16/2024] [Accepted: 07/24/2024] [Indexed: 09/04/2024] Open
Abstract
Bats are ecologically diverse vertebrates characterized by their ability to host a wide range of viruses without apparent illness and the presence of numerous endogenous viral elements (EVEs). EVEs are well preserved, expressed, and may affect host biology and immunity, but their role in bat immune system evolution remains unclear. Among EVEs, endogenous bornavirus-like elements (EBLs) are bornavirus sequences integrated into animal genomes. Here, we identified a novel EBL in the microbat Myotis daubentonii, EBLL-Cultervirus.10-MyoDau (short name is CV.10-MyoDau) that shows protein-level conservation with the L-protein of a Cultervirus (Wuhan sharpbelly bornavirus). Surprisingly, we discovered a transcript on the antisense strand comprising three exons, which we named AMCR-MyoDau. The active transcription in Myotis daubentonii tissues of AMCR-MyoDau, confirmed by RNA-Seq analysis and RT-PCR, highlights its potential role during viral infections. Using comparative genomics comprising 63 bat genomes, we demonstrate nucleotide-level conservation of CV.10-MyoDau and AMCR-MyoDau across various bat species and its detection in 22 Yangochiropera and 12 Yinpterochiroptera species. To the best of our knowledge, this marks the first occurrence of a conserved EVE shared among diverse bat species, which is accompanied by a conserved antisense transcript. This highlights the need for future research to explore the role of EVEs in shaping the evolution of bat immunity.
Collapse
Affiliation(s)
- Muriel Ritsch
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
- European Virus Bioinformatics Center, 07743 Jena, Germany
| | - Tom Eulenfeld
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
- European Virus Bioinformatics Center, 07743 Jena, Germany
- Cluster of Excellence Balance of the Microverse, Friedrich Schiller University Jena, 07743 Jena, Germany
| | - Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
- European Virus Bioinformatics Center, 07743 Jena, Germany
| | - Andreas Schoen
- Institute for Virology, FB10-Veterinary Medicine, Justus Liebig University, 35392 Gießen, Germany
| | - Friedemann Weber
- Institute for Virology, FB10-Veterinary Medicine, Justus Liebig University, 35392 Gießen, Germany
| | - Martin Hölzer
- European Virus Bioinformatics Center, 07743 Jena, Germany
- Genome Competence Center (MF1), Robert Koch Institute, 13353 Berlin, Germany
| | - Manja Marz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
- European Virus Bioinformatics Center, 07743 Jena, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103 Leipzig, Germany
- Fritz Lipmann Institute-Leibniz Institute on Aging, 07745 Jena, Germany
| |
Collapse
|
4
|
Lim PK, Wang R, Mutwil M. LSTrAP-denovo: Automated Generation of Transcriptome Atlases for Eukaryotic Species Without Genomes. PHYSIOLOGIA PLANTARUM 2024; 176:e14407. [PMID: 38973613 DOI: 10.1111/ppl.14407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 05/28/2024] [Indexed: 07/09/2024]
Abstract
Despite the abundance of species with transcriptomic data, a significant number of species still lack sequenced genomes, making it difficult to study gene function and expression in these organisms. While de novo transcriptome assembly can be used to assemble protein-coding transcripts from RNA-sequencing (RNA-seq) data, the datasets used often only feature samples of arbitrarily selected or similar experimental conditions, which might fail to capture condition-specific transcripts. We developed the Large-Scale Transcriptome Assembly Pipeline for de novo assembled transcripts (LSTrAP-denovo) to automatically generate transcriptome atlases of eukaryotic species. Specifically, given an NCBI TaxID, LSTrAP-denovo can (1) filter undesirable RNA-seq accessions based on read data, (2) select RNA-seq accessions via unsupervised machine learning to construct a sample-balanced dataset for download, (3) assemble transcripts via over-assembly, (4) functionally annotate coding sequences (CDS) from assembled transcripts and (5) generate transcriptome atlases in the form of expression matrices for downstream transcriptomic analyses. LSTrAP-denovo is easy to implement, written in Python, and is freely available at https://github.com/pengkenlim/LSTrAP-denovo/.
Collapse
Affiliation(s)
- Peng Ken Lim
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Ruoxi Wang
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Marek Mutwil
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
5
|
Park JE, Patnaik BB, Sang MK, Song DK, Jeong JY, Hong CE, Kim YT, Shin HJ, Ziwei L, Patnaik HH, Hwang HJ, Park SY, Kang SW, Ko JH, Lee JS, Park HS, Jo YH, Han YS, Lee YS. Transcriptome sequencing of the endangered land snail Karaftohelix adamsi from the Island Ulleung: De novo assembly, annotation, valuation of fitness genes and SSR markers. Genes Genomics 2024; 46:851-870. [PMID: 38809491 DOI: 10.1007/s13258-024-01511-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 03/08/2024] [Indexed: 05/30/2024]
Abstract
BACKGROUND The Bradybaenidae snail Karaftohelix adamsi is endemic to Korea, with the species tracked from Island Ulleung in North Gyeongsang Province of South Korea. K. adamsi has been classified under the Endangered Wildlife Class II species of Korea and poses a severe risk of extinction following habitat disturbances. With no available information at the DNA (genome) or mRNA (transcriptome) level for the species, conservation by utilizing informed molecular resources seems difficult. OBJECTIVE In this study, we used the Illumina short-read sequencing and Trinity de novo assembly to draft the reference transcriptome of K. adamsi. RESULTS After assembly, 13,753 unigenes were obtained of which 10,511 were annotated to public databases (a maximum of 10,165 unigenes found homologs in PANM DB). A total of 6,351, 3,535, 358, and 3,407 unigenes were ascribed to the functional categories under KOG, GO, KEGG, and IPS, respectively. The transcripts such as the HSP 70, aquaporin, TLR, and MAPK, among others, were screened as putative functional resources for adaptation. DNA transposons were found to be thickly populated in comparison to retrotransposons in the assembled unigenes. Further, 2,164 SSRs were screened with the promiscuous presence of dinucleotide repeats such as AC/GT and AG/CT. CONCLUSION The transcriptome-guided discovery of molecular resources in K. adamsi will not only serve as a basis for functional genomics studies but also provide sustainable tools to be utilized for the protection of the species in the wild. Moreover, the development of polymorphic SSRs is valuable for the identification of species from newer habitats and cross-species genotyping.
Collapse
Affiliation(s)
- Jie Eun Park
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Research Support Center for Bio-Bigdata Analysis and Utilization of Biological Resources, Soonchunhyang University, Chungnam, 31, Asan, South Korea
| | - Bharat Bhusan Patnaik
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea
- PG Department of Biosciences and Biotechnology, Fakir Mohan University, Nuapadhi, Balasore, Odisha, 756089, India
| | - Min Kyu Sang
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Research Support Center for Bio-Bigdata Analysis and Utilization of Biological Resources, Soonchunhyang University, Chungnam, 31, Asan, South Korea
| | - Dae Kwon Song
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Research Support Center for Bio-Bigdata Analysis and Utilization of Biological Resources, Soonchunhyang University, Chungnam, 31, Asan, South Korea
| | - Jun Yang Jeong
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea
| | - Chan Eui Hong
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea
| | - Yong Tae Kim
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea
| | - Hyeon Jun Shin
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea
| | - Liu Ziwei
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea
| | - Hongray Howrelia Patnaik
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- PG Department of Zoology, BJB Autonomous College, Bhubaneswar, Odisha, 751014, India
| | - Hee Ju Hwang
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea
| | - So Young Park
- Biodiversity Research Team, Animal & Plant Research Department, Nakdonggang National Institute of Biological Resources, Sangju, Gyeongbuk, South Korea
| | - Se Won Kang
- Biological Resource Center (BRC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Jeongeup, Jeonbuk, South Korea
| | - Jung Ho Ko
- Police Science Institute, Korean National Police University, Asan, 31539, Chungnam, Korea
| | - Jun Sang Lee
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
| | - Hong Seog Park
- Research Institute, GnC BIO Co., LTD, 621-6 Banseok-Dong, Yuseong-Gu, Daejeon, 34069, Korea
| | - Yong Hun Jo
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea
| | - Yeon Soo Han
- College of Agriculture and Life Science, Chonnam National University, 77 Yongbong-Ro, Buk-Gu, Gwangju, 61186, South Korea
| | - Yong Seok Lee
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, 31538, South Korea.
- Research Support Center for Bio-Bigdata Analysis and Utilization of Biological Resources, Soonchunhyang University, Chungnam, 31, Asan, South Korea.
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, Korea.
| |
Collapse
|
6
|
Jackson DJ, Cerveau N, Posnien N. De novo assembly of transcriptomes and differential gene expression analysis using short-read data from emerging model organisms - a brief guide. Front Zool 2024; 21:17. [PMID: 38902827 PMCID: PMC11188175 DOI: 10.1186/s12983-024-00538-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 06/12/2024] [Indexed: 06/22/2024] Open
Abstract
Many questions in biology benefit greatly from the use of a variety of model systems. High-throughput sequencing methods have been a triumph in the democratization of diverse model systems. They allow for the economical sequencing of an entire genome or transcriptome of interest, and with technical variations can even provide insight into genome organization and the expression and regulation of genes. The analysis and biological interpretation of such large datasets can present significant challenges that depend on the 'scientific status' of the model system. While high-quality genome and transcriptome references are readily available for well-established model systems, the establishment of such references for an emerging model system often requires extensive resources such as finances, expertise and computation capabilities. The de novo assembly of a transcriptome represents an excellent entry point for genetic and molecular studies in emerging model systems as it can efficiently assess gene content while also serving as a reference for differential gene expression studies. However, the process of de novo transcriptome assembly is non-trivial, and as a rule must be empirically optimized for every dataset. For the researcher working with an emerging model system, and with little to no experience with assembling and quantifying short-read data from the Illumina platform, these processes can be daunting. In this guide we outline the major challenges faced when establishing a reference transcriptome de novo and we provide advice on how to approach such an endeavor. We describe the major experimental and bioinformatic steps, provide some broad recommendations and cautions for the newcomer to de novo transcriptome assembly and differential gene expression analyses. Moreover, we provide an initial selection of tools that can assist in the journey from raw short-read data to assembled transcriptome and lists of differentially expressed genes.
Collapse
Affiliation(s)
- Daniel J Jackson
- University of Göttingen, Department of Geobiology, Goldschmidtstr.3, Göttingen, 37077, Germany.
| | - Nicolas Cerveau
- University of Göttingen, Department of Geobiology, Goldschmidtstr.3, Göttingen, 37077, Germany
| | - Nico Posnien
- University of Göttingen, Department of Developmental Biology, GZMB, Justus-Von-Liebig-Weg 11, Göttingen, 37077, Germany.
| |
Collapse
|
7
|
Sornsenee P, Surachat K, Kang DK, Mendoza R, Romyasamit C. Probiotic Insights from the Genomic Exploration of Lacticaseibacillus paracasei Strains Isolated from Fermented Palm Sap. Foods 2024; 13:1773. [PMID: 38891001 PMCID: PMC11172291 DOI: 10.3390/foods13111773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 05/31/2024] [Accepted: 06/04/2024] [Indexed: 06/20/2024] Open
Abstract
This study focused on L. paracasei strains isolated from fermented palm sap in southern Thailand that exhibit potential probiotic characteristics, including antibiotic susceptibility, resistance to gastrointestinal stresses, and antimicrobial activity against various pathogens. However, a thorough investigation of the whole genome sequences of L. paracasei isolates is required to ensure their safety and probiotic properties for human applications. This study aimed to sequence the genome of L. paracasei isolated from fermented palm sap, to assess its safety profile, and to conduct a comprehensive comparative genomic analysis with other Lacticaseibacillus species. The genome sizes of the seven L. paracasei strains ranged from 3,070,747 bp to 3,131,129 bp, with a GC content between 46.11% and 46.17% supporting their classification as nomadic lactobacilli. In addition, the minimal presence of cloud genes and a significant number of core genes suggest a high degree of relatedness among the strains. Meanwhile, phylogenetic analysis of core genes revealed that the strains possessed distinct genes and were grouped into two distinct clades. Genomic analysis revealed key genes associated with probiotic functions, such as those involved in gastrointestinal, oxidative stress resistance, vitamin synthesis, and biofilm disruption. This study is consistent with previous studies that used whole-genome sequencing and bioinformatics to assess the safety and potential benefits of probiotics in various food fermentation processes. Our findings provide valuable insights into the potential use of seven L. paracasei strains isolated from fermented palm sap as probiotic and postbiotic candidates in functional foods and pharmaceuticals.
Collapse
Affiliation(s)
- Phoomjai Sornsenee
- Department of Family and Preventive Medicine, Faculty of Medicine, Prince of Songkla University, Songkhla 90110, Thailand;
| | - Komwit Surachat
- Department of Biomedical Sciences and Biomedical Engineering, Faculty of Medicine, Prince of Songkla University, Songkhla 90110, Thailand;
| | - Dae-Kyung Kang
- Department of Animal Biotechnology, Dankook University, Cheonan 31116, Republic of Korea; (D.-K.K.); (R.M.)
| | - Remylin Mendoza
- Department of Animal Biotechnology, Dankook University, Cheonan 31116, Republic of Korea; (D.-K.K.); (R.M.)
| | - Chonticha Romyasamit
- Department of Medical Technology, School of Allied Health Sciences, Walailak University, Nakhon Si Thammarat 80160, Thailand
- Center of Excellence in Innovation of Essential Oil and Bioactive Compounds, Walailak University, Nakhon Si Thammarat 80160, Thailand
| |
Collapse
|
8
|
Seaman RP, Campbell R, Doe V, Yosufzai Z, Graber JH. A cloud-based training module for efficient de novo transcriptome assembly using Nextflow and Google cloud. Brief Bioinform 2024; 25:bbae313. [PMID: 38941113 PMCID: PMC11212313 DOI: 10.1093/bib/bbae313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 04/29/2024] [Accepted: 05/18/2024] [Indexed: 06/29/2024] Open
Abstract
This study describes the development of a resource module that is part of a learning platform named "NIGMS Sandbox for Cloud-based Learning" (https://github.com/NIGMS/NIGMS-Sandbox). The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox at the beginning of this Supplement. This module delivers learning materials on de novo transcriptome assembly using Nextflow in an interactive format that uses appropriate cloud resources for data access and analysis. Cloud computing is a powerful new means by which biomedical researchers can access resources and capacity that were previously either unattainable or prohibitively expensive. To take advantage of these resources, however, the biomedical research community needs new skills and knowledge. We present here a cloud-based training module, developed in conjunction with Google Cloud, Deloitte Consulting, and the NIH STRIDES Program, that uses the biological problem of de novo transcriptome assembly to demonstrate and teach the concepts of computational workflows (using Nextflow) and cost- and resource-efficient use of Cloud services (using Google Cloud Platform). Our work highlights the reduced necessity of on-site computing resources and the accessibility of cloud-based infrastructure for bioinformatics applications.
Collapse
Affiliation(s)
- Ryan P Seaman
- MDI Biological Laboratory, 159 Old Bar Harbor Road, Bar Harbor, ME 04609, USA
| | - Ross Campbell
- Health Data and AI, 4301 Fairfax Dr, Unit 210, Deloitte Consulting LLP, Arlington, VA 22203, USA
| | - Valena Doe
- Google Cloud, Google, 1900 Reston Metro Plaza, Reston, VA 20190, USA
| | - Zelaikha Yosufzai
- Health Data and AI, 4301 Fairfax Dr, Unit 210, Deloitte Consulting LLP, Arlington, VA 22203, USA
| | - Joel H Graber
- MDI Biological Laboratory, 159 Old Bar Harbor Road, Bar Harbor, ME 04609, USA
| |
Collapse
|
9
|
Miranda S, Koop M, Angeli A, Lagrèze J, Malnoy M, Martens S. Assessment and Partial Characterization of Candidate Genes in Dihydrochalcone and Arbutin Biosynthesis in an Apple-Pear Hybrid by De Novo Transcriptome Assembly. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:11804-11819. [PMID: 38717061 DOI: 10.1021/acs.jafc.4c01006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2024]
Abstract
Apples (Malus × domestica Borkh.) and pears (Pyrus communis L.) are valuable crops closely related within the Rosaceae family with reported nutraceutical properties derived from secondary metabolites including phloridzin and arbutin, which are distinctive phenolic metabolites characterizing apples and pears, respectively. Here, we generated a de novo transcriptome assembly of an intergeneric hybrid between apple and pear, accumulating intermediate levels of phloridzin and arbutin. Combining RNA-seq, in silico functional annotation prediction, targeted gene expression analysis, and expression-metabolite correlations, we identified candidate genes for functional characterization, resulting in the identification of active arbutin synthases in the hybrid and parental genotypes. Despite exhibiting an active arbutin synthase in vitro, the natural lack of arbutin in apples is reasoned by the absence of the substrate and broad substrate specificity. Altogether, our study serves as the basis for future assessment of potential physiological roles of identified genes by genome editing of hybrids and pears.
Collapse
Affiliation(s)
- Simón Miranda
- Research and Innovation Centre, Edmund Mach Foundation, San Michele all'Adige 38098, Italy
| | - Marion Koop
- Research and Innovation Centre, Edmund Mach Foundation, San Michele all'Adige 38098, Italy
| | - Andrea Angeli
- Research and Innovation Centre, Edmund Mach Foundation, San Michele all'Adige 38098, Italy
| | - Jorge Lagrèze
- Research and Innovation Centre, Edmund Mach Foundation, San Michele all'Adige 38098, Italy
| | - Mickael Malnoy
- Research and Innovation Centre, Edmund Mach Foundation, San Michele all'Adige 38098, Italy
| | - Stefan Martens
- Research and Innovation Centre, Edmund Mach Foundation, San Michele all'Adige 38098, Italy
| |
Collapse
|
10
|
Ezieke AH, Serrano A, Peces M, Clarke W, Villa-Gomez D. Effect of feeding frequency on the anaerobic digestion of berry fruit waste. WASTE MANAGEMENT (NEW YORK, N.Y.) 2024; 178:66-75. [PMID: 38377770 DOI: 10.1016/j.wasman.2024.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 01/29/2024] [Accepted: 02/05/2024] [Indexed: 02/22/2024]
Abstract
On-site anaerobic digesters for small agricultural farms typically have feeding schedules that fluctuate according to farm operations. Shocks in feeding, particularly for putrescible waste can disrupt the stable operation of a digester. The effect of intermittent feeding on the anaerobic digestion of rejected raspberries was investigated in four 3L reactors operated in semicontinuous mode for 350 days at 38 °C with a hydraulic retention time of 25 days and an organic loading rate (OLR) of 1gVS/L/d. During the acclimatisation period (147 days) the organic loading was 5 feeds per week. The feeding regime of two reactors was then changed while maintaining the same OLR and HRT to one weekly feed event in one reactor and 3 equal feeds per week in another. The feeding regime did not significantly affect specific methane yield (369 ± 47 L/kgVS on average) despite very different weekly patterns in methane production. Volatile fatty acids (VFA) comprised >83 % of the organics in the effluent, while the rest included non-inhibitory concentrations of phenolic compounds (515-556 mg gallic acid/L). The microbial composition and relative abundance of predominant groups in all reactors were the archaeal genera Methanobacterium and Methanolinea and the bacterial phyla Bacteridota and Firmicutes. Increasing the OLR to 2gVS/L/d on day 238 resulted in failure of all reactors, attributed to the insufficient alkalinity to counterbalance the VFA produced, and the pH decrease below 6. Overall results suggests that optimal digestion of raspberry waste is maintained despite variations in feeding frequency, but acidification can occur with OLR changes.
Collapse
Affiliation(s)
| | - Antonio Serrano
- The University of Queensland, School of Civil Engineering, Brisbane 4072, Australia; Institute of Water Research, University of Granada, Granada 18071, Spain; Department of Microbiology, Pharmacy Faculty, University of Granada, Campus de Cartuja s/n, Granada 18071, Spain
| | - Miriam Peces
- Department of Chemistry and Bioscience, Center for Microbial Communities, Aalborg University, Aalborg East 9220, Denmark
| | - William Clarke
- The University of Queensland, School of Civil Engineering, Brisbane 4072, Australia
| | - Denys Villa-Gomez
- The University of Queensland, School of Civil Engineering, Brisbane 4072, Australia.
| |
Collapse
|
11
|
Bossert S, Pauly A, Danforth BN, Orr MC, Murray EA. Lessons from assembling UCEs: A comparison of common methods and the case of Clavinomia (Halictidae). Mol Ecol Resour 2024; 24:e13925. [PMID: 38183389 DOI: 10.1111/1755-0998.13925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 12/08/2023] [Accepted: 12/21/2023] [Indexed: 01/08/2024]
Abstract
Sequence data assembly is a foundational step in high-throughput sequencing, with untold consequences for downstream analyses. Despite this, few studies have interrogated the many methods for assembling phylogenomic UCE data for their comparative efficacy, or for how outputs may be impacted. We study this by comparing the most commonly used assembly methods for UCEs in the under-studied bee lineage Nomiinae and a representative sampling of relatives. Data for 63 UCE-only and 75 mixed taxa were assembled with five methods, including ABySS, HybPiper, SPAdes, Trinity and Velvet, and then benchmarked for their relative performance in terms of locus capture parameters and phylogenetic reconstruction. Unexpectedly, Trinity and Velvet trailed the other methods in terms of locus capture and DNA matrix density, whereas SPAdes performed favourably in most assessed metrics. In comparison with SPAdes, the guided-assembly approach HybPiper generally recovered the highest quality loci but in lower numbers. Based on our results, we formally move Clavinomia to Dieunomiini and render Epinomia once more a subgenus of Dieunomia. We strongly advise that future studies more closely examine the influence of assembly approach on their results, or, minimally, use better-performing assembly methods such as SPAdes or HybPiper. In this way, we can move forward with phylogenomic studies in a more standardized, comparable manner.
Collapse
Affiliation(s)
- Silas Bossert
- Department of Entomology, Washington State University, Pullman, Washington, USA
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Alain Pauly
- Royal Belgian Institute of Natural Sciences, O.D. Taxonomy and Phylogeny, Brussels, Belgium
| | - Bryan N Danforth
- Department of Entomology, Cornell University, Ithaca, New York, USA
| | - Michael C Orr
- Entomologie, Staatliches Museum für Naturkunde Stuttgart, Stuttgart, Germany
| | - Elizabeth A Murray
- Department of Entomology, Washington State University, Pullman, Washington, USA
| |
Collapse
|
12
|
Kang JN, Hur M, Kim CK, Yang SH, Lee SM. Enhancing transcriptome analysis in medicinal plants: multiple unigene sets in Astragalus membranaceus. FRONTIERS IN PLANT SCIENCE 2024; 15:1301526. [PMID: 38384760 PMCID: PMC10879423 DOI: 10.3389/fpls.2024.1301526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 01/22/2024] [Indexed: 02/23/2024]
Abstract
Astragalus membranaceus is a medicinal plant mainly used in East Asia and contains abundant secondary metabolites. Despite the importance of this plant, the available genomic and genetic information is still limited. De novo transcriptome construction is recognized as an essential method for transcriptome research when reference genome information is incomplete. In this study, we constructed three individual transcriptome sets (unigene sets) for detailed analysis of the phenylpropanoid biosynthesis pathway, a major metabolite of A. membranaceus. Set-1 was a circular consensus sequence (CCS) generated using PacBio sequencing (PacBio-seq). Set-2 consisted of hybridized assembled unigenes with Illumina sequencing (Illumina-seq) reads and PacBio CCS using rnaSPAdes. Set-3 unigenes were assembled from Illumina-seq reads using the Trinity software. Construction of multiple unigene sets provides several advantages for transcriptome analysis. First, it provides an appropriate expression filtering threshold for assembly-based unigenes: a threshold transcripts per million (TPM) ≥ 5 removed more than 88% of assembly-based unigenes, which were mostly short and low-expressing unigenes. Second, assembly-based unigenes compensated for the incomplete length of PacBio CCSs: the ends of the 5`/3` untranslated regions of phenylpropanoid-related unigenes derived from set-1 were incomplete, which suggests that PacBio CCSs are unlikely to be full-length transcripts. Third, more isoform unigenes could be obtained from multiple unigene sets; isoform unigenes missing in Set-1 were detected in set-2 and set-3. Finally, gene ontology and Kyoto Encyclopedia of Genes and Genomes analyses showed that phenylpropanoid biosynthesis and carbohydrate metabolism were highly activated in A. membranaceus roots. Various sequencing technologies and assemblers have been developed for de novo transcriptome analysis. However, no technique is perfect for de novo transcriptome analysis, suggesting the need to construct multiple unigene sets. This method enables efficient transcript filtering and detection of longer and more diverse transcripts.
Collapse
Affiliation(s)
- Ji-Nam Kang
- Genomics Division, National Institute of Agricultural Sciences, Jeonju-si, Jeollabuk-do, Republic of Korea
| | - Mok Hur
- Department of Herbal Crop Resources, National Institute of Horticultural & Herbal Science, Eumseong-gun, Chungcheongbuk-do, Republic of Korea
| | - Chang-Kug Kim
- Genomics Division, National Institute of Agricultural Sciences, Jeonju-si, Jeollabuk-do, Republic of Korea
| | - So-Hee Yang
- Genomics Division, National Institute of Agricultural Sciences, Jeonju-si, Jeollabuk-do, Republic of Korea
| | - Si-Myung Lee
- Genomics Division, National Institute of Agricultural Sciences, Jeonju-si, Jeollabuk-do, Republic of Korea
| |
Collapse
|
13
|
Martín-Manzo MV, Morelos-Castro RM, Munguia-Vega A, Soberanes-Yepiz ML, Cortés-Jacinto E. Transcriptome analysis of reproductive tract tissues of male river prawn Macrobrachium americanum. Mol Biol Rep 2024; 51:259. [PMID: 38302799 DOI: 10.1007/s11033-023-09125-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 12/06/2023] [Indexed: 02/03/2024]
Abstract
BACKGROUND The river prawn, Macrobrachium americanum (M. americanum), is one of the largest prawns of the genus in Latin America and is an amphidromous species distributed along the Pacific coast of America. This prawn has commercial value due to its size and taste, making it a good option for aquaculture production. Its culture has been attempted in ponds and concrete tanks, but no successful technique can still support commercial production. Understanding the mechanisms that regulate reproduction at the molecular level is very important. This knowledge can provide tools for manipulating transcripts, which could increase the number or size of animals in the culture. Our understanding of the mechanism that regulates the reproduction of M. americanum at the molecular level is limited. AIM Perform and analyze the transcriptome assembly of the testes, vas deferens, and terminal ampulla of M. americanum. to provide new molecular information about its reproduction. METHODS AND RESULTS The cDNA library was constructed and sequenced for each tissue to identify novel transcripts. A combined transcriptome with the three tissues was assembled using Trinity software. Unigenes were annotated using BLASTx and BLAST2GO. The transcriptome assembly generated 1,059,447 unigenes, of which 7222 genes had significant hits (e-value < 1 × 10-5) when compared against the Swiss-Prot database. Around 75 genes were related to sex determination, testis development, spermatogenesis, spermiogenesis, fertilization, maturation of testicular cells, neuropeptides, hormones, hormone receptors, and/or embryogenesis. CONCLUSIONS These results provide new molecular information about M. americanum reproduction, representing a reference point for further genetic studies of this species.
Collapse
Affiliation(s)
- Miriam Victoria Martín-Manzo
- Centro de Investigaciones Biológicas del Noroeste (CIBNOR), Playa Palo de Santa Rita Sur, Av. Instituto Politécnico Nacional 195, 23096, La Paz, BCS, Mexico
| | - Rosa María Morelos-Castro
- Centro de Investigaciones Biológicas del Noroeste Tepic, Investigadoras E Investigadores Por México-CONACYT. Unidad Nayarit, Nayarit, Mexico
| | - Adrian Munguia-Vega
- Applied Genomics Lab, Av. Gral. Félix Ortega Aguilar, 23000, La Paz, Baja California Sur, Mexico
- Conservation Genetics Laboratory, The University of Arizona, Tucson, AZ, 85721, USA
| | - Maritza Lourdes Soberanes-Yepiz
- Centro de Investigaciones Biológicas del Noroeste (CIBNOR), Playa Palo de Santa Rita Sur, Av. Instituto Politécnico Nacional 195, 23096, La Paz, BCS, Mexico
| | - Edilmar Cortés-Jacinto
- Centro de Investigaciones Biológicas del Noroeste (CIBNOR), Playa Palo de Santa Rita Sur, Av. Instituto Politécnico Nacional 195, 23096, La Paz, BCS, Mexico.
| |
Collapse
|
14
|
Fonseca-González I, Velasquez-Agudelo E, Londoño-Mesa MH, Álvarez JC. De novo transcriptome sequencing and annotation of the Antarctic polychaete Microspio moorei (Spionidae) with its characterization of the heat stress-related proteins (HSP, SOD & CAT). Mar Genomics 2024; 73:101085. [PMID: 38301367 DOI: 10.1016/j.margen.2024.101085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 12/07/2023] [Accepted: 01/22/2024] [Indexed: 02/03/2024]
Abstract
We present a de novo transcriptome assembly for the non-model Antarctic polychaete worm Microspio moorei (Spionidae) collected during Antarctic field expedition in Fildes Bay, King George Island, Antarctic Peninsula, in 2017. Here, we report the first transcriptome reference array for Microspio spp. The gene sequences of the spionid worm were annotated from a wide range of functions (i.e., biological, and metabolic processes, catalytic processes, and catalytic activity). HSP70, HSP90 SOD and CAT families were compared to reported annelid transcriptomes and proteomes. The phylogenetic analysis using COI, 16S, and 18S markers effectively clusters the species within the family. However, it also casts uncertainty on the monophyletic nature of the Microspio genera, indicating the necessity for additional data and potentially requiring a reevaluation of its grouping. Within these protein families, 3D model software was used to create one representative of their protein structures. Structural predictions were compared with related reported annelids living at different temperatures and a human X-ray reference. We found structural differences (RMSE >1.8) between the human HSP proteins but no significant differences between the polychaete-predicted proteins (RMSE <1.2). These results encourage further research of heat stress-related proteins, the development of genetic markers for climate change-induced temperature stress, and the study of the underlying mechanisms of the heat response. Moreover, these results motivate the extension of these findings to congeneric species.
Collapse
Affiliation(s)
- Idalyd Fonseca-González
- LimnoBasE & Biotamar Research Group, Institute of Biology, University of Antioquia, Medellín 050010, Colombia
| | - Esteban Velasquez-Agudelo
- Research Group in Biodiversity, Evolution and Conservation (BEC), EAFIT University, Medellín 050022, Colombia
| | - Mario H Londoño-Mesa
- LimnoBasE & Biotamar Research Group, Institute of Biology, University of Antioquia, Medellín 050010, Colombia
| | - Javier C Álvarez
- Research Group in Biodiversity, Evolution and Conservation (BEC), EAFIT University, Medellín 050022, Colombia.
| |
Collapse
|
15
|
Westrin KJ, Kretzschmar WW, Emanuelsson O. ClusTrast: a short read de novo transcript isoform assembler guided by clustered contigs. BMC Bioinformatics 2024; 25:54. [PMID: 38302873 PMCID: PMC10836024 DOI: 10.1186/s12859-024-05663-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 01/18/2024] [Indexed: 02/03/2024] Open
Abstract
BACKGROUND Transcriptome assembly from RNA-sequencing data in species without a reliable reference genome has to be performed de novo, but studies have shown that de novo methods often have inadequate ability to reconstruct transcript isoforms. We address this issue by constructing an assembly pipeline whose main purpose is to produce a comprehensive set of transcript isoforms. RESULTS We present the de novo transcript isoform assembler ClusTrast, which takes short read RNA-seq data as input, assembles a primary assembly, clusters a set of guiding contigs, aligns the short reads to the guiding contigs, assembles each clustered set of short reads individually, and merges the primary and clusterwise assemblies into the final assembly. We tested ClusTrast on real datasets from six eukaryotic species, and showed that ClusTrast reconstructed more expressed known isoforms than any of the other tested de novo assemblers, at a moderate reduction in precision. For recall, ClusTrast was on top in the lower end of expression levels (<15% percentile) for all tested datasets, and over the entire range for almost all datasets. Reference transcripts were often (35-69% for the six datasets) reconstructed to at least 95% of their length by ClusTrast, and more than half of reference transcripts (58-81%) were reconstructed with contigs that exhibited polymorphism, measuring on a subset of reliably predicted contigs. ClusTrast recall increased when using a union of assembled transcripts from more than one assembly tool as primary assembly. CONCLUSION We suggest that ClusTrast can be a useful tool for studying isoforms in species without a reliable reference genome, in particular when the goal is to produce a comprehensive transcriptome set with polymorphic variants.
Collapse
Affiliation(s)
- Karl Johan Westrin
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, 171 65, Solna, Sweden
| | - Warren W Kretzschmar
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, 171 65, Solna, Sweden
- Department of Medicine Huddinge, Center for Hematology and Regenerative Medicine (HERM), Karolinska Institute, 141 52, Flemingsberg, Sweden
| | - Olof Emanuelsson
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, 171 65, Solna, Sweden.
| |
Collapse
|
16
|
Alvarez RV, Landsman D. GTax: improving de novo transcriptome assembly by removing foreign RNA contamination. Genome Biol 2024; 25:12. [PMID: 38191464 PMCID: PMC10773103 DOI: 10.1186/s13059-023-03141-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 12/08/2023] [Indexed: 01/10/2024] Open
Abstract
The cost and complexity of generating a complete reference genome means that many organisms lack an annotated reference. An alternative is to use a de novo reference transcriptome. This technology is cost-effective but is susceptible to off-target RNA contamination. In this manuscript, we present GTax, a taxonomy-structured database of genomic sequences that can be used with BLAST to detect and remove foreign contamination in RNA sequencing samples before assembly. In addition, we use a de novo transcriptome assembly of Solanum lycopersicum (tomato) to demonstrate that removing foreign contamination in sequencing samples reduces the number of assembled chimeric transcripts.
Collapse
Affiliation(s)
- Roberto Vera Alvarez
- Computational Biology Branch, National Center for Biotechnology Information, Intramural Research Program, National Library of Medicine, NIH, Bethesda, MD, USA
| | - David Landsman
- Computational Biology Branch, National Center for Biotechnology Information, Intramural Research Program, National Library of Medicine, NIH, Bethesda, MD, USA.
| |
Collapse
|
17
|
Khelghatibana F, Javan-Nikkhah M, Safaie N, Sobhani A, Shams S, Sari E. A reference transcriptome for walnut anthracnose pathogen, Ophiognomonia leptostyla, guides the discovery of candidate virulence genes. Fungal Genet Biol 2023; 169:103828. [PMID: 37657751 DOI: 10.1016/j.fgb.2023.103828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 08/13/2023] [Accepted: 08/28/2023] [Indexed: 09/03/2023]
Abstract
Despite the economic losses due to the walnut anthracnose, Ophiognomonia leptostyla is an orphan fungus with respect to genomic resources. In the present study, the transcriptome of O. leptostyla was assembled for the first time. RNA sequencing was conducted for the fungal mycelia grown in a liquid media, and the inoculated leaf samples of walnut with the fungal conidia sampled at 48, 96 and 144 h post inoculation (hpi). The completeness, correctness, and contiguity of the de novo transcriptome assemblies generated with Trinity, Oases, SOAPdenovo-Trans and Bridger were compared to identify a single superior reference assembly. In most of the assessment criteria including N50, Transrate score, number of ORFs with known description in gene bank, the percentage of reads mapped back to the transcript (RMBT), BUSCO score, Swiss-Prot coverage bin and RESM-EVAL score, the Bridger assembly was the superior and thus used as a reference for profiling the O. leptostyla transcriptome in liquid media vs. during walnut infection. The k-means clustering of transcripts resulted in four distinct transcription patterns across the three sampling time points. Most of the detected CAZy transcripts had elevated transcription at 96 hpi that is hypothetically concurrent with the start of intracellular growth. The in-silico analysis revealed 103 candidate effectors of which six were members of Necrosis and Ethylene Inducing Like Protein (NLP) gene family belonging to three distinct k-means clusters. This study provided a complex and temporal pattern of the CAZys and candidate effectors transcription during six days post O. leptostyla inoculation on walnut leaves, introducing a list of candidate virulence genes for validation in future studies.
Collapse
Affiliation(s)
- Fatemeh Khelghatibana
- Department of Plant Pathology, Iranian Research Institute of Plant Protection, Agricultural Research, Education and Extension Organization (AREEO), Tehran, Iran.
| | - Mohammad Javan-Nikkhah
- Department of Plant Protection, College of Agriculture and Natural Resources, University of Tehran, Karaj, Iran
| | - Naser Safaie
- Department of Plant Pathology, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Ahmad Sobhani
- Agricultural Biotechnology Research Institute of Iran - Isfahan Branch, Agricultural Research, Education and Extension Organization (AREEO), Isfahan, Iran
| | - Somayeh Shams
- Department of Plant Production and Genetic Engineering, Faculty of Agriculture, University of Lorestan, Khorramabad, Iran
| | - Ehsan Sari
- Department of Microbiology and Plant Pathology, University of California, Riverside, CA, USA.
| |
Collapse
|
18
|
Fu P, Wu Y, Zhang Z, Qiu Y, Wang Y, Peng Y. VIGA: a one-stop tool for eukaryotic virus identification and genome assembly from next-generation-sequencing data. Brief Bioinform 2023; 25:bbad444. [PMID: 38048079 PMCID: PMC10753531 DOI: 10.1093/bib/bbad444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 10/26/2023] [Accepted: 11/11/2023] [Indexed: 12/05/2023] Open
Abstract
Identification of viruses and further assembly of viral genomes from the next-generation-sequencing data are essential steps in virome studies. This study presented a one-stop tool named VIGA (available at https://github.com/viralInformatics/VIGA) for eukaryotic virus identification and genome assembly from NGS data. It was composed of four modules, namely, identification, taxonomic annotation, assembly and novel virus discovery, which integrated several third-party tools such as BLAST, Trinity, MetaCompass and RagTag. Evaluation on multiple simulated and real virome datasets showed that VIGA assembled more complete virus genomes than its competitors on both the metatranscriptomic and metagenomic data and performed well in assembling virus genomes at the strain level. Finally, VIGA was used to investigate the virome in metatranscriptomic data from the Human Microbiome Project and revealed different composition and positive rate of viromes in diseases of prediabetes, Crohn's disease and ulcerative colitis. Overall, VIGA would help much in identification and characterization of viromes, especially the known viruses, in future studies.
Collapse
Affiliation(s)
- Ping Fu
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Yifan Wu
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Zhiyuan Zhang
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Ye Qiu
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Yirong Wang
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Yousong Peng
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| |
Collapse
|
19
|
Schoen A, Hölzer M, Müller MA, Wallerang KB, Drosten C, Marz M, Lamp B, Weber F. Functional comparisons of the virus sensor RIG-I from humans, the microbat Myotis daubentonii, and the megabat Rousettus aegyptiacus, and their response to SARS-CoV-2 infection. J Virol 2023; 97:e0020523. [PMID: 37728614 PMCID: PMC10653997 DOI: 10.1128/jvi.00205-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 07/09/2023] [Indexed: 09/21/2023] Open
Abstract
IMPORTANCE A common hypothesis holds that bats (order Chiroptera) are outstanding reservoirs for zoonotic viruses because of a special antiviral interferon (IFN) system. However, functional studies about key components of the bat IFN system are rare. RIG-I is a cellular sensor for viral RNA signatures that activates the antiviral signaling chain to induce IFN. We cloned and functionally characterized RIG-I genes from two species of the suborders Yangochiroptera and Yinpterochiroptera. The bat RIG-Is were conserved in their sequence and domain organization, and similar to human RIG-I in (i) mediating virus- and IFN-activated gene expression, (ii) antiviral signaling, (iii) temperature dependence, and (iv) recognition of RNA ligands. Moreover, RIG-I of Rousettus aegyptiacus (suborder Yinpterochiroptera) and of humans were found to recognize SARS-CoV-2 infection. Thus, members of both bat suborders encode RIG-Is that are comparable to their human counterpart. The ability of bats to harbor zoonotic viruses therefore seems due to other features.
Collapse
Affiliation(s)
- Andreas Schoen
- Institute for Virology, FB10-Veterinary Medicine, Justus-Liebig University, Giessen, Germany
| | - Martin Hölzer
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Jena, Germany
- European Virus Bioinformatics Center, Jena, Germany
| | - Marcel A. Müller
- German Centre for Infection Research (DZIF), Partner Sites Giessen and Charité, Berlin, Germany
- Institute of Virology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Kai B. Wallerang
- Institute for Virology, FB10-Veterinary Medicine, Justus-Liebig University, Giessen, Germany
| | - Christian Drosten
- European Virus Bioinformatics Center, Jena, Germany
- German Centre for Infection Research (DZIF), Partner Sites Giessen and Charité, Berlin, Germany
- Institute of Virology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Manja Marz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Jena, Germany
- European Virus Bioinformatics Center, Jena, Germany
| | - Benjamin Lamp
- Institute for Virology, FB10-Veterinary Medicine, Justus-Liebig University, Giessen, Germany
| | - Friedemann Weber
- Institute for Virology, FB10-Veterinary Medicine, Justus-Liebig University, Giessen, Germany
- European Virus Bioinformatics Center, Jena, Germany
- German Centre for Infection Research (DZIF), Partner Sites Giessen and Charité, Berlin, Germany
| |
Collapse
|
20
|
White OW, Reyes-Betancort A, Carine MA, Chapman MA. Comparative transcriptomics and gene expression divergence associated with homoploid hybrid speciation in Argyranthemum. G3 (BETHESDA, MD.) 2023; 13:jkad158. [PMID: 37477910 PMCID: PMC10542503 DOI: 10.1093/g3journal/jkad158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 04/21/2023] [Accepted: 06/28/2023] [Indexed: 07/22/2023]
Abstract
Ecological isolation is increasingly thought to play an important role in speciation, especially for the origin and reproductive isolation of homoploid hybrid species. However, the extent to which divergent and/or transgressive gene expression changes are involved in speciation is not well studied. In this study, we employ comparative transcriptomics to investigate gene expression changes associated with the origin and evolution of two homoploid hybrid plant species, Argyranthemum sundingii and A. lemsii (Asteraceae). As there is no standard methodology for comparative transcriptomics, we examined five different pipelines for data assembly and analysing gene expression across the four species (two hybrid and two parental). We note biases and problems with all pipelines, and the approach used affected the biological interpretation of the data. Using the approach that we found to be optimal, we identify transcripts showing DE between the parental taxa and between the homoploid hybrid species and their parents; in several cases, putative functions of these DE transcripts have a plausible role in ecological adaptation and could be the cause or consequence of ecological speciation. Although independently derived, the homoploid hybrid species have converged on similar expression phenotypes, likely due to adaptation to similar habitats.
Collapse
Affiliation(s)
- Oliver W White
- Algae, Fungi and Plants Division, Department of Life Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | | | - Mark A Carine
- Algae, Fungi and Plants Division, Department of Life Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Mark A Chapman
- Biological Sciences, University of Southampton, Southampton SO17 1BJ, UK
| |
Collapse
|
21
|
Zhang J, Zhang H, Ju Z, Peng Y, Pan Y, Xi W, Wei Y. JCcirc: circRNA full-length sequence assembly through integrated junction contigs. Brief Bioinform 2023; 24:bbad363. [PMID: 37833842 DOI: 10.1093/bib/bbad363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/04/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
Recent studies have shed light on the potential of circular RNA (circRNA) as a biomarker for disease diagnosis and as a nucleic acid vaccine. The exploration of these functionalities requires correct circRNA full-length sequences; however, existing assembly tools can only correctly assemble some circRNAs, and their performance can be further improved. Here, we introduce a novel feature known as the junction contig (JC), which is an extension of the back-splice junction (BSJ). Leveraging the strengths of both BSJ and JC, we present a novel method called JCcirc (https://github.com/cbbzhang/JCcirc). It enables efficient reconstruction of all types of circRNA full-length sequences and their alternative isoforms using splice graphs and fragment coverage. Our findings demonstrate the superiority of JCcirc over existing methods on human simulation datasets, and its average F1 score surpasses CircAST by 0.40 and outperforms both CIRI-full and circRNAfull by 0.13. For circRNAs below 400 bp, 400-800 bp, 800 bp-1200 bp and above 1200 bp, the correct assembly rates are 0.13, 0.09, 0.04 and 0.03 higher, respectively, than those achieved by existing methods. Moreover, JCcirc also outperforms existing assembly tools on other five model species datasets and real sequencing datasets. These results show that JCcirc is a robust tool for accurately assembling circRNA full-length sequences, laying the foundation for the functional analysis of circRNAs.
Collapse
Affiliation(s)
- Jingjing Zhang
- University of Chinese Academy of Sciences, Beijing, China
- Shenzhen Key Laboratory of Intelligent Bioinformatics & Center for High Performance Computing, Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China
| | - Huiling Zhang
- College of Mathematics and Information, South China Agriculture University, Guangzhou, China
| | - Zhen Ju
- University of Chinese Academy of Sciences, Beijing, China
- Shenzhen Key Laboratory of Intelligent Bioinformatics & Center for High Performance Computing, Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China
| | - Yin Peng
- Guangdong Key Laboratory for Genome Stability and Disease Prevention and Regional Immunity and Diseases, Department of Pathology, Shenzhen University School of Medicine, Shenzhen, China
| | - Yi Pan
- Shenzhen Key Laboratory of Intelligent Bioinformatics & Center for High Performance Computing, Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China
| | - Wenhui Xi
- Shenzhen Key Laboratory of Intelligent Bioinformatics & Center for High Performance Computing, Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China
| | - Yanjie Wei
- Shenzhen Key Laboratory of Intelligent Bioinformatics & Center for High Performance Computing, Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China
| |
Collapse
|
22
|
Kelliher JM, Robinson AJ, Longley R, Johnson LYD, Hanson BT, Morales DP, Cailleau G, Junier P, Bonito G, Chain PSG. The endohyphal microbiome: current progress and challenges for scaling down integrative multi-omic microbiome research. MICROBIOME 2023; 11:192. [PMID: 37626434 PMCID: PMC10463477 DOI: 10.1186/s40168-023-01634-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 07/29/2023] [Indexed: 08/27/2023]
Abstract
As microbiome research has progressed, it has become clear that most, if not all, eukaryotic organisms are hosts to microbiomes composed of prokaryotes, other eukaryotes, and viruses. Fungi have only recently been considered holobionts with their own microbiomes, as filamentous fungi have been found to harbor bacteria (including cyanobacteria), mycoviruses, other fungi, and whole algal cells within their hyphae. Constituents of this complex endohyphal microbiome have been interrogated using multi-omic approaches. However, a lack of tools, techniques, and standardization for integrative multi-omics for small-scale microbiomes (e.g., intracellular microbiomes) has limited progress towards investigating and understanding the total diversity of the endohyphal microbiome and its functional impacts on fungal hosts. Understanding microbiome impacts on fungal hosts will advance explorations of how "microbiomes within microbiomes" affect broader microbial community dynamics and ecological functions. Progress to date as well as ongoing challenges of performing integrative multi-omics on the endohyphal microbiome is discussed herein. Addressing the challenges associated with the sample extraction, sample preparation, multi-omic data generation, and multi-omic data analysis and integration will help advance current knowledge of the endohyphal microbiome and provide a road map for shrinking microbiome investigations to smaller scales. Video Abstract.
Collapse
Affiliation(s)
| | | | - Reid Longley
- Los Alamos National Laboratory, Los Alamos, NM, USA
| | | | | | | | | | | | | | | |
Collapse
|
23
|
Nakamae K, Bono H. DANGER analysis: risk-averse on/off-target assessment for CRISPR editing without a reference genome. BIOINFORMATICS ADVANCES 2023; 3:vbad114. [PMID: 37661945 PMCID: PMC10469126 DOI: 10.1093/bioadv/vbad114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 08/11/2023] [Accepted: 08/22/2023] [Indexed: 09/05/2023]
Abstract
Motivation The CRISPR-Cas9 system has successfully achieved site-specific gene editing in organisms ranging from humans to bacteria. The technology efficiently generates mutants, allowing for phenotypic analysis of the on-target gene. However, some conventional studies did not investigate whether deleterious off-target effects partially affect the phenotype. Results Herein, we present a novel phenotypic assessment of CRISPR-mediated gene editing: Deleterious and ANticipatable Guides Evaluated by RNA-sequencing (DANGER) analysis. Using RNA-seq data, this bioinformatics pipeline can elucidate genomic on/off-target sites on mRNA-transcribed regions related to expression changes and then quantify phenotypic risk at the gene ontology term level. We demonstrated the risk-averse on/off-target assessment in RNA-seq data from gene-edited samples of human cells and zebrafish brains. Our DANGER analysis successfully detected off-target sites, and it quantitatively evaluated the potential contribution of deleterious off-targets to the transcriptome phenotypes of the edited mutants. Notably, DANGER analysis harnessed de novo transcriptome assembly to perform risk-averse on/off-target assessments without a reference genome. Thus, our resources would help assess genome editing in non-model organisms, individual human genomes, and atypical genomes from diseases and viruses. In conclusion, DANGER analysis facilitates the safer design of genome editing in all organisms with a transcriptome. Availability and implementation The Script for the DANGER analysis pipeline is available at https://github.com/KazukiNakamae/DANGER_analysis. In addition, the software provides a tutorial on reproducing the results presented in this article on the Readme page. The Docker image of DANGER_analysis is also available at https://hub.docker.com/repository/docker/kazukinakamae/dangeranalysis/general.
Collapse
Affiliation(s)
- Kazuki Nakamae
- Laboratory of Bio-DX, Genome Editing Innovation Center, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-0046, Japan
- Research and Development Department, PtBio Inc., 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-0046, Japan
| | - Hidemasa Bono
- Laboratory of Bio-DX, Genome Editing Innovation Center, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-0046, Japan
- Laboratory of Genome Informatics, Graduate School of Integrated Sciences for Life, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-0046, Japan
| |
Collapse
|
24
|
Abstract
Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced. Identifying genes in these sequences is essential to understand the biology of the species. This is challenging due to the transcriptional complexity of eukaryotic genomes, which encode hundreds of thousands of transcripts of multiple types. Among these, a small set of protein-coding mRNAs play a disproportionately large role in defining phenotypes. Due to their sequence conservation, orthology can be established, making it possible to define the universal catalog of eukaryotic protein-coding genes. This catalog should substantially contribute to uncovering the genomic events underlying the emergence of eukaryotic phenotypes. This piece briefly reviews the basics of protein-coding gene prediction, discusses challenges in finalizing annotation of the human genome, and proposes strategies for producing annotations across the eukaryotic Tree of Life. This lays the groundwork for obtaining the catalog of all genes-the Earth's code of life.
Collapse
Affiliation(s)
- Roderic Guigó
- Bioinformatics and Genomics, Center for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Dr. Aiguader 88, 08003 Barcelona, Catalonia
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia
| |
Collapse
|
25
|
Shi X, Li J, Liu T, Zhao H, Leng H, Sun K, Feng J. Divergence of cochlear transcriptomics between reference‑based and reference‑free transcriptome analyses among Rhinolophus ferrumequinum populations. PLoS One 2023; 18:e0288404. [PMID: 37432940 DOI: 10.1371/journal.pone.0288404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 06/26/2023] [Indexed: 07/13/2023] Open
Abstract
Differences in gene expression within tissues can lead to differences in tissue function. Understanding the transcriptome of a species helps elucidate the molecular mechanisms underlying phenotypic divergence. According to the presence or absence of a reference genome of for a studied species, transcriptome analyses can be divided into reference‑based and reference‑free methods, respectively. Presently, comparisons of complete transcriptome analysis results between those two methods are still rare. In this study, we compared the cochlear transcriptome analysis results of greater horseshoe bats (Rhinolophus ferrumequinum) from three lineages in China with different acoustic phenotypes using reference‑based and reference‑free methods to explore their differences in subsequent analysis. The results gained by reference-based results had lower false-positive rates and were more accurate because differentially expressed genes among the three populations obtained by this method had greater reliability and a higher annotation rate. Some phenotype-related enrichment terms, including those related to inorganic molecules and proton transmembrane channels, were also obtained only by the reference-based method. However, the reference‑based method might have the limitation of incomplete information acquisition. Thus, we believe that a combination of reference‑free and reference‑based methods is ideal for transcriptome analyses. The results of our study provided a reference for the selection of transcriptome analysis methods in the future.
Collapse
Affiliation(s)
- Xiaoxiao Shi
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun, Jilin, China
| | - Jun Li
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun, Jilin, China
| | - Tong Liu
- Department of Life Science, Jilin Agricultural University, Changchun, Jilin, China
| | - Hanbo Zhao
- Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural, Shenzhen, China
| | - Haixia Leng
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun, Jilin, China
| | - Keping Sun
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun, Jilin, China
- Key Laboratory of Vegetation Ecology, Ministry of Education, Changchun, Jilin, China
| | - Jiang Feng
- Jilin Provincial Key Laboratory of Animal Resource Conservation and Utilization, Northeast Normal University, Changchun, Jilin, China
- Department of Life Science, Jilin Agricultural University, Changchun, Jilin, China
| |
Collapse
|
26
|
Bitar M, Rivera I, Almeida I, Shi W, Ferguson K, Beesley J, Lakhani S, Edwards S, French J. Redefining normal breast cell populations using long noncoding RNAs. Nucleic Acids Res 2023; 51:6389-6410. [PMID: 37144467 PMCID: PMC10325898 DOI: 10.1093/nar/gkad339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 04/12/2023] [Accepted: 04/21/2023] [Indexed: 05/06/2023] Open
Abstract
Single-cell RNAseq has allowed unprecedented insight into gene expression across different cell populations in normal tissue and disease states. However, almost all studies rely on annotated gene sets to capture gene expression levels and sequencing reads that do not align to known genes are discarded. Here, we discover thousands of long noncoding RNAs (lncRNAs) expressed in human mammary epithelial cells and analyze their expression in individual cells of the normal breast. We show that lncRNA expression alone can discriminate between luminal and basal cell types and define subpopulations of both compartments. Clustering cells based on lncRNA expression identified additional basal subpopulations, compared to clustering based on annotated gene expression, suggesting that lncRNAs can provide an additional layer of information to better distinguish breast cell subpopulations. In contrast, these breast-specific lncRNAs poorly distinguish brain cell populations, highlighting the need to annotate tissue-specific lncRNAs prior to expression analyses. We also identified a panel of 100 breast lncRNAs that could discern breast cancer subtypes better than protein-coding markers. Overall, our results suggest that lncRNAs are an unexplored resource for new biomarker and therapeutic target discovery in the normal breast and breast cancer subtypes.
Collapse
Affiliation(s)
- Mainá Bitar
- Cancer Program, QIMR Berghofer Medical Research Institute, Brisbane 4006, Australia
- Faculty of Medicine, The University of Queensland, Brisbane 4006, Australia
| | - Isela Sarahi Rivera
- Cancer Program, QIMR Berghofer Medical Research Institute, Brisbane 4006, Australia
- School of Biomedical Science and Institute of Health and Biomedical Innovation, Faculty of Health, Queensland University of Technology, Brisbane 4001, Australia
| | - Isabela Almeida
- Cancer Program, QIMR Berghofer Medical Research Institute, Brisbane 4006, Australia
- Faculty of Medicine, The University of Queensland, Brisbane 4006, Australia
| | - Wei Shi
- Cancer Program, QIMR Berghofer Medical Research Institute, Brisbane 4006, Australia
| | - Kaltin Ferguson
- UQ Centre for Clinical Research, The University of Queensland, Brisbane 4006, Australia
| | - Jonathan Beesley
- Cancer Program, QIMR Berghofer Medical Research Institute, Brisbane 4006, Australia
| | - Sunil R Lakhani
- UQ Centre for Clinical Research, The University of Queensland, Brisbane 4006, Australia
- Pathology Queensland, The Royal Brisbane & Women's Hospital, Brisbane 4006, Australia
| | - Stacey L Edwards
- Cancer Program, QIMR Berghofer Medical Research Institute, Brisbane 4006, Australia
- Faculty of Medicine, The University of Queensland, Brisbane 4006, Australia
| | - Juliet D French
- Cancer Program, QIMR Berghofer Medical Research Institute, Brisbane 4006, Australia
- Faculty of Medicine, The University of Queensland, Brisbane 4006, Australia
| |
Collapse
|
27
|
Han SY, Kim WY, Kim JS, Hwang I. Comparative transcriptomics reveals the role of altered energy metabolism in the establishment of single-cell C 4 photosynthesis in Bienertia sinuspersici. FRONTIERS IN PLANT SCIENCE 2023; 14:1202521. [PMID: 37476170 PMCID: PMC10354284 DOI: 10.3389/fpls.2023.1202521] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Accepted: 05/31/2023] [Indexed: 07/22/2023]
Abstract
Single-cell C4 photosynthesis (SCC4) in terrestrial plants without Kranz anatomy involves three steps: initial CO2 fixation in the cytosol, CO2 release in mitochondria, and a second CO2 fixation in central chloroplasts. Here, we investigated how the large number of mechanisms underlying these processes, which occur in three different compartments, are orchestrated in a coordinated manner to establish the C4 pathway in Bienertia sinuspersici, a SCC4 plant. Leaves were subjected to transcriptome analysis at three different developmental stages. Functional enrichment analysis revealed that SCC4 cycle genes are coexpressed with genes regulating cyclic electron flow and amino/organic acid metabolism, two key processes required for the production of energy molecules in C3 plants. Comparative gene expression profiling of B. sinuspersici and three other species (Suaeda aralocaspica, Amaranthus hypochondriacus, and Arabidopsis thaliana) showed that the direction of metabolic flux was determined via an alteration in energy supply in peripheral chloroplasts and mitochondria via regulation of gene expression in the direction of the C4 cycle. Based on these results, we propose that the redox homeostasis of energy molecules via energy metabolism regulation is key to the establishment of the SCC4 pathway in B. sinuspersici.
Collapse
Affiliation(s)
- Sang-Yun Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Republic of Korea
| | - Woe-Yeon Kim
- Division of Applied Life Science (BK21+) and Research Institute of Life Science, Institute of Agriculture and Life Sciences, Gyeongsang National University, Jinju, Republic of Korea
| | - Jung Sun Kim
- Genomic Division, Department of Agricultural Bio-Resources, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju, Republic of Korea
| | - Inhwan Hwang
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Republic of Korea
| |
Collapse
|
28
|
Liu P, Ewald J, Pang Z, Legrand E, Jeon YS, Sangiovanni J, Hacariz O, Zhou G, Head JA, Basu N, Xia J. ExpressAnalyst: A unified platform for RNA-sequencing analysis in non-model species. Nat Commun 2023; 14:2995. [PMID: 37225696 DOI: 10.1038/s41467-023-38785-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 05/16/2023] [Indexed: 05/26/2023] Open
Abstract
The increasing application of RNA sequencing to study non-model species demands easy-to-use and efficient bioinformatics tools to help researchers quickly uncover biological and functional insights. We developed ExpressAnalyst ( www.expressanalyst.ca ), a web-based platform for processing, analyzing, and interpreting RNA-sequencing data from any eukaryotic species. ExpressAnalyst contains a series of modules that cover from processing and annotation of FASTQ files to statistical and functional analysis of count tables or gene lists. All modules are integrated with EcoOmicsDB, an ortholog database that enables comprehensive analysis for species without a reference transcriptome. By coupling ultra-fast read mapping algorithms with high-resolution ortholog databases through a user-friendly web interface, ExpressAnalyst allows researchers to obtain global expression profiles and gene-level insights from raw RNA-sequencing reads within 24 h. Here, we present ExpressAnalyst and demonstrate its utility with a case study of RNA-sequencing data from multiple non-model salamander species, including two that do not have a reference transcriptome.
Collapse
Affiliation(s)
- Peng Liu
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Jessica Ewald
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Zhiqiang Pang
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Elena Legrand
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Yeon Seon Jeon
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Jonathan Sangiovanni
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Orcun Hacariz
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Guangyan Zhou
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Jessica A Head
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Niladri Basu
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada
| | - Jianguo Xia
- Faculty of Agricultural and Environmental Sciences, McGill University, Ste-Anne-de-Bellevue, Canada.
| |
Collapse
|
29
|
Chowdhury MAA, Islam MR, Amin A, Mou SN, Ullah KN, Baten A, Shoyaib M, Ali AA, Chowdhury FT, Rahi ML, Khan H, Amin MA, Islam MR. Integrated transcriptome catalog of Tenualosa ilisha as a resource for gene discovery and expression profiling. Sci Data 2023; 10:214. [PMID: 37062771 PMCID: PMC10106452 DOI: 10.1038/s41597-023-02132-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 04/03/2023] [Indexed: 04/18/2023] Open
Abstract
The silver pride of Bangladesh, migratory shad, Tenualosa ilisha (Hilsa), makes the highest contribution to the total fish production of Bangladesh. Despite its noteworthy contribution, a well-annotated transcriptome data is not available. Here we report a transcriptomic catalog of Hilsa, constructed by assembling RNA-Seq reads from different tissues of the fish including brain, gill, kidney, liver, and muscle. Hilsa fish were collected from different aquatic habitats (fresh, brackish, and sea water) and the sequencing was performed in the next generation sequencing (NGS) platform. De novo assembly of the sequences obtained from 46 cDNA libraries revealed 462,085 transcript isoforms that were subsequently annotated using the Universal Protein Resource Knowledgebase (UniPortKB) as a reference. Starting from the sampling to final annotation, all the steps along with the workflow are reported here. This study will provide a significant resource for ongoing and future research on Hilsa for transcriptome based expression profiling and identification of candidate genes.
Collapse
Affiliation(s)
- Md Arko Ayon Chowdhury
- Molecular Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, 1000, Bangladesh
- Center for Computational and Data Sciences (CCDS), Independent University, Bangladesh (IUB), Dhaka, Bangladesh
| | - Md Rakibul Islam
- Molecular Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, 1000, Bangladesh
- Center for Computational and Data Sciences (CCDS), Independent University, Bangladesh (IUB), Dhaka, Bangladesh
| | - Al Amin
- Molecular Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, 1000, Bangladesh
- Center for Computational and Data Sciences (CCDS), Independent University, Bangladesh (IUB), Dhaka, Bangladesh
| | - Sadia Noor Mou
- Molecular Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, 1000, Bangladesh
- Center for Computational and Data Sciences (CCDS), Independent University, Bangladesh (IUB), Dhaka, Bangladesh
| | - Kazi Newaz Ullah
- Center for Computational and Data Sciences (CCDS), Independent University, Bangladesh (IUB), Dhaka, Bangladesh
- Department of Zoology, Jagannath University, Dhaka, 1100, Bangladesh
| | - Abdul Baten
- Institute of Precision Medicine and Bioinformatics, Sydney Local Health District, Royal Prince Alfred Hospital, Camperdown, Australia
| | - Mohammad Shoyaib
- Institute of Information Technology (IIT), University of Dhaka, Dhaka, 1000, Bangladesh
| | - Amin Ahsan Ali
- Center for Computational and Data Sciences (CCDS), Independent University, Bangladesh (IUB), Dhaka, Bangladesh
| | - Farhana Tasnim Chowdhury
- Molecular Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, 1000, Bangladesh
| | - Md Lifat Rahi
- Fisheries and Marine Resource Technology (FMRT) Discipline, Khulna University, Khulna, 9208, Bangladesh
| | - Haseena Khan
- Molecular Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, 1000, Bangladesh
| | - M Ashraful Amin
- Center for Computational and Data Sciences (CCDS), Independent University, Bangladesh (IUB), Dhaka, Bangladesh.
| | - Mohammad Riazul Islam
- Molecular Biology Laboratory, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, 1000, Bangladesh.
| |
Collapse
|
30
|
Fallon TR, Čalounová T, Mokrejš M, Weng JK, Pluskal T. transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation. BMC Bioinformatics 2023; 24:133. [PMID: 37016291 PMCID: PMC10074830 DOI: 10.1186/s12859-023-05254-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 03/24/2023] [Indexed: 04/06/2023] Open
Abstract
BACKGROUND RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software. RESULTS Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware. CONCLUSIONS transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.
Collapse
Affiliation(s)
- Timothy R Fallon
- Scripps Institution of Oceanography, UC San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Tereza Čalounová
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 16000, Prague 6, Czech Republic
| | - Martin Mokrejš
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 16000, Prague 6, Czech Republic
| | - Jing-Ke Weng
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA, 02142, USA.
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| | - Tomáš Pluskal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 16000, Prague 6, Czech Republic.
| |
Collapse
|
31
|
Camacho C, Boratyn GM, Joukov V, Vera Alvarez R, Madden TL. ElasticBLAST: accelerating sequence search via cloud computing. BMC Bioinformatics 2023; 24:117. [PMID: 36967390 PMCID: PMC10040096 DOI: 10.1186/s12859-023-05245-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 03/21/2023] [Indexed: 03/28/2023] Open
Abstract
BACKGROUND Biomedical researchers use alignments produced by BLAST (Basic Local Alignment Search Tool) to categorize their query sequences. Producing such alignments is an essential bioinformatics task that is well suited for the cloud. The cloud can perform many calculations quickly as well as store and access large volumes of data. Bioinformaticians can also use it to collaborate with other researchers, sharing their results, datasets and even their pipelines on a common platform. RESULTS We present ElasticBLAST, a cloud native application to perform BLAST alignments in the cloud. ElasticBLAST can handle anywhere from a few to many thousands of queries and run the searches on thousands of virtual CPUs (if desired), deleting resources when it is done. It uses cloud native tools for orchestration and can request discounted instances, lowering cloud costs for users. It is supported on Amazon Web Services and Google Cloud Platform. It can search BLAST databases that are user provided or from the National Center for Biotechnology Information. CONCLUSION We show that ElasticBLAST is a useful application that can efficiently perform BLAST searches for the user in the cloud, demonstrating that with two examples. At the same time, it hides much of the complexity of working in the cloud, lowering the threshold to move work to the cloud.
Collapse
Affiliation(s)
- Christiam Camacho
- grid.280285.50000 0004 0507 7840National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Grzegorz M. Boratyn
- grid.280285.50000 0004 0507 7840National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Victor Joukov
- grid.280285.50000 0004 0507 7840National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Roberto Vera Alvarez
- grid.280285.50000 0004 0507 7840National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Thomas L. Madden
- grid.280285.50000 0004 0507 7840National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| |
Collapse
|
32
|
Krinos AI, Cohen NR, Follows MJ, Alexander H. Reverse engineering environmental metatranscriptomes clarifies best practices for eukaryotic assembly. BMC Bioinformatics 2023; 24:74. [PMID: 36869298 PMCID: PMC9983209 DOI: 10.1186/s12859-022-05121-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 12/21/2022] [Indexed: 03/05/2023] Open
Abstract
BACKGROUND Diverse communities of microbial eukaryotes in the global ocean provide a variety of essential ecosystem services, from primary production and carbon flow through trophic transfer to cooperation via symbioses. Increasingly, these communities are being understood through the lens of omics tools, which enable high-throughput processing of diverse communities. Metatranscriptomics offers an understanding of near real-time gene expression in microbial eukaryotic communities, providing a window into community metabolic activity. RESULTS Here we present a workflow for eukaryotic metatranscriptome assembly, and validate the ability of the pipeline to recapitulate real and manufactured eukaryotic community-level expression data. We also include an open-source tool for simulating environmental metatranscriptomes for testing and validation purposes. We reanalyze previously published metatranscriptomic datasets using our metatranscriptome analysis approach. CONCLUSION We determined that a multi-assembler approach improves eukaryotic metatranscriptome assembly based on recapitulated taxonomic and functional annotations from an in-silico mock community. The systematic validation of metatranscriptome assembly and annotation methods provided here is a necessary step to assess the fidelity of our community composition measurements and functional content assignments from eukaryotic metatranscriptomes.
Collapse
Affiliation(s)
- Arianna I Krinos
- MIT-WHOI Joint Program in Oceanography and Applied Ocean Science and Engineering, Cambridge and Woods Hole, MA, USA. .,Department of Biology, Woods Hole Oceanographic Institution, Woods Hole, MA, USA. .,Department of Earth, Atmospheric, and Planetary Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Natalie R Cohen
- Skidaway Institute of Oceanography, University of Georgia, Savannah, GA, USA
| | - Michael J Follows
- Department of Earth, Atmospheric, and Planetary Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Harriet Alexander
- Department of Biology, Woods Hole Oceanographic Institution, Woods Hole, MA, USA.
| |
Collapse
|
33
|
Acebal MC, Dalgaard LT, Jørgensen TS, Hansen BW. Embryogenesis of a calanoid copepod analyzed by transcriptomics. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY. PART D, GENOMICS & PROTEOMICS 2023; 45:101054. [PMID: 36565589 DOI: 10.1016/j.cbd.2022.101054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/22/2022] [Accepted: 12/06/2022] [Indexed: 12/14/2022]
Abstract
The calanoid copepod Acartia tonsa (Dana) has attracted interest because of its use as a copepod model organism as well as its potential economic role as live fish larval feed. While the adult genome and transcriptome of A. tonsa has been investigated, no studies have been performed investigating the genome-wide transcriptional changes during the normal subitaneous embryogenesis. Thus, the aim of the current study was to investigate said transcriptional changes throughout A. tonsa embryonic development. RNA extraction and de novo transcriptome assembly for the subitaneous embryogenesis of the copepod was conducted. The assembly includes for the first-time samples describing quiescent development and overall helps establishing a framework for future studies on the molecular biology of our species of interest. Among the findings reported, sequences annotated to well-known developmental genes, were identified. At the same time are described the molecular changes and gene expression levels throughout the entire 42 h the embryonic development lasts. In conclusion, here we present the most complete genome-wide transcriptional map of early copepod embryonic development to date, enabling further use of A. tonsa as a model organism for crustacean development. Keywords: enrichment of pathways; subitaneous embryogenesis, comparative genomics; transcriptome assembly; invertebrate genomics.
Collapse
Affiliation(s)
- Miguel Cifuentes Acebal
- Department of Science and Environment, Roskilde University, Universitetsvej 1, DK-4000 Roskilde, Denmark
| | - Louise Torp Dalgaard
- Department of Science and Environment, Roskilde University, Universitetsvej 1, DK-4000 Roskilde, Denmark
| | - Tue Sparholt Jørgensen
- Department of Science and Environment, Roskilde University, Universitetsvej 1, DK-4000 Roskilde, Denmark; Department of Environmental Science - Environmental Microbiology and Biotechnology, Aarhus University, Frederiksborgvej 399, DK-4000 Roskilde, Denmark; The Novo Nordisk Foundation Center for Biosustainability (DTU Biosustain) at the Technical University of Denmark, Building 220, Kemitorvet, DK-2800 Kgs. Lyngby, Denmark(1)
| | - Benni Winding Hansen
- Department of Science and Environment, Roskilde University, Universitetsvej 1, DK-4000 Roskilde, Denmark.
| |
Collapse
|
34
|
Abstract
Polyploidizations, or whole-genome duplications (WGDs), in plants have increased biological complexity, facilitated evolutionary innovation, and likely enabled adaptation under harsh conditions. Besides genomic data, transcriptome data have been widely employed to detect WGDs, due to their efficient accessibility to the gene space of a species. Age distributions based on synonymous substitutions (so-called KS age distributions) for paralogs assembled from transcriptome data have identified numerous WGDs in plants, paving the way for further studies on the importance of WGDs for the evolution of seed and flowering plants. However, it is still unclear how transcriptome-based age distributions compare to those based on genomic data. In this chapter, we implemented three different de novo transcriptome assembly pipelines with two popular assemblers, i.e., Trinity and SOAPdenovo-Trans. We selected six plant species with published genomes and transcriptomes to evaluate how assembled transcripts from different pipelines perform when using KS distributions to detect previously documented WGDs in the six species. Further, using genes predicted in each genome as references, we evaluated the effects of missing genes, gene family clustering, and de novo assembled transcripts on the transcriptome-based KS distributions. Our results show that, although the transcriptome-based KS distributions differ from the genome-based ones with respect to their shapes and scales, they are still reasonably reliable for unveiling WGDs, except in species where most duplicates originated from a recent WGD. We also discuss how to overcome some possible pitfalls when using transcriptome data to identify WGDs.
Collapse
Affiliation(s)
- Jia Li
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,VIB Center for Plant Systems Biology, VIB, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
| | - Zhen Li
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
| |
Collapse
|
35
|
Tao F, Fan C, Liu Y, Sivakumar S, Kowalski KP, Golenberg EM. Optimization and application of non-native Phragmites australis transcriptome assemblies. PLoS One 2023; 18:e0280354. [PMID: 36689482 PMCID: PMC9870158 DOI: 10.1371/journal.pone.0280354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 12/27/2022] [Indexed: 01/24/2023] Open
Abstract
Phragmites australis (common reed) has a cosmopolitan distribution and has been suggested as a model organism for the study of invasive plant species. In North America, the non-native subspecies (ssp. australis) is widely distributed across the contiguous 48 states in the United States and large parts of Canada. Even though millions of dollars are spent annually on Phragmites management, insufficient knowledge of P. australis impeded the efficiency of management. To solve this problem, transcriptomic information generated from multiple types of tissue could be a valuable resource for future studies. Here, we constructed forty-nine P. australis transcriptomes assemblies via different assembly tools and multiple parameter settings. The optimal transcriptome assembly for functional annotation and downstream analyses was selected among these transcriptome assemblies by comprehensive assessments. For a total of 422,589 transcripts assembled in this transcriptome assembly, 319,046 transcripts (75.5%) have at least one functional annotation. Within the transcriptome assembly, we further identified 1,495 transcripts showing tissue-specific expression pattern, 10,828 putative transcription factors, and 72,165 candidates for simple sequence repeats markers. The identification and analyses of predicted transcripts related to herbicide- and salinity-resistant genes were shown as two applications of the transcriptomic information to facilitate further research on P. australis. Transcriptome assembly and selection would be important for the transcriptome annotation. With this optimal transcriptome assembly and all relative information from downstream analyses, we have helped to establish foundations for future studies on the mechanisms underlying the invasiveness of non-native P. australis subspecies.
Collapse
Affiliation(s)
- Feng Tao
- Department of Biological Sciences, Wayne State University, Detroit, MI, United States of America
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, Detroit, MI, United States of America
| | - Yimin Liu
- Department of Biological Sciences, Wayne State University, Detroit, MI, United States of America
| | - Subashini Sivakumar
- Department of Biological Sciences, Wayne State University, Detroit, MI, United States of America
| | - Kurt P. Kowalski
- U.S. Geological Survey-Great Lakes Science Center, Ann Arbor, MI, United States of America
| | - Edward M. Golenberg
- Department of Biological Sciences, Wayne State University, Detroit, MI, United States of America
| |
Collapse
|
36
|
Camacho C, Boratyn GM, Joukov V, Alvarez RV, Madden TL. ElasticBLAST: Accelerating Sequence Search via Cloud Computing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.04.522777. [PMID: 36789435 PMCID: PMC9928022 DOI: 10.1101/2023.01.04.522777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Background Biomedical researchers use alignments produced by BLAST (Basic Local Alignment Search Tool) to categorize their query sequences. Producing such alignments is an essential bioinformatics task that is well suited for the cloud. The cloud can perform many calculations quickly as well as store and access large volumes of data. Bioinformaticians can also use it to collaborate with other researchers, sharing their results, datasets and even their pipelines on a common platform. Results We present ElasticBLAST, a cloud native application to perform BLAST alignments in the cloud. ElasticBLAST can handle anywhere from a few to many thousands of queries and run the searches on thousands of virtual CPUs (if desired), deleting resources when it is done. It uses cloud native tools for orchestration and can request discounted instances, lowering cloud costs for users. It is supported on Amazon Web Services and Google Cloud Platform. It can search BLAST databases that are user provided or from the National Center for Biotechnology Information. Conclusion We show that ElasticBLAST is a useful application that can efficiently perform BLAST searches for the user in the cloud, demonstrating that with two examples. At the same time, it hides much of the complexity of working in the cloud, lowering the threshold to move work to the cloud.
Collapse
Affiliation(s)
- Christiam Camacho
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Grzegorz M. Boratyn
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Victor Joukov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Roberto Vera Alvarez
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | | |
Collapse
|
37
|
Miranda S, Lagrèze J, Knoll AS, Angeli A, Espley RV, Dare AP, Malnoy M, Martens S. De novo transcriptome assembly and functional analysis reveal a dihydrochalcone 3-hydroxylase(DHC3H) of wild Malus species that produces sieboldin in vivo. FRONTIERS IN PLANT SCIENCE 2022; 13:1072765. [PMID: 36589107 PMCID: PMC9800874 DOI: 10.3389/fpls.2022.1072765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 11/23/2022] [Indexed: 06/17/2023]
Abstract
Sieboldin is a specialised secondary metabolite of the group of dihydrochalcones (DHC), found in high concentrations only in some wild Malus species, closely related to the domesticated apple (Malus × domestica L.). To date, the first committed step towards the biosynthesis of sieboldin remains unknown. In this study, we combined transcriptomic analysis and a de novo transcriptome assembly to identify two putative 3-hydroxylases in two wild Malus species (Malus toringo (K. Koch) Carriere syn. sieboldii Rehder, Malus micromalus Makino) whose DHC profile is dominated by sieboldin. We assessed the in vivo activity of putative candidates to produce 3-hydroxyphloretin and sieboldin by de novo production in Saccharomyces cerevisiae. We found that CYP98A proteins of wild Malus accessions (CYP98A195, M. toringo and CYP98A196, M. micromalus) were able to produce 3-hydroxyphloretin, ultimately leading to sieboldin accumulation by co-expression with PGT2. CYP98A197-198 genes of M. × domestica, however, were unable to hydroxylate phloretin in vivo. CYP98A195-196 proteins exerting 3-hydroxylase activity co-localised with an endoplasmic reticulum marker. CYP98A protein model from wild accessions showed mutations in key residues close to the ligand pocket predicted using phloretin for protein docking modelling. These mutations are located within known substrate recognition sites of cytochrome P450s, which could explain the acceptance of phloretin in CYP98A protein of wild accessions. Screening a Malus germplasm collection by HRM marker analysis for CYP98A genes identified three clusters that correspond to the alleles of domesticated and wild species. Moreover, CYP98A isoforms identified in M. toringo and M. micromalus correlate with the accumulation of sieboldin in other wild and hybrid Malus genotypes. Taken together, we provide the first evidence of an enzyme producing sieboldin in vivo that could be involved in the key hydroxylation step towards the synthesis of sieboldin in Malus species.
Collapse
Affiliation(s)
- Simón Miranda
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
- Center Agriculture Food and Environment (C3A), University of Trento, Trento, Italy
- The New Zealand Institute for Plant and Food Research Limited, Auckland, New Zealand
| | - Jorge Lagrèze
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
- Center Agriculture Food and Environment (C3A), University of Trento, Trento, Italy
| | - Anne-Sophie Knoll
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Andrea Angeli
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Richard V. Espley
- The New Zealand Institute for Plant and Food Research Limited, Auckland, New Zealand
| | - Andrew P. Dare
- The New Zealand Institute for Plant and Food Research Limited, Auckland, New Zealand
| | - Mickael Malnoy
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| | - Stefan Martens
- Research and Innovation Center, Fondazione Edmund Mach, San Michele all’Adige, Trento, Italy
| |
Collapse
|
38
|
Walter M, Puniamoorthy N. Discovering novel reproductive genes in a non-model fly using de novo GridION transcriptomics. Front Genet 2022; 13:1003771. [PMID: 36568389 PMCID: PMC9768217 DOI: 10.3389/fgene.2022.1003771] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 11/16/2022] [Indexed: 12/12/2022] Open
Abstract
Gene discovery has important implications for investigating phenotypic trait evolution, adaptation, and speciation. Male reproductive tissues, such as accessory glands (AGs), are hotspots for recruitment of novel genes that diverge rapidly even among closely related species/populations. These genes synthesize seminal fluid proteins that often affect post-copulatory sexual selection-they can mediate male-male sperm competition, ejaculate-female interactions that modify female remating and even influence reproductive incompatibilities among diverging species/populations. Although de novo transcriptomics has facilitated gene discovery in non-model organisms, reproductive gene discovery is still challenging without a reference database as they are often novel and bear no homology to known proteins. Here, we use reference-free GridION long-read transcriptomics, from Oxford Nanopore Technologies (ONT), to discover novel AG genes and characterize their expression in the widespread dung fly, Sepsis punctum. Despite stark population differences in male reproductive traits (e.g.: Body size, testes size, and sperm length) as well as female re-mating, the male AG genes and their secretions of S. punctum are still unknown. We implement a de novo ONT transcriptome pipeline incorporating quality-filtering and rigorous error-correction procedures, and we evaluate gene sequence and gene expression results against high-quality Illumina short-read data. We discover highly-expressed reproductive genes in AG transcriptomes of S. punctum consisting of 40 high-quality and high-confidence ONT genes that cross-verify against Illumina genes, among which 26 are novel and specific to S. punctum. Novel genes account for an average of 81% of total gene expression and may be functionally relevant in seminal fluid protein production. For instance, 80% of genes encoding secretory proteins account for 74% total gene expression. In addition, median sequence similarities of ONT nucleotide and protein sequences match within-Illumina sequence similarities. Read-count based expression quantification in ONT is congruent with Illumina's Transcript per Million (TPM), both in overall pattern and within functional categories. Rapid genomic innovation followed by recruitment of de novo genes for high expression in S. punctum AG tissue, a pattern observed in other insects, could be a likely mechanism of evolution of these genes. The study also demonstrates the feasibility of adapting ONT transcriptomics for gene discovery in non-model systems.
Collapse
|
39
|
Lotterhos KE, Fitzpatrick MC, Blackmon H. Simulation Tests of Methods in Evolution, Ecology, and Systematics: Pitfalls, Progress, and Principles. ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2022; 53:113-136. [PMID: 38107485 PMCID: PMC10723108 DOI: 10.1146/annurev-ecolsys-102320-093722] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Complex statistical methods are continuously developed across the fields of ecology, evolution, and systematics (EES). These fields, however, lack standardized principles for evaluating methods, which has led to high variability in the rigor with which methods are tested, a lack of clarity regarding their limitations, and the potential for misapplication. In this review, we illustrate the common pitfalls of method evaluations in EES, the advantages of testing methods with simulated data, and best practices for method evaluations. We highlight the difference between method evaluation and validation and review how simulations, when appropriately designed, can refine the domain in which a method can be reliably applied. We also discuss the strengths and limitations of different evaluation metrics. The potential for misapplication of methods would be greatly reduced if funding agencies, reviewers, and journals required principled method evaluation.
Collapse
Affiliation(s)
- Katie E Lotterhos
- Department of Marine and Environmental Sciences, Northeastern University, Nahant, Massachusetts, USA
| | - Matthew C Fitzpatrick
- Appalachian Lab, University of Maryland Center for Environmental Science, Frostburg, Maryland, USA
| | - Heath Blackmon
- Department of Biology, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
40
|
De novo assembly and annotation of the transcriptome of the endangered seagrass Zostera capensis: Insights from differential gene expression under thermal stress. Mar Genomics 2022; 66:100984. [PMID: 36116404 DOI: 10.1016/j.margen.2022.100984] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 09/05/2022] [Accepted: 09/06/2022] [Indexed: 11/23/2022]
Abstract
Seagrasses are important marine ecosystem engineers but anthropogenic impacts and climate change have led to numerous population declines globally. In South Africa, Zostera capensis is endangered due to fragmented populations and heavy anthropogenic pressures on estuarine ecosystems that house the core of the populations. Addressing questions of how pressures such as climate change affect foundational species, including Z. capensis are crucial to supporting their conservation and underpin restoration efforts. Here we use ecological transcriptomics to study key functional responses of Z. capensis through quantification of gene expression after thermal stress and present the first reference transcriptome of Z. capensis. Four de novo reference assemblies (Trinity, IDBA-tran, RNAspades, SOAPdenovo) filtered through the EvidentialGene pipeline resulted in 153,755 transcripts with a BUSCO score of 66.1% for completeness. Differential expression analysis between heat stressed (32 °C for three days) and pre-warming plants identified genes involved in photosynthesis, oxidative stress, translation, metabolic and biosynthetic processes in the Z. capensis thermal stress response. This reference transcriptome is a significant contribution to the limited available genomic resources for Z. capensis and represents a vital tool for addressing questions around the species restoration and potential functional responses to warming marine environments.
Collapse
|
41
|
Sheikh-Assadi M, Naderi R, Salami SA, Kafi M, Fatahi R, Shariati V, Martinelli F, Cicatelli A, Triassi M, Guarino F, Improta G, Claros MG. Normalized Workflow to Optimize Hybrid De Novo Transcriptome Assembly for Non-Model Species: A Case Study in Lilium ledebourii (Baker) Boiss. PLANTS 2022; 11:plants11182365. [PMID: 36145766 PMCID: PMC9503428 DOI: 10.3390/plants11182365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Revised: 08/21/2022] [Accepted: 09/07/2022] [Indexed: 11/16/2022]
Abstract
A high-quality transcriptome is required to advance numerous bioinformatics workflows. Nevertheless, the effectuality of tools for de novo assembly and real precision assembled transcriptomes looks somewhat unexplored, particularly for non-model organisms with complicated (very long, heterozygous, polyploid) genomes. To disclose the performance of various transcriptome assembly programs, this study built 11 single assemblies and analyzed their performance on some significant reference-free and reference-based criteria. As well as to reconfirm the outputs of benchmarks, 55 BLAST were performed and compared using 11 constructed transcriptomes. Concisely, normalized benchmarking demonstrated that Velvet–Oases suffer from the worst results, while the EvidentialGene strategy can provide the most comprehensive and accurate transcriptome of Lilium ledebourii (Baker) Boiss. The BLAST results also confirmed the superiority of EvidentialGene, so it could capture even up to 59% more (than Velvet–Oases) unique gene hits. To promote assembly optimization, with the help of normalized benchmarking, PCA and AHC, it is emphasized that each metric can only provide part of the transcriptome status, and one should never settle for just a few evaluation criteria. This study supplies a framework for benchmarking and optimizing the efficiency of assembly approaches to analyze RNA-Seq data and reveals that selecting an inefficient assembly strategy might result in less identification of unique gene hits.
Collapse
Affiliation(s)
- Morteza Sheikh-Assadi
- Department of Horticultural Science, Faculty of Agricultural Science and Engineering, University of Tehran, Karaj 31587-77871, Iran
- Correspondence: (M.S.-A.); (R.N.)
| | - Roohangiz Naderi
- Department of Horticultural Science, Faculty of Agricultural Science and Engineering, University of Tehran, Karaj 31587-77871, Iran
- Correspondence: (M.S.-A.); (R.N.)
| | - Seyed Alireza Salami
- Department of Horticultural Science, Faculty of Agricultural Science and Engineering, University of Tehran, Karaj 31587-77871, Iran
| | - Mohsen Kafi
- Department of Horticultural Science, Faculty of Agricultural Science and Engineering, University of Tehran, Karaj 31587-77871, Iran
| | - Reza Fatahi
- Department of Horticultural Science, Faculty of Agricultural Science and Engineering, University of Tehran, Karaj 31587-77871, Iran
| | - Vahid Shariati
- NIGEB Genome Center, National Institute of Genetic Engineering and Biotechnology, Tehran 14965/161, Iran
| | - Federico Martinelli
- Department of Biology, University of Florence, 50019 Sesto Fiorentino, Italy
| | - Angela Cicatelli
- Department of Chemistry and Biology “A. Zambelli”, University of Salerno, 84084 Fisciano, Italy
| | - Maria Triassi
- Department of Public Health, University of Naples “Federico II”, 80131 Naples, Italy
| | - Francesco Guarino
- Department of Chemistry and Biology “A. Zambelli”, University of Salerno, 84084 Fisciano, Italy
| | - Giovanni Improta
- Department of Public Health, University of Naples “Federico II”, 80131 Naples, Italy
| | - Manuel Gonzalo Claros
- Molecular Biology and Biochemistry Department, University of Málaga, 29071 Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), 29071 Málaga, Spain
- Institute of Biomedical Research in Málaga (IBIMA), IBIMA-RARE, 29010 Málaga, Spain
- Instituto de Hortofruticultura Subtropical y Mediterránea (IHSM-UMA-CSIC), 29010 Málaga, Spain
| |
Collapse
|
42
|
Hempel CA, Wright N, Harvie J, Hleap JS, Adamowicz S, Steinke D. Metagenomics versus total RNA sequencing: most accurate data-processing tools, microbial identification accuracy and perspectives for ecological assessments. Nucleic Acids Res 2022; 50:9279-9293. [PMID: 35979944 PMCID: PMC9458450 DOI: 10.1093/nar/gkac689] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/05/2022] [Accepted: 07/29/2022] [Indexed: 12/24/2022] Open
Abstract
Metagenomics and total RNA sequencing (total RNA-Seq) have the potential to improve the taxonomic identification of diverse microbial communities, which could allow for the incorporation of microbes into routine ecological assessments. However, these target-PCR-free techniques require more testing and optimization. In this study, we processed metagenomics and total RNA-Seq data from a commercially available microbial mock community using 672 data-processing workflows, identified the most accurate data-processing tools, and compared their microbial identification accuracy at equal and increasing sequencing depths. The accuracy of data-processing tools substantially varied among replicates. Total RNA-Seq was more accurate than metagenomics at equal sequencing depths and even at sequencing depths almost one order of magnitude lower than those of metagenomics. We show that while data-processing tools require further exploration, total RNA-Seq might be a favorable alternative to metagenomics for target-PCR-free taxonomic identifications of microbial communities and might enable a substantial reduction in sequencing costs while maintaining accuracy. This could be particularly an advantage for routine ecological assessments, which require cost-effective yet accurate methods, and might allow for the incorporation of microbes into ecological assessments.
Collapse
Affiliation(s)
- Christopher A Hempel
- To whom correspondence should be addressed. Tel: +1 519 824 4120; Fax: +1 519 824 5703;
| | - Natalie Wright
- Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Julia Harvie
- Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Jose S Hleap
- SHARCNET, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Sarah J Adamowicz
- Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Dirk Steinke
- Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada,Centre for Biodiversity Genomics, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
43
|
Proteotranscriptomics - A facilitator in omics research. Comput Struct Biotechnol J 2022; 20:3667-3675. [PMID: 35891789 PMCID: PMC9293588 DOI: 10.1016/j.csbj.2022.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 07/04/2022] [Accepted: 07/04/2022] [Indexed: 11/26/2022] Open
Abstract
Applications in omics research, such as comparative transcriptomics and proteomics, require the knowledge of the species-specific gene sequence and benefit from a comprehensive high-quality annotation of the coding genes to achieve high coverage. While protein-coding genes can in simple cases be detected by scanning the genome for open reading frames, in more complex genomes exonic sequences are separated by introns. Despite advances in sequencing technologies that allow for ever-growing numbers of genomes, the quality of many of the provided genome assemblies do not reach reference quality. These non-contiguous assemblies with gaps and the necessity to predict splice sites limit accurate gene annotation from solely genomic data. In contrast, the transcriptome only contains transcribed gene regions, is devoid of introns and thus provides the optimal basis for the identification of open reading frames. The additional integration of proteomics data to validate predicted protein-coding genes further enriches for accurate gene models. This review outlines the principles of the proteotranscriptomics approach, discusses common challenges and suggests methods for improvement.
Collapse
|
44
|
Hufsky F, Abecasis A, Agudelo-Romero P, Bletsa M, Brown K, Claus C, Deinhardt-Emmer S, Deng L, Friedel CC, Gismondi MI, Kostaki EG, Kühnert D, Kulkarni-Kale U, Metzner KJ, Meyer IM, Miozzi L, Nishimura L, Paraskevopoulou S, Pérez-Cataluña A, Rahlff J, Thomson E, Tumescheit C, van der Hoek L, Van Espen L, Vandamme AM, Zaheri M, Zuckerman N, Marz M. Women in the European Virus Bioinformatics Center. Viruses 2022; 14:1522. [PMID: 35891501 PMCID: PMC9319252 DOI: 10.3390/v14071522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/05/2022] [Accepted: 07/07/2022] [Indexed: 02/01/2023] Open
Abstract
Viruses are the cause of a considerable burden to human, animal and plant health, while on the other hand playing an important role in regulating entire ecosystems. The power of new sequencing technologies combined with new tools for processing "Big Data" offers unprecedented opportunities to answer fundamental questions in virology. Virologists have an urgent need for virus-specific bioinformatics tools. These developments have led to the formation of the European Virus Bioinformatics Center, a network of experts in virology and bioinformatics who are joining forces to enable extensive exchange and collaboration between these research areas. The EVBC strives to provide talented researchers with a supportive environment free of gender bias, but the gender gap in science, especially in math-intensive fields such as computer science, persists. To bring more talented women into research and keep them there, we need to highlight role models to spark their interest, and we need to ensure that female scientists are not kept at lower levels but are given the opportunity to lead the field. Here we showcase the work of the EVBC and highlight the achievements of some outstanding women experts in virology and viral bioinformatics.
Collapse
Affiliation(s)
- Franziska Hufsky
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
| | - Ana Abecasis
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Global Health and Tropical Medicine, Institute of Hygiene and Tropical Medicine, New University of Lisbon, 1349-008 Lisbon, Portugal
| | - Patricia Agudelo-Romero
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Wal-Yan Respiratory Research Centre, Telethon Kids Institute, University of Western Australia, Nedlands, WA 6009, Australia
| | - Magda Bletsa
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece
- Department of Microbiology, Immunology and Transplantation, Rega Institute, Katholieke Universiteit Leuven, B-3000 Leuven, Belgium
| | - Katherine Brown
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 1TN, UK
| | - Claudia Claus
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Institute of Medical Microbiology and Virology, Medical Faculty, Leipzig University, 04103 Leipzig, Germany
| | - Stefanie Deinhardt-Emmer
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Institute of Medical Microbiology, Jena University Hospital, 07747 Jena, Germany
| | - Li Deng
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Institute of Virology, Helmholtz Centre Munich-German Research Center for Environmental Health, 85764 Neuherberg, Germany
- Microbial Disease Prevention, School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Caroline C. Friedel
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Institute of Informatics, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
| | - María Inés Gismondi
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Institute of Agrobiotechnology and Molecular Biology (IABIMO), National Institute for Agriculture Technology (INTA), National Research Council (CONICET), Hurlingham B1686IGC, Argentina
- Department of Basic Sciences, National University of Luján, Luján B6702MZP, Argentina
| | - Evangelia Georgia Kostaki
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece
| | - Denise Kühnert
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Transmission, Infection, Diversification and Evolution Group, Max Planck Institute for the Science of Human History, 07745 Jena, Germany
| | - Urmila Kulkarni-Kale
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Bioinformatics Centre, Savitribai Phule Pune University, Pune 411007, India
| | - Karin J. Metzner
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Department of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, 8091 Zurich, Switzerland
- Institute of Medical Virology, University of Zurich, 8057 Zurich, Switzerland
| | - Irmtraud M. Meyer
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, 10115 Berlin, Germany
- Institute of Chemistry and Biochemistry, Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, 14195 Berlin, Germany
- Faculty of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany
| | - Laura Miozzi
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Institute for Sustainable Plant Protection, National Research Council of Italy, 10135 Torino, Italy
| | - Luca Nishimura
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Department of Genetics, School of Life Science, The Graduate University for Advanced Studies (SOKENDAI), Mishima 411-8540, Japan
- Human Genetics Laboratory, National Institute of Genetics, Mishima 411-8540, Japan
| | - Sofia Paraskevopoulou
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Methods Development and Research Infrastructure, Bioinformatics and Systems Biology, Robert Koch Institute, 13353 Berlin, Germany
| | - Alba Pérez-Cataluña
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- VISAFELab, Department of Preservation and Food Safety Technologies, Institute of Agrochemistry and Food Technology, IATA-CSIC, 46980 Valencia, Spain
| | - Janina Rahlff
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Centre for Ecology and Evolution in Microbial Model Systems (EEMiS), Department of Biology and Environmental Science, Linneaus University, 391 82 Kalmar, Sweden
| | - Emma Thomson
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Queen Elizabeth University Hospital, NHS Greater Glasgow and Clyde, Glasgow G51 4TF, UK
- MRC-University of Glasgow Centre for Virus Research, Glasgow G61 1QH, UK
| | - Charlotte Tumescheit
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- School of Biological Sciences, Seoul National University, Seoul 08826, Korea
| | - Lia van der Hoek
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Laboratory of Experimental Virology, Department of Medical Microbiology and Infection Prevention, Amsterdam UMC, University of Amsterdam, 1012 WX Amsterdam, The Netherlands
- Amsterdam Institute for Infection and Immunity, 1100 DD Amsterdam, The Netherlands
| | - Lore Van Espen
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Department of Microbiology, Immunology and Transplantation, Rega Institute, Katholieke Universiteit Leuven, B-3000 Leuven, Belgium
| | - Anne-Mieke Vandamme
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Department of Microbiology, Immunology and Transplantation, Rega Institute, Katholieke Universiteit Leuven, B-3000 Leuven, Belgium
- Global Health and Tropical Medicine, Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, 1349-008 Lisbon, Portugal
- Institute for the Future, Katholieke Universiteit Leuven, B-3000 Leuven, Belgium
| | - Maryam Zaheri
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Institute of Medical Virology, University of Zurich, 8057 Zurich, Switzerland
| | - Neta Zuckerman
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- Central Virology Laboratory, Public Health Services, Ministry of Health and Sheba Medical Center, Ramat Gan 52621, Israel
| | - Manja Marz
- European Virus Bioinformatics Center, 07743 Jena, Germany; (A.A.); (P.A.-R.); (M.B.); (K.B.); (C.C.); (S.D.-E.); (L.D.); (C.C.F.); (M.I.G.); (E.G.K.); (D.K.); (U.K.-K.); (K.J.M.); (I.M.M.); (L.M.); (L.N.); (S.P.); (A.P.-C.); (J.R.); (E.T.); (C.T.); (L.v.d.H.); (L.V.E.); (A.-M.V.); (M.Z.); (N.Z.)
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
| |
Collapse
|
45
|
A thorough annotation of the krill transcriptome offers new insights for the study of physiological processes. Sci Rep 2022; 12:11415. [PMID: 35794144 PMCID: PMC9259678 DOI: 10.1038/s41598-022-15320-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 06/22/2022] [Indexed: 11/09/2022] Open
Abstract
AbstractThe krill species Euphausia superba plays a critical role in the food chain of the Antarctic ecosystem. Significant changes in climate conditions observed in the Antarctic Peninsula region in the last decades have already altered the distribution of krill and its reproductive dynamics. A deeper understanding of the adaptation capabilities of this species is urgently needed. The availability of a large body of RNA-seq assays allowed us to extend the current knowledge of the krill transcriptome. Our study covered the entire developmental process providing information of central relevance for ecological studies. Here we identified a series of genes involved in different steps of the krill moulting cycle, in the reproductive process and in sexual maturation in accordance with what was already described in previous works. Furthermore, the new transcriptome highlighted the presence of differentially expressed genes previously unknown, playing important roles in cuticle development as well as in energy storage during the krill life cycle. The discovery of new opsin sequences, specifically rhabdomeric opsins, one onychopsin, and one non-visual arthropsin, expands our knowledge of the krill opsin repertoire. We have collected all these results into the KrillDB2 database, a resource combining the latest annotation of the krill transcriptome with a series of analyses targeting genes relevant to krill physiology. KrillDB2 provides in a single resource a comprehensive catalog of krill genes; an atlas of their expression profiles over all RNA-seq datasets publicly available; a study of differential expression across multiple conditions. Finally, it provides initial indications about the expression of microRNA precursors, whose contribution to krill physiology has never been reported before.
Collapse
|
46
|
Ross CJ, Ulitsky I. Discovering functional motifs in long noncoding RNAs. WILEY INTERDISCIPLINARY REVIEWS. RNA 2022; 13:e1708. [PMID: 34981665 DOI: 10.1002/wrna.1708] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 11/19/2021] [Accepted: 12/04/2021] [Indexed: 12/27/2022]
Abstract
Long noncoding RNAs (lncRNAs) are products of pervasive transcription that closely resemble messenger RNAs on the molecular level, yet function through largely unknown modes of action. The current model is that the function of lncRNAs often relies on specific, typically short, conserved elements, connected by linkers in which specific sequences and/or structures are less important. This notion has fueled the development of both computational and experimental methods focused on the discovery of functional elements within lncRNA genes, based on diverse signals such as evolutionary conservation, predicted structural elements, or the ability to rescue loss-of-function phenotypes. In this review, we outline the main challenges that the different methods need to overcome, describe the recently developed approaches, and discuss their respective limitations. This article is categorized under: RNA Evolution and Genomics > Computational Analyses of RNA RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs.
Collapse
Affiliation(s)
- Caroline Jane Ross
- Biological Regulation and Molecular Neuroscience, Weizmann Institute of Science, Rehovot, Israel
| | - Igor Ulitsky
- Biological Regulation and Molecular Neuroscience, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
47
|
Salinas-Restrepo C, Misas E, Estrada-Gómez S, Quintana-Castillo JC, Guzman F, Calderón JC, Giraldo MA, Segura C. Improving the Annotation of the Venom Gland Transcriptome of Pamphobeteus verdolaga, Prospecting Novel Bioactive Peptides. Toxins (Basel) 2022; 14:408. [PMID: 35737069 PMCID: PMC9228390 DOI: 10.3390/toxins14060408] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 06/06/2022] [Accepted: 06/07/2022] [Indexed: 02/01/2023] Open
Abstract
Spider venoms constitute a trove of novel peptides with biotechnological interest. Paucity of next-generation-sequencing (NGS) data generation has led to a description of less than 1% of these peptides. Increasing evidence supports the underestimation of the assembled genes a single transcriptome assembler can predict. Here, the transcriptome of the venom gland of the spider Pamphobeteus verdolaga was re-assembled, using three free access algorithms, Trinity, SOAPdenovo-Trans, and SPAdes, to obtain a more complete annotation. Assembler's performance was evaluated by contig number, N50, read representation on the assembly, and BUSCO's terms retrieval against the arthropod dataset. Out of all the assembled sequences with all software, 39.26% were common between the three assemblers, and 27.88% were uniquely assembled by Trinity, while 27.65% were uniquely assembled by SPAdes. The non-redundant merging of all three assemblies' output permitted the annotation of 9232 sequences, which was 23% more when compared to each software and 28% more when compared to the previous P. verdolaga annotation; moreover, the description of 65 novel theraphotoxins was possible. In the generation of data for non-model organisms, as well as in the search for novel peptides with biotechnological interest, it is highly recommended to employ at least two different transcriptome assemblers.
Collapse
Affiliation(s)
- Cristian Salinas-Restrepo
- Grupo Toxinología, Alternativas Terapéuticas y Alimentarias, Facultad de Ciencias Farmacéuticas y Alimentarias, Universidad de Antioquia, Medellín 050012, Colombia; (C.S.-R.); (S.E.-G.)
| | - Elizabeth Misas
- Corporación para Investigaciones Biológicas, Medellín 050012, Colombia;
| | - Sebastian Estrada-Gómez
- Grupo Toxinología, Alternativas Terapéuticas y Alimentarias, Facultad de Ciencias Farmacéuticas y Alimentarias, Universidad de Antioquia, Medellín 050012, Colombia; (C.S.-R.); (S.E.-G.)
- Centro de Investigación en Recursos Naturales y Sustentabilidad, Universidad Bernardo O’Higgins, Aven-ida Viel 1497, Santiago 7750000, Chile
| | | | - Fanny Guzman
- Núcleo Biotecnología Curauma (NBC), Pontifícia Universidad Católica de Valparaíso, Valparaíso 2374631, Chile;
| | - Juan C. Calderón
- Physiology and Biochemistry Research Group-PHYSIS, Faculty of Medicine, University of Antioquia, Medellín 050012, Colombia;
| | - Marco A. Giraldo
- Biophysics Group, Institute of Physics, University of Antioquia, Medellín 050012, Colombia;
| | - Cesar Segura
- Grupo Malaria, Facultad de Medicina, Universidad de Antioquia, Medellín 050012, Colombia
| |
Collapse
|
48
|
Gerbracht JV, Harding T, Simpson AGB, Roger AJ, Hess S. Comparative transcriptomics reveals the molecular toolkit used by an algivorous protist for cell wall perforation. Curr Biol 2022; 32:3374-3384.e5. [PMID: 35700733 DOI: 10.1016/j.cub.2022.05.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 04/11/2022] [Accepted: 05/18/2022] [Indexed: 10/18/2022]
Abstract
Microbial eukaryotes display a stunning diversity of feeding strategies, ranging from generalist predators to highly specialized parasites. The unicellular "protoplast feeders" represent a fascinating mechanistic intermediate, as they penetrate other eukaryotic cells (algae and fungi) like some parasites but then devour their cell contents by phagocytosis.1 Besides prey recognition and attachment, this complex behavior involves the local, pre-phagocytotic dissolution of the prey cell wall, which results in well-defined perforations of species-specific size and structure.2 Yet the molecular processes that enable protoplast feeders to overcome cell walls of diverse biochemical composition remain unknown. We used the flagellate Orciraptor agilis (Viridiraptoridae, Rhizaria) as a model protoplast feeder and applied differential gene expression analysis to examine its penetration of green algal cell walls. Besides distinct expression changes that reflect major cellular processes (e.g., locomotion and cell division), we found lytic carbohydrate-active enzymes that are highly expressed and upregulated during the attack on the alga. A putative endocellulase (family GH5_5) with a secretion signal is most prominent, and a potential key factor for cell wall dissolution. Other candidate enzymes (e.g., lytic polysaccharide monooxygenases) belong to families that are largely uncharacterized, emphasizing the potential of non-fungal microeukaryotes for enzyme exploration. Unexpectedly, we discovered various chitin-related factors that point to an unknown chitin metabolism in Orciraptor agilis, potentially also involved in the feeding process. Our findings provide first molecular insights into an important microbial feeding behavior and new directions for cell biology research on non-model eukaryotes.
Collapse
Affiliation(s)
- Jennifer V Gerbracht
- Institute for Zoology, University of Cologne, Zülpicher Str. 47b, 50674 Cologne, Germany
| | - Tommy Harding
- Department of Biochemistry and Molecular Biology, Dalhousie University, 5850 College Street, Halifax, NS B3H 4R2, Canada
| | - Alastair G B Simpson
- Department of Biology, Dalhousie University, 1355 Oxford Street, Halifax, NS B3H 4R2, Canada
| | - Andrew J Roger
- Department of Biochemistry and Molecular Biology, Dalhousie University, 5850 College Street, Halifax, NS B3H 4R2, Canada
| | - Sebastian Hess
- Institute for Zoology, University of Cologne, Zülpicher Str. 47b, 50674 Cologne, Germany; Department of Biochemistry and Molecular Biology, Dalhousie University, 5850 College Street, Halifax, NS B3H 4R2, Canada; Department of Biology, Dalhousie University, 1355 Oxford Street, Halifax, NS B3H 4R2, Canada.
| |
Collapse
|
49
|
Pulido-Quetglas C, Johnson R. Designing libraries for pooled CRISPR functional screens of long noncoding RNAs. Mamm Genome 2022; 33:312-327. [PMID: 34533605 PMCID: PMC9114037 DOI: 10.1007/s00335-021-09918-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 09/09/2021] [Indexed: 02/01/2023]
Abstract
Human and other genomes encode tens of thousands of long noncoding RNAs (lncRNAs), the vast majority of which remain uncharacterised. High-throughput functional screening methods, notably those based on pooled CRISPR-Cas perturbations, promise to unlock the biological significance and biomedical potential of lncRNAs. Such screens are based on libraries of single guide RNAs (sgRNAs) whose design is critical for success. Few off-the-shelf libraries are presently available, and lncRNAs tend to have cell-type-specific expression profiles, meaning that library design remains in the hands of researchers. Here we introduce the topic of pooled CRISPR screens for lncRNAs and guide readers through the three key steps of library design: accurate annotation of transcript structures, curation of optimal candidate sets, and design of sgRNAs. This review is a starting point and reference for researchers seeking to design custom CRISPR screening libraries for lncRNAs.
Collapse
Affiliation(s)
- Carlos Pulido-Quetglas
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, 3010, Bern, Switzerland
- Department for BioMedical Research, University of Bern, 3008, Bern, Switzerland
- Graduate School of Cellular and Biomedical Sciences, University of Bern, 3012, Bern, Switzerland
| | - Rory Johnson
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, 3010, Bern, Switzerland.
- Department for BioMedical Research, University of Bern, 3008, Bern, Switzerland.
- School of Biology and Environmental Science, University College Dublin, Dublin, D04 V1W8, Ireland.
- Conway Institute for Biomolecular and Biomedical Research, University College Dublin, Dublin, D04 V1W8, Ireland.
| |
Collapse
|
50
|
von Reumont BM, Anderluh G, Antunes A, Ayvazyan N, Beis D, Caliskan F, Crnković A, Damm M, Dutertre S, Ellgaard L, Gajski G, German H, Halassy B, Hempel BF, Hucho T, Igci N, Ikonomopoulou MP, Karbat I, Klapa MI, Koludarov I, Kool J, Lüddecke T, Ben Mansour R, Vittoria Modica M, Moran Y, Nalbantsoy A, Ibáñez MEP, Panagiotopoulos A, Reuveny E, Céspedes JS, Sombke A, Surm JM, Undheim EAB, Verdes A, Zancolli G. Modern venomics-Current insights, novel methods, and future perspectives in biological and applied animal venom research. Gigascience 2022; 11:giac048. [PMID: 35640874 PMCID: PMC9155608 DOI: 10.1093/gigascience/giac048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 04/10/2022] [Accepted: 04/12/2022] [Indexed: 12/11/2022] Open
Abstract
Venoms have evolved >100 times in all major animal groups, and their components, known as toxins, have been fine-tuned over millions of years into highly effective biochemical weapons. There are many outstanding questions on the evolution of toxin arsenals, such as how venom genes originate, how venom contributes to the fitness of venomous species, and which modifications at the genomic, transcriptomic, and protein level drive their evolution. These questions have received particularly little attention outside of snakes, cone snails, spiders, and scorpions. Venom compounds have further become a source of inspiration for translational research using their diverse bioactivities for various applications. We highlight here recent advances and new strategies in modern venomics and discuss how recent technological innovations and multi-omic methods dramatically improve research on venomous animals. The study of genomes and their modifications through CRISPR and knockdown technologies will increase our understanding of how toxins evolve and which functions they have in the different ontogenetic stages during the development of venomous animals. Mass spectrometry imaging combined with spatial transcriptomics, in situ hybridization techniques, and modern computer tomography gives us further insights into the spatial distribution of toxins in the venom system and the function of the venom apparatus. All these evolutionary and biological insights contribute to more efficiently identify venom compounds, which can then be synthesized or produced in adapted expression systems to test their bioactivity. Finally, we critically discuss recent agrochemical, pharmaceutical, therapeutic, and diagnostic (so-called translational) aspects of venoms from which humans benefit.
Collapse
Affiliation(s)
- Bjoern M von Reumont
- Goethe University Frankfurt, Institute for Cell Biology and Neuroscience, Department for Applied Bioinformatics, 60438 Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Frankfurt, Senckenberganlage 25, 60235 Frankfurt, Germany
- Justus Liebig University Giessen, Institute for Insectbiotechnology, Heinrich Buff Ring 26-32, 35396 Giessen, Germany
| | - Gregor Anderluh
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, 1000 Ljubljana, Slovenia
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450–208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Naira Ayvazyan
- Orbeli Institute of Physiology of NAS RA, Orbeli ave. 22, 0028 Yerevan, Armenia
| | - Dimitris Beis
- Developmental Biology, Centre for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation Academy of Athens, Athens 11527, Greece
| | - Figen Caliskan
- Department of Biology, Faculty of Science and Letters, Eskisehir Osmangazi University, TR-26040 Eskisehir, Turkey
| | - Ana Crnković
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, 1000 Ljubljana, Slovenia
| | - Maik Damm
- Technische Universität Berlin, Department of Chemistry, Straße des 17. Juni 135, 10623 Berlin, Germany
| | | | - Lars Ellgaard
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Goran Gajski
- Institute for Medical Research and Occupational Health, Mutagenesis Unit, Ksaverska cesta 2, 10000 Zagreb, Croatia
| | - Hannah German
- Amsterdam Institute of Molecular and Life Sciences, Division of BioAnalytical Chemistry, Faculty of Science, Vrije Universiteit Amsterdam, De Boelelaan 1085, 1081HV Amsterdam, The Netherlands
| | - Beata Halassy
- University of Zagreb, Centre for Research and Knowledge Transfer in Biotechnology, Trg Republike Hrvatske 14, 10000 Zagreb, Croatia
| | - Benjamin-Florian Hempel
- BIH Center for Regenerative Therapies BCRT, Charité - Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | - Tim Hucho
- Translational Pain Research, Department of Anesthesiology and Intensive Care Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, 50931 Cologne, Germany
| | - Nasit Igci
- Nevsehir Haci Bektas Veli University, Faculty of Arts and Sciences, Department of Molecular Biology and Genetics, 50300 Nevsehir, Turkey
| | - Maria P Ikonomopoulou
- Madrid Institute for Advanced Studies in Food, Madrid,E28049, Spain
- The University of Queensland, St Lucia, QLD 4072, Australia
| | - Izhar Karbat
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Maria I Klapa
- Metabolic Engineering and Systems Biology Laboratory, Institute of Chemical Engineering Sciences, Foundation for Research & Technology Hellas (FORTH/ICE-HT), Patras GR-26504, Greece
| | - Ivan Koludarov
- Justus Liebig University Giessen, Institute for Insectbiotechnology, Heinrich Buff Ring 26-32, 35396 Giessen, Germany
| | - Jeroen Kool
- Amsterdam Institute of Molecular and Life Sciences, Division of BioAnalytical Chemistry, Faculty of Science, Vrije Universiteit Amsterdam, De Boelelaan 1085, 1081HV Amsterdam, The Netherlands
| | - Tim Lüddecke
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Frankfurt, Senckenberganlage 25, 60235 Frankfurt, Germany
- Department of Bioresources, Fraunhofer Institute for Molecular Biology and Applied Ecology, 35392 Gießen, Germany
| | - Riadh Ben Mansour
- Department of Life Sciences, Faculty of Sciences, Gafsa University, Campus Universitaire Siidi Ahmed Zarrouk, 2112 Gafsa, Tunisia
| | - Maria Vittoria Modica
- Dept. of Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Via Po 25c, I-00198 Roma, Italy
| | - Yehu Moran
- Department of Ecology, Evolution and Behavior, Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
| | - Ayse Nalbantsoy
- Department of Bioengineering, Faculty of Engineering, Ege University, 35100 Bornova, Izmir, Turkey
| | - María Eugenia Pachón Ibáñez
- Unit of Infectious Diseases, Microbiology, and Preventive Medicine, Virgen del Rocío University Hospital, Institute of Biomedicine of Seville, 41013 Sevilla, Spain
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
| | - Alexios Panagiotopoulos
- Metabolic Engineering and Systems Biology Laboratory, Institute of Chemical Engineering Sciences, Foundation for Research & Technology Hellas (FORTH/ICE-HT), Patras GR-26504, Greece
- Animal Biology Division, Department of Biology, University of Patras, Patras, GR-26500, Greece
| | - Eitan Reuveny
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Javier Sánchez Céspedes
- Unit of Infectious Diseases, Microbiology, and Preventive Medicine, Virgen del Rocío University Hospital, Institute of Biomedicine of Seville, 41013 Sevilla, Spain
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, Madrid, Spain
| | - Andy Sombke
- Department of Evolutionary Biology, University of Vienna, Djerassiplatz 1, 1030 Vienna, Austria
| | - Joachim M Surm
- Department of Ecology, Evolution and Behavior, Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
| | - Eivind A B Undheim
- University of Oslo, Centre for Ecological and Evolutionary Synthesis, Postboks 1066 Blindern 0316 Oslo, Norway
| | - Aida Verdes
- Department of Biodiversity and Evolutionary Biology, Museo Nacional de Ciencias Naturales, José Gutiérrez Abascal 2, 28006 Madrid, Spain
| | - Giulia Zancolli
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|