1
|
Singh KP, Kumari P, Yadava DK. Development of de-novo transcriptome assembly and SSRs in allohexaploid Brassica with functional annotations and identification of heat-shock proteins for thermotolerance. Front Genet 2022; 13:958217. [PMID: 36186472 PMCID: PMC9524822 DOI: 10.3389/fgene.2022.958217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 08/23/2022] [Indexed: 11/20/2022] Open
Abstract
Crop Brassicas contain monogenomic and digenomic species, with no evidence of a trigenomic Brassica in nature. Through somatic fusion (Sinapis alba + B. juncea), a novel allohexaploid trigenomic Brassica (H1 = AABBSS; 2n = 60) was produced and used for transcriptome analysis to uncover genes for thermotolerance, annotations, and microsatellite markers for future molecular breeding. Illumina Novaseq 6000 generated a total of 76,055,546 paired-end raw reads, which were used for de-novo assembly, resulting in the development of 486,066 transcripts. A total of 133,167 coding sequences (CDSs) were predicted from transcripts with a mean length of 507.12 bp and 46.15% GC content. The BLASTX search of CDSs against public protein databases showed a maximum of 126,131 (94.72%) and a minimum of 29,810 (22.39%) positive hits. Furthermore, 953,773 gene ontology (GO) terms were found in 77,613 (58.28%) CDSs, which were divided into biological processes (49.06%), cellular components (31.67%), and molecular functions (19.27%). CDSs were assigned to 144 pathways by a pathway study using the KEGG database and 1,551 pathways by a similar analysis using the Reactome database. Further investigation led to the discovery of genes encoding over 2,000 heat shock proteins (HSPs). The discovery of a large number of HSPs in allohexaploid Brassica validated our earlier findings for heat tolerance at seed maturity. A total of 15,736 SSRs have been found in 13,595 CDSs, with an average of one SSR per 4.29 kb length and an SSR frequency of 11.82%. The first transcriptome assembly of a meiotically stable allohexaploid Brassica has been given in this article, along with functional annotations and the presence of SSRs, which could aid future genetic and genomic studies.
Collapse
Affiliation(s)
| | - Preetesh Kumari
- Genetics Division, ICAR—Indian Agricultural Research Institute, New Delhi, India
- *Correspondence: Preetesh Kumari,
| | | |
Collapse
|
2
|
Naranpanawa DNU, Chandrasekara CHWMRB, Bandaranayake PCG, Bandaranayake AU. Raw transcriptomics data to gene specific SSRs: a validated free bioinformatics workflow for biologists. Sci Rep 2020; 10:18236. [PMID: 33106560 PMCID: PMC7588437 DOI: 10.1038/s41598-020-75270-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2019] [Accepted: 09/21/2020] [Indexed: 02/07/2023] Open
Abstract
Recent advances in next-generation sequencing technologies have paved the path for a considerable amount of sequencing data at a relatively low cost. This has revolutionized the genomics and transcriptomics studies. However, different challenges are now created in handling such data with available bioinformatics platforms both in assembly and downstream analysis performed in order to infer correct biological meaning. Though there are a handful of commercial software and tools for some of the procedures, cost of such tools has made them prohibitive for most research laboratories. While individual open-source or free software tools are available for most of the bioinformatics applications, those components usually operate standalone and are not combined for a user-friendly workflow. Therefore, beginners in bioinformatics might find analysis procedures starting from raw sequence data too complicated and time-consuming with the associated learning-curve. Here, we outline a procedure for de novo transcriptome assembly and Simple Sequence Repeats (SSR) primer design solely based on tools that are available online for free use. For validation of the developed workflow, we used Illumina HiSeq reads of different tissue samples of Santalum album (sandalwood), generated from a previous transcriptomics project. A portion of the designed primers were tested in the lab with relevant samples and all of them successfully amplified the targeted regions. The presented bioinformatics workflow can accurately assemble quality transcriptomes and develop gene specific SSRs. Beginner biologists and researchers in bioinformatics can easily utilize this workflow for research purposes.
Collapse
Affiliation(s)
- D N U Naranpanawa
- Agricultural Biotechnology Centre, Faculty of Agriculture, University of Peradeniya, Peradeniya, 20400, Sri Lanka
- Postgraduate Institute of Science, University of Peradeniya, Peradeniya, 20400, Sri Lanka
| | - C H W M R B Chandrasekara
- Agricultural Biotechnology Centre, Faculty of Agriculture, University of Peradeniya, Peradeniya, 20400, Sri Lanka
| | - P C G Bandaranayake
- Agricultural Biotechnology Centre, Faculty of Agriculture, University of Peradeniya, Peradeniya, 20400, Sri Lanka
| | - A U Bandaranayake
- Department of Computer Engineering, Faculty of Engineering, University of Peradeniya, Peradeniya, 20400, Sri Lanka.
| |
Collapse
|
3
|
Kim KW, Allen DW, Briese T, Couper JJ, Barry SC, Colman PG, Cotterill AM, Davis EA, Giles LC, Harrison LC, Harris M, Haynes A, Horton JL, Isaacs SR, Jain K, Lipkin WI, McGorm K, Morahan G, Morbey C, Pang ICN, Papenfuss AT, Penno MAS, Sinnott RO, Soldatos G, Thomson RL, Vuillermin P, Wentworth JM, Wilkins MR, Rawlinson WD, Craig ME. Higher frequency of vertebrate-infecting viruses in the gut of infants born to mothers with type 1 diabetes. Pediatr Diabetes 2020; 21:271-279. [PMID: 31800147 DOI: 10.1111/pedi.12952] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Revised: 11/07/2019] [Accepted: 11/10/2019] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Microbial exposures in utero and early life shape the infant microbiome, which can profoundly impact on health. Compared to the bacterial microbiome, very little is known about the virome. We set out to characterize longitudinal changes in the gut virome of healthy infants born to mothers with or without type 1 diabetes using comprehensive virome capture sequencing. METHODS Healthy infants were selected from Environmental Determinants of Islet Autoimmunity (ENDIA), a prospective cohort of Australian children with a first-degree relative with type 1 diabetes, followed from pregnancy. Fecal specimens were collected three-monthly in the first year of life. RESULTS Among 25 infants (44% born to mothers with type 1 diabetes) at least one virus was detected in 65% (65/100) of samples and 96% (24/25) of infants during the first year of life. In total, 26 genera of viruses were identified and >150 viruses were differentially abundant between the gut of infants with a mother with type 1 diabetes vs without. Positivity for any virus was associated with maternal type 1 diabetes and older infant age. Enterovirus was associated with older infant age and maternal smoking. CONCLUSIONS We demonstrate a distinct gut virome profile in infants of mothers with type 1 diabetes, which may influence health outcomes later in life. Higher prevalence and greater number of viruses observed compared to previous studies suggests significant underrepresentation in existing virome datasets, arising most likely from less sensitive techniques used in data acquisition.
Collapse
Affiliation(s)
- Ki Wook Kim
- School of Women's and Children's Health, University of New South Wales, Sydney, New South Wales, Australia
| | - Digby W Allen
- School of Women's and Children's Health, University of New South Wales, Sydney, New South Wales, Australia
| | - Thomas Briese
- Center for Infection and Immunity and Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York
| | - Jennifer J Couper
- Robinson Research Institute and Adelaide Medical School, University of Adelaide, Adelaide, South Australia, Australia
| | - Simon C Barry
- Robinson Research Institute and Adelaide Medical School, University of Adelaide, Adelaide, South Australia, Australia
| | - Peter G Colman
- Department of Diabetes and Endocrinology, The Royal Melbourne Hospital Victoria, Melbourne, Victoria, Australia
| | - Andrew M Cotterill
- Department of Endocrinology, Queensland Children's Hospital, South Brisbane, Queensland, Australia
| | - Elizabeth A Davis
- Telethon Kids Institute, The University of Western Australia, Perth, Western Australia, Australia
| | - Lynne C Giles
- School of Public Health, University of Adelaide, Adelaide, South Australia, Australia
| | - Leonard C Harrison
- Walter and Eliza Hall Institute and Royal Melbourne Hospital, Melbourne, Victoria, Australia
| | - Mark Harris
- Department of Endocrinology, Queensland Children's Hospital, South Brisbane, Queensland, Australia
| | - Aveni Haynes
- Telethon Kids Institute, The University of Western Australia, Perth, Western Australia, Australia
| | - Jessica L Horton
- School of Women's and Children's Health, University of New South Wales, Sydney, New South Wales, Australia
| | - Sonia R Isaacs
- School of Women's and Children's Health, University of New South Wales, Sydney, New South Wales, Australia
| | - Komal Jain
- Center for Infection and Immunity and Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York
| | - Walter I Lipkin
- Center for Infection and Immunity and Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York
| | - Kelly McGorm
- Robinson Research Institute and Adelaide Medical School, University of Adelaide, Adelaide, South Australia, Australia
| | - Grant Morahan
- Centre for Diabetes Research, Harry Perkins Institute of Medical Research, Perth, Western Australia, Australia
| | - Claire Morbey
- Hunter Diabetes Centre, Newcastle, New South Wales, Australia
| | - Ignatius C N Pang
- School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, New South Wales, Australia
| | - Anthony T Papenfuss
- Walter and Eliza Hall Institute and Royal Melbourne Hospital, Melbourne, Victoria, Australia
| | - Megan A S Penno
- Robinson Research Institute and Adelaide Medical School, University of Adelaide, Adelaide, South Australia, Australia
| | - Richard O Sinnott
- Department of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - Georgia Soldatos
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | - Rebecca L Thomson
- Robinson Research Institute and Adelaide Medical School, University of Adelaide, Adelaide, South Australia, Australia
| | - Peter Vuillermin
- School of Medicine, Deakin University, Geelong, Victoria, Australia
| | - John M Wentworth
- Walter and Eliza Hall Institute and Royal Melbourne Hospital, Melbourne, Victoria, Australia
| | - Marc R Wilkins
- School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, New South Wales, Australia
| | - William D Rawlinson
- School of Women's and Children's Health, University of New South Wales, Sydney, New South Wales, Australia.,Serology and Virology Division, SEALS Microbiology, Prince of Wales Hospital, Sydney, New South Wales, Australia
| | - Maria E Craig
- School of Women's and Children's Health, University of New South Wales, Sydney, New South Wales, Australia.,Institute of Endocrinology and Diabetes, The Children's Hospital at Westmead, Sydney, New South Wales, Australia
| | | |
Collapse
|
4
|
Luo Y, Liao X, Wu FX, Wang J. Computational Approaches for Transcriptome Assembly Based on Sequencing Technologies. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190410155603] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Transcriptome assembly plays a critical role in studying biological properties and
examining the expression levels of genomes in specific cells. It is also the basis of many
downstream analyses. With the increase of speed and the decrease in cost, massive sequencing
data continues to accumulate. A large number of assembly strategies based on different
computational methods and experiments have been developed. How to efficiently perform
transcriptome assembly with high sensitivity and accuracy becomes a key issue. In this work, the
issues with transcriptome assembly are explored based on different sequencing technologies.
Specifically, transcriptome assemblies with next-generation sequencing reads are divided into
reference-based assemblies and de novo assemblies. The examples of different species are used to
illustrate that long reads produced by the third-generation sequencing technologies can cover fulllength
transcripts without assemblies. In addition, different transcriptome assemblies using the
Hybrid-seq methods and other tools are also summarized. Finally, we discuss the future directions
of transcriptome assemblies.
Collapse
Affiliation(s)
- Yuwen Luo
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Xingyu Liao
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatchewan, Canada
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
5
|
Carvajal-Lopez P, Von Borstel FD, Torres A, Rustici G, Gutierrez J, Romero-Vivas E. Microarray-Based Quality Assessment as a Supporting Criterion for de novo Transcriptome Assembly Selection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:198-206. [PMID: 30059314 DOI: 10.1109/tcbb.2018.2860997] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
RNA-Sequencing and de novo assembly have enabled the analysis of species with non-available reference transcriptomes, although intrinsic features (biological and technical) induce errors in the reconstruction. A strategy to resolve these errors consists of varying assembling process parameters to generate multiple reconstructions. However, the best assembly selection remains a challenge. Quantitative metrics for quality assessment have been inconsistent when compared with pertinent references. In this paper, a criterion for supporting assembly selection based on mapping DNA microarray hybridized probes to assembly sets is proposed. Mouse and fruit fly RNA-Seq datasets were assembled with standard de novo procedures. Quality assessment was estimated using quantitative metrics and the proposed criterion. The assembly that best mapped to the available reference transcriptomes of these model species provided the highest quality assembly. The hybridized probes identified the best assemblies, whereas quantitative metrics remained inconsistent. For example, subtle probe mapping difference of 0.25 percent, but statistically significant (ANOVA, p < 0.05), enabled the assembly selection that led to identify 3,719 more contigs and led to 1,049 further mapped contigs to the mouse reference transcriptome. The microarray data availability for non-model species makes the proposed criterion suitable for quality assessment of multiple de novo assembly strategies.
Collapse
|
6
|
Tremblay ÉD, Kimoto T, Bérubé JA, Bilodeau GJ. High-Throughput Sequencing to Investigate Phytopathogenic Fungal Propagules Caught in Baited Insect Traps. J Fungi (Basel) 2019; 5:E15. [PMID: 30759800 PMCID: PMC6463110 DOI: 10.3390/jof5010015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 02/02/2019] [Accepted: 02/04/2019] [Indexed: 12/20/2022] Open
Abstract
Studying the means of dispersal of plant pathogens is crucial to better understand the dynamic interactions involved in plant infections. On one hand, entomologists rely mostly on both traditional molecular methods and morphological characteristics, to identify pests. On the other hand, high-throughput sequencing (HTS) is becoming the go-to avenue for scientists studying phytopathogens. These organisms sometimes infect plants, together with insects. Considering the growing number of exotic insect introductions in Canada, forest pest-management efforts would benefit from the development of a high-throughput strategy to investigate the phytopathogenic fungal and oomycete species interacting with wood-boring insects. We recycled formerly discarded preservative fluids from the Canadian Food Inspection Agency annual survey using insect traps and analysed more than one hundred samples originating from across Canada. Using the Ion Torrent Personal Genome Machine (PGM) HTS technology and fusion primers, we performed metabarcoding to screen unwanted fungi and oomycetes species, including Phytophthora spp. Community profiling was conducted on the four different wood-boring, insect-attracting semiochemicals; although the preservative (contained ethanol) also attracted other insects. Phytopathogenic fungi (e.g., Leptographium spp. and Meria laricis in the pine sawyer semiochemical) and oomycetes (mainly Peronospora spp. and Pythium aff. hypogynum in the General Longhorn semiochemical), solely associated with one of the four types of semiochemicals, were detected. This project demonstrated that the insect traps' semiochemical microbiome represents a new and powerful matrix for screening phytopathogens. Compared to traditional diagnostic techniques, the fluids allowed for a faster and higher throughput assessment of the biodiversity contained within. Additionally, minimal modifications to this approach would allow it to be used in other phytopathology fields.
Collapse
Affiliation(s)
- Émilie D Tremblay
- Canadian Food Inspection Agency, 3851 Fallowfield Road, Nepean, ON, K2H 8P9, Canada.
| | - Troy Kimoto
- Canadian Food Inspection Agency, 4321 Still Creek Dr, Burnaby, BC, V5C 6S7, Canada.
| | - Jean A Bérubé
- Natural Resources Canada, Laurentian Forestry Centre, 1055 Du P.E.P.S. Street, P.O. Box 10380 Québec, QC, G1V 4C7, Canada.
| | - Guillaume J Bilodeau
- Canadian Food Inspection Agency, 3851 Fallowfield Road, Nepean, ON, K2H 8P9, Canada.
| |
Collapse
|
7
|
Wook Kim K, Allen DW, Briese T, Couper JJ, Barry SC, Colman PG, Cotterill AM, Davis EA, Giles LC, Harrison LC, Harris M, Haynes A, Horton JL, Isaacs SR, Jain K, Lipkin WI, Morahan G, Morbey C, Pang ICN, Papenfuss AT, Penno MAS, Sinnott RO, Soldatos G, Thomson RL, Vuillermin PJ, Wentworth JM, Wilkins MR, Rawlinson WD, Craig ME. Distinct Gut Virome Profile of Pregnant Women With Type 1 Diabetes in the ENDIA Study. Open Forum Infect Dis 2019; 6:ofz025. [PMID: 30815502 PMCID: PMC6386807 DOI: 10.1093/ofid/ofz025] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Revised: 01/05/2019] [Accepted: 01/15/2019] [Indexed: 12/11/2022] Open
Abstract
Background The importance of gut bacteria in human physiology, immune regulation, and disease pathogenesis is well established. In contrast, the composition and dynamics of the gut virome are largely unknown; particularly lacking are studies in pregnancy. We used comprehensive virome capture sequencing to characterize the gut virome of pregnant women with and without type 1 diabetes (T1D), longitudinally followed in the Environmental Determinants of Islet Autoimmunity study. Methods In total, 61 pregnant women (35 with T1D and 26 without) from Australia were examined. Nucleic acid was extracted from serial fecal specimens obtained at prenatal visits, and viral genomes were sequenced by virome capture enrichment. The frequency, richness, and abundance of viruses were compared between women with and without T1D. Results Two viruses were more prevalent in pregnant women with T1D: picobirnaviruses (odds ratio [OR], 4.2; 95% confidence interval [CI], 1.0–17.1; P = .046) and tobamoviruses (OR, 3.2; 95% CI, 1.1–9.3; P = .037). The abundance of 77 viruses significantly differed between the 2 maternal groups (≥2-fold difference; P < .02), including 8 Enterovirus B types present at a higher abundance in women with T1D. Conclusions These findings provide novel insight into the composition of the gut virome during pregnancy and demonstrate a distinct profile of viruses in women with T1D.
Collapse
Affiliation(s)
- Ki Wook Kim
- School of Women's and Children's Health, University of New South Wales, Sydney, Australia
| | - Digby W Allen
- School of Women's and Children's Health, University of New South Wales, Sydney, Australia
| | - Thomas Briese
- Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York
| | - Jennifer J Couper
- Adelaide Medical School, Faculty and Health and Medical Sciences, University of Adelaide Robinson Research Institute, Australia
| | - Simon C Barry
- Adelaide Medical School, Faculty and Health and Medical Sciences, University of Adelaide Robinson Research Institute, Australia
| | - Peter G Colman
- Department of Diabetes and Endocrinology, The Royal Melbourne Hospital Victoria, Australia
| | | | | | - Lynne C Giles
- School of Public Health, University of Adelaide, Australia
| | - Leonard C Harrison
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
| | - Mark Harris
- Children's Health Queensland Hospital and Health Service, Australia
| | - Aveni Haynes
- Telethon Kids Institute, The University of Western Australia, Perth
| | - Jessica L Horton
- School of Women's and Children's Health, University of New South Wales, Sydney, Australia
| | - Sonia R Isaacs
- School of Women's and Children's Health, University of New South Wales, Sydney, Australia
| | - Komal Jain
- Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York
| | - Walter Ian Lipkin
- Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York
| | - Grant Morahan
- Centre for Diabetes Research, The Harry Perkins Institute for Medical Research, Perth, Australia
| | | | - Ignatius C N Pang
- School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, Australia
| | - Anthony T Papenfuss
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
| | - Megan A S Penno
- Adelaide Medical School, Faculty and Health and Medical Sciences, University of Adelaide Robinson Research Institute, Australia
| | - Richard O Sinnott
- Department of Computing and Information Systems, University of Melbourne, Australia
| | - Georgia Soldatos
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Rebecca L Thomson
- Adelaide Medical School, Faculty and Health and Medical Sciences, University of Adelaide Robinson Research Institute, Australia
| | | | - John M Wentworth
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
| | - Marc R Wilkins
- School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, Australia
| | - William D Rawlinson
- Serology and Virology Division, SEALS Microbiology, Prince of Wales Hospital, Sydney, Australia
| | - Maria E Craig
- School of Women's and Children's Health, University of New South Wales, Sydney, Australia.,Institute of Endocrinology and Diabetes, The Children's Hospital at Westmead, Sydney, Australia
| | | |
Collapse
|
8
|
Tremblay ÉD, Duceppe MO, Bérubé JA, Kimoto T, Lemieux C, Bilodeau GJ. Screening for Exotic Forest Pathogens to Increase Survey Capacity Using Metagenomics. PHYTOPATHOLOGY 2018; 108:1509-1521. [PMID: 29923801 DOI: 10.1094/phyto-02-18-0028-r] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Anthropogenic activities have a major impact on the global environment. Canada's natural resources are threatened by the spread of fungal pathogens, which is facilitated by agricultural practices and international trade. Fungi are introduced to new environments and sometimes become established, in which case they can cause disease outbreaks resulting in extensive forest decline. Here, we describe how a nationwide sample collection strategy coupled to next-generation sequencing (NGS) (i.e., metagenomics) can achieve fast and comprehensive screening for exotic invasive species. This methodology can help provide guidance to phytopathology stakeholders such as regulatory agencies. Several regulated invasive species were monitored by processing field samples collected over 3 years (2013 to 2015) near high-risk areas across Canada. Fifteen sequencing runs were required on the Ion Torrent platform to process 398 samples that yielded 45 million reads. High-throughput screening of fungal and oomycete operational taxonomic units using customized fungi-specific ribosomal internal transcribed spacer 1 barcoded primers was performed. Likewise, Phytophthora-specific barcoded primers were used to amplify the adenosine triphosphate synthase subunit 9-nicotinamide adenine dinucleotide dehydrogenase subunit 9 spacer. Several Phytophthora spp. were detected by NGS and confirmed by species-specific quantitative polymerase chain reaction (qPCR) assays. The target species Heterobasidion annosum sensu stricto could be detected only through metagenomics. We demonstrated that screening target species using a variety of sampling techniques and NGS-the results of which were validated by qPCR-has the potential to increase survey capacity and detection sensitivity, reduce hands-on time and costs, and assist regulatory agencies to identify ports of entry. Considering that early detection and prevention are the keys in mitigating invasive species damage, our method represents a substantial asset in plant pathology management.
Collapse
Affiliation(s)
- Émilie D Tremblay
- First, second, and sixth authors: Canadian Food Inspection Agency (CFIA), 3851 Fallowfield Road, Ottawa, Ontario, K2H 8P9, Canada; third author: Natural Resources Canada, Laurentian Forestry Centre, 1055 Du P.E.P.S. Street, P.O. Box 10380 Québec, Québec, G1V 4C7, Canada; fourth author: CFIA, 4321 Still Creek Dr, Burnaby, British Columbia, V5C 6S7, Canada; and fifth author: Institut de biologie intégrative et des systèmes, 1030 avenue de la Médecine, Québec, Québec, G1V 0A6, Canada
| | - Marc-Olivier Duceppe
- First, second, and sixth authors: Canadian Food Inspection Agency (CFIA), 3851 Fallowfield Road, Ottawa, Ontario, K2H 8P9, Canada; third author: Natural Resources Canada, Laurentian Forestry Centre, 1055 Du P.E.P.S. Street, P.O. Box 10380 Québec, Québec, G1V 4C7, Canada; fourth author: CFIA, 4321 Still Creek Dr, Burnaby, British Columbia, V5C 6S7, Canada; and fifth author: Institut de biologie intégrative et des systèmes, 1030 avenue de la Médecine, Québec, Québec, G1V 0A6, Canada
| | - Jean A Bérubé
- First, second, and sixth authors: Canadian Food Inspection Agency (CFIA), 3851 Fallowfield Road, Ottawa, Ontario, K2H 8P9, Canada; third author: Natural Resources Canada, Laurentian Forestry Centre, 1055 Du P.E.P.S. Street, P.O. Box 10380 Québec, Québec, G1V 4C7, Canada; fourth author: CFIA, 4321 Still Creek Dr, Burnaby, British Columbia, V5C 6S7, Canada; and fifth author: Institut de biologie intégrative et des systèmes, 1030 avenue de la Médecine, Québec, Québec, G1V 0A6, Canada
| | - Troy Kimoto
- First, second, and sixth authors: Canadian Food Inspection Agency (CFIA), 3851 Fallowfield Road, Ottawa, Ontario, K2H 8P9, Canada; third author: Natural Resources Canada, Laurentian Forestry Centre, 1055 Du P.E.P.S. Street, P.O. Box 10380 Québec, Québec, G1V 4C7, Canada; fourth author: CFIA, 4321 Still Creek Dr, Burnaby, British Columbia, V5C 6S7, Canada; and fifth author: Institut de biologie intégrative et des systèmes, 1030 avenue de la Médecine, Québec, Québec, G1V 0A6, Canada
| | - Claude Lemieux
- First, second, and sixth authors: Canadian Food Inspection Agency (CFIA), 3851 Fallowfield Road, Ottawa, Ontario, K2H 8P9, Canada; third author: Natural Resources Canada, Laurentian Forestry Centre, 1055 Du P.E.P.S. Street, P.O. Box 10380 Québec, Québec, G1V 4C7, Canada; fourth author: CFIA, 4321 Still Creek Dr, Burnaby, British Columbia, V5C 6S7, Canada; and fifth author: Institut de biologie intégrative et des systèmes, 1030 avenue de la Médecine, Québec, Québec, G1V 0A6, Canada
| | - Guillaume J Bilodeau
- First, second, and sixth authors: Canadian Food Inspection Agency (CFIA), 3851 Fallowfield Road, Ottawa, Ontario, K2H 8P9, Canada; third author: Natural Resources Canada, Laurentian Forestry Centre, 1055 Du P.E.P.S. Street, P.O. Box 10380 Québec, Québec, G1V 4C7, Canada; fourth author: CFIA, 4321 Still Creek Dr, Burnaby, British Columbia, V5C 6S7, Canada; and fifth author: Institut de biologie intégrative et des systèmes, 1030 avenue de la Médecine, Québec, Québec, G1V 0A6, Canada
| |
Collapse
|
9
|
Bains S, Thakur V, Kaur J, Singh K, Kaur R. Elucidating genes involved in sesquiterpenoid and flavonoid biosynthetic pathways in Saussurea lappa by de novo leaf transcriptome analysis. Genomics 2018; 111:1474-1482. [PMID: 30343181 DOI: 10.1016/j.ygeno.2018.09.022] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 09/16/2018] [Accepted: 09/30/2018] [Indexed: 12/13/2022]
Abstract
Saussurea lappa (family Asteraceae) possesses immense pharmacological potential mainly due to the presence of sesquiterpene lactones. In spite of its medicinal importance, S. lappa has been poorly explored at the molecular level. We initiated leaf transcriptome sequencing of S. lappa using the illumina highseq 2000 platform and generated 62,039,614 raw reads. Trinity assembler generated 122,434 contigs with an N50 value of 1053 bp. The assembled transcripts were compared against the non-redundant protein database at NCBI. The Blast2GO analysis assigned gene ontology (GO) terms, categorized into molecular functions (3132), biological processes (4477) and cellular components (1.927). Using KEGG, around 476 contigs were assigned to 39 pathways. For secondary metabolic pathways, we identified transcripts encoding genes involved in sesquiterpenoid and flavonoid biosynthesis. Relatively low number of transcripts were also found encoding for genes involved in the alkaloid pathway. Our data will contribute to functional genomics and metabolic engineering studies in this plant.
Collapse
Affiliation(s)
- Savita Bains
- Deparment of Biotechnology, Panjab University, BMS Block I, Sector 25, Chandigarh 160014, India
| | - Vasundhara Thakur
- Deparment of Biotechnology, Panjab University, BMS Block I, Sector 25, Chandigarh 160014, India
| | - Jagdeep Kaur
- Deparment of Biotechnology, Panjab University, BMS Block I, Sector 25, Chandigarh 160014, India
| | - Kashmir Singh
- Deparment of Biotechnology, Panjab University, BMS Block I, Sector 25, Chandigarh 160014, India
| | - Ravneet Kaur
- Deparment of Biotechnology, Panjab University, BMS Block I, Sector 25, Chandigarh 160014, India.
| |
Collapse
|
10
|
Comparative immunological study of the snail Physella acuta (Hygrophila, Pulmonata) reveals shared and unique aspects of gastropod immunobiology. Mol Immunol 2018; 101:108-119. [PMID: 29920433 DOI: 10.1016/j.molimm.2018.05.029] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 05/24/2018] [Accepted: 05/30/2018] [Indexed: 12/22/2022]
Abstract
The freshwater snail Physella acuta was selected to expand the perspective of comparative snail immunology. Analysis of Physella acuta, belonging to the Physidae, taxonomic sister family to Planorbidae, affords family-level comparison of immune features characterized from Biomphalaria glabrata, the model snail often used to interpret general gastropod immunity. To capture constitutive and induced immune sequences, transcriptomes of an individual Physella acuta snail, 12 h post injection with bacteria (Gram -/+) and one sham-exposed snail were recorded with 454 pyrosequencing. Assembly yielded a combined reference transcriptome containing 24,288 transcripts. Additionally, genomic Illumina reads were obtained (∼15-fold coverage). Recovery of transcripts for two macin-like antimicrobial peptides (AMPs), 12 aplysianins, four LBP/BPIs and three physalysins indicated that Physella acuta shares a similar organization of antimicrobial defenses with Biomphalaria glabrata, contrasting a modest AMP arsenal with a diverse set of antimicrobial proteins. The lack of predicted transmembrane domains in all seven Physella acuta PGRP transcripts supports the notion that gastropods do not employ cell-bound PGRP receptors, different from ecdysozoan invertebrates yet similar to mammals (vertebrate deuterostomes). The well-documented sequence diversification by Biomphalaria glabrata FREPs (immune lectins comprising immunoglobulin superfamily domains and fibrinogen domains), resulting from somatic mutations of a large FREP gene family is hypothesized to be unique to Planorbidae; Physella acuta revealed just two bonafide FREP genes and these were not diversified. Furthermore, the flatworm parasite Echinostoma paraensei, confirmed here to infect both snail species, did not evoke from Physella acuta the abundant expression of FREP proteins at 2, 4 and 8 days post exposure that was previously observed from Biomphalaria glabrata. The Physella acuta reference transcriptome also revealed 24 unique transcripts encoding proteins consisting of a single fibrinogen-related domain (FReDs), with a short N-terminal sequence encoding either a signal peptide, transmembrane domain or no predicted features. The Physella acuta FReDs are candidate immune genes based on implication of similar sequences in immunity of bivalve molluscs. Overall, comparative analysis of snails of sister families elucidated the potential for taxon-specific immune features and investigation of strategically selected species will provide a more comprehensive view of gastropod immunity.
Collapse
|
11
|
Reducing the number of artifactual repeats in de novo assembly of RNA-Seq data by optimizing the assembly pipeline. GENE REPORTS 2017. [DOI: 10.1016/j.genrep.2017.08.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
12
|
De novo transcriptome sequencing and assembly from apomictic and sexual Eragrostis curvula genotypes. PLoS One 2017; 12:e0185595. [PMID: 29091722 PMCID: PMC5665505 DOI: 10.1371/journal.pone.0185595] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Accepted: 09/15/2017] [Indexed: 11/19/2022] Open
Abstract
A long-standing goal in plant breeding has been the ability to confer apomixis to agriculturally relevant species, which would require a deeper comprehension of the molecular basis of apomictic regulatory mechanisms. Eragrostis curvula (Schrad.) Nees is a perennial grass that includes both sexual and apomictic cytotypes. The availability of a reference transcriptome for this species would constitute a very important tool toward the identification of genes controlling key steps of the apomictic pathway. Here, we used Roche/454 sequencing technologies to generate reads from inflorescences of E. curvula apomictic and sexual genotypes that were de novo assembled into a reference transcriptome. Near 90% of the 49568 assembled isotigs showed sequence similarity to sequences deposited in the public databases. A gene ontology analysis categorized 27448 isotigs into at least one of the three main GO categories. We identified 11475 SSRs, and several of them were assayed in E curvula germoplasm using SSR-based primers, providing a valuable set of molecular markers that could allow direct allele selection. The differential contribution to each library of the spliced forms of several transcripts revealed the existence of several isotigs produced via alternative splicing of single genes. The reference transcriptome presented and validated in this work will be useful for the identification of a wide range of gene(s) related to agronomic traits of E. curvula, including those controlling key steps of the apomictic pathway in this species, allowing the extrapolation of the findings to other plant species.
Collapse
|
13
|
Schultz JH, Adema CM. Comparative immunogenomics of molluscs. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2017; 75:3-15. [PMID: 28322934 PMCID: PMC5494275 DOI: 10.1016/j.dci.2017.03.013] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Revised: 03/10/2017] [Accepted: 03/15/2017] [Indexed: 05/22/2023]
Abstract
Comparative immunology, studying both vertebrates and invertebrates, provided the earliest descriptions of phagocytosis as a general immune mechanism. However, the large scale of animal diversity challenges all-inclusive investigations and the field of immunology has developed by mostly emphasizing study of a few vertebrate species. In addressing the lack of comprehensive understanding of animal immunity, especially that of invertebrates, comparative immunology helps toward management of invertebrates that are food sources, agricultural pests, pathogens, or transmit diseases, and helps interpret the evolution of animal immunity. Initial studies showed that the Mollusca (second largest animal phylum), and invertebrates in general, possess innate defenses but lack the lymphocytic immune system that characterizes vertebrate immunology. Recognizing the reality of both common and taxon-specific immune features, and applying up-to-date cell and molecular research capabilities, in-depth studies of a select number of bivalve and gastropod species continue to reveal novel aspects of molluscan immunity. The genomics era heralded a new stage of comparative immunology; large-scale efforts yielded an initial set of full molluscan genome sequences that is available for analyses of full complements of immune genes and regulatory sequences. Next-generation sequencing (NGS), due to lower cost and effort required, allows individual researchers to generate large sequence datasets for growing numbers of molluscs. RNAseq provides expression profiles that enable discovery of immune genes and genome sequences reveal distribution and diversity of immune factors across molluscan phylogeny. Although computational de novo sequence assembly will benefit from continued development and automated annotation may require some experimental validation, NGS is a powerful tool for comparative immunology, especially increasing coverage of the extensive molluscan diversity. To date, immunogenomics revealed new levels of complexity of molluscan defense by indicating sequence heterogeneity in individual snails and bivalves, and members of expanded immune gene families are expressed differentially to generate pathogen-specific defense responses.
Collapse
Affiliation(s)
- Jonathan H Schultz
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - Coen M Adema
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA.
| |
Collapse
|
14
|
Ma T, Zhang A. Omics Informatics: From Scattered Individual Software Tools to Integrated Workflow Management Systems. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:926-946. [PMID: 26930689 DOI: 10.1109/tcbb.2016.2535251] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Omic data analyses pose great informatics challenges. As an emerging subfield of bioinformatics, omics informatics focuses on analyzing multi-omic data efficiently and effectively, and is gaining momentum. There are two underlying trends in the expansion of omics informatics landscape: the explosion of scattered individual omics informatics tools with each of which focuses on a specific task in both single- and multi- omic settings, and the fast-evolving integrated software platforms such as workflow management systems that can assemble multiple tools into pipelines and streamline integrative analysis for complicated tasks. In this survey, we give a holistic view of omics informatics, from scattered individual informatics tools to integrated workflow management systems. We not only outline the landscape and challenges of omics informatics, but also sample a number of widely used and cutting-edge algorithms in omics data analysis to give readers a fine-grained view. We survey various workflow management systems (WMSs), classify them into three levels of WMSs from simple software toolkits to integrated multi-omic analytical platforms, and point out the emerging needs for developing intelligent workflow management systems. We also discuss the challenges, strategies and some existing work in systematic evaluation of omics informatics tools. We conclude by providing future perspectives of emerging fields and new frontiers in omics informatics.
Collapse
|
15
|
Zhu C, Sun B, Liu T, Zheng H, Gu W, He W, Sun F, Wang Y, Yang M, Bei W, Peng X, She Q, Xie L, Chen L. Genomic and transcriptomic analyses reveal distinct biological functions for cold shock proteins (VpaCspA and VpaCspD) in Vibrio parahaemolyticus CHN25 during low-temperature survival. BMC Genomics 2017; 18:436. [PMID: 28583064 PMCID: PMC5460551 DOI: 10.1186/s12864-017-3784-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 05/10/2017] [Indexed: 11/24/2022] Open
Abstract
Background Vibrio parahaemolyticus causes serious seafood-borne gastroenteritis and death in humans. Raw seafood is often subjected to post-harvest processing and low-temperature storage. To date, very little information is available regarding the biological functions of cold shock proteins (CSPs) in the low-temperature survival of the bacterium. In this study, we determined the complete genome sequence of V. parahaemolyticus CHN25 (serotype: O5:KUT). The two main CSP-encoding genes (VpacspA and VpacspD) were deleted from the bacterial genome, and comparative transcriptomic analysis between the mutant and wild-type strains was performed to dissect the possible molecular mechanisms that underlie low-temperature adaptation by V. parahaemolyticus. Results The 5,443,401-bp V. parahaemolyticus CHN25 genome (45.2% G + C) consisted of two circular chromosomes and three plasmids with 4,724 predicted protein-encoding genes. One dual-gene and two single-gene deletion mutants were generated for VpacspA and VpacspD by homologous recombination. The growth of the ΔVpacspA mutant was strongly inhibited at 10 °C, whereas the VpacspD gene deletion strongly stimulated bacterial growth at this low temperature compared with the wild-type strain. The complementary phenotypes were observed in the reverse mutants (ΔVpacspA-com, and ΔVpacspD-com). The transcriptome data revealed that 12.4% of the expressed genes in V. parahaemolyticus CHN25 were significantly altered in the ΔVpacspA mutant when it was grown at 10 °C. These included genes that were involved in amino acid degradation, secretion systems, sulphur metabolism and glycerophospholipid metabolism along with ATP-binding cassette transporters. However, a low temperature elicited significant expression changes for 10.0% of the genes in the ΔVpacspD mutant, including those involved in the phosphotransferase system and in the metabolism of nitrogen and amino acids. The major metabolic pathways that were altered by the dual-gene deletion mutant (ΔVpacspAD) radically differed from those that were altered by single-gene mutants. Comparison of the transcriptome profiles further revealed numerous differentially expressed genes that were shared among the three mutants and regulators that were specifically, coordinately or antagonistically modulated by VpaCspA and VpaCspD. Our data also revealed several possible molecular coping strategies for low-temperature adaptation by the bacterium. Conclusions This study is the first to describe the complete genome sequence of V. parahaemolyticus (serotype: O5:KUT). The gene deletions, complementary insertions, and comparative transcriptomics demonstrate that VpaCspA is a primary CSP in the bacterium, while VpaCspD functions as a growth inhibitor at 10 °C. These results have improved our understanding of the genetic basis for low-temperature survival by the most common seafood-borne pathogen worldwide. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3784-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Chunhua Zhu
- Key Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture; College of Food Science and Technology, Shanghai Ocean University, 999 Hu Cheng Huan Road, Shanghai, 201306, People's Republic of China
| | - Boyi Sun
- Key Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture; College of Food Science and Technology, Shanghai Ocean University, 999 Hu Cheng Huan Road, Shanghai, 201306, People's Republic of China
| | - Taigang Liu
- College of Information Technology, Shanghai Ocean University, 999 Hu Cheng Huan Road, Shanghai, 201306, People's Republic of China
| | - Huajun Zheng
- Shanghai-MOST Key Laboratory of Disease and Health Genomics, Chinese National Human Genome Centre at Shanghai, Shanghai, 201203, People's Republic of China
| | - Wenyi Gu
- Shanghai-MOST Key Laboratory of Disease and Health Genomics, Chinese National Human Genome Centre at Shanghai, Shanghai, 201203, People's Republic of China
| | - Wei He
- Shanghai Hanyu Bio-lab, 151 Ke Yuan Road, Shanghai, 201203, People's Republic of China
| | - Fengjiao Sun
- Key Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture; College of Food Science and Technology, Shanghai Ocean University, 999 Hu Cheng Huan Road, Shanghai, 201306, People's Republic of China
| | - Yaping Wang
- Key Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture; College of Food Science and Technology, Shanghai Ocean University, 999 Hu Cheng Huan Road, Shanghai, 201306, People's Republic of China
| | - Meicheng Yang
- Shanghai Institute for Food and Drug Control, 1500 Zhang Heng Road, Shanghai, 201203, People's Republic of China
| | - Weicheng Bei
- State Key Laboratory of Agricultural Microbiology, Laboratory of Animal Infectious Diseases, College of Animal Science & Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, 430070, People's Republic of China
| | - Xu Peng
- Archaea Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK2200, Copenhagen N, Denmark
| | - Qunxin She
- Archaea Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK2200, Copenhagen N, Denmark
| | - Lu Xie
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Shanghai, 201203, People's Republic of China.
| | - Lanming Chen
- Key Laboratory of Quality and Safety Risk Assessment for Aquatic Products on Storage and Preservation (Shanghai), China Ministry of Agriculture; College of Food Science and Technology, Shanghai Ocean University, 999 Hu Cheng Huan Road, Shanghai, 201306, People's Republic of China.
| |
Collapse
|
16
|
Exploring the heat-responsive chaperones and microsatellite markers associated with terminal heat stress tolerance in developing wheat. Funct Integr Genomics 2017; 17:621-640. [PMID: 28573536 DOI: 10.1007/s10142-017-0560-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Revised: 04/06/2017] [Accepted: 04/17/2017] [Indexed: 10/19/2022]
Abstract
Global warming is a major threat for agriculture and food security, and in many cases the negative impacts are already apparent. Wheat is one of the most important staple food crops and is highly sensitive to the heat stress (HS) during reproductive and grain-filling stages. Here, whole transcriptome analysis of thermotolerant wheat cv. HD2985 was carried out at the post-anthesis stage under control (22 ± 3 °C) and HS-treated (42 °C, 2 h) conditions using Illumina Hiseq and Roche GS-FLX 454 platforms. We assembled ~24 million (control) and ~23 million (HS-treated) high-quality trimmed reads using different assemblers with optimal parameters. De novo assembly yielded 52,567 (control) and 59,658 (HS-treated) unigenes. We observed 785 transcripts to be upregulated and 431 transcripts to be downregulated under HS; 78 transcripts showed >10-fold upregulation such as HSPs, metabolic pathway-related genes, etc. Maximum number of upregulated genes was observed to be associated with processes such as HS-response, protein-folding, oxidation-reduction and photosynthesis. We identified 2008 and 2483 simple sequence repeats (SSRs) markers from control and HS-treated samples; 243 SSRs were observed to be overlying on stress-associated genes. Polymorphic study validated four SSRs to be heat-responsive in nature. Expression analysis of identified differentially expressed transcripts (DETs) showed very high fold increase in the expression of catalytic chaperones (HSP26, HSP17, and Rca) in contrasting wheat cvs. HD2985 and HD2329 under HS. We observed positive correlation between RNA-seq and qRT-PCR expression data. The present study culminated in greater understanding of the heat-response of tolerant genotype and has provided good candidate genes for the marker development and screening of wheat germplasm for thermotolerance.
Collapse
|
17
|
Armero A, Baudouin L, Bocs S, This D. Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut. PLoS One 2017; 12:e0173300. [PMID: 28334050 PMCID: PMC5363918 DOI: 10.1371/journal.pone.0173300] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 02/17/2017] [Indexed: 01/20/2023] Open
Abstract
The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/).
Collapse
Affiliation(s)
- Alix Armero
- Montpellier SupAgro, UMR AGAP, Montpellier, France
| | | | - Stéphanie Bocs
- CIRAD, UMR AGAP, Montpellier, France
- South Green Bioinformatics Platform, Montpellier, France
| | | |
Collapse
|
18
|
Loke KK, Rahnamaie-Tajadod R, Yeoh CC, Goh HH, Mohamed-Hussein ZA, Zainal Z, Ismail I, Mohd Noor N. Transcriptome analysis of Polygonum minus reveals candidate genes involved in important secondary metabolic pathways of phenylpropanoids and flavonoids. PeerJ 2017; 5:e2938. [PMID: 28265493 PMCID: PMC5333554 DOI: 10.7717/peerj.2938] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 12/23/2016] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Polygonum minus is an herbal plant in the Polygonaceae family which is rich in ethnomedicinal plants. The chemical composition and characteristic pungent fragrance of Polygonum minus have been extensively studied due to its culinary and medicinal properties. There are only a few transcriptome sequences available for species from this important family of medicinal plants. The limited genetic information from the public expressed sequences tag (EST) library hinders further study on molecular mechanisms underlying secondary metabolite production. METHODS In this study, we performed a hybrid assembly of 454 and Illumina sequencing reads from Polygonum minus root and leaf tissues, respectively, to generate a combined transcriptome library as a reference. RESULTS A total of 34.37 million filtered and normalized reads were assembled into 188,735 transcripts with a total length of 136.67 Mbp. We performed a similarity search against all the publicly available genome sequences and found similarity matches for 163,200 (86.5%) of Polygonum minus transcripts, largely from Arabidopsis thaliana (58.9%). Transcript abundance in the leaf and root tissues were estimated and validated through RT-qPCR of seven selected transcripts involved in the biosynthesis of phenylpropanoids and flavonoids. All the transcripts were annotated against KEGG pathways to profile transcripts related to the biosynthesis of secondary metabolites. DISCUSSION This comprehensive transcriptome profile will serve as a useful sequence resource for molecular genetics and evolutionary research on secondary metabolite biosynthesis in Polygonaceae family. Transcriptome assembly of Polygonum minus can be accessed at http://prims.researchfrontier.org/index.php/dataset/transcriptome.
Collapse
Affiliation(s)
- Kok-Keong Loke
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | | | - Chean-Chean Yeoh
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | - Hoe-Han Goh
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | - Zamri Zainal
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | - Ismanizan Ismail
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
- School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | - Normah Mohd Noor
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| |
Collapse
|
19
|
De Oliveira AL, Wollesen T, Kristof A, Scherholz M, Redl E, Todt C, Bleidorn C, Wanninger A. Comparative transcriptomics enlarges the toolkit of known developmental genes in mollusks. BMC Genomics 2016; 17:905. [PMID: 27832738 PMCID: PMC5103448 DOI: 10.1186/s12864-016-3080-9] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 09/08/2016] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Mollusks display a striking morphological disparity, including, among others, worm-like animals (the aplacophorans), snails and slugs, bivalves, and cephalopods. This phenotypic diversity renders them ideal for studies into animal evolution. Despite being one of the most species-rich phyla, molecular and in silico studies concerning specific key developmental gene families are still scarce, thus hampering deeper insights into the molecular machinery that governs the development and evolution of the various molluscan class-level taxa. RESULTS Next-generation sequencing was used to retrieve transcriptomes of representatives of seven out of the eight recent class-level taxa of mollusks. Similarity searches, phylogenetic inferences, and a detailed manual curation were used to identify and confirm the orthology of numerous molluscan Hox and ParaHox genes, which resulted in a comprehensive catalog that highlights the evolution of these genes in Mollusca and other metazoans. The identification of a specific molluscan motif in the Hox paralog group 5 and a lophotrochozoan ParaHox motif in the Gsx gene is described. Functional analyses using KEGG and GO tools enabled a detailed description of key developmental genes expressed in important pathways such as Hedgehog, Wnt, and Notch during development of the respective species. The KEGG analysis revealed Wnt8, Wnt11, and Wnt16 as Wnt genes hitherto not reported for mollusks, thereby enlarging the known Wnt complement of the phylum. In addition, novel Hedgehog (Hh)-related genes were identified in the gastropod Lottia cf. kogamogai, demonstrating a more complex gene content in this species than in other mollusks. CONCLUSIONS The use of de novo transcriptome assembly and well-designed in silico protocols proved to be a robust approach for surveying and mining large sequence data in a wide range of non-model mollusks. The data presented herein constitute only a small fraction of the information retrieved from the analysed molluscan transcriptomes, which can be promptly employed in the identification of novel genes and gene families, phylogenetic inferences, and other studies using molecular tools. As such, our study provides an important framework for understanding some of the underlying molecular mechanisms involved in molluscan body plan diversification and hints towards functions of key developmental genes in molluscan morphogenesis.
Collapse
Affiliation(s)
- A. L. De Oliveira
- Department of Integrative Zoology, Faculty of Life Sciences, University of Vienna, Althanstraße 14, Vienna, 1090 Austria
| | - T. Wollesen
- Department of Integrative Zoology, Faculty of Life Sciences, University of Vienna, Althanstraße 14, Vienna, 1090 Austria
| | - A. Kristof
- Department of Integrative Zoology, Faculty of Life Sciences, University of Vienna, Althanstraße 14, Vienna, 1090 Austria
| | - M. Scherholz
- Department of Integrative Zoology, Faculty of Life Sciences, University of Vienna, Althanstraße 14, Vienna, 1090 Austria
| | - E. Redl
- Department of Integrative Zoology, Faculty of Life Sciences, University of Vienna, Althanstraße 14, Vienna, 1090 Austria
| | - C. Todt
- University of Bergen, University Museum, The Natural History Collections, Allégaten 41, 5007 Bergen, Norway
| | - C. Bleidorn
- Museo Nacional de Ciencias Naturales, Spanish National Research Council (CSIC), José Gutiérrez Abascal 2, Madrid, 28006 Spain
- Institute of Biology, University of Leipzig, Leipzig, 04103 Germany
| | - A. Wanninger
- Department of Integrative Zoology, Faculty of Life Sciences, University of Vienna, Althanstraße 14, Vienna, 1090 Austria
| |
Collapse
|
20
|
Greenwood JM, Ezquerra AL, Behrens S, Branca A, Mallet L. Current analysis of host–parasite interactions with a focus on next generation sequencing data. ZOOLOGY 2016; 119:298-306. [DOI: 10.1016/j.zool.2016.06.010] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 06/22/2016] [Accepted: 06/22/2016] [Indexed: 01/21/2023]
|
21
|
Faber-Hammond JJ, Brown KH. Anchored pseudo-de novo assembly of human genomes identifies extensive sequence variation from unmapped sequence reads. Hum Genet 2016; 135:727-40. [PMID: 27061184 PMCID: PMC4899208 DOI: 10.1007/s00439-016-1667-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Accepted: 03/29/2016] [Indexed: 01/08/2023]
Abstract
The human genome reference (HGR) completion marked the genomics era beginning, yet despite its utility universal application is limited by the small number of individuals used in its development. This is highlighted by the presence of high-quality sequence reads failing to map within the HGR. Sequences failing to map generally represent 2-5 % of total reads, which may harbor regions that would enhance our understanding of population variation, evolution, and disease. Alternatively, complete de novo assemblies can be created, but these effectively ignore the groundwork of the HGR. In an effort to find a middle ground, we developed a bioinformatic pipeline that maps paired-end reads to the HGR as separate single reads, exports unmappable reads, de novo assembles these reads per individual and then combines assemblies into a secondary reference assembly used for comparative analysis. Using 45 diverse 1000 Genomes Project individuals, we identified 351,361 contigs covering 195.5 Mb of sequence unincorporated in GRCh38. 30,879 contigs are represented in multiple individuals with ~40 % showing high sequence complexity. Genomic coordinates were generated for 99.9 %, with 52.5 % exhibiting high-quality mapping scores. Comparative genomic analyses with archaic humans and primates revealed significant sequence alignments and comparisons with model organism RefSeq gene datasets identified novel human genes. If incorporated, these sequences will expand the HGR, but more importantly our data highlight that with this method low coverage (~10-20×) next-generation sequencing can still be used to identify novel unmapped sequences to explore biological functions contributing to human phenotypic variation, disease and functionality for personal genomic medicine.
Collapse
Affiliation(s)
- Joshua J Faber-Hammond
- Department of Biology, Portland State University, 1719 SW 10th Ave., SRTC 246, Portland, 97207-0751, USA
| | - Kim H Brown
- Department of Biology, Portland State University, 1719 SW 10th Ave., SRTC 246, Portland, 97207-0751, USA.
| |
Collapse
|
22
|
Vatanparast M, Shetty P, Chopra R, Doyle JJ, Sathyanarayana N, Egan AN. Transcriptome sequencing and marker development in winged bean (Psophocarpus tetragonolobus; Leguminosae). Sci Rep 2016; 6:29070. [PMID: 27356763 PMCID: PMC4928180 DOI: 10.1038/srep29070] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Accepted: 06/14/2016] [Indexed: 01/08/2023] Open
Abstract
Winged bean, Psophocarpus tetragonolobus (L.) DC., is similar to soybean in yield and nutritional value but more viable in tropical conditions. Here, we strengthen genetic resources for this orphan crop by producing a de novo transcriptome assembly and annotation of two Sri Lankan accessions (denoted herein as CPP34 [PI 491423] and CPP37 [PI 639033]), developing simple sequence repeat (SSR) markers, and identifying single nucleotide polymorphisms (SNPs) between geographically separated genotypes. A combined assembly based on 804,757 reads from two accessions produced 16,115 contigs with an N50 of 889 bp, over 90% of which has significant sequence similarity to other legumes. Combining contigs with singletons produced 97,241 transcripts. We identified 12,956 SSRs, including 2,594 repeats for which primers were designed and 5,190 high-confidence SNPs between Sri Lankan and Nigerian genotypes. The transcriptomic data sets generated here provide new resources for gene discovery and marker development in this orphan crop, and will be vital for future plant breeding efforts. We also analyzed the soybean trypsin inhibitor (STI) gene family, important plant defense genes, in the context of related legumes and found evidence for radiation of the Kunitz trypsin inhibitor (KTI) gene family within winged bean.
Collapse
Affiliation(s)
- Mohammad Vatanparast
- US National Herbarium (US), Department of Botany, Smithsonian Institution-NMNH, 10th and Constitution Ave, Washington DC, 20013, USA
| | - Prateek Shetty
- Department of Plant Biology, Michigan State University, 612 Wilson Road, Room 166, East Lansing, MI, 48824, USA
| | - Ratan Chopra
- United States Department of Agriculture, Agriculture Research Service, 3810 4th St., Lubbock, TX, 79415, USA
| | - Jeff J Doyle
- Section of Plant Breeding &Genetics, School of Integrative Plant Science, Cornell University, 412 Mann Library, Ithaca, NY, 14853, USA
| | - N Sathyanarayana
- Department of Botany, Sikkim University, 5th Mile, Tadong, Gangtok, Sikkim, 737102, India
| | - Ashley N Egan
- US National Herbarium (US), Department of Botany, Smithsonian Institution-NMNH, 10th and Constitution Ave, Washington DC, 20013, USA
| |
Collapse
|
23
|
|
24
|
Bushmanova E, Antipov D, Lapidus A, Suvorov V, Prjibelski AD. rnaQUAST: a quality assessment tool for de novo transcriptome assemblies. Bioinformatics 2016; 32:2210-2. [PMID: 27153654 DOI: 10.1093/bioinformatics/btw218] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 04/18/2016] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED Ability to generate large RNA-Seq datasets created a demand for both de novo and reference-based transcriptome assemblers. However, while many transcriptome assemblers are now available, there is still no unified quality assessment tool for RNA-Seq assemblies. We present rnaQUAST-a tool for evaluating RNA-Seq assembly quality and benchmarking transcriptome assemblers using reference genome and gene database. rnaQUAST calculates various metrics that demonstrate completeness and correctness levels of the assembled transcripts, and outputs them in a user-friendly report. AVAILABILITY AND IMPLEMENTATION rnaQUAST is implemented in Python and is freely available at http://bioinf.spbau.ru/en/rnaquast CONTACT ap@bioinf.spbau.ru SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Elena Bushmanova
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Dmitry Antipov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Alla Lapidus
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia Algorithmic Biology Lab, St. Petersburg Academic University, St. Petersburg, Russia
| | - Vladimir Suvorov
- Research and Development Department, EMC, St. Petersburg, Russia
| | - Andrey D Prjibelski
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia Algorithmic Biology Lab, St. Petersburg Academic University, St. Petersburg, Russia
| |
Collapse
|
25
|
Faber-Hammond JJ, Brown KH. Pseudo-De Novo Assembly and Analysis of Unmapped Genome Sequence Reads in Wild Zebrafish Reveal Novel Gene Content. Zebrafish 2016; 13:95-102. [PMID: 26886859 DOI: 10.1089/zeb.2015.1154] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Zebrafish represents the third vertebrate with an officially completed genome, yet it remains incomplete with additions and corrections continuing with the current release, GRCz10, having 13% of zebrafish cDNA sequences unmapped. This disparity may result from population differences, given that the genome reference was generated from clonal individuals with limited genetic diversity. This is supported by the recent analysis of a single wild zebrafish, which identified over 5.2 million SNPs and 1.6 million in/dels in the previous genome build, zv9. Re-examination of this sequence data set indicated that 13.8% of quality sequence reads failed to align to GRCz10. Using a novel bioinformatics de novo assembly pipeline on these unmappable reads, we identified 1,514,491 novel contigs covering ∼224 Mb of genomic sequence. Among these, 1083 contigs were found to contain a potential gene coding sequence. RNA-seq data comparison confirmed that 362 contigs contained a transcribed DNA sequence, suggesting that a large amount of functional genomic sequence remains unannotated in the zebrafish reference genome. By utilizing the bioinformatics pipeline developed in this study, the zebrafish genome will be bolstered as a model for human disease research. Adaptation of the pipeline described here also offers a cost-efficient and effective method to identify and map novel genetic content across any genome and will ultimately aid in the completion of additional genomes for a broad range of species.
Collapse
Affiliation(s)
| | - Kim H Brown
- Department of Biology, Portland State University , Portland, Oregon
| |
Collapse
|
26
|
Perera OP, Walsh TK, Luttrell RG. Complete Mitochondrial Genome of Helicoverpa zea (Lepidoptera: Noctuidae) and Expression Profiles of Mitochondrial-Encoded Genes in Early and Late Embryos. JOURNAL OF INSECT SCIENCE (ONLINE) 2016; 16:iew023. [PMID: 27126963 PMCID: PMC4864584 DOI: 10.1093/jisesa/iew023] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Accepted: 03/06/2016] [Indexed: 05/06/2023]
Abstract
The mitochondrial genome (mitogenome) of the bollworm, Helicoverpa zea (Boddie), was assembled using paired-end nucleotide sequence reads generated with a next-generation sequencing platform. Assembly resulted in a mitogenome of 15,348 bp with greater than 17,000-fold average coverage. Organization of the H. zea mitogenome (gene order and orientation) was identical to other known lepidopteran mitogenome sequences. Compared with Helicoverpa armigera (Hübner) mitogenome, there were a few differences in the lengths of gaps between genes, but the lengths of nucleotide overlaps were essentially conserved between the two species. Nucleotide composition of the H. zea mitochondrial genome was very similar to those of the related species H. armigera and Helicoverpa punctigera Wallengren. Mapping of RNA-Seq reads obtained from 2-h eggs and 48-h embryos to protein coding genes (PCG) revealed that all H. zea PCGs were processed as single mature gene transcripts except for the bicistronic atp8 + atp6 transcript. A tRNA-like sequence predicted to form a hammer-head-like secondary structure that may play a role in transcription start and mitogenome replication was identified within the control region of the H. zea mitogenome. Similar structures were also found within the control regions of several other lepidopteran species. Expression analysis revealed significant differences in levels of expression of PCGs within each developmental stage, but the pattern of variation was similar in both developmental stages analyzed in this study. Mapping of RNA-Seq reads to PCG transcripts also identified transcription termination and polyadenylation sites that differed from the sites described in other lepidopteran species.
Collapse
Affiliation(s)
- Omaththage P Perera
- Southern Insect Management Research Unit, USDA-ARS, Stoneville, MS 38776 (; ; ),
| | - Thomas K Walsh
- Land and Water Flagship, Commonwealth Scientific and Industrial Research Organization, Clunies Ross Street, GPO Box 1700, Canberra, ACT 2601, Australia
| | - Randall G Luttrell
- Southern Insect Management Research Unit, USDA-ARS, Stoneville, MS 38776 (; ; )
| |
Collapse
|
27
|
Feldmeyer B, Greshake B, Funke E, Ebersberger I, Pfenninger M. Positive selection in development and growth rate regulation genes involved in species divergence of the genus Radix. BMC Evol Biol 2015; 15:164. [PMID: 26281847 PMCID: PMC4539673 DOI: 10.1186/s12862-015-0434-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 07/24/2015] [Indexed: 01/09/2023] Open
Abstract
Background Life history traits like developmental time, age and size at maturity are directly related to fitness in all organisms and play a major role in adaptive evolution and speciation processes. Comparative genomic or transcriptomic approaches to identify positively selected genes involved in species divergence can help to generate hypotheses on the driving forces behind speciation. Here we use a bottom-up approach to investigate this hypothesis by comparative analysis of orthologous transcripts of four closely related European Radix species. Results Snails of the genus Radix occupy species specific distribution ranges with distinct climatic niches, indicating a potential for natural selection driven speciation based on ecological niche differentiation. We then inferred phylogenetic relationships among the four Radix species based on whole mt-genomes plus 23 nuclear loci. Three different tests to infer selection and changes in amino acid properties yielded a total of 134 genes with signatures of positive selection. The majority of these genes belonged to the functional gene ontology categories “reproduction” and “genitalia” with an overrepresentation of the functions “development” and “growth rate”. Conclusions We show here that Radix species divergence may be primarily enforced by selection on life history traits such as (larval-) development and growth rate. We thus hypothesise that life history differences may confer advantages under the according climate regimes, e.g., species occupying warmer and dryer habitats might have a fitness advantage with fast developing susceptible life stages, which are more tolerant to habitat desiccation. Electronic supplementary material The online version of this article (doi:10.1186/s12862-015-0434-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Barbara Feldmeyer
- Molecular Ecology Group, Senckenberg Biodiversity and Climate Research Centre (BiK-F), Georg-Voigt-Str. 14-16, Frankfurt am Main, 60325, Germany. .,Evolutionary Biology, Johannes Gutenberg University Mainz, Müllerweg 6, Mainz, 55099, Germany.
| | - Bastian Greshake
- Molecular Ecology Group, Senckenberg Biodiversity and Climate Research Centre (BiK-F), Georg-Voigt-Str. 14-16, Frankfurt am Main, 60325, Germany. .,Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Maxvon-Laue Str. 13, Frankfurt am Main, 60438, Germany.
| | - Elisabeth Funke
- Molecular Ecology Group, Senckenberg Biodiversity and Climate Research Centre (BiK-F), Georg-Voigt-Str. 14-16, Frankfurt am Main, 60325, Germany.
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Maxvon-Laue Str. 13, Frankfurt am Main, 60438, Germany.
| | - Markus Pfenninger
- Molecular Ecology Group, Senckenberg Biodiversity and Climate Research Centre (BiK-F), Georg-Voigt-Str. 14-16, Frankfurt am Main, 60325, Germany.
| |
Collapse
|
28
|
Miguel A, de Vega-Bartol J, Marum L, Chaves I, Santo T, Leitão J, Varela MC, Miguel CM. Characterization of the cork oak transcriptome dynamics during acorn development. BMC PLANT BIOLOGY 2015; 15:158. [PMID: 26109289 PMCID: PMC4479327 DOI: 10.1186/s12870-015-0534-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Accepted: 05/26/2015] [Indexed: 05/11/2023]
Abstract
BACKGROUND Cork oak (Quercus suber L.) has a natural distribution across western Mediterranean regions and is a keystone forest tree species in these ecosystems. The fruiting phase is especially critical for its regeneration but the molecular mechanisms underlying the biochemical and physiological changes during cork oak acorn development are poorly understood. In this study, the transcriptome of the cork oak acorn, including the seed, was characterized in five stages of development, from early development to acorn maturation, to identify the dominant processes in each stage and reveal transcripts with important functions in gene expression regulation and response to water. RESULTS A total of 80,357 expressed sequence tags (ESTs) were de novo assembled from RNA-Seq libraries representative of the several acorn developmental stages. Approximately 7.6 % of the total number of transcripts present in Q. suber transcriptome was identified as acorn specific. The analysis of expression profiles during development returned 2,285 differentially expressed (DE) transcripts, which were clustered into six groups. The stage of development corresponding to the mature acorn exhibited an expression profile markedly different from other stages. Approximately 22 % of the DE transcripts putatively code for transcription factors (TF) or transcriptional regulators, and were found almost equally distributed among the several expression profile clusters, highlighting their major roles in controlling the whole developmental process. On the other hand, carbohydrate metabolism, the biological pathway most represented during acorn development, was especially prevalent in mid to late stages as evidenced by enrichment analysis. We further show that genes related to response to water, water deprivation and transport were mostly represented during the early (S2) and the last stage (S8) of acorn development, when tolerance to water desiccation is possibly critical for acorn viability. CONCLUSIONS To our knowledge this work represents the first report of acorn development transcriptomics in oaks. The obtained results provide novel insights into the developmental biology of cork oak acorns, highlighting transcripts putatively involved in the regulation of the gene expression program and in specific processes likely essential for adaptation. It is expected that this knowledge can be transferred to other oak species of great ecological value.
Collapse
Affiliation(s)
- Andreia Miguel
- Instituto de Biologia Experimental e Tecnológica, Apartado 12, 2781-901, Oeiras, Portugal.
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Avenida da República, 2780-157, Oeiras, Portugal.
| | - José de Vega-Bartol
- Instituto de Biologia Experimental e Tecnológica, Apartado 12, 2781-901, Oeiras, Portugal.
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Avenida da República, 2780-157, Oeiras, Portugal.
- The Genome Analysis Centre, Norwich Research Park, Norwich, NR4 7UH, UK.
| | - Liliana Marum
- Instituto de Biologia Experimental e Tecnológica, Apartado 12, 2781-901, Oeiras, Portugal.
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Avenida da República, 2780-157, Oeiras, Portugal.
- KLÓN, Innovative Technologies from Cloning, Biocant Park, Núcleo 4, Lote 4A, 3060-197, Cantanhede, Portugal.
| | - Inês Chaves
- Instituto de Biologia Experimental e Tecnológica, Apartado 12, 2781-901, Oeiras, Portugal.
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Avenida da República, 2780-157, Oeiras, Portugal.
| | - Tatiana Santo
- Laboratory of Genomics and Genetic Improvement, BioFIG, FCT, Universidade do Algarve, E.8, Campus de Gambelas, Faro, 8300, Portugal.
| | - José Leitão
- Laboratory of Genomics and Genetic Improvement, BioFIG, FCT, Universidade do Algarve, E.8, Campus de Gambelas, Faro, 8300, Portugal.
| | - Maria Carolina Varela
- INIAV- Instituto Nacional de Investigação Agrária e Veterinária, IP, Quinta do, Marquês, Oeiras, 2780-159, Portugal.
| | - Célia M Miguel
- Instituto de Biologia Experimental e Tecnológica, Apartado 12, 2781-901, Oeiras, Portugal.
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Avenida da República, 2780-157, Oeiras, Portugal.
| |
Collapse
|
29
|
Gorson J, Ramrattan G, Verdes A, Wright EM, Kantor Y, Rajaram Srinivasan R, Musunuri R, Packer D, Albano G, Qiu WG, Holford M. Molecular Diversity and Gene Evolution of the Venom Arsenal of Terebridae Predatory Marine Snails. Genome Biol Evol 2015; 7:1761-78. [PMID: 26025559 PMCID: PMC4494067 DOI: 10.1093/gbe/evv104] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Venom peptides from predatory organisms are a resource for investigating evolutionary processes such as adaptive radiation or diversification, and exemplify promising targets for biomedical drug development. Terebridae are an understudied lineage of conoidean snails, which also includes cone snails and turrids. Characterization of cone snail venom peptides, conotoxins, has revealed a cocktail of bioactive compounds used to investigate physiological cellular function, predator-prey interactions, and to develop novel therapeutics. However, venom diversity of other conoidean snails remains poorly understood. The present research applies a venomics approach to characterize novel terebrid venom peptides, teretoxins, from the venom gland transcriptomes of Triplostephanus anilis and Terebra subulata. Next-generation sequencing and de novo assembly identified 139 putative teretoxins that were analyzed for the presence of canonical peptide features as identified in conotoxins. To meet the challenges of de novo assembly, multiple approaches for cross validation of findings were performed to achieve reliable assemblies of venom duct transcriptomes and to obtain a robust portrait of Terebridae venom. Phylogenetic methodology was used to identify 14 teretoxin gene superfamilies for the first time, 13 of which are unique to the Terebridae. Additionally, basic local algorithm search tool homology-based searches to venom-related genes and posttranslational modification enzymes identified a convergence of certain venom proteins, such as actinoporin, commonly found in venoms. This research provides novel insights into venom evolution and recruitment in Conoidean predatory marine snails and identifies a plethora of terebrid venom peptides that can be used to investigate fundamental questions pertaining to gene evolution.
Collapse
Affiliation(s)
- Juliette Gorson
- Hunter College and The Graduate Center, City University of New York Invertebrate Zoology, Sackler Institute for Comparative Genomics, American Museum of Natural History, New York
| | - Girish Ramrattan
- Hunter College and The Graduate Center, City University of New York
| | - Aida Verdes
- Hunter College and The Graduate Center, City University of New York Invertebrate Zoology, Sackler Institute for Comparative Genomics, American Museum of Natural History, New York
| | - Elizabeth M Wright
- Hunter College and The Graduate Center, City University of New York Invertebrate Zoology, Sackler Institute for Comparative Genomics, American Museum of Natural History, New York
| | - Yuri Kantor
- A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russia Visiting Professor, Muséum National d'Histoire Naturelle, Paris, France
| | | | - Raj Musunuri
- Department of Bioinformatics, New York University Polytechnic School of Engineering
| | - Daniel Packer
- Hunter College and The Graduate Center, City University of New York
| | - Gabriel Albano
- Estação de Biologia Marítima da Inhaca (EBMI), Faculdade de Ciencias, Universidade Eduardo Mondlane, Distrito Municipal KaNyaka, Maputo, Mozambique
| | - Wei-Gang Qiu
- Hunter College and The Graduate Center, City University of New York
| | - Mandë Holford
- Hunter College and The Graduate Center, City University of New York Invertebrate Zoology, Sackler Institute for Comparative Genomics, American Museum of Natural History, New York
| |
Collapse
|
30
|
Powell D, Knibb W, Remilton C, Elizur A. De-novo transcriptome analysis of the banana shrimp (Fenneropenaeus merguiensis) and identification of genes associated with reproduction and development. Mar Genomics 2015; 22:71-8. [PMID: 25936497 DOI: 10.1016/j.margen.2015.04.006] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Revised: 04/20/2015] [Accepted: 04/20/2015] [Indexed: 01/07/2023]
Abstract
The banana shrimp Fenneropenaeus merguiensis is a commercially important marine crustacean for world aquaculture and fisheries. Despite this, limited genetic information is available for it and many other penaeid shrimp species. Here we present the first in-depth analysis of the transcriptional content of 8 different tissues from the banana shrimp using RNA-Seq technologies. A total of over 1 million single-end and over 49 million paired-end reads were obtained from Roche 454FLX and illumina sequencing platforms, respectively, resulting in an assembly of 124,631 transcripts with an N50 of 1,332 and mean length of 514 nt. A total of 59,179 putative protein sequences obtained from the assembled transcripts were annotated using public protein sequence databases and assigned 20,430 BLAST hits, 16,866 GO terms and 13,304 KOG categories. Further analysis revealed a rich set of transcript sequences exhibiting homology with genes associated with reproduction, sex determination and development and distinguished the tissues responsible for this expression. This report adds a substantial contribution to the sequence data currently available for F. merguiensis, providing valuable resources for further research.
Collapse
Affiliation(s)
- Daniel Powell
- Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore, QLD 4558, Australia.
| | - Wayne Knibb
- Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore, QLD 4558, Australia.
| | | | - Abigail Elizur
- Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore, QLD 4558, Australia.
| |
Collapse
|
31
|
Hebert FO, Phelps L, Samonte I, Panchal M, Grambauer S, Barber I, Kalbe M, Landry CR, Aubin-Horth N. Identification of candidate mimicry proteins involved in parasite-driven phenotypic changes. Parasit Vectors 2015; 8:225. [PMID: 25888917 PMCID: PMC4407394 DOI: 10.1186/s13071-015-0834-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Accepted: 03/29/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Endoparasites with complex life cycles are faced with several biological challenges, as they need to occupy various ecological niches throughout their development. Host phenotypes that increase the parasite's transmission rate to the next host have been extensively described, but few mechanistic explanations have been proposed to describe their proximate causes. In this study we explore the possibility that host phenotypic changes are triggered by the production of mimicry proteins from the parasite by using an ecological model system consisting of the infection of the threespine stickleback (Gasterosteus aculeatus) by the cestode Schistocephalus solidus. METHOD Using RNA-seq data, we assembled 9,093 protein-coding genes from which ORFs were predicted to generate a reference proteome. Based on a previously published method, we built two complementary analysis pipelines to i) establish a general classification of protein similarity among various species (pipeline A) and ii) identify candidate mimicry proteins showing specific host-parasite similarities (pipeline B), a key feature underlying the possibility of molecular mimicry. RESULTS Ninety-four tapeworm proteins showed high local sequence homology with stickleback proteins. Four of these candidates correspond to secreted or membrane proteins that could be produced by the parasite and eventually be released in or be in contact with the host to modulate physiological pathways involved in various phenotypes (e.g. behaviors). One of these candidates belongs to the Wnt family, a large group of signaling molecules involved in cell-to-cell interactions and various developmental pathways. The three other candidates are involved in ion transport and post-translational protein modifications. We further confirmed that these four candidates are expressed in three different developmental stages of the cestode by RT-PCR, including the stages found in the host. CONCLUSION In this study, we identified mimicry candidate peptides from a behavior-altering cestode showing specific sequence similarity with host proteins. Despite their potential role in modulating host pathways that could lead to parasite-induced phenotypic changes and despite our confirmation that they are expressed in the developmental stage corresponding to the altered host behavior, further investigations will be needed to confirm their mechanistic role in the molecular cross-talk taking place between S. solidus and the threespine stickleback.
Collapse
Affiliation(s)
- Francois Olivier Hebert
- Institut de Biologie Intégrative et des Systèmes (IBIS), Département de Biologie, Université Laval, Pavillon Charles-Eugènes-Marchand, Québec, G1V 0A6, Canada.
| | - Luke Phelps
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str 2, 24306, Ploen, Germany.
| | - Irene Samonte
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str 2, 24306, Ploen, Germany.
| | - Mahesh Panchal
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str 2, 24306, Ploen, Germany.
| | - Stephan Grambauer
- Department of Biology, Adrian Building, Leicester University, University Road, Leicester, LE1 7RH, UK.
| | - Iain Barber
- Department of Biology, Adrian Building, Leicester University, University Road, Leicester, LE1 7RH, UK.
| | - Martin Kalbe
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str 2, 24306, Ploen, Germany.
| | - Christian R Landry
- Institut de Biologie Intégrative et des Systèmes (IBIS), Département de Biologie, Université Laval, Pavillon Charles-Eugènes-Marchand, Québec, G1V 0A6, Canada.
| | - Nadia Aubin-Horth
- Institut de Biologie Intégrative et des Systèmes (IBIS), Département de Biologie, Université Laval, Pavillon Charles-Eugènes-Marchand, Québec, G1V 0A6, Canada.
| |
Collapse
|
32
|
Meyer B, Martini P, Biscontin A, De Pittà C, Romualdi C, Teschke M, Frickenhaus S, Harms L, Freier U, Jarman S, Kawaguchi S. Pyrosequencing and de novo assembly of Antarctic krill (Euphausia superba) transcriptome to study the adaptability of krill to climate-induced environmental changes. Mol Ecol Resour 2015; 15:1460-71. [PMID: 25818178 PMCID: PMC4672718 DOI: 10.1111/1755-0998.12408] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 03/13/2015] [Accepted: 03/18/2015] [Indexed: 11/28/2022]
Abstract
The Antarctic krill, Euphausia superba, has a key position in the Southern Ocean food web by serving as direct link between primary producers and apex predators. The south-west Atlantic sector of the Southern Ocean, where the majority of the krill population is located, is experiencing one of the most profound environmental changes worldwide. Up to now, we have only cursory information about krill’s genomic plasticity to cope with the ongoing environmental changes induced by anthropogenic CO2 emission. The genome of krill is not yet available due to its large size (about 48 Gbp). Here, we present two cDNA normalized libraries from whole krill and krill heads sampled in different seasons that were combined with two data sets of krill transcriptome projects, already published, to produce the first knowledgebase krill ‘master’ transcriptome. The new library produced 25% more E. superba transcripts and now includes nearly all the enzymes involved in the primary oxidative metabolism (Glycolysis, Krebs cycle and oxidative phosphorylation) as well as all genes involved in glycogenesis, glycogen breakdown, gluconeogenesis, fatty acid synthesis and fatty acids β-oxidation. With these features, the ‘master’ transcriptome provides the most complete picture of metabolic pathways in Antarctic krill and will provide a major resource for future physiological and molecular studies. This will be particularly valuable for characterizing the molecular networks that respond to stressors caused by the anthropogenic CO2 emissions and krill’s capacity to cope with the ongoing environmental changes in the Atlantic sector of the Southern Ocean.
Collapse
Affiliation(s)
- B Meyer
- Section Polar Biological Oceanography, Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Am Handelshafen 12, 27570, Bremerhaven, Germany.,Institute for Chemistry and Biology of the Marine Environment, Carl von Ossietzky University of Oldenburg, Carl-von-Ossietzky-Straße 9-11, 26111, Oldenburg, Germany
| | - P Martini
- Dipartimento di Biologia, Università degli Studi di Padova, via U. Bassi, 58/B, 35131, Padova, Italy
| | - A Biscontin
- Dipartimento di Biologia, Università degli Studi di Padova, via U. Bassi, 58/B, 35131, Padova, Italy
| | - C De Pittà
- Dipartimento di Biologia, Università degli Studi di Padova, via U. Bassi, 58/B, 35131, Padova, Italy
| | - C Romualdi
- Dipartimento di Biologia, Università degli Studi di Padova, via U. Bassi, 58/B, 35131, Padova, Italy
| | - M Teschke
- Section Polar Biological Oceanography, Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Am Handelshafen 12, 27570, Bremerhaven, Germany
| | - S Frickenhaus
- Section Scientific Computing, Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Am Handelshafen 12, 27570, Bremerhaven, Germany.,Hochschule Bremerhaven, An der Karlstadt 8, 27568, Bremerhaven, Germany
| | - L Harms
- Section Scientific Computing, Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Am Handelshafen 12, 27570, Bremerhaven, Germany
| | - U Freier
- SC-Scientific Consulting, Münchener Str. 41a, D-41472, Neuss, Germany
| | - S Jarman
- Australian Antarctic Division, Kingston, Tas., 7050, Australia
| | - S Kawaguchi
- Australian Antarctic Division, Kingston, Tas., 7050, Australia
| |
Collapse
|
33
|
Foissac S, Sammeth M. Analysis of alternative splicing events in custom gene datasets by AStalavista. Methods Mol Biol 2015; 1269:379-92. [PMID: 25577392 DOI: 10.1007/978-1-4939-2291-8_24] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Alternative splicing (AS) is a eukaryotic principle to derive more than one RNA product from transcribed genes by removing distinct subsets of introns from a premature polymer. We know today that this process is highly regulated and makes up a large part of the differences between species, cell types, and states. The key to compare AS across different genes or organisms is to tokenize the AS phenomenon into atomary units, so-called AS events. These events then usually are grouped by common patterns to investigate the underlying molecular mechanisms that drive their regulation. However, attempts to decompose loci with AS observations into events are often hampered by applying a limited set of a priori defined event patterns which are not capable to describe all AS configurations and therefore cannot decompose the phenomenon exhaustively. In this chapter, we describe working scenarios of AStalavista, a computational method that reports all AS events reflected by transcript annotations. We show how to practically employ AStalavista to study AS variation in complex transcriptomes, as characterized by the human GENCODE annotation. Our examples demonstrate how the inherent and universal AStalavista paradigm allows for an automatic delineation of AS events in custom gene datasets. Additionally, we sketch an example of an AStalavista use case including next-generation sequencing data (RNA-Seq) to enrich the landscape of discovered AS events.
Collapse
Affiliation(s)
- Sylvain Foissac
- UMR1388 GenPhySE, French National Institute for Agricultural Research (INRA), Chemin de Borde Rouge, 31326, Castanet Tolosan, France
| | | |
Collapse
|
34
|
Li B, Fillmore N, Bai Y, Collins M, Thomson JA, Stewart R, Dewey CN. Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biol 2014. [PMID: 25608678 DOI: 10.1101/006338] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2023] Open
Abstract
De novo RNA-Seq assembly facilitates the study of transcriptomes for species without sequenced genomes, but it is challenging to select the most accurate assembly in this context. To address this challenge, we developed a model-based score, RSEM-EVAL, for evaluating assemblies when the ground truth is unknown. We show that RSEM-EVAL correctly reflects assembly accuracy, as measured by REF-EVAL, a refined set of ground-truth-based scores that we also developed. Guided by RSEM-EVAL, we assembled the transcriptome of the regenerating axolotl limb; this assembly compares favorably to a previous assembly. A software package implementing our methods, DETONATE, is freely available at http://deweylab.biostat.wisc.edu/detonate.
Collapse
|
35
|
Li B, Fillmore N, Bai Y, Collins M, Thomson JA, Stewart R, Dewey CN. Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biol 2014; 15:553. [PMID: 25608678 PMCID: PMC4298084 DOI: 10.1186/s13059-014-0553-5] [Citation(s) in RCA: 196] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2014] [Accepted: 10/30/2014] [Indexed: 01/16/2023] Open
Abstract
De novo RNA-Seq assembly facilitates the study of transcriptomes for species without sequenced genomes, but it is challenging to select the most accurate assembly in this context. To address this challenge, we developed a model-based score, RSEM-EVAL, for evaluating assemblies when the ground truth is unknown. We show that RSEM-EVAL correctly reflects assembly accuracy, as measured by REF-EVAL, a refined set of ground-truth-based scores that we also developed. Guided by RSEM-EVAL, we assembled the transcriptome of the regenerating axolotl limb; this assembly compares favorably to a previous assembly. A software package implementing our methods, DETONATE, is freely available at http://deweylab.biostat.wisc.edu/detonate.
Collapse
|
36
|
Archer J, Whiteley G, Casewell NR, Harrison RA, Wagstaff SC. VTBuilder: a tool for the assembly of multi isoform transcriptomes. BMC Bioinformatics 2014; 15:389. [PMID: 25465054 PMCID: PMC4260244 DOI: 10.1186/s12859-014-0389-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 11/19/2014] [Indexed: 01/10/2023] Open
Abstract
Background Within many research areas, such as transcriptomics, the millions of short DNA fragments (reads) produced by current sequencing platforms need to be assembled into transcript sequences before they can be utilized. Despite recent advances in assembly software, creating such transcripts from read data harboring isoform variation remains challenging. This is because current approaches fail to identify all variants present or they create chimeric transcripts within which relationships between co-evolving sites and other evolutionary factors are disrupted. We present VTBuilder, a tool for constructing non-chimeric transcripts from read data that has been sequenced from sources containing isoform complexity. Results We validated VTBuilder using reads simulated from 54 Sanger sequenced transcripts (SSTs) expressed in the venom gland of the saw scaled viper, Echis ocellatus. The SSTs were selected to represent genes from major co-expressed toxin groups known to harbor isoform variants. From the simulated reads, VTBuilder constructed 55 transcripts, 50 of which had a greater than 99% sequence similarity to 48 of the SSTs. In contrast, using the popular assembler tool Trinity (r2013-02-25), only 14 transcripts were constructed with a similar level of sequence identity to just 11 SSTs. Furthermore VTBuilder produced transcripts with a similar length distribution to the SSTs while those produced by Trinity were considerably shorter. To demonstrate that our approach can be scaled to real world data we assembled the venom gland transcriptome of the African puff adder Bitis arietans using paired-end reads sequenced on Illumina’s MiSeq platform. VTBuilder constructed 1481 transcripts from 5 million reads and, following annotation, all major toxin genes were recovered demonstrating reconstruction of complex underlying sequence and isoform diversity. Conclusion Unlike other approaches, VTBuilder strives to maintain the relationships between co-evolving sites within the constructed transcripts, and thus increases transcript utility for a wide range of research areas ranging from transcriptomics to phylogenetics and including the monitoring of drug resistant parasite populations. Additionally, improving the quality of transcripts assembled from read data will have an impact on future studies that query these data. VTBuilder has been implemented in java and is available, under the GPL GPU V0.3 license, from http:// http://www.lstmed.ac.uk/vtbuilder. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0389-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- John Archer
- Department of Parasitology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA2, UK.
| | - Gareth Whiteley
- Department of Parasitology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA2, UK.
| | - Nicholas R Casewell
- Department of Parasitology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA2, UK.
| | - Robert A Harrison
- Department of Parasitology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA2, UK.
| | - Simon C Wagstaff
- Department of Parasitology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA2, UK.
| |
Collapse
|
37
|
Zhang W, Tian D, Huang X, Xu Y, Mo H, Liu Y, Meng J, Zhang D. Characterization of flower-bud transcriptome and development of genic SSR markers in Asian lotus (Nelumbo nucifera Gaertn.). PLoS One 2014; 9:e112223. [PMID: 25379700 PMCID: PMC4224446 DOI: 10.1371/journal.pone.0112223] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 10/10/2014] [Indexed: 01/01/2023] Open
Abstract
Background Asian lotus (Nelumbo nucifera Gaertn.) is the national flower of India, Vietnam, and one of the top ten traditional Chinese flowers. Although lotus is highly valued for its ornamental, economic and cultural uses, genomic information, particularly the expressed sequence based (genic) markers is limited. High-throughput transcriptome sequencing provides large amounts of transcriptome data for promoting gene discovery and development of molecular markers. Results In this study, 68,593 unigenes were assembled from 1.34 million 454 GS-FLX sequence reads of a mixed flower-bud cDNA pool derived from three accessions of N. nucifera. A total of 5,226 SSR loci were identified, and 3,059 primer pairs were designed for marker development. Di-nucleotide repeat motifs were the most abundant type identified with a frequency of 65.2%, followed by tri- (31.7%), tetra- (2.1%), penta- (0.5%) and hexa-nucleotide repeats (0.5%). A total of 575 primer pairs were synthesized, of which 514 (89.4%) yielded PCR amplification products. In eight Nelumbo accessions, 109 markers were polymorphic. They were used to genotype a sample of 44 accessions representing diverse wild and cultivated genotypes of Nelumbo. The number of alleles per locus varied from 2 to 9 alleles and the polymorphism information content values ranged from 0.6 to 0.9. We performed genetic diversity analysis using 109 polymorphic markers. A UPGMA dendrogram was constructed based on Jaccard’s similarity coefficients revealing distinct clusters among the 44 accessions. Conclusions Deep transcriptome sequencing of lotus flower buds developed 3,059 genic SSRs, making a significant addition to the existing SSR markers in lotus. Among them, 109 polymorphic markers were successfully validated in 44 accessions of Nelumbo. This comprehensive set of genic SSR markers developed in our study will facilitate analyses of genetic diversity, construction of linkage maps, gene mapping, and marker-assisted selection breeding for lotus.
Collapse
Affiliation(s)
- Weiwei Zhang
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, China
| | - Daike Tian
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, China
- * E-mail:
| | - Xiu Huang
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, China
| | - Yuxian Xu
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, China
- College of life and Environmental Sciences, Shanghai Normal University, Shanghai, China
| | - Haibo Mo
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, China
| | - Yanbo Liu
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, China
- College of Horticulture, Northeast Agricultural University, Harbin, China
| | - Jing Meng
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, China
| | - Dasheng Zhang
- Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Chenshan Botanical Garden, Shanghai, China
| |
Collapse
|
38
|
Marchant A, Mougel F, Almeida C, Jacquin-Joly E, Costa J, Harry M. De novo transcriptome assembly for a non-model species, the blood-sucking bug Triatoma brasiliensis, a vector of Chagas disease. Genetica 2014; 143:225-39. [PMID: 25233990 DOI: 10.1007/s10709-014-9790-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Accepted: 09/01/2014] [Indexed: 11/29/2022]
Abstract
High throughput sequencing (HTS) provides new research opportunities for work on non-model organisms, such as differential expression studies between populations exposed to different environmental conditions. However, such transcriptomic studies first require the production of a reference assembly. The choice of sampling procedure, sequencing strategy and assembly workflow is crucial. To develop a reliable reference transcriptome for Triatoma brasiliensis, the major Chagas disease vector in Northeastern Brazil, different de novo assembly protocols were generated using various datasets and software. Both 454 and Illumina sequencing technologies were applied on RNA extracted from antennae and mouthparts from single or pooled individuals. The 454 library yielded 278 Mb. Fifteen Illumina libraries were constructed and yielded nearly 360 million RNA-seq single reads and 46 million RNA-seq paired-end reads for nearly 45 Gb. For the 454 reads, we used three assemblers, Newbler, CAP3 and/or MIRA and for the Illumina reads, the Trinity assembler. Ten assembly workflows were compared using these programs separately or in combination. To compare the assemblies obtained, quantitative and qualitative criteria were used, including contig length, N50, contig number and the percentage of chimeric contigs. Completeness of the assemblies was estimated using the CEGMA pipeline. The best assembly (57,657 contigs, completeness of 80 %, <1 % chimeric contigs) was a hybrid assembly leading to recommend the use of (1) a single individual with large representation of biological tissues, (2) merging both long reads and short paired-end Illumina reads, (3) several assemblers in order to combine the specific advantages of each.
Collapse
Affiliation(s)
- A Marchant
- Laboratoire Evolution, Génomes et Spéciation LEGS, UPR 9034, CNRS, Avenue de la Terrasse, Bâtiment 13, BP1, 91198, Gif-sur-Yvette, France,
| | | | | | | | | | | |
Collapse
|
39
|
Hook SE, Twine NA, Simpson SL, Spadaro DA, Moncuquet P, Wilkins MR. 454 pyrosequencing-based analysis of gene expression profiles in the amphipod Melita plumulosa: transcriptome assembly and toxicant induced changes. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2014; 153:73-88. [PMID: 24434169 DOI: 10.1016/j.aquatox.2013.11.022] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2013] [Revised: 11/26/2013] [Accepted: 11/28/2013] [Indexed: 05/20/2023]
Abstract
Next generation sequencing using Roche's 454 pyrosequencing platform can be used to generate genomic information for non-model organisms, although there are bioinformatic challenges associated with these studies. These challenges are compounded by a lack of a standardized protocol to either assemble data or to evaluate the quality of a de novo transcriptome. This study presents an assembly of the control and toxicant responsive transcriptome of Melita plumulosa, an Australian amphipod commonly used in ecotoxicological studies. RNA was harvested from control amphipods, juvenile amphipods, and from amphipods exposed to either metal or diesel contaminated sediments. This RNA was used as the basis for a 454 based transcriptome sequencing effort. Sequencing generated 1.3 million reads from control, juvenile, metal-exposed and diesel-exposed amphipods. Different read filtering and assembly protocols were evaluated to generate an assembly that (i) had an optimal number of contigs; (ii) had long contigs; (iii) contained a suitable representation of conserved genes; and (iv) had long ortholog alignment lengths relative to the length of each contig. A final assembly, generated using fixed-length trimming based on the sequence quality scores, followed by assembly using the MIRA algorithm, produced the best results. The 26,625 contigs generated via this approach were annotated using Blast2GO, and the differential expression between treatments and control was determined by mapping with BWA followed by DESeq. Although the mapping generated low coverage, many differentially expressed contigs, including some with known developmental or toxicological function, were identified. This study demonstrated that 454 pyrosequencing is an effective means of generating reference transcriptome information for organisms, such as the amphipod M. plumulosa, that have no genomic information available in databases or in closely related sequenced species. It also demonstrated how optimization of read filtering protocols and assembly approaches changes the utility of results obtained from next generation sequencing studies, and establishes criteria to determine the quality of a de novo assembly in species lacking a reference genome. This new transcriptomic knowledge provides the genomic foundation for the creation of microarray and qPCR assays, serving as a reference transcriptome in future RNAseq studies, and allowing both the biology and ecotoxicology of this organism to be better understood. This approach will allow genomics-based methodology to be applied to a wider range of environmentally relevant species.
Collapse
Affiliation(s)
- Sharon E Hook
- CSIRO Land and Water, Locked Bag 2007, Kirrawee, NSW 2232, Australia.
| | - Natalie A Twine
- NSW Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Stuart L Simpson
- CSIRO Land and Water, Locked Bag 2007, Kirrawee, NSW 2232, Australia
| | - David A Spadaro
- CSIRO Land and Water, Locked Bag 2007, Kirrawee, NSW 2232, Australia
| | - Philippe Moncuquet
- CSIRO Mathematics, Informatics, and Statistics, Acton, ACT, 2601, Australia
| | - Marc R Wilkins
- NSW Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
40
|
Moreton J, Dunham SP, Emes RD. A consensus approach to vertebrate de novo transcriptome assembly from RNA-seq data: assembly of the duck (Anas platyrhynchos) transcriptome. Front Genet 2014; 5:190. [PMID: 25009556 PMCID: PMC4070175 DOI: 10.3389/fgene.2014.00190] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2014] [Accepted: 06/09/2014] [Indexed: 11/17/2022] Open
Abstract
For vertebrate organisms where a reference genome is not available, de novo transcriptome assembly enables a cost effective insight into the identification of tissue specific or differentially expressed genes and variation of the coding part of the genome. However, since there are a number of different tools and parameters that can be used to reconstruct transcripts, it is difficult to determine an optimal method. Here we suggest a pipeline based on (1) assessing the performance of three different assembly tools (2) using both single and multiple k-mer (MK) approaches (3) examining the influence of the number of reads used in the assembly (4) merging assemblies from different tools. We use an example dataset from the vertebrate Anas platyrhynchos domestica (Pekin duck). We find that taking a subset of data enables a robust assembly to be produced by multiple methods without the need for very high memory capacity. The use of reads mapped back to transcripts (RMBT) and CEGMA (Core Eukaryotic Genes Mapping Approach) provides useful metrics to determine the completeness of assembly obtained. For this dataset the use of MK in the assembly generated a more complete assembly as measured by greater number of RMBT and CEGMA score. Merged single k-mer assemblies are generally smaller but consist of longer transcripts, suggesting an assembly consisting of fewer fragmented transcripts. We suggest that the use of a subset of reads during assembly allows the relatively rapid investigation of assembly characteristics and can guide the user to the most appropriate transcriptome for particular downstream use. Transcriptomes generated by the compared assembly methods and the final merged assembly are freely available for download at http://dx.doi.org/10.6084/m9.figshare.1032613.
Collapse
Affiliation(s)
- Joanna Moreton
- Advanced Data Analysis Centre, University of Nottingham Leicestershire, UK ; School of Veterinary Medicine and Science, University of Nottingham Leicestershire, UK
| | - Stephen P Dunham
- School of Veterinary Medicine and Science, University of Nottingham Leicestershire, UK
| | - Richard D Emes
- Advanced Data Analysis Centre, University of Nottingham Leicestershire, UK ; School of Veterinary Medicine and Science, University of Nottingham Leicestershire, UK
| |
Collapse
|
41
|
Abrahão JS, Boratto P, Dornas FP, Silva LC, Campos RK, Almeida GMF, Kroon EG, La Scola B. Growing a giant: evaluation of the virological parameters for mimivirus production. J Virol Methods 2014; 207:6-11. [PMID: 24972367 DOI: 10.1016/j.jviromet.2014.06.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2014] [Revised: 05/28/2014] [Accepted: 06/03/2014] [Indexed: 11/16/2022]
Abstract
Acanthamoeba polyphaga mimivirus (APMV) was described in 2003, and due to its unique structural and genetic complexity, the viral family Mimiviridae was created. APMV prompted the creation of an open field of study on the function of hundreds of never-before-seen open reading frames (ORFs) and their roles in virus-host interactions. In recent years, several giant viruses have been isolated from different environments and specimens. Although the scientific community has experienced a remarkable advancement in the comprehension of the mimivirus replication cycle in the last years, few studies have been devoted to the investigation of the methodological features and conditions for mimivirus cultivation. In this work, conditions for the cultivation of mimivirus isolates were investigated to obtain relevant information about the production of infectious particles, total viral particles and viral DNA. The results suggest that low viral doses are more efficient for the production of infectious particles, yielding up to 5000 TCID50 for each inoculated TCID50. Besides methodological information, these data also reveal, for the first time, the ratio between total and infectious particles (in TCID50) that are produced during mimivirus cultivation in laboratory conditions. All of this information can be used as a worldwide guide for the production of mimiviruses and can help prompt mimivirological studies in different fields.
Collapse
Affiliation(s)
- Jônatas S Abrahão
- Universidade Federal de Minas Gerais, Departamento de Microbiologia, Laboratório de Vírus, Belo Horizonte, Brazil; Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes (URMITE), UM63 CNRS 7278 IRD 198 INSERM U1095, Faculté de Médecine, Aix-Marseille Université, Marseille, France.
| | - Paulo Boratto
- Universidade Federal de Minas Gerais, Departamento de Microbiologia, Laboratório de Vírus, Belo Horizonte, Brazil
| | - Fábio P Dornas
- Universidade Federal de Minas Gerais, Departamento de Microbiologia, Laboratório de Vírus, Belo Horizonte, Brazil
| | - Lorena C Silva
- Universidade Federal de Minas Gerais, Departamento de Microbiologia, Laboratório de Vírus, Belo Horizonte, Brazil
| | - Rafael K Campos
- Universidade Federal de Minas Gerais, Departamento de Microbiologia, Laboratório de Vírus, Belo Horizonte, Brazil
| | - Gabriel M F Almeida
- Universidade Federal de Minas Gerais, Departamento de Microbiologia, Laboratório de Vírus, Belo Horizonte, Brazil
| | - Erna G Kroon
- Universidade Federal de Minas Gerais, Departamento de Microbiologia, Laboratório de Vírus, Belo Horizonte, Brazil
| | - Bernard La Scola
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes (URMITE), UM63 CNRS 7278 IRD 198 INSERM U1095, Faculté de Médecine, Aix-Marseille Université, Marseille, France
| |
Collapse
|
42
|
Genome-wide transcriptomic responses of the seagrasses Zostera marina and Nanozostera noltii under a simulated heatwave confirm functional types. Mar Genomics 2014; 15:65-73. [DOI: 10.1016/j.margen.2014.03.004] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2014] [Revised: 03/18/2014] [Accepted: 03/19/2014] [Indexed: 12/25/2022]
|
43
|
De novo transcriptome hybrid assembly and validation in the European earwig (Dermaptera, Forficula auricularia). PLoS One 2014; 9:e94098. [PMID: 24722757 PMCID: PMC3983118 DOI: 10.1371/journal.pone.0094098] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 03/10/2014] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The European earwig (Forficula auricularia) is an established system for studies of sexual selection, social interactions and the evolution of parental care. Despite its scientific interest, little knowledge exists about the species at the genomic level, limiting the scope of molecular studies and expression analyses of genes of interest. To overcome these limitations, we sequenced and validated the transcriptome of the European earwig. METHODOLOGY AND PRINCIPAL FINDINGS To obtain a comprehensive transcriptome, we sequenced mRNA from various tissues and developmental stages of female and male earwigs using Roche 454 pyrosequencing and Illumina HiSeq. The reads were de novo assembled independently and screened for possible microbial contamination and repeated elements. The remaining contigs were combined into a hybrid assembly and clustered to reduce redundancy. A comparison with the eukaryotic core gene dataset indicates that we sequenced a substantial part of the earwig transcriptome with a low level of fragmentation. In addition, a comparative analysis revealed that more than 8,800 contigs of the hybrid assembly show significant similarity to insect-specific proteins and those were assigned for Gene Ontology terms. Finally, we established a quantitative PCR test for expression stability using commonly used housekeeping genes and applied the method to five homologs of known sex-biased genes of the honeybee. The qPCR pilot study confirmed sex specific expression and also revealed significant expression differences between the brain and antenna tissue samples. CONCLUSIONS By employing two different sequencing approaches and including samples obtained from different tissues, developmental stages, and sexes, we were able to assemble a comprehensive transcriptome of F. auricularia. The transcriptome presented here offers new opportunities to study the molecular bases and evolution of parental care and sociality in arthropods.
Collapse
|
44
|
Melicher D, Torson AS, Dworkin I, Bowsher JH. A pipeline for the de novo assembly of the Themira biloba (Sepsidae: Diptera) transcriptome using a multiple k-mer length approach. BMC Genomics 2014; 15:188. [PMID: 24621177 PMCID: PMC4008362 DOI: 10.1186/1471-2164-15-188] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2013] [Accepted: 03/03/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Sepsidae family of flies is a model for investigating how sexual selection shapes courtship and sexual dimorphism in a comparative framework. However, like many non-model systems, there are few molecular resources available. Large-scale sequencing and assembly have not been performed in any sepsid, and the lack of a closely related genome makes investigation of gene expression challenging. Our goal was to develop an automated pipeline for de novo transcriptome assembly, and to use that pipeline to assemble and analyze the transcriptome of the sepsid Themira biloba. RESULTS Our bioinformatics pipeline uses cloud computing services to assemble and analyze the transcriptome with off-site data management, processing, and backup. It uses a multiple k-mer length approach combined with a second meta-assembly to extend transcripts and recover more bases of transcript sequences than standard single k-mer assembly. We used 454 sequencing to generate 1.48 million reads from cDNA generated from embryo, larva, and pupae of T. biloba and assembled a transcriptome consisting of 24,495 contigs. Annotation identified 16,705 transcripts, including those involved in embryogenesis and limb patterning. We assembled transcriptomes from an additional three non-model organisms to demonstrate that our pipeline assembled a higher-quality transcriptome than single k-mer approaches across multiple species. CONCLUSIONS The pipeline we have developed for assembly and analysis increases contig length, recovers unique transcripts, and assembles more base pairs than other methods through the use of a meta-assembly. The T. biloba transcriptome is a critical resource for performing large-scale RNA-Seq investigations of gene expression patterns, and is the first transcriptome sequenced in this Dipteran family.
Collapse
Affiliation(s)
- Dacotah Melicher
- />Department of Biological Sciences, North Dakota State University, 1340 Bolley Drive, 218 Stevens Hall, Fargo, ND 58102 USA
| | - Alex S Torson
- />Department of Biological Sciences, North Dakota State University, 1340 Bolley Drive, 218 Stevens Hall, Fargo, ND 58102 USA
| | - Ian Dworkin
- />Department of Zoology, Michigan State University, 328 Giltner Hall, East Lansing, MI 48823 USA
| | - Julia H Bowsher
- />Department of Biological Sciences, North Dakota State University, 1340 Bolley Drive, 218 Stevens Hall, Fargo, ND 58102 USA
| |
Collapse
|
45
|
Robinson SD, Safavi-Hemami H, McIntosh LD, Purcell AW, Norton RS, Papenfuss AT. Diversity of conotoxin gene superfamilies in the venomous snail, Conus victoriae. PLoS One 2014; 9:e87648. [PMID: 24505301 PMCID: PMC3914837 DOI: 10.1371/journal.pone.0087648] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Accepted: 12/28/2013] [Indexed: 12/31/2022] Open
Abstract
Animal venoms represent a vast library of bioactive peptides and proteins with proven potential, not only as research tools but also as drug leads and therapeutics. This is illustrated clearly by marine cone snails (genus Conus), whose venoms consist of mixtures of hundreds of peptides (conotoxins) with a diverse array of molecular targets, including voltage- and ligand-gated ion channels, G-protein coupled receptors and neurotransmitter transporters. Several conotoxins have found applications as research tools, with some being used or developed as therapeutics. The primary objective of this study was the large-scale discovery of conotoxin sequences from the venom gland of an Australian cone snail species, Conus victoriae. Using cDNA library normalization, high-throughput 454 sequencing, de novo transcriptome assembly and annotation with BLASTX and profile hidden Markov models, we discovered over 100 unique conotoxin sequences from 20 gene superfamilies, the highest diversity of conotoxins so far reported in a single study. Many of the sequences identified are new members of known conotoxin superfamilies, some help to redefine these superfamilies and others represent altogether new classes of conotoxins. In addition, we have demonstrated an efficient combination of methods to mine an animal venom gland and generate a library of sequences encoding bioactive peptides.
Collapse
Affiliation(s)
- Samuel D. Robinson
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, VIC, Australia
- * E-mail: (SDR); (HSH)
| | - Helena Safavi-Hemami
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- * E-mail: (SDR); (HSH)
| | - Lachlan D. McIntosh
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Melbourne, VIC, Australia
| | - Anthony W. Purcell
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
| | - Raymond S. Norton
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, VIC, Australia
| | - Anthony T. Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Melbourne, VIC, Australia
| |
Collapse
|
46
|
Chow KS, Ghazali AK, Hoh CC, Mohd-Zainuddin Z. RNA sequencing read depth requirement for optimal transcriptome coverage in Hevea brasiliensis. BMC Res Notes 2014; 7:69. [PMID: 24484543 PMCID: PMC3926681 DOI: 10.1186/1756-0500-7-69] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Accepted: 01/17/2014] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND One of the concerns of assembling de novo transcriptomes is determining the amount of read sequences required to ensure a comprehensive coverage of genes expressed in a particular sample. In this report, we describe the use of Illumina paired-end RNA-Seq (PE RNA-Seq) reads from Hevea brasiliensis (rubber tree) bark to devise a transcript mapping approach for the estimation of the read amount needed for deep transcriptome coverage. FINDINGS We optimized the assembly of a Hevea bark transcriptome based on 16 Gb Illumina PE RNA-Seq reads using the Oases assembler across a range of k-mer sizes. We then assessed assembly quality based on transcript N50 length and transcript mapping statistics in relation to (a) known Hevea cDNAs with complete open reading frames, (b) a set of core eukaryotic genes and (c) Hevea genome scaffolds. This was followed by a systematic transcript mapping process where sub-assemblies from a series of incremental amounts of bark transcripts were aligned to transcripts from the entire bark transcriptome assembly. The exercise served to relate read amounts to the degree of transcript mapping level, the latter being an indicator of the coverage of gene transcripts expressed in the sample. As read amounts or datasize increased toward 16 Gb, the number of transcripts mapped to the entire bark assembly approached saturation. A colour matrix was subsequently generated to illustrate sequencing depth requirement in relation to the degree of coverage of total sample transcripts. CONCLUSIONS We devised a procedure, the "transcript mapping saturation test", to estimate the amount of RNA-Seq reads needed for deep coverage of transcriptomes. For Hevea de novo assembly, we propose generating between 5-8 Gb reads, whereby around 90% transcript coverage could be achieved with optimized k-mers and transcript N50 length. The principle behind this methodology may also be applied to other non-model plants, or with reads from other second generation sequencing platforms.
Collapse
MESH Headings
- Databases, Genetic
- Gene Expression Profiling/methods
- Gene Expression Regulation, Plant
- Gene Library
- Genes, Plant
- Hevea/chemistry
- Hevea/genetics
- High-Throughput Nucleotide Sequencing
- Open Reading Frames
- Plant Bark/metabolism
- Plant Leaves/metabolism
- Plant Proteins/genetics
- RNA, Messenger/biosynthesis
- RNA, Messenger/chemistry
- RNA, Messenger/genetics
- RNA, Messenger/isolation & purification
- RNA, Plant/biosynthesis
- RNA, Plant/chemistry
- RNA, Plant/genetics
- RNA, Plant/isolation & purification
- Reproducibility of Results
- Transcriptome
Collapse
Affiliation(s)
- Keng-See Chow
- Biotechnology Unit, Malaysian Rubber Board, Rubber Research Institute of Malaysia, Experiment Station, Kuala Lumpur 47000, Sungai Buloh, Selangor, Malaysia
| | - Ahmad-Kamal Ghazali
- Codon Genomics SB, No. 26, Jalan Dutamas 7, Taman Dutamas, Balakong 43200, Seri Kembangan Balakong, Selangor, Malaysia
| | - Chee-Choong Hoh
- Codon Genomics SB, No. 26, Jalan Dutamas 7, Taman Dutamas, Balakong 43200, Seri Kembangan Balakong, Selangor, Malaysia
| | - Zainorlina Mohd-Zainuddin
- Biotechnology Unit, Malaysian Rubber Board, Rubber Research Institute of Malaysia, Experiment Station, Kuala Lumpur 47000, Sungai Buloh, Selangor, Malaysia
| |
Collapse
|
47
|
Wei L, Xiao M, Hayward A, Fu D. Applications and challenges of next-generation sequencing in Brassica species. PLANTA 2013; 238:1005-24. [PMID: 24062086 DOI: 10.1007/s00425-013-1961-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2013] [Accepted: 09/12/2013] [Indexed: 05/09/2023]
Abstract
Next-generation sequencing (NGS) produces numerous (often millions) short DNA sequence reads, typically varying between 25 and 400 bp in length, at a relatively low cost and in a short time. This revolutionary technology is being increasingly applied in whole-genome, transcriptome, epigenome and small RNA sequencing, molecular marker and gene discovery, comparative and evolutionary genomics, and association studies. The Brassica genus comprises some of the most agro-economically important crops, providing abundant vegetables, condiments, fodder, oil and medicinal products. Many Brassica species have undergone the process of polyploidization, which makes their genomes exceptionally complex and can create difficulties in genomics research. NGS injects new vigor into Brassica research, yet also faces specific challenges in the analysis of complex crop genomes and traits. In this article, we review the advantages and limitations of different NGS technologies and their applications and challenges, using Brassica as an advanced model system for agronomically important, polyploid crops. Specifically, we focus on the use of NGS for genome resequencing, transcriptome sequencing, development of single-nucleotide polymorphism markers, and identification of novel microRNAs and their targets. We present trends and advances in NGS technology in relation to Brassica crop improvement, with wide application for sophisticated genomics research into agronomically important polyploid crops.
Collapse
Affiliation(s)
- Lijuan Wei
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang, 330045, China
- Chongqing Engineering Research Center for Rapeseed, College of Agronomy and Biotechnology, Southwest University, Chongqing, 400716, China
| | - Meili Xiao
- Chongqing Engineering Research Center for Rapeseed, College of Agronomy and Biotechnology, Southwest University, Chongqing, 400716, China
| | - Alice Hayward
- Centre for Integrative Legume Research, School of Agriculture and Food Sciences, The University of Queensland, St Lucia, 4072, Australia
| | - Donghui Fu
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang, 330045, China.
| |
Collapse
|
48
|
Xie J, Strobel GA, Mends MT, Hilmer J, Nigg J, Geary B. Collophora aceris, a novel antimycotic producing endophyte associated with Douglas Maple. MICROBIAL ECOLOGY 2013; 66:784-795. [PMID: 23996143 DOI: 10.1007/s00248-013-0281-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 08/14/2013] [Indexed: 06/02/2023]
Abstract
A novel endophyte designated Collophora aceris, was obtained from stem tissues of Douglas Maple (Acer glabrum var. douglasii) in a Pacific Northwest temperate rainforest. Colonies were slow growing, white, creamy, moist, and translucent to opaque on potato dextrose agar and other media with few aerial hyphae. It also produced solid, dark sclerotia (200-400 μm) on oatmeal agar and no evidence of pseudopycnidia as per other Collophora spp. Conidia were rod-like in the size ranging from 2.2-8.4 × 0.8-1.8 μm and produced holoblastically on conidiogenous cells by budding with no collarette at the budding site. Phylogenetic analyses, based on 18S rDNA sequence data, showed that C. aceris possessed 99 % similarity to other Collophora spp. However, ITS-5.8S rDNA sequence data indicated that the organism was potentially related to Allantophomopsis spp. Finally, combined morphological, physiological, and molecular genetics data indicated that this organism is most like Collophora spp. but it is distinctly unique when compared to all other fungi in this group. It is to be noted that this is the first report of any member of this genus existing as an endophyte. This fungus makes a wide spectrum antimycotic agent (Collophorin) with biological activity against such pathogenic fungi as Pythium ultimum, Phytophthora cinnamomi, Phytophthora palmivora, and Rhizoctonia solani. Collophorin was purified to homogeneity and shown to have a unique mass of 120.0639, an empirical formula of C8H8O1, and UV absorption bands at 260 and 378 nm. This work also indicates that C. aceris possesses the biological potential to provide protection of its host against an array of common plant pathogens.
Collapse
Affiliation(s)
- Jie Xie
- State Key Laboratory of Silkworm Genome Biology, College of Biotechnology, Southwest University, Chongqing, 400715, China
| | | | | | | | | | | |
Collapse
|
49
|
Jones M, Blaxter M. afterParty: turning raw transcriptomes into permanent resources. BMC Bioinformatics 2013; 14:301. [PMID: 24093729 PMCID: PMC3856601 DOI: 10.1186/1471-2105-14-301] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2012] [Accepted: 10/03/2013] [Indexed: 01/30/2023] Open
Abstract
Background Next-generation DNA sequencing technologies have made it possible to generate transcriptome data for novel organisms quickly and cheaply, to the extent that the effort required to annotate and publish a new transcriptome is greater than the effort required to sequence it. Often, following publication, details of the annotation effort are only available in summary form, hindering subsequent exploitation of the data. To promote best-practice in annotation and to ensure that data remain accessible, we have written afterParty, a web application that allows users to assemble, annotate and publish novel transcriptomes using only a web browser. Results afterParty is a robust web application that implements best-practice transcriptome assembly, annotation, browsing, searching, and visualization. Users can turn a collection of reads (from Roche 454 chemistry) or assembled contigs (from any sequencing chemistry, including Illumina Solexa RNA-Seq) into a searchable, browsable transcriptome resource and quickly make it publicly available. Contigs are functionally annotated based on similarity to known sequences and protein domains. Once assembled and annotated, transcriptomes derived from multiple species or libraries can be compared and searched. afterParty datasets can either be created using the existing afterParty server, or using local instances that can be built easily using a virtual machine. afterParty includes powerful visualization tools for transcriptome dataset exploration and uses a flexible annotation architecture which will allow additional types of annotation to be added in the future. Conclusions afterParty's main use case scenario is one in which a working biologist has generated a large volume of transcribed sequence data and wishes to turn it into a useful resource that has some durability. By reducing the effort, bioinformatics skills, and computational resources needed to annotate and publish a transcriptome, afterParty will facilitate the annotation and sharing of sequence data that would otherwise remain unavailable. A typical metazoan transcriptome containing several tens of thousands of contigs can be annotated in a few minutes of interactive time and a few days of computational time.
Collapse
Affiliation(s)
- Martin Jones
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK.
| | | |
Collapse
|
50
|
Jia B, Xuan L, Cai K, Hu Z, Ma L, Wei C. NeSSM: a Next-generation Sequencing Simulator for Metagenomics. PLoS One 2013; 8:e75448. [PMID: 24124490 PMCID: PMC3790878 DOI: 10.1371/journal.pone.0075448] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2012] [Accepted: 08/19/2013] [Indexed: 12/21/2022] Open
Abstract
Background Metagenomics can reveal the vast majority of microbes that have been missed by traditional cultivation-based methods. Due to its extremely wide range of application areas, fast metagenome sequencing simulation systems with high fidelity are in great demand to facilitate the development and comparison of metagenomics analysis tools. Results We present here a customizable metagenome simulation system: NeSSM (Next-generation Sequencing Simulator for Metagenomics). Combining complete genomes currently available, a community composition table, and sequencing parameters, it can simulate metagenome sequencing better than existing systems. Sequencing error models based on the explicit distribution of errors at each base and sequencing coverage bias are incorporated in the simulation. In order to improve the fidelity of simulation, tools are provided by NeSSM to estimate the sequencing error models, sequencing coverage bias and the community composition directly from existing metagenome sequencing data. Currently, NeSSM supports single-end and pair-end sequencing for both 454 and Illumina platforms. In addition, a GPU (graphics processing units) version of NeSSM is also developed to accelerate the simulation. By comparing the simulated sequencing data from NeSSM with experimental metagenome sequencing data, we have demonstrated that NeSSM performs better in many aspects than existing popular metagenome simulators, such as MetaSim, GemSIM and Grinder. The GPU version of NeSSM is more than one-order of magnitude faster than MetaSim. Conclusions NeSSM is a fast simulation system for high-throughput metagenome sequencing. It can be helpful to develop tools and evaluate strategies for metagenomics analysis and it’s freely available for academic users at http://cbb.sjtu.edu.cn/~ccwei/pub/software/NeSSM.php.
Collapse
Affiliation(s)
- Ben Jia
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Liming Xuan
- School of Bioengineering, East China University of Science and Technology, Shanghai, China
- Shanghai Center for Bioinformation Technology, Shanghai, China
| | - Kaiye Cai
- Shanghai Center for Bioinformation Technology, Shanghai, China
| | - Zhiqiang Hu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Center for Bioinformation Technology, Shanghai, China
| | - Liangxiao Ma
- Shanghai Center for Bioinformation Technology, Shanghai, China
| | - Chaochun Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Center for Bioinformation Technology, Shanghai, China
- * E-mail:
| |
Collapse
|