151
|
Qian X, Ba Y, Zhuang Q, Zhong G. RNA-Seq technology and its application in fish transcriptomics. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2013; 18:98-110. [PMID: 24380445 DOI: 10.1089/omi.2013.0110] [Citation(s) in RCA: 177] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
High-throughput sequencing technologies, also known as next-generation sequencing (NGS) technologies, have revolutionized the way that genomic research is advancing. In addition to the static genome, these state-of-art technologies have been recently exploited to analyze the dynamic transcriptome, and the resulting technology is termed RNA sequencing (RNA-seq). RNA-seq is free from many limitations of other transcriptomic approaches, such as microarray and tag-based sequencing method. Although RNA-seq has only been available for a short time, studies using this method have completely changed our perspective of the breadth and depth of eukaryotic transcriptomes. In terms of the transcriptomics of teleost fishes, both model and non-model species have benefited from the RNA-seq approach and have undergone tremendous advances in the past several years. RNA-seq has helped not only in mapping and annotating fish transcriptome but also in our understanding of many biological processes in fish, such as development, adaptive evolution, host immune response, and stress response. In this review, we first provide an overview of each step of RNA-seq from library construction to the bioinformatic analysis of the data. We then summarize and discuss the recent biological insights obtained from the RNA-seq studies in a variety of fish species.
Collapse
Affiliation(s)
- Xi Qian
- 1 Department of Animal Science, University of Vermont , Burlington, Vermont
| | | | | | | |
Collapse
|
152
|
Peng Y, Leung HCM, Yiu SM, Lv MJ, Zhu XG, Chin FYL. IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels. Bioinformatics 2013; 29:i326-34. [PMID: 23813001 PMCID: PMC3694675 DOI: 10.1093/bioinformatics/btt219] [Citation(s) in RCA: 124] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Motivation: RNA sequencing based on next-generation sequencing technology is effective for analyzing transcriptomes. Like de novo genome assembly, de novo transcriptome assembly does not rely on any reference genome or additional annotation information, but is more difficult. In particular, isoforms can have very uneven expression levels (e.g. 1:100), which make it very difficult to identify low-expressed isoforms. One challenge is to remove erroneous vertices/edges with high multiplicity (produced by high-expressed isoforms) in the de Bruijn graph without removing correct ones with not-so-high multiplicity from low-expressed isoforms. Failing to do so will result in the loss of low-expressed isoforms or having complicated subgraphs with transcripts of different genes mixed together due to erroneous vertices/edges. Contributions: Unlike existing tools, which remove erroneous vertices/edges with multiplicities lower than a global threshold, we use a probabilistic progressive approach to iteratively remove them with local thresholds. This enables us to decompose the graph into disconnected components, each containing a few genes, if not a single gene, while retaining many correct vertices/edges of low-expressed isoforms. Combined with existing techniques, IDBA-Tran is able to assemble both high-expressed and low-expressed transcripts and outperform existing assemblers in terms of sensitivity and specificity for both simulated and real data. Availability:http://www.cs.hku.hk/∼alse/idba_tran. Contact:chin@cs.hku.hk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yu Peng
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | | | | | | | | | | |
Collapse
|
153
|
Barling A, Swaminathan K, Mitros T, James BT, Morris J, Ngamboma O, Hall MC, Kirkpatrick J, Alabady M, Spence AK, Hudson ME, Rokhsar DS, Moose SP. A detailed gene expression study of the Miscanthus genus reveals changes in the transcriptome associated with the rejuvenation of spring rhizomes. BMC Genomics 2013; 14:864. [PMID: 24320546 PMCID: PMC4046694 DOI: 10.1186/1471-2164-14-864] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Accepted: 12/04/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Miscanthus genus of perennial C4 grasses contains promising biofuel crops for temperate climates. However, few genomic resources exist for Miscanthus, which limits understanding of its interesting biology and future genetic improvement. A comprehensive catalog of expressed sequences were generated from a variety of Miscanthus species and tissue types, with an emphasis on characterizing gene expression changes in spring compared to fall rhizomes. RESULTS Illumina short read sequencing technology was used to produce transcriptome sequences from different tissues and organs during distinct developmental stages for multiple Miscanthus species, including Miscanthus sinensis, Miscanthus sacchariflorus, and their interspecific hybrid Miscanthus × giganteus. More than fifty billion base-pairs of Miscanthus transcript sequence were produced. Overall, 26,230 Sorghum gene models (i.e., ~ 96% of predicted Sorghum genes) had at least five Miscanthus reads mapped to them, suggesting that a large portion of the Miscanthus transcriptome is represented in this dataset. The Miscanthus × giganteus data was used to identify genes preferentially expressed in a single tissue, such as the spring rhizome, using Sorghum bicolor as a reference. Quantitative real-time PCR was used to verify examples of preferential expression predicted via RNA-Seq. Contiguous consensus transcript sequences were assembled for each species and annotated using InterProScan. Sequences from the assembled transcriptome were used to amplify genomic segments from a doubled haploid Miscanthus sinensis and from Miscanthus × giganteus to further disentangle the allelic and paralogous variations in genes. CONCLUSIONS This large expressed sequence tag collection creates a valuable resource for the study of Miscanthus biology by providing detailed gene sequence information and tissue preferred expression patterns. We have successfully generated a database of transcriptome assemblies and demonstrated its use in the study of genes of interest. Analysis of gene expression profiles revealed biological pathways that exhibit altered regulation in spring compared to fall rhizomes, which are consistent with their different physiological functions. The expression profiles of the subterranean rhizome provides a better understanding of the biological activities of the underground stem structures that are essentials for perenniality and the storage or remobilization of carbon and nutrient resources.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Stephen P Moose
- Energy Biosciences Institute, Institute for Genomic Biology, University of Illinois Urbana, 1206 West Gregory Drive, Urbana, IL 61801, USA.
| |
Collapse
|
154
|
Chiara M, Horner DS, Spada A. De novo assembly of the transcriptome of the non-model plant Streptocarpus rexii employing a novel heuristic to recover locus-specific transcript clusters. PLoS One 2013; 8:e80961. [PMID: 24324652 PMCID: PMC3855653 DOI: 10.1371/journal.pone.0080961] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Accepted: 10/08/2013] [Indexed: 11/20/2022] Open
Abstract
De novo transcriptome characterization from Next Generation Sequencing data has become an important approach in the study of non-model plants. Despite notable advances in the assembly of short reads, the clustering of transcripts into unigene-like (locus-specific) clusters remains a somewhat neglected subject. Indeed, closely related paralogous transcripts are often merged into single clusters by current approaches. Here, a novel heuristic method for locus-specific clustering is compared to that implemented in the de novo assembler Oases, using the same initial transcript collections, derived from Arabidopsis thaliana and the developmental model Streptocarpus rexii. We show that the proposed approach improves cluster specificity in the A. thaliana dataset for which the reference genome is available. Furthermore, for the S. rexii data our filtered transcript collection matches a larger number of distinct annotated loci in reference genomes than the Oases set, while containing a reduced overall number of loci. A detailed discussion of advantages and limitations of our approach in processing de novo transcriptome reconstructions is presented. The proposed method should be widely applicable to other organisms, irrespective of the transcript assembly method employed. The S. rexii transcriptome is available as a sophisticated and augmented publicly available online database.
Collapse
Affiliation(s)
- Matteo Chiara
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milano, Italia
| | - David S. Horner
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milano, Italia
- * E-mail:
| | - Alberto Spada
- Dipartimento Di Scienze Agrarie E Ambientali - Produzione, Territorio, Agroenergia, Università degli Studi di Milano, Milano, Italia
| |
Collapse
|
155
|
Lu B, Yang W, Dai Q, Fu J. Using genes as characters and a parsimony analysis to explore the phylogenetic position of turtles. PLoS One 2013; 8:e79348. [PMID: 24278129 PMCID: PMC3836853 DOI: 10.1371/journal.pone.0079348] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Accepted: 09/26/2013] [Indexed: 11/18/2022] Open
Abstract
The phylogenetic position of turtles within the vertebrate tree of life remains controversial. Conflicting conclusions from different studies are likely a consequence of systematic error in the tree construction process, rather than random error from small amounts of data. Using genomic data, we evaluate the phylogenetic position of turtles with both conventional concatenated data analysis and a "genes as characters" approach. Two datasets were constructed, one with seven species (human, opossum, zebra finch, chicken, green anole, Chinese pond turtle, and western clawed frog) and 4584 orthologous genes, and the second with four additional species (soft-shelled turtle, Nile crocodile, royal python, and tuatara) but only 1638 genes. Our concatenated data analysis strongly supported turtle as the sister-group to archosaurs (the archosaur hypothesis), similar to several recent genomic data based studies using similar methods. When using genes as characters and gene trees as character-state trees with equal weighting for each gene, however, our parsimony analysis suggested that turtles are possibly sister-group to diapsids, archosaurs, or lepidosaurs. None of these resolutions were strongly supported by bootstraps. Furthermore, our incongruence analysis clearly demonstrated that there is a large amount of inconsistency among genes and most of the conflict relates to the placement of turtles. We conclude that the uncertain placement of turtles is a reflection of the true state of nature. Concatenated data analysis of large and heterogeneous datasets likely suffers from systematic error and over-estimates of confidence as a consequence of a large number of characters. Using genes as characters offers an alternative for phylogenomic analysis. It has potential to reduce systematic error, such as data heterogeneity and long-branch attraction, and it can also avoid problems associated with computation time and model selection. Finally, treating genes as characters provides a convenient method for examining gene and genome evolution.
Collapse
Affiliation(s)
- Bin Lu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Weizhao Yang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Qiang Dai
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Jinzhong Fu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
- Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada
| |
Collapse
|
156
|
Wong ESW, Nicol S, Warren WC, Belov K. Echidna venom gland transcriptome provides insights into the evolution of monotreme venom. PLoS One 2013; 8:e79092. [PMID: 24265746 PMCID: PMC3827146 DOI: 10.1371/journal.pone.0079092] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Accepted: 09/18/2013] [Indexed: 11/18/2022] Open
Abstract
Monotremes (echidna and platypus) are egg-laying mammals. One of their most unique characteristic is that males have venom/crural glands that are seasonally active. Male platypuses produce venom during the breeding season, delivered via spurs, to aid in competition against other males. Echidnas are not able to erect their spurs, but a milky secretion is produced by the gland during the breeding season. The function and molecular composition of echidna venom is as yet unknown. Hence, we compared the deeply sequenced transcriptome of an in-season echidna crural gland to that of a platypus and searched for putative venom genes to provide clues into the function of echidna venom and the evolutionary history of monotreme venom. We found that the echidna venom gland transcriptome was markedly different from the platypus with no correlation between the top 50 most highly expressed genes. Four peptides found in the venom of the platypus were detected in the echidna transcriptome. However, these genes were not highly expressed in echidna, suggesting that they are the remnants of the evolutionary history of the ancestral venom gland. Gene ontology terms associated with the top 100 most highly expressed genes in echidna, showed functional terms associated with steroidal and fatty acid production, suggesting that echidna “venom” may play a role in scent communication during the breeding season. The loss of the ability to erect the spur and other unknown evolutionary forces acting in the echidna lineage resulted in the gradual decay of venom components and the evolution of a new role for the crural gland.
Collapse
Affiliation(s)
- Emily S. W. Wong
- Institute for Molecular Bioscience, University of Queensland, QLD, Australia
| | - Stewart Nicol
- School of Zoology, University of Tasmania, TAS, Australia
| | - Wesley C. Warren
- The Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Katherine Belov
- Faculty of Veterinary Science, The University of Sydney, NSW, Australia
- * E-mail:
| |
Collapse
|
157
|
Safikhani Z, Sadeghi M, Pezeshk H, Eslahchi C. SSP: an interval integer linear programming for de novo transcriptome assembly and isoform discovery of RNA-seq reads. Genomics 2013; 102:507-14. [PMID: 24161398 DOI: 10.1016/j.ygeno.2013.10.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2013] [Revised: 09/28/2013] [Accepted: 10/16/2013] [Indexed: 11/19/2022]
Abstract
Recent advances in the sequencing technologies have provided a handful of RNA-seq datasets for transcriptome analysis. However, reconstruction of full-length isoforms and estimation of the expression level of transcripts with a low cost are challenging tasks. We propose a novel de novo method named SSP that incorporates interval integer linear programming to resolve alternatively spliced isoforms and reconstruct the whole transcriptome from short reads. Experimental results show that SSP is fast and precise in determining different alternatively spliced isoforms along with the estimation of reconstructed transcript abundances. The SSP software package is available at http://www.bioinf.cs.ipm.ir/software/ssp.
Collapse
Affiliation(s)
- Zhaleh Safikhani
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Mehdi Sadeghi
- National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran.
| | - Hamid Pezeshk
- School of Mathematics, Statistics and Computer Sciences, Center of Excellence in Biomathematics, College of Science, University of Tehran, Tehran, Iran
| | - Changiz Eslahchi
- Department of Computer Science, Shahid Beheshti University, GC., Tehran, Iran; School of Computer Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| |
Collapse
|
158
|
Combining different mRNA capture methods to analyze the transcriptome: analysis of the Xenopus laevis transcriptome. PLoS One 2013; 8:e77700. [PMID: 24143257 PMCID: PMC3797054 DOI: 10.1371/journal.pone.0077700] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Accepted: 09/13/2013] [Indexed: 11/19/2022] Open
Abstract
mRNA sequencing (mRNA-seq) is a commonly used technique to survey gene expression from organisms with fully sequenced genomes. Successful mRNA-seq requires purification of mRNA away from the much more abundant ribosomal RNA, which is typically accomplished by oligo-dT selection. However, mRNAs with short poly-A tails are captured poorly by oligo-dT based methods. We demonstrate that combining mRNA capture via oligo-dT with mRNA capture by the 5’ 7-methyl guanosine cap provides a more complete view of the transcriptome and can be used to assay changes in mRNA poly-A tail length on a genome-wide scale. We also show that using mRNA-seq reads from both capture methods as input for de novo assemblers provides a more complete reconstruction of the transcriptome than either method used alone. We apply these methods of mRNA capture and de novo assembly to the transcriptome of Xenopus laevis, a well-studied frog that currently lacks a finished sequenced genome, to discover transcript sequences for thousands of mRNAs that are currently absent from public databases. The methods we describe here will be broadly applicable to many organisms and will provide insight into the transcriptomes of organisms with sequenced and unsequenced genomes.
Collapse
|
159
|
Bester-Van Der Merwe A, Blaauw S, Du Plessis J, Roodt-Wilding R. Transcriptome-wide single nucleotide polymorphisms (SNPs) for abalone (Haliotis midae): validation and application using GoldenGate medium-throughput genotyping assays. Int J Mol Sci 2013; 14:19341-60. [PMID: 24065109 PMCID: PMC3794836 DOI: 10.3390/ijms140919341] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2013] [Revised: 08/26/2013] [Accepted: 09/05/2013] [Indexed: 01/08/2023] Open
Abstract
Haliotis midae is one of the most valuable commercial abalone species in the world, but is highly vulnerable, due to exploitation, habitat destruction and predation. In order to preserve wild and cultured stocks, genetic management and improvement of the species has become crucial. Fundamental to this is the availability and employment of molecular markers, such as microsatellites and single nucleotide (SNPs). Transcriptome sequences generated through sequencing-by-synthesis technology were utilized for the in vitro and in silico identification of 505 putative SNPs from a total of 316 selected contigs. A subset of 234 SNPs were further validated and characterized in wild and cultured abalone using two Illumina GoldenGate genotyping assays. Combined with VeraCode technology, this genotyping platform yielded a 65%–69% conversion rate (percentage polymorphic markers) with a global genotyping success rate of 76%–85% and provided a viable means for validating SNP markers in a non-model species. The utility of 31 of the validated SNPs in population structure analysis was confirmed, while a large number of SNPs (174) were shown to be informative and are, thus, good candidates for linkage map construction. The non-synonymous SNPs (50) located in coding regions of genes that showed similarities with known proteins will also be useful for genetic applications, such as the marker-assisted selection of genes of relevance to abalone aquaculture.
Collapse
Affiliation(s)
- Aletta Bester-Van Der Merwe
- Molecular Breeding and Biodiversity Group, Department of Genetics, Faculty of Agrisciences, Stellenbosch University, Private Bag X1, Matieland 7602, South Africa.
| | | | | | | |
Collapse
|
160
|
Qiao L, Yang W, Fu J, Song Z. Transcriptome profile of the green odorous frog (Odorrana margaretae). PLoS One 2013; 8:e75211. [PMID: 24073255 PMCID: PMC3779193 DOI: 10.1371/journal.pone.0075211] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 08/13/2013] [Indexed: 12/02/2022] Open
Abstract
Transcriptome profiles provide a practical and inexpensive alternative to explore genomic data in non-model organisms, particularly in amphibians where the genomes are very large and complex. The odorous frog Odorranamargaretae (Anura: Ranidae) is a dominant species in the mountain stream ecosystem of western China. Limited knowledge of its genetic background has hindered research on this species, despite its importance in the ecosystem and as biological resources. Here we report the transcriptome of O. margaretae in order to establish the foundation for genetic research. Using an Illumina sequencing platform, 62,321,166 raw reads were acquired. After a de novo assembly, 37,906 transcripts were obtained, and 18,933 transcripts were annotated to 14,628 genes. We functionally classified these transcripts by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). A total of 11,457 unique transcripts were assigned to 52 GO terms, and 1,438 transcripts were assigned to 128 KEGG pathways. Furthermore, we identified 27 potential antimicrobial peptides (AMPs), 50,351 single nucleotide polymorphism (SNP) sites, and 2,574 microsatellite DNA loci. The transcriptome profile of this species will shed more light on its genetic background and provide useful tools for future studies of this species, as well as other species in the genus Odorrana. It will also contribute to the accumulation of amphibian genomic data.
Collapse
Affiliation(s)
- Liang Qiao
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, China
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
| | - Weizhao Yang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
| | - Jinzhong Fu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
- Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada
- * E-mail: (JF); (ZS)
| | - Zhaobin Song
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, China
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- * E-mail: (JF); (ZS)
| |
Collapse
|
161
|
An D, Caffrey SM, Soh J, Agrawal A, Brown D, Budwill K, Dong X, Dunfield P, Foght J, Gieg LM, Hallam SJ, Hanson NW, He Z, Jack TR, Klassen J, Konwar KM, Kuatsjah E, Li C, Larter S, Leopatra V, Nesbø CL, Oldenburg T, Pagé A, Ramos-Padron E, Rochman FF, Saidi-Mehrabad A, Sensen CW, Sipahimalani P, Song YC, Wilson S, Wolbring G, Wong ML, Voordouw G. Metagenomics of hydrocarbon resource environments indicates aerobic taxa and genes to be unexpectedly common. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2013; 47:10708-17. [PMID: 23889694 PMCID: PMC3864245 DOI: 10.1021/es4020184] [Citation(s) in RCA: 114] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2013] [Revised: 07/23/2013] [Accepted: 07/26/2013] [Indexed: 05/29/2023]
Abstract
Oil in subsurface reservoirs is biodegraded by resident microbial communities. Water-mediated, anaerobic conversion of hydrocarbons to methane and CO2, catalyzed by syntrophic bacteria and methanogenic archaea, is thought to be one of the dominant processes. We compared 160 microbial community compositions in ten hydrocarbon resource environments (HREs) and sequenced twelve metagenomes to characterize their metabolic potential. Although anaerobic communities were common, cores from oil sands and coal beds had unexpectedly high proportions of aerobic hydrocarbon-degrading bacteria. Likewise, most metagenomes had high proportions of genes for enzymes involved in aerobic hydrocarbon metabolism. Hence, although HREs may have been strictly anaerobic and typically methanogenic for much of their history, this may not hold today for coal beds and for the Alberta oil sands, one of the largest remaining oil reservoirs in the world. This finding may influence strategies to recover energy or chemicals from these HREs by in situ microbial processes.
Collapse
Affiliation(s)
- Dongshan An
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Sean M. Caffrey
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Jung Soh
- Visual Genomics Centre, Faculty
of Medicine, University of Calgary, Calgary,
Alberta, T2N 1N4, Canada
| | - Akhil Agrawal
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Damon Brown
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Karen Budwill
- Environment and Carbon Management Division, Alberta Innovates−Technology Futures, Edmonton,
Alberta, T6N 1E4, Canada
| | - Xiaoli Dong
- Visual Genomics Centre, Faculty
of Medicine, University of Calgary, Calgary,
Alberta, T2N 1N4, Canada
| | - Peter
F. Dunfield
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Julia Foght
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, P6G 2M7,
Canada
| | - Lisa M. Gieg
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Steven J. Hallam
- Department of Microbiology &
Immunology, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British
Columbia, V6T 1Z4, Canada
- Michael
Smith Genome Sciences Centre,
Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Niels W. Hanson
- Genome Sciences and Technology
Training Program, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Zhiguo He
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Thomas R. Jack
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Jonathan Klassen
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, P6G 2M7,
Canada
| | - Kishori M. Konwar
- Department of Microbiology &
Immunology, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Eugene Kuatsjah
- Genome Sciences and Technology
Training Program, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Carmen Li
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, P6G 2M7,
Canada
| | - Steve Larter
- Department
of Geosciences, University of Calgary,
Calgary, Alberta, T2N 1N4, Canada
| | - Verlyn Leopatra
- Department of Community Health
Sciences, University of Calgary, Alberta,
T2N 1N4, Canada
| | - Camilla L. Nesbø
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, P6G 2M7,
Canada
- Department of Biology, University of Oslo, 0313 Oslo, Norway
| | - Thomas Oldenburg
- Department
of Geosciences, University of Calgary,
Calgary, Alberta, T2N 1N4, Canada
| | - Antoine
P. Pagé
- Department of Microbiology &
Immunology, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Esther Ramos-Padron
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Fauziah F. Rochman
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | | | - Christoph W. Sensen
- Visual Genomics Centre, Faculty
of Medicine, University of Calgary, Calgary,
Alberta, T2N 1N4, Canada
| | - Payal Sipahimalani
- Michael
Smith Genome Sciences Centre,
Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Young C. Song
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British
Columbia, V6T 1Z4, Canada
| | - Sandra Wilson
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Gregor Wolbring
- Department of Community Health
Sciences, University of Calgary, Alberta,
T2N 1N4, Canada
| | - Man-Ling Wong
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| | - Gerrit Voordouw
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4, Canada
| |
Collapse
|
162
|
Abstract
BACKGROUND Buffalograss [Buchloë dactyloides (Nutt.) Engel. syn. Bouteloua dactyloides (Nutt.) Columbus] is a United States native turfgrass species that requires less irrigation, fungicides and pesticides compared to more commonly used turfgrass species. In areas where water is limited, interest in this grass species for lawns is increasing. While several buffalograss cultivars have been developed through buffalograss breeding, the timeframe for new cultivar development is long and is limited by a lack of useful genetic resources. Two high throughput next-generation sequencing techniques were used to increase the genomic resources available for buffalograss. RESULTS Total RNA was extracted and purified from leaf samples of two buffalograss cultivars. '378' and 'Prestige' cDNA libraries were subjected to high throughput sequencing on the Illumina GA and Roche 454 Titanium FLX sequencing platforms. The 454 platform (3 samples) produced 1,300,885 reads and the Illumina platform (12 samples) generated approximately 332 million reads. The multiple k-mer technique for de novo assembly using Velvet and Oases was applied. A total of 121,288 contigs were assembled that were similar to previously reported Ensembl commelinid sequences. Original Illumina reads were also mapped to the high quality assembly to estimate expression levels of buffalograss transcripts. There were a total of 325 differentially expressed genes between the two buffalograss cultivars. A glycosyl transferase, serine threonine kinase, and nb-arc domain containing transcripts were among those differentially expressed between the two cultivars. These genes have been previously implicated in defense response pathways and may in part explain some of the performance differences between 'Prestige' and '378'. CONCLUSIONS To date, this is the first high throughput sequencing experiment conducted on buffalograss. In total, 121,288 high quality transcripts were assembled, significantly expanding the limited genetic resources available for buffalograss genetic studies. Additionally, 325 differentially expressed sequences were identified which may contribute to performance or morphological differences between 'Prestige' and '378' buffalograss cultivars.
Collapse
|
163
|
Brykczynska U, Tzika AC, Rodriguez I, Milinkovitch MC. Contrasted evolution of the vomeronasal receptor repertoires in mammals and squamate reptiles. Genome Biol Evol 2013; 5:389-401. [PMID: 23348039 PMCID: PMC3590772 DOI: 10.1093/gbe/evt013] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
The vomeronasal organ (VNO) is an olfactory structure that detects pheromones and environmental cues. It consists of sensory neurons that express evolutionary unrelated groups of transmembrane chemoreceptors. The predominant V1R and V2R receptor repertoires are believed to detect airborne and water-soluble molecules, respectively. It has been suggested that the shift in habitat of early tetrapods from water to land is reflected by an increase in the ratio of V1R/V2R genes. Snakes, which have a very large VNO associated with a sophisticated tongue delivery system, are missing from this analysis. Here, we use RNA-seq and RNA in situ hybridization to study the diversity, evolution, and expression pattern of the corn snake vomeronasal receptor repertoires. Our analyses indicate that snakes and lizards retain an extremely limited number of V1R genes but exhibit a large number of V2R genes, including multiple lineages of reptile-specific and snake-specific expansions. We finally show that the peculiar bigenic pattern of V2R vomeronasal receptor gene transcription observed in mammals is conserved in squamate reptiles, hinting at an important but unknown functional role played by this expression strategy. Our results do not support the hypothesis that the shift to a vomeronasal receptor repertoire dominated by V1Rs in mammals reflects the evolutionary transition of early tetrapods from water to land. This study sheds light on the evolutionary dynamics of the vomeronasal receptor families in vertebrates and reveals how mammals and squamates differentially adapted the same ancestral vomeronasal repertoire to succeed in a terrestrial environment.
Collapse
Affiliation(s)
- Urszula Brykczynska
- Laboratory of Artificial & Natural Evolution (LANE), Department of Genetics & Evolution, University of Geneva, Sciences III, Geneva, Switzerland
| | | | | | | |
Collapse
|
164
|
The developmental transcriptome of the mosquito Aedes aegypti, an invasive species and major arbovirus vector. G3-GENES GENOMES GENETICS 2013; 3:1493-509. [PMID: 23833213 PMCID: PMC3755910 DOI: 10.1534/g3.113.006742] [Citation(s) in RCA: 148] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Mosquitoes are vectors of a number of important human and animal diseases. The development of novel vector control strategies requires a thorough understanding of mosquito biology. To facilitate this, we used RNA-seq to identify novel genes and provide the first high-resolution view of the transcriptome throughout development and in response to blood feeding in a mosquito vector of human disease, Aedes aegypti, the primary vector for Dengue and yellow fever. We characterized mRNA expression at 34 distinct time points throughout Aedes development, including adult somatic and germline tissues, by using polyA+ RNA-seq. We identify a total of 14,238 novel new transcribed regions corresponding to 12,597 new loci, as well as many novel transcript isoforms of previously annotated genes. Altogether these results increase the annotated fraction of the transcribed genome into long polyA+ RNAs by more than twofold. We also identified a number of patterns of shared gene expression, as well as genes and/or exons expressed sex-specifically or sex-differentially. Expression profiles of small RNAs in ovaries, early embryos, testes, and adult male and female somatic tissues also were determined, resulting in the identification of 38 new Aedes-specific miRNAs, and ~291,000 small RNA new transcribed regions, many of which are likely to be endogenous small-interfering RNAs and Piwi-interacting RNAs. Genes of potential interest for transgene-based vector control strategies also are highlighted. Our data have been incorporated into a user-friendly genome browser located at www.Aedes.caltech.edu, with relevant links to Vectorbase (www.vectorbase.org)
Collapse
|
165
|
Forconi M, Chalopin D, Barucca M, Biscotti MA, De Moro G, Galiana D, Gerdol M, Pallavicini A, Canapa A, Olmo E, Volff JN. Transcriptional activity of transposable elements in coelacanth. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2013; 322:379-89. [PMID: 24038780 DOI: 10.1002/jez.b.22527] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Revised: 06/04/2013] [Accepted: 07/14/2013] [Indexed: 01/22/2023]
Abstract
The morphological stasis of coelacanths has long suggested a slow evolutionary rate. General genomic stasis might also imply a decrease of transposable elements activity. To evaluate the potential activity of transposable elements (TEs) in "living fossil" species, transcriptomic data of Latimeria chalumnae and its Indonesian congener Latimeria menadoensis were compared through the RNA-sequencing mapping procedures in three different organs (liver, testis, and muscle). The analysis of coelacanth transcriptomes highlights a significant percentage of transcribed TEs in both species. Major contributors are LINE retrotransposons, especially from the CR1 family. Furthermore, some particular elements such as a LF-SINE and a LINE2 sequences seem to be more expressed than other elements. The amount of TEs expressed in testis suggests possible transposition burst in incoming generations. Moreover, significant amount of TEs in liver and muscle transcriptomes were also observed. Analyses of elements displaying marked organ-specific expression gave us the opportunity to highlight exaptation cases, that is, the recruitment of TEs as new cellular genes, but also to identify a new Latimeria-specific family of Short Interspersed Nuclear Elements called CoeG-SINEs. Overall, transcriptome results do not seem to be in line with a slow-evolving genome with poor TE activity.
Collapse
Affiliation(s)
- Mariko Forconi
- Dipartimento di Scienze della Vita e dell'Ambiente, Università Politecnica delle Marche, Ancona, Italy; Institut de Génomique Fonctionnelle de Lyon, ENS Lyon, France
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
166
|
Verma P, Shah N, Bhatia S. Development of an expressed gene catalogue and molecular markers from the de novo assembly of short sequence reads of the lentil (Lens culinaris Medik.) transcriptome. PLANT BIOTECHNOLOGY JOURNAL 2013; 11:894-905. [PMID: 23759076 DOI: 10.1111/pbi.12082] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Revised: 04/10/2013] [Accepted: 04/15/2013] [Indexed: 05/23/2023]
Abstract
Genomic resources such as ESTs, molecular markers and linkage maps are essential for crop improvement. However, these resources are still limited in important legumes such as lentil (Lens culinaris Medik.), which is valued world wide as a rich source of dietary protein. In this study, the de novo transcriptome assembly of 119,855,798 short reads, generated by Illumina paired-end sequencing, was performed using various assembly programs. This resulted in 42,196 nonredundant high-quality transcripts of average length 810 bases, N50 value of 1,432 and an average expression per transcript of 26.21 rpkm reads per kilobase per million(RPKM). Similarity search with the unigenes and protein sequences of other plants resulted in maximum similarity with soybean. A total of 20,009 nonredundant transcripts showed similarity with the UniProtKB database and of these, 18,064 transcripts were grouped into three main GO categories, that is, biological process (15,126), molecular function (15,505) and cellular component (9,434). Annotated transcripts were mapped to 289 predicted Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and 8,893 transcripts were classified into 24 functional categories based on Cluster of Orthologous Groups (COG) of proteins. Mining the data set for the presence of SSRs resulted in 8,722 SSRs with a frequency occurrence of one SSR per 3.92 kb. From these, 5,673 SSR primer pairs were designed, and a subset of these were utilized for diversity analysis. This study, which provides a large data set of annotated transcripts and gene-based SSR markers, would serve as a foundation for various applications in lentil breeding and genetics.
Collapse
Affiliation(s)
- Priyanka Verma
- National Institute of Plant Genome Research, New Delhi, India
| | | | | |
Collapse
|
167
|
Ghangal R, Chaudhary S, Jain M, Purty RS, Chand Sharma P. Optimization of de novo short read assembly of seabuckthorn (Hippophae rhamnoides L.) transcriptome. PLoS One 2013; 8:e72516. [PMID: 23991119 PMCID: PMC3749127 DOI: 10.1371/journal.pone.0072516] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2013] [Accepted: 07/09/2013] [Indexed: 11/18/2022] Open
Abstract
Seabuckthorn (Hippophaerhamnoides L.) is known for its medicinal, nutritional and environmental importance since ancient times. However, very limited efforts have been made to characterize the genome and transcriptome of this wonder plant. Here, we report the use of next generation massive parallel sequencing technology (Illumina platform) and de novo assembly to gain a comprehensive view of the seabuckthorn transcriptome. We assembled 86,253,874 high quality short reads using six assembly tools. At our hand, assembly of non-redundant short reads following a two-step procedure was found to be the best considering various assembly quality parameters. Initially, ABySS tool was used following an additive k-mer approach. The assembled transcripts were subsequently subjected to TGICL suite. Finally, de novo short read assembly yielded 88,297 transcripts (> 100 bp), representing about 53 Mb of seabuckthorn transcriptome. The average length of transcripts was 610 bp, N50 length 1198 BP and 91% of the short reads uniquely mapped back to seabuckthorn transcriptome. A total of 41,340 (46.8%) transcripts showed significant similarity with sequences present in nr protein databases of NCBI (E-value < 1E-06). We also screened the assembled transcripts for the presence of transcription factors and simple sequence repeats. Our strategy involving the use of short read assembler (ABySS) followed by TGICL will be useful for the researchers working with a non-model organism’s transcriptome in terms of saving time and reducing complexity in data management. The seabuckthorn transcriptome data generated here provide a valuable resource for gene discovery and development of functional molecular markers.
Collapse
Affiliation(s)
- Rajesh Ghangal
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
| | - Saurabh Chaudhary
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
| | - Mukesh Jain
- National Institute of Plant Genome Research, New Delhi, India
| | - Ram Singh Purty
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
| | - Prakash Chand Sharma
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
- * E-mail:
| |
Collapse
|
168
|
Li C, Wang Y, Huang X, Li J, Wang H, Li J. De novo assembly and characterization of fruit transcriptome in Litchi chinensis Sonn and analysis of differentially regulated genes in fruit in response to shading. BMC Genomics 2013; 14:552. [PMID: 23941440 PMCID: PMC3751308 DOI: 10.1186/1471-2164-14-552] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 08/09/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Litchi (Litchi chinensis Sonn.) is one of the most important fruit trees cultivated in tropical and subtropical areas. However, a lack of transcriptomic and genomic information hinders our understanding of the molecular mechanisms underlying fruit set and fruit development in litchi. Shading during early fruit development decreases fruit growth and induces fruit abscission. Here, high-throughput RNA sequencing (RNA-Seq) was employed for the de novo assembly and characterization of the fruit transcriptome in litchi, and differentially regulated genes, which are responsive to shading, were also investigated using digital transcript abundance(DTA)profiling. RESULTS More than 53 million paired-end reads were generated and assembled into 57,050 unigenes with an average length of 601 bp. These unigenes were annotated by querying against various public databases, with 34,029 unigenes found to be homologous to genes in the NCBI GenBank database and 22,945 unigenes annotated based on known proteins in the Swiss-Prot database. In further orthologous analyses, 5,885 unigenes were assigned with one or more Gene Ontology terms, 10,234 hits were aligned to the 24 Clusters of Orthologous Groups classifications and 15,330 unigenes were classified into 266 Kyoto Encyclopedia of Genes and Genomes pathways. Based on the newly assembled transcriptome, the DTA profiling approach was applied to investigate the differentially expressed genes related to shading stress. A total of 3.6 million and 3.5 million high-quality tags were generated from shaded and non-shaded libraries, respectively. As many as 1,039 unigenes were shown to be significantly differentially regulated. Eleven of the 14 differentially regulated unigenes, which were randomly selected for more detailed expression comparison during the course of shading treatment, were identified as being likely to be involved in the process of fruitlet abscission in litchi. CONCLUSIONS The assembled transcriptome of litchi fruit provides a global description of expressed genes in litchi fruit development, and could serve as an ideal repository for future functional characterization of specific genes. The DTA analysis revealed that more than 1000 differentially regulated unigenes respond to the shading signal, some of which might be involved in the fruitlet abscission process in litchi, shedding new light on the molecular mechanisms underlying organ abscission.
Collapse
Affiliation(s)
- Caiqin Li
- China Litchi Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Yan Wang
- China Litchi Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Xuming Huang
- China Litchi Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Jiang Li
- Beijing Genomics Institute at Shenzhen, Shenzhen 518083, China
| | - Huicong Wang
- China Litchi Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Jianguo Li
- China Litchi Research Center, South China Agricultural University, Guangzhou 510642, China
| |
Collapse
|
169
|
Li S, Dong X, Su Z. Directional RNA-seq reveals highly complex condition-dependent transcriptomes in E. coli K12 through accurate full-length transcripts assembling. BMC Genomics 2013; 14:520. [PMID: 23899370 PMCID: PMC3734233 DOI: 10.1186/1471-2164-14-520] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Accepted: 07/27/2013] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Although prokaryotic gene transcription has been studied over decades, many aspects of the process remain poorly understood. Particularly, recent studies have revealed that transcriptomes in many prokaryotes are far more complex than previously thought. Genes in an operon are often alternatively and dynamically transcribed under different conditions, and a large portion of genes and intergenic regions have antisense RNA (asRNA) and non-coding RNA (ncRNA) transcripts, respectively. Ironically, similar studies have not been conducted in the model bacterium E coli K12, thus it is unknown whether or not the bacterium possesses similar complex transcriptomes. Furthermore, although RNA-seq becomes the major method for analyzing the complexity of prokaryotic transcriptome, it is still a challenging task to accurately assemble full length transcripts using short RNA-seq reads. RESULTS To fill these gaps, we have profiled the transcriptomes of E. coli K12 under different culture conditions and growth phases using a highly specific directional RNA-seq technique that can capture various types of transcripts in the bacterial cells, combined with a highly accurate and robust algorithm and tool TruHMM (http://bioinfolab.uncc.edu/TruHmm_package/) for assembling full length transcripts. We found that 46.9 ~ 63.4% of expressed operons were utilized in their putative alternative forms, 72.23 ~ 89.54% genes had putative asRNA transcripts and 51.37 ~ 72.74% intergenic regions had putative ncRNA transcripts under different culture conditions and growth phases. CONCLUSIONS As has been demonstrated in many other prokaryotes, E. coli K12 also has a highly complex and dynamic transcriptomes under different culture conditions and growth phases. Such complex and dynamic transcriptomes might play important roles in the physiology of the bacterium. TruHMM is a highly accurate and robust algorithm for assembling full-length transcripts in prokaryotes using directional RNA-seq short reads.
Collapse
Affiliation(s)
- Shan Li
- Department of Bioinformatics and Genomics, College of Computing and Informatics, The University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC 28223, USA
| | | | | |
Collapse
|
170
|
Zhou ZC, Dong Y, Sun HJ, Yang AF, Chen Z, Gao S, Jiang JW, Guan XY, Jiang B, Wang B. Transcriptome sequencing of sea cucumber (Apostichopus japonicus) and the identification of gene-associated markers. Mol Ecol Resour 2013; 14:127-38. [PMID: 23855518 DOI: 10.1111/1755-0998.12147] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2013] [Revised: 06/23/2013] [Accepted: 06/25/2013] [Indexed: 11/30/2022]
Abstract
Sea cucumber (Apostichopus japonicus) is an ecologically and economically important species in East and South-East Asia. This project aimed to identify large numbers of gene-associated markers and differentially expressed genes (DEGs) after lipopolysaccharides (LPS) challenge in A. japonicus using high-throughput transcriptome sequencing. A total of 162 million high-quality reads of 174 million raw reads were obtained by deep sequencing using Illumina HiSeq™ 2000 platform. Assembly of these reads generated 94 704 unigenes, with read length ranging from 200 to 16 153 bp (average length of 810 bp). A total of 36 005 were identified as coding sequences (CDSs), 32 479 of which were successfully annotated. Based on the assembly transcriptome, we identified 142 511 high-quality single nucleotide polymorphisms (SNPs). Among them, 33 775, 63 120 and 45 616 were located in sequences without predicted CDS (non-CDSs), CDSs and untranslated regions (UTRs), respectively. These putative SNPs included 82 664 transitions and 59 847 transversions. Totally, 89 375 (59.1%) were distributed in 15 473 known genes. A total of 6417 microsatellites were detected in 5970 unigenes, 3216 of which were annotated and 2481 were successfully subjected for primer design. The numbers of simple sequence repeats (SSRs) identified in non-CDSs, CDSs and UTRs were 2367, 2316 and 1734. These potential SNPs and SSRs are expected to provide abundant resources for genetic, evolutionary and ecological studies in sea cucumber. Transcriptome comparison revealed 1330, 1347 and 1291 DEGs in the coelomocytes of A. japonicus at 4 h, 24 h and 72 h after LPS challenge, respectively. Approximately 58.4% (1802) of total DEGs have been successfully annotated.
Collapse
Affiliation(s)
- Z C Zhou
- Liaoning Key Lab of Marine Fishery Molecular Biology, Liaoning Ocean and Fisheries Science Research Institute, Dalian, Liaoning, 116023, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
171
|
O'Neil ST, Emrich SJ. Assessing De Novo transcriptome assembly metrics for consistency and utility. BMC Genomics 2013. [PMID: 23837739 DOI: 10.1186/1471‐2164‐14‐465] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcriptome sequencing and assembly represent a great resource for the study of non-model species, and many metrics have been used to evaluate and compare these assemblies. Unfortunately, it is still unclear which of these metrics accurately reflect assembly quality. RESULTS We simulated sequencing transcripts of Drosophila melanogaster. By assembling these simulated reads using both a "perfect" and a modern transcriptome assembler while varying read length and sequencing depth, we evaluated quality metrics to determine whether they 1) revealed perfect assemblies to be of higher quality, and 2) revealed perfect assemblies to be more complete as data quantity increased.Several commonly used metrics were not consistent with these expectations, including average contig coverage and length, though they became consistent when singletons were included in the analysis. We found several annotation-based metrics to be consistent and informative, including contig reciprocal best hit count and contig unique annotation count. Finally, we evaluated a number of novel metrics such as reverse annotation count, contig collapse factor, and the ortholog hit ratio, discovering that each assess assembly quality in unique ways. CONCLUSIONS Although much attention has been given to transcriptome assembly, little research has focused on determining how best to evaluate assemblies, particularly in light of the variety of options available for read length and sequencing depth. Our results provide an important review of these metrics and give researchers tools to produce the highest quality transcriptome assemblies.
Collapse
Affiliation(s)
- Shawn T O'Neil
- Center for Genome Research and Biocomputing, Oregon State University,Corvallis, OR 97333, USA
| | | |
Collapse
|
172
|
O'Neil ST, Emrich SJ. Assessing De Novo transcriptome assembly metrics for consistency and utility. BMC Genomics 2013; 14:465. [PMID: 23837739 PMCID: PMC3733778 DOI: 10.1186/1471-2164-14-465] [Citation(s) in RCA: 96] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2012] [Accepted: 06/21/2013] [Indexed: 11/10/2022] Open
Abstract
Background Transcriptome sequencing and assembly represent a great resource for the study of non-model species, and many metrics have been used to evaluate and compare these assemblies. Unfortunately, it is still unclear which of these metrics accurately reflect assembly quality. Results We simulated sequencing transcripts of Drosophila melanogaster. By assembling these simulated reads using both a “perfect” and a modern transcriptome assembler while varying read length and sequencing depth, we evaluated quality metrics to determine whether they 1) revealed perfect assemblies to be of higher quality, and 2) revealed perfect assemblies to be more complete as data quantity increased. Several commonly used metrics were not consistent with these expectations, including average contig coverage and length, though they became consistent when singletons were included in the analysis. We found several annotation-based metrics to be consistent and informative, including contig reciprocal best hit count and contig unique annotation count. Finally, we evaluated a number of novel metrics such as reverse annotation count, contig collapse factor, and the ortholog hit ratio, discovering that each assess assembly quality in unique ways. Conclusions Although much attention has been given to transcriptome assembly, little research has focused on determining how best to evaluate assemblies, particularly in light of the variety of options available for read length and sequencing depth. Our results provide an important review of these metrics and give researchers tools to produce the highest quality transcriptome assemblies.
Collapse
Affiliation(s)
- Shawn T O'Neil
- Center for Genome Research and Biocomputing, Oregon State University,Corvallis, OR 97333, USA
| | | |
Collapse
|
173
|
Xu Z, Zhang C, Zhang X, Liu C, Wu Z, Yang Z, Zhou K, Yang X, Li F. Transcriptome profiling reveals auxin and cytokinin regulating somatic embryogenesis in different sister lines of cotton cultivar CCRI24. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2013; 55:631-42. [PMID: 23710882 DOI: 10.1111/jipb.12073] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Accepted: 05/15/2013] [Indexed: 05/22/2023]
Abstract
To get a broader view on the molecular mechanisms underlying somatic embryogenesis (SE) in cotton (Gossypium hirsutum L.), global analysis of cotton transcriptome dynamics during SE in different sister lines was performed using RNA-Seq. A total of 204 349 unigenes were detected by de novo assembly of the 214 977 462 Illumina reads. The quantitative reverse transcription-polymerase chain reaction (qRT-PCR) measurements were positively correlated with the RNA-Seq results for almost all the tested genes (R(2) = 0.841, correlation was significant at the 0.01 level). Different phytohormone (auxin and cytokinin) concentration ratios in medium and the endogenous content changes of these two phytohormones at two stages in different sister lines suggested the roles of auxin and cytokinin during cotton SE. On the basis of global gene regulation of phytohormone-related genes, numerous genes from all the differentially expressed transcripts were involved in auxin and cytokinin biosynthesis and signal transduction pathways. Analyses of differentially expressed genes that were involved in these pathways revealed the substantial changes in gene type and abundance between two sister lines. Isolation, cloning and silencing/overexpressing the genes that revealed remarkable up- or down-expression during cotton SE were important. Furthermore, auxin and cytokinin play a primary role in SE, but potential cross-talk with each other or other factors remains unclear.
Collapse
Affiliation(s)
- Zhenzhen Xu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agriculture Sciences, Anyang, 455000, China
| | | | | | | | | | | | | | | | | |
Collapse
|
174
|
Smith S, Bernatchez L, Beheregaray LB. RNA-seq analysis reveals extensive transcriptional plasticity to temperature stress in a freshwater fish species. BMC Genomics 2013; 14:375. [PMID: 23738713 PMCID: PMC3680095 DOI: 10.1186/1471-2164-14-375] [Citation(s) in RCA: 127] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Accepted: 05/27/2013] [Indexed: 11/21/2022] Open
Abstract
Background Identifying genes of adaptive significance in a changing environment is a major focus of ecological genomics. Such efforts were restricted, until recently, to researchers studying a small group of model organisms or closely related taxa. With the advent of next generation sequencing (NGS), genomes and transcriptomes of virtually any species are now available for studies of adaptive evolution. We experimentally manipulated temperature conditions for two groups of crimson spotted rainbowfish (Melanotaenia duboulayi) and measured differences in RNA transcription between them. This non-migratory species is found across a latitudinal thermal gradient in eastern Australia and is predicted to be negatively impacted by ongoing environmental and climatic change. Results Using next generation RNA-seq technologies on an Illumina HiSeq2000 platform, we assembled a de novo transcriptome and tested for differential expression across the treatment groups. Quality of the assembly was high with a N50 length of 1856 bases. Of the 107,749 assembled contigs, we identified 4251 that were differentially expressed according to a consensus of four different mapping and significance testing approaches. Once duplicate isoforms were removed, we were able to annotate 614 up-regulated transfrags and 349 that showed reduced expression in the higher temperature group. Conclusions Annotated blast matches reveal that differentially expressed genes correspond to critical metabolic pathways previously shown to be important for temperature tolerance in other fish species. Our results indicate that rainbowfish exhibit predictable plastic regulatory responses to temperature stress and the genes we identified provide excellent candidates for further investigations of population adaptation to increasing temperatures.
Collapse
Affiliation(s)
- Steve Smith
- Molecular Ecology Laboratory, School of Biological Sciences, Flinders University, Adelaide, SA 5001, Australia
| | | | | |
Collapse
|
175
|
Ruttink T, Sterck L, Rohde A, Bendixen C, Rouzé P, Asp T, Van de Peer Y, Roldan-Ruiz I. Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne. PLANT BIOTECHNOLOGY JOURNAL 2013; 11:605-17. [PMID: 23433242 DOI: 10.1111/pbi.12051] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2012] [Revised: 01/05/2013] [Accepted: 01/11/2013] [Indexed: 05/09/2023]
Abstract
Despite current advances in next-generation sequencing data analysis procedures, de novo assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent to outbreeding crop species hamper De Bruijn Graph-based de novo assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary de novo assembly is best performed per genotype to limit the level of polymorphism and avoid transcript fragmentation. Here, we propose an Orthology Guided Assembly procedure that first uses sequence similarity (tBLASTn) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP3 clustering on a gene-by-gene basis. Thus, we simultaneously annotate putative orthologues for each protein of the model species, resolve allelic redundancy and fragmentation and create a de novo transcript sequence representing the consensus of all alleles present in the sequenced genotypes. We demonstrate the procedure using RNA-seq data from 14 genotypes of Lolium perenne to generate a reference transcriptome for gene discovery and translational research, to reveal the transcriptome-wide distribution and density of SNPs in an outbreeding crop and to illustrate the effect of polymorphisms on the assembly procedure. The results presented here illustrate that constructing a non-redundant reference sequence is essential for comparative genomics, orthology-based annotation and candidate gene selection but also for read mapping and subsequent polymorphism discovery and/or read count-based gene expression analysis.
Collapse
Affiliation(s)
- Tom Ruttink
- Plant Sciences Unit--Growth and Development, Institute for Agricultural and Fisheries Research-ILVO, Melle, Belgium.
| | | | | | | | | | | | | | | |
Collapse
|
176
|
Poelchau MF, Reynolds JA, Denlinger DL, Elsik CG, Armbruster PA. Transcriptome sequencing as a platform to elucidate molecular components of the diapause response in the Asian tiger mosquito, Aedes albopictus.. PHYSIOLOGICAL ENTOMOLOGY 2013; 38:173-181. [PMID: 23833391 PMCID: PMC3700550 DOI: 10.1111/phen.12016] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2013] [Accepted: 03/18/2013] [Indexed: 05/26/2023]
Abstract
Diapause has long been recognized as a crucial ecological adaptation to spatio-temporal environmental variation. More recently, rapid evolution of the diapause response has been implicated in response to contemporary global warming and during the range expansion of invasive species. Although the molecular regulation of diapause remains largely unresolved, rapidly emerging next-generation sequencing (NGS) technologies provide exciting opportunities to address this longstanding question. Herein, a new assembly from life-history stages relevant to diapause in the Asian tiger mosquito, Aedes albopictus (Skuse) is presented, along with unique methods for the analysis of NGS data and transcriptome assembly. A digital normalization procedure that significantly reduces computational resources required for transcriptome assembly is evaluated. Additionally, a method for protein reference-based and genomic reference-based merged assembly of 454 and Illumina reads is described. Finally, a gene ontology analysis is presented, which creates a platform to identify physiological processes associated with diapause. Taken together, these methods provide valuable tools for analyzing the transcriptional underpinnings of many complex phenotypes, including diapause, and provide a basis for determining the molecular regulation of diapause in Ae. albopictus.
Collapse
|
177
|
Pérez-Portela R, Riesgo A. Optimizing preservation protocols to extract high-quality RNA from different tissues of echinoderms for next-generation sequencing. Mol Ecol Resour 2013; 13:884-9. [PMID: 23683108 DOI: 10.1111/1755-0998.12122] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Revised: 04/02/2013] [Accepted: 04/23/2013] [Indexed: 11/26/2022]
Abstract
Transcriptomic information provides fundamental insights into biological processes. Extraction of quality RNA is a challenging step, and preservation and extraction protocols need to be adjusted in many cases. Our objectives were to optimize preservation protocols for isolation of high-quality RNA from diverse echinoderm tissues and to compare the utility of parameters as absorbance ratios and RIN values to assess RNA quality. Three different tissues (gonad, oesophagus and coelomocytes) were selected from the sea urchin Arbacia lixula. Solid tissues were flash-frozen and stored at -80 °C until processed. Four preservation treatments were applied to coelomocytes: flash freezing and storage at -80 °C, RNAlater and storage at -20 °C, preservation in TRIzol reagent and storage at -80 °C and direct extraction with TRIzol from fresh cells. Extractions of total RNA were performed with a modified TRIzol protocol for all tissues. Our results showed high values of RNA quantity and quality for all tissues, showing nonsignificant differences among them. However, while flash freezing was effective for solid tissues, it was inadequate for coelomocytes because of the low quality of the RNA extractions. Coelomocytes preserved in RNAlater displayed large variability in RNA integrity and insufficient RNA amount for further isolation of mRNA. TRIzol was the most efficient system for stabilizing RNA which resulted on high RNA quality and quantity. We did not detect correlation between absorbance ratios and RNA integrity. The best strategies for assessing RNA integrity was the visualization of 18S rRNA and 28S rRNA bands in agarose gels and estimation of RIN values with Agilent Bioanalyzer chips.
Collapse
Affiliation(s)
- Rocío Pérez-Portela
- Center for Advanced Studies of Blanes (CEAB-CSIC), Acces a la Cala St. Francesc 14, Blanes, Girona, 17300, Spain.
| | | |
Collapse
|
178
|
Góngora-Castillo E, Buell CR. Bioinformatics challenges in de novo transcriptome assembly using short read sequences in the absence of a reference genome sequence. Nat Prod Rep 2013; 30:490-500. [PMID: 23377493 DOI: 10.1039/c3np20099j] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Plant natural product research can be facilitated through genome and transcriptome sequencing approaches that generate informative sequence and expression datasets that enable characterization of biochemical pathways of interest. As the overwhelming majority of plant-derived natural products are derived from species with little, if any, sequence and/or genomic resources, the ability to perform whole genome shotgun sequencing and assembly has been and will continue to be transformative as access to a genome sequence provides molecular resources and a context for discovery and characterization of biosynthetic pathways. Due to the reduced size and complexity of the transcriptome relative to the genome, transcriptome sequencing provides a rapid, inexpensive approach to access gene sequences, gene expression abundances, and gene expression patterns in any species, including those that lack a reference genome sequence. To date, successful applications of RNA sequencing in conjunction with de novo transcriptome assembly has enabled identification of new genes in an array of biochemical pathways in plants. While sequencing technologies are well developed, challenges remain in the handling and analysis of transcriptome sequences. In this Highlight article, we provide an overview of the bioinformatics challenges associated with transcriptome analyses using short read sequences and how to address these issues in plant species that lack a reference genome.
Collapse
|
179
|
Haegeman A, Bauters L, Kyndt T, Rahman MM, Gheysen G. Identification of candidate effector genes in the transcriptome of the rice root knot nematode Meloidogyne graminicola. MOLECULAR PLANT PATHOLOGY 2013; 14:379-90. [PMID: 23279209 PMCID: PMC6638898 DOI: 10.1111/mpp.12014] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Plant-parasitic nematodes secrete so-called effectors into their host plant which are able to suppress the plant's defence responses, alter plant signalling pathways and, in the case of root knot nematodes, induce the formation of giant cells. Putative effectors have been successfully identified by genomics, transcriptomics and proteomics approaches. In this study, we investigated the transcriptome of the rice root knot nematode Meloidogyne graminicola by 454 sequencing of second-stage juveniles as well as mRNA-seq of rice infected tissue. Over 350 000 reads derived from M. graminicola preparasitic juveniles were assembled, annotated and checked for homologues in different databases. From infected rice tissue, 1.4% of all reads generated were identified as being derived from the nematode. Using multiple strategies, several putative effector genes were identified, both pioneer genes and genes corresponding to already known effectors. To check whether these genes could be involved in the interaction with the plant, in situ hybridization was performed on a selection of genes to localize their expression in the nematode. Most were expressed in the gland cells or amphids of the nematode, confirming possible secretion of the proteins and hence a role in infection. Other putative effectors showed a different expression pattern, potentially linked with the excretory/secretory system. This transcriptome study is a good starting point to functionally investigate novel effectors derived from M. graminicola. This will lead to better insights into the interaction between these nematodes and the model plant rice. Moreover, the transcriptome can be used to identify possible target genes for RNA interference (RNAi)-based control strategies. Four genes proved to be interesting targets by showing up to 40% higher mortality relative to the control treatment when soaked in gene-specific small interfering RNAs (siRNAs).
Collapse
Affiliation(s)
- Annelies Haegeman
- Department of Molecular Biotechnology, Ghent University, B-9000, Ghent, Belgium
| | | | | | | | | |
Collapse
|
180
|
Lee BR, Cho S, Song Y, Kim SC, Cho BK. Emerging tools for synthetic genome design. Mol Cells 2013; 35:359-70. [PMID: 23708771 PMCID: PMC3887862 DOI: 10.1007/s10059-013-0127-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2013] [Accepted: 04/26/2013] [Indexed: 12/29/2022] Open
Abstract
Synthetic biology is an emerging discipline for designing and synthesizing predictable, measurable, controllable, and transformable biological systems. These newly designed biological systems have great potential for the development of cheaper drugs, green fuels, biodegradable plastics, and targeted cancer therapies over the coming years. Fortunately, our ability to quickly and accurately engineer biological systems that behave predictably has been dramatically expanded by significant advances in DNA-sequencing, DNA-synthesis, and DNA-editing technologies. Here, we review emerging technologies and methodologies in the field of building designed biological systems, and we discuss their future perspectives.
Collapse
Affiliation(s)
- Bo-Rahm Lee
- Intelligent Synthetic Biology Center, Daejeon 305-701,
Korea
| | - Suhyung Cho
- Intelligent Synthetic Biology Center, Daejeon 305-701,
Korea
- Department of Biological Sciences and Korea Advanced Institute of Science and Technology Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701,
Korea
| | - Yoseb Song
- Intelligent Synthetic Biology Center, Daejeon 305-701,
Korea
- Department of Biological Sciences and Korea Advanced Institute of Science and Technology Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701,
Korea
| | - Sun Chang Kim
- Intelligent Synthetic Biology Center, Daejeon 305-701,
Korea
- Department of Biological Sciences and Korea Advanced Institute of Science and Technology Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701,
Korea
| | - Byung-Kwan Cho
- Intelligent Synthetic Biology Center, Daejeon 305-701,
Korea
- Department of Biological Sciences and Korea Advanced Institute of Science and Technology Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701,
Korea
| |
Collapse
|
181
|
Haegeman A, Bauters L, Kyndt T, Rahman MM, Gheysen G. Identification of candidate effector genes in the transcriptome of the rice root knot nematode Meloidogyne graminicola. MOLECULAR PLANT PATHOLOGY 2013; 14:379-390. [PMID: 23279209 DOI: 10.1111/mpp.12014 [epub ahead of print]] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Plant-parasitic nematodes secrete so-called effectors into their host plant which are able to suppress the plant's defence responses, alter plant signalling pathways and, in the case of root knot nematodes, induce the formation of giant cells. Putative effectors have been successfully identified by genomics, transcriptomics and proteomics approaches. In this study, we investigated the transcriptome of the rice root knot nematode Meloidogyne graminicola by 454 sequencing of second-stage juveniles as well as mRNA-seq of rice infected tissue. Over 350 000 reads derived from M. graminicola preparasitic juveniles were assembled, annotated and checked for homologues in different databases. From infected rice tissue, 1.4% of all reads generated were identified as being derived from the nematode. Using multiple strategies, several putative effector genes were identified, both pioneer genes and genes corresponding to already known effectors. To check whether these genes could be involved in the interaction with the plant, in situ hybridization was performed on a selection of genes to localize their expression in the nematode. Most were expressed in the gland cells or amphids of the nematode, confirming possible secretion of the proteins and hence a role in infection. Other putative effectors showed a different expression pattern, potentially linked with the excretory/secretory system. This transcriptome study is a good starting point to functionally investigate novel effectors derived from M. graminicola. This will lead to better insights into the interaction between these nematodes and the model plant rice. Moreover, the transcriptome can be used to identify possible target genes for RNA interference (RNAi)-based control strategies. Four genes proved to be interesting targets by showing up to 40% higher mortality relative to the control treatment when soaked in gene-specific small interfering RNAs (siRNAs).
Collapse
Affiliation(s)
- Annelies Haegeman
- Department of Molecular Biotechnology, Ghent University, B-9000, Ghent, Belgium
| | | | | | | | | |
Collapse
|
182
|
Liu S, Wang X, Sun F, Zhang J, Feng J, Liu H, Rajendran KV, Sun L, Zhang Y, Jiang Y, Peatman E, Kaltenboeck L, Kucuktas H, Liu Z. RNA-Seq reveals expression signatures of genes involved in oxygen transport, protein synthesis, folding, and degradation in response to heat stress in catfish. Physiol Genomics 2013; 45:462-76. [PMID: 23632418 DOI: 10.1152/physiolgenomics.00026.2013] [Citation(s) in RCA: 113] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Temperature is one of the most prominent abiotic factors affecting ectotherms. Most fish species, as ectotherms, have extraordinary ability to deal with a wide range of temperature changes. While the molecular mechanism underlying temperature adaptation has long been of interest, it is still largely unexplored with fish. Understanding of the fundamental mechanisms conferring tolerance to temperature fluctuations is a topic of increasing interest as temperature may continue to rise as a result of global climate change. Catfish have a wide natural habitat and possess great plasticity in dealing with environmental variations in temperature. However, no studies have been conducted at the transcriptomic level to determine heat stress-induced gene expression. In the present study, we conducted an RNA-Seq analysis to identify heat stress-induced genes in catfish at the transcriptome level. Expression analysis identified a total of 2,260 differentially expressed genes with a cutoff of twofold change. qRT-PCR validation suggested the high reliability of the RNA-Seq results. Gene ontology, enrichment, and pathway analyses were conducted to gain insight into physiological and gene pathways. Specifically, genes involved in oxygen transport, protein folding and degradation, and metabolic process were highly induced, while general protein synthesis was dramatically repressed in response to the lethal temperature stress. This is the first RNA-Seq-based expression study in catfish in response to heat stress. The candidate genes identified should be valuable for further targeted studies on heat tolerance, thereby assisting the development of heat-tolerant catfish lines for aquaculture.
Collapse
Affiliation(s)
- Shikai Liu
- The Fish Molecular Genetics and Biotechnology Laboratory, Department of Fisheries and Allied Aquacultures and Program of Cell and Molecular Biosciences, Aquatic Genomics Unit, Auburn University, Auburn, Alabama, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
183
|
Chen G, Wang C, Shi L, Tong W, Qu X, Chen J, Yang J, Shi C, Chen L, Zhou P, Lu B, Shi T. Comprehensively identifying and characterizing the missing gene sequences in human reference genome with integrated analytic approaches. Hum Genet 2013; 132:899-911. [PMID: 23572138 DOI: 10.1007/s00439-013-1300-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2012] [Accepted: 03/25/2013] [Indexed: 11/25/2022]
Abstract
The human reference genome is still incomplete and a number of gene sequences are missing from it. The approaches to uncover them, the reasons causing their absence and their functions are less explored. Here, we comprehensively identified and characterized the missing genes of human reference genome with RNA-Seq data from 16 different human tissues. By using a combined approach of genome-guided transcriptome reconstruction coupled with genome-wide comparison, we uncovered 3.78 and 2.37 Mb transcribed regions in the human genome assemblies of Celera and HuRef either missed from their homologous chromosomes of NCBI human reference genome build 37.2 or partially or entirely absent from the reference. We further identified a significant number of novel transcript contigs in each tissue from de novo transcriptome assembly that are unalignable to NCBI build 37.2 but can be aligned to at least one of the genomes from Celera, HuRef, chimpanzee, macaca or mouse. Our analyses indicate that the missing genes could result from genome misassembly, transposition, copy number variation, translocation and other structural variations. Moreover, our results further suggest that a large portion of these missing genes are conserved between human and other mammals, implying their important biological functions. Totally, 1,233 functional protein domains were detected in these missing genes. Collectively, our study not only provides approaches for uncovering the missing genes of a genome, but also proposes the potential reasons causing genes missed from the genome and highlights the importance of uncovering the missing genes of incomplete genomes.
Collapse
Affiliation(s)
- Geng Chen
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
184
|
Functional Annotation and Comparative Analysis of a Zygopteran Transcriptome. G3-GENES GENOMES GENETICS 2013; 3:763-770. [PMID: 23550132 PMCID: PMC3618363 DOI: 10.1534/g3.113.005637] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In this paper we present a de novo assembly of the transcriptome of the damselfly (Enallagma hageni) through the use of 454 pyrosequencing. E. hageni is a member of the suborder Zygoptera, in the order Odonata, and Odonata organisms form the basal lineage of the winged insects (Pterygota). To date, sequence data used in phylogenetic analysis of Enallagma species have been derived from either mitochondrial DNA or ribosomal nuclear DNA. This Enallagma transcriptome contained 31,661 contigs that were assembled and translated into 14,813 individual open reading frames. Using these data, we constructed an extensive dataset of 634 orthologous nuclear protein-encoding genes across 11 species of Arthropoda and used Bayesian techniques to elucidate the position of Enallagma in the arthropod phylogenetic tree. Additionally, we demonstrated that the Enallagma transcriptome contains 169 genes that are evolving at rates that differ relative to those of the rest of the transcriptome (29 accelerated and 140 decreased), and, through multiple Gene Ontology searches and clustering methods, we present the first functional annotation of any palaeopteran’s transcriptome in the literature.
Collapse
|
185
|
Werner GDA, Gemmell P, Grosser S, Hamer R, Shimeld SM. Analysis of a deep transcriptome from the mantle tissue of Patella vulgata Linnaeus (Mollusca: Gastropoda: Patellidae) reveals candidate biomineralising genes. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2013; 15:230-243. [PMID: 22865210 DOI: 10.1007/s10126-012-9481-0] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 07/17/2012] [Indexed: 06/01/2023]
Abstract
The gastropod Patella vulgata is abundant on rocky shores in Northern Europe and a significant grazer of intertidal algae. Here we report the application of Illumina sequencing to develop a transcriptome from the adult mantle tissue of P. vulgata. We obtained 47,237,104 paired-end reads of 51 bp, trialled de novo assembly methods and settled on the additive multiple K method followed by redundancy removal as resulting in the most comprehensive assembly. This yielded 29,489 contigs of at least 500 bp in length. We then used three methods to search for candidate genes relevant to biomineralisation: searches via BLAST and Hidden Markov Models for homologues of biomineralising genes from other molluscs, searches for predicted proteins containing tandem repeats and searches for secreted proteins that lacked a transmembrane domain. From the results of these searches we selected 15 contigs for verification by RT-PCR, of which 14 were successfully amplified and cloned. These included homologues of Pif-177/BSMP, Perlustrin, SPARC, AP24, Follistatin-like and Carbonic anhydrase, as well as three containing extensive G-X-Y repeats as found in nacrein. We selected two for further verification by in situ hybridisation, demonstrating expression in the larval shell field. We conclude that de novo assembly of Illumina data offers a cheap and rapid route to a predicted transcriptome that can be used as a resource for further biological study.
Collapse
Affiliation(s)
- Gijsbert D A Werner
- Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, UK
| | | | | | | | | |
Collapse
|
186
|
Hughes GM, Gang L, Murphy WJ, Higgins DG, Teeling EC. Using Illumina next generation sequencing technologies to sequence multigene families in de novo species. Mol Ecol Resour 2013; 13:510-21. [PMID: 23480365 DOI: 10.1111/1755-0998.12087] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2012] [Revised: 01/27/2013] [Accepted: 01/29/2013] [Indexed: 11/27/2022]
Abstract
The advent of Next Generation Sequencing Technology (NGST) has revolutionized molecular biology research, allowing for rapid gene/genome sequencing from a multitude of diverse species. As high throughput sequencing becomes more accessible, more efficient workflows must be developed to deal with the amounts of data produced and better assemble the genomes of de novo lineages. We combine traditional laboratory methods with Illumina NGST to amplify and sequence the largest mammalian multigene family, the Olfactory Receptor gene family, for species with and without a reference genome. We develop novel assembly methods to annotate and filter these data, which can be utilized for any gene family or any species. We find no significant difference between the ratio of genes within their respective gene families of our data compared with available genomic data. Using simulated data we explore the limitations of short-read sequence data and our assembly in recovering this gene family. We highlight the benefits and shortcomings of these methods. Compared with data generated from traditional polymerase chain reaction, cloning and Sanger sequencing methodologies, sequence data generated using our pipeline increases yield and sequencing efficiency without reducing the number of unique genes amplified. A cloning step is not required, therefore shortening data generation time. The novel downstream methodologies and workflows described provide a tool to be utilized by many fields of biology, to access and analyze the vast quantities of data generated. By combining laboratory and in silico methods, we provide a means of extracting genomic information for multigene families without complete genome sequencing.
Collapse
Affiliation(s)
- Graham M Hughes
- UCD School of Biological and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | | | | | | | | |
Collapse
|
187
|
Pérez-Porro AR, Navarro-Gómez D, Uriz MJ, Giribet G. A NGS approach to the encrusting Mediterranean sponge Crella elegans (Porifera, Demospongiae, Poecilosclerida): transcriptome sequencing, characterization and overview of the gene expression along three life cycle stages. Mol Ecol Resour 2013; 13:494-509. [PMID: 23437888 DOI: 10.1111/1755-0998.12085] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2012] [Revised: 01/16/2013] [Accepted: 01/18/2013] [Indexed: 11/28/2022]
Abstract
Sponges can be dominant organisms in many marine and freshwater habitats where they play essential ecological roles. They also represent a key group to address important questions in early metazoan evolution. Recent approaches for improving knowledge on sponge biological and ecological functions as well as on animal evolution have focused on the genetic toolkits involved in ecological responses to environmental changes (biotic and abiotic), development and reproduction. These approaches are possible thanks to newly available, massive sequencing technologies-such as the Illumina platform, which facilitate genome and transcriptome sequencing in a cost-effective manner. Here we present the first NGS (next-generation sequencing) approach to understanding the life cycle of an encrusting marine sponge. For this we sequenced libraries of three different life cycle stages of the Mediterranean sponge Crella elegans and generated de novo transcriptome assemblies. Three assemblies were based on sponge tissue of a particular life cycle stage, including non-reproductive tissue, tissue with sperm cysts and tissue with larvae. The fourth assembly pooled the data from all three stages. By aggregating data from all the different life cycle stages we obtained a higher total number of contigs, contigs with blast hit and annotated contigs than from one stage-based assemblies. In that multi-stage assembly we obtained a larger number of the developmental regulatory genes known for metazoans than in any other assembly. We also advance the differential expression of selected genes in the three life cycle stages to explore the potential of RNA-seq for improving knowledge on functional processes along the sponge life cycle.
Collapse
Affiliation(s)
- A R Pérez-Porro
- Center for Advanced Studies of Blanes (CEAB-CSIC), Girona, Blanes 17300, Spain.
| | | | | | | |
Collapse
|
188
|
Long Y, Li Q, Zhou B, Song G, Li T, Cui Z. De novo assembly of mud loach (Misgurnus anguillicaudatus) skin transcriptome to identify putative genes involved in immunity and epidermal mucus secretion. PLoS One 2013; 8:e56998. [PMID: 23437293 PMCID: PMC3577766 DOI: 10.1371/journal.pone.0056998] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2012] [Accepted: 01/16/2013] [Indexed: 12/17/2022] Open
Abstract
Fish skin serves as the first line of defense against a wide variety of chemical, physical and biological stressors. Secretion of mucus is among the most prominent characteristics of fish skin and numerous innate immune factors have been identified in the epidermal mucus. However, molecular mechanisms underlying the mucus secretion and immune activities of fish skin remain largely unclear due to the lack of genomic and transcriptomic data for most economically important fish species. In this study, we characterized the skin transcriptome of mud loach using Illumia paired-end sequencing. A total of 40364 unigenes were assembled from 86.6 million (3.07 gigabases) filtered reads. The mean length, N50 size and maximum length of assembled transcripts were 387, 611 and 8670 bp, respectively. A total of 17336 (43.76%) unigenes were annotated by blast searches against the NCBI non-redundant protein database. Gene ontology mapping assigned a total of 108513 GO terms to 15369 (38.08%) unigenes. KEGG orthology mapping annotated 9337 (23.23%) unigenes. Among the identified KO categories, immune system is the largest category that contains various components of multiple immune pathways such as chemokine signaling, leukocyte transendothelial migration and T cell receptor signaling, suggesting the complexity of immune mechanisms in fish skin. As for mucin biosynthesis, 37 unigenes were mapped to 7 enzymes of the mucin type O-glycan biosynthesis pathway and 8 members of the polypeptide N-acetylgalactosaminyltransferase family were identified. Additionally, 38 unigenes were mapped to 23 factors of the SNARE interactions in vesicular transport pathway, indicating that the activity of this pathway is required for the processes of epidermal mucus storage and release. Moreover, 1754 simple sequence repeats (SSRs) were detected in 1564 unigenes and dinucleotide repeats represented the most abundant type. These findings have laid the foundation for further understanding the secretary processes and immune functions of loach skin mucus.
Collapse
Affiliation(s)
- Yong Long
- The Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, P. R. China
| | - Qing Li
- The Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, P. R. China
| | - Bolan Zhou
- The Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, P. R. China
- University of the Chinese Academy of Sciences, Beijing, P. R. China
| | - Guili Song
- The Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, P. R. China
| | - Tao Li
- The Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, P. R. China
| | - Zongbin Cui
- The Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, P. R. China
- * E-mail:
| |
Collapse
|
189
|
Looso M, Preussner J, Sousounis K, Bruckskotten M, Michel CS, Lignelli E, Reinhardt R, Höffner S, Krüger M, Tsonis PA, Borchardt T, Braun T. A de novo assembly of the newt transcriptome combined with proteomic validation identifies new protein families expressed during tissue regeneration. Genome Biol 2013; 14:R16. [PMID: 23425577 PMCID: PMC4054090 DOI: 10.1186/gb-2013-14-2-r16] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Accepted: 02/20/2013] [Indexed: 11/12/2022] Open
Abstract
Background Notophthalmus viridescens, an urodelian amphibian, represents an excellent model organism to study regenerative processes, but mechanistic insights into molecular processes driving regeneration have been hindered by a paucity and poor annotation of coding nucleotide sequences. The enormous genome size and the lack of a closely related reference genome have so far prevented assembly of the urodelian genome. Results We describe the de novo assembly of the transcriptome of the newt Notophthalmus viridescens and its experimental validation. RNA pools covering embryonic and larval development, different stages of heart, appendage and lens regeneration, as well as a collection of different undamaged tissues were used to generate sequencing datasets on Sanger, Illumina and 454 platforms. Through a sequential de novo assembly strategy, hybrid datasets were converged into one comprehensive transcriptome comprising 120,922 non-redundant transcripts with a N50 of 975. From this, 38,384 putative transcripts were annotated and around 15,000 transcripts were experimentally validated as protein coding by mass spectrometry-based proteomics. Bioinformatical analysis of coding transcripts identified 826 proteins specific for urodeles. Several newly identified proteins establish novel protein families based on the presence of new sequence motifs without counterparts in public databases, while others containing known protein domains extend already existing families and also constitute new ones. Conclusions We demonstrate that our multistep assembly approach allows de novo assembly of the newt transcriptome with an annotation grade comparable to well characterized organisms. Our data provide the groundwork for mechanistic experiments to answer the question whether urodeles utilize proprietary sets of genes for tissue regeneration.
Collapse
|
190
|
Singhal S. De novo
transcriptomic analyses for non‐model organisms: an evaluation of methods across a multi‐species data set. Mol Ecol Resour 2013; 13:403-16. [DOI: 10.1111/1755-0998.12077] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 12/13/2012] [Accepted: 12/22/2012] [Indexed: 01/09/2023]
Affiliation(s)
- Sonal Singhal
- Museum of Vertebrate Zoology University of California, Berkeley 3101 Valley Life Sciences Building Berkeley CA 94720‐3160 USA
- Department of Integrative Biology University of California, Berkeley 1005 Valley Life Sciences Building Berkeley CA 94720‐3140 USA
| |
Collapse
|
191
|
Su CL, Chao YT, Yen SH, Chen CY, Chen WC, Chang YCA, Shih MC. Orchidstra: an integrated orchid functional genomics database. PLANT & CELL PHYSIOLOGY 2013; 54:e11. [PMID: 23324169 PMCID: PMC3583029 DOI: 10.1093/pcp/pct004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species.
Collapse
Affiliation(s)
- Chun-lin Su
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
- These authors contributed equally to this work
| | - Ya-Ting Chao
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
- These authors contributed equally to this work
| | - Shao-Hua Yen
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Chun-Yi Chen
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Wan-Chieh Chen
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Yao-Chien Alex Chang
- Department of Horticulture and Landscape Architecture, National Taiwan University, Taipei 10617, Taiwan
| | - Ming-Che Shih
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
- *Corresponding author: E-mail: ; Fax, +886-2-26515693
| |
Collapse
|
192
|
Abstract
Advances in sequencing technologies and increased access to sequencing services have led to renewed interest in sequence and genome assembly. Concurrently, new applications for sequencing have emerged, including gene expression analysis, discovery of genomic variants and metagenomics, and each of these has different needs and challenges in terms of assembly. We survey the theoretical foundations that underlie modern assembly and highlight the options and practical trade-offs that need to be considered, focusing on how individual features address the needs of specific applications. We also review key software and the interplay between experimental design and efficacy of assembly.
Collapse
Affiliation(s)
- Niranjan Nagarajan
- Computational and Systems Biology, Genome Institute of Singapore, 138672 Singapore
| | | |
Collapse
|
193
|
Venturini L, Ferrarini A, Zenoni S, Tornielli GB, Fasoli M, Santo SD, Minio A, Buson G, Tononi P, Zago ED, Zamperin G, Bellin D, Pezzotti M, Delledonne M. De novo transcriptome characterization of Vitis vinifera cv. Corvina unveils varietal diversity. BMC Genomics 2013; 14:41. [PMID: 23331995 PMCID: PMC3556335 DOI: 10.1186/1471-2164-14-41] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Accepted: 01/11/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Plants such as grapevine (Vitis spp.) display significant inter-cultivar genetic and phenotypic variation. The genetic components underlying phenotypic diversity in grapevine must be understood in order to disentangle genetic and environmental factors. RESULTS We have shown that cDNA sequencing by RNA-seq is a robust approach for the characterization of varietal diversity between a local grapevine cultivar (Corvina) and the PN40024 reference genome. We detected 15,161 known genes including 9463 with novel splice isoforms, and identified 2321 potentially novel protein-coding genes in non-annotated or unassembled regions of the reference genome. We also discovered 180 apparent private genes in the Corvina genome which were missing from the reference genome. CONCLUSIONS The de novo assembly approach allowed a substantial amount of the Corvina transcriptome to be reconstructed, improving known gene annotations by robustly defining gene structures, annotating splice isoforms and detecting genes without annotations. The private genes we discovered are likely to be nonessential but could influence certain cultivar-specific characteristics. Therefore, the application of de novo transcriptome assembly should not be restricted to species lacking a reference genome because it can also improve existing reference genome annotations and identify novel, cultivar-specific genes.
Collapse
Affiliation(s)
- Luca Venturini
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Alberto Ferrarini
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Sara Zenoni
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | | | - Marianna Fasoli
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Silvia Dal Santo
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Andrea Minio
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Genny Buson
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Paola Tononi
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Elisa Debora Zago
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Gianpiero Zamperin
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Diana Bellin
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Mario Pezzotti
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| | - Massimo Delledonne
- Biotechnology Department, University of Verona, Strada Le Grazie 15, I-37134, Verona, Italy
| |
Collapse
|
194
|
SymGRASS: a database of sugarcane orthologous genes involved in arbuscular mycorrhiza and root nodule symbiosis. BMC Bioinformatics 2013; 14 Suppl 1:S2. [PMID: 23368899 PMCID: PMC3548678 DOI: 10.1186/1471-2105-14-s1-s2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The rationale for gathering information from plants procuring nitrogen through symbiotic interactions controlled by a common genetic program for a sustainable biofuel production is the high energy demanding application of synthetic nitrogen fertilizers. We curated sequence information publicly available for the biofuel plant sugarcane, performed an analysis of the common SYM pathway known to control symbiosis in other plants, and provide results, sequences and literature links as an online database. METHODS Sugarcane sequences and informations were downloaded from the nucEST database, cleaned and trimmed with seqclean, assembled with TGICL plus translating mapping method, and annotated. The annotation is based on BLAST searches against a local formatted plant Uniprot90 generated with CD-HIT for functional assignment, rpsBLAST to CDD database for conserved domain analysis, and BLAST search to sorghum's for Gene Ontology (GO) assignment. Gene expression was normalized according the Unigene standard, presented as ESTs/100 kb. Protein sequences known in the SYM pathway were used as queries to search the SymGRASS sequence database. Additionally, antimicrobial peptides described in the PhytAMP database served as queries to retrieve and generate expression profiles of these defense genes in the libraries compared to the libraries obtained under symbiotic interactions. RESULTS We describe the SymGRASS, a database of sugarcane orthologous genes involved in arbuscular mycorrhiza (AM) and root nodule (RN) symbiosis. The database aggregates knowledge about sequences, tissues, organ, developmental stages and experimental conditions, and provides annotation and level of gene expression for sugarcane transcripts and SYM orthologous genes in sugarcane through a web interface. Several candidate genes were found for all nodes in the pathway, and interestingly a set of symbiosis specific genes was found. CONCLUSIONS The knowledge integrated in SymGRASS may guide studies on molecular, cellular and physiological mechanisms by which sugarcane controls the establishment and efficiency of endophytic associations. We believe that the candidate sequences for the SYM pathway together with the pool of exclusively expressed tentative consensus (TC) sequences are crucial for the design of molecular studies to unravel the mechanisms controlling the establishment of symbioses in sugarcane, ultimately serving as a basis for the improvement of grass crops.
Collapse
|
195
|
Schafleitner R, Kumar S, Lin CY, Hegde SG, Ebert A. The okra (Abelmoschus esculentus) transcriptome as a source for gene sequence information and molecular markers for diversity analysis. Gene 2013; 517:27-36. [PMID: 23299025 DOI: 10.1016/j.gene.2012.12.098] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Revised: 11/26/2012] [Accepted: 12/19/2012] [Indexed: 01/01/2023]
Abstract
A combined leaf and pod transcriptome of okra (Abelmoschus esculentus (L.) Moench) has been produced by RNA sequencing and short read assembly. More than 150,000 unigenes were obtained, comprising some 46 million base pairs of sequence information. More than 55% of the unigenes were annotated through sequence comparison with databases. The okra transcriptome sequences were mined for simple sequence repeat (SSR) markers. From 935 non-redundant SSR motifs identified in the unigene set, 199 were chosen for testing in a germplasm set, resulting in 161 polymorphic SSR markers. From this set, 19 markers were selected for a diversity analysis on 65 okra accessions comprising three different species, revealing 58 different genotypes and resulted in clustering of the accessions according to species and geographic origin. The okra gene sequence information and the marker resource are made available to the research community for functional genomics and breeding research.
Collapse
Affiliation(s)
- Roland Schafleitner
- AVRDC - The World Vegetable Center, P.O. Box 42, Shanhua, Tainan 74199, Taiwan.
| | | | | | | | | |
Collapse
|
196
|
Abstract
Sequencing of mRNA using next-generation sequencing (NGS) technologies (RNA-seq) has the potential to reveal unprecedented complexity of the transcriptomes. The transcriptome sequencing of an organism provides quick insights into the gene space, opportunity to isolate genes of interest, development of functional markers, quantitation of gene expression, and comparative genomic studies. Although becoming cheaper, transcriptome sequencing still remains an expensive endeavor. Further, the assembly of millions and billions of RNA-seq reads to construct the complete transcriptome poses great informatics challenges. Here, first we outline various important issues from experimental design to data analysis, including various strategies of transcriptome assembly, which need substantial consideration for a successful RNA-seq experiment. Further, we describe a method for using RNA-seq to characterize the transcriptome of a plant species, taking the example of a legume crop plant chickpea. Our aim is to provide a quick start guide to the nonexpert researchers for NGS-based transcriptome analysis.
Collapse
Affiliation(s)
- Rohini Garg
- National Institute of Plant Genome Research, New Delhi, India
| | | |
Collapse
|
197
|
Saidi-Mehrabad A, He Z, Tamas I, Sharp CE, Brady AL, Rochman FF, Bodrossy L, Abell GC, Penner T, Dong X, Sensen CW, Dunfield PF. Methanotrophic bacteria in oilsands tailings ponds of northern Alberta. ISME JOURNAL 2012; 7:908-21. [PMID: 23254511 DOI: 10.1038/ismej.2012.163] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
We investigated methanotrophic bacteria in slightly alkaline surface water (pH 7.4-8.7) of oilsands tailings ponds in Fort McMurray, Canada. These large lakes (up to 10 km(2)) contain water, silt, clay and residual hydrocarbons that are not recovered in oilsands mining. They are primarily anoxic and produce methane but have an aerobic surface layer. Aerobic methane oxidation was measured in the surface water at rates up to 152 nmol CH4 ml(-1) water d(-1). Microbial diversity was investigated via pyrotag sequencing of amplified 16S rRNA genes, as well as by analysis of methanotroph-specific pmoA genes using both pyrosequencing and microarray analysis. The predominantly detected methanotroph in surface waters at all sampling times was an uncultured species related to the gammaproteobacterial genus Methylocaldum, although a few other methanotrophs were also detected, including Methylomonas spp. Active species were identified via (13)CH4 stable isotope probing (SIP) of DNA, combined with pyrotag sequencing and shotgun metagenomic sequencing of heavy (13)C-DNA. The SIP-PCR results demonstrated that the Methylocaldum and Methylomonas spp. actively consumed methane in fresh tailings pond water. Metagenomic analysis of DNA from the heavy SIP fraction verified the PCR-based results and identified additional pmoA genes not detected via PCR. The metagenome indicated that the overall methylotrophic community possessed known pathways for formaldehyde oxidation, carbon fixation and detoxification of nitrogenous compounds but appeared to possess only particulate methane monooxygenase not soluble methane monooxygenase.
Collapse
|
198
|
Hassan MA, Melo MB, Haas B, Jensen KDC, Saeij JPJ. De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs. BMC Genomics 2012; 13:696. [PMID: 23231500 PMCID: PMC3543268 DOI: 10.1186/1471-2164-13-696] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2012] [Accepted: 12/04/2012] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Accurate gene model predictions and annotation of alternative splicing events are imperative for genomic studies in organisms that contain genes with multiple exons. Currently most gene models for the intracellular parasite, Toxoplasma gondii, are based on computer model predictions without cDNA sequence verification. Additionally, the nature and extent of alternative splicing in Toxoplasma gondii is unknown. In this study, we used de novo transcript assembly and the published type II (ME49) genomic sequence to quantify the extent of alternative splicing in Toxoplasma and to improve the current Toxoplasma gene annotations. RESULTS We used high-throughput RNA-sequencing data to assemble full-length transcripts, independently of a reference genome, followed by gene annotation based on the ME49 genome. We assembled 13,533 transcripts overlapping with known ME49 genes in ToxoDB and then used this set to; a) improve the annotation in the untranslated regions of ToxoDB genes, b) identify novel exons within protein-coding ToxoDB genes, and c) report on 50 previously unidentified alternatively spliced transcripts. Additionally, we assembled a set of 2,930 transcripts not overlapping with any known ME49 genes in ToxoDB. From this set, we have identified 118 new ME49 genes, 18 novel Toxoplasma genes, and putative non-coding RNAs. CONCLUSION RNA-seq data and de novo transcript assembly provide a robust way to update incompletely annotated genomes, like the Toxoplasma genome. We have used RNA-seq to improve the annotation of several Toxoplasma genes, identify alternatively spliced genes, novel genes, novel exons, and putative non-coding RNAs.
Collapse
Affiliation(s)
- Musa A Hassan
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | | | | | | | | |
Collapse
|
199
|
Yang QS, Wu JH, Li CY, Wei YR, Sheng O, Hu CH, Kuang RB, Huang YH, Peng XX, McCardle JA, Chen W, Yang Y, Rose JKC, Zhang S, Yi GJ. Quantitative proteomic analysis reveals that antioxidation mechanisms contribute to cold tolerance in plantain (Musa paradisiaca L.; ABB Group) seedlings. Mol Cell Proteomics 2012; 11:1853-69. [PMID: 22982374 PMCID: PMC3518116 DOI: 10.1074/mcp.m112.022079] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2012] [Revised: 08/30/2012] [Indexed: 12/22/2022] Open
Abstract
Banana and its close relative, plantain are globally important crops and there is considerable interest in optimizing their cultivation. Plantain has superior cold tolerance compared with banana and a thorough understanding of the molecular mechanisms and responses of plantain to cold stress has great potential value for developing cold tolerant banana cultivars. In this study, we used iTRAQ-based comparative proteomic analysis to investigate the temporal responses of plantain to cold stress. Plantain seedlings were exposed for 0, 6, and 24 h of cold stress at 8 °C and subsequently allowed to recover for 24 h at 28 °C. A total of 3477 plantain proteins were identified, of which 809 showed differential expression from the three treatments. The majority of differentially expressed proteins were predicted to be involved in oxidation-reduction, including oxylipin biosynthesis, whereas others were associated with photosynthesis, photorespiration, and several primary metabolic processes, such as carbohydrate metabolic process and fatty acid beta-oxidation. Western blot analysis and enzyme activity assays were performed on seven differentially expressed, cold-response candidate plantain proteins to validate the proteomics data. Similar analyses of the seven candidate proteins were performed in cold-sensitive banana to examine possible functional conservation, and to compare the results to equivalent responses between the two species. Consistent results were achieved by Western blot and enzyme activity assays, demonstrating that the quantitative proteomics data collected in this study are reliable. Our results suggest that an increase of antioxidant capacity through adapted ROS scavenging capability, reduced production of ROS, and decreased lipid peroxidation contribute to molecular mechanisms for the increased cold tolerance in plantain. To the best of our knowledge, this is the first report of a global investigation on molecular responses of plantain to cold stress by proteomic analysis.
Collapse
Affiliation(s)
- Qiao-Song Yang
- From the ‡Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences
- §Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China
| | - Jun-Hua Wu
- ¶State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, South China Agricultural University, Guangzhou, China
| | - Chun-Yu Li
- From the ‡Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences
- §Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China
| | - Yue-Rong Wei
- From the ‡Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences
- §Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China
| | - Ou Sheng
- From the ‡Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences
- §Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China
| | - Chun-Hua Hu
- From the ‡Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences
- §Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China
| | - Rui-Bin Kuang
- From the ‡Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences
- §Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China
| | - Yong-Hong Huang
- From the ‡Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences
- §Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China
| | - Xin-Xiang Peng
- ¶State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, South China Agricultural University, Guangzhou, China
| | | | - Wei Chen
- ‖Institute for Biotechnology and Life Science Technologies
| | - Yong Yang
- **Robert W. Holley Center for Agriculture and Health, USDA-ARS
| | | | - Sheng Zhang
- ‖Institute for Biotechnology and Life Science Technologies
| | - Gan-Jun Yi
- From the ‡Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences
- §Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture, Guangzhou, China
| |
Collapse
|
200
|
Riesgo A, Andrade SCS, Sharma PP, Novo M, Pérez-Porro AR, Vahtera V, González VL, Kawauchi GY, Giribet G. Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa. Front Zool 2012; 9:33. [PMID: 23190771 PMCID: PMC3538665 DOI: 10.1186/1742-9994-9-33] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2012] [Accepted: 11/08/2012] [Indexed: 12/28/2022] Open
Abstract
UNLABELLED INTRODUCTION Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. RESULTS cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. CONCLUSIONS We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene fragments) and protein families for ten newly sequenced non-model organisms, some of commercial importance (i.e., Octopus vulgaris). These comprehensive sets of genes can be readily used for phylogenetic analysis, gene expression profiling, developmental analysis, and can also be a powerful resource for gene discovery. The characterization of the transcriptomes of such a diverse array of animal species permitted the comparison of sequencing depth, functional annotation, and efficiency of genomic sampling using the same pipelines, which proved to be similar for all considered species. In addition, the datasets revealed their potential as a resource for paralogue detection, a recurrent concern in various aspects of biological inquiry, including phylogenetics, molecular evolution, development, and cellular biochemistry.
Collapse
Affiliation(s)
- Ana Riesgo
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
- Centro de Estudios Avanzados de Blanes, CSIC, c/ Accés a la Cala St. Francesc 14, Blanes, Girona, 17300, Spain
| | - Sónia C S Andrade
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
| | - Prashant P Sharma
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
| | - Marta Novo
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
- Current address: Cardiff School of Biosciences, Cardiff University, BIOSI 1, Museum Avenue, Cardiff, CF10 3TL, UK
| | - Alicia R Pérez-Porro
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
- Centro de Estudios Avanzados de Blanes, CSIC, c/ Accés a la Cala St. Francesc 14, Blanes, Girona, 17300, Spain
| | - Varpu Vahtera
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
- Current address: Finnish Museum of Natural History, Zoology Unit, Pohjoinen Rautatiekatu 13, 00014 University of Helsinki, Helsinki, Finland
| | - Vanessa L González
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
| | - Gisele Y Kawauchi
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
| | - Gonzalo Giribet
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
| |
Collapse
|