Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. An optimized protocol for analysis of EST sequences. Nucleic Acids Res 2000;28:3657-65. [PMID: 10982889 PMCID: PMC110731 DOI: 10.1093/nar/28.18.3657] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. An optimized protocol for analysis of EST sequences. Nucleic Acids Res 2000;28:3657-65. [PMID: 10982889 PMCID: PMC110731 DOI: 10.1093/nar/28.18.3657] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Shityakov S, Bencurova E, Förster C, Dandekar T. Modeling of shotgun sequencing of DNA plasmids using experimental and theoretical approaches. BMC Bioinformatics 2020;21:132. [PMID: 32245400 PMCID: PMC7126183 DOI: 10.1186/s12859-020-3461-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 03/19/2020] [Indexed: 01/02/2023] Open

Carmona R, Zafra A, Seoane P, Castro AJ, Guerrero-Fernández D, Castillo-Castillo T, Medina-García A, Cánovas FM, Aldana-Montes JF, Navas-Delgado I, Alché JDD, Claros MG. ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome. FRONTIERS IN PLANT SCIENCE 2015;6:625. [PMID: 26322066 PMCID: PMC4531244 DOI: 10.3389/fpls.2015.00625] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 07/28/2015] [Indexed: 05/18/2023]

Abstract

Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species.

Collapse

Affiliation(s)

Rosario Carmona Department of Biochemistry, Cell and Molecular Biology of Plants, Estación Experimental del Zaidín, Consejo Superior de Investigaciones CientíficasGranada, Spain Plataforma Andaluza de Bioinformática, Edificio de Bioinnovación, Universidad de MálagaMálaga, Spain
Adoración Zafra Department of Biochemistry, Cell and Molecular Biology of Plants, Estación Experimental del Zaidín, Consejo Superior de Investigaciones CientíficasGranada, Spain
Pedro Seoane Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de MálagaMálaga, Spain
Antonio J. Castro Department of Biochemistry, Cell and Molecular Biology of Plants, Estación Experimental del Zaidín, Consejo Superior de Investigaciones CientíficasGranada, Spain
Darío Guerrero-Fernández Plataforma Andaluza de Bioinformática, Edificio de Bioinnovación, Universidad de MálagaMálaga, Spain
Trinidad Castillo-Castillo Departamento de Lenguajes y Ciencias de la Computación, Universidad de MálagaMálaga, Spain
Ana Medina-García Departamento de Lenguajes y Ciencias de la Computación, Universidad de MálagaMálaga, Spain
Francisco M. Cánovas Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de MálagaMálaga, Spain
José F. Aldana-Montes Departamento de Lenguajes y Ciencias de la Computación, Universidad de MálagaMálaga, Spain
Ismael Navas-Delgado Departamento de Lenguajes y Ciencias de la Computación, Universidad de MálagaMálaga, Spain
Juan de Dios Alché Department of Biochemistry, Cell and Molecular Biology of Plants, Estación Experimental del Zaidín, Consejo Superior de Investigaciones CientíficasGranada, Spain
M. Gonzalo Claros Plataforma Andaluza de Bioinformática, Edificio de Bioinnovación, Universidad de MálagaMálaga, Spain Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de MálagaMálaga, Spain *Correspondence: M. Gonzalo Claros, Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos, 29071 Málaga, Spain,

Collapse

Canales J, Bautista R, Label P, Gómez-Maldonado J, Lesur I, Fernández-Pozo N, Rueda-López M, Guerrero-Fernández D, Castro-Rodríguez V, Benzekri H, Cañas RA, Guevara MA, Rodrigues A, Seoane P, Teyssier C, Morel A, Ehrenmann F, Le Provost G, Lalanne C, Noirot C, Klopp C, Reymond I, García-Gutiérrez A, Trontin JF, Lelu-Walter MA, Miguel C, Cervera MT, Cantón FR, Plomion C, Harvengt L, Avila C, Gonzalo Claros M, Cánovas FM. De novo assembly of maritime pine transcriptome: implications for forest breeding and biotechnology. PLANT BIOTECHNOLOGY JOURNAL 2014;12:286-99. [PMID: 24256179 DOI: 10.1111/pbi.12136] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2013] [Revised: 09/24/2013] [Accepted: 09/26/2013] [Indexed: 05/21/2023]

Lee EJ, Kamli MR, Pokharel S, Malik A, Tareq KMA, Roouf Bhat A, Park HB, Lee YS, Kim S, Yang B, Young Chung K, Choi I. Expressed sequence tags for bovine muscle satellite cells, myotube formed-cells and adipocyte-like cells. PLoS One 2013;8:e79780. [PMID: 24224006 PMCID: PMC3818215 DOI: 10.1371/journal.pone.0079780] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2013] [Accepted: 09/25/2013] [Indexed: 12/25/2022] Open

Abstract

Background

Muscle satellite cells (MSCs) represent a devoted stem cell population that is responsible for postnatal muscle growth and skeletal muscle regeneration. An important characteristic of MSCs is that they encompass multi potential mesenchymal stem cell activity and are able to differentiate into myocytes and adipocytes. To achieve a global view of the genes differentially expressed in MSCs, myotube formed-cells (MFCs) and adipocyte-like cells (ALCs), we performed large-scale EST sequencing of normalized cDNA libraries developed from bovine MSCs.

Results

A total of 24,192 clones were assembled into 3,333 clusters, 5,517 singletons and 3,842contigs. Functional annotation of these unigenes revealed that a large portion of the differentially expressed genes are involved in cellular and signaling processes. Database for Annotation, Visualization and Integrated Discovery (DAVID) functional analysis of three subsets of highly expressed gene lists (MSC233, MFC258, and ALC248) highlighted some common and unique biological processes among MSC, MFC and ALC. Additionally, genes that may be specific to MSC, MFC and ALC are reported here, and the role of dimethylargininedimethylaminohydrolase2 (DDAH2) during myogenesis and hemoglobinsubunitalpha2 (HBA2) during transdifferentiation in C2C12 were assayed as a case study. DDAH2 was up-regulated during myognesis and knockdown of DDAH2 by siRNA significantly decreased myogenin (MYOG) expression corresponding with the slight change in cell morphology. In contrast, HBA2 was up-regulated during ALC formation and resulted in decreased intracellular lipid accumulation and CD36 mRNA expression upon knockdown assay.

Conclusion

In this study, a large number of EST sequences were generated from the MSC, MFC and ALC. Overall, the collection of ESTs generated in this study provides a starting point for the identification of novel genes involved in MFC and ALC formation, which in turn offers a fundamental resource to enable better understanding of the mechanism of muscle differentiation and transdifferentiation.

Collapse

Alnemer LM, Seetan RI, Bassi FM, Chitraranjan C, Helsene A, Loree P, Goshn SB, Gu YQ, Luo MC, Iqbal MJ, Lazo GR, Denton AM, Kianian SF. Wheat Zapper: a flexible online tool for colinearity studies in grass genomes. Funct Integr Genomics 2013;13:11-7. [PMID: 23474942 DOI: 10.1007/s10142-013-0317-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Revised: 02/08/2013] [Accepted: 02/12/2013] [Indexed: 10/27/2022]

Dhandapani V, Choi SR, Paul P, Kim YK, Ramchiary N, Hur Y, Lim YP. Development of EST database and transcriptome analysis in the leaves of Brassica rapa using a newly developed pipeline. Genes Genomics 2012. [DOI: 10.1007/s13258-012-0015-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Zhao R, Cao Y, Xu H, Lv L, Qiao D, Cao Y. ANALYSIS OF EXPRESSED SEQUENCE TAGS FROM THE GREEN ALGA DUNALIELLA SALINA (CHLOROPHYTA)(1). JOURNAL OF PHYCOLOGY 2011;47:1454-1460. [PMID: 27020369 DOI: 10.1111/j.1529-8817.2011.01071.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Marconi TG, Costa EA, Miranda HR, Mancini MC, Cardoso-Silva CB, Oliveira KM, Pinto LR, Mollinari M, Garcia AA, Souza AP. Functional markers for gene mapping and genetic diversity studies in sugarcane. BMC Res Notes 2011;4:264. [PMID: 21798036 PMCID: PMC3158763 DOI: 10.1186/1756-0500-4-264] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2011] [Accepted: 07/28/2011] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The database of sugarcane expressed sequence tags (EST) offers a great opportunity for developing molecular markers that are directly associated with important agronomic traits. The development of new EST-SSR markers represents an important tool for genetic analysis. In sugarcane breeding programs, functional markers can be used to accelerate the process and select important agronomic traits, especially in the mapping of quantitative traits loci (QTL) and plant resistant pathogens or qualitative resistance loci (QRL). The aim of this work was to develop new simple sequence repeat (SSR) markers in sugarcane using the sugarcane expressed sequence tag (SUCEST database).

FINDINGS

A total of 365 EST-SSR molecular markers with trinucleotide motifs were developed and evaluated in a collection of 18 genotypes of sugarcane (15 varieties and 3 species). In total, 287 of the EST-SSRs markers amplified fragments of the expected size and were polymorphic in the analyzed sugarcane varieties. The number of alleles ranged from 2-18, with an average of 6 alleles per locus, while polymorphism information content values ranged from 0.21-0.92, with an average of 0.69. The discrimination power was high for the majority of the EST-SSRs, with an average value of 0.80. Among the markers characterized in this study some have particular interest, those that are related to bacterial defense responses, generation of precursor metabolites and energy and those involved in carbohydrate metabolic process.

CONCLUSIONS

These EST-SSR markers presented in this work can be efficiently used for genetic mapping studies of segregating sugarcane populations. The high Polymorphism Information Content (PIC) and Discriminant Power (DP) presented facilitate the QTL identification and marker-assisted selection due the association with functional regions of the genome became an important tool for the sugarcane breeding program.

Collapse

Fernández-Pozo N, Canales J, Guerrero-Fernández D, Villalobos DP, Díaz-Moreno SM, Bautista R, Flores-Monterroso A, Guevara MÁ, Perdiguero P, Collada C, Cervera MT, Soto A, Ordás R, Cantón FR, Avila C, Cánovas FM, Claros MG. EuroPineDB: a high-coverage web database for maritime pine transcriptome. BMC Genomics 2011;12:366. [PMID: 21762488 PMCID: PMC3152544 DOI: 10.1186/1471-2164-12-366] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2011] [Accepted: 07/15/2011] [Indexed: 11/30/2022] Open

Abstract

Background

Pinus pinaster is an economically and ecologically important species that is becoming a woody gymnosperm model. Its enormous genome size makes whole-genome sequencing approaches are hard to apply. Therefore, the expressed portion of the genome has to be characterised and the results and annotations have to be stored in dedicated databases.

Description

EuroPineDB is the largest sequence collection available for a single pine species, Pinus pinaster (maritime pine), since it comprises 951 641 raw sequence reads obtained from non-normalised cDNA libraries and high-throughput sequencing from adult (xylem, phloem, roots, stem, needles, cones, strobili) and embryonic (germinated embryos, buds, callus) maritime pine tissues. Using open-source tools, sequences were optimally pre-processed, assembled, and extensively annotated (GO, EC and KEGG terms, descriptions, SNPs, SSRs, ORFs and InterPro codes). As a result, a 10.5× P. pinaster genome was covered and assembled in 55 322 UniGenes. A total of 32 919 (59.5%) of P. pinaster UniGenes were annotated with at least one description, revealing at least 18 466 different genes. The complete database, which is designed to be scalable, maintainable, and expandable, is freely available at: http://www.scbi.uma.es/pindb/. It can be retrieved by gene libraries, pine species, annotations, UniGenes and microarrays (i.e., the sequences are distributed in two-colour microarrays; this is the only conifer database that provides this information) and will be periodically updated. Small assemblies can be viewed using a dedicated visualisation tool that connects them with SNPs. Any sequence or annotation set shown on-screen can be downloaded. Retrieval mechanisms for sequences and gene annotations are provided.

Conclusions

The EuroPineDB with its integrated information can be used to reveal new knowledge, offers an easy-to-use collection of information to directly support experimental work (including microarray hybridisation), and provides deeper knowledge on the maritime pine transcriptome.

Collapse

Choi HK, Goes da Silva F, Lim HJ, Iandolino A, Seo YS, Lee SW, Cook DR. Diagnosis of Pierce's disease using biomarkers specific to Xylella fastidiosa rRNA and Vitis vinifera gene expression. PHYTOPATHOLOGY 2010;100:1089-99. [PMID: 20839944 DOI: 10.1094/phyto-01-10-0014] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]

Antonescu C, Antonescu V, Sultana R, Quackenbush J. Using the DFCI gene index databases for biological discovery. ACTA ACUST UNITED AC 2010;Chapter 1:1.6.1-1.6.36. [PMID: 20205187 DOI: 10.1002/0471250953.bi0106s29] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

A gene family-based method for interspecies comparisons of sequencing-based transcriptomes and its use in environmental adaptation analysis. J Genet Genomics 2010;37:205-18. [PMID: 20347830 DOI: 10.1016/s1673-8527(09)60039-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2009] [Revised: 01/20/2010] [Accepted: 02/03/2010] [Indexed: 11/21/2022]

O'Neil ST, Dzurisin JDK, Carmichael RD, Lobo NF, Emrich SJ, Hellmann JJ. Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon. BMC Genomics 2010;11:310. [PMID: 20478048 PMCID: PMC2887415 DOI: 10.1186/1471-2164-11-310] [Citation(s) in RCA: 112] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2009] [Accepted: 05/17/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Several recent studies have demonstrated the use of Roche 454 sequencing technology for de novo transcriptome analysis. Low error rates and high coverage also allow for effective SNP discovery and genetic diversity estimates. However, genetically diverse datasets, such as those sourced from natural populations, pose challenges for assembly programs and subsequent analysis. Further, estimating the effectiveness of transcript discovery using Roche 454 transcriptome data is still a difficult task.

RESULTS

Using the Roche 454 FLX Titanium platform, we sequenced and assembled larval transcriptomes for two butterfly species: the Propertius duskywing, Erynnis propertius (Lepidoptera: Hesperiidae) and the Anise swallowtail, Papilio zelicaon (Lepidoptera: Papilionidae). The Expressed Sequence Tags (ESTs) generated represent a diverse sample drawn from multiple populations, developmental stages, and stress treatments. Despite this diversity, > 95% of the ESTs assembled into long (> 714 bp on average) and highly covered (> 9.6x on average) contigs. To estimate the effectiveness of transcript discovery, we compared the number of bases in the hit region of unigenes (contigs and singletons) to the length of the best match silkworm (Bombyx mori) protein--this "ortholog hit ratio" gives a close estimate on the amount of the transcript discovered relative to a model lepidopteran genome. For each species, we tested two assembly programs and two parameter sets; although CAP3 is commonly used for such data, the assemblies produced by Celera Assembler with modified parameters were chosen over those produced by CAP3 based on contig and singleton counts as well as ortholog hit ratio analysis. In the final assemblies, 1,413 E. propertius and 1,940 P. zelicaon unigenes had a ratio > 0.8; 2,866 E. propertius and 4,015 P. zelicaon unigenes had a ratio > 0.5.

CONCLUSIONS

Ultimately, these assemblies and SNP data will be used to generate microarrays for ecoinformatics examining climate change tolerance of different natural populations. These studies will benefit from high quality assemblies with few singletons (less than 26% of bases for each assembled transcriptome are present in unassembled singleton ESTs) and effective transcript discovery (over 6,500 of our putative orthologs cover at least 50% of the corresponding model silkworm gene).

Collapse

SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read. BMC Bioinformatics 2010. [PMID: 20089148 DOI: 10.1186/1471‐2105‐11‐38] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Falgueras J, Lara AJ, Fernández-Pozo N, Cantón FR, Pérez-Trabado G, Claros MG. SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read. BMC Bioinformatics 2010;11:38. [PMID: 20089148 PMCID: PMC2832897 DOI: 10.1186/1471-2105-11-38] [Citation(s) in RCA: 142] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2009] [Accepted: 01/20/2010] [Indexed: 12/05/2022] Open

Mochida K, Yoshida T, Sakurai T, Ogihara Y, Shinozaki K. TriFLDB: a database of clustered full-length coding sequences from Triticeae with applications to comparative grass genomics. PLANT PHYSIOLOGY 2009;150:1135-46. [PMID: 19448038 PMCID: PMC2705016 DOI: 10.1104/pp.109.138214] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2009] [Accepted: 05/08/2009] [Indexed: 05/19/2023]

Abstract

The Triticeae Full-Length CDS Database (TriFLDB) contains available information regarding full-length coding sequences (CDSs) of the Triticeae crops wheat (Triticum aestivum) and barley (Hordeum vulgare) and includes functional annotations and comparative genomics features. TriFLDB provides a search interface using keywords for gene function and related Gene Ontology terms and a similarity search for DNA and deduced translated amino acid sequences to access annotations of Triticeae full-length CDS (TriFLCDS) entries. Annotations consist of similarity search results against several sequence databases and domain structure predictions by InterProScan. The deduced amino acid sequences in TriFLDB are grouped with the proteome datasets for Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and sorghum (Sorghum bicolor) by hierarchical clustering in stepwise thresholds of sequence identity, providing hierarchical clustering results based on full-length protein sequences. The database also provides sequence similarity results based on comparative mapping of TriFLCDSs onto the rice and sorghum genome sequences, which together with current annotations can be used to predict gene structures for TriFLCDS entries. To provide the possible genetic locations of full-length CDSs, TriFLCDS entries are also assigned to the genetically mapped cDNA sequences of barley and diploid wheat, which are currently accommodated in the Triticeae Mapped EST Database. These relational data are searchable from the search interfaces of both databases. The current TriFLDB contains 15,871 full-length CDSs from barley and wheat and includes putative full-length cDNAs for barley and wheat, which are publicly accessible. This informative content provides an informatics gateway for Triticeae genomics and grass comparative genomics. TriFLDB is publicly available at http://TriFLDB.psc.riken.jp/.

Collapse

Bekel T, Henckel K, Küster H, Meyer F, Mittard Runte V, Neuweger H, Paarmann D, Rupp O, Zakrzewski M, Pühler A, Stoye J, Goesmann A. The Sequence Analysis and Management System – SAMS-2.0: Data management and sequence analysis adapted to changing requirements from traditional sanger sequencing to ultrafast sequencing technologies. J Biotechnol 2009;140:3-12. [DOI: 10.1016/j.jbiotec.2009.01.006] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Scheibye-Alsing K, Hoffmann S, Frankel A, Jensen P, Stadler PF, Mang Y, Tommerup N, Gilchrist MJ, Nygård AB, Cirera S, Jørgensen CB, Fredholm M, Gorodkin J. Sequence assembly. Comput Biol Chem 2008;33:121-36. [PMID: 19152793 DOI: 10.1016/j.compbiolchem.2008.11.003] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2008] [Revised: 11/28/2008] [Accepted: 11/28/2008] [Indexed: 01/20/2023]

Argout X, Fouet O, Wincker P, Gramacho K, Legavre T, Sabau X, Risterucci AM, Da Silva C, Cascardo J, Allegre M, Kuhn D, Verica J, Courtois B, Loor G, Babin R, Sounigo O, Ducamp M, Guiltinan MJ, Ruiz M, Alemanno L, Machado R, Phillips W, Schnell R, Gilmour M, Rosenquist E, Butler D, Maximova S, Lanaud C. Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions. BMC Genomics 2008;9:512. [PMID: 18973681 PMCID: PMC2642826 DOI: 10.1186/1471-2164-9-512] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2008] [Accepted: 10/30/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40% yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao.

RESULTS

Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species.Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories.A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database.To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection.A large collection of new genetic markers was provided by this ESTs collection.

CONCLUSION

This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation.

Collapse

Meyer F, Paarmann D, D'Souza M, Olson R, Glass EM, Kubal M, Paczian T, Rodriguez A, Stevens R, Wilke A, Wilkening J, Edwards RA. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 2008;9:386. [PMID: 18803844 PMCID: PMC2563014 DOI: 10.1186/1471-2105-9-386] [Citation(s) in RCA: 2348] [Impact Index Per Article: 146.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2008] [Accepted: 09/19/2008] [Indexed: 02/01/2023] Open

Chojnowski JL, Braun EL. Turtle isochore structure is intermediate between amphibians and other amniotes. Integr Comp Biol 2008;48:454-62. [PMID: 21669806 DOI: 10.1093/icb/icn062] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

Vertebrate genomes are comprised of isochores that are relatively long (>100 kb) regions with a relatively homogenous (either GC-rich or AT-rich) base composition and with rather sharp boundaries with neighboring isochores. Mammals and living archosaurs (birds and crocodilians) have heterogeneous genomes that include very GC-rich isochores. In sharp contrast, the genomes of amphibians and fishes are more homogeneous and they have a lower overall GC content. Because DNA with higher GC content is more thermostable, the elevated GC content of mammalian and archosaurian DNA has been hypothesized to be an adaptation to higher body temperatures. This hypothesis can be tested by examining structure of isochores across the reptilian clade, which includes the archosaurs, testudines (turtles), and lepidosaurs (lizards and snakes), because reptiles exhibit diverse body sizes, metabolic rates, and patterns of thermoregulation. This study focuses on a comparative analysis of a new set of expressed genes of the red-eared slider turtle and orthologs of the turtle genes in mammalian (human, mouse, dog, and opossum), archosaurian (chicken and alligator), and amphibian (western clawed frog) genomes. EST (expressed sequence tag) data from a turtle cDNA library enriched for genes that have specialized functions (developmental genes) revealed using the GC content of the third-codon-position to examine isochore structure requires careful consideration of the types of genes examined. The more highly expressed genes (e.g., housekeeping genes) are more likely to be GC-rich than are genes with specialized functions. However, the set of highly expressed turtle genes demonstrated that the turtle genome has a GC content that is intermediate between the GC-poor amphibians and the GC-rich mammals and archosaurs. There was a strong correlation between the GC content of all turtle genes and the GC content of other vertebrate genes, with the slope of the line describing this relationship also indicating that the isochore structure of turtles is intermediate between that of amphibians and other amniotes. These data are consistent with some thermal hypotheses of isochore evolution, but we believe that the credible set of models for isochore evolution still includes a variety of models. These data expand the amount of genomic data available from reptiles upon which future studies of reptilian genomics can build.

Collapse

Laney SJ, Buttaro CJ, Visconti S, Pilotte N, Ramzy RMR, Weil GJ, Williams SA. A reverse transcriptase-PCR assay for detecting filarial infective larvae in mosquitoes. PLoS Negl Trop Dis 2008;2:e251. [PMID: 18560545 PMCID: PMC2413423 DOI: 10.1371/journal.pntd.0000251] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2008] [Accepted: 05/20/2008] [Indexed: 11/19/2022] Open

Abstract

Background

Existing molecular assays for filarial parasite DNA in mosquitoes cannot distinguish between infected mosquitoes that contain any stage of the parasite and infective mosquitoes that harbor third stage larvae (L3) capable of establishing new infections in humans. We now report development of a molecular L3-detection assay for Brugia malayi in vectors based on RT-PCR detection of an L3-activated gene transcript.

Methodology/Principal Findings

Candidate genes identified by bioinformatics analysis of EST datasets across the B. malayi life cycle were initially screened by PCR using cDNA libraries as templates. Stage-specificity was confirmed using RNA isolated from infected mosquitoes. Mosquitoes were collected daily for 14 days after feeding on microfilaremic cat blood. RT-PCR was performed with primer sets that were specific for individual candidate genes. Many promising candidates with strong expression in the L3 stage were excluded because of low-level transcription in less mature larvae. One transcript (TC8100, which encodes a particular form of collagen) was only detected in mosquitoes that contained L3 larvae. This assay detects a single L3 in a pool of 25 mosquitoes.

Conclusions/Significance

This L3-activated gene transcript, combined with a control transcript (tph-1, accession # U80971) that is constitutively expressed by all vector-stage filarial larvae, can be used to detect filarial infectivity in pools of mosquito vectors. This general approach (detection of stage-specific gene transcripts from eukaryotic pathogens) may also be useful for detecting infective stages of other vector-borne parasites.

The Global Programme for the Elimination of Lymphatic Filariasis (GPELF) was launched in the year 1998 with the goal of eliminating lymphatic filariasis by 2020. As the success of mass drug administration (MDA) in the global program drives the rates of infection in endemic populations to very low levels, the development of new, highly sensitive methods are required for monitoring transmission by screening mosquitoes for the presence of L3 infective larvae. The current method of mosquito dissection to identify L3 larvae is laborious and insensitive and is not amenable to screening large numbers of mosquitoes. Existing molecular assays for the detection of filarial parasite DNA in mosquitoes are sensitive and can easily screen large numbers of vectors. However, current PCR-based methods cannot distinguish between infected mosquitoes that contain any stage of the parasite and infective mosquitoes that harbor third stage larvae (L3) capable of establishing new infections in humans. This paper reports the first development of a molecular L3-detection assay for a filarial parasite in mosquitoes based on RT-PCR detection of an L3-activated gene transcript. This strategy of detecting stage-specific messenger RNA from filarial parasites may also prove useful for detecting infective stages of other vector-borne pathogens.

Collapse

Kim CK, Choi JW, Park D, Kang MJ, Seol YJ, Hyun DY, Hahn JH. PlantGI: a database for searching gene indices in agricultural plants developed at NIAB, Korea. Bioinformation 2008;2:344-5. [PMID: 18685722 PMCID: PMC2478734 DOI: 10.6026/97320630002344] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2008] [Revised: 04/22/2008] [Accepted: 04/28/2008] [Indexed: 11/23/2022] Open

Lee Y, Quackenbush J. Using the TIGR gene index databases for biological discovery. ACTA ACUST UNITED AC 2008;Chapter 1:Unit 1.6. [PMID: 18428690 DOI: 10.1002/0471250953.bi0106s03] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Mukesh M, Kataria RS, Kumar V, Pandey D, Sodhi M, Ahlawat SP, Sobti RC, Mishra BP. Construction and Evaluation of Directionally Cloned cDNA Libraries from Lactating and Non-lactating Mammary Gland of River Buffalo (Bubalus bubalis): A Resource for Gene Identification in Bubaline Genome. JOURNAL OF APPLIED ANIMAL RESEARCH 2008. [DOI: 10.1080/09712119.2008.9706902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]

Florea L. Bioinformatics of alternative splicing and its regulation. Brief Bioinform 2008;7:55-69. [PMID: 16761365 DOI: 10.1093/bib/bbk005] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Extraction and annotation of SAGE tags using sequence quality values. Methods Mol Biol 2008;387:123-32. [PMID: 18287627 DOI: 10.1007/978-1-59745-454-4_9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Bioinformatics detection of alternative splicing. Methods Mol Biol 2008;452:179-97. [PMID: 18566765 DOI: 10.1007/978-1-60327-159-2_9] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Chojnowski JL, Franklin J, Katsu Y, Iguchi T, Guillette LJ, Kimball RT, Braun EL. Patterns of Vertebrate Isochore Evolution Revealed by Comparison of Expressed Mammalian, Avian, and Crocodilian Genes. J Mol Evol 2007;65:259-66. [PMID: 17674077 DOI: 10.1007/s00239-007-9003-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2006] [Accepted: 05/18/2007] [Indexed: 10/23/2022]

Frishman D. Protein annotation at genomic scale: the current status. Chem Rev 2007;107:3448-66. [PMID: 17658902 DOI: 10.1021/cr068303k] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Masoudi-Nejad A, Goto S, Jauregui R, Ito M, Kawashima S, Moriya Y, Endo TR, Kanehisa M. EGENES: transcriptome-based plant database of genes with metabolic pathway information and expressed sequence tag indices in KEGG. PLANT PHYSIOLOGY 2007;144:857-66. [PMID: 17468225 PMCID: PMC1914165 DOI: 10.1104/pp.106.095059] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2006] [Accepted: 04/18/2007] [Indexed: 05/15/2023]

Borges JC, Cagliari TC, Ramos CHI. Expression and variability of molecular chaperones in the sugarcane expressome. JOURNAL OF PLANT PHYSIOLOGY 2007;164:505-13. [PMID: 16687190 DOI: 10.1016/j.jplph.2006.03.013] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2005] [Accepted: 03/19/2006] [Indexed: 05/09/2023]

Longhorn SJ, Foster PG, Vogler AP. The nematode?arthropod clade revisited: phylogenomic analyses from ribosomal protein genes misled by shared evolutionary biases. Cladistics 2007;23:130-144. [DOI: 10.1111/j.1096-0031.2006.00132.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open

Chen FC, Wang SS, Chaw SM, Huang YT, Chuang TJ. Plant Gene and Alternatively Spliced Variant Annotator. A plant genome annotation pipeline for rice gene and alternatively spliced variant identification with cross-species expressed sequence tag conservation from seven plant species. PLANT PHYSIOLOGY 2007;143:1086-95. [PMID: 17220363 PMCID: PMC1820933 DOI: 10.1104/pp.106.092460] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]

Abstract

The completion of the rice (Oryza sativa) genome draft has brought unprecedented opportunities for genomic studies of the world's most important food crop. Previous rice gene annotations have relied mainly on ab initio methods, which usually yield a high rate of false-positive predictions and give only limited information regarding alternative splicing in rice genes. Comparative approaches based on expressed sequence tags (ESTs) can compensate for the drawbacks of ab initio methods because they can simultaneously identify experimental data-supported genes and alternatively spliced transcripts. Furthermore, cross-species EST information can be used to not only offset the insufficiency of same-species ESTs but also derive evolutionary implications. In this study, we used ESTs from seven plant species, rice, wheat (Triticum aestivum), maize (Zea mays), barley (Hordeum vulgare), sorghum (Sorghum bicolor), soybean (Glycine max), and Arabidopsis (Arabidopsis thaliana), to annotate the rice genome. We developed a plant genome annotation pipeline, Plant Gene and Alternatively Spliced Variant Annotator (PGAA). Using this approach, we identified 852 genes (931 isoforms) not annotated in other widely used databases (i.e. the Institute for Genomic Research, National Center for Biotechnology Information, and Rice Annotation Project) and found 87% of them supported by both rice and nonrice EST evidence. PGAA also identified more than 44,000 alternatively spliced events, of which approximately 20% are not observed in the other three annotations. These novel annotations represent rich opportunities for rice genome research, because the functions of most of our annotated genes are currently unknown. Also, in the PGAA annotation, the isoforms with non-rice-EST-supported exons are significantly enriched in transporter activity but significantly underrepresented in transcription regulator activity. We have also identified potential lineage-specific and conserved isoforms, which are important markers in evolutionary studies. The data and the Web-based interface, RiceViewer, are available for public access at http://RiceViewer.genomics.sinica.edu.tw/.

Collapse

SeqTrim — A Validation and Trimming Tool for All Purpose Sequence Reads. ACTA ACUST UNITED AC 2007. [DOI: 10.1007/978-3-540-74972-1_46] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Pickart MA, Klee EW, Nielsen AL, Sivasubbu S, Mendenhall EM, Bill BR, Chen E, Eckfeldt CE, Knowlton M, Robu ME, Larson JD, Deng Y, Schimmenti LA, Ellis LB, Verfaillie CM, Hammerschmidt M, Farber SA, Ekker SC. Genome-wide reverse genetics framework to identify novel functions of the vertebrate secretome. PLoS One 2006;1:e104. [PMID: 17218990 PMCID: PMC1766371 DOI: 10.1371/journal.pone.0000104] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2006] [Accepted: 11/12/2006] [Indexed: 11/18/2022] Open

Affiliation(s)

Michael A. Pickart Department of Oral Sciences and Minnesota Craniofacial Research Training Program MinnCResT, University of Minnesota, Minneapolis, Minnesota, United States of America
Eric W. Klee Laboratory Medicine and Pathology and Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, United States of America
Aubrey L. Nielsen Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America
Sridhar Sivasubbu Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America
Eric M. Mendenhall Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America Department of Medicine, Division of Hematology, Oncology, and Transplantation, and Stem Cell Institute, University of Minnesota, Minneapolis, Minnesota, United States of America
Brent R. Bill Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America Department of Pediatrics, Genetics and Metabolism and Department of Ophthalmology, University of Minnesota, Minneapolis, Minnesota, United States of America
Eleanor Chen Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America
Craig E. Eckfeldt Department of Medicine, Division of Hematology, Oncology, and Transplantation, and Stem Cell Institute, University of Minnesota, Minneapolis, Minnesota, United States of America
Michelle Knowlton Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America
Mara E. Robu Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America Department of Oral Sciences and Minnesota Craniofacial Research Training Program MinnCResT, University of Minnesota, Minneapolis, Minnesota, United States of America
Jon D. Larson Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America
Yun Deng Carnegie Institute of Washington, Baltimore, Maryland, United States of America
Lisa A. Schimmenti Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Department of Pediatrics, Genetics and Metabolism and Department of Ophthalmology, University of Minnesota, Minneapolis, Minnesota, United States of America
Lynda B.M. Ellis Laboratory Medicine and Pathology and Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, United States of America
Catherine M. Verfaillie Department of Medicine, Division of Hematology, Oncology, and Transplantation, and Stem Cell Institute, University of Minnesota, Minneapolis, Minnesota, United States of America
Matthias Hammerschmidt Max Planck Institute Immunbiologie, Freiburg, Germany
Steven A. Farber Carnegie Institute of Washington, Baltimore, Maryland, United States of America
Stephen C. Ekker Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota, United States of America Arnold and Mabel Beckman Center for Transposon Research, University of Minnesota, Minneapolis, Minnesota, United States of America * To whom correspondence should be addressed. E-mail:

Collapse

Venier P, De Pittà C, Pallavicini A, Marsano F, Varotto L, Romualdi C, Dondero F, Viarengo A, Lanfranchi G. Development of mussel mRNA profiling: Can gene expression trends reveal coastal water pollution? Mutat Res 2006;602:121-34. [PMID: 17010391 DOI: 10.1016/j.mrfmmm.2006.08.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2006] [Revised: 08/21/2006] [Accepted: 08/21/2006] [Indexed: 05/12/2023]

Abstract

Marine bivalves of the genus Mytilus are intertidal filter-feeders commonly used as biosensors of coastal pollution. Mussels adjust their functions to ordinary environmental changes, e.g. temperature fluctuations and emersion-related hypoxia, and react to various contaminants, accumulated from the surrounding water and defining a potential health risk for sea-food consumers. Despite the increasing use of mussels in environmental monitoring, their genome and gene functions are largely unexplored. Hence, we started the systematic identification of expressed sequence tags and prepared a cDNA microarray of Mytilus galloprovincialis including 1714 mussel probes (76% singletons, approximately 50% putatively identified transcripts) plus unrelated controls. To assess the potential use of the gene set represented in MytArray 1.0, we tested different tissues and groups of mussels. The resulting data highlighted the transcriptional specificity of the mussel tissues. Further testing of the most responsive digestive gland allowed correct classification of mussels treated with mixtures of heavy metals or organic contaminants (expression changes of specific genes discriminated the two pollutant cocktails). Similar analyses made a distinction possible between mussels living in the Venice lagoon (Italy) at the petrochemical district and mussels close to the open sea. The suggestive presence of gene markers tracing organic contaminants more than heavy metals in mussels from the industrial district is consistent with reported trends of chemical contamination. Further study is necessary in order to understand how much gene expression profiles can disclose the signatures of pollutants in mussel cells and tissues. Nevertheless, the gene expression patterns described in this paper support a wider characterization of the mussel transcriptome and point to the development of novel environmental metrics.

Collapse

Bouck A, Vision T. The molecular ecologist's guide to expressed sequence tags. Mol Ecol 2006;16:907-24. [PMID: 17305850 DOI: 10.1111/j.1365-294x.2006.03195.x] [Citation(s) in RCA: 283] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

JUICE: a data management system that facilitates the analysis of large volumes of information in an EST project workflow. BMC Bioinformatics 2006;7:513. [PMID: 17123449 PMCID: PMC1676024 DOI: 10.1186/1471-2105-7-513] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2006] [Accepted: 11/23/2006] [Indexed: 11/25/2022] Open

Abstract

Background

Expressed sequence tag (EST) analyses provide a rapid and economical means to identify candidate genes that may be involved in a particular biological process. These ESTs are useful in many Functional Genomics studies. However, the large quantity and complexity of the data generated during an EST sequencing project can make the analysis of this information a daunting task.

Results

In an attempt to make this task friendlier, we have developed JUICE, an open source data management system (Apache + PHP + MySQL on Linux), which enables the user to easily upload, organize, visualize and search the different types of data generated in an EST project pipeline. In contrast to other systems, the JUICE data management system allows a branched pipeline to be established, modified and expanded, during the course of an EST project.

The web interfaces and tools in JUICE enable the users to visualize the information in a graphical, user-friendly manner. The user may browse or search for sequences and/or sequence information within all the branches of the pipeline. The user can search using terms associated with the sequence name, annotation or other characteristics stored in JUICE and associated with sequences or sequence groups. Groups of sequences can be created by the user, stored in a clipboard and/or downloaded for further analyses.

Different user profiles restrict the access of each user depending upon their role in the project. The user may have access exclusively to visualize sequence information, access to annotate sequences and sequence information, or administrative access.

Conclusion

JUICE is an open source data management system that has been developed to aid users in organizing and analyzing the large amount of data generated in an EST Project workflow. JUICE has been used in one of the first functional genomics projects in Chile, entitled "Functional Genomics in nectarines: Platform to potentiate the competitiveness of Chile in fruit exportation". However, due to its ability to organize and visualize data from external pipelines, JUICE is a flexible data management system that should be useful for other EST/Genome projects. The JUICE data management system is released under the Open Source GNU Lesser General Public License (LGPL). JUICE may be downloaded from or .

Collapse

Arhondakis S, Clay O, Bernardi G. Compositional properties of human cDNA libraries: practical implications. FEBS Lett 2006;580:5772-8. [PMID: 17022979 DOI: 10.1016/j.febslet.2006.09.034] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2006] [Revised: 09/12/2006] [Accepted: 09/19/2006] [Indexed: 01/28/2023]

Turner JD, Schote AB, Macedo JA, Pelascini LPL, Muller CP. Tissue specific glucocorticoid receptor expression, a role for alternative first exon usage? Biochem Pharmacol 2006;72:1529-37. [PMID: 16930562 DOI: 10.1016/j.bcp.2006.07.005] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2006] [Revised: 07/04/2006] [Accepted: 07/11/2006] [Indexed: 01/28/2023]

Malde K, Schneeberger K, Coward E, Jonassen I. RBR: library-less repeat detection for ESTs. Bioinformatics 2006;22:2232-6. [PMID: 16837527 DOI: 10.1093/bioinformatics/btl368] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Lorenzini DM, da Silva PI, Soares MB, Arruda P, Setubal J, Daffre S. Discovery of immune-related genes expressed in hemocytes of the tarantula spider Acanthoscurria gomesiana. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2006;30:545-56. [PMID: 16386302 DOI: 10.1016/j.dci.2005.09.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2005] [Revised: 08/28/2005] [Accepted: 09/02/2005] [Indexed: 05/05/2023]

Wang JPZ, Lindsay BG, Cui L, Wall PK, Marion J, Zhang J, dePamphilis CW. Gene capture prediction and overlap estimation in EST sequencing from one or multiple libraries. BMC Bioinformatics 2005;6:300. [PMID: 16351717 PMCID: PMC1369009 DOI: 10.1186/1471-2105-6-300] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2004] [Accepted: 12/13/2005] [Indexed: 11/10/2022] Open

Wu Y, Rozenfeld S, Defferrard A, Ruggiero K, Udall JA, Kim H, Llewellyn DJ, Dennis ES. Cycloheximide treatment of cotton ovules alters the abundance of specific classes of mRNAs and generates novel ESTs for microarray expression profiling. Mol Genet Genomics 2005;274:477-93. [PMID: 16208490 DOI: 10.1007/s00438-005-0049-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2005] [Accepted: 08/19/2005] [Indexed: 10/25/2022]

Firnhaber C, Pühler A, Küster H. EST sequencing and time course microarray hybridizations identify more than 700 Medicago truncatula genes with developmental expression regulation in flowers and pods. PLANTA 2005;222:269-83. [PMID: 15968508 DOI: 10.1007/s00425-005-1543-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2004] [Accepted: 02/25/2005] [Indexed: 05/03/2023]

da Silva FG, Iandolino A, Al-Kayal F, Bohlmann MC, Cushman MA, Lim H, Ergul A, Figueroa R, Kabuloglu EK, Osborne C, Rowe J, Tattersall E, Leslie A, Xu J, Baek J, Cramer GR, Cushman JC, Cook DR. Characterizing the grape transcriptome. Analysis of expressed sequence tags from multiple Vitis species and development of a compendium of gene expression during berry development. PLANT PHYSIOLOGY 2005;139:574-97. [PMID: 16219919 PMCID: PMC1255978 DOI: 10.1104/pp.105.065748] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2005] [Revised: 07/28/2005] [Accepted: 08/04/2005] [Indexed: 05/04/2023]

Abstract

We report the analysis and annotation of 146,075 expressed sequence tags from Vitis species. The majority of these sequences were derived from different cultivars of Vitis vinifera, comprising an estimated 25,746 unique contig and singleton sequences that survey transcription in various tissues and developmental stages and during biotic and abiotic stress. Putatively homologous proteins were identified for over 17,752 of the transcripts, with 1,962 transcripts further subdivided into one or more Gene Ontology categories. A simple structured vocabulary, with modules for plant genotype, plant development, and stress, was developed to describe the relationship between individual expressed sequence tags and cDNA libraries; the resulting vocabulary provides query terms to facilitate data mining within the context of a relational database. As a measure of the extent to which characterized metabolic pathways were encompassed by the data set, we searched for homologs of the enzymes leading from glycolysis, through the oxidative/nonoxidative pentose phosphate pathway, and into the general phenylpropanoid pathway. Homologs were identified for 65 of these 77 enzymes, with 86% of enzymatic steps represented by paralogous genes. Differentially expressed transcripts were identified by means of a stringent believability index cutoff of > or =98.4%. Correlation analysis and two-dimensional hierarchical clustering grouped these transcripts according to similarity of expression. In the broadest analysis, 665 differentially expressed transcripts were identified across 29 cDNA libraries, representing a range of developmental and stress conditions. The groupings revealed expected associations between plant developmental stages and tissue types, with the notable exception of abiotic stress treatments. A more focused analysis of flower and berry development identified 87 differentially expressed transcripts and provides the basis for a compendium that relates gene expression and annotation to previously characterized aspects of berry development and physiology. Comparison with published results for select genes, as well as correlation analysis between independent data sets, suggests that the inferred in silico patterns of expression are likely to be an accurate representation of transcript abundance for the conditions surveyed. Thus, the combined data set reveals the in silico expression patterns for hundreds of genes in V. vinifera, the majority of which have not been previously studied within this species.

Collapse

Sczyrba A, Beckstette M, Brivanlou AH, Giegerich R, Altmann CR. XenDB: full length cDNA prediction and cross species mapping in Xenopus laevis. BMC Genomics 2005;6:123. [PMID: 16162280 PMCID: PMC1261260 DOI: 10.1186/1471-2164-6-123] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2005] [Accepted: 09/14/2005] [Indexed: 11/23/2022] Open

Abstract

Background

Research using the model system Xenopus laevis has provided critical insights into the mechanisms of early vertebrate development and cell biology. Large scale sequencing efforts have provided an increasingly important resource for researchers. To provide full advantage of the available sequence, we have analyzed 350,468 Xenopus laevis Expressed Sequence Tags (ESTs) both to identify full length protein encoding sequences and to develop a unique database system to support comparative approaches between X. laevis and other model systems.

Description

Using a suffix array based clustering approach, we have identified 25,971 clusters and 40,877 singleton sequences. Generation of a consensus sequence for each cluster resulted in 31,353 tentative contig and 4,801 singleton sequences. Using both BLASTX and FASTY comparison to five model organisms and the NR protein database, more than 15,000 sequences are predicted to encode full length proteins and these have been matched to publicly available IMAGE clones when available. Each sequence has been compared to the KOG database and ~67% of the sequences have been assigned a putative functional category. Based on sequence homology to mouse and human, putative GO annotations have been determined.

Conclusion

The results of the analysis have been stored in a publicly available database XenDB . A unique capability of the database is the ability to batch upload cross species queries to identify potential Xenopus homologues and their associated full length clones. Examples are provided including mapping of microarray results and application of 'in silico' analysis. The ability to quickly translate the results of various species into 'Xenopus-centric' information should greatly enhance comparative embryological approaches.

Supplementary material can be found at .

Collapse

Claverie JM. Fewer genes, more noncoding RNA. Science 2005;309:1529-30. [PMID: 16141064 DOI: 10.1126/science.1116800] [Citation(s) in RCA: 133] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Min XJ, Butler G, Storms R, Tsang A. TargetIdentifier: a webserver for identifying full-length cDNAs from EST sequences. Nucleic Acids Res 2005;33:W669-72. [PMID: 15980559 PMCID: PMC1160197 DOI: 10.1093/nar/gki436] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open