1
|
In-depth global analysis of transcript abundance levels in porcine alveolar macrophages following infection with porcine reproductive and respiratory syndrome virus. Adv Virol 2011; 2010:864181. [PMID: 22331987 PMCID: PMC3275998 DOI: 10.1155/2010/864181] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2010] [Accepted: 12/12/2010] [Indexed: 01/30/2023] Open
Abstract
Porcine reproductive and respiratory syndrome virus (PRRSV) is a major pathogen of swine worldwide and causes considerable economic loss. Identifying specific cell signaling or activation pathways that associate with variation in PRRSV replication and macrophage function may lead to identification of novel gene targets for the control of PRRSV infection. Serial Analysis of Gene Expression (SAGE) was used to create and survey the transcriptome of in vitro mock-infected and PRRSV strain VR-2332-infected porcine alveolar macrophages (PAM) at 0, 6, 12, 16, and 24 hours after infection. The transcriptome data indicated changes in transcript abundance occurring in PRRSV-infected PAMs over time after infection with more than 590 unique tags with significantly altered transcript abundance levels identified (P < .01). Strikingly, innate immune genes (whose transcript abundances are typically altered in response to other pathogens or insults including IL-8, CCL4, and IL-1β) showed no or very little change at any time point following infection.
Collapse
|
2
|
Harhay GP, Smith TP, Alexander LJ, Haudenschild CD, Keele JW, Matukumalli LK, Schroeder SG, Van Tassell CP, Gresham CR, Bridges SM, Burgess SC, Sonstegard TS. An atlas of bovine gene expression reveals novel distinctive tissue characteristics and evidence for improving genome annotation. Genome Biol 2010; 11:R102. [PMID: 20961407 PMCID: PMC3218658 DOI: 10.1186/gb-2010-11-10-r102] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2010] [Revised: 07/22/2010] [Accepted: 10/20/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A comprehensive transcriptome survey, or gene atlas, provides information essential for a complete understanding of the genomic biology of an organism. We present an atlas of RNA abundance for 92 adult, juvenile and fetal cattle tissues and three cattle cell lines. RESULTS The Bovine Gene Atlas was generated from 7.2 million unique digital gene expression tag sequences (300.2 million total raw tag sequences), from which 1.59 million unique tag sequences were identified that mapped to the draft bovine genome accounting for 85% of the total raw tag abundance. Filtering these tags yielded 87,764 unique tag sequences that unambiguously mapped to 16,517 annotated protein-coding loci in the draft genome accounting for 45% of the total raw tag abundance. Clustering of tissues based on tag abundance profiles generally confirmed ontology classification based on anatomy. There were 5,429 constitutively expressed loci and 3,445 constitutively expressed unique tag sequences mapping outside annotated gene boundaries that represent a resource for enhancing current gene models. Physical measures such as inferred transcript length or antisense tag abundance identified tissues with atypical transcriptional tag profiles. We report for the first time the tissue-specific variation in the proportion of mitochondrial transcriptional tag abundance. CONCLUSIONS The Bovine Gene Atlas is the deepest and broadest transcriptome survey of any livestock genome to date. Commonalities and variation in sense and antisense transcript tag profiles identified in different tissues facilitate the examination of the relationship between gene expression, tissue, and gene function.
Collapse
Affiliation(s)
- Gregory P Harhay
- USDA-ARS US Meat Animal Research Center, State Spur 18 D, Clay Center, NE 68901, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Pinto PI, Matsumura H, Thorne MA, Power DM, Terauchi R, Reinhardt R, Canário AV. Gill transcriptome response to changes in environmental calcium in the green spotted puffer fish. BMC Genomics 2010; 11:476. [PMID: 20716350 PMCID: PMC3091672 DOI: 10.1186/1471-2164-11-476] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2010] [Accepted: 08/17/2010] [Indexed: 12/13/2022] Open
Abstract
Background Calcium ion is tightly regulated in body fluids and for euryhaline fish, which are exposed to rapid changes in environmental [Ca2+], homeostasis is especially challenging. The gill is the main organ of active calcium uptake and therefore plays a crucial role in the maintenance of calcium ion homeostasis. To study the molecular basis of the short-term responses to changing calcium availability, the whole gill transcriptome obtained by Super Serial Analysis of Gene Expression (SuperSAGE) of the euryhaline teleost green spotted puffer fish, Tetraodon nigroviridis, exposed to water with altered [Ca2+] was analysed. Results Transfer of T. nigroviridis from 10 ppt water salinity containing 2.9 mM Ca2+ to high (10 mM Ca2+ ) and low (0.01 mM Ca2+) calcium water of similar salinity for 2-12 h resulted in 1,339 differentially expressed SuperSAGE tags (26-bp transcript identifiers) in gills. Of these 869 tags (65%) were mapped to T. nigroviridis cDNAs or genomic DNA and 497 (57%) were assigned to known proteins. Thirteen percent of the genes matched multiple tags indicating alternative RNA transcripts. The main enriched gene ontology groups belong to Ca2+ signaling/homeostasis but also muscle contraction, cytoskeleton, energy production/homeostasis and tissue remodeling. K-means clustering identified co-expressed transcripts with distinct patterns in response to water [Ca2+] and exposure time. Conclusions The generated transcript expression patterns provide a framework of novel water calcium-responsive genes in the gill during the initial response after transfer to different [Ca2+]. This molecular response entails initial perception of alterations, activation of signaling networks and effectors and suggests active remodeling of cytoskeletal proteins during the initial acclimation process. Genes related to energy production and energy homeostasis are also up-regulated, probably reflecting the increased energetic needs of the acclimation response. This study is the first genome-wide transcriptome analysis of fish gills and is an important resource for future research on the short-term mechanisms involved in the gill acclimation responses to environmental Ca2+ changes and osmoregulation.
Collapse
Affiliation(s)
- Patrícia Is Pinto
- Centro de Ciências do Mar, CIMAR-Laboratório Associado, University of Algarve, Campus de Gambelas, 8005-139 Faro, Portugal.
| | | | | | | | | | | | | |
Collapse
|
4
|
Obermeier C, Hosseini B, Friedt W, Snowdon R. Gene expression profiling via LongSAGE in a non-model plant species: a case study in seeds of Brassica napus. BMC Genomics 2009; 10:295. [PMID: 19575793 PMCID: PMC2719671 DOI: 10.1186/1471-2164-10-295] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2009] [Accepted: 07/03/2009] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Serial analysis of gene expression (LongSAGE) was applied for gene expression profiling in seeds of oilseed rape (Brassica napus ssp. napus). The usefulness of this technique for detailed expression profiling in a non-model organism was demonstrated for the highly complex, neither fully sequenced nor annotated genome of B. napus by applying a tag-to-gene matching strategy based on Brassica ESTs and the annotated proteome of the closely related model crucifer A. thaliana. RESULTS Transcripts from 3,094 genes were detected at two time-points of seed development, 23 days and 35 days after pollination (DAP). Differential expression showed a shift from gene expression involved in diverse developmental processes including cell proliferation and seed coat formation at 23 DAP to more focussed metabolic processes including storage protein accumulation and lipid deposition at 35 DAP. The most abundant transcripts at 23 DAP were coding for diverse protease inhibitor proteins and proteases, including cysteine proteases involved in seed coat formation and a number of lipid transfer proteins involved in embryo pattern formation. At 35 DAP, transcripts encoding napin, cruciferin and oleosin storage proteins were most abundant. Over both time-points, 18.6% of the detected genes were matched by Brassica ESTs identified by LongSAGE tags in antisense orientation. This suggests a strong involvement of antisense transcript expression in regulatory processes during B. napus seed development. CONCLUSION This study underlines the potential of transcript tagging approaches for gene expression profiling in Brassica crop species via EST matching to annotated A. thaliana genes. Limits of tag detection for low-abundance transcripts can today be overcome by ultra-high throughput sequencing approaches, so that tag-based gene expression profiling may soon become the method of choice for global expression profiling in non-model species.
Collapse
Affiliation(s)
- Christian Obermeier
- Justus Liebig University Giessen, Department of Plant Breeding, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany.
| | | | | | | |
Collapse
|
5
|
Pinheiro DG, Galante PAF, de Souza SJ, Zago MA, Silva WA. A score system for quality evaluation of RNA sequence tags: an improvement for gene expression profiling. BMC Bioinformatics 2009; 10:170. [PMID: 19500384 PMCID: PMC2701951 DOI: 10.1186/1471-2105-10-170] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2008] [Accepted: 06/06/2009] [Indexed: 12/01/2022] Open
Abstract
Background High-throughput molecular approaches for gene expression profiling, such as Serial Analysis of Gene Expression (SAGE), Massively Parallel Signature Sequencing (MPSS) or Sequencing-by-Synthesis (SBS) represent powerful techniques that provide global transcription profiles of different cell types through sequencing of short fragments of transcripts, denominated sequence tags. These techniques have improved our understanding about the relationships between these expression profiles and cellular phenotypes. Despite this, more reliable datasets are still necessary. In this work, we present a web-based tool named S3T: Score System for Sequence Tags, to index sequenced tags in accordance with their reliability. This is made through a series of evaluations based on a defined rule set. S3T allows the identification/selection of tags, considered more reliable for further gene expression analysis. Results This methodology was applied to a public SAGE dataset. In order to compare data before and after filtering, a hierarchical clustering analysis was performed in samples from the same type of tissue, in distinct biological conditions, using these two datasets. Our results provide evidences suggesting that it is possible to find more congruous clusters after using S3T scoring system. Conclusion These results substantiate the proposed application to generate more reliable data. This is a significant contribution for determination of global gene expression profiles. The library analysis with S3T is freely available at . S3T source code and datasets can also be downloaded from the aforementioned website.
Collapse
Affiliation(s)
- Daniel G Pinheiro
- Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP, Brazil.
| | | | | | | | | |
Collapse
|
6
|
Leyritz J, Schicklin S, Blachon S, Keime C, Robardet C, Boulicaut JF, Besson J, Pensa RG, Gandrillon O. SQUAT: A web tool to mine human, murine and avian SAGE data. BMC Bioinformatics 2008; 9:378. [PMID: 18801154 PMCID: PMC2567996 DOI: 10.1186/1471-2105-9-378] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2008] [Accepted: 09/18/2008] [Indexed: 01/17/2023] Open
Abstract
Background There is an increasing need in transcriptome research for gene expression data and pattern warehouses. It is of importance to integrate in these warehouses both raw transcriptomic data, as well as some properties encoded in these data, like local patterns. Description We have developed an application called SQUAT (SAGE Querying and Analysis Tools) which is available at: . This database gives access to both raw SAGE data and patterns mined from these data, for three species (human, mouse and chicken). This database allows to make simple queries like "In which biological situations is my favorite gene expressed?" as well as much more complex queries like: ≪what are the genes that are frequently co-over-expressed with my gene of interest in given biological situations?≫. Connections with external web databases enrich biological interpretations, and enable sophisticated queries. To illustrate the power of SQUAT, we show and analyze the results of three different queries, one of which led to a biological hypothesis that was experimentally validated. Conclusion SQUAT is a user-friendly information retrieval platform, which aims at bringing some of the state-of-the-art mining tools to biologists.
Collapse
Affiliation(s)
- Johan Leyritz
- Equipe Bases Moléculaires de l'Autorenouvellement et de ses Altérations, Université de Lyon, F-69622, Université Lyon 1, Villeurbanne, CNRS, UMR5534, Centre de Génétique Moléculaire et Cellualire, Lyon, France.
| | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Wang H, Zheng H, Azuaje F. Clustering-based approaches to SAGE data mining. BioData Min 2008; 1:5. [PMID: 18822151 PMCID: PMC2553774 DOI: 10.1186/1756-0381-1-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2008] [Accepted: 07/17/2008] [Indexed: 11/12/2022] Open
Abstract
Serial analysis of gene expression (SAGE) is one of the most powerful tools for global gene expression profiling. It has led to several biological discoveries and biomedical applications, such as the prediction of new gene functions and the identification of biomarkers in human cancer research. Clustering techniques have become fundamental approaches in these applications. This paper reviews relevant clustering techniques specifically designed for this type of data. It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation.
Collapse
Affiliation(s)
- Haiying Wang
- School of Computing and Mathematics, University of Ulster, Newtownabbey, BT37 0QB, Co, Antrim, Northern Ireland, UK.
| | | | | |
Collapse
|
8
|
Bresson C, Keime C, Faure C, Letrillard Y, Barbado M, Sanfilippo S, Benhra N, Gandrillon O, Gonin-Giraud S. Large-scale analysis by SAGE reveals new mechanisms of v-erbA oncogene action. BMC Genomics 2007; 8:390. [PMID: 17961265 PMCID: PMC2194726 DOI: 10.1186/1471-2164-8-390] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2007] [Accepted: 10/26/2007] [Indexed: 11/10/2022] Open
Abstract
Background: The v-erbA oncogene, carried by the Avian Erythroblastosis Virus, derives from the c-erbAα proto-oncogene that encodes the nuclear receptor for triiodothyronine (T3R). v-ErbA transforms erythroid progenitors in vitro by blocking their differentiation, supposedly by interference with T3R and RAR (Retinoic Acid Receptor). However, v-ErbA target genes involved in its transforming activity still remain to be identified. Results: By using Serial Analysis of Gene Expression (SAGE), we identified 110 genes deregulated by v-ErbA and potentially implicated in the transformation process. Bioinformatic analysis of promoter sequence and transcriptional assays point out a potential role of c-Myb in the v-ErbA effect. Furthermore, grouping of newly identified target genes by function revealed both expected (chromatin/transcription) and unexpected (protein metabolism) functions potentially deregulated by v-ErbA. We then focused our study on 15 of the new v-ErbA target genes and demonstrated by real time PCR that in majority their expression was activated neither by T3, nor RA, nor during differentiation. This was unexpected based upon the previously known role of v-ErbA. Conclusion: This paper suggests the involvement of a wealth of new unanticipated mechanisms of v-ErbA action.
Collapse
|
9
|
Norambuena T, Malig R, Melo F. SAGExplore: a web server for unambiguous tag mapping in serial analysis of gene expression oriented to gene discovery and annotation. Nucleic Acids Res 2007; 35:W163-8. [PMID: 17626053 PMCID: PMC1933165 DOI: 10.1093/nar/gkm429] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html.
Collapse
Affiliation(s)
| | | | - Francisco Melo
- *To whom correspondence should be addressed. +56 2 686 2279+56 2 222 5515
| |
Collapse
|
10
|
Keime C, Sémon M, Mouchiroud D, Duret L, Gandrillon O. Unexpected observations after mapping LongSAGE tags to the human genome. BMC Bioinformatics 2007; 8:154. [PMID: 17504516 PMCID: PMC1884178 DOI: 10.1186/1471-2105-8-154] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2007] [Accepted: 05/15/2007] [Indexed: 01/15/2023] Open
Abstract
Background SAGE has been used widely to study the expression of known transcripts, but much less to annotate new transcribed regions. LongSAGE produces tags that are sufficiently long to be reliably mapped to a whole-genome sequence. Here we used this property to study the position of human LongSAGE tags obtained from all public libraries. We focused mainly on tags that do not map to known transcripts. Results Using a published error rate in SAGE libraries, we first removed the tags likely to result from sequencing errors. We then observed that an unexpectedly large number of the remaining tags still did not match the genome sequence. Some of these correspond to parts of human mRNAs, such as polyA tails, junctions between two exons and polymorphic regions of transcripts. Another non-negligible proportion can be attributed to contamination by murine transcripts and to residual sequencing errors. After filtering out our data with these screens to ensure that our dataset is highly reliable, we studied the tags that map once to the genome. 31% of these tags correspond to unannotated transcripts. The others map to known transcribed regions, but many of them (nearly half) are located either in antisense or in new variants of these known transcripts. Conclusion We performed a comprehensive study of all publicly available human LongSAGE tags, and carefully verified the reliability of these data. We found the potential origin of many tags that did not match the human genome sequence. The properties of the remaining tags imply that the level of sequencing error may have been under-estimated. The frequency of tags matching once the genome sequence but not in an annotated exon suggests that the human transcriptome is much more complex than shown by the current human genome annotations, with many new splicing variants and antisense transcripts. SAGE data is appropriate to map new transcripts to the genome, as demonstrated by the high rate of cross-validation of the corresponding tags using other methods.
Collapse
Affiliation(s)
- Céline Keime
- Université de Lyon, Lyon, F-69003, France ; Université Lyon 1, Lyon, F-69003, France, CNRS, UMR5534, Centre de génétique moléculaire et cellulaire, Villeurbanne, F-69622, France
- Université de Lyon, Lyon, F-69003, France ; Université Lyon 1, Lyon, F-69003, France, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, F-69622, France
| | - Marie Sémon
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland
| | - Dominique Mouchiroud
- Université de Lyon, Lyon, F-69003, France ; Université Lyon 1, Lyon, F-69003, France, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, F-69622, France
| | - Laurent Duret
- Université de Lyon, Lyon, F-69003, France ; Université Lyon 1, Lyon, F-69003, France, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, F-69622, France
| | - Olivier Gandrillon
- Université de Lyon, Lyon, F-69003, France ; Université Lyon 1, Lyon, F-69003, France, CNRS, UMR5534, Centre de génétique moléculaire et cellulaire, Villeurbanne, F-69622, France
| |
Collapse
|
11
|
Rosinski-Chupin I, Briolay J, Brouilly P, Perrot S, Gomez SM, Chertemps T, Roth CW, Keime C, Gandrillon O, Couble P, Brey PT. SAGE analysis of mosquito salivary gland transcriptomes during Plasmodium invasion. Cell Microbiol 2006; 9:708-24. [PMID: 17054438 DOI: 10.1111/j.1462-5822.2006.00822.x] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Invasion of the vector salivary glands by Plasmodium is a critical step for malaria transmission. To describe salivary gland cellular responses to sporozoite invasion, we have undertaken the analysis of Anopheles gambiae salivary gland transcriptome using Serial Analysis of Gene Expression (SAGE). Statistical analysis of the more than 160000 sequenced tags generated from four libraries, two from glands infected by Plasmodium berghei, two from glands of controls, revealed that at least 57 Anopheles genes are differentially expressed in infected salivary glands. Among the 37 immune-related genes identified by SAGE tags, four (Defensin1, GNBP, Serpin6 and Cecropin2) were found to be upregulated during salivary gland invasion, while five genes encoding small secreted proteins display induction patterns strongly reminiscent of that of Cecropin2. Invasion by Plasmodium has also an impact on the expression of genes involved in transport, lipid and energy metabolism, suggesting that the sporozoite may exploit the metabolism of its host. In contrast, protein composition of saliva is predicted to be only slightly modified after infection. This study, which is the first transcriptome analysis of the salivary gland response to Plasmodium infection, provides a basis for a better understanding of Plasmodium/Anopheles salivary gland interactions.
Collapse
Affiliation(s)
- Isabelle Rosinski-Chupin
- Unité de Biochimie et Biologie Moléculaire des Insectes, Institut Pasteur, 28 rue du Dr Roux, 75724, Paris, France.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Kuo BYL, Chen Y, Bohacec S, Johansson Ö, Wasserman WW, Simpson EM. SAGE2Splice: unmapped SAGE tags reveal novel splice junctions. PLoS Comput Biol 2006; 2:e34. [PMID: 16683015 PMCID: PMC1447652 DOI: 10.1371/journal.pcbi.0020034] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2005] [Accepted: 03/08/2006] [Indexed: 11/18/2022] Open
Abstract
Serial analysis of gene expression (SAGE) not only is a method for profiling the global expression of genes, but also offers the opportunity for the discovery of novel transcripts. SAGE tags are mapped to known transcripts to determine the gene of origin. Tags that map neither to a known transcript nor to the genome were hypothesized to span a splice junction, for which the exon combination or exon(s) are unknown. To test this hypothesis, we have developed an algorithm, SAGE2Splice, to efficiently map SAGE tags to potential splice junctions in a genome. The algorithm consists of three search levels. A scoring scheme was designed based on position weight matrices to assess the quality of candidates. Using optimized parameters for SAGE2Splice analysis and two sets of SAGE data, candidate junctions were discovered for 5%-6% of unmapped tags. Candidates were classified into three categories, reflecting the previous annotations of the putative splice junctions. Analysis of predicted tags extracted from EST sequences demonstrated that candidate junctions having the splice junction located closer to the center of the tags are more reliable. Nine of these 12 candidates were validated by RT-PCR and sequencing, and among these, four revealed previously uncharacterized exons. Thus, SAGE2Splice provides a new functionality for the identification of novel transcripts and exons. SAGE2Splice is available online at http://www.cisreg.ca.
Collapse
Affiliation(s)
- Byron Yu-Lin Kuo
- Genetics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Ying Chen
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Slavita Bohacec
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Öjvind Johansson
- Stockholm Bioinformatics Center, Kunliga Tekniska Högskolan, Albanova, Stockholm, Sweden
| | - Wyeth W Wasserman
- Genetics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Elizabeth M Simpson
- Genetics Graduate Program, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|