Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Lee S, Clark T, Chen J, Zhou G, Scott LR, Rowley JD, Wang SM. Correct identification of genes from serial analysis of gene expression tag sequences. Genomics 2002;79:598-602. [PMID: 11944993 DOI: 10.1006/geno.2002.6730] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

For:	Lee S, Clark T, Chen J, Zhou G, Scott LR, Rowley JD, Wang SM. Correct identification of genes from serial analysis of gene expression tag sequences. Genomics 2002;79:598-602. [PMID: 11944993 DOI: 10.1006/geno.2002.6730] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Number

Cited by Other Article(s)

Gunaratne PH. Advances in Genome Biology and Technology. Expert Rev Mol Diagn 2014;4:757-60. [PMID: 15525218 DOI: 10.1586/14737159.4.6.757] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Smandi S, Guerfali FZ, Farhat M, Ben-Aissa K, Laouini D, Guizani-Tabbane L, Dellagi K, Benkahla A. Methodology optimizing SAGE library tag-to-gene mapping: application to Leishmania. BMC Res Notes 2012;5:74. [PMID: 22283878 PMCID: PMC3292834 DOI: 10.1186/1756-0500-5-74] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2011] [Accepted: 01/27/2012] [Indexed: 11/10/2022] Open

Xiang Y, Wang Y, Luo Y, Zhang B, Xin J, Zheng D. Molecular biocompatibility evaluation of poly(d,l-lactic acid)-modified biomaterials based on long serial analysis of gene expression. Colloids Surf B Biointerfaces 2011;85:248-61. [DOI: 10.1016/j.colsurfb.2011.02.036] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2010] [Revised: 02/23/2011] [Accepted: 02/25/2011] [Indexed: 10/18/2022]

Wu Q, Kim YC, Lu J, Xuan Z, Chen J, Zheng Y, Zhou T, Zhang MQ, Wu CI, Wang SM. Poly A- transcripts expressed in HeLa cells. PLoS One 2008;3:e2803. [PMID: 18665230 PMCID: PMC2481391 DOI: 10.1371/journal.pone.0002803] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2008] [Accepted: 07/04/2008] [Indexed: 12/20/2022] Open

Affiliation(s)

Qingfa Wu Center for Functional Genomics, Division of Medical Genetics, Department of Medicine, ENH Research Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
Yeong C. Kim Center for Functional Genomics, Division of Medical Genetics, Department of Medicine, ENH Research Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
Jian Lu Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
Zhenyu Xuan Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
Jun Chen Center for Functional Genomics, Division of Medical Genetics, Department of Medicine, ENH Research Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
Yonglan Zheng Center for Functional Genomics, Division of Medical Genetics, Department of Medicine, ENH Research Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
Tom Zhou Center for Functional Genomics, Division of Medical Genetics, Department of Medicine, ENH Research Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
Michael Q. Zhang Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
Chung-I Wu Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
San Ming Wang Center for Functional Genomics, Division of Medical Genetics, Department of Medicine, ENH Research Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America Robert H. Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America

Collapse

Zhu J, He F, Wang J, Yu J. Modeling transcriptome based on transcript-sampling data. PLoS One 2008;3:e1659. [PMID: 18286206 PMCID: PMC2243018 DOI: 10.1371/journal.pone.0001659] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2007] [Accepted: 01/21/2008] [Indexed: 01/10/2023] Open

Abstract

Background

Newly-evolved multiplex sequencing technology has been bringing transcriptome sequencing into an unprecedented depth. Millions of transcript tags now can be acquired in a single experiment through parallelization. The significant increase in throughput and reduction in cost required us to address some fundamental questions, such as how many transcript tags do we have to sequence for a given transcriptome? How could we estimate the total number of unique transcripts for different cell types (transcriptome diversity) and the distribution of their copy numbers (transcriptome dynamics)? What is the probability that a transcript with a given expression level to be detected at a certain sampling depth?

Methodology/Principal Findings

We developed a statistical model to evaluate these parameters based on transcriptome-sampling data. Three mixture models were exploited for their potentials to model the sampling frequencies. We demonstrated that relative abundances of all transcripts in a transcriptome follow the generalized inverse Gaussian distribution. The widely known beta and gamma distributions failed to fulfill the singular characteristics of relative abundance distribution, i.e., highly skewed toward zero and with a long tail. An estimator of transcriptome diversity and an analytical form of sampling growth curve were proposed in a coherent framework. Experimental data fitted this model very well and Monte Carlo simulations based on this model replicated sampling experiments in a remarkable precision.

Conclusions

Taking human embryonic stem cell as a prototype, we demonstrated that sequencing tens of thousands of transcript tags in an ordinary EST/SAGE experiment was far from sufficient. In order to fully characterize a human transcriptome, millions of transcript tags had to be sequenced. This model lays a statistical basis for transcriptome-sampling experiments and in essence can be used in all sampling-based data.

Collapse

Ge X, Wang SM. Identifying nonspecific SAGE tags by context of gene expression. Methods Mol Biol 2008;387:199-204. [PMID: 18287633 DOI: 10.1007/978-1-59745-454-4_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

Wang SM. Understanding SAGE data. Trends Genet 2006;23:42-50. [PMID: 17109989 DOI: 10.1016/j.tig.2006.11.001] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2006] [Revised: 10/05/2006] [Accepted: 11/01/2006] [Indexed: 02/08/2023]

Accurate and unambiguous tag-to-gene mapping in serial analysis of gene expression. BMC Bioinformatics 2006;7:487. [PMID: 17083742 PMCID: PMC1637119 DOI: 10.1186/1471-2105-7-487] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Accepted: 11/04/2006] [Indexed: 12/24/2022] Open

Abstract

BACKGROUND

In this study, we present a robust and reliable computational method for tag-to-gene assignment in serial analysis of gene expression (SAGE). The method relies on current genome information and annotation, incorporation of several new features, and key improvements over alternative methods, all of which are important to determine gene expression levels more accurately. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome.

RESULTS

We applied this method to the Saccharomyces cerevisiae genome, producing the most thorough and accurate annotation of potential virtual SAGE tags that is available today for this organism. The usefulness of this method is exemplified by the significant reduction of ambiguous cases in existing experimental SAGE data. In addition, we report new insights from the analysis of existing SAGE data. First, we found that experimental SAGE tags mapping onto introns, intron-exon boundaries, and non-coding RNA elements are observed in all available SAGE data. Second, a significant fraction of experimental SAGE tags was found to map onto genomic regions currently annotated as intergenic. Third, a significant number of existing experimental SAGE tags for yeast has been derived from truncated cDNAs, which are synthesized through oligo-d(T) priming to internal poly-(A) regions during reverse transcription.

CONCLUSION

We conclude that an accurate and unambiguous tag mapping process is essential to increase the quality and the amount of information that can be extracted from SAGE experiments. This is supported by the results obtained here and also by the large impact that the erroneous interpretation of these data could have on downstream applications.

Collapse

Wang SM. Applying the SAGE technique to study the effects of electromagnetic field on biological systems. Proteomics 2006;6:4765-8. [PMID: 16897688 DOI: 10.1002/pmic.200500881] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Friedland DR, Popper P, Eernisse R, Ringger B, Cioffi JA. Differential expression of cytoskeletal genes in the cochlear nucleus. ACTA ACUST UNITED AC 2006;288:447-65. [PMID: 16550590 PMCID: PMC2570442 DOI: 10.1002/ar.a.20303] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Ge X, Jung YC, Wu Q, Kibbe WA, Wang SM. Annotating nonspecific SAGE tags with microarray data. Genomics 2006;87:173-80. [PMID: 16314072 DOI: 10.1016/j.ygeno.2005.08.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2005] [Revised: 08/09/2005] [Accepted: 08/27/2005] [Indexed: 11/26/2022]

Poroyko V, Hejlek LG, Spollen WG, Springer GK, Nguyen HT, Sharp RE, Bohnert HJ. The maize root transcriptome by serial analysis of gene expression. PLANT PHYSIOLOGY 2005;138:1700-10. [PMID: 15965024 PMCID: PMC1176439 DOI: 10.1104/pp.104.057638] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Ibrahim AFM, Hedley PE, Cardle L, Kruger W, Marshall DF, Muehlbauer GJ, Waugh R. A comparative analysis of transcript abundance using SAGE and Affymetrix arrays. Funct Integr Genomics 2005;5:163-74. [PMID: 15714318 DOI: 10.1007/s10142-005-0135-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2004] [Revised: 12/13/2004] [Accepted: 12/22/2004] [Indexed: 12/18/2022]

Fujishima N, Hirokawa M, Aiba N, Ichikawa Y, Fujishima M, Komatsuda A, Suzuki Y, Kawabata Y, Miura I, Sawada KI. Gene Expression Profiling of Human Erythroid Progenitors by Micro-Serial Analysis of Gene Expression. Int J Hematol 2004;80:239-45. [PMID: 15540898 DOI: 10.1532/ijh97.04053] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Cekan SZ. Methods to find out the expression of activated genes. Reprod Biol Endocrinol 2004;2:68. [PMID: 15385048 PMCID: PMC524190 DOI: 10.1186/1477-7827-2-68] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/08/2004] [Accepted: 09/23/2004] [Indexed: 11/23/2022] Open

Maloney JP. Expression of thrombospondins in skeletal muscle. FASEB J 2004;18:1049. [PMID: 15226264 DOI: 10.1096/fj.04-1647lte] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Sun M, Zhou G, Lee S, Chen J, Shi RZ, Wang SM. SAGE is far more sensitive than EST for detecting low-abundance transcripts. BMC Genomics 2004;5:1. [PMID: 14704093 PMCID: PMC317289 DOI: 10.1186/1471-2164-5-1] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2003] [Accepted: 01/05/2004] [Indexed: 11/10/2022] Open

Fizames C, Muños S, Cazettes C, Nacry P, Boucherez J, Gaymard F, Piquemal D, Delorme V, Commes T, Doumas P, Cooke R, Marti J, Sentenac H, Gojon A. The Arabidopsis root transcriptome by serial analysis of gene expression. Gene identification using the genome sequence. PLANT PHYSIOLOGY 2004;134:67-80. [PMID: 14730065 PMCID: PMC316288 DOI: 10.1104/pp.103.030536] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2003] [Revised: 09/07/2003] [Accepted: 10/22/2003] [Indexed: 05/18/2023]

Hayden PS, El-Meanawy A, Schelling JR, Sedor JR. DNA expression analysis: serial analysis of gene expression, microarrays and kidney disease. Curr Opin Nephrol Hypertens 2003;12:407-14. [PMID: 12815337 DOI: 10.1097/00041552-200307000-00009] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Pleasance ED, Marra MA, Jones SJM. Assessment of SAGE in transcript identification. Genome Res 2003;13:1203-15. [PMID: 12743019 PMCID: PMC403648 DOI: 10.1101/gr.873003] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Unneberg P, Wennborg A, Larsson M. Transcript identification by analysis of short sequence tags--influence of tag length, restriction site and transcript database. Nucleic Acids Res 2003;31:2217-26. [PMID: 12682372 PMCID: PMC153741 DOI: 10.1093/nar/gkg313] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

There exist a number of gene expression profiling techniques that utilize restriction enzymes for generation of short expressed sequence tags. We have studied how the choice of restriction enzyme influences various characteristics of tags generated in an experiment. We have also investigated various aspects of in silico transcript identification that these profiling methods rely on. First, analysis of 14 248 mRNA sequences derived from the RefSeq transcript database showed that 1-30% of the sequences lack a given restriction enzyme recognition site. Moreover, 1-5% of the transcripts have recognition sites located less than 10 bases from the poly(A) tail. The uniqueness of 10 bp tags lies in the range 90-95%, which increases only slightly with longer tags, due to the existence of closely related transcripts. Furthermore, 3-30% of upstream 10 bp tags are identical to 3' tags, introducing a risk of misclassification if upstream tags are present in a sample. Second, we found that a sequence length of 16-17 bp, including the recognition site, is sufficient for unique transcript identification by BLAST based sequence alignment to the UniGene Human non-redundant database. Third, we constructed a tag-to-gene mapping for UniGene and compared it to an existing mapping database. The mappings agreed to 79-83%, where the selection of representative sequences in the UniGene clusters is the main cause of the disagreement. The results of this study may serve to improve the interpretation of sequence-based expression studies and the design of hybridization arrays, by identifying short tags that have a high reliability and separating them from tags that carry an inherent ambiguity in their capacity to discriminate between genes. To this end, supplementary information in the form of a web companion to this paper is located at http:// biobase.biotech.kth.se/tagseq.

Collapse

Zhang MQ. Computational prediction of eukaryotic protein-coding genes. Nat Rev Genet 2002;3:698-709. [PMID: 12209144 DOI: 10.1038/nrg890] [Citation(s) in RCA: 124] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2002. [PMCID: PMC2447335 DOI: 10.1002/cfg.120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open