251
|
Ghedin E, Pumfery A, de la Fuente C, Yao K, Miller N, Lacoste V, Quackenbush J, Jacobson S, Kashanchi F. Use of a multi-virus array for the study of human viral and retroviral pathogens: gene expression studies and ChIP-chip analysis. Retrovirology 2004; 1:10. [PMID: 15169557 PMCID: PMC442135 DOI: 10.1186/1742-4690-1-10] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2004] [Accepted: 05/25/2004] [Indexed: 11/10/2022] Open
Abstract
Background Since the discovery of human immunodeficiency virus (HIV-1) twenty years ago, AIDS has become one of the most studied diseases. A number of viruses have subsequently been identified to contribute to the pathogenesis of HIV and its opportunistic infections and cancers. Therefore, a multi-virus array containing eight human viruses implicated in AIDS pathogenesis was developed and its efficacy in various applications was characterized. Results The amplified open reading frames (ORFs) of human immunodeficiency virus type 1, human T cell leukemia virus types 1 and 2, hepatitis C virus, Epstein-Barr virus, human herpesvirus 6A and 6B, and Kaposi's sarcoma-associated herpesvirus were spotted on glass slides and hybridized to DNA and RNA samples. Using a random priming method for labeling genomic DNA or cDNA probes, we show specific detection of genomic viral DNA from cells infected with the human herpesviruses, and effectively demonstrate the inhibitory effects of a cellular cyclin dependent kinase inhibitor on viral gene expression in HIV-1 and KSHV latently infected cells. In addition, we coupled chromatin immunoprecipitation with the virus chip (ChIP-chip) to study cellular protein and DNA binding. Conclusions An amplicon based virus chip representing eight human viruses was successfully used to identify each virus with little cross hybridization. Furthermore, the identity of both viruses was correctly determined in co-infected cells. The utility of the virus chip was demonstrated by a variety of expression studies. Additionally, this is the first demonstrated use of ChIP-chip analysis to show specific binding of proteins to viral DNA, which, importantly, did not require further amplification for detection.
Collapse
|
252
|
Agrawal D, Chen T, Irby R, Quackenbush J, Chambers AF, Szabo M, Cantor A, Coppola D, Yeatman TJ. Osteopontin identified as colon cancer tumor progression marker. C R Biol 2004; 326:1041-3. [PMID: 14744111 DOI: 10.1016/j.crvi.2003.09.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Identifying molecular markers for colon cancer is a top priority. Using a pooled sample approach with Affymetrix GeneChip technology, we assayed colon cancers derived from a series of clinical stages to identify molecular markers of potential prognostic value. Of 12000 genes assessed, osteopontin emerged as the leading candidate tumor progression marker. Osteopontin is a secreted glycoprotein known to bind integrins and CD44. Its actual molecular function remains elusive but its increased expression correlates strongly with tumor progression.
Collapse
|
253
|
|
254
|
Grigoryev DN, Ma SF, Irizarry RA, Ye SQ, Quackenbush J, Garcia JGN. Orthologous gene-expression profiling in multi-species models: search for candidate genes. Genome Biol 2004; 5:R34. [PMID: 15128448 PMCID: PMC416470 DOI: 10.1186/gb-2004-5-5-r34] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2003] [Revised: 01/26/2004] [Accepted: 03/16/2004] [Indexed: 12/15/2022] Open
Abstract
Microarray-driven gene-expression profiles are generally produced and analyzed for a single specific experimental model. We have assessed an analytical approach that simultaneously evaluates multi-species experimental models within a particular biological condition using orthologous genes as linkers for the various Affymetrix microarray platforms on multi-species models of ventilator-associated lung injury. The results suggest that this approach may be a useful tool in the evaluation of biological processes of interest and selection of process-related candidate genes.
Collapse
|
255
|
Imanishi T, Itoh T, Suzuki Y, O'Donovan C, Fukuchi S, Koyanagi KO, Barrero RA, Tamura T, Yamaguchi-Kabata Y, Tanino M, Yura K, Miyazaki S, Ikeo K, Homma K, Kasprzyk A, Nishikawa T, Hirakawa M, Thierry-Mieg J, Thierry-Mieg D, Ashurst J, Jia L, Nakao M, Thomas MA, Mulder N, Karavidopoulou Y, Jin L, Kim S, Yasuda T, Lenhard B, Eveno E, Suzuki Y, Yamasaki C, Takeda JI, Gough C, Hilton P, Fujii Y, Sakai H, Tanaka S, Amid C, Bellgard M, Bonaldo MDF, Bono H, Bromberg SK, Brookes AJ, Bruford E, Carninci P, Chelala C, Couillault C, de Souza SJ, Debily MA, Devignes MD, Dubchak I, Endo T, Estreicher A, Eyras E, Fukami-Kobayashi K, R. Gopinath G, Graudens E, Hahn Y, Han M, Han ZG, Hanada K, Hanaoka H, Harada E, Hashimoto K, Hinz U, Hirai M, Hishiki T, Hopkinson I, Imbeaud S, Inoko H, Kanapin A, Kaneko Y, Kasukawa T, Kelso J, Kersey P, Kikuno R, Kimura K, Korn B, Kuryshev V, Makalowska I, Makino T, Mano S, Mariage-Samson R, Mashima J, Matsuda H, Mewes HW, Minoshima S, Nagai K, Nagasaki H, Nagata N, Nigam R, Ogasawara O, Ohara O, Ohtsubo M, Okada N, Okido T, Oota S, Ota M, Ota T, Otsuki T, Piatier-Tonneau D, Poustka A, Ren SX, Saitou N, Sakai K, Sakamoto S, Sakate R, Schupp I, Servant F, Sherry S, Shiba R, Shimizu N, Shimoyama M, Simpson AJ, Soares B, Steward C, Suwa M, Suzuki M, Takahashi A, Tamiya G, Tanaka H, Taylor T, Terwilliger JD, Unneberg P, Veeramachaneni V, Watanabe S, Wilming L, Yasuda N, Yoo HS, Stodolsky M, Makalowski W, Go M, Nakai K, Takagi T, Kanehisa M, Sakaki Y, Quackenbush J, Okazaki Y, Hayashizaki Y, Hide W, Chakraborty R, Nishikawa K, Sugawara H, Tateno Y, Chen Z, Oishi M, Tonellato P, Apweiler R, Okubo K, Wagner L, Wiemann S, Strausberg RL, Isogai T, Auffray C, Nomura N, Gojobori T, Sugano S. Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biol 2004; 2:e162. [PMID: 15103394 PMCID: PMC393292 DOI: 10.1371/journal.pbio.0020162] [Citation(s) in RCA: 267] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2003] [Accepted: 04/01/2004] [Indexed: 01/08/2023] Open
Abstract
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.
Collapse
|
256
|
Sharov V, Kwong KY, Frank B, Chen E, Hasseman J, Gaspard R, Yu Y, Yang I, Quackenbush J. The limits of log-ratios. BMC Biotechnol 2004; 4:3. [PMID: 15113428 PMCID: PMC400743 DOI: 10.1186/1472-6750-4-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2003] [Accepted: 03/08/2004] [Indexed: 12/04/2022] Open
Abstract
Background DNA microarray assays typically compare two biological samples and present the results of those comparisons gene-by-gene as the logarithm base two of the ratio of the measured expression levels for the two samples. Results Because of the fixed dynamic range of fluorescence and other detection systems, there is a limit to the range of comparisons that can be made using any array technology, and this must be taken into account when interpreting the results of any such analysis. Conclusions The dynamic range of microarray data collection systems results in limits in the comparative analyses that can be derived from such measurements and suggests that optimal results can be obtained by making measurements that avoid the boundaries of that dynamic range.
Collapse
|
257
|
Rexroad CE, Lee Y, Keele JW, Karamycheva S, Brown G, Koop B, Gahr SA, Palti Y, Quackenbush J. Sequence analysis of a rainbow trout cDNA library and creation of a gene index. Cytogenet Genome Res 2004; 102:347-54. [PMID: 14970727 DOI: 10.1159/000075773] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2003] [Accepted: 07/30/2003] [Indexed: 11/19/2022] Open
Abstract
Expressed sequence tag (EST) projects have produced extremely valuable resources for identifying genes affecting phenotypes of interest. A large-scale EST sequencing project for rainbow trout was initiated to identify and functionally annotate as many unique transcripts as possible. Over 45,000 5' ESTs were obtained by sequencing clones from a single normalized library constructed using mRNA from six tissues. The production of this sequence data and creation of a rainbow trout Gene Index eliminating redundancy and providing annotation for these sequences will facilitate research in this species.
Collapse
|
258
|
Bloom G, Yang IV, Boulware D, Kwong KY, Coppola D, Eschrich S, Quackenbush J, Yeatman TJ. Multi-platform, multi-site, microarray-based human tumor classification. THE AMERICAN JOURNAL OF PATHOLOGY 2004; 164:9-16. [PMID: 14695313 PMCID: PMC1602228 DOI: 10.1016/s0002-9440(10)63090-8] [Citation(s) in RCA: 144] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The introduction of gene expression profiling has resulted in the production of rich human data sets with potential for deciphering tumor diagnosis, prognosis, and therapy. Here we demonstrate how artificial neural networks (ANNs) can be applied to two completely different microarray platforms (cDNA and oligonucleotide), or a combination of both, to build tumor classifiers capable of deciphering the identity of most human cancers. First, 78 tumors representing eight different types of histologically similar adenocarcinoma, were evaluated with a 32k cDNA microarray and correctly classified by a cDNA-based ANN, using independent training and test sets, with a mean accuracy of 83%. To expand our approach, oligonucleotide data derived from six independent performance sites, representing 463 tumors and 21 tumor types, were assembled, normalized, and scaled. An oligonucleotide-based ANN, trained on a random fraction of the tumors (n = 343), was 88% accurate in predicting known pathological origin of the remaining fraction of tumors (n = 120) not exposed to the training algorithm. Finally, a mixed-platform classifier using a combination of both cDNA and oligonucleotide microarray data from seven performance sites, normalized and scaled from a large and diverse tumor set (n = 539), produced similar results (85% accuracy) on independent test sets. Further validation of our classifiers was achieved by accurately (84%) predicting the known primary site of origin for an independent set of metastatic lesions (n = 50), resected from brain, lung, and liver, potentially addressing the vexing classification problems imposed by unknown primary cancers. These cDNA- and oligonucleotide-based classifiers provide a first proof of principle that data derived from multiple platforms and performance sites can be exploited to build multi-tissue tumor classifiers.
Collapse
|
259
|
Yang I, Eschrich S, Bloom G, Quackenbush J, Yeatman TJ. Molecular profiling predicts colon cancer survival better than dukes staging. Ann Surg Oncol 2004. [DOI: 10.1007/bf02523978] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
260
|
Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, SanMiguel P, Lakey N, Bedell J, Yuan Y, Budiman MA, Resnick A, Van Aken S, Utterback T, Riedmuller S, Williams M, Feldblyum T, Schubert K, Beachy R, Fraser CM, Quackenbush J. Enrichment of gene-coding sequences in maize by genome filtration. Science 2004; 302:2118-20. [PMID: 14684821 DOI: 10.1126/science.1090047] [Citation(s) in RCA: 171] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Approximately 80% of the maize genome comprises highly repetitive sequences interspersed with single-copy, gene-rich sequences, and standard genome sequencing strategies are not readily adaptable to this type of genome. Methodologies that enrich for genic sequences might more rapidly generate useful results from complex genomes. Equivalent numbers of clones from maize selected by techniques called methylation filtering and High C0t selection were sequenced to generate approximately 200,000 reads (approximately 132 megabases), which were assembled into contigs. Combination of the two techniques resulted in a sixfold reduction in the effective genome size and a fourfold increase in the gene identification rate in comparison to a nonenriched library.
Collapse
|
261
|
Abstract
The genomic revolution, manifested by the sequencing of the complete genome of many organisms, along with technological advances, such as DNA microarrays and developments in high-throughput analysis of proteins, metabolites, and isotopic tracer distribution patterns, challenged the conventional ways in which questions are approached in the biological sciences: (a) rather than examining a small number of genes and/or reactions at any one time;, we can now analyze gene expression and protein activity in the context of systems of interacting genes and gene products; (b) comprehensive analysis of biological systems requires the integration of all cellular fingerprints: genome sequence, maps of gene expression, protein expression, metabolic output, and in vivo enzymatic activity; and (c) collecting, managing, and analyzing comparable data from various cellular profiles requires expertise from several fields that transcend traditional discipline boundaries. While researchers in systems biology have still to address difficult challenges in both experimental and computational arenas, they possess, for the first time, the opportunity to unravel the mechanisms of life. The enormous impact of these discoveries in diverse areas, such as metabolic engineering, strain selection, drug screening and development, bioprocess development, disease prognosis and diagnosis, gene and other medical therapies, is an obvious motivation for pursuing integrated analyses of cellular systems.
Collapse
|
262
|
Brentani H, Caballero OL, Camargo AA, da Silva AM, da Silva WA, Dias Neto E, Grivet M, Gruber A, Guimaraes PEM, Hide W, Iseli C, Jongeneel CV, Kelso J, Nagai MA, Ojopi EPB, Osorio EC, Reis EMR, Riggins GJ, Simpson AJG, de Souza S, Stevenson BJ, Strausberg RL, Tajara EH, Verjovski-Almeida S, Acencio ML, Bengtson MH, Bettoni F, Bodmer WF, Briones MRS, Camargo LP, Cavenee W, Cerutti JM, Coelho Andrade LE, Costa dos Santos PC, Ramos Costa MC, da Silva IT, Estécio MRH, Sa Ferreira K, Furnari FB, Faria M, Galante PAF, Guimaraes GS, Holanda AJ, Kimura ET, Leerkes MR, Lu X, Maciel RMB, Martins EAL, Massirer KB, Melo ASA, Mestriner CA, Miracca EC, Miranda LL, Nobrega FG, Oliveira PS, Paquola ACM, Pandolfi JRC, Campos Pardini MIDM, Passetti F, Quackenbush J, Schnabel B, Sogayar MC, Souza JE, Valentini SR, Zaiats AC, Amaral EJ, Arnaldi LAT, de Araújo AG, de Bessa SA, Bicknell DC, Ribeiro de Camaro ME, Carraro DM, Carrer H, Carvalho AF, Colin C, Costa F, Curcio C, Guerreiro da Silva IDC, Pereira da Silva N, Dellamano M, El-Dorry H, Espreafico EM, Scattone Ferreira AJ, Ayres Ferreira C, Fortes MAHZ, Gama AH, Giannella-Neto D, Giannella MLCC, Giorgi RR, Goldman GH, Goldman MHS, Hackel C, Ho PL, Kimura EM, Kowalski LP, Krieger JE, Leite LCC, Lopes A, Luna AMSC, Mackay A, Mari SKN, Marques AA, Martins WK, Montagnini A, Mourão Neto M, Nascimento ALTO, Neville AM, Nobrega MP, O'Hare MJ, Otsuka AY, Ruas de Melo AI, Paco-Larson ML, Guimarães Pereira G, Pereira da Silva N, Pesquero JB, Pessoa JG, Rahal P, Rainho CA, Rodrigues V, Rogatto SR, Romano CM, Romeiro JG, Rossi BM, Rusticci M, Guerra de Sá R, Sant' Anna SC, Sarmazo ML, Silva TCDLE, Soares FA, Sonati MDF, de Freitas Sousa J, Queiroz D, Valente V, Vettore AL, Villanova FE, Zago MA, Zalcberg H. The generation and utilization of a cancer-oriented representation of the human transcriptome by using expressed sequence tags. Proc Natl Acad Sci U S A 2003; 100:13418-23. [PMID: 14593198 PMCID: PMC263829 DOI: 10.1073/pnas.1233632100] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Whereas genome sequencing defines the genetic potential of an organism, transcript sequencing defines the utilization of this potential and links the genome with most areas of biology. To exploit the information within the human genome in the fight against cancer, we have deposited some two million expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the public databases. The data currently define approximately 23,500 genes, of which only approximately 1,250 are still represented only by ESTs. Examination of the EST coverage of known cancer-related (CR) genes reveals that <1% do not have corresponding ESTs, indicating that the representation of genes associated with commonly studied tumors is high. The careful recording of the origin of all ESTs we have produced has enabled detailed definition of where the genes they represent are expressed in the human body. More than 100,000 ESTs are available for seven tissues, indicating a surprising variability of gene usage that has led to the discovery of a significant number of genes with restricted expression, and that may thus be therapeutically useful. The ESTs also reveal novel nonsynonymous germline variants (although the one-pass nature of the data necessitates careful validation) and many alternatively spliced transcripts. Although widely exploited by the scientific community, vindicating our totally open source policy, the EST data generated still provide extensive information that remains to be systematically explored, and that may further facilitate progress toward both the understanding and treatment of human cancers.
Collapse
|
263
|
Abstract
DNA microarray analysis has provided a wealth of data on global patterns of gene expression but has yet to deliver on its early promise of identifying networks of interacting gene products. In his Perspective, Quackenbush discusses new work (Stuart et al.) that uses evolutionary conservation of gene expression patterns in yeast, worm, fruit fly, and human in an attempt to identify functionally related groups of genes.
Collapse
|
264
|
Abulencia JP, Gaspard R, Healy ZR, Gaarde WA, Quackenbush J, Konstantopoulos K. Shear-induced cyclooxygenase-2 via a JNK2/c-Jun-dependent pathway regulates prostaglandin receptor expression in chondrocytic cells. J Biol Chem 2003; 278:28388-94. [PMID: 12743126 DOI: 10.1074/jbc.m301378200] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Using cDNA microarrays coupled with bioinformatics tools, we elucidated a signaling cascade regulating cyclooxygenase-2 (COX-2), a pivotal pro-inflammatory enzyme expressed in rheumatic and osteoarthritic, but not normal, cartilage. Exposure of T/C-28a2 chondrocytic cells to fluid shear results in co-regulation of c-Jun N-terminal kinase2 (JNK2), c-Jun, and COX-2 as well as concomitant downstream expression of prostaglandin receptors EP2 and EP3a1. JNK2 transcript inhibition abrogated shear-induced COX-2, EP2, and EP3a1 mRNA up-regulation as well as c-Jun phosphorylation. Functional knock-out experiments using an antisense c-Jun oligonucleotide revealed the abolition of shear-induced COX-2, EP2, and EP3a1, but not JNK2, transcripts. Moreover, inhibition of COX-2 activity eliminated mRNA upregulation of EP2 and EP3a1 induced by shear. Hence, a biochemical pathway exists wherein fluid shear activates COX-2, via a JNK2/c-Jun-dependent pathway, which in turn elicits downstream EP2 and EP3a1 mRNA synthesis.
Collapse
|
265
|
Chen T, Yang I, Irby R, Shain KH, Wang HG, Quackenbush J, Coppola D, Cheng JQ, Yeatman TJ. Regulation of caspase expression and apoptosis by adenomatous polyposis coli. Cancer Res 2003; 63:4368-74. [PMID: 12907606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
Abstract
The adenomatous polyposis coli (APC) gene, a member of the WNT pathway, has been shown to assign intestinal epithelial cells to a program of proliferation or differentiation through regulation of the beta-catenin/TCF-4 complex. Wild-type APC, in certain cellular contexts, appears to induce differentiation and apoptosis, although mutant forms of APC, known to produce polyps and ultimately cancers, may suppress these events. Here, we show that mutant forms of APC can induce repression of select terminal caspases as a potential means of attenuating responses to apoptotic stimuli. Using gene expression profiling to interrogate the intact intestines of Apc(+/min) mice harboring numerous polyps, we identified a reduction in the mRNA expression of both caspases 3 and 7. We additionally identified a reduction in protein levels of caspase-3, caspase-7, and caspase-9 in human colon cancer specimens known to harbor APC mutations. A reduction in caspase protein levels resulted in resistance to apoptotic-inducing agents and restoration of caspase levels reinstated apoptotic capacities. Consistent with Wnt pathway involvement, dominant negative TCF/LEF induced caspase protein expression. These data provide support for the hypothesis that one of the functions of APC is the regulation of caspase activity and other apoptotic proteins by controlling their expression levels in the cell.
Collapse
|
266
|
Kasukawa T, Furuno M, Nikaido I, Bono H, Hume DA, Bult C, Hill DP, Baldarelli R, Gough J, Kanapin A, Matsuda H, Schriml LM, Hayashizaki Y, Okazaki Y, Quackenbush J. Development and evaluation of an automated annotation pipeline and cDNA annotation system. Genome Res 2003; 13:1542-51. [PMID: 12819153 PMCID: PMC403710 DOI: 10.1101/gr.992803] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Manual curation has long been held to be the "gold standard" for functional annotation of DNA sequence. Our experience with the annotation of more than 20,000 full-length cDNA sequences revealed problems with this approach, including inaccurate and inconsistent assignment of gene names, as well as many good assignments that were difficult to reproduce using only computational methods. For the FANTOM2 annotation of more than 60,000 cDNA clones, we developed a number of methods and tools to circumvent some of these problems, including an automated annotation pipeline that provides high-quality preliminary annotation for each sequence by introducing an "uninformative filter" that eliminates uninformative annotations, controlled vocabularies to accurately reflect both the functional assignments and the evidence supporting them, and a highly refined, Web-based manual annotation tool that allows users to view a wide array of sequence analyses and to assign gene names and putative functions using a consistent nomenclature. The ultimate utility of our approach is reflected in the low rate of reassignment of automated assignments by manual curation. Based on these results, we propose a new standard for large-scale annotation, in which the initial automated annotations are manually investigated and then computational methods are iteratively modified and improved based on the results of manual curation.
Collapse
|
267
|
Wells CA, Ravasi T, Sultana R, Yagi K, Carninci P, Bono H, Faulkner G, Okazaki Y, Quackenbush J, Hume DA, Lyons PA. Continued discovery of transcriptional units expressed in cells of the mouse mononuclear phagocyte lineage. Genome Res 2003; 13:1360-5. [PMID: 12819134 PMCID: PMC403663 DOI: 10.1101/gr.1056103] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2002] [Accepted: 02/25/2003] [Indexed: 11/24/2022]
Abstract
The current RIKEN transcript set represents a significant proportion of the mouse transcriptome but transcripts expressed in the innate and acquired immune systems are poorly represented. In the present study we have assessed the complexity of the transcriptome expressed in mouse macrophages before and after treatment with lipopolysaccharide, a global regulator of macrophage gene expression, using existing RIKEN 19K arrays. By comparison to array profiles of other cells and tissues, we identify a large set of macrophage-enriched genes, many of which have obvious functions in endocytosis and phagocytosis. In addition, a significant number of LPS-inducible genes were identified. The data suggest that macrophages are a complex source of mRNA for transcriptome studies. To assess complexity and identify additional macrophage expressed genes, cDNA libraries were created from purified populations of macrophage and dendritic cells, a functionally related cell type. Sequence analysis revealed a high incidence of novel mRNAs within these cDNA libraries. These studies provide insights into the depths of transcriptional complexity still untapped amongst products of inducible genes, and identify macrophage and dendritic cell populations as a starting point for sampling the inducible mammalian transcriptome.
Collapse
|
268
|
Merrick JM, Osman A, Tsai J, Quackenbush J, LoVerde PT, Lee NH. The Schistosoma mansoni gene index: gene discovery and biology by reconstruction and analysis of expressed gene sequences. J Parasitol 2003; 89:261-9. [PMID: 12760639 DOI: 10.1645/0022-3395(2003)089[0261:tsmgig]2.0.co;2] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Expressed sequence tag (EST) sequencing and analysis is a primary research tool to identify and characterize the Schistosoma mansoni transcriptome. As part of our gene discovery effort, a total of 5,793 ESTs have been generated from clones selected randomly from complementary DNA (cDNA) libraries constructed from male and female adult worms. Assembly analysis of all the 16,813 public S. mansoni ESTs has identified 1,920 distinct tentative consensus sequences (TCs) and 5,571 nonoverlapping ESTs (singletons). Of these, 376 TCs (20%) and 1,449 singletons (26%) are unique to the SUNY/TIGR sequencing effort. Tentative consensus sequences and singletons were distributed into various categories of biological roles associated with cell structure, metabolism, protein fate, signal transduction, transcription, protein synthesis, transporters, and cell growth. The TCs and singletons represent transcripts that can be used as a resource for functional annotation of genomic sequence data, comparative sequence analysis, and cDNA clone selection for microarray projects. The utility of EST analysis is demonstrated by identifying new protease genes, which may be involved in hemoglobin degradation.
Collapse
|
269
|
Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J. TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 2003; 19:651-2. [PMID: 12651724 DOI: 10.1093/bioinformatics/btg034] [Citation(s) in RCA: 1332] [Impact Index Per Article: 63.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
TGICL is a pipeline for analysis of large Expressed Sequence Tags (EST) and mRNA databases in which the sequences are first clustered based on pairwise sequence similarity, and then assembled by individual clusters (optionally with quality values) to produce longer, more complete consensus sequences. The system can run on multi-CPU architectures including SMP and PVM.
Collapse
|
270
|
Kim H, Snesrud EC, Haas B, Cheung F, Town CD, Quackenbush J. Gene expression analyses of Arabidopsis chromosome 2 using a genomic DNA amplicon microarray. Genome Res 2003; 13:327-40. [PMID: 12618363 PMCID: PMC430289 DOI: 10.1101/gr.552003] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2002] [Accepted: 12/20/2002] [Indexed: 11/24/2022]
Abstract
The gene predictions and accompanying functional assignments resulting from the sequencing and annotation of a genome represent hypotheses that can be tested and used to develop a more complete understanding of the organism and its biology. In the model plant Arabidopsis thaliana, we developed a novel approach to constructing whole-genome microarrays based on PCR amplification of the 3' ends of each predicted gene from genomic DNA, and constructed an array representing more than 94% of the predicted genes and pseudogenes on chromosome 2. With this array, we examined various tissues and physiological conditions, providing expression-based validation for 84% of the gene predictions and providing clues as to the functions of many predicted genes. Further, by examining the distribution of expression along the physical chromosome, we were able to identify a region of repressed transcription that may represent a previously undescribed heterochromatic region.
Collapse
|
271
|
Dudoit S, Gentleman RC, Quackenbush J. Open source software for the analysis of microarray data. Biotechniques 2003; Suppl:45-51. [PMID: 12664684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
DNA microarray assays represent the first widely used application that attempts to build upon the information provided by genome projects in the study of biological questions. One of the greatest challenges with working with microarrays is collecting, managing, and analyzing data. Although several commercial and noncommercial solutions exist, there is a growing body of freely available, open source software that allows users to analyze data using a host of existing techniques and to develop their own and integrate them within the system. Here we review three of the most widely used and comprehensive systems, the statistical analysis tools written in R through the Bioconductor project (http://www.bioconductor.org), the Java-based TM4 software system available from The Institute for Genomic Research (http://www.tigr.org/software), and BASE, the Web-based system developed at Lund University (http://base.thep.lu.se).
Collapse
|
272
|
Dudoit S, Gentleman RC, Quackenbush J. Open Source Software for the Analysis of Microarray Data. Biotechniques 2003. [DOI: 10.2144/mar03dudoit] [Citation(s) in RCA: 167] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
DNA microarray assays represent the first widely used application that attempts to build upon the information provided by genome projects in the study of biological questions. One of the greatest challenges with working with microarrays is collecting, managing, and analyzing data. Although several commercial and noncommercial solutions exist, there is a growing body of freely available, open source software that allows users to analyze data using a host of existing techniques and to develop their own and integrate them within the system. Here we review three of the most widely used and comprehensive systems, the statistical analysis tools written in R through the Bioconductor project ( http://www.bioconductor.org ), the Java®-based TM4 software system available from The Institute for Genomic Research ( http://www.tigr.org/software ), and BASE, the Web-based system developed at Lund University ( http://base.thep.lu.se ).
Collapse
|
273
|
Cook DN, Wang S, Howles GP, Speer M, Churchhill G, Quackenbush J, Schwartz DA. The genetics of innate immunity in the lung. Chest 2003; 123:369S. [PMID: 12628980 DOI: 10.1378/chest.123.3_suppl.369s] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
|
274
|
Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J. TM4: a free, open-source system for microarray data management and analysis. Biotechniques 2003; 34:374-8. [PMID: 12613259 DOI: 10.2144/03342mt01] [Citation(s) in RCA: 3697] [Impact Index Per Article: 176.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
|
275
|
Ronning CM, Stegalkina SS, Ascenzi RA, Bougri O, Hart AL, Utterbach TR, Vanaken SE, Riedmuller SB, White JA, Cho J, Pertea GM, Lee Y, Karamycheva S, Sultana R, Tsai J, Quackenbush J, Griffiths HM, Restrepo S, Smart CD, Fry WE, Van Der Hoeven R, Tanksley S, Zhang P, Jin H, Yamamoto ML, Baker BJ, Buell CR. Comparative analyses of potato expressed sequence tag libraries. PLANT PHYSIOLOGY 2003; 131:419-29. [PMID: 12586867 PMCID: PMC166819 DOI: 10.1104/pp.013581] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2002] [Revised: 10/21/2002] [Accepted: 11/14/2002] [Indexed: 05/18/2023]
Abstract
The cultivated potato (Solanum tuberosum) shares similar biology with other members of the Solanaceae, yet has features unique within the family, such as modified stems (stolons) that develop into edible tubers. To better understand potato biology, we have undertaken a survey of the potato transcriptome using expressed sequence tags (ESTs) from diverse tissues. A total of 61,940 ESTs were generated from aerial tissues, below-ground tissues, and tissues challenged with the late-blight pathogen (Phytophthora infestans). Clustering and assembly of these ESTs resulted in a total of 19,892 unique sequences with 8,741 tentative consensus sequences and 11,151 singleton ESTs. We were able to identify a putative function for 43.7% of these sequences. A number of sequences (48) were expressed throughout the libraries sampled, representing constitutively expressed sequences. Other sequences (13,068, 21%) were uniquely expressed and were detected only in a single library. Using hierarchal and k means clustering of the EST sequences, we were able to correlate changes in gene expression with major physiological events in potato biology. Using pair-wise comparisons of tuber-related tissues, we were able to associate genes with tuber initiation, dormancy, and sprouting. We also were able to identify a number of characterized as well as novel sequences that were unique to the incompatible interaction of late-blight pathogen, thereby providing a foundation for further understanding the mechanism of resistance.
Collapse
|