Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

83
(from Reference Citation Analysis)

Article PDFs (20)

Cited by > 0 (46)

Searched Name

Robert Gentleman

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Number	Citation Analysis
26	Landsman D, Gentleman R, Kelso J, Francis Ouellette BF. DATABASE: A new forum for biological databases and curation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2009;2009:bap002. [PMID: 20157475 PMCID: PMC2790300 DOI: 10.1093/database/bap002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
27	Carey VJ, Gentleman R. Interpreting genetics of gene expression: integrative architecture in Bioconductor. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2009:380-90. [PMID: 19209716 PMCID: PMC3378382 DOI: 10.1142/9789812836939_0036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023] Abstract Several influential studies of genotypic determinants of gene expression in humans have now been published based on various populations including HapMap cohorts. The magnitude of the analytic task (transcriptome vs. SNP-genome) is a hindrance to dissemination of efficient, thorough, and auditable inference methods for this project. We describe the structure and use of Bioconductor facilities for inference in genetics of gene expression, with simultaneous application to multiple HapMap cohorts. Tools distributed for this purpose are readily adapted for the structure and analysis of privately-generated data in expression genetics. Collapse Key Words Collapse MESH Headings Biometry Carrier Proteins/genetics Cohort Studies Databases, Genetic Forkhead Transcription Factors/genetics Gene Expression Profiling/statistics & numerical data Genetics, Population HLA-DR Antigens/genetics HLA-DRB1 Chains Humans Polymorphism, Single Nucleotide Quantitative Trait Loci Regulatory Elements, Transcriptional Software Urotensins/genetics Collapse Grants R01 HL086601-01 NHLBI NIH HHS R01 HL086601 NHLBI NIH HHS P41 HG004059-01 NHGRI NIH HHS P41 HG004059 NHGRI NIH HHS R01 HG003646-01 NHGRI NIH HHS R01 HG003646 NHGRI NIH HHS Collapse
28	Kauffmann A, Gentleman R, Huber W. arrayQualityMetrics--a bioconductor package for quality assessment of microarray data. ACTA ACUST UNITED AC 2008;25:415-6. [PMID: 19106121 PMCID: PMC2639074 DOI: 10.1093/bioinformatics/btn647] [Citation(s) in RCA: 651] [Impact Index Per Article: 40.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Abstract SUMMARY The assessment of data quality is a major concern in microarray analysis. arrayQualityMetrics is a Bioconductor package that provides a report with diagnostic plots for one or two colour microarray data. The quality metrics assess reproducibility, identify apparent outlier arrays and compute measures of signal-to-noise ratio. The tool handles most current microarray technologies and is amenable to use in automated analysis pipelines or for automatic report generation, as well as for use by individuals. The diagnosis of quality remains, in principle, a context-dependent judgement, but our tool provides powerful, automated, objective and comprehensive instruments on which to base a decision. AVAILABILITY arrayQualityMetrics is a free and open source package, under LGPL license, available from the Bioconductor project at www.bioconductor.org. A users guide and examples are provided with the package. Some examples of HTML reports generated by arrayQualityMetrics can be found at http://www.microarray-quality.org Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
29	Sarkar D, Parkin R, Wyman S, Bendoraite A, Sather C, Delrow J, Godwin AK, Drescher C, Huber W, Gentleman R, Tewari M. Quality assessment and data analysis for microRNA expression arrays. Nucleic Acids Res 2008;37:e17. [PMID: 19103660 PMCID: PMC2632898 DOI: 10.1093/nar/gkn932] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract MicroRNAs are small (approximately 22 nt) RNAs that regulate gene expression and play important roles in both normal and disease physiology. The use of microarrays for global characterization of microRNA expression is becoming increasingly popular and has the potential to be a widely used and valuable research tool. However, microarray profiling of microRNA expression raises a number of data analytic challenges that must be addressed in order to obtain reliable results. We introduce here a universal reference microRNA reagent set as well as a series of nonhuman spiked-in synthetic microRNA controls, and demonstrate their use for quality control and between-array normalization of microRNA expression data. We also introduce diagnostic plots designed to assess and compare various normalization methods. We anticipate that the reagents and analytic approach presented here will be useful for improving the reliability of microRNA microarray experiments. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
30	Gentleman R. Some perspectives on statistical computing. CAN J STAT 2008. [DOI: 10.2307/3315925] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
31	Le Meur N, Gentleman R. Modeling synthetic lethality. Genome Biol 2008;9:R135. [PMID: 18789146 PMCID: PMC2592713 DOI: 10.1186/gb-2008-9-9-r135] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2008] [Accepted: 09/12/2008] [Indexed: 11/10/2022] Open Abstract BACKGROUND Synthetic lethality defines a genetic interaction where the combination of mutations in two or more genes leads to cell death. The implications of synthetic lethal screens have been discussed in the context of drug development as synthetic lethal pairs could be used to selectively kill cancer cells, but leave normal cells relatively unharmed. A challenge is to assess genome-wide experimental data and integrate the results to better understand the underlying biological processes. We propose statistical and computational tools that can be used to find relationships between synthetic lethality and cellular organizational units. RESULTS In Saccharomyces cerevisiae, we identified multi-protein complexes and pairs of multi-protein complexes that share an unusually high number of synthetic genetic interactions. As previously predicted, we found that synthetic lethality can arise from subunits of an essential multi-protein complex or between pairs of multi-protein complexes. Finally, using multi-protein complexes allowed us to take into account the pleiotropic nature of the gene products. CONCLUSIONS Modeling synthetic lethality using current estimates of the yeast interactome is an efficient approach to disentangle some of the complex molecular interactions that drive a cell. Our model in conjunction with applied statistical methods and computational methods provides new tools to better characterize synthetic genetic interactions. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
32	Oron AP, Jiang Z, Gentleman R. Gene set enrichment analysis using linear models and diagnostics. ACTA ACUST UNITED AC 2008;24:2586-91. [PMID: 18790795 DOI: 10.1093/bioinformatics/btn465] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Abstract MOTIVATION Gene-set enrichment analysis (GSEA) can be greatly enhanced by linear model (regression) diagnostic techniques. Diagnostics can be used to identify outlying or influential samples, and also to evaluate model fit and explore model expansion. RESULTS We demonstrate this methodology on an adult acute lymphoblastic leukemia (ALL) dataset, using GSEA based on chromosome-band mapping of genes. Individual residuals, grouped or aggregated by chromosomal loci, indicate problematic samples and potential data-entry errors, and help identify hyperdiploidy as a factor playing a key role in expression for this dataset. Subsequent analysis pinpoints suspected DNA copy number abnormalities of specific samples and chromosomes (most prevalent are chromosomes X, 21 and 14), and also reveals significant expression differences between the hyperdiploid and diploid groups on other chromosomes (most prominently 19, 22, 3 and 13)--differences which are apparently not associated with copy number. AVAILABILITY Software for the statistical tools demonstrated in this article is available as Bioconductor package GSEAlm. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
33	Bar M, Wyman SK, Fritz BR, Qi J, Garg KS, Parkin RK, Kroh EM, Bendoraite A, Mitchell PS, Nelson AM, Ruzzo WL, Ware C, Radich JP, Gentleman R, Ruohola-Baker H, Tewari M. MicroRNA discovery and profiling in human embryonic stem cells by deep sequencing of small RNA libraries. Stem Cells 2008;26:2496-505. [PMID: 18583537 DOI: 10.1634/stemcells.2008-0356] [Citation(s) in RCA: 235] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Abstract We used massively parallel pyrosequencing to discover and characterize microRNAs (miRNAs) expressed in human embryonic stem cells (hESC). Sequencing of small RNA cDNA libraries derived from undifferentiated hESC and from isogenic differentiating cultures yielded a total of 425,505 high-quality sequence reads. A custom data analysis pipeline delineated expression profiles for 191 previously annotated miRNAs, 13 novel miRNAs, and 56 candidate miRNAs. Further characterization of a subset of the novel miRNAs in Dicer-knockdown hESC demonstrated Dicer-dependent expression, providing additional validation of our results. A set of 14 miRNAs (9 known and 5 novel) was noted to be expressed in undifferentiated hESC and then strongly downregulated with differentiation. Functional annotation analysis of predicted targets of these miRNAs and comparison with a null model using non-hESC-expressed miRNAs identified statistically enriched functional categories, including chromatin remodeling and lineage-specific differentiation annotations. Finally, integration of our data with genome-wide chromatin immunoprecipitation data on OCT4, SOX2, and NANOG binding sites implicates these transcription factors in the regulation of nine of the novel/candidate miRNAs identified here. Comparison of our results with those of recent deep sequencing studies in mouse and human ESC shows that most of the novel/candidate miRNAs found here were not identified in the other studies. The data indicate that hESC express a larger complement of miRNAs than previously appreciated, and they provide a resource for additional studies of miRNA regulation of hESC physiology. Disclosure of potential conflicts of interest is found at the end of this article. Collapse Key Words Collapse MESH Headings Base Sequence Cell Differentiation Cell Line Databases, Genetic Embryonic Stem Cells/cytology Embryonic Stem Cells/enzymology Embryonic Stem Cells/metabolism Expressed Sequence Tags Gene Expression Profiling Gene Expression Regulation, Developmental Gene Library Humans MicroRNAs/chemistry MicroRNAs/genetics Molecular Sequence Data Multipotent Stem Cells/cytology Multipotent Stem Cells/metabolism Nucleic Acid Conformation Pluripotent Stem Cells/cytology Pluripotent Stem Cells/metabolism RNA, Small Interfering/chemistry RNA, Small Interfering/genetics Reproducibility of Results Reverse Transcriptase Polymerase Chain Reaction Ribonuclease III/metabolism Sequence Analysis, RNA Transcription Factors/metabolism Collapse Grants P41 HG004059-01 NHGRI NIH HHS P20 GM069983-01 NIGMS NIH HHS T32 CA080416-10 NCI NIH HHS K12 CA076930 NCI NIH HHS P30 ES007033-14 NIEHS NIH HHS T32 CA009515-21 NCI NIH HHS P30 ES007033 NIEHS NIH HHS R01 GM083867 NIGMS NIH HHS P30 CA015704-34 NCI NIH HHS T32 CA009657 NCI NIH HHS T32 CA009515-22 NCI NIH HHS 5 K12 CA076930 NCI NIH HHS T32 CA080416 NCI NIH HHS R01 GM083867-01A2 NIGMS NIH HHS 5 T32 CA009515-21/22 NCI NIH HHS P30 CA015704 NCI NIH HHS P30 ES07033 NIEHS NIH HHS T32 CA009657-16 NCI NIH HHS CA80416 NCI NIH HHS P20 GM069983 NIGMS NIH HHS 5 P30 CA015704 NCI NIH HHS K12 CA076930-10 NCI NIH HHS P41 HG004059 NHGRI NIH HHS T32 CA009515 NCI NIH HHS 5 T32 CA09657-16 NCI NIH HHS P01 GM081619 NIGMS NIH HHS P01 GM081619-01 NIGMS NIH HHS Collapse
34	Chiang T, Scholtens D, Sarkar D, Gentleman R, Huber W. Coverage and error models of protein-protein interaction data by directed graph analysis. Genome Biol 2008;8:R186. [PMID: 17845715 PMCID: PMC2375024 DOI: 10.1186/gb-2007-8-9-r186] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2007] [Revised: 05/26/2007] [Accepted: 09/10/2007] [Indexed: 01/10/2023] Open Abstract Using a directed graph model for bait to prey systems and a multinomial error model, we assessed the error statistics in all published large-scale datasets for Saccharomyces cerevisiae and characterized them by three traits: the set of tested interactions, artifacts that lead to false-positive or false-negative observations, and estimates of the stochastic error rates that affect the data. These traits provide a prerequisite for the estimation of the protein interactome and its modules. Collapse Key Words Collapse MESH Headings Algorithms Computational Biology/methods Computer Graphics Databases, Protein Fungal Proteins Gene Expression Regulation, Fungal Genomics/methods Models, Biological Protein Interaction Mapping Proteome Proteomics/methods Regression Analysis Reproducibility of Results Saccharomyces cerevisiae/genetics Stochastic Processes Collapse Grants Collapse
35	Risk M, Coleman I, Dumpit R, Gentleman R, Kristal AR, Knudsen BS, Nelson PS, Lin DW. Differential gene expression in normal prostate epithelium of men with and without prostate cancer. J Clin Oncol 2008. [DOI: 10.1200/jco.2008.26.15_suppl.5142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
36	Gentleman R, Huber W. Making the most of high-throughput protein-interaction data. Genome Biol 2008;8:112. [PMID: 18001486 PMCID: PMC2246275 DOI: 10.1186/gb-2007-8-10-112] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open Abstract We review the estimation of coverage and error rate in high-throughput protein-protein interaction datasets and argue that reports of the low quality of such data are to a substantial extent based on misinterpretations. Probabilistic statistical models and methods can be used to estimate properties of interest and to make the best use of the available data. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
37	Sarkar D, Le Meur N, Gentleman R. Using flowViz to visualize flow cytometry data. ACTA ACUST UNITED AC 2008;24:878-9. [PMID: 18245128 DOI: 10.1093/bioinformatics/btn021] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Abstract UNLABELLED Automated analysis of flow cytometry (FCM) data is essential for it to become successful as a high throughput technology. We believe that the principles of Trellis graphics can be adapted to provide useful visualizations that can aid such automation. In this article, we describe the R/Bioconductor package flowViz that implements such visualizations. AVAILABILITY flowViz is available as an R package from the Bioconductor project: http://bioconductor.org Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
38	Scholtens D, Chiang T, Huber W, Gentleman R. Estimating node degree in bait-prey graphs. ACTA ACUST UNITED AC 2007;24:218-24. [PMID: 18025006 DOI: 10.1093/bioinformatics/btm565] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Abstract MOTIVATION Proteins work together to drive biological processes in cellular machines. Summarizing global and local properties of the set of protein interactions, the interactome, is necessary for describing cellular systems. We consider a relatively simple per-protein feature of the interactome: the number of interaction partners for a protein, which in graph terminology is the degree of the protein. RESULTS Using data subject to both stochastic and systematic sources of false positive and false negative observations, we develop an explicit probability model and resultant likelihood method to estimate node degree on portions of the interactome assayed by bait-prey technologies. This approach yields substantial improvement in degree estimation over the current practice that naively sums observed edges. Accurate modeling of observed data in relation to true but unknown parameters of interest gives a formal point of reference from which to draw conclusions about the system under study. AVAILABILITY All analyses discussed in this text can be performed using the ppiStats and ppiData packages available through the Bioconductor project (http://www.bioconductor.org). Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
39	Chiang T, Li N, Orchard S, Kerrien S, Hermjakob H, Gentleman R, Huber W. Rintact: enabling computational analysis of molecular interaction data from the IntAct repository. Bioinformatics 2007;24:1100-1. [PMID: 17989096 DOI: 10.1093/bioinformatics/btm518] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open Abstract MOTIVATION The IntAct repository is one of the largest and most widely used databases for the curation and storage of molecular interaction data. These datasets need to be analyzed by computational methods. Software packages in the statistical environment R provide powerful tools for conducting such analyses. RESULTS We introduce Rintact, a Bioconductor package that allows users to transform PSI-MI XML2.5 interaction data files from IntAct into R graph objects. On these, they can use methods from R and Bioconductor for a variety of tasks: determining cohesive subgraphs, computing summary statistics, fitting mathematical models to the data or rendering graphical layouts. Rintact provides a programmatic interface to the IntAct repository and allows the use of the analytic methods provided by R and Bioconductor. AVAILABILITY Rintact is freely available at http://bioconductor.org Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
40	Huber W, Carey VJ, Long L, Falcon S, Gentleman R. Graphs in molecular biology. BMC Bioinformatics 2007;8 Suppl 6:S8. [PMID: 17903289 PMCID: PMC1995545 DOI: 10.1186/1471-2105-8-s6-s8] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open Abstract Graph theoretical concepts are useful for the description and analysis of interactions and relationships in biological systems. We give a brief introduction into some of the concepts and their areas of application in molecular biology. We discuss software that is available through the Bioconductor project and present a simple example application to the integration of a protein-protein interaction and a co-expression network. Collapse Key Words Collapse MESH Headings Algorithms Computational Biology/methods Computational Biology/trends Computer Graphics Computer Simulation Gene Expression Profiling/methods Gene Expression Regulation/physiology Models, Biological Oligonucleotide Array Sequence Analysis/methods Proteome/metabolism Signal Transduction/physiology Transcription, Genetic/physiology Collapse Grants Collapse
41	Le Meur N, Rossini A, Gasparetto M, Smith C, Brinkman RR, Gentleman R. Data quality assessment of ungated flow cytometry data in high throughput experiments. Cytometry A 2007;71:393-403. [PMID: 17366638 PMCID: PMC2768034 DOI: 10.1002/cyto.a.20396] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Abstract BACKGROUND The recent development of semiautomated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data. METHODS We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data. RESULTS We found that graphical representations can reveal substantial nonbiological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review. CONCLUSIONS Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control. Collapse Key Words Collapse MESH Headings Algorithms Antibodies, Monoclonal/pharmacology Antibodies, Monoclonal, Murine-Derived Antigens, CD/analysis Antineoplastic Agents/pharmacology Artifacts Biomarkers/analysis Cell Line, Tumor Cell Separation/methods Cell Separation/standards Cell Survival/drug effects Cluster Analysis Computer Graphics Data Interpretation, Statistical Flow Cytometry/methods Flow Cytometry/standards Graft vs Host Disease/diagnosis Graft vs Host Disease/immunology Humans Miniaturization/standards Quality Control Reproducibility of Results Rituximab Software Time Factors Collapse Grants R01 EB005034 NIBIB NIH HHS R01 EB005034-03 NIBIB NIH HHS Collapse
42	Gentleman R, Temple Lang D. Statistical Analyses and Reproducible Research. J Comput Graph Stat 2007. [DOI: 10.1198/106186007x178663] [Citation(s) in RCA: 132] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
43	Carey VJ, Morgan M, Falcon S, Lazarus R, Gentleman R. GGtools: analysis of genetics of gene expression in bioconductor. Bioinformatics 2006;23:522-3. [PMID: 17158513 DOI: 10.1093/bioinformatics/btl628] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract UNLABELLED This paper reviews the central concepts and implementation of data structures and methods for studying genetics of gene expression with the GGtools package of Bioconductor. Illustration with a HapMap+expression dataset is provided. AVAILABILITY Package GGtools is part of Bioconductor 1.9 (http://bioconductor.org). Open source with Artistic License. Collapse Key Words Collapse MESH Headings Chromosome Mapping/methods Computer Simulation DNA Mutational Analysis/methods Gene Expression Profiling/methods Genetic Variation/genetics Genetics, Population Models, Genetic Polymorphism, Single Nucleotide/genetics Software Software Design User-Computer Interface Collapse Grants HG002708 NHGRI NIH HHS HG003646 NHGRI NIH HHS Collapse
44	Jiang Z, Gentleman R. Extensions to gene set enrichment. ACTA ACUST UNITED AC 2006;23:306-13. [PMID: 17127676 DOI: 10.1093/bioinformatics/btl599] [Citation(s) in RCA: 172] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Abstract MOTIVATION Gene Set Enrichment Analysis (GSEA) has been developed recently to capture changes in the expression of pre-defined sets of genes. We propose number of extensions to GSEA, including the use of different statistics to describe the association between genes and phenotypes of interest. We make use of dimension reduction procedures, such as principle component analysis, to identify gene sets with correlated expression. We also address issues that arise when gene sets overlap. RESULTS Our proposals extend the range of applicability of GSEA and allow for adjustments based on other covariates. We have provided a well-defined procedure to address interpretation issues that can raise when gene sets have substantial overlap. We have shown how standard dimension reduction methods, such as PCA, can be used to help further interpret GSEA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Algorithms Cell Physiological Phenomena Gene Expression/physiology Gene Expression Profiling/methods Gene Expression Regulation/physiology Oligonucleotide Array Sequence Analysis/methods Proteome/metabolism Collapse Grants Collapse
45	Falcon S, Gentleman R. Using GOstats to test gene lists for GO term association. Bioinformatics 2006;23:257-8. [PMID: 17098774 DOI: 10.1093/bioinformatics/btl567] [Citation(s) in RCA: 1426] [Impact Index Per Article: 79.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open Abstract MOTIVATION Functional analyses based on the association of Gene Ontology (GO) terms to genes in a selected gene list are useful bioinformatic tools and the GOstats package has been widely used to perform such computations. In this paper we report significant improvements and extensions such as support for conditional testing. RESULTS We discuss the capabilities of GOstats, a Bioconductor package written in R, that allows users to test GO terms for over or under-representation using either a classical hypergeometric test or a conditional hypergeometric that uses the relationships among GO terms to decorrelate the results. AVAILABILITY GOstats is available as an R package from the Bioconductor project: http://bioconductor.org Collapse Key Words Collapse MESH Headings Algorithms Data Interpretation, Statistical Database Management Systems Databases, Protein Gene Expression Profiling/methods Information Storage and Retrieval/methods Natural Language Processing Proteins/chemistry Proteins/classification Proteins/metabolism Software Terminology as Topic Collapse Grants Collapse
46	Shi Q, Harris LN, Lu X, Li X, Hwang J, Gentleman R, Iglehart JD, Miron A. Declining Plasma Fibrinogen Alpha Fragment Identifies HER2-Positive Breast Cancer Patients and Reverts to Normal Levels after Surgery. J Proteome Res 2006;5:2947-55. [PMID: 17081046 DOI: 10.1021/pr060099u] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Abstract Breast cancer is the most common nonskin malignancy affecting women. Currently, no simple, blood-based diagnostic test exists to complement radiological screening and increase sensitivity of detection. To screen plasma specimens and identify biomarkers that detect HER2-positive breast cancer, automated robotic sample processing followed by surface-enhanced laser desorption ionization time-of-flight (SELDI-TOF) mass spectroscopy was used. Multiple statistical algorithms were used to select biomarkers that segregate cancer patients versus controls and produced average CV rates ranging from 20% to 29%. A set of seven biomarkers were validated on an independent test data set and achieved the best error rate of 19.1%. A permutation test indicated a p-value for CV error less than 0.002. Moreover, a ROC curve using these biomarkers achieved an area-under-the-curve value of 0.95 on an independent test data set. The marker responsible for most of the resolving power was identified as a fragment of Fibrinogen Alpha (FGA) encompassing residues 605-629. This marker was present at lower levels in cancer patients as compared to controls. The importance of this biomarker was validated in a longitudinal study comparing pre- and post-operative levels and was shown to revert to normal levels after surgery. This fragment may serve as a useful diagnostic and treatment-monitoring marker. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Automation Biomarkers, Tumor/blood Breast Neoplasms/blood Breast Neoplasms/genetics Breast Neoplasms/surgery False Positive Reactions Female Fibrinogen/chemistry Humans Longitudinal Studies Molecular Sequence Data Peptide Fragments/chemistry Protein Array Analysis Receptor, ErbB-2/blood Receptor, ErbB-2/genetics Reference Values Reproducibility of Results Robotics Collapse Grants 5 P50 CA089393 NCI NIH HHS Collapse
47	Quackenbush J, Stoeckert C, Ball C, Brazma A, Gentleman R, Huber W, Irizarry R, Salit M, Sherlock G, Spellman P, Winegarden N. Top-down standards will not serve systems biology. Nature 2006;440:24. [PMID: 16511469 DOI: 10.1038/440024a] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Abstract Collapse Key Words Collapse MESH Headings Databases, Factual/standards Databases, Factual/trends Software/standards Software/trends Systems Biology/methods Systems Biology/standards Systems Biology/trends Collapse Grants Collapse
48	Gentleman R. Developing Statistical Software inFORTRAN 95. J Stat Softw 2006. [DOI: 10.18637/jss.v017.b02] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
49	Chiaretti S, Li X, Gentleman R, Vitale A, Wang KS, Mandelli F, Foà R, Ritz J. Gene Expression Profiles of B-lineage Adult Acute Lymphocytic Leukemia Reveal Genetic Patterns that Identify Lineage Derivation and Distinct Mechanisms of Transformation. Clin Cancer Res 2005;11:7209-19. [PMID: 16243790 DOI: 10.1158/1078-0432.ccr-04-2165] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Abstract PURPOSE To characterize gene expression signatures in acute lymphocytic leukemia (ALL) cells associated with known genotypic abnormalities in adult patients. EXPERIMENTAL DESIGN Gene expression profiles from 128 adult patients with newly diagnosed ALL were characterized using high-density oligonucleotide microarrays. All patients were enrolled in the Italian GIMEMA multicenter clinical trial 0496 and samples had >90% leukemic cells. Uniform phenotypic, cytogenetic, and molecular data were also available for all cases. RESULTS T-lineage ALL was characterized by a homogeneous gene expression pattern, whereas several subgroups of B-lineage ALL were evident. Within B-lineage ALL, distinct signatures were associated with ALL1/AF4 and E2A/PBX1 gene rearrangements. Expression profiles associated with ALL1/AF4 and E2A/PBX1 are similar in adults and children. BCR/ABL+ gene expression pattern was more heterogeneous and was most similar to ALL without known molecular rearrangements. We also identified a set of 83 genes that were highly expressed in leukemia blasts from patients without known molecular abnormalities who subsequently relapsed following therapy. Supervised analysis of kinase genes revealed a high-level FLT3 expression in a subset of cases without molecular rearrangements. Two other kinases (PRKCB1 and DDR1) were highly expressed in cases without molecular rearrangements, as well as in BCR/ABL-positive ALL. CONCLUSIONS Genomic signatures are associated with phenotypically and molecularly well defined subgroups of adult ALL. Genomic profiling also identifies genes associated with poor outcome in cases without molecular aberrations and specific genes that may be new therapeutic targets in adult ALL. Collapse Key Words Collapse MESH Headings Adolescent Adult Burkitt Lymphoma/genetics Burkitt Lymphoma/immunology Burkitt Lymphoma/pathology Cluster Analysis Cytogenetic Analysis Female Flow Cytometry/methods Gene Expression Profiling Gene Expression Regulation, Leukemic/genetics Humans Immunophenotyping Italy Karyotyping Male Middle Aged Oligonucleotide Array Sequence Analysis/methods Oncogene Proteins, Fusion/genetics Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics Precursor Cell Lymphoblastic Leukemia-Lymphoma/immunology Precursor Cell Lymphoblastic Leukemia-Lymphoma/pathology Collapse Grants CA66996 NCI NIH HHS Collapse
50	Tadesse MG, Ibrahim JG, Gentleman R, Chiaretti S, Ritz J, Foa R. Bayesian error-in-variable survival model for the analysis of GeneChip arrays. Biometrics 2005;61:488-97. [PMID: 16011696 DOI: 10.1111/j.1541-0420.2005.00313.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Abstract DNA microarrays in conjunction with statistical models may help gain a deeper understanding of the molecular basis for specific diseases. An intense area of research is concerned with the identification of genes related to particular phenotypes. The technology, however, is subject to various sources of error that may lead to expression readings that are substantially different from the true transcript levels. Few methods for microarray data analysis have accounted for measurement error in a substantial way and that is the purpose of this investigation. We describe a Bayesian error-in-variable model for the analysis of microarray data from a clinical study of patients with acute lymphoblastic leukemia. We focus in particular on the problem of identifying genes whose expression patterns are associated with duration of remission. This is a question of great practical interest since relapse is a major concern in the treatment of this disease. We explore the effects of ignoring the uncertainty in the expression estimates on the selection and ranking of genes. Collapse Key Words Collapse MESH Headings Bayes Theorem Biometry Computational Biology/methods Data Interpretation, Statistical Gene Expression Profiling Humans Microarray Analysis Models, Statistical Oligonucleotide Array Sequence Analysis/methods Phenotype Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics Proportional Hazards Models RNA, Messenger/metabolism Regression Analysis Remission Induction Sequence Analysis, DNA Software Collapse Grants CA66996 NCI NIH HHS CA70101 NCI NIH HHS CA74015 NCI NIH HHS CA90301 NCI NIH HHS Collapse