Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

231
(from Reference Citation Analysis)

Article PDFs (71)

Cited by > 0 (150)

Searched Name

Casey S. Greene

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

101

Harrington LX, Way GP, Doherty JA, Greene CS. Functional network community detection can disaggregate and filter multiple underlying pathways in enrichment analyses. Pac Symp Biocomput 2018;23:157-167. [PMID: 29218878 PMCID: PMC5760988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

102

Allaway RJ, Fischer DA, de Abreu FB, Gardner TB, Gordon SR, Barth RJ, Colacchio TA, Wood M, Kacsoh BZ, Bouley SJ, Cui J, Hamilton J, Choi JA, Lange JT, Peterson JD, Padmanabhan V, Tomlinson CR, Tsongalis GJ, Suriawinata AA, Greene CS, Sanchez Y, Smith KD. Genomic characterization of patient-derived xenograft models established from fine needle aspirate biopsies of a primary pancreatic ductal adenocarcinoma and from patient-matched metastatic sites. Oncotarget 2017;7:17087-102. [PMID: 26934555 PMCID: PMC4941373 DOI: 10.18632/oncotarget.7718] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2015] [Accepted: 01/13/2016] [Indexed: 12/12/2022] Open

Affiliation(s)

Robert J Allaway Department of Pharmacology and Toxicology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
Dawn A Fischer Department of Surgery, Division of Surgical Oncology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Francine B de Abreu Department of Pathology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Timothy B Gardner Department of Medicine, Section of Gastroenterology and Hepatology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Stuart R Gordon Department of Medicine, Section of Gastroenterology and Hepatology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Richard J Barth Department of Surgery, Division of Surgical Oncology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA.,Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, NH 03756, USA
Thomas A Colacchio Department of Surgery, Division of Surgical Oncology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA.,Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, NH 03756, USA
Matthew Wood Department of Pharmacology and Toxicology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA.,Current location: Department of Pathology, University of California, San Francisco, CA 94143, USA
Balint Z Kacsoh Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH 03756, USA
Stephanie J Bouley Department of Pharmacology and Toxicology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
Jingxuan Cui Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH 03756, USA
Joanna Hamilton Department of Pharmacology and Toxicology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA.,Department of Medicine, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Jungbin A Choi Department of Pharmacology and Toxicology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
Joshua T Lange Department of Pharmacology and Toxicology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
Jason D Peterson Department of Pathology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Vijayalakshmi Padmanabhan Department of Pathology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Craig R Tomlinson Department of Pharmacology and Toxicology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA.,Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, NH 03756, USA.,Department of Medicine, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Gregory J Tsongalis Department of Pathology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA.,Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, NH 03756, USA
Arief A Suriawinata Department of Pathology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA
Casey S Greene Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, NH 03756, USA.,Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH 03756, USA.,Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, NH 03755, USA
Yolanda Sanchez Department of Pharmacology and Toxicology, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA.,Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, NH 03756, USA
Kerrington D Smith Department of Surgery, Division of Surgical Oncology, Dartmouth-Hitchcock Medical Center, Lebanon, NH 03756, USA.,Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, NH 03756, USA

Collapse

103

Tan J, Huyck M, Hu D, Zelaya RA, Hogan DA, Greene CS. ADAGE signature analysis: differential expression analysis with data-defined gene sets. BMC Bioinformatics 2017;18:512. [PMID: 29166858 PMCID: PMC5700673 DOI: 10.1186/s12859-017-1905-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 11/01/2017] [Indexed: 12/18/2022] Open

Abstract

BACKGROUND

Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data.

RESULTS

Here we introduce a method to identify perturbed processes. In contrast with methods that use curated gene sets, this approach uses signatures extracted from public expression data. We first extract expression signatures from public data using ADAGE, a neural network-based feature extraction approach. We next identify signatures that are differentially active under a given treatment. Our results demonstrate that these signatures represent biological processes that are perturbed by the experiment. Because these signatures are directly learned from data without supervision, they can identify uncurated or novel biological processes. We implemented ADAGE signature analysis for the bacterial pathogen Pseudomonas aeruginosa. For the convenience of different user groups, we implemented both an R package (ADAGEpath) and a web server ( http://adage.greenelab.com ) to run these analyses. Both are open-source to allow easy expansion to other organisms or signature generation methods. We applied ADAGE signature analysis to an example dataset in which wild-type and ∆anr mutant cells were grown as biofilms on the Cystic Fibrosis genotype bronchial epithelial cells. We mapped active signatures in the dataset to KEGG pathways and compared with pathways identified using GSEA. The two approaches generally return consistent results; however, ADAGE signature analysis also identified a signature that revealed the molecularly supported link between the MexT regulon and Anr.

CONCLUSIONS

We designed ADAGE signature analysis to perform gene set analysis using data-defined functional gene signatures. This approach addresses an important gap for biologists studying non-traditional model organisms and those without extensive curated resources available. We built both an R package and web server to provide ADAGE signature analysis to the community.

Collapse

104

Doherty JA, Peres LC, Wang C, Way GP, Greene CS, Schildkraut JM. Challenges and Opportunities in Studying the Epidemiology of Ovarian Cancer Subtypes. CURR EPIDEMIOL REP 2017;4:211-220. [PMID: 29226065 PMCID: PMC5718213 DOI: 10.1007/s40471-017-0115-y] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

105

Tan J, Doing G, Lewis KA, Price CE, Chen KM, Cady KC, Perchuk B, Laub MT, Hogan DA, Greene CS. Unsupervised Extraction of Stable Expression Signatures from Public Compendia with an Ensemble of Neural Networks. Cell Syst 2017;5:63-71.e6. [PMID: 28711280 PMCID: PMC5532071 DOI: 10.1016/j.cels.2017.06.003] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Revised: 04/11/2017] [Accepted: 06/08/2017] [Indexed: 01/18/2023]

106

Greene CS, Garmire LX, Gilbert JA, Ritchie MD, Hunter LE. Celebrating parasites. Nat Genet 2017;49:483-484. [PMID: 28358134 PMCID: PMC5710834 DOI: 10.1038/ng.3830] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]

107

Taroni JN, Greene CS, Martyanov V, Wood TA, Christmann RB, Farber HW, Lafyatis RA, Denton CP, Hinchcliff ME, Pioli PA, Mahoney JM, Whitfield ML. A novel multi-network approach reveals tissue-specific cellular modulators of fibrosis in systemic sclerosis. Genome Med 2017;9:27. [PMID: 28330499 PMCID: PMC5363043 DOI: 10.1186/s13073-017-0417-1] [Citation(s) in RCA: 67] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Accepted: 02/23/2017] [Indexed: 12/22/2022] Open

Abstract

Background

Systemic sclerosis (SSc) is a multi-organ autoimmune disease characterized by skin fibrosis. Internal organ involvement is heterogeneous. It is unknown whether disease mechanisms are common across all involved affected tissues or if each manifestation has a distinct underlying pathology.

Methods

We used consensus clustering to compare gene expression profiles of biopsies from four SSc-affected tissues (skin, lung, esophagus, and peripheral blood) from patients with SSc, and the related conditions pulmonary fibrosis (PF) and pulmonary arterial hypertension, and derived a consensus disease-associate signature across all tissues. We used this signature to query tissue-specific functional genomic networks. We performed novel network analyses to contrast the skin and lung microenvironments and to assess the functional role of the inflammatory and fibrotic genes in each organ. Lastly, we tested the expression of macrophage activation state-associated gene sets for enrichment in skin and lung using a Wilcoxon rank sum test.

Results

We identified a common pathogenic gene expression signature—an immune–fibrotic axis—indicative of pro-fibrotic macrophages (MØs) in multiple tissues (skin, lung, esophagus, and peripheral blood mononuclear cells) affected by SSc. While the co-expression of these genes is common to all tissues, the functional consequences of this upregulation differ by organ. We used this disease-associated signature to query tissue-specific functional genomic networks to identify common and tissue-specific pathologies of SSc and related conditions. In contrast to skin, in the lung-specific functional network we identify a distinct lung-resident MØ signature associated with lipid stimulation and alternative activation. In keeping with our network results, we find distinct MØ alternative activation transcriptional programs in SSc-associated PF lung and in the skin of patients with an “inflammatory” SSc gene expression signature.

Conclusions

Our results suggest that the innate immune system is central to SSc disease processes but that subtle distinctions exist between tissues. Our approach provides a framework for examining molecular signatures of disease in fibrosis and autoimmune diseases and for leveraging publicly available data to understand common and tissue-specific disease processes in complex human diseases.

Electronic supplementary material

The online version of this article (doi:10.1186/s13073-017-0417-1) contains supplementary material, which is available to authorized users.

Collapse

108

Beaulieu-Jones BK, Greene CS. Reproducibility of computational workflows is automated using continuous analysis. Nat Biotechnol 2017. [PMID: 28288103 DOI: 10.1038/nbt.3780.] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

109

Greene CS. Tell me your neighbors, and I will tell you what you are. Sci Transl Med 2017;9:9/376/eaam6058. [DOI: 10.1126/scitranslmed.aam6058] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

110

Way GP, Allaway RJ, Bouley SJ, Fadul CE, Sanchez Y, Greene CS. A machine learning classifier trained on cancer transcriptomes detects NF1 inactivation signal in glioblastoma. BMC Genomics 2017;18:127. [PMID: 28166733 PMCID: PMC5292791 DOI: 10.1186/s12864-017-3519-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2016] [Accepted: 01/26/2017] [Indexed: 12/14/2022] Open

Abstract

BACKGROUND

We have identified molecules that exhibit synthetic lethality in cells with loss of the neurofibromin 1 (NF1) tumor suppressor gene. However, recognizing tumors that have inactivation of the NF1 tumor suppressor function is challenging because the loss may occur via mechanisms that do not involve mutation of the genomic locus. Degradation of the NF1 protein, independent of NF1 mutation status, phenocopies inactivating mutations to drive tumors in human glioma cell lines. NF1 inactivation may alter the transcriptional landscape of a tumor and allow a machine learning classifier to detect which tumors will benefit from synthetic lethal molecules.

RESULTS

We developed a strategy to predict tumors with low NF1 activity and hence tumors that may respond to treatments that target cells lacking NF1. Using RNAseq data from The Cancer Genome Atlas (TCGA), we trained an ensemble of 500 logistic regression classifiers that integrates mutation status with whole transcriptomes to predict NF1 inactivation in glioblastoma (GBM). On TCGA data, the classifier detected NF1 mutated tumors (test set area under the receiver operating characteristic curve (AUROC) mean = 0.77, 95% quantile = 0.53 - 0.95) over 50 random initializations. On RNA-Seq data transformed into the space of gene expression microarrays, this method produced a classifier with similar performance (test set AUROC mean = 0.77, 95% quantile = 0.53 - 0.96). We applied our ensemble classifier trained on the transformed TCGA data to a microarray validation set of 12 samples with matched RNA and NF1 protein-level measurements. The classifier's NF1 score was associated with NF1 protein concentration in these samples.

CONCLUSIONS

We demonstrate that TCGA can be used to train accurate predictors of NF1 inactivation in GBM. The ensemble classifier performed well for samples with very high or very low NF1 protein concentrations but had mixed performance in samples with intermediate NF1 concentrations. Nevertheless, high-performing and validated predictors have the potential to be paired with targeted therapies and personalized medicine.

Collapse

111

Greene CS, Himmelstein DS. Genetic Association-Guided Analysis of Gene Networks for the Study of Complex Traits. ACTA ACUST UNITED AC 2017;9:179-84. [PMID: 27094199 DOI: 10.1161/circgenetics.115.001181] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 03/08/2016] [Indexed: 12/29/2022]

112

Moore JH, Jennings SF, Greene CS, Hunter LE, Perkins AD, Williams-Devane C, Wunsch DC, Zhao Z, Huang X. NO-BOUNDARY THINKING IN BIOINFORMATICS. Pac Symp Biocomput 2017;22:646-648. [PMID: 27897015 DOI: 10.1142/9789813207813_0060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

113

Greene CS. Cheap-seq. Sci Transl Med 2016;8:370ec203. [PMID: 28003542 DOI: 10.1126/scitranslmed.aal3701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

114

Greene CS. How to know what we don’t. Sci Transl Med 2016. [DOI: 10.1126/scitranslmed.aal0067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

115

Beaulieu-Jones BK, Greene CS. Semi-supervised learning of the electronic health record for phenotype stratification. J Biomed Inform 2016. [PMID: 27744022 DOI: 10.1016/j.jbi.2016.10.007.] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

116

Beaulieu-Jones BK, Greene CS. Semi-supervised learning of the electronic health record for phenotype stratification. J Biomed Inform 2016;64:168-178. [PMID: 27744022 DOI: 10.1016/j.jbi.2016.10.007] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Revised: 10/05/2016] [Accepted: 10/08/2016] [Indexed: 12/12/2022]

117

Greene CS. A stromal focus reveals tumor immune signatures. Sci Transl Med 2016. [DOI: 10.1126/scitranslmed.aai8224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

118

Krishnan A, Taroni JN, Greene CS. Integrative Networks Illuminate Biological Factors Underlying Gene–Disease Associations. Curr Genet Med Rep 2016. [DOI: 10.1007/s40142-016-0102-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

119

Jiang Y, Oron TR, Clark WT, Bankapur AR, D'Andrea D, Lepore R, Funk CS, Kahanda I, Verspoor KM, Ben-Hur A, Koo DCE, Penfold-Brown D, Shasha D, Youngs N, Bonneau R, Lin A, Sahraeian SME, Martelli PL, Profiti G, Casadio R, Cao R, Zhong Z, Cheng J, Altenhoff A, Skunca N, Dessimoz C, Dogan T, Hakala K, Kaewphan S, Mehryary F, Salakoski T, Ginter F, Fang H, Smithers B, Oates M, Gough J, Törönen P, Koskinen P, Holm L, Chen CT, Hsu WL, Bryson K, Cozzetto D, Minneci F, Jones DT, Chapman S, Bkc D, Khan IK, Kihara D, Ofer D, Rappoport N, Stern A, Cibrian-Uhalte E, Denny P, Foulger RE, Hieta R, Legge D, Lovering RC, Magrane M, Melidoni AN, Mutowo-Meullenet P, Pichler K, Shypitsyna A, Li B, Zakeri P, ElShal S, Tranchevent LC, Das S, Dawson NL, Lee D, Lees JG, Sillitoe I, Bhat P, Nepusz T, Romero AE, Sasidharan R, Yang H, Paccanaro A, Gillis J, Sedeño-Cortés AE, Pavlidis P, Feng S, Cejuela JM, Goldberg T, Hamp T, Richter L, Salamov A, Gabaldon T, Marcet-Houben M, Supek F, Gong Q, Ning W, Zhou Y, Tian W, Falda M, Fontana P, Lavezzo E, Toppo S, Ferrari C, Giollo M, Piovesan D, Tosatto SCE, Del Pozo A, Fernández JM, Maietta P, Valencia A, Tress ML, Benso A, Di Carlo S, Politano G, Savino A, Rehman HU, Re M, Mesiti M, Valentini G, Bargsten JW, van Dijk ADJ, Gemovic B, Glisic S, Perovic V, Veljkovic V, Veljkovic N, Almeida-E-Silva DC, Vencio RZN, Sharan M, Vogel J, Kansakar L, Zhang S, Vucetic S, Wang Z, Sternberg MJE, Wass MN, Huntley RP, Martin MJ, O'Donovan C, Robinson PN, Moreau Y, Tramontano A, Babbitt PC, Brenner SE, Linial M, Orengo CA, Rost B, Greene CS, Mooney SD, Friedberg I, Radivojac P. An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol 2016;17:184. [PMID: 27604469 PMCID: PMC5015320 DOI: 10.1186/s13059-016-1037-6] [Citation(s) in RCA: 252] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 08/04/2016] [Indexed: 12/02/2022] Open

Affiliation(s)

Yuxiang Jiang Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA
Tal Ronnen Oron Buck Institute for Research on Aging, Novato, CA, USA
Wyatt T Clark Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
Asma R Bankapur Department of Microbiology, Miami University, Oxford, OH, USA
Daniel D'Andrea University of Rome, La Sapienza, Rome, Italy
Rosalba Lepore University of Rome, La Sapienza, Rome, Italy
Christopher S Funk Computational Bioscience Program, University of Colorado School of Medicine, Aurora, CO, USA
Indika Kahanda Department of Computer Science, Colorado State University, Fort Collins, CO, USA
Karin M Verspoor Department of Computing and Information Systems, University of Melbourne, Parkville, Victoria, Australia Health and Biomedical Informatics Centre, University of Melbourne, Parkville, Victoria, Australia
Asa Ben-Hur Department of Computer Science, Colorado State University, Fort Collins, CO, USA
Da Chen Emily Koo Department of Biology, New York University, New York, NY, USA
Duncan Penfold-Brown Social Media and Political Participation Lab, New York University, New York, NY, USA CY Data Science, New York, NY, USA
Dennis Shasha Department of Computer Science, New York University, New York, NY, USA
Noah Youngs CY Data Science, New York, NY, USA Department of Computer Science, New York University, New York, NY, USA Simons Center for Data Analysis, New York, NY, USA
Richard Bonneau Department of Computer Science, New York University, New York, NY, USA Simons Center for Data Analysis, New York, NY, USA Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
Alexandra Lin Department of Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, CA, USA
Sayed M E Sahraeian Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
Pier Luigi Martelli Biocomputing Group, BiGeA, University of Bologna, Bologna, Italy
Giuseppe Profiti Biocomputing Group, BiGeA, University of Bologna, Bologna, Italy
Rita Casadio Biocomputing Group, BiGeA, University of Bologna, Bologna, Italy
Renzhi Cao Computer Science Department, University of Missouri, Columbia, MO, USA
Zhaolong Zhong Computer Science Department, University of Missouri, Columbia, MO, USA
Jianlin Cheng Computer Science Department, University of Missouri, Columbia, MO, USA
Adrian Altenhoff ETH Zurich, Zurich, Switzerland Swiss Institute of Bioinformatics, Zurich, Switzerland
Nives Skunca ETH Zurich, Zurich, Switzerland Swiss Institute of Bioinformatics, Zurich, Switzerland
Christophe Dessimoz Bioinformatics Group, Department of Computer Science, University College London, London, UK University of Lausanne, Lausanne, Switzerland Swiss Institute of Bioinformatics, Lausanne, Switzerland
Tunca Dogan European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Kai Hakala Department of Information Technology, University of Turku, Turku, Finland University of Turku Graduate School, University of Turku, Turku, Finland
Suwisa Kaewphan Department of Information Technology, University of Turku, Turku, Finland University of Turku Graduate School, University of Turku, Turku, Finland Turku Centre for Computer Science, Turku, Finland
Farrokh Mehryary Department of Information Technology, University of Turku, Turku, Finland University of Turku Graduate School, University of Turku, Turku, Finland
Tapio Salakoski Department of Information Technology, University of Turku, Turku, Finland Turku Centre for Computer Science, Turku, Finland
Filip Ginter Department of Information Technology, University of Turku, Turku, Finland
Hai Fang University of Bristol, Bristol, UK
Ben Smithers University of Bristol, Bristol, UK
Matt Oates University of Bristol, Bristol, UK
Julian Gough University of Bristol, Bristol, UK
Petri Törönen Institute of Biotechnology, University of Helsinki, Helsinki, Finland
Patrik Koskinen Institute of Biotechnology, University of Helsinki, Helsinki, Finland
Liisa Holm Institute of Biotechnology, University of Helsinki, Helsinki, Finland Department of Biological and Environmental Sciences, Universitity of Helsinki, Helsinki, Finland
Ching-Tai Chen Institute of Information Science, Academia Sinica, Taipei, Taiwan
Wen-Lian Hsu Institute of Information Science, Academia Sinica, Taipei, Taiwan
Kevin Bryson Bioinformatics Group, Department of Computer Science, University College London, London, UK
Domenico Cozzetto Bioinformatics Group, Department of Computer Science, University College London, London, UK
Federico Minneci Bioinformatics Group, Department of Computer Science, University College London, London, UK
David T Jones Bioinformatics Group, Department of Computer Science, University College London, London, UK
Samuel Chapman Department of Computational Science and Engineering, North Carolina A&T State University, Greensboro, NC, USA
Dukka Bkc Department of Computational Science and Engineering, North Carolina A&T State University, Greensboro, NC, USA
Ishita K Khan Department of Computer Science, Purdue University, West Lafayette, IN, USA
Daisuke Kihara Department of Computer Science, Purdue University, West Lafayette, IN, USA Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
Dan Ofer Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
Nadav Rappoport Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
Amos Stern Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
Elena Cibrian-Uhalte European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Paul Denny Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
Rebecca E Foulger Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
Reija Hieta European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Duncan Legge European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Ruth C Lovering Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
Michele Magrane European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Anna N Melidoni Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
Prudence Mutowo-Meullenet European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Klemens Pichler European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Aleksandra Shypitsyna European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Biao Li Buck Institute for Research on Aging, Novato, CA, USA
Pooya Zakeri Department of Electrical Engineering, STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, Belgium iMinds Department Medical Information Technologies, Leuven, Belgium
Sarah ElShal Department of Electrical Engineering, STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, Belgium iMinds Department Medical Information Technologies, Leuven, Belgium
Léon-Charles Tranchevent Inserm UMR-S1052, CNRS UMR5286, Cancer Research Centre of Lyon, Lyon, France Université de Lyon 1, Villeurbanne, France Centre Léon Bérard, Lyon, France
Sayoni Das Institute of Structural and Molecular Biology, University College London, London, UK
Natalie L Dawson Institute of Structural and Molecular Biology, University College London, London, UK
David Lee Institute of Structural and Molecular Biology, University College London, London, UK
Jonathan G Lees Institute of Structural and Molecular Biology, University College London, London, UK
Ian Sillitoe Institute of Structural and Molecular Biology, University College London, London, UK
Prajwal Bhat Cerenode Inc., Boston, MA, USA
Tamás Nepusz Molde University College, Molde, Norway
Alfonso E Romero Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway University of London, Egham, UK
Rajkumar Sasidharan Department of Molecular, Cell and Developmental Biology, University of California at Los Angeles, Los Angeles, CA, USA
Haixuan Yang School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway, Ireland
Alberto Paccanaro Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway University of London, Egham, UK
Jesse Gillis Stanley Institute for Cognitive Genomics Cold Spring Harbor Laboratory, New York, NY, USA
Adriana E Sedeño-Cortés Graduate Program in Bioinformatics, University of British Columbia, Vancouver, Canada
Paul Pavlidis Department of Psychiatry and Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
Shou Feng Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA
Juan M Cejuela Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
Tatyana Goldberg Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
Tobias Hamp Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
Lothar Richter Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
Asaf Salamov DOE Joint Genome Institute, Walnut Creek, CA, USA
Toni Gabaldon Bioinformatics and Genomics, Centre for Genomic Regulation, Barcelona, Spain Universitat Pompeu Fabra, Barcelona, Spain Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
Marina Marcet-Houben Bioinformatics and Genomics, Centre for Genomic Regulation, Barcelona, Spain Universitat Pompeu Fabra, Barcelona, Spain
Fran Supek Universitat Pompeu Fabra, Barcelona, Spain Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation, Barcelona, Spain
Qingtian Gong State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai, China Children's Hospital of Fudan University, Shanghai, China
Wei Ning State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai, China Children's Hospital of Fudan University, Shanghai, China
Yuanpeng Zhou State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai, China Children's Hospital of Fudan University, Shanghai, China
Weidong Tian State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai, China Children's Hospital of Fudan University, Shanghai, China
Marco Falda Department of Molecular Medicine, University of Padua, Padua, Italy
Paolo Fontana Research and Innovation Center, Edmund Mach Foundation, San Michele all'Adige, Italy
Enrico Lavezzo Department of Molecular Medicine, University of Padua, Padua, Italy
Stefano Toppo Department of Molecular Medicine, University of Padua, Padua, Italy
Carlo Ferrari Department of Information Engineering, University of Padua, Padova, Italy
Manuel Giollo Department of Information Engineering, University of Padua, Padova, Italy Department of Biomedical Sciences, University of Padua, Padova, Italy
Damiano Piovesan Department of Information Engineering, University of Padua, Padova, Italy
Silvio C E Tosatto Department of Information Engineering, University of Padua, Padova, Italy
Angela Del Pozo Instituto De Genetica Medica y Molecular, Hospital Universitario de La Paz, Madrid, Spain
José M Fernández Spanish National Bioinformatics Institute, Spanish National Cancer Research Institute, Madrid, Spain
Paolo Maietta Structural and Computational Biology Programme, Spanish National Cancer Research Institute, Madrid, Spain
Alfonso Valencia Structural and Computational Biology Programme, Spanish National Cancer Research Institute, Madrid, Spain
Michael L Tress Structural and Computational Biology Programme, Spanish National Cancer Research Institute, Madrid, Spain
Alfredo Benso Control and Computer Engineering Department, Politecnico di Torino, Torino, Italy
Stefano Di Carlo Control and Computer Engineering Department, Politecnico di Torino, Torino, Italy
Gianfranco Politano Control and Computer Engineering Department, Politecnico di Torino, Torino, Italy
Alessandro Savino Control and Computer Engineering Department, Politecnico di Torino, Torino, Italy
Hafeez Ur Rehman National University of Computer & Emerging Sciences, Islamabad, Pakistan
Matteo Re Anacleto Lab, Dipartimento di informatica, Università degli Studi di Milano, Milan, Italy
Marco Mesiti Anacleto Lab, Dipartimento di informatica, Università degli Studi di Milano, Milan, Italy
Giorgio Valentini Anacleto Lab, Dipartimento di informatica, Università degli Studi di Milano, Milan, Italy
Joachim W Bargsten Applied Bioinformatics, Bioscience, Wageningen University and Research Centre, Wageningen, Netherlands
Aalt D J van Dijk Applied Bioinformatics, Bioscience, Wageningen University and Research Centre, Wageningen, Netherlands Biometris, Wageningen University, Wageningen, Netherlands
Branislava Gemovic Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
Sanja Glisic Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
Vladmir Perovic Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
Veljko Veljkovic Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
Nevena Veljkovic Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
Danillo C Almeida-E-Silva Department of Computing and Mathematics FFCLRP-USP, University of Sao Paulo, Ribeirao Preto, Brazil
Ricardo Z N Vencio Department of Computing and Mathematics FFCLRP-USP, University of Sao Paulo, Ribeirao Preto, Brazil
Malvika Sharan Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany
Jörg Vogel Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany
Lakesh Kansakar Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
Shanshan Zhang Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
Slobodan Vucetic Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
Zheng Wang University of Southern Mississippi, Hattiesburg, MS, USA
Michael J E Sternberg Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, UK
Mark N Wass School of Biosciences, University of Kent, Canterbury, Kent, UK
Rachael P Huntley European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Maria J Martin European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Claire O'Donovan European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
Peter N Robinson Institut für Medizinische Genetik und Humangenetik, Charité - Universitätsmedizin Berlin, Berlin, Germany
Yves Moreau Department of Electrical Engineering ESAT-SCD and IBBT-KU Leuven Future Health Department, Katholieke Universiteit Leuven, Leuven, Belgium
Anna Tramontano University of Rome, La Sapienza, Rome, Italy
Patricia C Babbitt California Institute for Quantitative Biosciences, University of California San Francisco, San Francisco, CA, USA
Steven E Brenner Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
Michal Linial Department of Chemical Biology, The Hebrew University of Jerusalem, Jerusalem, Israel
Christine A Orengo Institute of Structural and Molecular Biology, University College London, London, UK
Burkhard Rost Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
Casey S Greene Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
Sean D Mooney Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA
Iddo Friedberg Department of Microbiology, Miami University, Oxford, OH, USA. Department of Computer Science, Miami University, Oxford, OH, USA.
Predrag Radivojac Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA.

Collapse

120

Greene CS. Gut check. Sci Transl Med 2016. [DOI: 10.1126/scitranslmed.aah5494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

121

Doherty JA, Greene CS, Rudd JE, Tafe LJ, Alberg AJ, Bandera EV, Barnholtz-Sloan J, Bondy M, Cote ML, Funkhouser E, Moorman PG, Peters ES, Schwartz AG, Terry P, Bentley R, Berchuck A, Marks JR, Schildkraut JM. Abstract 3407: Gene expression subtypes of high grade serous ovarian cancer in African American women. Cancer Res 2016. [DOI: 10.1158/1538-7445.am2016-3407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract Abstract Ovarian cancer accounts for 5% of cancer deaths and is the fifth leading cause of cancer death in women in the United States. While incidence is higher in European American (EA) than African American (AA) women, five-year survival is worse for AA women (36%) than EA women (44%). Access to appropriate surgery and treatment is a major contributor but does not completely explain this disparity. The Cancer Genome Atlas (TCGA) identified four gene expression-based subtypes of the most common and lethal histotype, high grade serous carcinoma (HGSC): mesenchymal, proliferative, differentiated, and immunoreactive. We sought to characterize similarities and differences in gene expression-based subtypes arising in AA and EA women to determine whether there are underlying biologic features that may influence survival. We performed two distinct analyses, first using TCGA data and second using cases from the population-based African American Cancer Epidemiology Study (AACES). For both we summarized differential expression patterns for each subtype with moderated t statistic vectors for >10,000 genes using Significance Analysis of Microarrays. We calculated Pearson's correlations of these vectors to determine concordance of expression patterns between subtypes across EA and AA women. In TCGA, we observed correlations of subtype-specific expression patterns between the 24 AA and 475 EA tumors of 0.52-0.60 for each of the four subtypes. Thus, while analogous subtypes can be identified in AA and EA women, the magnitude of these correlations suggests that there are potential differences in gene expression patterns between AA and EA tumors that are assigned to the same subtype. We generated additional data from 58 AACES HGSC cases using the Affymetrix Human Transcriptome Array 2.0. Instead of assigning these tumors to previously-defined subtypes, we clustered samples to identify four subtypes de novo. We observed concordance with two of the TCGA subtypes; correlations for the mesenchymal-like and proliferative-like subtypes were 0.56-0.65. The mesenchymal-like subtype was more common in these AA women than in the TCGA EA women (33% versus 25%), and the proliferative-like subtype was marginally less common (14% versus 19%). Concordance for the differentiated-like subtype was considerably lower, at 0.21, and this subtype was less common in AA than EA women (19% versus 34%). Another subtype comprising 34% of the AA samples was only weakly correlated (-0.21-0.10) with any of the TCGA subtypes, suggesting that it is a novel subtype. The limited data available on HGSC in AA women suggest that at least two subtypes are comparable to those in EA women but differ in prevalence, and that there may be a novel subtype in AA women that does not strongly correspond to those described in EA women. Citation Format: Jennifer A. Doherty, Casey S. Greene, James E. Rudd, Laura J. Tafe, Anthony J. Alberg, Elisa V. Bandera, Jill Barnholtz-Sloan, Melissa Bondy, Michele L. Cote, Ellen Funkhouser, Patricia G. Moorman, Edward S. Peters, Ann G. Schwartz, Paul Terry, Rex Bentley, Andrew Berchuck, Jeffrey R. Marks, Joellen M. Schildkraut. Gene expression subtypes of high grade serous ovarian cancer in African American women. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 3407. Collapse

122

Rudd J, Shea EK, Way GP, Greene CS, Doherty JA. Abstract 815: Patterns of metagene activation in ovarian cancer subtypes. Cancer Res 2016. [DOI: 10.1158/1538-7445.am2016-815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract Abstract High grade serous ovarian cancer (HGSC) is a complex and aggressive disease. Recently, three or four gene expression-based subtypes, which may be differentially associated with survival, have been reported in several populations. To identify the biological functions that define the subtypes, we determined the extent to which metagenes—linear combinations of gene expression vectors—were differentially activated across subtypes, could be reliably identified across populations, and showed consistent associations with survival. We previously clustered HGSC samples using gene expression data from TCGA, Tothill (GSE9891), Yoshihara (GSE32062), and Mayo (GSE74357) to identify subtypes across populations. We found subtype-specific genes within each population through differential expression analysis (p < 4.6×10-6). Using the intersection of differentially expressed genes for parallel subtypes across these four populations, we applied non-negative matrix factorization to identify metagenes. To determine whether the metagenes were consistently observed across populations, we performed leave-one-dataset-out cross validation. For each metagene, we performed gene set enrichment analysis against the National Cancer Institute pathway interaction database to annotate metagene pathways. We examined whether increasing tertiles of metagene activity, which we termed low, medium, and high activity, were associated with survival using a random effects meta-analysis of Cox regression estimates adjusting for age at diagnosis, tumor stage, tumor grade, and debulking status. Five metagenes were consistently identified and significantly associated with HGSC subtypes (p < 0.0001). Of these, a metagene weakly enriched for the CMYB pathway was associated with subtype 1; three metagenes (one significantly enriched with the IL12 pathway and the others weakly enriched with the FCER1 and CXCR4 pathways) were associated with subtype 2; and a metagene weakly enriched with the AVB3 Integrin pathway distinguished between all 3 subtypes. Neither the CYMB metagene nor the IL12 metagene was significantly associated with survival. High activity of the CXCR4 and AVB3 metagenes was associated with poorer survival (hazard ratios (HR) and 95% confidence intervals (CI) are, respectively: 1.21, 0.99-1.48 and 1.34, 1.09-1.64). In contrast, high activity of the FCER1 metagene was associated with improved survival (HR 0.76, 95% CI 0.62-0.93). Metagenes that are consistently and statistically significantly associated with subtype may be indicative of functional differences between HGSC subtypes. The contrast in hazard estimates for metagenes associated with subtype 2 may indicate that the metagenes capture survival signal distinct from the subtype association. Future work associating metagene activity with subtype uncertainty may better enable the refinement of subtype definitions and the development of subtype specific treatment strategies. Citation Format: James Rudd, Emily K. Shea, Gregory P. Way, Casey S. Greene, Jennifer A. Doherty. Patterns of metagene activation in ovarian cancer subtypes. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 815. Collapse

123

Greene CS. The future is unsupervised. Sci Transl Med 2016. [DOI: 10.1126/scitranslmed.aag3101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

124

Greene CS, Voight BF. Pathway and network-based strategies to translate genetic discoveries into effective therapies. Hum Mol Genet 2016;25:R94-R98. [PMID: 27340225 DOI: 10.1093/hmg/ddw160] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Accepted: 05/19/2016] [Indexed: 11/13/2022] Open

125

Greene CS. Nothing but a hound dog. Sci Transl Med 2016. [DOI: 10.1126/scitranslmed.aaf9196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

126

Greene CS. CoINcIDE: All together now. Sci Transl Med 2016. [DOI: 10.1126/scitranslmed.aaf6940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

127

Himmelstein DS, Greene CS, Moore JH. Erratum to: Evolving hard problems: generating human genetics datasets with a complex etiology. BioData Min 2016;9:9. [PMID: 26848312 PMCID: PMC4740998 DOI: 10.1186/s13040-016-0085-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2016] [Accepted: 01/18/2016] [Indexed: 12/02/2022] Open

128

Thompson JA, Tan J, Greene CS. Cross-platform normalization of microarray and RNA-seq data for machine learning applications. PeerJ 2016;4:e1621. [PMID: 26844019 PMCID: PMC4736986 DOI: 10.7717/peerj.1621] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Accepted: 01/02/2016] [Indexed: 01/08/2023] Open

129

Song A, Yan J, Kim S, Risacher SL, Wong AK, Saykin AJ, Shen L, Greene CS. Network-based analysis of genetic variants associated with hippocampal volume in Alzheimer's disease: a study of ADNI cohorts. BioData Min 2016;9:3. [PMID: 26788126 PMCID: PMC4717572 DOI: 10.1186/s13040-016-0082-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Accepted: 01/14/2016] [Indexed: 12/25/2022] Open

Abstract

Background

Alzheimer’s disease (AD) is a neurodegenerative disease that causes dementia. While molecular basis of AD is not fully understood, genetic factors are expected to participate in the development and progression of the disease. Our goal was to uncover novel genetic underpinnings of Alzheimer’s disease with a bioinformatics approach that accounts for tissue specificity.

Findings

We performed genome-wide association studies (GWAS) for hippocampal volume in two Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohorts. We used these GWAS in a subsequent tissue-specific network-wide association study (NetWAS), which applied nominally significant associations in the initial GWAS to identify disease relevant patterns in a functional network for the hippocampus. We compared prioritized gene lists from NetWAS and GWAS with literature curated AD-associated genes from the Online Mendelian Inheritance in Man (OMIM) database. In the ADNI-1 GWAS, where we also observed an enrichment of low p-values, NetWAS prioritized disease-gene associations in accordance with OMIM annotations. This was not observed in the ADNI-2 dataset. We provide source code to replicate these analyses as well as complete results under permissive licenses.

Conclusions

We performed the first analysis of hippocampal volume using NetWAS, which uses machine learning algorithms applied to tissue-specific functional interaction network to prioritize GWAS results. Our findings support the idea that tissue-specific networks may provide helpful context for understanding the etiology of common human diseases and reveal challenges that network-based approaches encounter in some datasets. Our source code and intermediate results files can facilitate the development of methods to address these challenges.

Electronic supplementary material

The online version of this article (doi:10.1186/s13040-016-0082-8) contains supplementary material, which is available to authorized users.

Collapse

Affiliation(s)

Ailin Song Department of Genetics, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire USA ; Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, New Hampshire USA ; Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, New Hampshire USA
Jingwen Yan Center for Neuroimaging, Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, Indiana USA ; Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana USA ; School of Informatics and Computing, Indiana University Indianapolis, Indianapolis, Indiana USA
Sungeun Kim Center for Neuroimaging, Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, Indiana USA ; Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana USA ; Indiana Alzheimer Disease Center, Indiana University School of Medicine, Indianapolis, Indiana USA
Shannon Leigh Risacher Center for Neuroimaging, Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, Indiana USA ; Indiana Alzheimer Disease Center, Indiana University School of Medicine, Indianapolis, Indiana USA
Aaron K Wong Simons Center for Data Analysis, Simons Foundation, New York, NY USA
Andrew J Saykin Center for Neuroimaging, Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, Indiana USA ; Indiana Alzheimer Disease Center, Indiana University School of Medicine, Indianapolis, Indiana USA
Li Shen Center for Neuroimaging, Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, Indiana USA ; Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana USA ; School of Informatics and Computing, Indiana University Indianapolis, Indianapolis, Indiana USA ; Indiana Alzheimer Disease Center, Indiana University School of Medicine, Indianapolis, Indiana USA
Casey S Greene Department of Genetics, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire USA ; Dartmouth-Hitchcock Norris Cotton Cancer Center, Lebanon, New Hampshire USA ; Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, New Hampshire USA ; Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvnia USA

Collapse

130

Greene CS, Foster JA, Stanton BA, Hogan DA, Bromberg Y. COMPUTATIONAL APPROACHES TO STUDY MICROBES AND MICROBIOMES. Pac Symp Biocomput 2016;21:557-567. [PMID: 26776218 PMCID: PMC4832978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

131

Qian DC, Byun J, Han Y, Greene CS, Field JK, Hung RJ, Brhane Y, Mclaughlin JR, Fehringer G, Landi MT, Rosenberger A, Bickeböller H, Malhotra J, Risch A, Heinrich J, Hunter DJ, Henderson BE, Haiman CA, Schumacher FR, Eeles RA, Easton DF, Seminara D, Amos CI. Identification of shared and unique susceptibility pathways among cancers of the lung, breast, and prostate from genome-wide association studies and tissue-specific protein interactions. Hum Mol Genet 2015;24:7406-20. [PMID: 26483192 PMCID: PMC4664175 DOI: 10.1093/hmg/ddv440] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Revised: 09/11/2015] [Accepted: 10/12/2015] [Indexed: 12/18/2022] Open

Affiliation(s)

David C Qian Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA
Jinyoung Byun Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA
Younghun Han Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA
Casey S Greene Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
John K Field Department of Molecular and Clinical Cancer Medicine, University of Liverpool Cancer Research Centre, Liverpool L69 3GA, UK
Rayjean J Hung Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, ON M5G 1X5, Canada
Yonathan Brhane Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, ON M5G 1X5, Canada
John R Mclaughlin Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
Gordon Fehringer Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, ON M5G 1X5, Canada
Maria Teresa Landi National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
Albert Rosenberger Department of Genetic Epidemiology, University Medical Centre Göttingen, 37099 Göttingen, Germany
Heike Bickeböller Department of Genetic Epidemiology, University Medical Centre Göttingen, 37099 Göttingen, Germany
Jyoti Malhotra Division of Hematology and Oncology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Angela Risch Division of Epigenomics and Cancer Risk Factors, German Cancer Research Center, 69120 Heidelberg, Germany
Joachim Heinrich Institute of Epidemiology I, German Research Center for Environmental Health, 85764 Neuherberg, Germany
David J Hunter Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
Brian E Henderson Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
Christopher A Haiman Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
Fredrick R Schumacher Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
Rosalind A Eeles Department of Cancer Genetics, Institute of Cancer Research, London SW7 3RP, UK and
Douglas F Easton Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK
Daniela Seminara National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
Christopher I Amos Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA,

Collapse

132

Rudd J, Zelaya RA, Demidenko E, Goode EL, Greene CS, Doherty JA. Leveraging global gene expression patterns to predict expression of unmeasured genes. BMC Genomics 2015;16:1065. [PMID: 26666289 PMCID: PMC4678722 DOI: 10.1186/s12864-015-2250-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 11/27/2015] [Indexed: 12/31/2022] Open

Abstract

Background

Large collections of paraffin-embedded tissue represent a rich resource to test hypotheses based on gene expression patterns; however, measurement of genome-wide expression is cost-prohibitive on a large scale. Using the known expression correlation structure within a given disease type (in this case, high grade serous ovarian cancer; HGSC), we sought to identify reduced sets of directly measured (DM) genes which could accurately predict the expression of a maximized number of unmeasured genes.

Results

We developed a greedy gene set selection (GGS) algorithm which returns a DM set of user specified size based on a specific correlation threshold (|r_P|) and minimum number of DM genes that must be correlated to an unmeasured gene in order to infer the value of the unmeasured gene (redundancy). We evaluated GGS in the Cancer Genome Atlas (TCGA) HGSC data across 144 combinations of DM size, redundancy (1–3), and |r_P| (0.60, 0.65, 0.70). Across the parameter sweep, GGS allows on average 9 times more gene expression information to be captured compared to the DM set alone. GGS successfully augments prognostic HGSC gene sets; the addition of 20 GGS selected genes more than doubles the number of genes whose expression is predictable. Moreover, the expression prediction is highly accurate. After training regression models for the predictable gene set using 2/3 of the TCGA data, the average accuracy (ranked correlation of true and predicted values) in the 1/3 testing partition and four independent populations is above 0.65 and approaches 0.8 for conservative parameter sets. We observe similar accuracies in the TCGA HGSC RNA-sequencing data. Specifically, the prediction accuracy increases with increasing redundancy and increasing |r_P|.

Conclusions

GGS-selected genes, which maximize expression information about unmeasured genes, can be combined with candidate gene sets as a cost effective way to increase the amount of gene expression information obtained in large studies. This method can be applied to any organism, model system, disease, or tissue type for which whole genome gene expression data exists.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-2250-5) contains supplementary material, which is available to authorized users.

Collapse

133

Gonzalez GH, Tahsin T, Goodale BC, Greene AC, Greene CS. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery. Brief Bioinform 2015;17:33-42. [PMID: 26420781 PMCID: PMC4719073 DOI: 10.1093/bib/bbv087] [Citation(s) in RCA: 103] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Indexed: 02/06/2023] Open

134

Rudd J, Zelaya RA, Demidenko E, Greene CS, Doherty JA. Abstract 2171: Leveraging global gene expression patterns to identify gene sets that predict expression of large numbers of unmeasured genes. Cancer Res 2015. [DOI: 10.1158/1538-7445.am2015-2171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract Abstract Large collections of formalin-fixed paraffin embedded tissue are a rich resource to test hypotheses based on gene expression patterns; however, measurement of genome-wide expression is cost-prohibitive on a large scale, and a reduced set of “candidate” genes must be selected and assayed with platforms such as NanoString nCounter®. Using the known expression correlation structure within a given tissue (high grade serous ovarian cancer; HGSC), we sought to determine whether reduced sets of directly measured genes could accurately predict a maximized number of unmeasured genes. To maximize the number of unmeasured genes that can be inferred from reduced set assays, we developed an algorithm with three key parameters: the number of genes to directly measure; Pearson correlation thresholds between genes (rP); and the number of directly measured genes that must be correlated with the unmeasured genes (ie, redundancy). We evaluated this algorithm across a range of parameter values: 10-400 directly assayed genes, redundancy of 1-3, and |rP| of 0.60, 0.65, and 0.70. In a training partition of the Cancer Genome Atlas (TCGA) HGSC Affymetrix U133 Plus 2.0 gene expression data (n = 386), we used the selected directly measured genes to build linear models of predicted gene expression. We then evaluated the predicted expression values using true expression values from the following HGSC datasets: TCGA testing partition (n = 159); GSE9891 (Australian, U133 Plus 2.0 array, n = 264); and GSE32062 (Japanese, Agilent 4×44k microarray, n = 258). After restricting to genes with median absolute deviation (MAD) > 0.5 and using our most conservative parameters (|rP| = 0.7; redundancy = 3), 400 directly measured genes predicted an additional 198 unmeasured genes, with average Spearman rank coefficients (rS) and bootstrapped standard errors between predicted and true expression values of 0.854 (0.005) in the testing partition of TCGA, 0.871 (0.006) in the Australian data, and 0.832 (0.010) in the Japanese data. Removing MAD filtering predicted 332 unmeasured genes but lowered accuracy, with respective average rS values of 0.800 (0.009), 0.816 (0.011), and 0.750 (0.015). Relaxing redundancy to 2 and |rP| to 0.65 predicted 701 unmeasured genes, but respective average rS values decreased to 0.732 (0.007), 0.733 (0.008), and 0.686 (0.009). The number of predicted genes increases as the parameters become less conservative, with a concomitant decrease in accuracy. In summary, we show that for a given disease type a minimal set of directly measured genes can be used to maximize the amount of gene expression information captured in data sets across populations and assay platforms. Genes selected using this method can be combined with candidate gene sets as a cost-effective way to increase the amount of gene expression information obtained in large studies where using a genome-wide measurement platform is not feasible. Citation Format: James Rudd, Rene A. Zelaya, Eugene Demidenko, Casey S. Greene, Jennifer A. Doherty. Leveraging global gene expression patterns to identify gene sets that predict expression of large numbers of unmeasured genes. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr 2171. doi:10.1158/1538-7445.AM2015-2171 Collapse

135

Way GP, Rudd J, Greene CS, Doherty JA. Abstract 1928: High-grade serous ovarian cancer subtypes are similar across diverse populations. Cancer Res 2015. [DOI: 10.1158/1538-7445.am2015-1928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract Abstract The most common and lethal type of invasive epithelial ovarian cancer is high grade serous (HGSC). Three to four gene expression-based HGSC subtypes have been identified in prior studies. In contrast to most previous studies, which have assessed the performance of survival classifiers in validation sets, we sought to determine the degree of similarity of gene expression patterns in subtypes between populations using systematic unsupervised clustering within populations. We analyzed publically-available mRNA expression data from studies with >200 HGSC tumors: The Cancer Genome Atlas (TCGA, US, n = 519, Affymetrix HT U133a), Tothill et al. (GSE9891, Australia, n = 242, Affymetrix U133 Plus 2.0) and Yoshihara et al. (GSE32062, Japan, n = 258, Agilent G4112a). We restricted analyses to the 12,249 genes shared across all datasets and selected from these the union of the 1,500 most variant genes per population (2,824). Using these datasets, we performed k-means clustering within each population for k = 3 and k = 4. We compared each cluster to all other clusters using Significance Analysis of Microarrays, which outputs an F score for all 12,249 genes, measuring cluster-specific differential expression. We then calculated the correlation of the resulting F score vectors across populations and within populations across both numbers of centroids (k = 3 or k = 4). We identified analogous clusters by high F score correlations and determined each cluster's similarity to the TCGA subtypes based on cluster-specific differentially expressed genes. We observed high concordance of gene expression patterns for clusters across populations and across k-means runs, suggesting that analogous clusters exist in most analyses. For k = 3, F score correlations across populations for clusters 1, 2 and 3, respectively, ranged between 0.77-0.85, 0.80-0.90, and 0.66-0.72. For k = 4, F score correlations for clusters 1-4 were, respectively: 0.76-0.85, 0.82-0.85, 0.65-0.78, and 0.52-0.78. Across k = 3 and k = 4, correlations for cluster 1 within TCGA, Tothill, and Yoshihara were 0.99, 1.00 and 1.00, and correlations for cluster 2 were 0.96, 0.98, and 0.95, respectively. Correlations for cluster 3 were less strong: 0.56, 0.88, and 0.60, respectively. For k = 4, cluster 4 was composed mainly of samples that belonged to cluster 3 for k = 3; 88% for TCGA, 54% for Tothill, and 95% for Yoshihara. When compared to TCGA subtypes, cluster 1 corresponded most strongly to mesenchymal, cluster 2 to proliferative, cluster 3 to differentiated, and cluster 4 to immunoreactive. Our observation of highly correlated gene expression patterns between clusters across populations, across platforms, and across the number of centroids provides strong evidence that at least three biological HGSC subtypes exist. The mesenchymal-like and proliferative-like subtypes are particularly consistent across populations, and could be uniquely targeted for treatment. Citation Format: Gregory P. Way, James Rudd, Casey S. Greene, Jennifer A. Doherty. High-grade serous ovarian cancer subtypes are similar across diverse populations. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr 1928. doi:10.1158/1538-7445.AM2015-1928 Collapse

136

Gui J, Greene CS, Sullivan C, Taylor W, Moore JH, Kim C. Testing multiple hypotheses through IMP weighted FDR based on a genetic functional network with application to a new zebrafish transcriptome study. BioData Min 2015;8:17. [PMID: 26097506 PMCID: PMC4474579 DOI: 10.1186/s13040-015-0050-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2014] [Accepted: 06/08/2015] [Indexed: 11/10/2022] Open

137

Greene AC, Giffin KA, Greene CS, Moore JH. Adapting bioinformatics curricula for big data. Brief Bioinform 2015;17:43-50. [PMID: 25829469 PMCID: PMC4719066 DOI: 10.1093/bib/bbv018] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Indexed: 12/16/2022] Open

138

Mahoney JM, Taroni J, Martyanov V, Wood TA, Greene CS, Pioli PA, Hinchcliff ME, Whitfield ML. Systems level analysis of systemic sclerosis shows a network of immune and profibrotic pathways connected with genetic polymorphisms. PLoS Comput Biol 2015;11:e1004005. [PMID: 25569146 PMCID: PMC4288710 DOI: 10.1371/journal.pcbi.1004005] [Citation(s) in RCA: 94] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 10/27/2014] [Indexed: 12/15/2022] Open

Abstract

Systemic sclerosis (SSc) is a rare systemic autoimmune disease characterized by skin and organ fibrosis. The pathogenesis of SSc and its progression are poorly understood. The SSc intrinsic gene expression subsets (inflammatory, fibroproliferative, normal-like, and limited) are observed in multiple clinical cohorts of patients with SSc. Analysis of longitudinal skin biopsies suggests that a patient's subset assignment is stable over 6–12 months. Genetically, SSc is multi-factorial with many genetic risk loci for SSc generally and for specific clinical manifestations. Here we identify the genes consistently associated with the intrinsic subsets across three independent cohorts, show the relationship between these genes using a gene-gene interaction network, and place the genetic risk loci in the context of the intrinsic subsets. To identify gene expression modules common to three independent datasets from three different clinical centers, we developed a consensus clustering procedure based on mutual information of partitions, an information theory concept, and performed a meta-analysis of these genome-wide gene expression datasets. We created a gene-gene interaction network of the conserved molecular features across the intrinsic subsets and analyzed their connections with SSc-associated genetic polymorphisms. The network is composed of distinct, but interconnected, components related to interferon activation, M2 macrophages, adaptive immunity, extracellular matrix remodeling, and cell proliferation. The network shows extensive connections between the inflammatory- and fibroproliferative-specific genes. The network also shows connections between these subset-specific genes and 30 SSc-associated polymorphic genes including STAT4, BLK, IRF7, NOTCH4, PLAUR, CSK, IRAK1, and several human leukocyte antigen (HLA) genes. Our analyses suggest that the gene expression changes underlying the SSc subsets may be long-lived, but mechanistically interconnected and related to a patients underlying genetic risk.

Systemic sclerosis (SSc) is a rare autoimmune disease characterized by skin thickening (fibrosis) and progressive organ failure. Previous studies of SSc skin biopsies have identified molecular subsets of SSc based upon gene expression termed the inflammatory, fibroproliferative, normal-like, and limited intrinsic subsets. These gene expression signatures are large and although the biological processes are conserved, the exact list of genes can vary across datasets due to random variation, as well as minor differences in the composition of the study cohorts (e.g. early vs. late disease). We developed a computational tool to identify the consensus genes underlying the subsets across heterogeneous data and characterized the biological role of the consensus genes in SSc in order to obtain a systems level perspective of the SSc subsets. Our analysis reveals a complex network of genes connecting two of the major SSc intrinsic subsets, inflammatory and fibroproliferative. Many genetic loci associated with SSc risk show connections with the consensus genes of the intrinsic subsets, indicating that differential expression of genes defining the subsets may be related to genetic risk for SSc, thus for the first time placing the genetic risk factors in the context of, and showing putative relationships with, the intrinsic gene expression subsets.

Collapse

139

Tan J, Ung M, Cheng C, Greene CS. Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders. Pac Symp Biocomput 2015;20:132-143. [PMID: 25592575 DOI: 10.1142/9789814644730_0014] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

140

Greene CS, Tan J, Ung M, Moore JH, Cheng C. Big data bioinformatics. J Cell Physiol 2014;229:1896-900. [PMID: 24799088 DOI: 10.1002/jcp.24662] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Accepted: 05/01/2014] [Indexed: 12/17/2022]

141

Zieselman AL, Fisher JM, Hu T, Andrews PC, Greene CS, Shen L, Saykin AJ, Moore JH. Computational genetics analysis of grey matter density in Alzheimer's disease. BioData Min 2014;7:17. [PMID: 25165488 PMCID: PMC4145360 DOI: 10.1186/1756-0381-7-17] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 08/18/2014] [Indexed: 12/24/2022] Open

142

Bogenberger JM, Rudd JE, Chow D, Kassner M, Yin H, Greene CS, Tibes R. Abstract A28: Identification of HDAC inhibitor potentiating targets in acute myeloid leukemia cells by large-scale RNA-interference. Mol Cancer Ther 2014. [DOI: 10.1158/1535-7163.pms-a28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract Abstract The lysine deacetylase inhibitor suberoylanilide acid (SAHA) has shown promising but limited activity in the treatment of acute myeloid leukemia (AML). To identify potential targets for rational combination therapies that increase the efficacy of SAHA in AML, we used a functional RNA-interference (RNAi) screening approach to identify genes, that when inhibited, potentiate the in vitro anti-leukemic activity of SAHA. A total of 901 kinase, phosphatase and associated signaling genes were silenced, with four different siRNA sequences per gene, in combination with SAHA treatment. Log2 values of the ratio [(siRNA + SAHA)/(siRNA alone)] were calculated, with median and standard deviation determined on a per-plate basis. Hits were defined as ≥ 2 standard deviations from the log2 ratio median. Hit lists for each cell line were over-laid on an integrated functional relationship network. We applied a community detection algorithm to this sub-network and identified siRNA sensitive modules. Each module represents a highly connected set of genes in the integrated network. To identify the pathways represented by each module, we evaluated enrichment using the National Cancer Institute (NCI) Protein Interaction Database (PID) pathways. Several novel sensitizing targets, grouped into a small number of pathways, emerged from these screens. Some hits exhibit little to no anti-leukemic activity when silenced alone, indicative of synthetic lethal interaction with SAHA treatment. Initial validation experiments with siRNA and novel small molecule inhibitors confirm RNAi screen results and pharmacological sensitization is observed. The first reported large-scale HDAC inhibitor RNAi screen in leukemias has identified a novel rational combination that can be translated into design of a clinical trial. Citation Format: James M. Bogenberger, James E. Rudd, Donald Chow, Michelle Kassner, Holly Yin, Casey S. Greene, Raoul Tibes. Identification of HDAC inhibitor potentiating targets in acute myeloid leukemia cells by large-scale RNA-interference. [abstract]. In: Proceedings of the AACR Precision Medicine Series: Synthetic Lethal Approaches to Cancer Vulnerabilities; May 17-20, 2013; Bellevue, WA. Philadelphia (PA): AACR; Mol Cancer Ther 2013;12(5 Suppl):Abstract nr A28. Collapse

143

Penrod NM, Greene CS, Moore JH. Predicting targeted drug combinations based on Pareto optimal patterns of coexpression network connectivity. Genome Med 2014;6:33. [PMID: 24944582 PMCID: PMC4062052 DOI: 10.1186/gm550] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Accepted: 04/22/2014] [Indexed: 01/05/2023] Open

Abstract

Background

Molecularly targeted drugs promise a safer and more effective treatment modality than conventional chemotherapy for cancer patients. However, tumors are dynamic systems that readily adapt to these agents activating alternative survival pathways as they evolve resistant phenotypes. Combination therapies can overcome resistance but finding the optimal combinations efficiently presents a formidable challenge. Here we introduce a new paradigm for the design of combination therapy treatment strategies that exploits the tumor adaptive process to identify context-dependent essential genes as druggable targets.

Methods

We have developed a framework to mine high-throughput transcriptomic data, based on differential coexpression and Pareto optimization, to investigate drug-induced tumor adaptation. We use this approach to identify tumor-essential genes as druggable candidates. We apply our method to a set of ER⁺ breast tumor samples, collected before (n = 58) and after (n = 60) neoadjuvant treatment with the aromatase inhibitor letrozole, to prioritize genes as targets for combination therapy with letrozole treatment. We validate letrozole-induced tumor adaptation through coexpression and pathway analyses in an independent data set (n = 18).

Results

We find pervasive differential coexpression between the untreated and letrozole-treated tumor samples as evidence of letrozole-induced tumor adaptation. Based on patterns of coexpression, we identify ten genes as potential candidates for combination therapy with letrozole including EPCAM, a letrozole-induced essential gene and a target to which drugs have already been developed as cancer therapeutics. Through replication, we validate six letrozole-induced coexpression relationships and confirm the epithelial-to-mesenchymal transition as a process that is upregulated in the residual tumor samples following letrozole treatment.

Conclusions

To derive the greatest benefit from molecularly targeted drugs it is critical to design combination treatment strategies rationally. Incorporating knowledge of the tumor adaptation process into the design provides an opportunity to match targeted drugs to the evolving tumor phenotype and surmount resistance.

Collapse

144

Greene CS, Himmelstein DS, Nelson HH, Kelsey KT, Williams SM, Andrew AS, Karagas MR, Moore JH. Enabling personal genomics with an explicit test of epistasis. Pac Symp Biocomput 2013:327-36. [PMID: 19908385 DOI: 10.1142/9789814295291_0035] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Abstract

One goal of personal genomics is to use information about genomic variation to predict who is at risk for various common diseases. Technological advances in genotyping have spawned several personal genetic testing services that market genotyping services directly to the consumer. An important goal of consumer genetic testing is to provide health information along with the genotyping results. This has the potential to integrate detailed personal genetic and genomic information into healthcare decision making. Despite the potential importance of these advances, there are some important limitations. One concern is that much of the literature that is used to formulate personal genetics reports is based on genetic association studies that consider each genetic variant independently of the others. It is our working hypothesis that the true value of personal genomics will only be realized when the complexity of the genotype-to-phenotype mapping relationship is embraced, rather than ignored. We focus here on complexity in genetic architecture due to epistasis or nonlinear gene-gene interaction. We have previously developed a multifactor dimensionality reduction (MDR) algorithm and software package for detecting nonlinear interactions in genetic association studies. In most prior MDR analyses, the permutation testing strategy used to assess statistical significance was unable to differentiate MDR models that captured only interaction effects from those that also detected independent main effects. Statistical interpretation of MDR models required post-hoc analysis using entropy-based measures of interaction information. We introduce here a novel permutation test that allows the effects of nonlinear interactions between multiple genetic variants to be specifically tested in a manner that is not confounded by linear additive effects. We show using simulated nonlinear interactions that the power using the explicit test of epistasis is no different than a standard permutation test. We also show that the test has the appropriate size or type I error rate of approximately 0.05. We then apply MDR with the new explicit test of epistasis to a large genetic study of bladder cancer and show that a previously reported nonlinear interaction between is indeed significant, even after considering the strong additive effect of smoking in the model. Finally, we evaluated the power of the explicit test of epistasis to detect the nonlinear interaction between two XPD gene polymorphisms by simulating data from the MDR model of bladder cancer susceptibility. The results of this study provide for the first time a simple method for explicitly testing epistasis or gene-gene interaction effects in genetic association studies. Although we demonstrated the method with MDR, an important advantage is that it can be combined with any modeling approach. The explicit test of epistasis brings us a step closer to the type of routine gene-gene interaction analysis that is needed if we are to enable personal genomics.

Collapse

145

Ju W, Greene CS, Eichinger F, Nair V, Hodgin JB, Bitzer M, Lee YS, Zhu Q, Kehata M, Li M, Jiang S, Rastaldi MP, Cohen CD, Troyanskaya OG, Kretzler M. Defining cell-type specificity at the transcriptional level in human disease. Genome Res 2013;23:1862-73. [PMID: 23950145 PMCID: PMC3814886 DOI: 10.1101/gr.155697.113] [Citation(s) in RCA: 171] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Abstract

Cell-lineage–specific transcripts are essential for differentiated tissue function, implicated in hereditary organ failure, and mediate acquired chronic diseases. However, experimental identification of cell-lineage–specific genes in a genome-scale manner is infeasible for most solid human tissues. We developed the first genome-scale method to identify genes with cell-lineage–specific expression, even in lineages not separable by experimental microdissection. Our machine-learning–based approach leverages high-throughput data from tissue homogenates in a novel iterative statistical framework. We applied this method to chronic kidney disease and identified transcripts specific to podocytes, key cells in the glomerular filter responsible for hereditary and most acquired glomerular kidney disease. In a systematic evaluation of our predictions by immunohistochemistry, our in silico approach was significantly more accurate (65% accuracy in human) than predictions based on direct measurement of in vivo fluorescence-tagged murine podocytes (23%). Our method identified genes implicated as causal in hereditary glomerular disease and involved in molecular pathways of acquired and chronic renal diseases. Furthermore, based on expression analysis of human kidney disease biopsies, we demonstrated that expression of the podocyte genes identified by our approach is significantly related to the degree of renal impairment in patients. Our approach is broadly applicable to define lineage specificity in both cell physiology and human disease contexts. We provide a user-friendly website that enables researchers to apply this method to any cell-lineage or tissue of interest. Identified cell-lineage–specific transcripts are expected to play essential tissue-specific roles in organogenesis and disease and can provide starting points for the development of organ-specific diagnostics and therapies.

Collapse

146

Park CY, Wong AK, Greene CS, Rowland J, Guan Y, Bongo LA, Burdine RD, Troyanskaya OG. Functional knowledge transfer for high-accuracy prediction of under-studied biological processes. PLoS Comput Biol 2013;9:e1002957. [PMID: 23516347 PMCID: PMC3597527 DOI: 10.1371/journal.pcbi.1002957] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 01/15/2013] [Indexed: 11/19/2022] Open

Abstract

A key challenge in genetics is identifying the functional roles of genes in pathways. Numerous functional genomics techniques (e.g. machine learning) that predict protein function have been developed to address this question. These methods generally build from existing annotations of genes to pathways and thus are often unable to identify additional genes participating in processes that are not already well studied. Many of these processes are well studied in some organism, but not necessarily in an investigator's organism of interest. Sequence-based search methods (e.g. BLAST) have been used to transfer such annotation information between organisms. We demonstrate that functional genomics can complement traditional sequence similarity to improve the transfer of gene annotations between organisms. Our method transfers annotations only when functionally appropriate as determined by genomic data and can be used with any prediction algorithm to combine transferred gene function knowledge with organism-specific high-throughput data to enable accurate function prediction.

We show that diverse state-of-art machine learning algorithms leveraging functional knowledge transfer (FKT) dramatically improve their accuracy in predicting gene-pathway membership, particularly for processes with little experimental knowledge in an organism. We also show that our method compares favorably to annotation transfer by sequence similarity. Next, we deploy FKT with state-of-the-art SVM classifier to predict novel genes to 11,000 biological processes across six diverse organisms and expand the coverage of accurate function predictions to processes that are often ignored because of a dearth of annotated genes in an organism. Finally, we perform in vivo experimental investigation in Danio rerio and confirm the regulatory role of our top predicted novel gene, wnt5b, in leftward cell migration during heart development. FKT is immediately applicable to many bioinformatics techniques and will help biologists systematically integrate prior knowledge from diverse systems to direct targeted experiments in their organism of study.

Due to technical and ethical challenges many human diseases or biological processes are studied in model organisms. Discoveries in these organisms are then transferred back to human or other model organisms. Traditional methods for transferring novel gene function annotations have relied on finding genes with high sequence similarity believed to share evolutionary ancestry. However, sequence similarity does not guarantee a shared functional role in molecular pathways. In this study, we show that functional genomics can complement traditional sequence similarity measures to improve the transfer of gene annotations between organisms. We coupled our knowledge transfer method with current state-of-the-art machine learning algorithms and predicted gene function for 11,000 biological processes across six organisms. We experimentally validated our prediction of wnt5b's involvement in the determination of left-right heart asymmetry in zebrafish. Our results show that functional knowledge transfer can improve the coverage and accuracy of machine learning methods used for gene function prediction in a diverse set of organisms. Such an approach can be applied to additional organisms, and will be especially beneficial in organisms that have high-throughput genomic data with sparse annotations.

Collapse

147

Gonzalez G, Cohen KB, Greene CS, Hahn U, Kann MG, Leaman R, Shah N, Ye J. Text and data mining for biomedical discovery. Pac Symp Biocomput 2013;2013:368-372. [PMID: 23424141 PMCID: PMC6230431 DOI: 10.1142/9789814583220_0030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

148

Greene CS, Troyanskaya OG. Chapter 2: Data-driven view of disease biology. PLoS Comput Biol 2012;8:e1002816. [PMID: 23300408 PMCID: PMC3531282 DOI: 10.1371/journal.pcbi.1002816] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

149

Wong AK, Park CY, Greene CS, Bongo LA, Guan Y, Troyanskaya OG. IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucleic Acids Res 2012;40:W484-90. [PMID: 22684505 PMCID: PMC3394282 DOI: 10.1093/nar/gks458] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

150

Greene CS, Troyanskaya OG. Accurate evaluation and analysis of functional genomics data and methods. Ann N Y Acad Sci 2012;1260:95-100. [PMID: 22268703 DOI: 10.1111/j.1749-6632.2011.06383.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]