76
|
Kusko RL, Brothers JF, Tedrow J, Pandit K, Huleihel L, Perdomo C, Liu G, Juan-Guardela B, Kass D, Zhang S, Lenburg M, Martinez F, Quackenbush J, Sciurba F, Limper A, Geraci M, Yang I, Schwartz DA, Beane J, Spira A, Kaminski N. Integrated Genomics Reveals Convergent Transcriptomic Networks Underlying Chronic Obstructive Pulmonary Disease and Idiopathic Pulmonary Fibrosis. Am J Respir Crit Care Med 2016; 194:948-960. [PMID: 27104832 PMCID: PMC5067817 DOI: 10.1164/rccm.201510-2026oc] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2015] [Accepted: 03/27/2016] [Indexed: 12/18/2022] Open
Abstract
RATIONALE Despite shared environmental exposures, idiopathic pulmonary fibrosis (IPF) and chronic obstructive pulmonary disease are usually studied in isolation, and the presence of shared molecular mechanisms is unknown. OBJECTIVES We applied an integrative genomic approach to identify convergent transcriptomic pathways in emphysema and IPF. METHODS We defined the transcriptional repertoire of chronic obstructive pulmonary disease, IPF, or normal histology lungs using RNA-seq (n = 87). MEASUREMENTS AND MAIN RESULTS Genes increased in both emphysema and IPF relative to control were enriched for the p53/hypoxia pathway, a finding confirmed in an independent cohort using both gene expression arrays and the nCounter Analysis System (n = 193). Immunohistochemistry confirmed overexpression of HIF1A, MDM2, and NFKBIB members of this pathway in tissues from patients with emphysema or IPF. Using reads aligned across splice junctions, we determined that alternative splicing of p53/hypoxia pathway-associated molecules NUMB and PDGFA occurred more frequently in IPF or emphysema compared with control and validated these findings by quantitative polymerase chain reaction and the nCounter Analysis System on an independent sample set (n = 193). Finally, by integrating parallel microRNA and mRNA-Seq data on the same samples, we identified MIR96 as a key novel regulatory hub in the p53/hypoxia gene-expression network and confirmed that modulation of MIR96 in vitro recapitulates the disease-associated gene-expression network. CONCLUSIONS Our results suggest convergent transcriptional regulatory hubs in diseases as varied phenotypically as chronic obstructive pulmonary disease and IPF and suggest that these hubs may represent shared key responses of the lung to environmental stresses.
Collapse
|
77
|
Morrow JD, Cho MH, Hersh CP, Pinto-Plata V, Celli B, Marchetti N, Criner G, Bueno R, Washko G, Glass K, Choi AMK, Quackenbush J, Silverman EK, DeMeo DL. DNA methylation profiling in human lung tissue identifies genes associated with COPD. Epigenetics 2016; 11:730-739. [PMID: 27564456 PMCID: PMC5094634 DOI: 10.1080/15592294.2016.1226451] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Revised: 08/05/2016] [Accepted: 08/10/2016] [Indexed: 10/21/2022] Open
Abstract
Chronic obstructive pulmonary disease (COPD) is a smoking-related disease characterized by genetic and phenotypic heterogeneity. Although association studies have identified multiple genomic regions with replicated associations to COPD, genetic variation only partially explains the susceptibility to lung disease, and suggests the relevance of epigenetic investigations. We performed genome-wide DNA methylation profiling in homogenized lung tissue samples from 46 control subjects with normal lung function and 114 subjects with COPD, all former smokers. The differentially methylated loci were integrated with previous genome-wide association study results. The top 535 differentially methylated sites, filtered for a minimum mean methylation difference of 5% between cases and controls, were enriched for CpG shelves and shores. Pathway analysis revealed enrichment for transcription factors. The top differentially methylated sites from the intersection with previous GWAS were in CHRM1, GLT1D1, and C10orf11; sorted by GWAS P-value, the top sites included FRMD4A, THSD4, and C10orf11. Epigenetic association studies complement genetic association studies to identify genes potentially involved in COPD pathogenesis. Enrichment for genes implicated in asthma and lung function and for transcription factors suggests the potential pathogenic relevance of genes identified through differential methylation and the intersection with a broader range of GWAS associations.
Collapse
|
78
|
Vargas AJ, Quackenbush J, Glass K. Diet-induced weight loss leads to a switch in gene regulatory network control in the rectal mucosa. Genomics 2016; 108:126-133. [PMID: 27524493 PMCID: PMC5121035 DOI: 10.1016/j.ygeno.2016.08.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Revised: 08/09/2016] [Accepted: 08/10/2016] [Indexed: 12/15/2022]
Abstract
BACKGROUND Weight loss may decrease risk of colorectal cancer in obese individuals, yet its effect in the colorectum is not well understood. We used integrative network modeling, Passing Attributes between Networks for Data Assimilation, to estimate transcriptional regulatory network models from mRNA expression levels from rectal mucosa biopsies measured pre- and post-weight loss in 10 obese, pre-menopausal women. RESULTS We identified significantly greater regulatory targeting of glucose transport pathways in the post-weight loss regulatory network, including "regulation of glucose transport" (FDR=0.02), "hexose transport" (FDR=0.06), "glucose transport" (FDR=0.06) and "monosaccharide transport" (FDR=0.08). These findings were not evident by gene expression analysis alone. Network analysis also suggested a regulatory switch from NFΚB1 to MAX control of MYC post-weight loss. CONCLUSIONS These network-based results expand upon standard gene expression analysis by providing evidence for a potential mechanistic alteration caused by weight loss.
Collapse
|
79
|
Safikhani Z, Smirnov P, Freeman M, El-Hachem N, She A, Rene Q, Goldenberg A, Birkbak NJ, Hatzis C, Shi L, Beck AH, Aerts HJ, Quackenbush J, Haibe-Kains B. Revisiting inconsistency in large pharmacogenomic studies. F1000Res 2016; 5:2333. [PMID: 28928933 PMCID: PMC5580432 DOI: 10.12688/f1000research.9611.3] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/11/2017] [Indexed: 01/30/2023] Open
Abstract
In 2013, we published a comparative analysis of mutation and gene expression profiles and drug sensitivity measurements for 15 drugs characterized in the 471 cancer cell lines screened in the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE). While we found good concordance in gene expression profiles, there was substantial inconsistency in the drug responses reported by the GDSC and CCLE projects. We received extensive feedback on the comparisons that we performed. This feedback, along with the release of new data, prompted us to revisit our initial analysis. We present a new analysis using these expanded data, where we address the most significant suggestions for improvements on our published analysis - that targeted therapies and broad cytotoxic drugs should have been treated differently in assessing consistency, that consistency of both molecular profiles and drug sensitivity measurements should be compared across cell lines, and that the software analysis tools provided should have been easier to run, particularly as the GDSC and CCLE released additional data. Our re-analysis supports our previous finding that gene expression data are significantly more consistent than drug sensitivity measurements. Using new statistics to assess data consistency allowed identification of two broad effect drugs and three targeted drugs with moderate to good consistency in drug sensitivity data between GDSC and CCLE. For three other targeted drugs, there were not enough sensitive cell lines to assess the consistency of the pharmacological profiles. We found evidence of inconsistencies in pharmacological phenotypes for the remaining eight drugs. Overall, our findings suggest that the drug sensitivity data in GDSC and CCLE continue to present challenges for robust biomarker discovery. This re-analysis provides additional support for the argument that experimental standardization and validation of pharmacogenomic response will be necessary to advance the broad use of large pharmacogenomic screens.
Collapse
|
80
|
Safikhani Z, Smirnov P, Freeman M, El-Hachem N, She A, Rene Q, Goldenberg A, Birkbak NJ, Hatzis C, Shi L, Beck AH, Aerts HJ, Quackenbush J, Haibe-Kains B. Revisiting inconsistency in large pharmacogenomic studies. F1000Res 2016; 5:2333. [PMID: 28928933 PMCID: PMC5580432 DOI: 10.12688/f1000research.9611.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/21/2017] [Indexed: 11/13/2023] Open
Abstract
In 2013, we published a comparative analysis of mutation and gene expression profiles and drug sensitivity measurements for 15 drugs characterized in the 471 cancer cell lines screened in the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE). While we found good concordance in gene expression profiles, there was substantial inconsistency in the drug responses reported by the GDSC and CCLE projects. We received extensive feedback on the comparisons that we performed. This feedback, along with the release of new data, prompted us to revisit our initial analysis. We present a new analysis using these expanded data, where we address the most significant suggestions for improvements on our published analysis - that targeted therapies and broad cytotoxic drugs should have been treated differently in assessing consistency, that consistency of both molecular profiles and drug sensitivity measurements should be compared across cell lines, and that the software analysis tools provided should have been easier to run, particularly as the GDSC and CCLE released additional data. Our re-analysis supports our previous finding that gene expression data are significantly more consistent than drug sensitivity measurements. Using new statistics to assess data consistency allowed identification of two broad effect drugs and three targeted drugs with moderate to good consistency in drug sensitivity data between GDSC and CCLE. For three other targeted drugs, there were not enough sensitive cell lines to assess the consistency of the pharmacological profiles. We found evidence of inconsistencies in pharmacological phenotypes for the remaining eight drugs. Overall, our findings suggest that the drug sensitivity data in GDSC and CCLE continue to present challenges for robust biomarker discovery. This re-analysis provides additional support for the argument that experimental standardization and validation of pharmacogenomic response will be necessary to advance the broad use of large pharmacogenomic screens.
Collapse
|
81
|
Safikhani Z, Smirnov P, Freeman M, El-Hachem N, She A, Rene Q, Goldenberg A, Birkbak NJ, Hatzis C, Shi L, Beck AH, Aerts HJ, Quackenbush J, Haibe-Kains B. Revisiting inconsistency in large pharmacogenomic studies. F1000Res 2016; 5:2333. [PMID: 28928933 PMCID: PMC5580432 DOI: 10.12688/f1000research.9611.1] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/15/2016] [Indexed: 01/22/2023] Open
Abstract
In 2013, we published a comparative analysis mutation and gene expression profiles and drug sensitivity measurements for 15 drugs characterized in the 471 cancer cell lines screened in the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE). While we found good concordance in gene expression profiles, there was substantial inconsistency in the drug responses reported by the GDSC and CCLE projects. We received extensive feedback on the comparisons that we performed. This feedback, along with the release of new data, prompted us to revisit our initial analysis. Here we present a new analysis using these expanded data in which we address the most significant suggestions for improvements on our published analysis - that targeted therapies and broad cytotoxic drugs should have been treated differently in assessing consistency, that consistency of both molecular profiles and drug sensitivity measurements should both be compared across cell lines, and that the software analysis tools we provided should have been easier to run, particularly as the GDSC and CCLE released additional data. Our re-analysis supports our previous finding that gene expression data are significantly more consistent than drug sensitivity measurements. The use of new statistics to assess data consistency allowed us to identify two broad effect drugs and three targeted drugs with moderate to good consistency in drug sensitivity data between GDSC and CCLE. For three other targeted drugs, there were not enough sensitive cell lines to assess the consistency of the pharmacological profiles. We found evidence of inconsistencies in pharmacological phenotypes for the remaining eight drugs. Overall, our findings suggest that the drug sensitivity data in GDSC and CCLE continue to present challenges for robust biomarker discovery. This re-analysis provides additional support for the argument that experimental standardization and validation of pharmacogenomic response will be necessary to advance the broad use of large pharmacogenomic screens.
Collapse
|
82
|
Platig J, Castaldi PJ, DeMeo D, Quackenbush J. Bipartite Community Structure of eQTLs. PLoS Comput Biol 2016; 12:e1005033. [PMID: 27618581 PMCID: PMC5019382 DOI: 10.1371/journal.pcbi.1005033] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 06/23/2016] [Indexed: 11/18/2022] Open
Abstract
Genome Wide Association Studies (GWAS) and expression quantitative trait locus (eQTL) analyses have identified genetic associations with a wide range of human phenotypes. However, many of these variants have weak effects and understanding their combined effect remains a challenge. One hypothesis is that multiple SNPs interact in complex networks to influence functional processes that ultimately lead to complex phenotypes, including disease states. Here we present CONDOR, a method that represents both cis- and trans-acting SNPs and the genes with which they are associated as a bipartite graph and then uses the modular structure of that graph to place SNPs into a functional context. In applying CONDOR to eQTLs in chronic obstructive pulmonary disease (COPD), we found the global network “hub” SNPs were devoid of disease associations through GWAS. However, the network was organized into 52 communities of SNPs and genes, many of which were enriched for genes in specific functional classes. We identified local hubs within each community (“core SNPs”) and these were enriched for GWAS SNPs for COPD and many other diseases. These results speak to our intuition: rather than single SNPs influencing single genes, we see groups of SNPs associated with the expression of families of functionally related genes and that disease SNPs are associated with the perturbation of those functions. These methods are not limited in their application to COPD and can be used in the analysis of a wide variety of disease processes and other phenotypic traits. Large-scale studies have identified thousands of genetic variants associated with different phenotypes without explaining their function. Expression quantitative trait locus analysis associates the compendium of genetic variants with expression levels of individual genes, providing the opportunity to link those variants to functions. But the complexity of those associations has caused most analyses to focus solely on genetic variants immediately adjacent to the genes they may influence. We describe a method that embraces the complexity, representing all variant-gene associations as a bipartite graph. The graph contains highly modular, functional communities in which disease-associated variants emerge as those likely to perturb the structure of the network and the function of the genes in these communities.
Collapse
|
83
|
Elias KM, Emori MM, Westerling T, Long H, Budina-Kolomets A, Li F, MacDuffie E, Davis MR, Holman A, Lawney B, Freedman ML, Quackenbush J, Brown M, Drapkin R. Epigenetic remodeling regulates transcriptional changes between ovarian cancer and benign precursors. JCI Insight 2016; 1. [PMID: 27617304 DOI: 10.1172/jci.insight.87988] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Regulation of lineage-restricted transcription factors has been shown to influence malignant transformation in several types of cancer. Whether similar mechanisms are involved in ovarian cancer pathogenesis is unknown. PAX8 is a nuclear transcription factor that controls the embryologic development of the Müllerian system, including the fallopian tubes. Recent studies have shown that fallopian tube secretory epithelial cells (FTSECs) give rise to the most common form of ovarian cancer, high-grade serous ovarian carcinomas (HGSOCs). We designed the present study in order to understand whether changes in gene expression between FTSECs and HGSOCs relate to alterations in PAX8 binding to chromatin. Using whole transcriptome shotgun sequencing (RNA-Seq) after PAX8 knockdown and ChIP-Seq, we show that FTSECs and HGSOCs are distinguished by marked reprogramming of the PAX8 cistrome. Genes that are significantly altered between FTSECs and HGSOCs are enriched near PAX8 binding sites. These sites are also near TEAD binding sites, and these transcriptional changes may be related to PAX8 interactions with the TEAD/YAP1 signaling pathway. These data suggest that transcriptional changes after transformation in ovarian cancer are closely related to epigenetic remodeling in lineage-specific transcription factors.
Collapse
|
84
|
Manimaran S, Selby HM, Okrah K, Ruberman C, Leek JT, Quackenbush J, Haibe-Kains B, Bravo HC, Johnson WE. BatchQC: interactive software for evaluating sample and batch effects in genomic data. Bioinformatics 2016; 32:3836-3838. [PMID: 27540268 PMCID: PMC5167063 DOI: 10.1093/bioinformatics/btw538] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Revised: 08/10/2016] [Accepted: 08/10/2016] [Indexed: 12/02/2022] Open
Abstract
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. There are several existing batch adjustment tools for ‘-omics’ data, but they do not indicate a priori whether adjustment needs to be conducted or how correction should be applied. We present a software pipeline, BatchQC, which addresses these issues using interactive visualizations and statistics that evaluate the impact of batch effects in a genomic dataset. BatchQC can also apply existing adjustment tools and allow users to evaluate their benefits interactively. We used the BatchQC pipeline on both simulated and real data to demonstrate the effectiveness of this software toolkit. Availability and Implementation: BatchQC is available through Bioconductor: http://bioconductor.org/packages/BatchQC and GitHub: https://github.com/mani2012/BatchQC. Contact:wej@bu.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|
85
|
De Nicolo A, Haibe-Kains B, Taşan M, Cusick ME, Vidal M, Quackenbush J, Joukov V, Livingston DM. Analysis of BRCA1-related functional associations in sporadic triple negative breast cancer: A network-based approach. J Clin Oncol 2016. [DOI: 10.1200/jco.2016.34.15_suppl.1070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
86
|
Safikhani Z, El-Hachem N, Quevedo R, Smirnov P, Goldenberg A, Juul Birkbak N, Mason C, Hatzis C, Shi L, Aerts HJWL, Quackenbush J, Haibe-Kains B. Assessment of pharmacogenomic agreement. F1000Res 2016; 5:825. [PMID: 27408686 PMCID: PMC4926729 DOI: 10.12688/f1000research.8705.1] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/03/2016] [Indexed: 11/20/2022] Open
Abstract
In 2013 we published an analysis demonstrating that drug response data and gene-drug associations reported in two independent large-scale pharmacogenomic screens, Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE), were inconsistent. The GDSC and CCLE investigators recently reported that their respective studies exhibit reasonable agreement and yield similar molecular predictors of drug response, seemingly contradicting our previous findings. Reanalyzing the authors' published methods and results, we found that their analysis failed to account for variability in the genomic data and more importantly compared different drug sensitivity measures from each study, which substantially deviate from our more stringent consistency assessment. Our comparison of the most updated genomic and pharmacological data from the GDSC and CCLE confirms our published findings that the measures of drug response reported by these two groups are not consistent. We believe that a principled approach to assess the reproducibility of drug sensitivity predictors is necessary before envisioning their translation into clinical settings.
Collapse
|
87
|
Wu W, Parmar C, Grossmann P, Quackenbush J, Lambin P, Bussink J, Mak R, Aerts HJWL. Exploratory Study to Identify Radiomics Classifiers for Lung Cancer Histology. Front Oncol 2016; 6:71. [PMID: 27064691 PMCID: PMC4811956 DOI: 10.3389/fonc.2016.00071] [Citation(s) in RCA: 236] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Accepted: 03/14/2016] [Indexed: 01/05/2023] Open
Abstract
Background Radiomics can quantify tumor phenotypic characteristics non-invasively by applying feature algorithms to medical imaging data. In this study of lung cancer patients, we investigated the association between radiomic features and the tumor histologic subtypes (adenocarcinoma and squamous cell carcinoma). Furthermore, in order to predict histologic subtypes, we employed machine-learning methods and independently evaluated their prediction performance. Methods Two independent radiomic cohorts with a combined size of 350 patients were included in our analysis. A total of 440 radiomic features were extracted from the segmented tumor volumes of pretreatment CT images. These radiomic features quantify tumor phenotypic characteristics on medical images using tumor shape and size, intensity statistics, and texture. Univariate analysis was performed to assess each feature’s association with the histological subtypes. In our multivariate analysis, we investigated 24 feature selection methods and 3 classification methods for histology prediction. Multivariate models were trained on the training cohort and their performance was evaluated on the independent validation cohort using the area under ROC curve (AUC). Histology was determined from surgical specimen. Results In our univariate analysis, we observed that fifty-three radiomic features were significantly associated with tumor histology. In multivariate analysis, feature selection methods ReliefF and its variants showed higher prediction accuracy as compared to other methods. We found that Naive Baye’s classifier outperforms other classifiers and achieved the highest AUC (0.72; p-value = 2.3 × 10−7) with five features: Stats_min, Wavelet_HLL_rlgl_lowGrayLevelRunEmphasis, Wavelet_HHL_stats_median, Wavelet_HLL_stats_skewness, and Wavelet_HLH_glcm_clusShade. Conclusion Histological subtypes can influence the choice of a treatment/therapy for lung cancer patients. We observed that radiomic features show significant association with the lung tumor histology. Moreover, radiomics-based multivariate classifiers were independently validated for the prediction of histological subtypes. Despite achieving lower than optimal prediction accuracy (AUC 0.72), our analysis highlights the impressive potential of non-invasive and cost-effective radiomics for precision medicine. Further research in this direction could lead us to optimal performance and therefore to clinical applicability, which could enhance the efficiency and efficacy of cancer care.
Collapse
|
88
|
Delaney SK, Hultner ML, Jacob HJ, Ledbetter DH, McCarthy JJ, Ball M, Beckman KB, Belmont JW, Bloss CS, Christman MF, Cosgrove A, Damiani SA, Danis T, Delledonne M, Dougherty MJ, Dudley JT, Faucett WA, Friedman JR, Haase DH, Hays TS, Heilsberg S, Huber J, Kaminsky L, Ledbetter N, Lee WH, Levin E, Libiger O, Linderman M, Love RL, Magnus DC, Martland A, McClure SL, Megill SE, Messier H, Nussbaum RL, Palaniappan L, Patay BA, Popovich BW, Quackenbush J, Savant MJ, Su MM, Terry SF, Tucker S, Wong WT, Green RC. Toward clinical genomics in everyday medicine: perspectives and recommendations. Expert Rev Mol Diagn 2016; 16:521-32. [PMID: 26810587 PMCID: PMC4841021 DOI: 10.1586/14737159.2016.1146593] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Precision or personalized medicine through clinical genome and exome sequencing has been described by some as a revolution that could transform healthcare delivery, yet it is currently used in only a small fraction of patients, principally for the diagnosis of suspected Mendelian conditions and for targeting cancer treatments. Given the burden of illness in our society, it is of interest to ask how clinical genome and exome sequencing can be constructively integrated more broadly into the routine practice of medicine for the betterment of public health. In November 2014, 46 experts from academia, industry, policy and patient advocacy gathered in a conference sponsored by Illumina, Inc. to discuss this question, share viewpoints and propose recommendations. This perspective summarizes that work and identifies some of the obstacles and opportunities that must be considered in translating advances in genomics more widely into the practice of medicine.
Collapse
|
89
|
Cloonan SM, Glass K, Laucho-Contreras ME, Bhashyam AR, Cervo M, Pabón MA, Konrad C, Polverino F, Siempos II, Perez E, Mizumura K, Ghosh MC, Parameswaran H, Williams NC, Rooney KT, Chen ZH, Goldklang MP, Yuan GC, Moore SC, Demeo DL, Rouault TA, D’Armiento JM, Schon EA, Manfredi G, Quackenbush J, Mahmood A, Silverman EK, Owen CA, Choi AM. Mitochondrial iron chelation ameliorates cigarette smoke-induced bronchitis and emphysema in mice. Nat Med 2016; 22:163-74. [PMID: 26752519 PMCID: PMC4742374 DOI: 10.1038/nm.4021] [Citation(s) in RCA: 168] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Accepted: 12/01/2015] [Indexed: 12/20/2022]
Abstract
Chronic obstructive pulmonary disease (COPD) is linked to both cigarette smoking and genetic determinants. We have previously identified iron-responsive element-binding protein 2 (IRP2) as an important COPD susceptibility gene and have shown that IRP2 protein is increased in the lungs of individuals with COPD. Here we demonstrate that mice deficient in Irp2 were protected from cigarette smoke (CS)-induced experimental COPD. By integrating RNA immunoprecipitation followed by sequencing (RIP-seq), RNA sequencing (RNA-seq), and gene expression and functional enrichment clustering analysis, we identified Irp2 as a regulator of mitochondrial function in the lungs of mice. Irp2 increased mitochondrial iron loading and levels of cytochrome c oxidase (COX), which led to mitochondrial dysfunction and subsequent experimental COPD. Frataxin-deficient mice, which had higher mitochondrial iron loading, showed impaired airway mucociliary clearance (MCC) and higher pulmonary inflammation at baseline, whereas mice deficient in the synthesis of cytochrome c oxidase, which have reduced COX, were protected from CS-induced pulmonary inflammation and impairment of MCC. Mice treated with a mitochondrial iron chelator or mice fed a low-iron diet were protected from CS-induced COPD. Mitochondrial iron chelation also alleviated CS-induced impairment of MCC, CS-induced pulmonary inflammation and CS-associated lung injury in mice with established COPD, suggesting a critical functional role and potential therapeutic intervention for the mitochondrial-iron axis in COPD.
Collapse
MESH Headings
- Aged
- Aged, 80 and over
- Airway Remodeling
- Animals
- Bronchitis/etiology
- Bronchitis/genetics
- Disease Models, Animal
- Electron Transport Complex IV/metabolism
- Electrophoretic Mobility Shift Assay
- Enzyme-Linked Immunosorbent Assay
- Flow Cytometry
- Gene Expression Profiling
- Humans
- Immunoblotting
- Immunohistochemistry
- Immunoprecipitation
- Iron/metabolism
- Iron Chelating Agents/pharmacology
- Iron Regulatory Protein 2/genetics
- Iron Regulatory Protein 2/metabolism
- Iron, Dietary
- Iron-Binding Proteins/genetics
- Lung/drug effects
- Lung/metabolism
- Lung Injury/etiology
- Lung Injury/genetics
- Membrane Potential, Mitochondrial
- Mice
- Mice, Knockout
- Microscopy, Confocal
- Microscopy, Electron, Transmission
- Microscopy, Fluorescence
- Mitochondria/drug effects
- Mitochondria/metabolism
- Mucociliary Clearance/genetics
- Pneumonia/etiology
- Pneumonia/genetics
- Pulmonary Disease, Chronic Obstructive/etiology
- Pulmonary Disease, Chronic Obstructive/genetics
- Pulmonary Disease, Chronic Obstructive/metabolism
- Pulmonary Emphysema/etiology
- Pulmonary Emphysema/genetics
- Real-Time Polymerase Chain Reaction
- Smoke/adverse effects
- Smoking/adverse effects
- Nicotiana
- Frataxin
Collapse
|
90
|
Malleshaiah M, Padi M, Rué P, Quackenbush J, Martinez-Arias A, Gunawardena J. Nac1 Coordinates a Sub-network of Pluripotency Factors to Regulate Embryonic Stem Cell Differentiation. Cell Rep 2016; 14:1181-1194. [PMID: 26832399 DOI: 10.1016/j.celrep.2015.12.101] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Revised: 10/19/2015] [Accepted: 12/23/2015] [Indexed: 12/15/2022] Open
Abstract
Pluripotent cells give rise to distinct cell types during development and are regulated by often self-reinforcing molecular networks. How such networks allow cells to differentiate is less well understood. Here, we use integrative methods to show that external signals induce reorganization of the mouse embryonic stem cell pluripotency network and that a sub-network of four factors, Nac1, Oct4, Tcf3, and Sox2, regulates their differentiation into the alternative mesendodermal and neuroectodermal fates. In the mesendodermal fate, Nac1 and Oct4 were constrained within quantitative windows, whereas Sox2 and Tcf3 were repressed. In contrast, in the neuroectodermal fate, Sox2 and Tcf3 were constrained while Nac1 and Oct4 were repressed. In addition, we show that Nac1 coordinates differentiation by activating Oct4 and inhibiting both Sox2 and Tcf3. Reorganization of progenitor cell networks around shared factors might be a common differentiation strategy and our integrative approach provides a general methodology for delineating such networks.
Collapse
|
91
|
Kuijjer ML, Glass K, Quackenbush J. Abstract B1-20: Gene regulation by transcription factors and microRNAs in ovarian cancer. Cancer Res 2015. [DOI: 10.1158/1538-7445.compsysbio-b1-20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Conventional methods to analyze genomic data do not make use of the connectivity between different data types, such as transcriptional regulation and gene expression, thereby often failing to identify the cellular processes that are unique to cancer and cancer subtypes. An example of this are the four recently characterized high-grade serous ovarian cancer transcriptomic subtypes – differentiated, immunoreactive, mesenchymal, and proliferative ovarian cancer. These subtypes have not been associated with significant differences in survival, and their discovery has not lead to the identification of subtype-specific therapies. Uncovering the regulatory mechanisms mediating differences in expression between these subtypes may identify new therapeutic interventions, and ultimately help cancer patients.
We modeled regulatory networks of the four ovarian cancer subtypes using PANDA, a network inference approach that uses genomic data to search for an optimal network by modeling information flow between regulators and target genes. Because not only transcription factors, but also microRNAs play an important role in gene regulation, we modified the PANDA algorithm to account for the regulatory effects of microRNAs in addition to transcription factors (miR-PANDA, in preparation). We compared the networks defining gene regulation in each subtype using different network comparison metrics. We observed a very striking pattern in out-degree differences that suggests transcription factors and microRNAs play a major role in driving the different subtypes. Using gene set enrichment analysis on in-degree differences of target genes, we identified several cancer-related pathways that are highly targeted in specific subtypes, such as regulation of Notch signaling by microRNAs in the immunoreactive subtype, and regulation of Wnt signaling by transcription factors in the proliferative subtype. These results may point to new therapeutic interventions and advance personalized treatments for ovarian cancer patients.
Citation Format: Marieke Lydia Kuijjer, Kimberly Glass, John Quackenbush. Gene regulation by transcription factors and microRNAs in ovarian cancer. [abstract]. In: Proceedings of the AACR Special Conference on Computational and Systems Biology of Cancer; Feb 8-11 2015; San Francisco, CA. Philadelphia (PA): AACR; Cancer Res 2015;75(22 Suppl 2):Abstract nr B1-20.
Collapse
|
92
|
Padi M, Quackenbush J. Integrating transcriptional and protein interaction networks to prioritize condition-specific master regulators. BMC SYSTEMS BIOLOGY 2015; 9:80. [PMID: 26576632 PMCID: PMC4650867 DOI: 10.1186/s12918-015-0228-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Accepted: 11/03/2015] [Indexed: 12/20/2022]
Abstract
BACKGROUND Genome-wide libraries of yeast deletion strains have been used to screen for genes that drive phenotypes such as stress response. A surprising observation emerging from these studies is that the genes with the largest changes in mRNA expression during a state transition are not those that drive that transition. Here, we show that integrating gene expression data with context-independent protein interaction networks can help prioritize master regulators that drive biological phenotypes. RESULTS Genes essential for survival had previously been shown to exhibit high centrality in protein interaction networks. However, the set of genes that drive growth in any specific condition is highly context-dependent. We inferred regulatory networks from gene expression data and transcription factor binding motifs in Saccharomyces cerevisiae, and found that high-degree nodes in regulatory networks are enriched for transcription factors that drive the corresponding phenotypes. We then found that using a metric combining protein interaction and transcriptional networks improved the enrichment for drivers in many of the contexts we examined. We applied this principle to a dataset of gene expression in normal human fibroblasts expressing a panel of viral oncogenes. We integrated regulatory interactions inferred from this data with a database of yeast two-hybrid protein interactions and ranked 571 human transcription factors by their combined network score. The ranked list was significantly enriched in known cancer genes that could not be found by standard differential expression or enrichment analyses. CONCLUSIONS There has been increasing recognition that network-based approaches can provide insight into critical cellular elements that help define phenotypic state. Our analysis suggests that no one network, based on a single data type, captures the full spectrum of interactions. Greater insight can instead be gained by exploring multiple independent networks and by choosing an appropriate metric on each network. Moreover we can improve our ability to rank phenotypic drivers by combining the information from individual networks. We propose that such integrative network analysis could be used to combine clinical gene expression data with interaction databases to prioritize patient- and disease-specific therapeutic targets.
Collapse
|
93
|
De Rienzo A, Archer MA, Yeap BY, Dao N, Sciaranghella D, Sideris AC, Zheng Y, Holman AG, Wang YE, Dal Cin PS, Fletcher JA, Rubio R, Croft L, Quackenbush J, Sugarbaker PE, Munir KJ, Battilana JR, Gustafson CE, Chirieac LR, Ching SM, Wong J, Tay LC, Rudd S, Hercus R, Sugarbaker DJ, Richards WG, Bueno R. Gender-Specific Molecular and Clinical Features Underlie Malignant Pleural Mesothelioma. Cancer Res 2015; 76:319-28. [PMID: 26554828 DOI: 10.1158/0008-5472.can-15-0751] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 10/19/2015] [Indexed: 12/29/2022]
Abstract
Malignant pleural mesothelioma (MPM) is an aggressive cancer that occurs more frequently in men, but is associated with longer survival in women. Insight into the survival advantage of female patients may advance the molecular understanding of MPM and identify therapeutic interventions that will improve the prognosis for all MPM patients. In this study, we performed whole-genome sequencing of tumor specimens from 10 MPM patients and matched control samples to identify potential driver mutations underlying MPM. We identified molecular differences associated with gender and histology. Specifically, single-nucleotide variants of BAP1 were observed in 21% of cases, with lower mutation rates observed in sarcomatoid MPM (P < 0.001). Chromosome 22q loss was more frequently associated with the epithelioid than that nonepitheliod histology (P = 0.037), whereas CDKN2A deletions occurred more frequently in nonepithelioid subtypes among men (P = 0.021) and were correlated with shorter overall survival for the entire cohort (P = 0.002) and for men (P = 0.012). Furthermore, women were more likely to harbor TP53 mutations (P = 0.004). Novel mutations were found in genes associated with the integrin-linked kinase pathway, including MYH9 and RHOA. Moreover, expression levels of BAP1, MYH9, and RHOA were significantly higher in nonepithelioid tumors, and were associated with significant reduction in survival of the entire cohort and across gender subgroups. Collectively, our findings indicate that diverse mechanisms highly related to gender and histology appear to drive MPM.
Collapse
|
94
|
Jirawatnotai S, Sharma S, Michowski W, Suktitipat B, Geng Y, Quackenbush J, Elias JE, Gygi SP, Wang YE, Sicinski P. The cyclin D1-CDK4 oncogenic interactome enables identification of potential novel oncogenes and clinical prognosis. Cell Cycle 2015; 13:2889-900. [PMID: 25486477 DOI: 10.4161/15384101.2014.946850] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Overexpression of cyclin D1 and its catalytic partner, CDK4, is frequently seen in human cancers. We constructed cyclin D1 and CDK4 protein interaction network in a human breast cancer cell line MCF7, and identified novel CDK4 protein partners. Among CDK4 interactors we observed several proteins functioning in protein folding and in complex assembly. One of the novel partners of CDK4 is FKBP5, which we found to be required to maintain CDK4 levels in cancer cells. An integrative analysis of the extended cyclin D1 cancer interactome and somatic copy number alterations in human cancers identified BAIAPL21 as a potential novel human oncogene. We observed that in several human tumor types BAIAPL21 is expressed at higher levels as compared to normal tissue. Forced overexpression of BAIAPL21 augmented anchorage independent growth, increased colony formation by cancer cells and strongly enhanced the ability of cells to form tumors in vivo. Lastly, we derived an Aggregate Expression Score (AES), which quantifies the expression of all cyclin D1 interactors in a given tumor. We observed that AES has a prognostic value among patients with ER-positive breast cancers. These studies illustrate the utility of analyzing the interactomes of proteins involved in cancer to uncover potential oncogenes, or to allow better cancer prognosis.
Collapse
Key Words
- ACN, acetonitrile
- AES, aggregate expression score
- ATCC, American type culture collection
- CDK4
- DMEM, Dulbecco's Modified Eagle's medium
- FBS, fetal bovine serum
- LC-MS/MS, liquid chromatography-tandem mass spectrometry
- PPI, protein-protein interaction
- RPMI, Roswell Park Memorial Institute medium
- SCNA, somatic copy-number variation
- TCGA, the cancer genome atlas
- WB, immunoblotting
- breast cancer
- cyclin D1
- interactome
- oncogenes
- oncogenic signature
- siFKBP4, FKBP4-specific small interfering RNA
- siFKBP5, FKBP5-specific small interfering RNA
- siRNA, small interfering RNA
- sicont, control small interfering RNA
- sicyclin D1, cyclin D1-specific small interfering RNA
Collapse
|
95
|
Scherer D, Toth R, Kelemen L, Risch A, Hazra A, Issa JP, Moreno V, Eeles RA, Quackenbush J, Goode EL, Ogino S, Hung R, Ulrich CM. Abstract 4612: Genetic variants in epigenetic pathways and risk of multiple cancer types in the GAME-ON consortium. Cancer Res 2015. [DOI: 10.1158/1538-7445.am2015-4612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Introduction
Epigenetic changes are reversible features of the genome that regulate gene transcription and protein expression on several levels including DNA methylation, histone modification or miRNA expression. We investigated the association between inherited variation in genes of key epigenetic processes and risk of multiple cancers within the GAME-ON consortium.
Methods
We performed a pathway based meta-analysis using genotypes from more than 50,000 cases of breast, lung, prostate, ovarian and colorectal cancer cases and more than 60,000 controls from various genome wide association studies participating in the GAME-ON consortium to estimate associations with cancer risk. Using the 1000GenomeProject database, we selected 505,702 genotyped and imputed single nucleotide polymorphisms in 551 genes (flanking region +/- 250kb) related to DNA methylation, histone modification or chromatin remodeling based on GO and GeneCard databases. In order to allow variants to be associated with only a subset of traits we used subset based meta-analysis. False-discovery rate (FDR) corrected p-values (q-values) lower than 0.05 were considered significant.
Results and Discussion 582 SNPs were significantly associated with risk of at least one cancer. We identified nine major regions that showed significant associations with more than one cancer type.
Among the most interesting regions was the region around PHC3 (3q36), which showed associations with prostate and colorectal cancer and clear cell ovarian carcinomas. PHC3 is involved in chromatin remodeling and plays a role in epithelial neoplasms. Significant Odds ratios (ORs) ranged from 0.80 to 1.31. The number of risk and protective alleles in this region was similarly distributed (19 and 18, respectively). One of the strongest associations was observed for rs76925190 (intronic in PRKC1), which increased the risk of colorectal and prostate cancer (q-value 4.28*10-10). Variants in this region were previously associated with prostate cancer.
Polymorphisms in the region (19q13) around BABAM1 (RISC and BRCA1 A complex member 1), were associated with lung, breast, ovarian and prostate cancer. BABAM1 is associated with the BRCA1-complex. Its function in histone modification and DNA repair emphasizes its importance in carcinogenesis. Significant ORs ranged from 0.88 to 1.14 with similar distribution of risk and protective alleles in this region (19 and 17, respectively). The strongest association was observed for rs4808076 (intronic in ANKLE1), which increased the risk of squamous lung, serous ovarian and ER- -breast cancer (q-value 2.40*10-6). Variants in this region were previously associated with risk of breast and ovarian cancer.
Conclusions
This study emphasizes the importance of variants in genes of epigenetic processes on cancer risk and further provides insights into novel, pleiotropic epigenetic mechanisms of cancer development.
Citation Format: Dominique Scherer, Reka Toth, Linda Kelemen, Angela Risch, Aditi Hazra, Jean Pierre Issa, Victor Moreno, Rosalind A. Eeles, John Quackenbush, Ellen L. Goode, Shuji Ogino, Rayjean Hung, Cornelia M. Ulrich. Genetic variants in epigenetic pathways and risk of multiple cancer types in the GAME-ON consortium. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr 4612. doi:10.1158/1538-7445.AM2015-4612
Collapse
|
96
|
Olsen C, Fleming K, Prendergast N, Rubio R, Emmert-Streib F, Bontempi G, Quackenbush J, Haibe-Kains B. Using shRNA experiments to validate gene regulatory networks. GENOMICS DATA 2015; 4:123-6. [PMID: 26484195 PMCID: PMC4535466 DOI: 10.1016/j.gdata.2015.03.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Revised: 03/23/2015] [Accepted: 03/23/2015] [Indexed: 11/26/2022]
Abstract
Quantitative validation of gene regulatory networks (GRNs) inferred from observational expression data is a difficult task usually involving time intensive and costly laboratory experiments. We were able to show that gene knock-down experiments can be used to quantitatively assess the quality of large-scale GRNs via a purely data-driven approach (Olsen et al. 2014). Our new validation framework also enables the statistical comparison of multiple network inference techniques, which was a long-standing challenge in the field. In this Data in Brief we detail the contents and quality controls for the gene expression data (available from NCBI Gene Expression Omnibus repository with accession number GSE53091) associated with our study published in Genomics (Olsen et al. 2014). We also provide R code to access the data and reproduce the analysis presented in this article.
Collapse
|
97
|
Glass K, Quackenbush J, Spentzos D, Haibe-Kains B, Yuan GC. A network model for angiogenesis in ovarian cancer. BMC Bioinformatics 2015; 16:115. [PMID: 25888305 PMCID: PMC4408593 DOI: 10.1186/s12859-015-0551-y] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2014] [Accepted: 03/25/2015] [Indexed: 12/31/2022] Open
Abstract
Background We recently identified two robust ovarian cancer subtypes, defined by the expression of genes involved in angiogenesis, with significant differences in clinical outcome. To identify potential regulatory mechanisms that distinguish the subtypes we applied PANDA, a method that uses an integrative approach to model information flow in gene regulatory networks. Results We find distinct differences between networks that are active in the angiogenic and non-angiogenic subtypes, largely defined by a set of key transcription factors that, although previously reported to play a role in angiogenesis, are not strongly differentially-expressed between the subtypes. Our network analysis indicates that these factors are involved in the activation (or repression) of different genes in the two subtypes, resulting in differential expression of their network targets. Mechanisms mediating differences between subtypes include a previously unrecognized pro-angiogenic role for increased genome-wide DNA methylation and complex patterns of combinatorial regulation. Conclusions The models we develop require a shift in our interpretation of the driving factors in biological networks away from the genes themselves and toward their interactions. The observed regulatory changes between subtypes suggest therapeutic interventions that may help in the treatment of ovarian cancer. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0551-y) contains supplementary material, which is available to authorized users.
Collapse
|
98
|
Vargas A, Quackenbush J, Glass K. Diet‐induced Weight Loss Changes in Gene Regulatory Networks in the Rectum: Network Analysis as a Compliment to Traditional Gene Expression Analysis. FASEB J 2015. [DOI: 10.1096/fasebj.29.1_supplement.275.5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
99
|
Campbell JD, Liu G, Luo L, Xiao J, Gerrein J, Juan-Guardela B, Tedrow J, Alekseyev YO, Yang IV, Correll M, Geraci M, Quackenbush J, Sciurba F, Schwartz DA, Kaminski N, Johnson WE, Monti S, Spira A, Beane J, Lenburg ME. Assessment of microRNA differential expression and detection in multiplexed small RNA sequencing data. RNA (NEW YORK, N.Y.) 2015; 21:164-71. [PMID: 25519487 PMCID: PMC4338344 DOI: 10.1261/rna.046060.114] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Small RNA sequencing can be used to gain an unprecedented amount of detail into the microRNA transcriptome. The relatively high cost and low throughput of sequencing bases technologies can potentially be offset by the use of multiplexing. However, multiplexing involves a trade-off between increased number of sequenced samples and reduced number of reads per sample (i.e., lower depth of coverage). To assess the effect of different sequencing depths owing to multiplexing on microRNA differential expression and detection, we sequenced the small RNA of lung tissue samples collected in a clinical setting by multiplexing one, three, six, nine, or 12 samples per lane using the Illumina HiSeq 2000. As expected, the numbers of reads obtained per sample decreased as the number of samples in a multiplex increased. Furthermore, after normalization, replicate samples included in distinct multiplexes were highly correlated (R > 0.97). When detecting differential microRNA expression between groups of samples, microRNAs with average expression >1 reads per million (RPM) had reproducible fold change estimates (signal to noise) independent of the degree of multiplexing. The number of microRNAs detected was strongly correlated with the log2 number of reads aligning to microRNA loci (R = 0.96). However, most additional microRNAs detected in samples with greater sequencing depth were in the range of expression which had lower fold change reproducibility. These findings elucidate the trade-off between increasing the number of samples in a multiplex with decreasing sequencing depth and will aid in the design of large-scale clinical studies exploring microRNA expression and its role in disease.
Collapse
|
100
|
Yang IV, Pedersen BS, Rabinovich E, Hennessy CE, Davidson EJ, Murphy E, Guardela BJ, Tedrow JR, Zhang Y, Singh MK, Correll M, Schwarz MI, Geraci M, Sciurba FC, Quackenbush J, Spira A, Kaminski N, Schwartz DA. Relationship of DNA methylation and gene expression in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med 2014; 190:1263-72. [PMID: 25333685 PMCID: PMC4315819 DOI: 10.1164/rccm.201408-1452oc] [Citation(s) in RCA: 124] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Accepted: 10/17/2014] [Indexed: 11/16/2022] Open
Abstract
RATIONALE Idiopathic pulmonary fibrosis (IPF) is an untreatable and often fatal lung disease that is increasing in prevalence and is caused by complex interactions between genetic and environmental factors. Epigenetic mechanisms control gene expression and are likely to regulate the IPF transcriptome. OBJECTIVES To identify methylation marks that modify gene expression in IPF lung. METHODS We assessed DNA methylation (comprehensive high-throughput arrays for relative methylation arrays [CHARM]) and gene expression (Agilent gene expression arrays) in 94 patients with IPF and 67 control subjects, and performed integrative genomic analyses to define methylation-gene expression relationships in IPF lung. We validated methylation changes by a targeted analysis (Epityper), and performed functional validation of one of the genes identified by our analysis. MEASUREMENTS AND MAIN RESULTS We identified 2,130 differentially methylated regions (DMRs; <5% false discovery rate), of which 738 are associated with significant changes in gene expression and enriched for expected inverse relationship between methylation and expression (P < 2.2 × 10(-16)). We validated 13/15 DMRs by targeted analysis of methylation. Methylation-expression quantitative trait loci (methyl-eQTL) identified methylation marks that control cis and trans gene expression, with an enrichment for cis relationships (P < 2.2 × 10(-16)). We found five trans methyl-eQTLs where a methylation change at a single DMR is associated with transcriptional changes in a substantial number of genes; four of these DMRs are near transcription factors (castor zinc finger 1 [CASZ1], FOXC1, MXD4, and ZDHHC4). We studied the in vitro effects of change in CASZ1 expression and validated its role in regulation of target genes in the methyl-eQTL. CONCLUSIONS These results suggest that DNA methylation may be involved in the pathogenesis of IPF.
Collapse
|