1
|
He QE, Zhu JX, Wang LY, Ding EC, Song K. DNA methylation loci identification for pan-cancer early-stage diagnosis and prognosis using a new distributed parallel partial least squares method. Front Genet 2022; 13:940214. [PMID: 36338981 PMCID: PMC9626520 DOI: 10.3389/fgene.2022.940214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 09/30/2022] [Indexed: 11/17/2022] Open
Abstract
Aberrant methylation is one of the early detectable events in many tumors, which is very promising for pan-cancer early-stage diagnosis and prognosis. To efficiently analyze the big pan-cancer methylation data and to overcome the co-methylation phenomenon, a MapReduce-based distributed and parallel-designed partial least squares approach was proposed. The large-scale high-dimensional methylation data were first decomposed into distributed blocks according to their genome locations. A distributed and parallel data processing strategy was proposed based on the framework of MapReduce, and then latent variables were further extracted for each distributed block. A set of pan-cancer signatures through a differential co-expression network followed by statistical tests was further identified based on their gene expression profiles. In total, 15 TCGA and 3 GEO datasets were used as the training and testing data, respectively, to verify our method. As a result, 22,000 potential methylation loci were selected as highly related loci with early-stage pan-cancer diagnosis. Of these, 67 methylation loci were further identified as pan-cancer signatures considering their gene expression as well. The survival analysis as well as pathway enrichment analysis on them shows that not only these loci may serve as potential drug targets, but also the proposed method may serve as a uniform framework for signature identification with big data.
Collapse
Affiliation(s)
- Qi-en He
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Jun-xuan Zhu
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Li-yan Wang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - En-ci Ding
- Tianjin First Central Hospital, Tianjin, China
| | - Kai Song
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
- *Correspondence: Kai Song,
| |
Collapse
|
2
|
Yue R, Dutta A. Computational systems biology in disease modeling and control, review and perspectives. NPJ Syst Biol Appl 2022; 8:37. [PMID: 36192551 PMCID: PMC9528884 DOI: 10.1038/s41540-022-00247-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 09/05/2022] [Indexed: 02/02/2023] Open
Abstract
Omics-based approaches have become increasingly influential in identifying disease mechanisms and drug responses. Considering that diseases and drug responses are co-expressed and regulated in the relevant omics data interactions, the traditional way of grabbing omics data from single isolated layers cannot always obtain valuable inference. Also, drugs have adverse effects that may impair patients, and launching new medicines for diseases is costly. To resolve the above difficulties, systems biology is applied to predict potential molecular interactions by integrating omics data from genomic, proteomic, transcriptional, and metabolic layers. Combined with known drug reactions, the resulting models improve medicines' therapeutical performance by re-purposing the existing drugs and combining drug molecules without off-target effects. Based on the identified computational models, drug administration control laws are designed to balance toxicity and efficacy. This review introduces biomedical applications and analyses of interactions among gene, protein and drug molecules for modeling disease mechanisms and drug responses. The therapeutical performance can be improved by combining the predictive and computational models with drug administration designed by control laws. The challenges are also discussed for its clinical uses in this work.
Collapse
Affiliation(s)
- Rongting Yue
- Department of Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT, 06269, USA.
| | - Abhishek Dutta
- Department of Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT, 06269, USA
| |
Collapse
|
3
|
Li YK, Hsu HM, Lin MC, Chang CW, Chu CM, Chang YJ, Yu JC, Chen CT, Jian CE, Sun CA, Chen KH, Kuo MH, Cheng CS, Chang YT, Wu YS, Wu HY, Yang YT, Lin C, Lin HC, Hu JM, Chang YT. Genetic co-expression networks contribute to creating predictive model and exploring novel biomarkers for the prognosis of breast cancer. Sci Rep 2021; 11:7268. [PMID: 33790307 PMCID: PMC8012617 DOI: 10.1038/s41598-021-84995-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Accepted: 02/02/2021] [Indexed: 12/14/2022] Open
Abstract
Genetic co-expression network (GCN) analysis augments the understanding of breast cancer (BC). We aimed to propose GCN-based modeling for BC relapse-free survival (RFS) prediction and to discover novel biomarkers. We used GCN and Cox proportional hazard regression to create various prediction models using mRNA microarray of 920 tumors and conduct external validation using independent data of 1056 tumors. GCNs of 34 identified candidate genes were plotted in various sizes. Compared to the reference model, the genetic predictors selected from bigger GCNs composed better prediction models. The prediction accuracy and AUC of 3 ~ 15-year RFS are 71.0-81.4% and 74.6-78% respectively (rfm, ACC 63.2-65.5%, AUC 61.9-74.9%). The hazard ratios of risk scores of developing relapse ranged from 1.89 ~ 3.32 (p < 10-8) over all models under the control of the node status. External validation showed the consistent finding. We found top 12 co-expressed genes are relative new or novel biomarkers that have not been explored in BC prognosis or other cancers until this decade. GCN-based modeling creates better prediction models and facilitates novel genes exploration on BC prognosis.
Collapse
Affiliation(s)
- Yuan-Kuei Li
- Division of Colorectal Surgery, Department of Surgery, Taoyuan Armed Forces General Hospital, Taoyuan, Taiwan.,Department of Biomedical Sciences and Engineering, National Central University, Taoyuan, Taiwan
| | - Huan-Ming Hsu
- Division of General Surgery, Department of Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan.,Department of Surgery, Songshan Branch of Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan.,Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, 11490, Taiwan
| | - Meng-Chiung Lin
- Division of Gastroenterology, Department of Medicine, Taichung Armed Forces General Hospital, Taichung, Taiwan
| | - Chi-Wen Chang
- School of Nursing, College of Medicine, Chang Gung University, Taoyuan, Taiwan.,Department of Pediatrics, Chang Gung Memorial Hospital, Taoyuan, Taiwan.,Department of Nursing, Chang Gung Memorial Hospital, Tao-Yuan, Taiwan
| | - Chi-Ming Chu
- Division of Medical Informatics, Department of Epidemiology, School of Public Health, National Defense Medical Center, Taipei, Taiwan.,Big Data Research Center, College of Medicine, Fu-Jen Catholic University, New Taipei City, Taiwan.,Department of Public Health, College of Medicine, Fu-Jen Catholic University, New Taipei City, Taiwan.,Department of Public Health, China Medical University, Taichung City, Taiwan.,Department of Healthcare Administration and Medical Informatics College of Health Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Yu-Jia Chang
- Graduate Institute of Clinical Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan.,Cell Physiology and Molecular Image Research Center, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan
| | - Jyh-Cherng Yu
- Division of General Surgery, Department of Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Chien-Ting Chen
- Division of Medical Informatics, Department of Epidemiology, School of Public Health, National Defense Medical Center, Taipei, Taiwan
| | - Chen-En Jian
- Division of Medical Informatics, Department of Epidemiology, School of Public Health, National Defense Medical Center, Taipei, Taiwan
| | - Chien-An Sun
- Big Data Research Center, College of Medicine, Fu-Jen Catholic University, New Taipei City, Taiwan
| | - Kang-Hua Chen
- School of Nursing, College of Medicine, Chang Gung University, Taoyuan, Taiwan.,Department of Nursing, Chang Gung Memorial Hospital, Tao-Yuan, Taiwan
| | - Ming-Hao Kuo
- Graduate Institute of Medical Sciences, National Defense Medical Center, Taipei, Taiwan
| | - Chia-Shiang Cheng
- Graduate Institute of Life Sciences, National Defense Medical Center, Taipei, Taiwan
| | - Ya-Ting Chang
- Division of Medical Informatics, Department of Epidemiology, School of Public Health, National Defense Medical Center, Taipei, Taiwan
| | - Yi-Syuan Wu
- Graduate Institute of Life Sciences, National Defense Medical Center, Taipei, Taiwan
| | - Hao-Yi Wu
- Division of Medical Informatics, Department of Epidemiology, School of Public Health, National Defense Medical Center, Taipei, Taiwan
| | - Ya-Ting Yang
- Division of Medical Informatics, Department of Epidemiology, School of Public Health, National Defense Medical Center, Taipei, Taiwan
| | - Chen Lin
- Department of Biomedical Sciences and Engineering, National Central University, Taoyuan, Taiwan.,Center for Biotechnology and Biomedical Engineering, National Central University, Taoyuan, Taiwan
| | - Hung-Che Lin
- Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, 11490, Taiwan.,Graduate Institute of Medical Sciences, National Defense Medical Center, Taipei, Taiwan.,Hualien Armed Forces General Hospital, Xincheng, Hualien, 97144, Taiwan
| | - Je-Ming Hu
- Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, 11490, Taiwan.,Graduate Institute of Medical Sciences, National Defense Medical Center, Taipei, Taiwan.,Division of Colorectal Surgery, Department of Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei City, Taiwan.,School of Medicine, National Defense Medical Center, Taipei City, Taiwan
| | - Yu-Tien Chang
- Division of Medical Informatics, Department of Epidemiology, School of Public Health, National Defense Medical Center, Taipei, Taiwan. .,Big Data Research Center, College of Medicine, Fu-Jen Catholic University, New Taipei City, Taiwan.
| |
Collapse
|
4
|
Huang Z, Han Z, Wang Resource T, Shao W, Xiang S, Salama P, Rizkalla M, Huang K, Zhang J. TSUNAMI: Translational Bioinformatics Tool Suite for Network Analysis and Mining. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021; 19:1023-1031. [PMID: 33705981 PMCID: PMC9403021 DOI: 10.1016/j.gpb.2019.05.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 04/03/2019] [Accepted: 05/31/2019] [Indexed: 11/15/2022]
Abstract
Gene co-expression network (GCN) mining identifies gene modules with highly correlated expression profiles across samples/conditions. It enables researchers to discover latent gene/molecule interactions, identify novel gene functions, and extract molecular features from certain disease/condition groups, thus helping to identify disease biomarkers. However, there lacks an easy-to-use tool package for users to mine GCN modules that are relatively small in size with tightly connected genes that can be convenient for downstream gene set enrichment analysis, as well as modules that may share common members. To address this need, we developed an online GCN mining tool package: TSUNAMI (Tools SUite for Network Analysis and MIning). TSUNAMI incorporates our state-of-the-art lmQCM algorithm to mine GCN modules for both public and user-input data (microarray, RNA-seq, or any other numerical omics data), and then performs downstream gene set enrichment analysis for the identified modules. It has several features and advantages: 1) a user-friendly interface and real-time co-expression network mining through a web server; 2) direct access and search of NCBI Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases, as well as user-input gene expression matrices for GCN module mining; 3) multiple co-expression analysis tools to choose from, all of which are highly flexible in regards to parameter selection options; 4) identified GCN modules are summarized to eigengenes, which are convenient for users to check their correlation with other clinical traits; 5) integrated downstream Enrichr enrichment analysis and links to other gene set enrichment tools; and 6) visualization of gene loci by Circos plot in any step of the process. The web service is freely accessible through URL: https://biolearns.medicine.iu.edu/. Source code is available at https://github.com/huangzhii/TSUNAMI/.
Collapse
Affiliation(s)
- Zhi Huang
- School of Electrical and Computer Engineering, Purdue University, West Lafayette IN 47907, USA; Department of Electrical and Computer Engineering, Indiana University - Purdue University Indianapolis, Indianapolis IN 46202, USA
| | - Zhi Han
- Department of Medicine, Indiana University School of Medicine, Indianapolis IN 46202, USA
| | | | - Wei Shao
- Department of Medicine, Indiana University School of Medicine, Indianapolis IN 46202, USA
| | - Shunian Xiang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis IN 46202, USA
| | - Paul Salama
- Department of Electrical and Computer Engineering, Indiana University - Purdue University Indianapolis, Indianapolis IN 46202, USA
| | - Maher Rizkalla
- Department of Electrical and Computer Engineering, Indiana University - Purdue University Indianapolis, Indianapolis IN 46202, USA
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis IN 46202, USA.
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis IN 46202, USA.
| |
Collapse
|
5
|
Johnson TS, Xiang S, Dong T, Huang Z, Cheng M, Wang T, Yang K, Ni D, Huang K, Zhang J. Combinatorial analyses reveal cellular composition changes have different impacts on transcriptomic changes of cell type specific genes in Alzheimer's Disease. Sci Rep 2021; 11:353. [PMID: 33432017 PMCID: PMC7801680 DOI: 10.1038/s41598-020-79740-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 12/09/2020] [Indexed: 11/09/2022] Open
Abstract
Alzheimer's disease (AD) brains are characterized by progressive neuron loss and gliosis. Previous studies of gene expression using bulk tissue samples often fail to consider changes in cell-type composition when comparing AD versus control, which can lead to differences in expression levels that are not due to transcriptional regulation. We mined five large transcriptomic AD datasets for conserved gene co-expression module, then analyzed differential expression and differential co-expression within the modules between AD samples and controls. We performed cell-type deconvolution analysis to determine whether the observed differential expression was due to changes in cell-type proportions in the samples or to transcriptional regulation. Our findings were validated using four additional datasets. We discovered that the increased expression of microglia modules in the AD samples can be explained by increased microglia proportions in the AD samples. In contrast, decreased expression and perturbed co-expression within neuron modules in the AD samples was likely due in part to altered regulation of neuronal pathways. Several transcription factors that are differentially expressed in AD might account for such altered gene regulation. Similarly, changes in gene expression and co-expression within astrocyte modules could be attributed to combined effects of astrogliosis and astrocyte gene activation. Gene expression in the astrocyte modules was also strongly correlated with clinicopathological biomarkers. Through this work, we demonstrated that combinatorial analysis can delineate the origins of transcriptomic changes in bulk tissue data and shed light on key genes and pathways involved in AD.
Collapse
Affiliation(s)
- Travis S Johnson
- Department of Biostatistics, Indiana University, School of Medicine, Indianapolis, IN, 46202, USA
| | - Shunian Xiang
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN, 46202, USA
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Tianhan Dong
- Department of Pharmacology, Indiana University, School of Medicine, Indianapolis, IN, 46202, USA
| | - Zhi Huang
- Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907, USA
| | - Michael Cheng
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN, 46202, USA
| | - Tianfu Wang
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Kai Yang
- Department of Pediatrics, Indiana University, School of Medicine, Indianapolis, IN, 46202, USA
| | - Dong Ni
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University, Shenzhen, 518060, China.
| | - Kun Huang
- Department of Medicine, Indiana University, School of Medicine, Indianapolis, IN, 46202, USA.
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University, School of Medicine, Indianapolis, IN, 46202, USA.
| |
Collapse
|
6
|
Iliopoulos A, Beis G, Apostolou P, Papasotiriou I. Complex Networks, Gene Expression and Cancer Complexity: A Brief Review of Methodology and Applications. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191017093504] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
In this brief survey, various aspects of cancer complexity and how this complexity can
be confronted using modern complex networks’ theory and gene expression datasets, are described.
In particular, the causes and the basic features of cancer complexity, as well as the challenges
it brought are underlined, while the importance of gene expression data in cancer research
and in reverse engineering of gene co-expression networks is highlighted. In addition, an introduction
to the corresponding theoretical and mathematical framework of graph theory and complex
networks is provided. The basics of network reconstruction along with the limitations of gene
network inference, the enrichment and survival analysis, evolution, robustness-resilience and cascades
in complex networks, are described. Finally, an indicative and suggestive example of a cancer
gene co-expression network inference and analysis is given.
Collapse
Affiliation(s)
- A.C. Iliopoulos
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - G. Beis
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - P. Apostolou
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - I. Papasotiriou
- Research Genetic Cancer Centre International GmbH, Zug, Switzerland
| |
Collapse
|
7
|
Sun L, Zhang J, Chen W, Chen Y, Zhang X, Yang M, Xiao M, Ma F, Yao Y, Ye M, Zhang Z, Chen K, Chen F, Ren Y, Ni S, Zhang X, Yan Z, Sun Z, Zhou H, Yang H, Xie S, Haque ME, Huang K, Yang Y. Attenuation of epigenetic regulator SMARCA4 and ERK-ETS signaling suppresses aging-related dopaminergic degeneration. Aging Cell 2020; 19:e13210. [PMID: 32749068 PMCID: PMC7511865 DOI: 10.1111/acel.13210] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 06/16/2020] [Accepted: 07/12/2020] [Indexed: 11/27/2022] Open
Abstract
How complex interactions of genetic, environmental factors and aging jointly contribute to dopaminergic degeneration in Parkinson's disease (PD) is largely unclear. Here, we applied frequent gene co‐expression analysis on human patient substantia nigra‐specific microarray datasets to identify potential novel disease‐related genes. In vivo Drosophila studies validated two of 32 candidate genes, a chromatin‐remodeling factor SMARCA4 and a biliverdin reductase BLVRA. Inhibition of SMARCA4 was able to prevent aging‐dependent dopaminergic degeneration not only caused by overexpression of BLVRA but also in four most common Drosophila PD models. Furthermore, down‐regulation of SMARCA4 specifically in the dopaminergic neurons prevented shortening of life span caused by α‐synuclein and LRRK2. Mechanistically, aberrant SMARCA4 and BLVRA converged on elevated ERK‐ETS activity, attenuation of which by either genetic or pharmacological manipulation effectively suppressed dopaminergic degeneration in Drosophila in vivo. Down‐regulation of SMARCA4 or drug inhibition of MEK/ERK also mitigated mitochondrial defects in PINK1 (a PD‐associated gene)‐deficient human cells. Our findings underscore the important role of epigenetic regulators and implicate a common signaling axis for therapeutic intervention in normal aging and a broad range of age‐related disorders including PD.
Collapse
Affiliation(s)
- Ling Sun
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Jie Zhang
- Department of Medical and Molecular Genetics School of Medicine Indiana University Indianapolis IN USA
| | - Wenfeng Chen
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Yun Chen
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Xiaohui Zhang
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Mingjuan Yang
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Min Xiao
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Fujun Ma
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Yizhou Yao
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Meina Ye
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Zhenkun Zhang
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Kai Chen
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Fei Chen
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Yujun Ren
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Shiwei Ni
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Xi Zhang
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
| | - Zhangming Yan
- MOE Key Lab of Bioinformatics School of Life Sciences Tsinghua University Beijing China
| | - Zhi‐Rong Sun
- MOE Key Lab of Bioinformatics School of Life Sciences Tsinghua University Beijing China
| | - Hai‐Meng Zhou
- Zhejiang Provincial Key Laboratory of Applied Enzymology Yangtze Delta Region Institute of Tsinghua University Jiaxing China
| | - Hongqin Yang
- Key Laboratory of Optoelectronic Science and Technology for Medicine Ministry of Education Fujian Normal University Fuzhou China
| | - Shusen Xie
- Key Laboratory of Optoelectronic Science and Technology for Medicine Ministry of Education Fujian Normal University Fuzhou China
| | - M. Emdadul Haque
- Department of Biochemistry College of Medicine and Health Sciences United Arab Emirates University Al‐Ain United Arab Emirates
| | - Kun Huang
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
- Department of Hematology and Oncology School of Medicine Indiana University Indianapolis IN USA
| | - Yufeng Yang
- Institute of Life Sciences Fuzhou University Fuzhou Fujian China
- Key Laboratory of Optoelectronic Science and Technology for Medicine Ministry of Education Fujian Normal University Fuzhou China
| |
Collapse
|
8
|
Beklen H, Gulfidan G, Arga KY, Mardinoglu A, Turanli B. Drug Repositioning for P-Glycoprotein Mediated Co-Expression Networks in Colorectal Cancer. Front Oncol 2020; 10:1273. [PMID: 32903699 PMCID: PMC7438820 DOI: 10.3389/fonc.2020.01273] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 06/19/2020] [Indexed: 12/24/2022] Open
Abstract
Colorectal cancer (CRC) is one of the most fatal types of cancers that is seen in both men and women. CRC is the third most common type of cancer worldwide. Over the years, several drugs are developed for the treatment of CRC; however, patients with advanced CRC can be resistant to some drugs. P-glycoprotein (P-gp) (also known as Multidrug Resistance 1, MDR1) is a well-identified membrane transporter protein expressed by ABCB1 gene. The high expression of MDR1 protein found in several cancer types causes chemotherapy failure owing to efflux drug molecules out of the cancer cell, decreases the drug concentration, and causes drug resistance. As same as other cancers, drug-resistant CRC is one of the major obstacles for effective therapy and novel therapeutic strategies are urgently needed. Network-based approaches can be used to determine specific biomarkers, potential drug targets, or repurposing approved drugs in drug-resistant cancers. Drug repositioning is the approach for using existing drugs for a new therapeutic purpose; it is a highly efficient and low-cost process. To improve current understanding of the MDR-1-related drug resistance in CRC, we explored gene co-expression networks around ABCB1 gene with different network sizes (50, 100, 150, 200 edges) and repurposed candidate drugs targeting the ABCB1 gene and its co-expression network by using drug repositioning approach for the treatment of CRC. The candidate drugs were also assessed by using molecular docking for determining the potential of physical interactions between the drug and MDR1 protein as a drug target. We also evaluated these four networks whether they are diagnostic or prognostic features in CRC besides biological function determined by functional enrichment analysis. Lastly, differentially expressed genes of drug-resistant (i.e., oxaliplatin, methotrexate, SN38) HT29 cell lines were found and used for repurposing drugs with reversal gene expressions. As a result, it is shown that all networks exhibited high diagnostic and prognostic performance besides the identification of various drug candidates for drug-resistant patients with CRC. All these results can shed light on the development of effective diagnosis, prognosis, and treatment strategies for drug resistance in CRC.
Collapse
Affiliation(s)
- Hande Beklen
- Department of Bioengineering, Marmara University, Istanbul, Turkey
| | - Gizem Gulfidan
- Department of Bioengineering, Marmara University, Istanbul, Turkey
| | | | - Adil Mardinoglu
- Centre for Host-Microbiome Interactions, Faculty of Dentistry, Oral & Craniofacial Sciences, King's College London, London, United Kingdom.,Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm, Sweden
| | - Beste Turanli
- Department of Bioengineering, Istanbul Medeniyet University, Istanbul, Turkey
| |
Collapse
|
9
|
Schubert M, Colomé-Tatché M, Foijer F. Gene networks in cancer are biased by aneuploidies and sample impurities. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194444. [PMID: 31654805 DOI: 10.1016/j.bbagrm.2019.194444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/05/2019] [Accepted: 10/14/2019] [Indexed: 12/14/2022]
Abstract
Gene regulatory network inference is a standard technique for obtaining structured regulatory information from, for instance, gene expression measurements. Methods performing this task have been extensively evaluated on synthetic, and to a lesser extent real data sets. In contrast to these test evaluations, applications to gene expression data of human cancers are often limited by fewer samples and more potential regulatory links, and are biased by copy number aberrations as well as cell mixtures and sample impurities. Here, we take networks inferred from TCGA cohorts as an example to show that (1) transcription factor annotations are essential to obtain reliable networks, and (2) even for state of the art methods, we expect that between 20 and 80% of edges are caused by copy number changes and cell mixtures rather than transcription factor regulation.
Collapse
Affiliation(s)
- Michael Schubert
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands; Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany.
| | - Maria Colomé-Tatché
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands; Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany; TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Floris Foijer
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands.
| |
Collapse
|
10
|
Kalamohan K, Gunasekaran P, Ibrahim S. Gene coexpression network analysis of multiple cancers discovers the varying stem cell features between gastric and breast cancer. Meta Gene 2019. [DOI: 10.1016/j.mgene.2019.100576] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
11
|
Helm BR, Zhan X, Pandya PH, Murray ME, Pollok KE, Renbarger JL, Ferguson MJ, Han Z, Ni D, Zhang J, Huang K. Gene Co-Expression Networks Restructured Gene Fusion in Rhabdomyosarcoma Cancers. Genes (Basel) 2019; 10:genes10090665. [PMID: 31480361 PMCID: PMC6770752 DOI: 10.3390/genes10090665] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 08/07/2019] [Accepted: 08/19/2019] [Indexed: 01/28/2023] Open
Abstract
Rhabdomyosarcoma is subclassified by the presence or absence of a recurrent chromosome translocation that fuses the FOXO1 and PAX3 or PAX7 genes. The fusion protein (FOXO1-PAX3/7) retains both binding domains and becomes a novel and potent transcriptional regulator in rhabdomyosarcoma subtypes. Many studies have characterized and integrated genomic, transcriptomic, and epigenomic differences among rhabdomyosarcoma subtypes that contain the FOXO1-PAX3/7 gene fusion and those that do not; however, few investigations have investigated how gene co-expression networks are altered by FOXO1-PAX3/7. Although transcriptional data offer insight into one level of functional regulation, gene co-expression networks have the potential to identify biological interactions and pathways that underpin oncogenesis and tumorigenicity. Thus, we examined gene co-expression networks for rhabdomyosarcoma that were FOXO1-PAX3 positive, FOXO1-PAX7 positive, or fusion negative. Gene co-expression networks were mined using local maximum Quasi-Clique Merger (lmQCM) and analyzed for co-expression differences among rhabdomyosarcoma subtypes. This analysis observed 41 co-expression modules that were shared between fusion negative and positive samples, of which 17/41 showed significant up- or down-regulation in respect to fusion status. Fusion positive and negative rhabdomyosarcoma showed differing modularity of co-expression networks with fusion negative (n = 109) having significantly more individual modules than fusion positive (n = 53). Subsequent analysis of gene co-expression networks for PAX3 and PAX7 type fusions observed 17/53 were differentially expressed between the two subtypes. Gene list enrichment analysis found that gene ontology terms were poorly matched with biological processes and molecular function for most co-expression modules identified in this study; however, co-expressed modules were frequently localized to cytobands on chromosomes 8 and 11. Overall, we observed substantial restructuring of co-expression networks relative to fusion status and fusion type in rhabdomyosarcoma and identified previously overlooked genes and pathways that may be targeted in this pernicious disease.
Collapse
Affiliation(s)
- Bryan R Helm
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA
| | - Xiaohui Zhan
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China
| | - Pankita H Pandya
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA
| | - Mary E Murray
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA
| | - Karen E Pollok
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA
- Department of Pharmacology and Toxicology, Indiana University, Indianapolis, IN 46202-3082, USA
| | - Jamie L Renbarger
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA
| | - Michael J Ferguson
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA
| | - Zhi Han
- Department of Pharmacology and Toxicology, Indiana University, Indianapolis, IN 46202-3082, USA
| | - Dong Ni
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA.
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202-3082, USA.
- Regenstrief Institute, Indianapolis, IN 46202, USA.
| |
Collapse
|
12
|
Han Y, Ye X, Wang C, Liu Y, Zhang S, Feng W, Huang K, Zhang J. Integration of molecular features with clinical information for predicting outcomes for neuroblastoma patients. Biol Direct 2019; 14:16. [PMID: 31443736 PMCID: PMC6706887 DOI: 10.1186/s13062-019-0244-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 08/06/2019] [Indexed: 01/14/2023] Open
Abstract
Background Neuroblastoma is one of the most common types of pediatric cancer. In current neuroblastoma prognosis, patients can be stratified into high- and low-risk groups. Generally, more than 90% of the patients in the low-risk group will survive, while less than 50% for those with the high-risk disease will survive. Since the so-called “high-risk” patients still contain patients with mixed good and poor outcomes, more refined stratification needs to be established so that for the patients with poor outcome, they can receive prompt and individualized treatment to improve their long-term survival rate, while the patients with good outcome can avoid unnecessary over treatment. Methods We first mined co-expressed gene modules from microarray and RNA-seq data of neuroblastoma samples using the weighted network mining algorithm lmQCM, and summarize the resulted modules into eigengenes. Then patient similarity weight matrix was constructed with module eigengenes using two different approaches. At the last step, a consensus clustering method called Molecular Regularized Consensus Patient Stratification (MRCPS) was applied to aggregate both clinical information (clinical stage and clinical risk level) and multiple eigengene data for refined patient stratification. Results The integrative method MRCPS demonstrated superior performance to clinical staging or transcriptomic features alone for the NB cohort stratification. It successfully identified the worst prognosis group from the clinical high-risk group, with less than 40% survived in the first 50 months of diagnosis. It also identified highly differentially expressed genes between best prognosis group and worst prognosis group, which can be potential gene biomarkers for clinical testing. Conclusions To address the need for better prognosis and facilitate personalized treatment on neuroblastoma, we modified the recently developed bioinformatics workflow MRCPS for refined patient prognosis. It integrates clinical information and molecular features such as gene co-expression for prognosis. This clustering workflow is flexible, allowing the integration of both categorical and numerical data. The results demonstrate the power of survival prognosis with this integrative analysis workflow, with superior prognostic performance to only using transcriptomic data or clinical staging/risk information alone. Reviewers This article was reviewed by Lan Hu, Haibo Liu, Julie Zhu and Aleksandra Gruca. Electronic supplementary material The online version of this article (10.1186/s13062-019-0244-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yatong Han
- Department of Automation, Harbin Engineering University, Harbin, China.,Department of Neurosurgery, Stanford University, California, USA
| | - Xiufen Ye
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Chao Wang
- Thermo Fisher Scientific, Waltham, MA, USA
| | - Yusong Liu
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Siyuan Zhang
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Weixing Feng
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, USA. .,Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, USA.
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, USA.
| |
Collapse
|
13
|
Wei Y, Dong S, Zhu Y, Zhao Y, Wu C, Zhu Y, Li K, Xu Y. DNA co-methylation analysis of lincRNAs across nine cancer types reveals novel potential epigenetic biomarkers in cancer. Epigenomics 2019; 11:1177-1190. [PMID: 31347388 DOI: 10.2217/epi-2018-0138] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Aim: The potential functions and prognostic value of lincRNAs with co-methylation events are explored in 9 cancer types. Materials & methods: Here, we evaluated the co-methylation events in promoter and gene-body regions between two lincRNAs across 9 cancer types by constructing a systematic biological framework. Results: The co-methylation events in both promoter and gene-body regions tended to be highly cancer specific. Patient samples could be separated by tumor and normal types according to the eigengenes of universal co-methylation clusters. Functional enrichment results revealed the lincRNAs that brought promoter and gene-body co-methylation events that affected cancer progress through participating in different pathways and could serve as potential prognostic biomarkers. Conclusion: The study provides new insight into the epigenetic regulation in cancer and leads to a potential new direction for epigenetic biomarker discovery.
Collapse
Affiliation(s)
- Yunzhen Wei
- College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, PR China.,School of Life Science, Faculty of Science, The Chinese University of Hong Kong, PR China
| | - Siyao Dong
- College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, PR China
| | - Yanjiao Zhu
- College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, PR China
| | - Yichuan Zhao
- College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, PR China
| | - Cheng Wu
- College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, PR China
| | - Yinling Zhu
- College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, PR China
| | - Kun Li
- College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, PR China
| | - Yan Xu
- College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, PR China
| |
Collapse
|
14
|
Yu CY, Xiang S, Huang Z, Johnson TS, Zhan X, Han Z, Abu Zaid M, Huang K. Gene Co-expression Network and Copy Number Variation Analyses Identify Transcription Factors Associated With Multiple Myeloma Progression. Front Genet 2019; 10:468. [PMID: 31156714 PMCID: PMC6533571 DOI: 10.3389/fgene.2019.00468] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Accepted: 05/01/2019] [Indexed: 11/29/2022] Open
Abstract
Multiple myeloma (MM) has two clinical precursor stages of disease: monoclonal gammopathy of undetermined significance (MGUS) and smoldering multiple myeloma (SMM). However, the mechanism of progression is not well understood. Because gene co-expression network analysis is a well-known method for discovering new gene functions and regulatory relationships, we utilized this framework to conduct differential co-expression analysis to identify interesting transcription factors (TFs) in two publicly available datasets. We then used copy number variation (CNV) data from a third public dataset to validate these TFs. First, we identified co-expressed gene modules in two publicly available datasets each containing three conditions: normal, MGUS, and SMM. These modules were assessed for condition-specific gene expression, and then enrichment analysis was conducted on condition-specific modules to identify their biological function and upstream TFs. TFs were assessed for differential gene expression between normal and MM precursors, then validated with CNV analysis to identify candidate genes. Functional enrichment analysis reaffirmed known functional categories in MM pathology, the main one relating to immune function. Enrichment analysis revealed a handful of differentially expressed TFs between normal and either MGUS or SMM in gene expression and/or CNV. Overall, we identified four genes of interest (MAX, TCF4, ZNF148, and ZNF281) that aid in our understanding of MM initiation and progression.
Collapse
Affiliation(s)
- Christina Y Yu
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States.,Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Shunian Xiang
- Department of Medical and Molecular Genetics, Indiana University, Indianapolis, IN, United States.,National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Zhi Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, United States
| | - Travis S Johnson
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States.,Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Xiaohui Zhan
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Zhi Han
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Regenstrief Institute, Indianapolis, IN, United States
| | - Mohammad Abu Zaid
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Regenstrief Institute, Indianapolis, IN, United States
| |
Collapse
|
15
|
Wang T, Zhang J, Huang K. Generalized gene co-expression analysis via subspace clustering using low-rank representation. BMC Bioinformatics 2019; 20:196. [PMID: 31074376 PMCID: PMC6509871 DOI: 10.1186/s12859-019-2733-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Background Gene Co-expression Network Analysis (GCNA) helps identify gene modules with potential biological functions and has become a popular method in bioinformatics and biomedical research. However, most current GCNA algorithms use correlation to build gene co-expression networks and identify modules with highly correlated genes. There is a need to look beyond correlation and identify gene modules using other similarity measures for finding novel biologically meaningful modules. Results We propose a new generalized gene co-expression analysis algorithm via subspace clustering that can identify biologically meaningful gene co-expression modules with genes that are not all highly correlated. We use low-rank representation to construct gene co-expression networks and local maximal quasi-clique merger to identify gene co-expression modules. We applied our method on three large microarray datasets and a single-cell RNA sequencing dataset. We demonstrate that our method can identify gene modules with different biological functions than current GCNA methods and find gene modules with prognostic values. Conclusions The presented method takes advantage of subspace clustering to generate gene co-expression networks rather than using correlation as the similarity measure between genes. Our generalized GCNA method can provide new insights from gene expression datasets and serve as a complement to current GCNA algorithms.
Collapse
Affiliation(s)
- Tongxin Wang
- Department of Computer Science, Indiana University Bloomington, Bloomington, 47408, IN, USA
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, 46202, IN, USA
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, 46202, IN, USA. .,Regenstrief Institute, Indianapolis, 46202, IN, USA.
| |
Collapse
|
16
|
Huang Z, Zhan X, Xiang S, Johnson TS, Helm B, Yu CY, Zhang J, Salama P, Rizkalla M, Han Z, Huang K. SALMON: Survival Analysis Learning With Multi-Omics Neural Networks on Breast Cancer. Front Genet 2019; 10:166. [PMID: 30906311 PMCID: PMC6419526 DOI: 10.3389/fgene.2019.00166] [Citation(s) in RCA: 119] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Accepted: 02/14/2019] [Indexed: 12/22/2022] Open
Abstract
Improved cancer prognosis is a central goal for precision health medicine. Though many models can predict differential survival from data, there is a strong need for sophisticated algorithms that can aggregate and filter relevant predictors from increasingly complex data inputs. In turn, these models should provide deeper insight into which types of data are most relevant to improve prognosis. Deep Learning-based neural networks offer a potential solution for both problems because they are highly flexible and account for data complexity in a non-linear fashion. In this study, we implement Deep Learning-based networks to determine how gene expression data predicts Cox regression survival in breast cancer. We accomplish this through an algorithm called SALMON (Survival Analysis Learning with Multi-Omics Neural Networks), which aggregates and simplifies gene expression data and cancer biomarkers to enable prognosis prediction. The results revealed improved performance when more omics data were used in model construction. Rather than use raw gene expression values as model inputs, we innovatively use eigengene modules from the result of gene co-expression network analysis. The corresponding high impact co-expression modules and other omics data are identified by feature selection technique, then examined by conducting enrichment analysis and exploiting biological functions, escalated the interpretation of input feature from gene level to co-expression modules level. Our study shows the feasibility of discovering breast cancer related co-expression modules, sketch a blueprint of future endeavors on Deep Learning-based survival analysis. SALMON source code is available at https://github.com/huangzhii/SALMON/.
Collapse
Affiliation(s)
- Zhi Huang
- School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, United States.,Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
| | - Xiaohui Zhan
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Shunian Xiang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China.,Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Travis S Johnson
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Bryan Helm
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Christina Y Yu
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Paul Salama
- Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
| | - Maher Rizkalla
- Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
| | - Zhi Han
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Regenstrief Institute, Indianapolis, IN, United States
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States.,Regenstrief Institute, Indianapolis, IN, United States
| |
Collapse
|
17
|
Han Y, Ye X, Cheng J, Zhang S, Feng W, Han Z, Zhang J, Huang K. Integrative analysis based on survival associated co-expression gene modules for predicting Neuroblastoma patients' survival time. Biol Direct 2019; 14:4. [PMID: 30760313 PMCID: PMC6375203 DOI: 10.1186/s13062-018-0229-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 11/20/2018] [Indexed: 12/03/2022] Open
Abstract
Background More than 90% of neuroblastoma patients are cured in the low-risk group while only less than 50% for those with high-risk disease can be cured. Since the high-risk patients still have poor outcomes, we need more accurate stratification to establish an individualized precise treatment plan for the patients to improve the long-term survival rate. Results We focus on extracting features and providing a workflow to improve survival prediction for neuroblastoma patients. With a workflow for gene co-expression network (GCN) mining in microarray and RNA-Seq datasets, we extracted molecular features from each co-expressed module and summarized them into eigengenes. Then we adopted the lasso-regularized Cox proportional hazards model to select the most informative eigengene features regarding association to the risk of metastasis. Nine eigengenes were selected which show strong association with patient survival prognosis. All of the nine corresponding gene modules also have highly enriched biological functions or cytoband locations. Three of them are unique modules to RNA-Seq data, which complement the modules from microarray data in terms of survival prognosis. We then merged all eigengenes from these unique modules and used an integrative method called Similarity Network Fusion to test the prognostic power of these eigengenes for prognosis. The prognostic accuracies are significantly improved as compared to using all eigengenes, and a subgroup of patients with very poor survival rate was identified. Conclusions We first compared GCNs mined from microarray and RNA-seq data. We discovered that each data modality yields unique GCNs, which are enriched with clear biological functions. Then we do module unique analysis and use lasso-cox model to select survival-associated eigengenes. Integration of unique and survival-associated eigengenes from both data types provides complementary information that leads to more accurate survival prognosis. Reviewers Reviewed by Susmita Datta, Marco Chierici and Dimitar Vassilev. Electronic supplementary material The online version of this article (10.1186/s13062-018-0229-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yatong Han
- Department of Automation, Harbin Engineering University, Harbin, China.,Department of Neurosurgery, Stanford University, California, USA
| | - Xiufen Ye
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Jun Cheng
- Department of Medicine, Indiana University School of Medicine, Indianapolis, USA.,School of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Siyuan Zhang
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Weixing Feng
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Zhi Han
- Department of Medicine, Indiana University School of Medicine, Indianapolis, USA
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, USA
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, USA. .,Regenstrief Institute, Indianapolis, USA.
| |
Collapse
|
18
|
Xiang S, Huang Z, Wang T, Han Z, Yu CY, Ni D, Huang K, Zhang J. Condition-specific gene co-expression network mining identifies key pathways and regulators in the brain tissue of Alzheimer's disease patients. BMC Med Genomics 2018; 11:115. [PMID: 30598117 PMCID: PMC6311927 DOI: 10.1186/s12920-018-0431-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Background Gene co-expression network (GCN) mining is a systematic approach to efficiently identify novel disease pathways, predict novel gene functions and search for potential disease biomarkers. However, few studies have systematically identified GCNs in multiple brain transcriptomic data of Alzheimer’s disease (AD) patients and looked for their specific functions. Methods In this study, we first mined GCN modules from AD and normal brain samples in multiple datasets respectively; then identified gene modules that are specific to AD or normal samples; lastly, condition-specific modules with similar functional enrichments were merged and enriched differentially expressed upstream transcription factors were further examined for the AD/normal-specific modules. Results We obtained 30 AD-specific modules which showed gain of correlation in AD samples and 31 normal-specific modules with loss of correlation in AD samples compared to normal ones, using the network mining tool lmQCM. Functional and pathway enrichment analysis not only confirmed known gene functional categories related to AD, but also identified novel regulatory factors and pathways. Remarkably, pathway analysis suggested that a variety of viral, bacteria, and parasitic infection pathways are activated in AD samples. Furthermore, upstream transcription factor analysis identified differentially expressed upstream regulators such as ZFHX3 for several modules, which can be potential driver genes for AD etiology and pathology. Conclusions Through our state-of-the-art network-based approach, AD/normal-specific GCN modules were identified using multiple transcriptomic datasets from multiple regions of the brain. Bacterial and viral infectious disease related pathways are the most frequently enriched in modules across datasets. Transcription factor ZFHX3 was identified as a potential driver regulator targeting the infectious diseases pathways in AD-specific modules. Our results provided new direction to the mechanism of AD as well as new candidates for drug targets. Electronic supplementary material The online version of this article (10.1186/s12920-018-0431-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shunian Xiang
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University, Shenzhen, 518060, China.,Department of Medical & Molecular Genetics, Indiana University, Indianapolis, IN, 46202, USA
| | - Zhi Huang
- Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907, USA
| | - Tianfu Wang
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Zhi Han
- Department of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Christina Y Yu
- Department of Medicine, Indiana University, Indianapolis, IN, 46202, USA.,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Dong Ni
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University, Shenzhen, 518060, China.
| | - Kun Huang
- Department of Medicine, Indiana University, Indianapolis, IN, 46202, USA.
| | - Jie Zhang
- Department of Medical & Molecular Genetics, Indiana University, Indianapolis, IN, 46202, USA.
| |
Collapse
|
19
|
Wang P, Gao L, Hu Y, Li F. Feature related multi-view nonnegative matrix factorization for identifying conserved functional modules in multiple biological networks. BMC Bioinformatics 2018; 19:394. [PMID: 30373534 PMCID: PMC6206826 DOI: 10.1186/s12859-018-2434-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Accepted: 10/15/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Comprehensive analyzing multi-omics biological data in different conditions is important for understanding biological mechanism in system level. Multiple or multi-layer network model gives us a new insight into simultaneously analyzing these data, for instance, to identify conserved functional modules in multiple biological networks. However, because of the larger scale and more complicated structure of multiple networks than single network, how to accurate and efficient detect conserved functional biological modules remains a significant challenge. RESULTS Here, we propose an efficient method, named ConMod, to discover conserved functional modules in multiple biological networks. We introduce two features to characterize multiple networks, thus all networks are compressed into two feature matrices. The module detection is only performed in the feature matrices by using multi-view non-negative matrix factorization (NMF), which is independent of the number of input networks. Experimental results on both synthetic and real biological networks demonstrate that our method is promising in identifying conserved modules in multiple networks since it improves the accuracy and efficiency comparing with state-of-the-art methods. Furthermore, applying ConMod to co-expression networks of different cancers, we find cancer shared gene modules, the majority of which have significantly functional implications, such as ribosome biogenesis and immune response. In addition, analyzing on brain tissue-specific protein interaction networks, we detect conserved modules related to nervous system development, mRNA processing, etc. CONCLUSIONS: ConMod facilitates finding conserved modules in any number of networks with a low time and space complexity, thereby serve as a valuable tool for inference shared traits and biological functions of multiple biological system.
Collapse
Affiliation(s)
- Peizhuo Wang
- School of Computer Science and Technology, Xidian University, Xi’an, 710071 China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an, 710071 China
| | - Yuxuan Hu
- School of Computer Science and Technology, Xidian University, Xi’an, 710071 China
| | - Feng Li
- School of Computer Science and Technology, Xidian University, Xi’an, 710071 China
| |
Collapse
|
20
|
Castillo-Morales A, Monzón-Sandoval J, Urrutia AO, Gutiérrez H. Postmitotic cell longevity-associated genes: a transcriptional signature of postmitotic maintenance in neural tissues. Neurobiol Aging 2018; 74:147-160. [PMID: 30448614 DOI: 10.1016/j.neurobiolaging.2018.10.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Revised: 10/03/2018] [Accepted: 10/11/2018] [Indexed: 12/24/2022]
Abstract
Different cell types have different postmitotic maintenance requirements. Nerve cells, however, are unique in this respect as they need to survive and preserve their functional complexity for the entire lifetime of the organism, and failure at any level of their supporting mechanisms leads to a wide range of neurodegenerative conditions. Whether these differences across tissues arise from the activation of distinct cell type-specific maintenance mechanisms or the differential activation of a common molecular repertoire is not known. To identify the transcriptional signature of postmitotic cellular longevity (PMCL), we compared whole-genome transcriptome data from human tissues ranging in longevity from 120 days to over 70 years and found a set of 81 genes whose expression levels are closely associated with increased cell longevity. Using expression data from 10 independent sources, we found that these genes are more highly coexpressed in longer-living tissues and are enriched in specific biological processes and transcription factor targets compared with randomly selected gene samples. Crucially, we found that PMCL-associated genes are downregulated in the cerebral cortex and substantia nigra of patients with Alzheimer's and Parkinson's disease, respectively, as well as Hutchinson-Gilford progeria-derived fibroblasts, and that this downregulation is specifically linked to their underlying association with cellular longevity. Moreover, we found that sexually dimorphic brain expression of PMCL-associated genes reflects sexual differences in lifespan in humans and macaques. Taken together, our results suggest that PMCL-associated genes are part of a generalized machinery of postmitotic maintenance and functional stability in both neural and non-neural cells and support the notion of a common molecular repertoire differentially engaged in different cell types with different survival requirements.
Collapse
Affiliation(s)
- Atahualpa Castillo-Morales
- School of Life Sciences, University of Lincoln, Lincoln, UK; Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Jimena Monzón-Sandoval
- School of Life Sciences, University of Lincoln, Lincoln, UK; Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Araxi O Urrutia
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK; Instituto de Ecología, Universidad Nacional Autónoma de México, Ciudad de México, Mexico.
| | | |
Collapse
|
21
|
Hashemikhabir S, Xia R, Xiang Y, Janga SC. A Framework for Identifying Genotypic Information from Clinical Records: Exploiting Integrated Ontology Structures to Transfer Annotations between ICD Codes and Gene Ontologies. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1259-1269. [PMID: 26394433 DOI: 10.1109/tcbb.2015.2480056] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Although some methods are proposed for automatic ontology generation, none of them address the issue of integrating large-scale heterogeneous biomedical ontologies. We propose a novel approach for integrating various types of ontologies efficiently and apply it to integrate International Classification of Diseases, Ninth Revision, Clinical Modification (ICD9CM), and Gene Ontologies. This approach is one of the early attempts to quantify the associations among clinical terms (e.g., ICD9 codes) based on their corresponding genomic relationships. We reconstructed a merged tree for a partial set of GO and ICD9 codes and measured the performance of this tree in terms of associations' relevance by comparing them with two well-known disease-gene datasets (i.e., MalaCards and Disease Ontology). Furthermore, we compared the genomic-based ICD9 associations to temporal relationships between them from electronic health records. Our analysis shows promising associations supported by both comparisons suggesting a high reliability. We also manually analyzed several significant associations and found promising support from literature.
Collapse
|
22
|
Ordinal Multi-modal Feature Selection for Survival Analysis of Early-Stage Renal Cancer. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION – MICCAI 2018 2018. [DOI: 10.1007/978-3-030-00934-2_72] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
23
|
Yu W, Zhao S, Wang Y, Zhao BN, Zhao W, Zhou X. Identification of cancer prognosis-associated functional modules using differential co-expression networks. Oncotarget 2017; 8:112928-112941. [PMID: 29348878 PMCID: PMC5762563 DOI: 10.18632/oncotarget.22878] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Accepted: 11/15/2017] [Indexed: 01/23/2023] Open
Abstract
The rapid accumulation of cancer-related data owing to high-throughput technologies has provided unprecedented choices to understand the progression of cancer and discover functional networks in multiple cancers. Establishment of co-expression networks will help us to discover the systemic properties of carcinogenesis features and regulatory mechanisms of multiple cancers. Here, we proposed a computational workflow to identify differentially co-expressed gene modules across 8 cancer types by using combined gene differential expression analysis methods and a higher-order generalized singular value decomposition. Four co-expression modules were identified; and oncogenes and tumor suppressors were significantly enriched in these modules. Functional enrichment analysis demonstrated the significantly enriched pathways in these modules, including ECM-receptor interaction, focal adhesion and PI3K-Akt signaling pathway. The top-ranked miRNAs (mir-199, mir-29, mir-200) and transcription factors (FOXO4, E2A, NFAT, and MAZ) were identified, which play an important role in deregulating cellular energetics; and regulating angiogenesis and cancer immune system. The clinical significance of the co-expressed gene clusters was assessed by evaluating their predictability of cancer patients’ survival. The predictive power of different clusters and subclusters was demonstrated. Our results will be valuable in cancer-related gene function annotation and for the evaluation of cancer patients’ prognosis.
Collapse
Affiliation(s)
- Wenshuai Yu
- Key Laboratory of Embedded System and Service Computing, College of Electronics and Information Engineering, The Ministry of Education, Tongji University, Shanghai, China
| | - Shengjie Zhao
- Key Laboratory of Embedded System and Service Computing, College of Electronics and Information Engineering, The Ministry of Education, Tongji University, Shanghai, China.,College of Software Engineering, Tongji University, Shanghai, China
| | - Yongcui Wang
- Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xining, China
| | | | - Weiling Zhao
- Department of Radiology and Comprehensive Cancer Center, Wake Forest University School of Medicine, Winston Salem, NC, USA
| | - Xiaobo Zhou
- College of Electronics and Information Engineering, Tongji University, Shanghai, China.,Center for Big Data Sciences and Network Security, Tongji University, Shanghai, China.,Center for Bioinformatics and System Biology, Wake Forest University School of Medicine, Winston Salem, NC, USA
| |
Collapse
|
24
|
Cheng J, Zhang J, Han Y, Wang X, Ye X, Meng Y, Parwani A, Han Z, Feng Q, Huang K. Integrative Analysis of Histopathological Images and Genomic Data Predicts Clear Cell Renal Cell Carcinoma Prognosis. Cancer Res 2017; 77:e91-e100. [PMID: 29092949 DOI: 10.1158/0008-5472.can-17-0313] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Revised: 02/13/2017] [Accepted: 06/29/2017] [Indexed: 12/17/2022]
Abstract
In cancer, both histopathologic images and genomic signatures are used for diagnosis, prognosis, and subtyping. However, combining histopathologic images with genomic data for predicting prognosis, as well as the relationships between them, has rarely been explored. In this study, we present an integrative genomics framework for constructing a prognostic model for clear cell renal cell carcinoma. We used patient data from The Cancer Genome Atlas (n = 410), extracting hundreds of cellular morphologic features from digitized whole-slide images and eigengenes from functional genomics data to predict patient outcome. The risk index generated by our model correlated strongly with survival, outperforming predictions based on considering morphologic features or eigengenes separately. The predicted risk index also effectively stratified patients in early-stage (stage I and stage II) tumors, whereas no significant survival difference was observed using staging alone. The prognostic value of our model was independent of other known clinical and molecular prognostic factors for patients with clear cell renal cell carcinoma. Overall, this workflow and the shared software code provide building blocks for applying similar approaches in other cancers. Cancer Res; 77(21); e91-100. ©2017 AACR.
Collapse
Affiliation(s)
- Jun Cheng
- Guangdong Province Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou, China
| | - Jie Zhang
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio.,Department of Medicine, Indiana University School of Medicine, Indianapolis, Indiana
| | - Yatong Han
- College of Automation, Harbin Engineering University, Harbin, Heilongjiang, China
| | - Xusheng Wang
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio
| | - Xiufen Ye
- College of Automation, Harbin Engineering University, Harbin, Heilongjiang, China
| | - Yuebo Meng
- College of Information and Control Engineering, Xi'an University of Architecture and Technology, Xi'an, China
| | - Anil Parwani
- Department of Pathology, The Ohio State University, Columbus, Ohio
| | - Zhi Han
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio.,Department of Medicine, Indiana University School of Medicine, Indianapolis, Indiana.,Department of Pathology, The Ohio State University, Columbus, Ohio
| | - Qianjin Feng
- Guangdong Province Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou, China.
| | - Kun Huang
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio. .,Department of Medicine, Indiana University School of Medicine, Indianapolis, Indiana
| |
Collapse
|
25
|
Di Salle P, Incerti G, Colantuono C, Chiusano ML. Gene co-expression analyses: an overview from microarray collections in Arabidopsis thaliana. Brief Bioinform 2017; 18:215-225. [PMID: 26891982 DOI: 10.1093/bib/bbw002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Indexed: 01/08/2023] Open
Abstract
Bioinformatics web-based resources and databases are precious references for most biological laboratories worldwide. However, the quality and reliability of the information they provide depends on them being used in an appropriate way that takes into account their specific features. Huge collections of gene expression data are currently publicly available, ready to support the understanding of gene and genome functionalities. In this context, tools and resources for gene co-expression analyses have flourished to exploit the 'guilty by association' principle, which assumes that genes with correlated expression profiles are functionally related. In the case of Arabidopsis thaliana, the reference species in plant biology, the resources available mainly consist of microarray results. After a general overview of such resources, we tested and compared the results they offer for gene co-expression analysis. We also discuss the effect on the results when using different data sets, as well as different data normalization approaches and parameter settings, which often consider different metrics for establishing co-expression. A dedicated example analysis of different gene pools, implemented by including/excluding mutant samples in a reference data set, showed significant variation of gene co-expression occurrence, magnitude and direction. We conclude that, as the heterogeneity of the resources and methods may produce different results for the same query genes, the exploration of more than one of the available resources is strongly recommended. The aim of this article is to show how best to integrate data sources and/or merge outputs to achieve robust analyses and reliable interpretations, thereby making use of diverse data resources an opportunity for added value.
Collapse
Affiliation(s)
- Pasquale Di Salle
- Department of Agriculture, University of Naples Federico II, Portici, Italy
| | - Guido Incerti
- Dipartimento di Agraria , University of Naples Federico II, via Università, Portici (NA), Italy
| | - Chiara Colantuono
- Department of Agriculture, University of Naples Federico II, Portici, Italy
| | | |
Collapse
|
26
|
Sonawane AR, Platig J, Fagny M, Chen CY, Paulson JN, Lopes-Ramos CM, DeMeo DL, Quackenbush J, Glass K, Kuijjer ML. Understanding Tissue-Specific Gene Regulation. Cell Rep 2017; 21:1077-1088. [PMID: 29069589 PMCID: PMC5828531 DOI: 10.1016/j.celrep.2017.10.001] [Citation(s) in RCA: 225] [Impact Index Per Article: 32.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Revised: 08/09/2017] [Accepted: 09/28/2017] [Indexed: 12/20/2022] Open
Abstract
Although all human tissues carry out common processes, tissues are distinguished by gene expression patterns, implying that distinct regulatory programs control tissue specificity. In this study, we investigate gene expression and regulation across 38 tissues profiled in the Genotype-Tissue Expression project. We find that network edges (transcription factor to target gene connections) have higher tissue specificity than network nodes (genes) and that regulating nodes (transcription factors) are less likely to be expressed in a tissue-specific manner as compared to their targets (genes). Gene set enrichment analysis of network targeting also indicates that the regulation of tissue-specific function is largely independent of transcription factor expression. In addition, tissue-specific genes are not highly targeted in their corresponding tissue network. However, they do assume bottleneck positions due to variability in transcription factor targeting and the influence of non-canonical regulatory interactions. These results suggest that tissue specificity is driven by context-dependent regulatory paths, providing transcriptional control of tissue-specific processes.
Collapse
Affiliation(s)
- Abhijeet Rajendra Sonawane
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - John Platig
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Maud Fagny
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Cho-Yi Chen
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Joseph Nathaniel Paulson
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Camila Miranda Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Dawn Lisa DeMeo
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - John Quackenbush
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA.
| | - Marieke Lydia Kuijjer
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.
| |
Collapse
|
27
|
Lopes-Ramos CM, Paulson JN, Chen CY, Kuijjer ML, Fagny M, Platig J, Sonawane AR, DeMeo DL, Quackenbush J, Glass K. Regulatory network changes between cell lines and their tissues of origin. BMC Genomics 2017; 18:723. [PMID: 28899340 PMCID: PMC5596945 DOI: 10.1186/s12864-017-4111-x] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2017] [Accepted: 09/01/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cell lines are an indispensable tool in biomedical research and often used as surrogates for tissues. Although there are recognized important cellular and transcriptomic differences between cell lines and tissues, a systematic overview of the differences between the regulatory processes of a cell line and those of its tissue of origin has not been conducted. The RNA-Seq data generated by the GTEx project is the first available data resource in which it is possible to perform a large-scale transcriptional and regulatory network analysis comparing cell lines with their tissues of origin. RESULTS We compared 127 paired Epstein-Barr virus transformed lymphoblastoid cell lines (LCLs) and whole blood samples, and 244 paired primary fibroblast cell lines and skin samples. While gene expression analysis confirms that these cell lines carry the expression signatures of their primary tissues, albeit at reduced levels, network analysis indicates that expression changes are the cumulative result of many previously unreported alterations in transcription factor (TF) regulation. More specifically, cell cycle genes are over-expressed in cell lines compared to primary tissues, and this alteration in expression is a result of less repressive TF targeting. We confirmed these regulatory changes for four TFs, including SMAD5, using independent ChIP-seq data from ENCODE. CONCLUSIONS Our results provide novel insights into the regulatory mechanisms controlling the expression differences between cell lines and tissues. The strong changes in TF regulation that we observe suggest that network changes, in addition to transcriptional levels, should be considered when using cell lines as models for tissues.
Collapse
Affiliation(s)
- Camila M. Lopes-Ramos
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Joseph N. Paulson
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Cho-Yi Chen
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Marieke L. Kuijjer
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Maud Fagny
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - John Platig
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Abhijeet R. Sonawane
- Channing Division of Network Medicine, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA USA
| | - Dawn L. DeMeo
- Channing Division of Network Medicine, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA USA
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, MA USA
| | - John Quackenbush
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215 USA
| | - Kimberly Glass
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA USA
| |
Collapse
|
28
|
Functional Virtual Flow Cytometry: A Visual Analytic Approach for Characterizing Single-Cell Gene Expression Patterns. BIOMED RESEARCH INTERNATIONAL 2017; 2017:3035481. [PMID: 28798928 PMCID: PMC5536134 DOI: 10.1155/2017/3035481] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Accepted: 05/22/2017] [Indexed: 01/13/2023]
Abstract
We presented a novel workflow for detecting distribution patterns in cell populations based on single-cell transcriptome study. With the fast adoption of single-cell analysis, a challenge to researchers is how to effectively extract gene features to meaningfully separate the cell population. Considering that coexpressed genes are often functionally or structurally related and the number of coexpressed modules is much smaller than the number of genes, our workflow uses gene coexpression modules as features instead of individual genes. Thus, when the coexpressed modules are summarized into eigengenes, not only can we interactively explore the distribution of cells but also we can promptly interpret the gene features. The interactive visualization is aided by a novel application of spatial statistical analysis to the scatter plots using a clustering index parameter. This parameter helps to highlight interesting 2D patterns in the scatter plot matrix (SPLOM). We demonstrated the effectiveness of the workflow using two large single-cell studies. In the Allen Brain scRNA-seq dataset, the visual analytics suggested a new hypothesis such as the involvement of glutamate metabolism in the separation of the brain cells. In a large glioblastoma study, a sample with a unique cell migration related signature was identified.
Collapse
|
29
|
Costa RL, Gadelha L, Ribeiro-Alves M, Porto F. GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis. PeerJ 2017; 5:e3509. [PMID: 28695067 PMCID: PMC5501156 DOI: 10.7717/peerj.3509] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2017] [Accepted: 06/06/2017] [Indexed: 12/28/2022] Open
Abstract
There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were analyzed. The results are integrated into GeNNet-DB, a database about genes, clusters, experiments and their properties and relationships. The resulting graph database is explored with queries that demonstrate the expressiveness of this data model for reasoning about gene interaction networks. GeNNet is the first platform to integrate the analytical process of transcriptome data with graph databases. It provides a comprehensive set of tools that would otherwise be challenging for non-expert users to install and use. Developers can add new functionality to components of GeNNet. The derived data allows for testing previous hypotheses about an experiment and exploring new ones through the interactive graph database environment. It enables the analysis of different data on humans, rhesus, mice and rat coming from Affymetrix platforms. GeNNet is available as an open source platform at https://github.com/raquele/GeNNet and can be retrieved as a software container with the command docker pull quelopes/gennet.
Collapse
Affiliation(s)
- Raquel L. Costa
- DEXL Lab, National Laboratory for Scientific Computing (LNCC), Petrópolis, Rio de Janeiro, Brazil
- National Institute of Cancer (INCA), Rio de Janeiro, RJ, Brazil
| | - Luiz Gadelha
- DEXL Lab, National Laboratory for Scientific Computing (LNCC), Petrópolis, Rio de Janeiro, Brazil
| | - Marcelo Ribeiro-Alves
- Laboratory of Clinical Research in DST- AIDS, National Institute of Infectology Evandro Chagas, Oswaldo Cruz Foundation, Rio de Janeiro, Brazil
| | - Fábio Porto
- DEXL Lab, National Laboratory for Scientific Computing (LNCC), Petrópolis, Rio de Janeiro, Brazil
| |
Collapse
|
30
|
Zhang J, Huang K. Pan-cancer analysis of frequent DNA co-methylation patterns reveals consistent epigenetic landscape changes in multiple cancers. BMC Genomics 2017; 18:1045. [PMID: 28198667 PMCID: PMC5310283 DOI: 10.1186/s12864-016-3259-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Background DNA methylation is the major form of epigenetic modifications through which the cell regulates the gene expression and silencing. There have been extensive studies on the roles of DNA methylation in cancers, and several cancer drugs were developed targeting this process. However, DNA co-methylation cluster has not been examined in depth, and co-methylation in multiple cancer types has never been studied previously. Results In this study, we applied newly developed lmQCM algorithm to mine co-methylation clusters using methylome data from 11 cancer types in TCGA database, and found frequent co-methylated gene clusters exist in these cancer types. Among the four identified frequent clusters, two of them separate the tumor sample from normal sample in 10 out of 11 cancer types, which indicates that consistent epigenetic landscape changes exist in multiple cancer types. Conclusion This discovery provides new insight on the epigenetic regulation in cancers and leads to potential new direction for epigenetic biomarker and cancer drug discovery. We also found that genes commonly believed to be silenced via hypermethylation in cancers may still display highly variable methylation levels among cancer cells, and should be considered while using them as epigenetic biomarkers. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3259-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jie Zhang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Kun Huang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
31
|
Ivliev AE, ‘t Hoen PAC, Borisevich D, Nikolsky Y, Sergeeva MG. Drug Repositioning through Systematic Mining of Gene Coexpression Networks in Cancer. PLoS One 2016; 11:e0165059. [PMID: 27824868 PMCID: PMC5100910 DOI: 10.1371/journal.pone.0165059] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Accepted: 09/12/2016] [Indexed: 11/18/2022] Open
Abstract
Gene coexpression network analysis is a powerful "data-driven" approach essential for understanding cancer biology and mechanisms of tumor development. Yet, despite the completion of thousands of studies on cancer gene expression, there have been few attempts to normalize and integrate co-expression data from scattered sources in a concise "meta-analysis" framework. We generated such a resource by exploring gene coexpression networks in 82 microarray datasets from 9 major human cancer types. The analysis was conducted using an elaborate weighted gene coexpression network (WGCNA) methodology and identified over 3,000 robust gene coexpression modules. The modules covered a range of known tumor features, such as proliferation, extracellular matrix remodeling, hypoxia, inflammation, angiogenesis, tumor differentiation programs, specific signaling pathways, genomic alterations, and biomarkers of individual tumor subtypes. To prioritize genes with respect to those tumor features, we ranked genes within each module by connectivity, leading to identification of module-specific functionally prominent hub genes. To showcase the utility of this network information, we positioned known cancer drug targets within the coexpression networks and predicted that Anakinra, an anti-rheumatoid therapeutic agent, may be promising for development in colorectal cancer. We offer a comprehensive, normalized and well documented collection of >3000 gene coexpression modules in a variety of cancers as a rich data resource to facilitate further progress in cancer research.
Collapse
Affiliation(s)
- Alexander E. Ivliev
- A.N. Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow, Russia
- IP & Science, Thomson Reuters, Boston, Massachusetts, United States of America
- * E-mail:
| | - Peter A. C. ‘t Hoen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Dmitrii Borisevich
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Yuri Nikolsky
- Institute for General Genetics, Moscow, Russia
- George Mason University, Fairfax, VA, United States of America
- Prosapia Genetics, Solana Beach, California, United States of America
| | - Marina G. Sergeeva
- A.N. Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow, Russia
| |
Collapse
|
32
|
Cao Z, Zhang S. An integrative and comparative study of pan-cancer transcriptomes reveals distinct cancer common and specific signatures. Sci Rep 2016; 6:33398. [PMID: 27633916 PMCID: PMC5025752 DOI: 10.1038/srep33398] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 08/24/2016] [Indexed: 12/11/2022] Open
Abstract
To investigate the commonalities and specificities across tumor lineages, we perform a systematic pan-cancer transcriptomic study across 6744 specimens. We find six pan-cancer subnetwork signatures which relate to cell cycle, immune response, Sp1 regulation, collagen, muscle system and angiogenesis. Moreover, four pan-cancer subnetwork signatures demonstrate strong prognostic potential. We also characterize 16 cancer type-specific subnetwork signatures which show diverse implications to somatic mutations, somatic copy number aberrations, DNA methylation alterations and clinical outcomes. Furthermore, some of them are strongly correlated with histological or molecular subtypes, indicating their implications with tumor heterogeneity. In summary, we systematically explore the pan-cancer common and cancer type-specific gene subnetwork signatures across multiple cancers, and reveal distinct commonalities and specificities among cancers at transcriptomic level.
Collapse
Affiliation(s)
- Zhen Cao
- National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Shihua Zhang
- National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
33
|
Cheng F, Liu C, Shen B, Zhao Z. Investigating cellular network heterogeneity and modularity in cancer: a network entropy and unbalanced motif approach. BMC SYSTEMS BIOLOGY 2016; 10 Suppl 3:65. [PMID: 27585651 PMCID: PMC5009528 DOI: 10.1186/s12918-016-0309-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Cancer is increasingly recognized as a cellular system phenomenon that is attributed to the accumulation of genetic or epigenetic alterations leading to the perturbation of the molecular network architecture. Elucidation of network properties that can characterize tumor initiation and progression, or pinpoint the molecular targets related to the drug sensitivity or resistance, is therefore of critical importance for providing systems-level insights into tumorigenesis and clinical outcome in the molecularly targeted cancer therapy. RESULTS In this study, we developed a network-based framework to quantitatively examine cellular network heterogeneity and modularity in cancer. Specifically, we constructed gene co-expressed protein interaction networks derived from large-scale RNA-Seq data across 8 cancer types generated in The Cancer Genome Atlas (TCGA) project. We performed gene network entropy and balanced versus unbalanced motif analysis to investigate cellular network heterogeneity and modularity in tumor versus normal tissues, different stages of progression, and drug resistant versus sensitive cancer cell lines. We found that tumorigenesis could be characterized by a significant increase of gene network entropy in all of the 8 cancer types. The ratio of the balanced motifs in normal tissues is higher than that of tumors, while the ratio of unbalanced motifs in tumors is higher than that of normal tissues in all of the 8 cancer types. Furthermore, we showed that network entropy could be used to characterize tumor progression and anticancer drug responses. For example, we found that kinase inhibitor resistant cancer cell lines had higher entropy compared to that of sensitive cell lines using the integrative analysis of microarray gene expression and drug pharmacological data collected from the Genomics of Drug Sensitivity in Cancer database. In addition, we provided potential network-level evidence that smoking might increase cancer cellular network heterogeneity and further contribute to tyrosine kinase inhibitor (e.g., gefitinib) resistance. CONCLUSION In summary, we demonstrated that network properties such as network entropy and unbalanced motifs associated with tumor initiation, progression, and anticancer drug responses, suggesting new potential network-based prognostic and predictive measure in cancer.
Collapse
Affiliation(s)
- Feixiong Cheng
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Chuang Liu
- Alibaba Research Center for Complexity Sciences, Hangzhou Normal University, Hangzhou, Zhejiang, China
| | - Bairong Shen
- Center for Systems Biology, Soochow University, Suzhou, China
| | - Zhongming Zhao
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA. .,Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, TN, USA. .,Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN, USA. .,Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
34
|
Zhang MH, Shen QH, Qin ZM, Wang QL, Chen X. Systematic tracking of disrupted modules identifies significant genes and pathways in hepatocellular carcinoma. Oncol Lett 2016; 12:3285-3295. [PMID: 27899995 PMCID: PMC5103943 DOI: 10.3892/ol.2016.5039] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Accepted: 07/12/2016] [Indexed: 12/17/2022] Open
Abstract
The objective of the present study is to identify significant genes and pathways associated with hepatocellular carcinoma (HCC) by systematically tracking the dysregulated modules of re-weighted protein-protein interaction (PPI) networks. Firstly, normal and HCC PPI networks were inferred and re-weighted based on Pearson correlation coefficient. Next, modules in the PPI networks were explored by a clique-merging algorithm, and disrupted modules were identified utilizing a maximum weight bipartite matching in non-increasing order. Then, the gene compositions of the disrupted modules were studied and compared with differentially expressed (DE) genes, and pathway enrichment analysis for these genes was performed based on Expression Analysis Systematic Explorer. Finally, validations of significant genes in HCC were conducted using reverse transcription-quantitative polymerase chain reaction (RT-qPCR) analysis. The present study evaluated 394 disrupted module pairs, which comprised 236 dysregulated genes. When the dysregulated genes were compared with 211 DE genes, a total of 26 common genes [including phospholipase C beta 1, cytochrome P450 (CYP) 2C8 and CYP2B6] were obtained. Furthermore, 6 of these 26 common genes were validated by RT-qPCR. Pathway enrichment analysis of dysregulated genes demonstrated that neuroactive ligand-receptor interaction, purine and drug metabolism, and metabolism of xenobiotics mediated by CYP were significantly disrupted pathways. In conclusion, the present study greatly improved the understanding of HCC in a systematic manner and provided potential biomarkers for early detection and novel therapeutic methods.
Collapse
Affiliation(s)
- Meng-Hui Zhang
- Department of General Surgery, The Fourth Hospital of Jinan, Jinan, Shandong 250031, P.R. China
| | - Qin-Hai Shen
- Department of Medicine, Shandong Medical College, Jinan, Shandong 250002, P.R. China
| | - Zhao-Min Qin
- Department of Nursing, Shandong Medical College, Jinan, Shandong 250002, P.R. China
| | - Qiao-Ling Wang
- Department of Ophthalmology, The Second Hospital of Jinan, Jinan, Shandong 250022, P.R. China
| | - Xi Chen
- Department of Ophthalmology, The Ninth Hospital of Chongqing, Chongqing 400700, P.R. China
| |
Collapse
|
35
|
Han Z, Zhang J, Sun G, Liu G, Huang K. A matrix rank based concordance index for evaluating and detecting conditional specific co-expressed gene modules. BMC Genomics 2016; 17 Suppl 7:519. [PMID: 27556416 PMCID: PMC5001231 DOI: 10.1186/s12864-016-2912-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Gene co-expression network analysis (GCNA) is widely adopted in bioinformatics and biomedical research with applications such as gene function prediction, protein-protein interaction inference, disease markers identification, and copy number variance discovery. Currently there is a lack of rigorous analysis on the mathematical condition for which the co-expressed gene module should satisfy. METHODS In this paper, we present a linear algebraic based Centralized Concordance Index (CCI) for evaluating the concordance of co-expressed gene modules from gene co-expression network analysis. The CCI can be used to evaluate the performance for co-expression network analysis algorithms as well as for detecting condition specific co-expression modules. We applied CCI in detecting lung tumor specific gene modules. RESULTS AND DISCUSSION Simulation showed that CCI is a robust indicator for evaluating the concordance of a group of co-expressed genes. The application to lung cancer datasets revealed interesting potential tumor specific genetic alterations including CNVs and even hints for gene-fusion. Deeper analysis required for understanding the molecular mechanisms of all such condition specific co-expression relationships. CONCLUSION The CCI can be used to evaluate the performance for co-expression network analysis algorithms as well as for detecting condition specific co-expression modules. It is shown to be more robust to outliers and interfering modules than density based on Pearson correlation coefficients.
Collapse
Affiliation(s)
- Zhi Han
- College of Computer and Control Engineering, Nankai University, Tianjin, China
- College of Software, Nankai University, Tianjin, China
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH USA
| | - Jie Zhang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH USA
- The CCC Biomedical Informatics Shared Resource, The Ohio State University, Columbus, OH USA
| | - Guoyuan Sun
- College of Computer and Control Engineering, Nankai University, Tianjin, China
- College of Software, Nankai University, Tianjin, China
| | - Gang Liu
- College of Computer and Control Engineering, Nankai University, Tianjin, China
- College of Software, Nankai University, Tianjin, China
| | - Kun Huang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH USA
- The CCC Biomedical Informatics Shared Resource, The Ohio State University, Columbus, OH USA
| |
Collapse
|
36
|
Zhang J, Huang K. Normalized lmQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers. Cancer Inform 2016; 13:137-46. [PMID: 27486298 PMCID: PMC4962959 DOI: 10.4137/cin.s14021] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Revised: 04/20/2015] [Accepted: 04/28/2015] [Indexed: 01/14/2023] Open
Abstract
In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network study and biomedicine. Our approach has two major improvements upon previous work. The first is the use of local maximum edges to initialize the search in order to avoid excessive overlaps among the modules, thereby greatly reducing the computing time. The second is the inclusion of a weight normalization procedure to enable discovery of "subtle" modules with more balanced sizes. We carried out careful tests on multiple parameters and settings using two large cancer datasets. This approach allowed us to identify a large number of gene modules enriched in both biological functions and chromosomal bands in cancer data, suggesting potential roles of copy number variations (CNVs) involved in the cancer development. We then tested the genes in selected modules with enriched chromosomal bands using The Cancer Genome Atlas data, and the results strongly support our hypothesis that the coexpression in these modules are associated with CNVs. While gene coexpression network analyses have been widely adopted in disease studies, most of them focus on the functional relationships of coexpressed genes. The relationship between coexpression gene modules and CNVs are much less investigated despite the potential advantage that we can infer from such relationship without genotyping data. Our new approach thus provides a means to carry out deep mining of the gene coexpression network to obtain both functional and genetic information from the expression data.
Collapse
Affiliation(s)
- Jie Zhang
- Department of Biomedical Informatics and Biomedical Informatics Shared Resource, The Ohio State University, Columbus, USA
| | - Kun Huang
- Department of Biomedical Informatics and Biomedical Informatics Shared Resource, The Ohio State University, Columbus, USA
| |
Collapse
|
37
|
Shroff S, Zhang J, Huang K. Gene Co-Expression Analysis Predicts Genetic Variants Associated with Drug Responsiveness in Lung Cancer. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2016; 2016:32-41. [PMID: 27570645 PMCID: PMC5001757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Responsiveness to drugs is an important concern in designing personalized treatment for cancer patients. Currently genetic markers are often used to guide targeted therapy. However, deeper understanding of the molecular basis for drug responses and discovery of new predictive biomarkers for drug sensitivity are much needed. In this paper, we present a workflow for identifying condition-specific gene co-expression networks associated with responses to the tyrosine kinase inhibitor, Erlotinib, in lung adenocarcinoma cell lines using data from the Cancer Cell Line Encyclopedia by combining network mining and statistical analysis. Particularly, we have identified multiple gene modules specifically co-expressed in the drug responsive cell lines but not in the unresponsive group. Interestingly, most of these modules are enriched on specific cytobands, suggesting potential copy number variation events on these loci. Our results therefore imply that there are multiple genetic loci with copy number variations associated with the Erlotinib responses. The existence of CNVs in these loci is also confirmed in lung cancer tissue samples using the TCGA data. Since these structural variations are inferred from functional genomics data, these CNVs are functional variations. These results suggest the condition specific gene co- expression network mining approach is an effective approach in predicting candidate biomarkers for drug responses.
Collapse
Affiliation(s)
- Sanaya Shroff
- Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY 14853 (USA)
| | - Jie Zhang
- Biomedical Informatics, The Ohio State University, Columbus, OH 43210 (USA)
| | - Kun Huang
- Biomedical Informatics, The Ohio State University, Columbus, OH 43210 (USA),Corresponding author:
| |
Collapse
|
38
|
Monzón-Sandoval J, Castillo-Morales A, Urrutia AO, Gutierrez H. Modular reorganization of the global network of gene regulatory interactions during perinatal human brain development. BMC DEVELOPMENTAL BIOLOGY 2016; 16:13. [PMID: 27175727 PMCID: PMC4866393 DOI: 10.1186/s12861-016-0111-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Accepted: 04/25/2016] [Indexed: 12/02/2022]
Abstract
Background During early development of the nervous system, gene expression patterns are known to vary widely depending on the specific developmental trajectories of different structures. Observable changes in gene expression profiles throughout development are determined by an underlying network of precise regulatory interactions between individual genes. Elucidating the organizing principles that shape this gene regulatory network is one of the central goals of developmental biology. Whether the developmental programme is the result of a dynamic driven by a fixed architecture of regulatory interactions, or alternatively, the result of waves of regulatory reorganization is not known. Results Here we contrast these two alternative models by examining existing expression data derived from the developing human brain in prenatal and postnatal stages. We reveal a sharp change in gene expression profiles at birth across brain areas. This sharp division between foetal and postnatal profiles is not the result of pronounced changes in level of expression of existing gene networks. Instead we demonstrate that the perinatal transition is marked by the widespread regulatory rearrangement within and across existing gene clusters, leading to the emergence of new functional groups. This rearrangement is itself organized into discrete blocks of genes, each targeted by a distinct set of transcriptional regulators and associated to specific biological functions. Conclusions Our results provide evidence of an acute modular reorganization of the regulatory architecture of the brain transcriptome occurring at birth, reflecting the reassembly of new functional associations required for the normal transition from prenatal to postnatal brain development. Electronic supplementary material The online version of this article (doi:10.1186/s12861-016-0111-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jimena Monzón-Sandoval
- School of Life Sciences, University of Lincoln, Lincoln, LN6 7DL, UK.,Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Atahualpa Castillo-Morales
- School of Life Sciences, University of Lincoln, Lincoln, LN6 7DL, UK.,Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Araxi O Urrutia
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK. .,Milner Centre for Evolution, University of Bath, Bath, BA2 7AY, UK.
| | | |
Collapse
|
39
|
Discovering gene re-ranking efficiency and conserved gene-gene relationships derived from gene co-expression network analysis on breast cancer data. Sci Rep 2016; 6:20518. [PMID: 26892392 PMCID: PMC4759568 DOI: 10.1038/srep20518] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 01/05/2016] [Indexed: 12/18/2022] Open
Abstract
Systemic approaches are essential in the discovery of disease-specific genes, offering a different perspective and new tools on the analysis of several types of molecular relationships, such as gene co-expression or protein-protein interactions. However, due to lack of experimental information, this analysis is not fully applicable. The aim of this study is to reveal the multi-potent contribution of statistical network inference methods in highlighting significant genes and interactions. We have investigated the ability of statistical co-expression networks to highlight and prioritize genes for breast cancer subtypes and stages in terms of: (i) classification efficiency, (ii) gene network pattern conservation, (iii) indication of involved molecular mechanisms and (iv) systems level momentum to drug repurposing pipelines. We have found that statistical network inference methods are advantageous in gene prioritization, are capable to contribute to meaningful network signature discovery, give insights regarding the disease-related mechanisms and boost drug discovery pipelines from a systems point of view.
Collapse
|
40
|
Chen D, Zhang Z, Meng Y. Systematic Tracking of Disrupted Modules Identifies Altered Pathways Associated with Congenital Heart Defects in Down Syndrome. Med Sci Monit 2015; 21:3334-42. [PMID: 26524729 PMCID: PMC4635630 DOI: 10.12659/msm.896001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND This work aimed to identify altered pathways in congenital heart defects (CHD) in Down syndrome (DS) by systematically tracking the dysregulated modules of reweighted protein-protein interaction (PPI) networks. MATERIAL AND METHODS We performed systematic identification and comparison of modules across normal and disease conditions by integrating PPI and gene-expression data. Based on Pearson correlation coefficient (PCC), normal and disease PPI networks were inferred and reweighted. Then, modules in the PPI network were explored by clique-merging algorithm; altered modules were identified via maximum weight bipartite matching and ranked in non-increasing order. Finally, pathways enrichment analysis of genes in altered modules was carried out based on Database for Annotation, Visualization, and Integrated Discovery (DAVID) to study the biological pathways in CHD in DS. RESULTS Our analyses revealed that 348 altered modules were identified by comparing modules in normal and disease PPI networks. Pathway functional enrichment analysis of disrupted module genes showed that the 4 most significantly altered pathways were: ECM-receptor interaction, purine metabolism, focal adhesion, and dilated cardiomyopathy. CONCLUSIONS We successfully identified 4 altered pathways and we predicted that these pathways would be good indicators for CHD in DS.
Collapse
Affiliation(s)
- Denghong Chen
- Department of Obstetrics, Jining No. 1 People's Hospital, Jining, Shandong, China (mainland)
| | - Zhenhua Zhang
- Department of Children's Health Prevention, Jining No. 1 People's Hospital, Jining, Shandong, China (mainland)
| | - Yuxiu Meng
- Department of Neonatology, Jining No. 1 People's Hospital, Jining, Shandong, China (mainland)
| |
Collapse
|
41
|
Monzón-Sandoval J, Castillo-Morales A, Crampton S, McKelvey L, Nolan A, O'Keeffe G, Gutierrez H. Modular and coordinated expression of immune system regulatory and signaling components in the developing and adult nervous system. Front Cell Neurosci 2015; 9:337. [PMID: 26379506 PMCID: PMC4551857 DOI: 10.3389/fncel.2015.00337] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Accepted: 08/14/2015] [Indexed: 12/14/2022] Open
Abstract
During development, the nervous system (NS) is assembled and sculpted through a concerted series of neurodevelopmental events orchestrated by a complex genetic programme. While neural-specific gene expression plays a critical part in this process, in recent years, a number of immune-related signaling and regulatory components have also been shown to play key physiological roles in the developing and adult NS. While the involvement of individual immune-related signaling components in neural functions may reflect their ubiquitous character, it may also reflect a much wider, as yet undescribed, genetic network of immune-related molecules acting as an intrinsic component of the neural-specific regulatory machinery that ultimately shapes the NS. In order to gain insights into the scale and wider functional organization of immune-related genetic networks in the NS, we examined the large scale pattern of expression of these genes in the brain. Our results show a highly significant correlated expression and transcriptional clustering among immune-related genes in the developing and adult brain, and this correlation was the highest in the brain when compared to muscle, liver, kidney and endothelial cells. We experimentally tested the regulatory clustering of immune system (IS) genes by using microarray expression profiling in cultures of dissociated neurons stimulated with the pro-inflammatory cytokine TNF-alpha, and found a highly significant enrichment of immune system-related genes among the resulting differentially expressed genes. Our findings strongly suggest a coherent recruitment of entire immune-related genetic regulatory modules by the neural-specific genetic programme that shapes the NS.
Collapse
Affiliation(s)
- Jimena Monzón-Sandoval
- School of Life Sciences, University of Lincoln Lincoln, UK ; Department of Biology and Biochemistry, University of Bath Bath, UK
| | - Atahualpa Castillo-Morales
- School of Life Sciences, University of Lincoln Lincoln, UK ; Department of Biology and Biochemistry, University of Bath Bath, UK
| | - Sean Crampton
- Department of Anatomy and Neuroscience, Biosciences Institute, University College Cork Cork, Ireland
| | - Laura McKelvey
- Department of Anatomy and Neuroscience, Biosciences Institute, University College Cork Cork, Ireland
| | - Aoife Nolan
- Department of Anatomy and Neuroscience, Biosciences Institute, University College Cork Cork, Ireland
| | - Gerard O'Keeffe
- Department of Anatomy and Neuroscience, Biosciences Institute, University College Cork Cork, Ireland ; Irish Centre for Fetal and Neonatal Translational Research (INFANT), Cork University Maternity Hospital Cork, Ireland
| | | |
Collapse
|
42
|
Abstract
Gene coexpression networks inferred by correlation from high-throughput profiling such as microarray data represent simple but effective structures for discovering and interpreting linear gene relationships. In recent years, several approaches have been proposed to tackle the problem of deciding when the resulting correlation values are statistically significant. This is most crucial when the number of samples is small, yielding a non-negligible chance that even high correlation values are due to random effects. Here we introduce a novel hard thresholding solution based on the assumption that a coexpression network inferred by randomly generated data is expected to be empty. The threshold is theoretically derived by means of an analytic approach and, as a deterministic independent null model, it depends only on the dimensions of the starting data matrix, with assumptions on the skewness of the data distribution compatible with the structure of gene expression levels data. We show, on synthetic and array datasets, that the proposed threshold is effective in eliminating all false positive links, with an offsetting cost in terms of false negative detected edges.
Collapse
|
43
|
Differential expression of inflammasomes in lung cancer cell lines and tissues. Tumour Biol 2015; 36:7501-13. [PMID: 25910707 DOI: 10.1007/s13277-015-3473-4] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 04/15/2015] [Indexed: 12/22/2022] Open
Abstract
As pivotal elements involved in inflammation, inflammasomes represent a group of multiprotein complexes triggering the maturation of proinflammatory cytokine interleukin (IL)-1β and IL-18. Although the importance of the inflammasomes in inflammatory diseases is well appreciated, a precise characterization of their expressions in lung cancer remains obscure. This study aimed to determine the expressions of inflammasomes in various lung cancer cell lines and tissues to understand their potential roles in lung cancer. Our findings showed that inflammasome components were markedly upregulated in lung cancer and elicited the maturation of IL-1β and IL-18. In addition, enormous variations in subtypes and levels of inflammasomes were detected in lung cancers depending on their histological type and grading, invasion ability, as well as chemoresistance. Generally, AIM2 inflammasome was overexpressed in nonsmall cell lung cancer (NSCLC), while NLRP3 inflammasome was upregulated in lung adenocarcinoma (ADC) and small cell lung cancer (SCLC). The high-metastatic or cisplatin-sensitive NSCLC cells expressed more inflammasome components and products than their counterpart low-metastatic or cisplatin-resistant NSCLC cells, respectively. In resected lung cancer tissues, high-grade ADC expressed more inflammasome components and products than low-grade ADC. Together, these findings suggest that inflammasomes may be crucial biomarkers for lung cancer as well as potential modulators of the biological behaviors of lung cancer. Further, pharmacotherapeutics targeting inflammasomes might be novel adjuvant therapy strategies for lung cancer.
Collapse
|
44
|
Bo V, Tucker A. Integrating Gene Regulatory Networks to identify cancer-specific genes. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2015; 2015:21-5. [PMID: 26306224 PMCID: PMC4525222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Consensus approaches have been widely used to identify Gene Regulatory Networks (GRNs) that are common to multiple studies. However, in this research we develop an application that semi-automatically identifies key mechanisms that are specific to a particular set of conditions. We analyse four different types of cancer to identify gene pathways unique to each of them. To support the results reliability we calculate the prediction accuracy of each gene for the specified conditions and compare to predictions on other conditions. The most predictive are validated using the GeneCards encyclopaedia1 coupled with a statistical test for validating clusters. Finally, we implement an interface that allows the user to identify unique subnetworks of any selected combination of studies using AND & NOT logic operators. Results show that unique genes and sub-networks can be reliably identified and that they reflect key mechanisms that are fundamental to the cancer types under study.
Collapse
Affiliation(s)
- Valeria Bo
- Department of Computer Science, Brunel University, London, UK
| | - Allan Tucker
- Department of Computer Science, Brunel University, London, UK
| |
Collapse
|
45
|
Bo V, Curtis T, Lysenko A, Saqi M, Swift S, Tucker A. Discovering study-specific gene regulatory networks. PLoS One 2014; 9:e106524. [PMID: 25191999 PMCID: PMC4156366 DOI: 10.1371/journal.pone.0106524] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2014] [Accepted: 08/01/2014] [Indexed: 11/18/2022] Open
Abstract
Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method's results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets.
Collapse
Affiliation(s)
- Valeria Bo
- Department of Information System and Computing, Brunel University, London, United Kingdom
| | | | | | | | - Stephen Swift
- Department of Information System and Computing, Brunel University, London, United Kingdom
| | - Allan Tucker
- Department of Information System and Computing, Brunel University, London, United Kingdom
| |
Collapse
|
46
|
Ding H, Wang C, Huang K, Machiraju R. iGPSe: a visual analytic system for integrative genomic based cancer patient stratification. BMC Bioinformatics 2014; 15:203. [PMID: 25000928 PMCID: PMC4227100 DOI: 10.1186/1471-2105-15-203] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2014] [Accepted: 06/10/2014] [Indexed: 12/21/2022] Open
Abstract
Background Cancers are highly heterogeneous with different subtypes. These subtypes often possess different genetic variants, present different pathological phenotypes, and most importantly, show various clinical outcomes such as varied prognosis and response to treatment and likelihood for recurrence and metastasis. Recently, integrative genomics (or panomics) approaches are often adopted with the goal of combining multiple types of omics data to identify integrative biomarkers for stratification of patients into groups with different clinical outcomes. Results In this paper we present a visual analytic system called Interactive Genomics Patient Stratification explorer (iGPSe) which significantly reduces the computing burden for biomedical researchers in the process of exploring complicated integrative genomics data. Our system integrates unsupervised clustering with graph and parallel sets visualization and allows direct comparison of clinical outcomes via survival analysis. Using a breast cancer dataset obtained from the The Cancer Genome Atlas (TCGA) project, we are able to quickly explore different combinations of gene expression (mRNA) and microRNA features and identify potential combined markers for survival prediction. Conclusions Visualization plays an important role in the process of stratifying given population patients. Visual tools allowed for the selection of possibly features across various datasets for the given patient population. We essentially made a case for visualization for a very important problem in translational informatics.
Collapse
Affiliation(s)
| | | | - Kun Huang
- Department of Computer Science and Engineering and Biomedical Informatics, The Ohio State University, 43210 Columbus, OH, USA.
| | | |
Collapse
|
47
|
Puniya BL, Kulshreshtha D, Verma SP, Kumar S, Ramachandran S. Integrated gene co-expression network analysis in the growth phase of Mycobacterium tuberculosis reveals new potential drug targets. MOLECULAR BIOSYSTEMS 2014; 9:2798-815. [PMID: 24056838 DOI: 10.1039/c3mb70278b] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
We have carried out weighted gene co-expression network analysis of Mycobacterium tuberculosis to gain insights into gene expression architecture during log phase growth. The differentially expressed genes between at least one pair of 11 different M. tuberculosis strains as source of biological variability were used for co-expression network analysis. This data included genes with highest coefficient of variation in expression. Five distinct modules were identified using topological overlap based clustering. All the modules together showed significant enrichment in biological processes: fatty acid biosynthesis, cell membrane, intracellular membrane bound organelle, DNA replication, Quinone biosynthesis, cell shape and peptidoglycan biosynthesis, ribosome and structural constituents of ribosome and transposition. We then extracted the co-expressed connections which were supported either by transcriptional regulatory network or STRING database or high edge weight of topological overlap. The genes trpC, nadC, pitA, Rv3404c, atpA, pknA, Rv0996, purB, Rv2106 and Rv0796 emerged as top hub genes. After overlaying this network on the iNJ661 metabolic network, the reactions catalyzed by 15 highly connected metabolic genes were knocked down in silico and evaluated by Flux Balance Analysis. The results showed that in 12 out of 15 cases, in 11 more than 50% of reactions catalyzed by genes connected through co-expressed connections also had altered fluxes. The modules 'Turquoise', 'Blue' and 'Red' also showed enrichment in essential genes. We could map 152 of the previously known or proposed drug targets in these modules and identified 15 new potential drug targets based on their high degree of co-expressed connections and strong correlation with module eigengenes.
Collapse
Affiliation(s)
- Bhanwar Lal Puniya
- G N Ramachandran Knowledge Centre for Genome Informatics, CSIR - Institute of Genomics and Integrative Biology, Mall Road, Delhi 110007, India.
| | | | | | | | | |
Collapse
|
48
|
Liu W, Li L, Li W. Gene co-expression analysis identifies common modules related to prognosis and drug resistance in cancer cell lines. Int J Cancer 2014; 135:2795-803. [PMID: 24771271 DOI: 10.1002/ijc.28935] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Revised: 04/12/2014] [Accepted: 04/16/2014] [Indexed: 11/08/2022]
Abstract
To discover a common gene co-expression network in cancer cell, we applied weighted gene co-expression network analysis to transcriptional profiles of 917 cancer cell lines. Fourteen biologically meaningful modules were identified, including cytoskeleton, cell cycle, RNA splicing, signaling pathway, transcription, translation and others. These modules were robust in an independent human cancer microarray dataset. Furthermore, we collected 11 independent cancer microarray datasets, and correlated these modules with clinical outcome. Most of these modules could predict patient survival in one or more cancer types. Some modules were predictive of relapse, metastasis and drug resistance. Novel regulatory mechanisms were also implicated. In summary, our findings, for the first time, provide a modular map for cancer cell lines, new targets for therapy and modules for regulatory mechanism of cancer development and drug resistance.
Collapse
Affiliation(s)
- Wei Liu
- Department of Pathology, Human Centrifuge Medical Training Center, Institute of Aviation Medicine of Chinese PLA Air Force, Beijing, China
| | | | | |
Collapse
|
49
|
Abstract
Most fungal genomes are poorly annotated, and many fungal traits of industrial and biomedical relevance are not well suited to classical genetic screens. Assigning genes to phenotypes on a genomic scale thus remains an urgent need in the field. We developed an approach to infer gene function from expression profiles of wild fungal isolates, and we applied our strategy to the filamentous fungus Neurospora crassa. Using transcriptome measurements in 70 strains from two well-defined clades of this microbe, we first identified 2,247 cases in which the expression of an unannotated gene rose and fell across N. crassa strains in parallel with the expression of well-characterized genes. We then used image analysis of hyphal morphologies, quantitative growth assays, and expression profiling to test the functions of four genes predicted from our population analyses. The results revealed two factors that influenced regulation of metabolism of nonpreferred carbon and nitrogen sources, a gene that governed hyphal architecture, and a gene that mediated amino acid starvation resistance. These findings validate the power of our population-transcriptomic approach for inference of novel gene function, and we suggest that this strategy will be of broad utility for genome-scale annotation in many fungal systems. IMPORTANCE Some fungal species cause deadly infections in humans or crop plants, and other fungi are workhorses of industrial chemistry, including the production of biofuels. Advances in medical and industrial mycology require an understanding of the genes that control fungal traits. We developed a method to infer functions of uncharacterized genes by observing correlated expression of their mRNAs with those of known genes across wild fungal isolates. We applied this strategy to a filamentous fungus and predicted functions for thousands of unknown genes. In four cases, we experimentally validated the predictions from our method, discovering novel genes involved in the metabolism of nutrient sources relevant for biofuel production, as well as colony morphology and starvation resistance. Our strategy is straightforward, inexpensive, and applicable for predicting gene function in many fungal species.
Collapse
|
50
|
Kotian S, Banerjee T, Lockhart A, Huang K, Catalyurek UV, Parvin JD. NUSAP1 influences the DNA damage response by controlling BRCA1 protein levels. Cancer Biol Ther 2014; 15:533-43. [PMID: 24521615 DOI: 10.4161/cbt.28019] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
NUSAP1 has been reported to function in mitotic spindle assembly, chromosome segregation, and regulation of cytokinesis. In this study, we find that NUSAP1 has hitherto unknown functions in the key BRCA1-regulated pathways of double strand DNA break repair and centrosome duplication. Both these pathways are important for maintenance of genomic stability, and any defects in these pathways can cause tumorigenesis. Depletion of NUSAP1 from cells led to the suppression of double strand DNA break repair via the homologous recombination and single-strand annealing pathways. The presence of NUSAP1 was also found to be important for the control of centrosome numbers. We have found evidence that NUSAP1 plays a role in these processes through regulation of BRCA1 protein levels, and BRCA1 overexpression from a plasmid mitigates the defective phenotypes seen upon NUSAP1 depletion. We found that after NUSAP1 depletion there is a decrease in BRCA1 recruitment to ionizing radiation-induced foci. Results from this study reveal a novel association between BRCA1 and NUSAP1 and suggests a mechanism whereby NUSAP1 is involved in carcinogenesis.
Collapse
Affiliation(s)
- Shweta Kotian
- Department of Biomedical Informatics; The Ohio State University Comprehensive Cancer Center; The Ohio State University; Columbus, OH USA
| | - Tapahsama Banerjee
- Department of Biomedical Informatics; The Ohio State University Comprehensive Cancer Center; The Ohio State University; Columbus, OH USA
| | - Ainsley Lockhart
- Department of Biomedical Informatics; The Ohio State University Comprehensive Cancer Center; The Ohio State University; Columbus, OH USA
| | - Kun Huang
- Department of Biomedical Informatics; The Ohio State University Comprehensive Cancer Center; The Ohio State University; Columbus, OH USA
| | - Umit V Catalyurek
- Department of Biomedical Informatics; The Ohio State University Comprehensive Cancer Center; The Ohio State University; Columbus, OH USA
| | - Jeffrey D Parvin
- Department of Biomedical Informatics; The Ohio State University Comprehensive Cancer Center; The Ohio State University; Columbus, OH USA
| |
Collapse
|