1
|
Wang Y, Wang T, Cao Y, Qiao X, Han X, Liu ZP. TopMarker: Computational screening biomarkers of hepatocellular carcinoma from transcriptome and interactome based on differential network topological parameters. Comput Biol Chem 2024; 112:108166. [PMID: 39111022 DOI: 10.1016/j.compbiolchem.2024.108166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 07/23/2024] [Accepted: 07/31/2024] [Indexed: 09/13/2024]
Abstract
Identifying diagnostic biomarkers for cancer is crucial in the field of personalized medicine. The available transcriptome and interactome provide unprecedented opportunities and challenges for biomarker screening. From a systematic perspective, network-based medicine methods provide alternative approaches to organizing the available high-throughput omics data for deciphering molecular interactions and their associations with phenotypic states. In this work, we propose a bioinformatics strategy named TopMarker for discovering diagnostic biomarkers by comparing the network topology differences in control and disease samples. Specifically, we build up gene-gene interaction networks in the two states of control and disease respectively. The network rewiring status across the two networks results in differential network topologies reflecting dynamics and changes in normal samples when compared with those in disease. Thus, we identify the potential biomarker genes with differential network topological parameters between the control and disease gene networks. For a proof-of-concept study, we introduce the computational pipeline of biomarker discovery in hepatocellular carcinoma (HCC). We prove the effectiveness of the proposed TopMarker method using these candidate biomarkers in classifying HCC samples and validate its signature capability across numerous independent datasets. We also compare the discriminant power of biomarker genes identified by TopMarker with those identified by other baseline methods. The higher classification performances and functional implications indicate the advantages of our proposed method for discovering biomarkers from differential network topology.
Collapse
Affiliation(s)
- Yanqiu Wang
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Tong Wang
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Yi Cao
- Center for Biomedical Engineering, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Xu Qiao
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Xianhua Han
- Faculty of Science, Yamaguchi University, Yamaguchi 753-8511, Japan
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China.
| |
Collapse
|
2
|
Liu ZP, Wang T. Network biology approach unveils transcriptomic alterations triggered by particle radiation. MOLECULAR THERAPY. NUCLEIC ACIDS 2024; 35:102294. [PMID: 39252875 PMCID: PMC11382101 DOI: 10.1016/j.omtn.2024.102294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Affiliation(s)
- Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Tong Wang
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| |
Collapse
|
3
|
Shen C, Cao Y, Qi GQ, Huang J, Liu ZP. Discovering pathway biomarkers of hepatocellular carcinoma occurrence and development by dynamic network entropy analysis. Gene 2023; 873:147467. [PMID: 37164125 DOI: 10.1016/j.gene.2023.147467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 04/26/2023] [Accepted: 05/03/2023] [Indexed: 05/12/2023]
Abstract
OBJECTIVE Gene expression profiling techniques measure the transcription of thousands of genes in a parallel manner. With more and more hepatocellular carcinoma (HCC) transcriptomic data becoming available, the high-throughput data provides an unprecedented opportunity to discover HCC diagnostic biomarkers. In this work, we propose a bioinformatics method based on dynamic network entropy analysis, called DNEA, to identify potential pathway biomarkers for HCC occurrence and development by integrating transcriptome and interactome. METHODS We firstly collect the pathways documented in different knowledge-bases and then impose the genome-wide human transcriptomic data of multistage cancerous tissues during the development and progression of HCC. After linking the gene sets of pathways into individual connected networks, we map the corresponding gene expression information onto these pathways. The dynamic network entropy of individual pathways is calculated to evaluate its activities and dysfunctionalities during the disease occurrence and development. We use the overall significant difference in the entropic dynamics during the time course to prioritize distinctive pathways during disease progression. Then machine learning classification methods are employed to screen out pathway biomarkers with the classification ability to distinguish different-stage samples of HCC progression. RESULTS Pathway biomarkers discovered based on DNEA demonstrate good classification performance in measuring HCC progression. The classification accuracy is as follows: DNA replication pathway (mean AUC= 0.82, 20 genes) from KEGG, FMLP pathway (mean AUC=0.84, 14 genes) from BioCarta, and downstream signaling of activated FGFR pathway (mean AUC =0.80, 15 genes) from Reactome. At the same time, previous studies have shown that these genes and pathways screened are closely related to the occurrence and development of HCC in terms of oncogenesis dysfunctions. CONCLUSIONS Our method for cancer biomarker discovery based on dynamic network entropy analysis is effective and efficient in identifying pathway biomarkers related to the progression of complex diseases.
Collapse
Affiliation(s)
- Chen Shen
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China; Department of Data and Information, The Children's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310052, China; Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, Zhejiang 310052, China
| | - Yi Cao
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China; Center for Biomedical Engineering, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Guo-Qiang Qi
- Department of Data and Information, The Children's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310052, China; Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, Zhejiang 310052, China
| | - Jian Huang
- Department of Data and Information, The Children's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310052, China; Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, Zhejiang 310052, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China.
| |
Collapse
|
4
|
Li L, Ching WK, Liu ZP. Robust biomarker screening from gene expression data by stable machine learning-recursive feature elimination methods. Comput Biol Chem 2022; 100:107747. [DOI: 10.1016/j.compbiolchem.2022.107747] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 06/17/2022] [Accepted: 07/25/2022] [Indexed: 11/03/2022]
|
5
|
Qiao X, Zhang X, Chen W, Xu X, Chen YW, Liu ZP. tensorGSEA: Detecting Differential Pathways in Type 2 Diabetes via Tensor-Based Data Reconstruction. Interdiscip Sci 2022; 14:520-531. [PMID: 35195883 DOI: 10.1007/s12539-022-00506-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 01/24/2022] [Accepted: 02/07/2022] [Indexed: 06/14/2023]
Abstract
Detecting significant signaling pathways in disease progression highlights the dysfunctions and pathogenic mechanisms of complex disease development. Since tensor decomposition has been proven effective for multi-dimensional data representation and reconstruction, differences between original and tensor-processed data are expected to extract crucial information and differential indication. This paper provides a tensor-based gene set enrichment analysis, called tensorGSEA, based on a data reconstruction method to identify relevant significant pathways during disease development. As a proof-of-concept study, we identify the differential pathways of diabetes in rats. Specifically, we first arrange gene expression profiles of each documented pathway as tensors with three dimensions: genes, samples, and periods. Then we compress tensors into core tensors with lower ranks. The pathways with lower reconstruction rates are obtained after reconstructing gene expression profiles in another state via these cores. Thus, differences underlying pathways are extracted by cross-state data reconstruction between controls and diseases. The experiments reveal several critical pathways with diabetes-specific functions which otherwise cannot be identified by alternative methods. Our proposed tensorGSEA is efficient in evaluating pathways by achieving their empirical statistical significance, respectively. The classification experiments demonstrate that the selected pathways can be implemented as biomarkers to identify the diabetic state. The code of tensorGSEA is available at https://github.com/zhxr37/tensorGSEA .
Collapse
Affiliation(s)
- Xu Qiao
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, 250061, Shandong, China
| | - Xianru Zhang
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, 250061, Shandong, China
| | - Wei Chen
- Shandong Provincial Key Laboratory of Oral Tissue Regeneration, School of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
| | - Xin Xu
- Shandong Provincial Key Laboratory of Oral Tissue Regeneration, School of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
| | - Yen-Wei Chen
- Graduate School of Information Science and Engineering, Ritsumeikan University, Shiga, 525-8577, Japan
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, 250061, Shandong, China.
| |
Collapse
|
6
|
Bajo-Morales J, Galvez JM, Prieto-Prieto JC, Herrera LJ, Rojas I, Castillo-Secilla D. Heterogeneous Gene Expression Cross-Evaluation of Robust Biomarkers
Using Machine Learning Techniques Applied to Lung Cancer. Curr Bioinform 2022. [DOI: 10.2174/1574893616666211005114934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Background:
Nowadays, gene expression analysis is one of the most promising pillars for
understanding and uncovering the mechanisms underlying the development and spread of cancer. In this
sense, Next Generation Sequencing technologies, such as RNA-Seq, are currently leading the market
due to their precision and cost. Nevertheless, there is still an enormous amount of non-analyzed data obtained
from older technologies, such as Microarray, which could still be useful to extract relevant
knowledge.
Methods:
Throughout this research, a complete machine learning methodology to cross-evaluate the
compatibility between both RNA-Seq and Microarray sequencing technologies is described and implemented.
In order to show a real application of the designed pipeline, a lung cancer case study is addressed
by considering two detected subtypes: adenocarcinoma and squamous cell carcinoma. Transcriptomic
datasets considered for our study have been obtained from the public repositories
NCBI/GEO, ArrayExpress and GDC-Portal. From them, several gene experiments have been carried
out with the aim of finding gene signatures for these lung cancer subtypes, linked to both transcriptomic
technologies. With these DEGs selected, intelligent predictive models capable of classifying new samples
belonging to these cancer subtypes have been developed.
Results:
The predictive models built using one technology are capable of discerning samples from a different
technology. The classification results are evaluated in terms of accuracy, F1-score and ROC
curves along with AUC. Finally, the biological information of the gene sets obtained and their relationship
with lung cancer are reviewed, encountering strong biological evidence linking them to the disease.
Conclusion:
Our method has the capability of finding strong gene signatures which are also independent
of the transcriptomic technology used to develop the analysis. In addition, our article highlights the
potential of using heterogeneous transcriptomic data to increase the amount of samples for the studies,
increasing the statistical significance of the results.
Collapse
Affiliation(s)
- Javier Bajo-Morales
- Department of Computer Architecture and Technology, University of Granada, C.I.T.I.C., Periodista Rafael Gómez
Montero, 2, 18014, Granada, Spain
| | - Juan Manuel Galvez
- Department of Computer Architecture and Technology, University of Granada, C.I.T.I.C., Periodista Rafael Gómez
Montero, 2, 18014, Granada, Spain
| | - Juan Carlos Prieto-Prieto
- Nuclear Medicine Department, IMIBIC, University Hospital Reina Sofia, Menéndez
Pidal Avenue, 14004, Córdoba, Spain
| | - Luis Javier Herrera
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada,Spain
| | - Ignacio Rojas
- Department of Computer Architecture and Technology, University of Granada, C.I.T.I.C., Periodista Rafael Gómez
Montero, 2, 18014, Granada, Spain
| | - Daniel Castillo-Secilla
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada,Spain
| |
Collapse
|
7
|
Li L, Liu ZP. A connected network-regularized logistic regression model for feature selection. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02877-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
8
|
Wang Y, Liu ZP. Identifying biomarkers for breast cancer by gene regulatory network rewiring. BMC Bioinformatics 2022; 22:308. [PMID: 35045805 PMCID: PMC8772043 DOI: 10.1186/s12859-021-04225-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Accepted: 06/01/2021] [Indexed: 12/09/2022] Open
Abstract
Background Mining gene regulatory network (GRN) is an important avenue for addressing cancer mechanism. Mutations in cancer genome perturb GRN and cause a rewiring in an orchestrated network. Hence, the exploration of gene regulatory network rewiring is significant to discover potential biomarkers and indicators for discriminating cancer phenotypes. Results Here, we propose a new bioinformatics method of identifying biomarkers based on network rewiring in different states. It firstly reconstructs GRN in different phenotypic conditions from gene expression data with a priori background network. We employ the algorithm based on path consistency algorithm and conditional mutual information to delete false-positive regulatory interactions between independent nodes/genes or not closely related gene pairs. And then a differential gene regulatory network (D-GRN) is constructed from the rewiring parts in the two phenotype-specific GRNs. Community detection technique is then applied for D-GRN to detect functional modules. Finally, we apply logistic regression classifier with recursive feature elimination to select biomarker genes in each module individually. The extracted feature genes result in a gene set of biomarkers with impressing ability to distinguish normal samples from controls. We verify the identified biomarkers in external independent validation datasets. For a proof-of-concept study, we apply the framework to identify diagnostic biomarkers of breast cancer. The identified biomarkers obtain a maximum AUC of 0.985 in the internal sample classification experiments. And these biomarkers achieve a maximum AUC of 0.989 in the external validations. Conclusion In conclusion, network rewiring reveals significant differences between different phenotypes, which indicating cancer dysfunctional mechanisms. With the development of sequencing technology, the amount and quality of gene expression data become available. Condition-specific gene regulatory networks that are close to the real regulations in different states will be established. Revealing the network rewiring will greatly benefit the discovery of biomarkers or signatures for phenotypes. D-GRN is a general method to meet this demand of deciphering the high-throughput data for biomarker discovery. It is also easy to be extended for identifying biomarkers of other complex diseases beyond breast cancer.
Collapse
Affiliation(s)
- Yijuan Wang
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, 250061, Shandong, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, 250061, Shandong, China.
| |
Collapse
|
9
|
Network-based prioritization of cancer biomarkers by phenotype-driven module detection and ranking. Comput Struct Biotechnol J 2022; 20:206-217. [PMID: 35024093 PMCID: PMC8715301 DOI: 10.1016/j.csbj.2021.12.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 11/13/2021] [Accepted: 12/04/2021] [Indexed: 12/23/2022] Open
Abstract
This paper describes an ensemble method with supervised module detection and further module prioritization for reliable network-based biomarker discovery. We design a module detection and ranking method called mRank to discover reliable network modules as cancer diagnostic biomarkers, with two procedures: (1) an iterative supervised module detection guided by phenotypic states in a specific network, (2) a block-based module ranking locally and globally via network topological centrality. We validate its effectiveness and efficiency by identifying hepatocellular carcinoma (HCC) network modules on a comprehensive gene regulatory network with specifying gene interactions by HCC RNA-seq data from the Cancer Genome Atlas (TCGA). These top-ranked modules by mRank get a mean AUC of 0.995 on TCGA HCC dataset with 371 tumor samples and 50 controls by cross-validation SVM. Based on the prior knowledge of cancer dysfunctions enriched in top-ranked modules, 69 genes are identified as HCC candidate biomarkers. They are further validated in independent cohorts with a classifier trained on TCGA HCC dataset. A mean AUC of 0.846 is achieved in distinguishing 976 disease samples from 827 controls. Moreover, some known HCC signatures such as AFP and SPP1 are also included in our identified biomarkers. mRank enables us to find more reliable network modules for cancer diagnosis. For a proof-of-concept study, we validate it in identifying HCC network biomarkers and it is generalizable to other cancers or complex disease. The overall results have demonstrated that mRank can find effective network biomarkers for cancer diagnosis which result in less false positives.
Collapse
|
10
|
Wei B, Yu M, Yao J, Jiang M, An J, Yang J, Lin J, Zhao Y, Zhu Y. Multidimensional Analyses of Tumor Immune Microenvironment Reveal the Possible Rationality of Immunotherapy and Identify High Immunotherapy Response Subtypes for Renal Papillary Cell Carcinoma. Front Immunol 2021; 12:657951. [PMID: 34531849 PMCID: PMC8438207 DOI: 10.3389/fimmu.2021.657951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 08/10/2021] [Indexed: 11/13/2022] Open
Abstract
Kidney renal papillary cell carcinoma (KIRP), the second most common subtype of renal cell carcinoma, still lacks effective treatment regimens for individualized immunotherapy because of the heterogeneity of its elusive immune microenvironment. Therefore, we aimed to comprehensively evaluate the immune microenvironment of KIRP by using the computational biology strategy to analyze the expression profile data of 289 KIRP patients obtained from The Cancer Genome Atlas database. Based on multidimensional, multi-omics bioinformatics analysis, we found that the tumor of patients with KIRP exhibited “hot” tumor characteristics but the CD8+ T cells in the tumor tissues did not limit tumor progression. Thus, patients with KIRP may realize higher clinical benefits by receiving treatment that can reverse CD8+ T-cell exhaustion. Among them, C1 and C3 immune subtypes could realize the best efficacy of reversing CD8+ T-cell exhaustion. Moreover, CCL5 and FASLG expression may be related to the formation of the immunosuppressive microenvironment in the tumors of patients with KIRP. In conclusion, the immune microenvironment landscape presented in this study provides a novel insight for further experimental and clinical exploration of tailored immunotherapy for patients with KIRP.
Collapse
Affiliation(s)
- Baojun Wei
- Department of Urology, The First Hospital of China Medical University, Shenyang, China
| | - Meng Yu
- Department of Laboratory Animal Science, China Medical University, Shenyang, China.,Key Laboratory of Transgenic Animal Research, China Medical University, Shenyang, China
| | - Jihang Yao
- Department of Gynecology, The First Hospital of China Medical University, Shenyang, China
| | - Mingzhe Jiang
- Department of Urology, The First Hospital of China Medical University, Shenyang, China
| | - Jun An
- Department of Urology, The First Hospital of China Medical University, Shenyang, China
| | - Jieping Yang
- Department of Urology, The First Hospital of China Medical University, Shenyang, China
| | - Jiaxing Lin
- Department of Urology, The First Hospital of China Medical University, Shenyang, China
| | - Yongkang Zhao
- National Institute of Health and Medical Big Data, China Medical University, Shenyang, China.,Joint Laboratory of Artificial Intelligence and Precision Medicine of China Medical University and Northeastern University, Northeastern University, Shenyang, China
| | - Yuyan Zhu
- Department of Urology, The First Hospital of China Medical University, Shenyang, China.,Joint Laboratory of Artificial Intelligence and Precision Medicine of China Medical University and Northeastern University, Northeastern University, Shenyang, China
| |
Collapse
|
11
|
Coleto-Alcudia V, Vega-Rodríguez MA. A metaheuristic multi-objective optimization method for dynamical network biomarker identification as pre-disease stage signal. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
12
|
Zhang Z, Liu ZP. Robust biomarker discovery for hepatocellular carcinoma from high-throughput data by multiple feature selection methods. BMC Med Genomics 2021; 14:112. [PMID: 34433487 PMCID: PMC8386074 DOI: 10.1186/s12920-021-00957-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 04/08/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Hepatocellular carcinoma (HCC) is one of the most common cancers. The discovery of specific genes severing as biomarkers is of paramount significance for cancer diagnosis and prognosis. The high-throughput omics data generated by the cancer genome atlas (TCGA) consortium provides a valuable resource for the discovery of HCC biomarker genes. Numerous methods have been proposed to select cancer biomarkers. However, these methods have not investigated the robustness of identification with different feature selection techniques. METHODS We use six different recursive feature elimination methods to select the gene signiatures of HCC from TCGA liver cancer data. The genes shared in the six selected subsets are proposed as robust biomarkers. Akaike information criterion (AIC) is employed to explain the optimization process of feature selection, which provides a statistical interpretation for the feature selection in machine learning methods. And we use several methods to validate the screened biomarkers. RESULTS In this paper, we propose a robust method for discovering biomarker genes for HCC from gene expression data. Specifically, we implement recursive feature elimination cross-validation (RFE-CV) methods based on six different classication algorithms. The overlaps in the discovered gene sets via different methods are referred as the identified biomarkers. We give an interpretation of the feature selection process based on machine learning using AIC in statistics. Furthermore, the features selected by the backward logistic stepwise regression via AIC minimum theory are completely contained in the identified biomarkers. Through the classification results, the superiority of interpretable robust biomarker discovery method is verified. CONCLUSIONS It is found that overlaps among gene subsets contain different quantitative features selected by the RFE-CV of 6 classifiers. The AIC values in the model selection provide a theoretical foundation for the feature selection process of biomarker discovery via machine learning. What's more, genes containing in more optimally selected subsets make better biological sense and implication. The quality of feature selection is improved by the intersections of biomarkers selected from different classifiers. This is a general method suitable for screening biomarkers of complex diseases from high-throughput data.
Collapse
Affiliation(s)
- Zishuang Zhang
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, 250061, Shandong, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, 250061, Shandong, China.
- Center for Intelligent Medicine, Shandong University, Jinan, 250061, Shandong, China.
| |
Collapse
|
13
|
Jiang X, Pan W, Chen M, Wang W, Song W, Lin GN. Integrative enrichment analysis of gene expression based on an artificial neuron. BMC Med Genomics 2021; 14:173. [PMID: 34433483 PMCID: PMC8386081 DOI: 10.1186/s12920-021-00988-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 05/18/2021] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Huntington's disease is a kind of chronic progressive neurodegenerative disease with complex pathogenic mechanisms. To data, the pathogenesis of Huntington's disease is still not fully understood, and there has been no effective treatment. The rapid development of high-throughput sequencing technologies makes it possible to explore the molecular mechanisms at the transcriptome level. Our previous studies on Huntington's disease have shown that it is difficult to distinguish disease-associated genes from non-disease genes. Meanwhile, recent progress in bio-medicine shows that the molecular origin of chronic complex diseases may not exist in the diseased tissue, and differentially expressed genes between different tissues may be helpful to reveal the molecular origin of chronic diseases. Therefore, developing integrative analysis computational methods for the multi-tissues gene expression data, exploring the relationship between differentially expressed genes in different tissues and the disease, can greatly accelerate the molecular discovery process. METHODS For analysis of the intra- and inter- tissues' differentially expressed genes, we designed an integrative enrichment analysis method based on an artificial neuron (IEAAN). Firstly, we calculated the differential expression scores of genes which are seen as features of the corresponding gene, using fold-change approach with intra- and inter- tissues' gene expression data. Then, we weighted sum all the differential expression scores through a sigmoid function to get differential expression enrichment score. Finally, we ranked the genes according to the enrichment score. Top ranking genes are supposed to be the potential disease-associated genes. RESULTS In this study, we conducted large amounts of experiments to analyze the differentially expressed genes of intra- and inter- tissues. Experimental results showed that genes differentially expressed between different tissues are more likely to be Huntington's disease-associated genes. Five disease-associated genes were selected out in this study, two of which have been reported to be implicated in Huntington's disease. CONCLUSIONS We proposed a novel integrative enrichment analysis method based on artificial neuron (IEAAN), which displays better prediction precision of disease-associated genes in comparison with the state-of-the-art statistical-based methods. Our comprehensive evaluation suggests that genes differentially expressed between striatum and liver tissues of health individuals are more likely to be Huntington's disease-associated genes.
Collapse
Affiliation(s)
- Xue Jiang
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Weihao Pan
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Miao Chen
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Weidi Wang
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Weichen Song
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
| | - Guan Ning Lin
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030 China
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai, 200030 China
| |
Collapse
|
14
|
Shojaie A. Differential Network Analysis: A Statistical Perspective. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2021; 13:e1508. [PMID: 37050915 PMCID: PMC10088462 DOI: 10.1002/wics.1508] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 03/03/2020] [Indexed: 11/06/2022]
Abstract
Networks effectively capture interactions among components of complex systems, and have thus become a mainstay in many scientific disciplines. Growing evidence, especially from biology, suggest that networks undergo changes over time, and in response to external stimuli. In biology and medicine, these changes have been found to be predictive of complex diseases. They have also been used to gain insight into mechanisms of disease initiation and progression. Primarily motivated by biological applications, this article provides a review of recent statistical machine learning methods for inferring networks and identifying changes in their structures.
Collapse
Affiliation(s)
- Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle WA
| |
Collapse
|
15
|
Shang H, Liu ZP. Prioritizing Type 2 Diabetes Genes by Weighted PageRank on Bilayer Heterogeneous Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:336-346. [PMID: 31095494 DOI: 10.1109/tcbb.2019.2917190] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The prevalence of diabetes mellitus has been increasing rapidly in recent years. Type 2 diabetes makes up about 90 percent cases of diabetes. The interacting mixed effects of genetics and environments build possible interpretable pathogenesis. Thus, finding the causal disease genes is crucial in its clinical diagnosis and medical treatment. Currently, network-based computational method becomes a powerful tool of systematically analyzing complex diseases, such as the identification of candidate disease genes from networks. In this paper, we propose a bioinformatics framework of prioritizing type 2 diabetes genes by leveraging the modified PageRank algorithm on bilayer biomolecular networks consisting an ensemble gene-gene regulatory network and an integrative protein-protein interaction network. We specifically weigh the networks by differential mutual information for measuring the context specificities between genes and between proteins by transcriptomic and proteomic datasets, respectively. After formulating the network into two components of known disease genes and the other normal healthy genes, we rank the diabetes genes and others by bringing the orders in the bilayer network via an improved PageRank algorithm. We conclude that these known disease genes achieve significantly higher ranks compared to these randomly-selected normal genes, and the ranks are robust and consistent in multiple validation scenarios. In functional analysis, these high-ranked genes are identified to perform relevant risks and dysfunctions of type 2 diabetes.
Collapse
|
16
|
Li L, Liu ZP. Biomarker discovery for predicting spontaneous preterm birth from gene expression data by regularized logistic regression. Comput Struct Biotechnol J 2020; 18:3434-3446. [PMID: 33294138 PMCID: PMC7689379 DOI: 10.1016/j.csbj.2020.10.028] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 10/24/2020] [Accepted: 10/25/2020] [Indexed: 01/23/2023] Open
Abstract
In this work, we provide a computational method of regularized logistic regression for discovering biomarkers of spontaneous preterm birth (SPTB) from gene expression data. The successful identification of SPTB biomarkers will greatly benefit the interference of infant gestational age for reducing the risks of pregnant women and preemies. In recent years, various approaches have been proposed for the feature selection of identifying the subset of meaningful genes that can achieve accurate classification for disease samples from controls. Here, we comprehensively summarize the regularized logistic regression with seven effective penalties developed for the selection of strongly indicative genes of SPTB from microarray data. We compare their properties and assess their classification performances in multiple datasets. It shows that elastic net, lasso,L 1 / 2 and SCAD penalties get the better performance than others and can be successfully used to identify biomarkers of SPTB. Particularly, we make a functional enrichment analysis on these biomarkers and construct a logistic regression classifier based on them. The classifier generates an indicator of preterm risk score (PRS) for predicting SPTB. Based on the trained predictor, we verify the identified biomarkers on an independent dataset. The biomarkers achieve the AUC value of 0.933 in the SPTB classification. The results demonstrate the effectiveness and efficiency of the built-up strategy of biomarker discovery with regularized logistic regression. Obviously, the proposed method of discovering biomarkers for SPTB can be easily extended for other complex diseases.
Collapse
Affiliation(s)
- Lingyu Li
- Center for Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhi-Ping Liu
- Center for Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| |
Collapse
|
17
|
Shang H, Liu ZP. Network-based prioritization of cancer genes by integrative ranks from multi-omics data. Comput Biol Med 2020; 119:103692. [PMID: 32339126 DOI: 10.1016/j.compbiomed.2020.103692] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 02/10/2020] [Accepted: 02/29/2020] [Indexed: 10/24/2022]
Abstract
Finding disease genes related to cancer is of great importance for diagnosis and treatment. With the development of high-throughput technologies, more and more multiple-level omics data have become available. Thus, it is urgent to develop computational methods to identify cancer genes by integrating these data. We propose an integrative rank-based method called iRank to prioritize cancer genes by integrating multi-omics data in a unified network-based framework. The method was used to identify the disease genes of hepatocellular carcinoma (HCC) in humans using the multi-omics data for HCC from TCGA after building up integrated networks in the corresponding molecular levels. The kernel of iRank is based on an improved PageRank algorithm with constraints. To demonstrate the validity and the effectiveness of the method, we performed experiments for comparison between single-level omics data and multiple omics data as well as with other algorithms: random walk (RW), random walk with restart on heterogeneous network (RWH), PRINCE and PhenoRank. We also performed a case study on another cancer, prostate adenocarcinoma (PRAD). The results indicate the effectiveness and efficiency of iRank which demonstrates the significance of integrating multi-omics data and multiplex networks in cancer gene prioritization.
Collapse
Affiliation(s)
- Haixia Shang
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China; Center of Intelligent Medicine, Shandong University, Jinan, Shandong 250061, China.
| |
Collapse
|
18
|
Kudryavtseva AV, Lukyanova EN, Kharitonov SL, Nyushko KM, Krasheninnikov AA, Pudova EA, Guvatova ZG, Alekseev BY, Kiseleva MV, Kaprin AD, Dmitriev AA, Snezhkina AV, Krasnov GS. Bioinformatic identification of differentially expressed genes associated with prognosis of locally advanced lymph node-positive prostate cancer. J Bioinform Comput Biol 2020; 17:1950003. [PMID: 30866732 DOI: 10.1142/s0219720019500033] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Prostate cancer (PCa) is one of the primary causes of cancer-related mortality in men worldwide. Patients with locally advanced PCa with metastases in regional lymph nodes are usually marked as a high-risk group. One of the chief concerns for this group is to make an informed decision about the necessity of conducting adjuvant androgen deprivation therapy after radical surgical treatment. During the oncogenic transformation and progression of the disease, the expression of many genes is altered. Some of these genes can serve as markers for diagnosis, predicting the prognosis or effectiveness of drug therapy, as well as possible therapeutic targets. We undertook bioinformatic analysis of the RNA-seq data deposited in The Cancer Genome Atlas consortium database to identify possible prognostic markers. We compared the groups with favorable and unfavorable prognosis for the cohort of patients with PCa showing lymph node metastasis (pT2N1M0, pT3N1M0, and pT4N1M0) and for the most common molecular type carrying the fusion transcript TMPRSS2-ERG. For the entire cohort, we revealed at least six potential markers (IDO1, UGT2B15, IFNG, MUC6, CXCL11, and GBP1). Most of these genes are involved in the positive regulation of immune response. For the TMPRSS2-ERG subtype, we also identified six genes, the expression of which may be associated with prognosis: TOB1, GALNT7, INAFM1, APELA, RAC3, and NNMT. The identified genes, after additional studies and validation in the extended cohort, could serve as a prognostic marker of locally advanced lymph node-positive PCa.
Collapse
Affiliation(s)
- Anna V Kudryavtseva
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Elena N Lukyanova
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Sergey L Kharitonov
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Kirill M Nyushko
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Alexey A Krasheninnikov
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Elena A Pudova
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Zulfiya G Guvatova
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Boris Y Alekseev
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Marina V Kiseleva
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Andrey D Kaprin
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Alexey A Dmitriev
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Anastasiya V Snezhkina
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - George S Krasnov
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| |
Collapse
|
19
|
Peedicayil J. Identification of Biomarkers in Neuropsychiatric Disorders Based on Systems Biology and Epigenetics. Front Genet 2019; 10:985. [PMID: 31681422 PMCID: PMC6801306 DOI: 10.3389/fgene.2019.00985] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 09/17/2019] [Indexed: 12/30/2022] Open
Abstract
Clinically useful biomarkers are available for some neuropsychiatric disorders like fragile X syndrome, Rett syndrome, and Huntington’s disease. Despite many decades of research on the pathogenesis of neuropsychiatric disorders like schizophrenia (SZ), bipolar disorder (BD), and major depressive disorder (MDD), the exact pathogenesis of these disorders remains unclear, and there are no clinically useful biomarkers for these disorders. However, there is increasing evidence that abnormal epigenetic mechanisms of gene expression contribute to the pathogenesis of SZ, BD, and MDD. Both systems (or network) biology and epigenetics (a component of systems biology) attempt to make sense of biological systems that are highly dynamic and multi-compartmental. This article suggests that systems biology, emphasizing the epigenetic component of systems biology, could help identify clinically useful biomarkers in neuropsychiatric disorders like SZ, BD, and MDD.
Collapse
Affiliation(s)
- Jacob Peedicayil
- Department of Pharmacology and Clinical Pharmacology, Christian Medical College, Vellore, India
| |
Collapse
|
20
|
Chen J, Qian X, He Y, Han X, Pan Y. Novel key genes in triple‐negative breast cancer identified by weighted gene co‐expression network analysis. J Cell Biochem 2019; 120:16900-16912. [PMID: 31081967 DOI: 10.1002/jcb.28948] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 04/15/2019] [Accepted: 04/18/2019] [Indexed: 12/28/2022]
Affiliation(s)
- Jian Chen
- Department of Oncology The First Affiliated Hospital of University of Science and Technology of China Hefei China
| | - Xiaojun Qian
- Department of Oncology The First Affiliated Hospital of University of Science and Technology of China Hefei China
| | - Yifu He
- Department of Oncology The First Affiliated Hospital of University of Science and Technology of China Hefei China
| | - Xinghua Han
- Department of Oncology The First Affiliated Hospital of University of Science and Technology of China Hefei China
| | - Yueyin Pan
- Department of Oncology The First Affiliated Hospital of University of Science and Technology of China Hefei China
| |
Collapse
|
21
|
Wang L, Liu ZP. Detecting Diagnostic Biomarkers of Alzheimer's Disease by Integrating Gene Expression Data in Six Brain Regions. Front Genet 2019; 10:157. [PMID: 30915100 PMCID: PMC6422912 DOI: 10.3389/fgene.2019.00157] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 02/13/2019] [Indexed: 01/24/2023] Open
Abstract
Alzheimer's disease (AD) is a neurodegenerative and progressive disease, which often causes irreversible damages to the cerebrum. The pathogenesis of AD is far from being fully understood, while there are some popular hypotheses. So far, the diagnosis of AD relies only on clinical screening in the form of imaging techniques or cerebrospinal fluid analysis, which may lead to inaccurate evaluation and then cause the delay of suitable treatments. While molecular biomarkers provide promising alternatives of establishing correct relationships between genotypes and phenotypes of clinical symptoms. In this paper, we propose a machine-learning-based method of identifying potential diagnostic biomarkers of AD based on gene coexpression network by integrating gene expression profiles in six brain regions. After building an integrated gene coexpression network of multiple brain regions, we decompose the differential network into some subnetwork modules. The module candidates from these coexpressed gene communities are then identified by screening their discriminative powers in control from disease samples. The potential biomarkers are then validated by multiple cross-validations and functional enrichment analyses. If the biomarkers successfully pass clinical significance tests, they can be used as a reference for clinical diagnosis after wet-experimental validations.
Collapse
Affiliation(s)
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, China
| |
Collapse
|
22
|
Analysis of Topological Parameters of Complex Disease Genes Reveals the Importance of Location in a Biomolecular Network. Genes (Basel) 2019; 10:genes10020143. [PMID: 30769902 PMCID: PMC6409865 DOI: 10.3390/genes10020143] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2019] [Revised: 02/09/2019] [Accepted: 02/11/2019] [Indexed: 12/24/2022] Open
Abstract
Network biology and medicine provide unprecedented opportunities and challenges for deciphering disease mechanisms from integrative viewpoints. The disease genes and their products perform their dysfunctions via physical and biochemical interactions in the form of a molecular network. The topological parameters of these disease genes in the interactome are of prominent interest to the understanding of their functionality from a systematic perspective. In this work, we provide a systems biology analysis of the topological features of complex disease genes in an integrated biomolecular network. Firstly, we identify the characteristics of four network parameters in the ten most frequently studied disease genes and identify several specific patterns of their topologies. Then, we confirm our findings in the other disease genes of three complex disorders (i.e., Alzheimer’s disease, diabetes mellitus, and hepatocellular carcinoma). The results reveal that the disease genes tend to have a higher betweenness centrality, a smaller average shortest path length, and a smaller clustering coefficient when compared to normal genes, whereas they have no significant degree prominence. The features highlight the importance of gene location in the integrated functional linkages.
Collapse
|
23
|
Anand R, Sarmah DT, Chatterjee S. Extracting proteins involved in disease progression using temporally connected networks. BMC SYSTEMS BIOLOGY 2018; 12:78. [PMID: 30045727 PMCID: PMC6060549 DOI: 10.1186/s12918-018-0600-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Accepted: 07/09/2018] [Indexed: 12/13/2022]
Abstract
BACKGROUND Metabolic disorders such as obesity and diabetes are diseases which develop gradually over time in an individual and through the perturbations of genes. Systematic experiments tracking disease progression at gene level are usually conducted giving a temporal microarray data. There is a need for developing methods to analyze such complex data and extract important proteins which could be involved in temporal progression of the data and hence progression of the disease. RESULTS In the present study, we have considered a temporal microarray data from an experiment conducted to study development of obesity and diabetes in mice. We have used this data along with an available Protein-Protein Interaction network to find a network of interactions between proteins which reproduces the next time point data from previous time point data. We show that the resulting network can be mined to identify critical nodes involved in the temporal progression of perturbations. We further show that published algorithms can be applied on such connected network to mine important proteins and show an overlap between outputs from published and our algorithms. The importance of set of proteins identified was supported by literature as well as was further validated by comparing them with the positive genes dataset from OMIM database which shows significant overlap. CONCLUSIONS The critical proteins identified from algorithms can be hypothesized to play important role in temporal progression of the data.
Collapse
Affiliation(s)
- Rajat Anand
- Drug Discovery Research Centre, Translational Health Science and Technology Institute, NCR Biotech science cluster, 3rd milestone, Faridabad-Gurgaon Expressway, Faridabad, 121001, India
| | - Dipanka Tanu Sarmah
- Drug Discovery Research Centre, Translational Health Science and Technology Institute, NCR Biotech science cluster, 3rd milestone, Faridabad-Gurgaon Expressway, Faridabad, 121001, India
| | - Samrat Chatterjee
- Drug Discovery Research Centre, Translational Health Science and Technology Institute, NCR Biotech science cluster, 3rd milestone, Faridabad-Gurgaon Expressway, Faridabad, 121001, India.
| |
Collapse
|
24
|
Liu ZP, Gao R. Detecting pathway biomarkers of diabetic progression with differential entropy. J Biomed Inform 2018; 82:143-153. [DOI: 10.1016/j.jbi.2018.05.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2017] [Revised: 04/22/2018] [Accepted: 05/12/2018] [Indexed: 12/20/2022]
|
25
|
Chen J, Wang X, Hu B, He Y, Qian X, Wang W. Candidate genes in gastric cancer identified by constructing a weighted gene co-expression network. PeerJ 2018; 6:e4692. [PMID: 29740513 PMCID: PMC5937478 DOI: 10.7717/peerj.4692] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 04/11/2018] [Indexed: 12/12/2022] Open
Abstract
Background Gastric cancer (GC) is one of the most common cancers with high mortality globally. However, the molecular mechanisms of GC are unclear, and the prognosis of GC is poor. Therefore, it is important to explore the underlying mechanisms and screen for novel prognostic markers and treatment targets. Methods The genetic and clinical data of GC patients in The Cancer Genome Atlas (TCGA) was analyzed by weighted gene co-expression network analysis (WGCNA). Modules with clinical significance and preservation were distinguished, and gene ontology and pathway enrichment analysis were performed. Hub genes of these modules were validated in the TCGA dataset and another independent dataset from the Gene Expression Omnibus (GEO) database by t-test. Furthermore, the significance of these genes was confirmed via survival analysis. Results We found a preserved module consisting of 506 genes was associated with clinical traits including pathologic T stage and histologic grade. PDGFRB, COL8A1, EFEMP2, FBN1, EMILIN1, FSTL1 and KIRREL were identified as candidate genes in the module. Their expression levels were correlated with pathologic T stage and histologic grade, also affected overall survival of GC patients. Conclusion These candidate genes may be involved in proliferation and differentiation of GC cells. They may serve as novel prognostic markers and treatment targets. Moreover, most of them were first reported in GC and deserved further research.
Collapse
Affiliation(s)
- Jian Chen
- Department of Chemotherapy, Qilu Hospital, Shandong University, Jinan, Shandong, China.,Department of Chemotherapy, Anhui Provincial Hospital, Hefei, Anhui, China
| | - Xiuwen Wang
- Department of Chemotherapy, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Bing Hu
- Department of Chemotherapy, Anhui Provincial Hospital, Hefei, Anhui, China
| | - Yifu He
- Department of Chemotherapy, Anhui Provincial Hospital, Hefei, Anhui, China
| | - Xiaojun Qian
- Department of Chemotherapy, Anhui Provincial Hospital, Hefei, Anhui, China
| | - Wei Wang
- Department of Chemotherapy, Anhui Provincial Hospital, Hefei, Anhui, China
| |
Collapse
|
26
|
Detecting Early Warning Signal of Influenza A Disease Using Sample-Specific Dynamical Network Biomarkers. BIOMED RESEARCH INTERNATIONAL 2018; 2018:6807059. [PMID: 29662893 PMCID: PMC5831949 DOI: 10.1155/2018/6807059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 11/30/2017] [Accepted: 12/25/2017] [Indexed: 11/17/2022]
Abstract
Aims/Introduction. Evidences have shown that the deteriorated procession of disease is not a smooth change with time and conditions, in which a critical transition point denoted as predisease state drives the state from normal to disease. Considering individual differences, this paper provides a sample-specific method that constructs an index with individual-specific dynamical network biomarkers (DNB) which are defined as early warning index (EWI) for detecting predisease state of individual sample. Based on microarray data of influenza A disease, 144 genes are selected as DNB and the 7th time period is defined as predisease state. In addition, according to functional analysis of the discovered DNB, it is relevant with experience data, which can illustrate the effectiveness of our sample-specific method.
Collapse
|
27
|
Jiang L, Sui D, Qiao K, Dong HM, Chen L, Han Y. Impaired Functional Criticality of Human Brain during Alzheimer's Disease Progression. Sci Rep 2018; 8:1324. [PMID: 29358749 PMCID: PMC5778032 DOI: 10.1038/s41598-018-19674-7] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 01/05/2018] [Indexed: 12/11/2022] Open
Abstract
The progression of Alzheimer’s Disease (AD) has been proposed to comprise three stages, subjective cognitive decline (SCD), mild cognitive impairment (MCI), and AD. Was brain dynamics across the three stages smooth? Was there a critical transition? How could we characterize and study functional criticality of human brain? Based on dynamical characteristics of critical transition from nonlinear dynamics, we proposed a vertex-wise Index of Functional Criticality (vIFC) of fMRI time series in this study. Using 42 SCD, 67 amnestic MCI (aMCI), 34 AD patients as well as their age-, sex-, years of education-matched 54 NC, our new method vIFC successfully detected significant patient-normal differences for SCD and aMCI, as well as significant negative correlates of vIFC in the right middle temporal gyrus with total scores of Montreal Cognitive Assessment (MoCA) in SCD. In comparison, standard deviation of fMRI time series only detected significant differences between AD patients and normal controls. As an index of functional criticality of human brain derived from nonlinear dynamics, vIFC could serve as a sensitive neuroimaging marker for future studies; considering much more vIFC impairments in aMCI compared to SCD and AD, our study indicated aMCI as a critical stage across AD progression.
Collapse
Affiliation(s)
- Lili Jiang
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, 100101, China. .,Lifespan Connectomics and Behavior Team, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China. .,Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08544, USA.
| | - Danyang Sui
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, 100101, China.,Lifespan Connectomics and Behavior Team, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China.,Department of Psychology, University of Chinese Academy of Sciences, Shijingshan, Beijing, 100049, China
| | - Kaini Qiao
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, 100101, China.,Lifespan Connectomics and Behavior Team, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China.,Department of Psychology, University of Chinese Academy of Sciences, Shijingshan, Beijing, 100049, China
| | - Hao-Ming Dong
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, 100101, China.,Lifespan Connectomics and Behavior Team, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China.,Department of Psychology, University of Chinese Academy of Sciences, Shijingshan, Beijing, 100049, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Innovation Center for Cell Signaling Network, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai, 200031, China
| | - Ying Han
- Department of Neurology, XuanWu Hospital of Capital Medical University, Beijing, 100053, China. .,Center of Alzheimer's Disease, Beijing; Institute for Brain Disorders, Beijing, 100053, China. .,Beijing Institute of Geriatrics, Beijing, 100053, China. .,National Clinical Research Center for Geriatric Disorders, Beijing, 100053, China. .,PKU Care Rehabilitation Hospital, Beijing, 100053, China.
| |
Collapse
|
28
|
Zheng H, Li P, Kwok JG, Korrapati A, Li WT, Qu Y, Wang XQ, Kisseleva T, Wang-Rodriguez J, Ongkeko WM. Alcohol and hepatitis virus-dysregulated lncRNAs as potential biomarkers for hepatocellular carcinoma. Oncotarget 2017; 9:224-235. [PMID: 29416609 PMCID: PMC5787460 DOI: 10.18632/oncotarget.22921] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2017] [Accepted: 11/09/2017] [Indexed: 12/28/2022] Open
Abstract
Hepatocellular carcinoma (HCC) is one of the leading causes of cancer-related deaths because of frequent late detection and poor therapeutic outcomes, necessitating the need to identify effective biomarkers for early diagnosis and new therapeutic targets for effective treatment. Long noncoding RNAs (lncRNAs) have emerged as promising molecular markers for diagnosis and treatment. Through analysis of patient samples from The Cancer Genome Atlas database, we identified putative lncRNAs dysregulated in HCC and by its risk factors, hepatitis infection and alcohol consumption. We identified 184 lncRNAs dysregulated in HCC tumors versus paired normal samples, 53 lncRNAs dysregulated in alcohol-drinking patients with hepatitis B, and 5, 456 lncRNAs dysregulated in patients with hepatitis infection. A panel of these candidate lncRNAs’ expressions correlated significantly with patient survival, clinical variables, and known genomic alteration in HCC. Two most significantly dysregulated lncRNAs in our computational analysis, lnc-CFP-1:1 and lnc-CD164L2-1:1, were validated in vitro to be dysregulated by alcohol. Our findings suggest that lncRNAs dysregulated by different etiologies of HCC serve as potential disease markers and can be further investigated to develop personalized prevention, diagnosis, and treatment strategies.
Collapse
Affiliation(s)
- Hao Zheng
- Department of Surgery, University of California, San Diego, La Jolla, California, USA
| | - Pinxue Li
- Department of Surgery, University of California, San Diego, La Jolla, California, USA
| | - James G Kwok
- Department of Surgery, University of California, San Diego, La Jolla, California, USA
| | - Avinaash Korrapati
- Department of Surgery, University of California, San Diego, La Jolla, California, USA
| | - Wei Tse Li
- Department of Surgery, University of California, San Diego, La Jolla, California, USA
| | - Yuanhao Qu
- Department of Surgery, University of California, San Diego, La Jolla, California, USA
| | - Xiao Qi Wang
- Department of Surgery, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Tatiana Kisseleva
- Department of Surgery, University of California, San Diego, La Jolla, California, USA
| | - Jessica Wang-Rodriguez
- Veterans Administration Medical Center and Department of Pathology, University of California, San Diego, La Jolla, California, USA
| | - Weg M Ongkeko
- Department of Surgery, University of California, San Diego, La Jolla, California, USA
| |
Collapse
|
29
|
Large-scale gene network analysis reveals the significance of extracellular matrix pathway and homeobox genes in acute myeloid leukemia: an introduction to the Pigengene package and its applications. BMC Med Genomics 2017; 10:16. [PMID: 28298217 PMCID: PMC5353782 DOI: 10.1186/s12920-017-0253-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 03/08/2017] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND The distinct types of hematological malignancies have different biological mechanisms and prognoses. For instance, myelodysplastic syndrome (MDS) is generally indolent and low risk; however, it may transform into acute myeloid leukemia (AML), which is much more aggressive. METHODS We develop a novel network analysis approach that uses expression of eigengenes to delineate the biological differences between these two diseases. RESULTS We find that specific genes in the extracellular matrix pathway are underexpressed in AML. We validate this finding in three ways: (a) We train our model on a microarray dataset of 364 cases and test it on an RNA Seq dataset of 74 cases. Our model showed 95% sensitivity and 86% specificity in the training dataset and showed 98% sensitivity and 91% specificity in the test dataset. This confirms that the identified biological signatures are independent from the expression profiling technology and independent from the training dataset. (b) Immunocytochemistry confirms that MMP9, an exemplar protein in the extracellular matrix, is underexpressed in AML. (c) MMP9 is hypermethylated in the majority of AML cases (n=194, Welch's t-test p-value <10-138), which complies with its low expression in AML. Our novel network analysis approach is generalizable and useful in studying other complex diseases (e.g., breast cancer prognosis). We implement our methodology in the Pigengene software package, which is publicly available through Bioconductor. CONCLUSIONS Eigengenes define informative biological signatures that are robust with respect to expression profiling technology. These signatures provide valuable information about the underlying biology of diseases, and they are useful in predicting diagnosis and prognosis.
Collapse
|
30
|
Chang HT. Biomarker discovery using dry-lab technologies and high-throughput screening. Biomark Med 2016; 10:559-61. [PMID: 27278686 DOI: 10.2217/bmm-2016-0111] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Affiliation(s)
- Hao-Teng Chang
- Graduate Institute of Basic Medical Science, China Medical University, Taichung City, Taiwan; Department of Computer Science & Information Engineering, Asia University, Taichung City, Taiwan
| |
Collapse
|