1
|
Wei W, Li Y, Huang T. Using Machine Learning Methods to Study Colorectal Cancer Tumor Micro-Environment and Its Biomarkers. Int J Mol Sci 2023; 24:11133. [PMID: 37446311 DOI: 10.3390/ijms241311133] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 06/25/2023] [Accepted: 06/26/2023] [Indexed: 07/15/2023] Open
Abstract
Colorectal cancer (CRC) is a leading cause of cancer deaths worldwide, and the identification of biomarkers can improve early detection and personalized treatment. In this study, RNA-seq data and gene chip data from TCGA and GEO were used to explore potential biomarkers for CRC. The SMOTE method was used to address class imbalance, and four feature selection algorithms (MCFS, Borota, mRMR, and LightGBM) were used to select genes from the gene expression matrix. Four machine learning algorithms (SVM, XGBoost, RF, and kNN) were then employed to obtain the optimal number of genes for model construction. Through interpretable machine learning (IML), co-predictive networks were generated to identify rules and uncover underlying relationships among the selected genes. Survival analysis revealed that INHBA, FNBP1, PDE9A, HIST1H2BG, and CADM3 were significantly correlated with prognosis in CRC patients. In addition, the CIBERSORT algorithm was used to investigate the proportion of immune cells in CRC tissues, and gene mutation rates for the five selected biomarkers were explored. The biomarkers identified in this study have significant implications for the development of personalized therapies and could ultimately lead to improved clinical outcomes for CRC patients.
Collapse
Affiliation(s)
- Wei Wei
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yixue Li
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
- Guangzhou Laboratory, Guangzhou 510005, China
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai 200433, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
2
|
Adinew GM, Messeha S, Taka E, Ahmed SA, Soliman KFA. The Role of Apoptotic Genes and Protein-Protein Interactions in Triple-negative Breast Cancer. Cancer Genomics Proteomics 2023; 20:247-272. [PMID: 37093683 PMCID: PMC10148064 DOI: 10.21873/cgp.20379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/09/2023] [Accepted: 02/19/2023] [Indexed: 04/25/2023] Open
Abstract
BACKGROUND/AIM Compared to other breast cancer types, triple-negative breast cancer (TNBC) has historically had few treatment alternatives. Therefore, exploring and pinpointing potentially implicated genes could be used for treating and managing TNBC. By doing this, we will provide essential data to comprehend how the genes are involved in the apoptotic pathways of the cancer cells to identify potential therapeutic targets. Analysis of a single genetic alteration may not reveal the pathogenicity driving TNBC due to the high genomic complexity and heterogeneity of TNBC. Therefore, searching through a large variety of gene interactions enabled the identification of molecular therapeutic genes. MATERIALS AND METHODS This study used integrated bioinformatics methods such as UALCAN, TNM plotter, PANTHER, GO-KEEG and PPIs to assess the gene expression, protein-protein interaction (PPI), and transcription factor interaction of apoptosis-regulated genes. RESULTS Compared to normal breast tissue, gene expressions of BNIP3, TNFRSF10B, MCL1, and CASP4 were downregulated in UALCAN. At the same time, BIK, AKT1, BAD, FADD, DIABLO, and CASP9 was down-regulated in bc-GeneExMiner v4.5 mRNA expression (BCGM) databases. Based on GO term enrichment analysis, the cellular process (GO:0009987), which has about 21 apoptosis-regulated genes, is the top category in the biological processes (BP), followed by biological regulation (GO:0065007). We identified 29 differentially regulated pathways, including the p53 pathway, angiogenesis, apoptosis signaling pathway, and the Alzheimer's disease presenilin pathway. We examined the PPIs between the genes that regulate apoptosis; CASP3 and CASP9 interact with FADD, MCL1, TNF, TNFRSRF10A, and TNFRSF10; additionally, CASP3 significantly forms PPIs with CASP9, DFFA, and TP53, and CASP9 with DIABLO. In the top 10 transcription factors, the androgen receptor (AR) interacts with five apoptosis-regulated genes (p<0.0001; q<0.01), followed by retinoic acid receptor alpha (RARA) (p<0.0001; q<0.01) and ring finger protein (RNF2) (p<0.0001; q<0.01). Overall, the gene expression profile, PPIs, and the apoptosis-TF interaction findings suggest that the 27 apoptosis-regulated genes might be used as promising targets in treating and managing TNBC. Furthermore, from a total of 27 key genes, CASP2, CASP3, DAPK1, TNF, TRAF2, and TRAF3 were significantly correlated with poor overall survival in TNBC (p-value <0.05); they could play important roles in the progression of TNBC and provide attractive therapeutic targets that may offer new candidate molecules for targeted therapy. CONCLUSION Our findings demonstrate that CASP2, CASP3, DAPK1, TNF, TRAF2, and TRAF3 were substantially associated with the overall survival rate (OS) difference of TNBC patients out of a total of 27 specific genes used in this study, which may play crucial roles in the development of TNBC and offer promising therapeutic interventions.
Collapse
Affiliation(s)
- Getinet M Adinew
- Division of Pharmaceutical Sciences, College of Pharmacy and Pharmaceutical Sciences, Institute of Public Health, Florida A&M University, Tallahassee, FL, U.S.A
| | - Samia Messeha
- Division of Pharmaceutical Sciences, College of Pharmacy and Pharmaceutical Sciences, Institute of Public Health, Florida A&M University, Tallahassee, FL, U.S.A
| | - Equar Taka
- Division of Pharmaceutical Sciences, College of Pharmacy and Pharmaceutical Sciences, Institute of Public Health, Florida A&M University, Tallahassee, FL, U.S.A
| | - Shade A Ahmed
- Division of Pharmaceutical Sciences, College of Pharmacy and Pharmaceutical Sciences, Institute of Public Health, Florida A&M University, Tallahassee, FL, U.S.A
| | - Karam F A Soliman
- Division of Pharmaceutical Sciences, College of Pharmacy and Pharmaceutical Sciences, Institute of Public Health, Florida A&M University, Tallahassee, FL, U.S.A.
| |
Collapse
|
3
|
Shah E, Maji P. Multi-View Kernel Learning for Identification of Disease Genes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2278-2290. [PMID: 37027602 DOI: 10.1109/tcbb.2023.3247033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Gene expression data sets and protein-protein interaction (PPI) networks are two heterogeneous data sources that have been extensively studied, due to their ability to capture the co-expression patterns among genes and their topological connections. Although they depict different traits of the data, both of them tend to group co-functional genes together. This phenomenon agrees with the basic assumption of multi-view kernel learning, according to which different views of the data contain a similar inherent cluster structure. Based on this inference, a new multi-view kernel learning based disease gene identification algorithm, termed as DiGId, is put forward. A novel multi-view kernel learning approach is proposed that aims to learn a consensus kernel, which efficiently captures the heterogeneous information of individual views as well as depicts the underlying inherent cluster structure. Some low-rank constraints are imposed on the learned multi-view kernel, so that it can effectively be partitioned into k or fewer clusters. The learned joint cluster structure is used to curate a set of potential disease genes. Moreover, a novel approach is put forward to quantify the importance of each view. In order to demonstrate the effectiveness of the proposed approach in capturing the relevant information depicted by individual views, an extensive analysis is performed on four different cancer-related gene expression data sets and PPI network, considering different similarity measures.
Collapse
|
4
|
Jeon J, Han EY, Jung I. MOPA: An integrative multi-omics pathway analysis method for measuring omics activity. PLoS One 2023; 18:e0278272. [PMID: 36928437 PMCID: PMC10019735 DOI: 10.1371/journal.pone.0278272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 11/13/2022] [Indexed: 03/18/2023] Open
Abstract
Pathways are composed of proteins forming a network to represent specific biological mechanisms and are often used to measure enrichment scores based on a list of genes in means to measure their biological activity. The pathway analysis is a de facto standard downstream analysis procedure in most genomic and transcriptomic studies. Here, we present MOPA (Multi-Omics Pathway Analysis), which is a multi-omics integrative method that scores individual pathways in a sample wise manner in terms of enriched multi-omics regulatory activity, which we refer to mES (multi-omics Enrichment Score). The mES score reflects the strength of regulatory relations between multi-omics in units of pathways. In addition, MOPA is able to measure how much each omics contribute to mES that may be used to observe what kind of omics are active in a pathway within a sample group (e.g., subtype, gender), which we refer to OCR (Omics Contribution Rate). Using nine different cancer types, 93 clinical features and three types of omics (i.e., gene expression, miRNA and methylation), MOPA was used to search for clinical features that were explainable in context of multi-omics. By evaluating the performance of MOPA, we showed that it yielded higher or at least equal performance compared to previous single and multi-omics pathway analysis tools. We find that the advantage of MOPA is the ability to explain pathways in terms of omics relation using mES and OCR. As one of the results, the TGF-beta signaling pathway was captured as an important pathway that showed distinct mES and OCR values specific to the CMS4 subtype in colon adenocarcinoma. The mES and OCR metrics suggested that the mRNA and miRNA expressions were significantly different from the other subtypes, which was concordant with previous studies. The MOPA software is available at https://github.com/jaeminjj/MOPA.
Collapse
Affiliation(s)
- Jaemin Jeon
- Interdisciplinary Program in Bioinformatics, Seoul National University, Gwanak-Gu, Seoul, Republic of Korea
| | - Eon Yong Han
- School of Computer Science and Engineering, Kyungpook National University, Buk-gu, Deagu, Republic of Korea
| | - Inuk Jung
- School of Computer Science and Engineering, Kyungpook National University, Buk-gu, Deagu, Republic of Korea
- * E-mail:
| |
Collapse
|
5
|
Ershov P, Poyarkov S, Konstantinova Y, Veselovsky E, Makarova A. Transcriptomic Signatures in Colorectal Cancer Progression. Curr Mol Med 2023; 23:239-249. [PMID: 35490318 DOI: 10.2174/1566524022666220427102048] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 11/05/2021] [Accepted: 03/09/2022] [Indexed: 02/08/2023]
Abstract
AIMS Due to a large number of identified hub-genes encoding key molecular regulators, which are involved in signal transduction and metabolic pathways in cancers, it is relevant to systemize and update these findings. BACKGROUND Colorectal cancer (CRC) is the third leading cause of cancer death in the world, with high metastatic potential. Elucidating the pathogenic mechanisms and selection of novel biomarkers in CRC is of great clinical significance. OBJECTIVE This analytical review aims at the systematization of bioinformatics and experimental identification of hub-genes associated with CRC for a more consolidated understanding of common features in networks and pathways in CRC progression as well as hub-genes selection. RESULTS In total, 301 hub-genes were derived from 40 articles. The "core" consisted of 28 hub-genes (CCNB1, LPAR1, BGN, CXCL3, COL1A2, UBE2C, NMU, COL1A1, CXCL2, CXCL11, CDK1, TOP2A, AURKA, SST, CXCL5, MMP3, CCND1, TIMP1, CXCL8, CXCL1, CXCL12, MYC, CCNA2, GCG, GUCA2A, PAICS, PYY and THBS2) mentioned in not less than three articles and having clinical significance in cancerassociated pathways. Of them, there were two discrete clusters enriched in chemokine signaling and cell cycle regulatory genes. High expression levels of BGN and TIMP1 and low expression levels of CCNB1, CXCL3, CXCL2, CXCL2 and PAICS were associated with unfavorable overall survival of patients with CRC. Differently expressed genes such as LPAR1, SST, CXCL12, GUCA2A, and PYY were shown as down regulated, whereas BGN, CXCL3, UBE2C, NMU, CXCL11, CDK1, TOP2A, AURKA, MMP3, CCND1, CXCL1, MYC, CCNA2, PAICS were up regulated genes in CRC. It was also found that MMP3, THBS2, TIMP1 and CXCL12 genes were associated with metastatic CRC. Network analysis in ONCO.IO showed that upstream master regulators RELA, STAT3, SOX2, FOXM1, SMAD3 and NF-kB were connected with "core" hub-genes. Conclusión: Results obtained are of useful fundamental information on revealing the mechanism of pathogenicity, cellular target selection for optimization of therapeutic interventions, as well as transcriptomics prognostic and predictive biomarkers development.
Collapse
Affiliation(s)
- Pavel Ershov
- Department of Analysis and Forecasting of Medical and Biological Health Risks, Federal State Budgetary Institution "Centre for Strategic Planning and Management of Biomedical Health Risks" of the Federal Medical Biological Agency, Moscow, Russia
| | - Stanislav Poyarkov
- Department of Analysis and Forecasting of Medical and Biological Health Risks, Federal State Budgetary Institution "Centre for Strategic Planning and Management of Biomedical Health Risks" of the Federal Medical Biological Agency, Moscow, Russia
| | - Yulia Konstantinova
- Oncology Department, Federal Research and Clinical Center of Specialized Kinds of Medical Care and Medical Technology of the Federal Medical Biological Agency, Moscow, Russia
| | - Egor Veselovsky
- Department of Analysis and Forecasting of Medical and Biological Health Risks, Federal State Budgetary Institution "Centre for Strategic Planning and Management of Biomedical Health Risks" of the Federal Medical Biological Agency, Moscow, Russia
| | - Anna Makarova
- Department of Analysis and Forecasting of Medical and Biological Health Risks, Federal State Budgetary Institution "Centre for Strategic Planning and Management of Biomedical Health Risks" of the Federal Medical Biological Agency, Moscow, Russia
| |
Collapse
|
6
|
Artificial intelligence in cancer target identification and drug discovery. Signal Transduct Target Ther 2022; 7:156. [PMID: 35538061 PMCID: PMC9090746 DOI: 10.1038/s41392-022-00994-0] [Citation(s) in RCA: 63] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Revised: 03/14/2022] [Accepted: 04/05/2022] [Indexed: 02/08/2023] Open
Abstract
Artificial intelligence is an advanced method to identify novel anticancer targets and discover novel drugs from biology networks because the networks can effectively preserve and quantify the interaction between components of cell systems underlying human diseases such as cancer. Here, we review and discuss how to employ artificial intelligence approaches to identify novel anticancer targets and discover drugs. First, we describe the scope of artificial intelligence biology analysis for novel anticancer target investigations. Second, we review and discuss the basic principles and theory of commonly used network-based and machine learning-based artificial intelligence algorithms. Finally, we showcase the applications of artificial intelligence approaches in cancer target identification and drug discovery. Taken together, the artificial intelligence models have provided us with a quantitative framework to study the relationship between network characteristics and cancer, thereby leading to the identification of potential anticancer targets and the discovery of novel drug candidates.
Collapse
|
7
|
Zhu L, Li W. Roles of Physicochemical and Structural Properties of RNA-Binding Proteins in Predicting the Activities of Trans-Acting Splicing Factors with Machine Learning. Int J Mol Sci 2022; 23:ijms23084426. [PMID: 35457243 PMCID: PMC9030803 DOI: 10.3390/ijms23084426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/13/2022] [Accepted: 04/14/2022] [Indexed: 02/06/2023] Open
Abstract
Trans-acting splicing factors play a pivotal role in modulating alternative splicing by specifically binding to cis-elements in pre-mRNAs. There are approximately 1500 RNA-binding proteins (RBPs) in the human genome, but the activities of these RBPs in alternative splicing are unknown. Since determining RBP activities through experimental methods is expensive and time consuming, the development of an efficient computational method for predicting the activities of RBPs in alternative splicing from their sequences is of great practical importance. Recently, a machine learning model for predicting the activities of splicing factors was built based on features of single and dual amino acid compositions. Here, we explored the role of physicochemical and structural properties in predicting their activities in alternative splicing using machine learning approaches and found that the prediction performance is significantly improved by including these properties. By combining the minimum redundancy–maximum relevance (mRMR) method and forward feature searching strategy, a promising feature subset with 24 features was obtained to predict the activities of RBPs. The feature subset consists of 16 dual amino acid compositions, 5 physicochemical features, and 3 structural features. The physicochemical and structural properties were as important as the sequence composition features for an accurate prediction of the activities of splicing factors. The hydrophobicity and distribution of coil are suggested to be the key physicochemical and structural features, respectively.
Collapse
Affiliation(s)
| | - Wenjin Li
- Correspondence: ; Tel.: +86-0755-26942336
| |
Collapse
|
8
|
Wang Y, Gao X, Ru X, Sun P, Wang J. A hybrid feature selection algorithm and its application in bioinformatics. PeerJ Comput Sci 2022; 8:e933. [PMID: 35494789 PMCID: PMC9044222 DOI: 10.7717/peerj-cs.933] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 03/03/2022] [Indexed: 06/14/2023]
Abstract
Feature selection is an independent technology for high-dimensional datasets that has been widely applied in a variety of fields. With the vast expansion of information, such as bioinformatics data, there has been an urgent need to investigate more effective and accurate methods involving feature selection in recent decades. Here, we proposed the hybrid MMPSO method, by combining the feature ranking method and the heuristic search method, to obtain an optimal subset that can be used for higher classification accuracy. In this study, ten datasets obtained from the UCI Machine Learning Repository were analyzed to demonstrate the superiority of our method. The MMPSO algorithm outperformed other algorithms in terms of classification accuracy while utilizing the same number of features. Then we applied the method to a biological dataset containing gene expression information about liver hepatocellular carcinoma (LIHC) samples obtained from The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx). On the basis of the MMPSO algorithm, we identified a 18-gene signature that performed well in distinguishing normal samples from tumours. Nine of the 18 differentially expressed genes were significantly up-regulated in LIHC tumour samples, and the area under curves (AUC) of the combination seven genes (ADRA2B, ERAP2, NPC1L1, PLVAP, POMC, PYROXD2, TRIM29) in classifying tumours with normal samples was greater than 0.99. Six genes (ADRA2B, PYROXD2, CACHD1, FKBP1B, PRKD1 and RPL7AP6) were significantly correlated with survival time. The MMPSO algorithm can be used to effectively extract features from a high-dimensional dataset, which will provide new clues for identifying biomarkers or therapeutic targets from biological data and more perspectives in tumor research.
Collapse
Affiliation(s)
- Yangyang Wang
- School of Electronics and Information, Northwestern Polytechnical University, Xi’an, Shaanxi, China
| | - Xiaoguang Gao
- School of Electronics and Information, Northwestern Polytechnical University, Xi’an, Shaanxi, China
| | - Xinxin Ru
- School of Electronics and Information, Northwestern Polytechnical University, Xi’an, Shaanxi, China
| | - Pengzhan Sun
- School of Electronics and Information, Northwestern Polytechnical University, Xi’an, Shaanxi, China
| | - Jihan Wang
- Institute of Medical Research, Northwestern Polytechnical University, Xi’an, Shaanxi, China
| |
Collapse
|
9
|
Wang C, Liao S, Wang Y, Hu X, Xu J. Computational Identification of Guillain-Barré Syndrome-Related Genes by an mRNA Gene Expression Profile and a Protein–Protein Interaction Network. Front Mol Neurosci 2022; 15:850209. [PMID: 35370550 PMCID: PMC8968047 DOI: 10.3389/fnmol.2022.850209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 02/24/2022] [Indexed: 11/22/2022] Open
Abstract
Background In the present study, we used a computational method to identify Guillain–Barré syndrome (GBS) related genes based on (i) a gene expression profile, and (ii) the shortest path analysis in a protein–protein interaction (PPI) network. Materials and Methods mRNA Microarray analyses were performed on the peripheral blood mononuclear cells (PBMCs) of four GBS patients and four age- and gender-matched healthy controls. Results Totally 30 GBS-related genes were screened out, in which 20 were retrieved from PPI analysis of upregulated expressed genes and 23 were from downregulated expressed genes (13 overlap genes). Gene ontology (GO) enrichment and KEGG enrichment analysis were performed, respectively. Results showed that there were some overlap GO terms and KEGG pathway terms in both upregulated and downregulated analysis, including positive regulation of macromolecule metabolic process, intracellular signaling cascade, cell surface receptor linked signal transduction, intracellular non-membrane-bounded organelle, non-membrane-bounded organelle, plasma membrane, ErbB signaling pathway, focal adhesion, neurotrophin signaling pathway and Wnt signaling pathway, which indicated these terms may play a critical role during GBS process. Discussion These results provided basic information about the genetic and molecular pathogenesis of GBS disease, which may improve the development of effective genetic strategies for GBS treatment in the future.
Collapse
Affiliation(s)
- Chunyang Wang
- Department of Neurology, Tianjin Medical University General Hospital, Tianjin, China
| | - Shiwei Liao
- Tianjin Key Laboratory of Cerebral Vascular and Neurodegenerative Diseases, Department of Neurorehabilitation and Neurology, Tianjin Huanhu Hospital, Tianjin Neurosurgical Institute, Tianjin, China
| | - Yiyi Wang
- Department of Neurology, Tianjin Haihe Hospital, Tianjin, China
| | - Xiaowei Hu
- Department of Neurology, Tianjin Medical University General Hospital, Tianjin, China
| | - Jing Xu
- Department of Neurology, Tianjin Medical University General Hospital, Tianjin, China
- *Correspondence: Jing Xu,
| |
Collapse
|
10
|
Shah E, Maji P. Scalable Non-Linear Graph Fusion for Prioritizing Cancer-Causing Genes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1130-1143. [PMID: 32966220 DOI: 10.1109/tcbb.2020.3026219] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In the past few decades, both gene expression data and protein-protein interaction (PPI)networks have been extensively studied, due to their ability to depict important characteristics of disease-associated genes. In this regard, the paper presents a new gene prioritization algorithm to identify and prioritize cancer-causing genes, integrating judiciously the complementary information obtained from two data sources. The proposed algorithm selects disease-causing genes by maximizing the importance of selected genes and functional similarity among them. A new quantitative index is introduced to evaluate the importance of a gene. It considers whether a gene exhibits a differential expression pattern across sick and healthy individuals, and has a strong connectivity in the PPI network, which are the important characteristics of a potential biomarker. As disease-associated genes are expected to have similar expression profiles and topological structures, a scalable non-linear graph fusion technique, termed as ScaNGraF, is proposed to learn a disease-dependent functional similarity network from the co-expression and common neighbor based similarity networks. The proposed ScaNGraF, which is based on message passing algorithm, efficiently combines the shared and complementary information provided by different data sources with significantly lower computational cost. A new measure, termed as DiCoIN, is introduced to evaluate the quality of a learned affinity network. The performance of the proposed graph fusion technique and gene selection algorithm is extensively compared with that of some existing methods, using several cancer data sets.
Collapse
|
11
|
Rout RK, Umer S, Sheikh S, Sindhwani S, Pati S. EightyDVec: a method for protein sequence similarity analysis using physicochemical properties of amino acids. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION 2022. [DOI: 10.1080/21681163.2021.1956369] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Ranjeet Kumar Rout
- Computer Science & Engineering, National Institute of Technology Srinagar, Hazratbal, India
| | - Saiyed Umer
- Computer Science & Engineering, Aliah University, West Bengal, India
| | - Sabha Sheikh
- Computer Science & Engineering, National Institute of Technology Srinagar, Hazratbal, India
| | - Sanchit Sindhwani
- , DR. B. R. Ambedkar National Institute of Technology, Jalandhar, Punjab, India
| | - Smitarani Pati
- , DR. B. R. Ambedkar National Institute of Technology, Jalandhar, Punjab, India
| |
Collapse
|
12
|
Sliheet E, Robinson M, Morand S, Choucair K, Willoughby D, Stanbery L, Aaron P, Bognar E, Nemunaitis J. Network based analysis identifies TP53m-BRCA1/2wt-homologous recombination proficient (HRP) population with enhanced susceptibility to Vigil immunotherapy. Cancer Gene Ther 2022; 29:993-1000. [PMID: 34785763 PMCID: PMC9293751 DOI: 10.1038/s41417-021-00400-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 09/21/2021] [Accepted: 10/11/2021] [Indexed: 02/06/2023]
Abstract
Thus far immunotherapy has had limited impact on ovarian cancer. Vigil (a novel DNA-based multifunctional immune-therapeutic) has shown clinical benefit to prolong relapse-free survival (RFS) and overall survival (OS) in the BRCA wild type and HRP populations. We further analyzed molecular signals related to sensitivity of Vigil treatment. Tissue from patients enrolled in the randomized double-blind trial of Vigil vs. placebo as maintenance in frontline management of advanced resectable ovarian cancer underwent DNA polymorphism analysis. Data was generated from a 981 gene panel to determine the tumor mutation burden and classify variants using Ingenuity Variant Analysis software (Qiagen) or NIH ClinVar. Only variants classified as pathogenic or likely pathogenic were included. STRING application (version 1.5.1) was used to create a protein-protein interaction network. Topological distance and probability of co-mutation were used to calculated the C-score and cumulative C-score (cumC-score). Kaplan-Meier analysis was used to determine the relationship between gene pairs with a high cumC-score and clinical parameters. Improved relapse free survival in Vigil treated patients was found for the TP53m-BRCAwt-HRP group compared to placebo (21.1 months versus 5.6 months p = 0.0013). Analysis of tumor mutation burden did not reveal statistical benefit in patients receiving Vigil versus placebo. Results suggest a subset of ovarian cancer patients with enhanced susceptibility to Vigil immunotherapy. The hypothesis-generating data presented invites a validation study of Vigil in target identified populations, and supports clinical consideration of STRING-generated network application to biomarker characterization with other cancer patients targeted with Vigil.
Collapse
Affiliation(s)
- Elyssa Sliheet
- grid.263864.d0000 0004 1936 7929Southern Methodist University, Department of Mathematics, Dallas, TX USA
| | - Molly Robinson
- grid.263864.d0000 0004 1936 7929Southern Methodist University, Department of Mathematics, Dallas, TX USA
| | - Susan Morand
- grid.267337.40000 0001 2184 944XUniversity of Toledo, Department of Medicine, Toledo, OH USA
| | - Khalil Choucair
- grid.266515.30000 0001 2106 0692University of Kansas School of Medicine, Wichita, KS USA
| | | | | | | | | | | |
Collapse
|
13
|
Gouda G, Gupta MK, Donde R, Behera L, Vadde R. Metabolic pathway-based target therapy to hepatocellular carcinoma: a computational approach. THERANOSTICS AND PRECISION MEDICINE FOR THE MANAGEMENT OF HEPATOCELLULAR CARCINOMA, VOLUME 2 2022:83-103. [DOI: 10.1016/b978-0-323-98807-0.00003-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]
|
14
|
Chen L, Li Z, Zeng T, Zhang YH, Zhang S, Huang T, Cai YD. Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features. Front Genet 2021; 12:783128. [PMID: 34804131 PMCID: PMC8603309 DOI: 10.3389/fgene.2021.783128] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 10/22/2021] [Indexed: 12/12/2022] Open
Abstract
Given the limitation of technologies, the subcellular localizations of proteins are difficult to identify. Predicting the subcellular localization and the intercellular distribution patterns of proteins in accordance with their specific biological roles, including validated functions, relationships with other proteins, and even their specific sequence characteristics, is necessary. The computational prediction of protein subcellular localizations can be performed on the basis of the sequence and the functional characteristics. In this study, the protein-protein interaction network, functional annotation of proteins and a group of direct proteins with known subcellular localization were used to construct models. To build efficient models, several powerful machine learning algorithms, including two feature selection methods, four classification algorithms, were employed. Some key proteins and functional terms were discovered, which may provide important contributions for determining protein subcellular locations. Furthermore, some quantitative rules were established to identify the potential subcellular localizations of proteins. As the first prediction model that uses direct protein annotation information (i.e., functional features) and STRING-based protein-protein interaction network (i.e., network features), our computational model can help promote the development of predictive technologies on subcellular localizations and provide a new approach for exploring the protein subcellular localization patterns and their potential biological importance.
Collapse
Affiliation(s)
- Lei Chen
- School of Life Sciences, Shanghai University, Shanghai, China
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - ZhanDong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Tao Zeng
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Hang Zhang
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - ShiQi Zhang
- Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
15
|
Hozhabri H, Lashkari A, Razavi SM, Mohammadian A. Integration of gene expression data identifies key genes and pathways in colorectal cancer. Med Oncol 2021; 38:7. [PMID: 33411100 DOI: 10.1007/s12032-020-01448-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 11/21/2020] [Indexed: 12/16/2022]
Abstract
Colorectal cancer (CRC) is one of the most common malignant tumor and prevalent cause of cancer-related death worldwide. In this study, we analyzed the gene expression profiles of patients with CRC with the aim of better understanding the molecular mechanism and key genes in CRC. Four gene expression profiles including, GSE9348, GSE41328, GSE41657, and GSE113513 were downloaded from GEO database. The data were processed using R programming language, in which 319 common differentially expressed genes including 94 up-regulated and 225 down-regulated were identified. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were conducted to find the most significant enriched pathways in CRC. Based on the GO and KEGG pathway analysis, the most important dysregulated pathways were regulation of cell proliferation, biocarbonate transport, Wnt, and IL-17 signaling pathways, and nitrogen metabolism. The protein-protein interaction (PPI) network of the DEGs was constructed using Cytoscape software and hub genes including MYC, CXCL1, CD44, MMP1, and CXCL12 were identified as the most critical hub genes. The present study enhances our understanding of the molecular mechanisms of the CRC, which might potentially be applied in the treatment strategies of CRC as molecular targets and diagnostic biomarkers.
Collapse
Affiliation(s)
- Hossein Hozhabri
- Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| | - Ali Lashkari
- Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Seyed-Morteza Razavi
- Department of Cell and Molecular Biology, Faculty of Biological Sciences, Kharazmi University, Tehran, Iran.,Salari Institute of Cognitive and Behavioral Disorders (SICBD), Karaj, Alborz, Iran.,Systems Biology Research Lab, Bioinformatics Group, Systems Biology of Next Generation Company (SBNGC), Qom, Iran
| | - Ali Mohammadian
- Department of Medical Biotechnology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran.
| |
Collapse
|
16
|
Qian G, Ho JWK. Challenges and emerging systems biology approaches to discover how the human gut microbiome impact host physiology. Biophys Rev 2020; 12:851-863. [PMID: 32638331 PMCID: PMC7429608 DOI: 10.1007/s12551-020-00724-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 07/02/2020] [Indexed: 02/07/2023] Open
Abstract
Research in the human gut microbiome has bloomed with advances in next generation sequencing (NGS) and other high-throughput molecular profiling technologies. This has enabled the generation of multi-omics datasets which holds promises for big data-enabled knowledge acquisition in the form of understanding the normal physiological and pathological involvement of gut microbiomes. Ample evidence suggests that distinct microbial compositions in the human gut are associated with different diseases. However, the biological mechanisms underlying these associations are often unclear. There is a need to move beyond statistical associations to discover how changes in the gut microbiota mechanistically affect host physiology and disease development. This review summarises state-of-the-art big data and systems biology approaches for mechanism discovery.
Collapse
Affiliation(s)
- Gordon Qian
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Joshua W K Ho
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong.
| |
Collapse
|
17
|
Wang Q, Ye J, Fang D, Lv L, Wu W, Shi D, Li Y, Yang L, Bian X, Wu J, Jiang X, Wang K, Wang Q, Hodson MP, Thibaut LM, Ho JWK, Giannoulatou E, Li L. Multi-omic profiling reveals associations between the gut mucosal microbiome, the metabolome, and host DNA methylation associated gene expression in patients with colorectal cancer. BMC Microbiol 2020; 20:83. [PMID: 32321427 PMCID: PMC7178946 DOI: 10.1186/s12866-020-01762-2] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 03/23/2020] [Indexed: 12/24/2022] Open
Abstract
Background The human gut microbiome plays a critical role in the carcinogenesis of colorectal cancer (CRC). However, a comprehensive analysis of the interaction between the host and microbiome is still lacking. Results We found correlations between the change in abundance of microbial taxa, butyrate-related colonic metabolites, and methylation-associated host gene expression in colonic tumour mucosa tissues compared with the adjacent normal mucosa tissues. The increase of genus Fusobacterium abundance was correlated with a decrease in the level of 4-hydroxybutyric acid (4-HB) and expression of immune-related peptidase inhibitor 16 (PI16), Fc Receptor Like A (FCRLA) and Lymphocyte Specific Protein 1 (LSP1). The decrease in the abundance of another potentially 4-HB-associated genus, Prevotella 2, was also found to be correlated with the down-regulated expression of metallothionein 1 M (MT1M). Additionally, the increase of glutamic acid-related family Halomonadaceae was correlated with the decreased expression of reelin (RELN). The decreased abundance of genus Paeniclostridium and genus Enterococcus were correlated with increased lactic acid level, and were also linked to the expression change of Phospholipase C Beta 1 (PLCB1) and Immunoglobulin Superfamily Member 9 (IGSF9) respectively. Interestingly, 4-HB, glutamic acid and lactic acid are all butyrate precursors, which may modify gene expression by epigenetic regulation such as DNA methylation. Conclusions Our study identified associations between previously reported CRC-related microbial taxa, butyrate-related metabolites and DNA methylation-associated gene expression in tumour and normal colonic mucosa tissues from CRC patients, which uncovered a possible mechanism of the role of microbiome in the carcinogenesis of CRC. In addition, these findings offer insight into potential new biomarkers, therapeutic and/or prevention strategies for CRC.
Collapse
Affiliation(s)
- Qing Wang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China.,Computational Genomics Laboratory, Victor Chang Cardiac Research Institute, Sydney, Australia
| | - Jianzhong Ye
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Daiqiong Fang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Longxian Lv
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Wenrui Wu
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Ding Shi
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Yating Li
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Liya Yang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Xiaoyuan Bian
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Jingjing Wu
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Xianwan Jiang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Kaicen Wang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Qiangqiang Wang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China
| | - Mark P Hodson
- Freedman Foundation Metabolomics Facility, Victor Chang Innovation Centre, Victor Chang Cardiac Research Institute, Sydney, Australia.,School of Pharmacy, University of Queensland, Woolloongabba, QLD 4102, Australia
| | - Loïc M Thibaut
- Computational Genomics Laboratory, Victor Chang Cardiac Research Institute, Sydney, Australia.,School of Mathematics and Statistics, UNSW Sydney, Sydney, Australia
| | - Joshua W K Ho
- Bioinformatics and Systems Medicine Laboratory, Victor Chang Cardiac Research Institute, Sydney, Australia.,School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Eleni Giannoulatou
- Computational Genomics Laboratory, Victor Chang Cardiac Research Institute, Sydney, Australia. .,St Vincent's Clinical School, UNSW Sydney, Sydney, Australia.
| | - Lanjuan Li
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China. .,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases, Hangzhou, China.
| |
Collapse
|
18
|
Jin Y, Qin X. Comprehensive analysis of transcriptome data for identifying biomarkers and therapeutic targets in head and neck squamous cell carcinoma. ANNALS OF TRANSLATIONAL MEDICINE 2020; 8:282. [PMID: 32355726 PMCID: PMC7186651 DOI: 10.21037/atm.2020.03.30] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Background Head and neck squamous cell carcinoma (HNSCC) is one of the most common malignancy worldwide. Accumulating evidences have highlighted the importance of transcriptome data during HNSCC tumorigenesis. The aim of this study was to identify significant genes as effective biomarkers for HNSCC and constructed miRNA-mRNA regulatory network for a more comprehensive understanding of the underlying molecular mechanisms. Methods A total of four independent microarrays conducted on HNSCC samples were downloaded from the Gene Expression Omnibus (GEO) and analyzed through R software. FunRich was applied to predict potential transcription factors and targeted genes of miRNAs. Protein-protein interaction (PPI) network and miRNA-mRNA regulatory network were constructed in Cytoscape. Additionally, the database for annotation, visualization, and integrated discovery (DAVID) was utilized to perform GO and KEGG pathway enrichment analyses. Validation of gene expression levels was conducted by online databases and qPCR experiments. Results A total of 35 and 193 differentially expressed miRNAs (DEMs) and mRNAs (DEGs) were screened out by the limma package in R. The interactive network of the overlapping DEGs presented three significant modules and ten hub genes (FN1, MMP3, SPP1, STAT1, LOX, CXCL5, CXCL11, ISG15, IFIT3, and RSAD2). Predicted target genes of DEMs were visualized in Cytoscape and six miRNA-mRNA regulatory pairs were identified. Further validation demonstrated the upregulation of SLC16A1 and COL4A1 in HNSCC. Conclusions We performed an integrated and comprehensive bioinformatics analysis of miRNAs and mRNAs in HNSCC, contributing to explore the underlying regulatory mechanisms and to identify genetic biomarkers and therapeutic targets for HNSCC.
Collapse
Affiliation(s)
- Yu Jin
- Department of General Dentistry, Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200011, China.,Shanghai Key Laboratory of Stomatology and Shanghai Research Institute of Stomatology, National Clinical Research Center of Stomatology, Shanghai 200000, China
| | - Xing Qin
- Shanghai Key Laboratory of Stomatology and Shanghai Research Institute of Stomatology, National Clinical Research Center of Stomatology, Shanghai 200000, China.,Department of Oral and Maxillofacial-Head & Neck Oncology, Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200011, China
| |
Collapse
|
19
|
Kudryavtseva AV, Lukyanova EN, Kharitonov SL, Nyushko KM, Krasheninnikov AA, Pudova EA, Guvatova ZG, Alekseev BY, Kiseleva MV, Kaprin AD, Dmitriev AA, Snezhkina AV, Krasnov GS. Bioinformatic identification of differentially expressed genes associated with prognosis of locally advanced lymph node-positive prostate cancer. J Bioinform Comput Biol 2020; 17:1950003. [PMID: 30866732 DOI: 10.1142/s0219720019500033] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Prostate cancer (PCa) is one of the primary causes of cancer-related mortality in men worldwide. Patients with locally advanced PCa with metastases in regional lymph nodes are usually marked as a high-risk group. One of the chief concerns for this group is to make an informed decision about the necessity of conducting adjuvant androgen deprivation therapy after radical surgical treatment. During the oncogenic transformation and progression of the disease, the expression of many genes is altered. Some of these genes can serve as markers for diagnosis, predicting the prognosis or effectiveness of drug therapy, as well as possible therapeutic targets. We undertook bioinformatic analysis of the RNA-seq data deposited in The Cancer Genome Atlas consortium database to identify possible prognostic markers. We compared the groups with favorable and unfavorable prognosis for the cohort of patients with PCa showing lymph node metastasis (pT2N1M0, pT3N1M0, and pT4N1M0) and for the most common molecular type carrying the fusion transcript TMPRSS2-ERG. For the entire cohort, we revealed at least six potential markers (IDO1, UGT2B15, IFNG, MUC6, CXCL11, and GBP1). Most of these genes are involved in the positive regulation of immune response. For the TMPRSS2-ERG subtype, we also identified six genes, the expression of which may be associated with prognosis: TOB1, GALNT7, INAFM1, APELA, RAC3, and NNMT. The identified genes, after additional studies and validation in the extended cohort, could serve as a prognostic marker of locally advanced lymph node-positive PCa.
Collapse
Affiliation(s)
- Anna V Kudryavtseva
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Elena N Lukyanova
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Sergey L Kharitonov
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Kirill M Nyushko
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Alexey A Krasheninnikov
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Elena A Pudova
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Zulfiya G Guvatova
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Boris Y Alekseev
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Marina V Kiseleva
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Andrey D Kaprin
- † Federal State Budgetary Institution, National Medical Research Radiological Center of the Ministry of Health of the Russian Federation, 4 Korolev Str., Obninsk 249036, Russian Federation
| | - Alexey A Dmitriev
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - Anastasiya V Snezhkina
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| | - George S Krasnov
- * Laboratory of Postgenomic Research, Engelhardt Institute of Molecular Biology Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russian Federation
| |
Collapse
|
20
|
Chen Y, Zhou ZF, Wang Y. Prediction and analysis of weighted genes in isoflurane induced general anesthesia based on network analysis. Int J Neurosci 2019; 130:610-620. [PMID: 31801399 DOI: 10.1080/00207454.2019.1701452] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Purpose: Isoflurane is still wildly used in the developing countries and isoflurane-induced general anesthesia gives rise to serious side effects. The aim of the present study was to investigate the molecular mechanism on isoflurane-induced general anesthesia.Materials and methods: The microarray data of GSE64617 dataset was downloaded from Gene Expression Omnibus (GEO) database. A total of 755 DEGs were identified using the limma package in the R programming language. Gene Ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes, and Genomes (KEGG) pathways enrichment were conducted for DEGs. A protein-protein interaction (PPI) network was constructed for DEGs and sensory perception related genes. A global miRNA-mRNA regulatory network was constructed to reveal the interactions in miRNA and mRNA in isoflurane treated samples. Degree was used to evaluate the importance of a gene in the PPI network and miRNA-mRNA regulatory network.Results and conclusions: HMBOX1, CSNK2A1, PNN, SRRM1, PRPF40A, APCNTRK1, MAPK1, hsa-miR-16-5p, hsa-miR-424-5p, hsa-miR-497-5p and hsa-miR-17-5p were selected as weighted genes. The expression changes were further vitrificated in the rat models by performing quantitative real-time PCR (qPCR) analysis. In conclusion, we find several weighted mRNAs and miRNAs involved in isoflurane induced general anesthesia through bioinformatics analysis.
Collapse
Affiliation(s)
- Yue Chen
- Department of Anesthesiology, Zhejiang Provincial People's Hospital, Hangzhou, China.,People's Hospital of Hangzhou Medical College, Hangzhou, China
| | - Zhen-Feng Zhou
- Department of Anesthesiology, Zhejiang Provincial People's Hospital, Hangzhou, China.,People's Hospital of Hangzhou Medical College, Hangzhou, China
| | - Yu Wang
- Department of Anesthesiology, Zhejiang Provincial People's Hospital, Hangzhou, China.,People's Hospital of Hangzhou Medical College, Hangzhou, China
| |
Collapse
|
21
|
Das D, Krishnan SR, Roy A, Bulusu G. A network-based approach reveals novel invasion and Maurer's clefts-related proteins in Plasmodium falciparum. Mol Omics 2019; 15:431-441. [PMID: 31631203 DOI: 10.1039/c9mo00124g] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Malaria continues to be a major concern in developing countries despite continuous efforts to find a cure for the disease. Understanding the pathogenesis mechanism is necessary to identify more effective drug targets against malaria. Many years of experimental research have generated a large amount of data for the malarial parasite, Plasmodium falciparum. These data are useful to understand the importance of certain parasite proteins, but it often remains unclear how these proteins come together, interact with other proteins and carry out their function. Identification of all proteins involved in pathogenesis is an important step towards understanding the molecular mechanism of pathogenesis. In this study, dynamic stage-specific protein-protein interaction networks were created based on gene expression data during the parasite's intra-erythrocytic stages and static protein-protein interaction data. Using previously known proteins of a biological event as seed proteins, the random walk with restart (RWR) method was used on the dynamic protein-protein interaction networks to identify novel proteins related to that event. Two screening procedures namely, permutation test and GO enrichment test were performed to increase the reliability of the RWR predictions. The proposed method was first validated on Plasmodium falciparum proteins related to invasion, where it could reproduce the existing knowledge from a small set of seed proteins. It was then used to identify novel Maurer's clefts resident proteins, where it could identify 152 parasite proteins. We show that the current approach can annotate conserved proteins with unknown function. The predicted proteins can help build a mechanistic model for disease pathogenesis, which will be useful in identifying new drug targets.
Collapse
Affiliation(s)
- Dibyajyoti Das
- TCS Innovation Labs - Hyderabad (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, India.
| | | | | | | |
Collapse
|
22
|
Su ZD, Huang Y, Zhang ZY, Zhao YW, Wang D, Chen W, Chou KC, Lin H. iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics 2019; 34:4196-4204. [PMID: 29931187 DOI: 10.1093/bioinformatics/bty508] [Citation(s) in RCA: 124] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 06/19/2018] [Indexed: 12/20/2022] Open
Abstract
Motivation Long non-coding RNAs (lncRNAs) are a class of RNA molecules with more than 200 nucleotides. They have important functions in cell development and metabolism, such as genetic markers, genome rearrangements, chromatin modifications, cell cycle regulation, transcription and translation. Their functions are generally closely related to their localization in the cell. Therefore, knowledge about their subcellular locations can provide very useful clues or preliminary insight into their biological functions. Although biochemical experiments could determine the localization of lncRNAs in a cell, they are both time-consuming and expensive. Therefore, it is highly desirable to develop bioinformatics tools for fast and effective identification of their subcellular locations. Results We developed a sequence-based bioinformatics tool called 'iLoc-lncRNA' to predict the subcellular locations of LncRNAs by incorporating the 8-tuple nucleotide features into the general PseKNC (Pseudo K-tuple Nucleotide Composition) via the binomial distribution approach. Rigorous jackknife tests have shown that the overall accuracy achieved by the new predictor on a stringent benchmark dataset is 86.72%, which is over 20% higher than that by the existing state-of-the-art predictor evaluated on the same tests. Availability and implementation A user-friendly webserver has been established at http://lin-group.cn/server/iLoc-LncRNA, by which users can easily obtain their desired results. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhen-Dong Su
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Yan Huang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Zhao-Yue Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Ya-Wei Zhao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Dong Wang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Wei Chen
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan, China.,Gordon Life Science Institute, Boston, MA, USA
| | - Kuo-Chen Chou
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,Gordon Life Science Institute, Boston, MA, USA
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,Gordon Life Science Institute, Boston, MA, USA
| |
Collapse
|
23
|
Donnio LM, Miquel C, Vermeulen W, Giglia-Mari G, Mari PO. Cell-type specific concentration regulation of the basal transcription factor TFIIH in XPB y/y mice model. Cancer Cell Int 2019; 19:237. [PMID: 31516394 PMCID: PMC6734240 DOI: 10.1186/s12935-019-0945-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 08/18/2019] [Indexed: 11/15/2022] Open
Abstract
Background The basal transcription/repair factor TFIIH is a ten sub-unit complex essential for RNA polymerase II (RNAP2) transcription initiation and DNA repair. In both these processes TFIIH acts as a DNA helix opener, required for promoter escape of RNAP2 in transcription initiation, and to set the stage for strand incision within the nucleotide excision repair (NER) pathway. Methods We used a knock-in mouse model that we generated and that endogenously expresses a fluorescent version of XPB (XPB-YFP). Using different microscopy, cellular biology and biochemistry approaches we quantified the steady state levels of this protein in different cells, and cells imbedded in tissues. Results Here we demonstrate, via confocal imaging of ex vivo tissues and cells derived from this mouse model, that TFIIH steady state levels are tightly regulated at the single cell level, thus keeping nuclear TFIIH concentrations remarkably constant in a cell type dependent manner. Moreover, we show that individual cellular TFIIH levels are proportional to the speed of mRNA production, hence to a cell’s transcriptional activity, which we can correlate to proliferation status. Importantly, cancer tissue presents a higher TFIIH than normal healthy tissues. Conclusion This study shows that TFIIH cellular concentration can be used as a bona-fide quantitative marker of transcriptional activity and cellular proliferation.
Collapse
Affiliation(s)
- Lise-Marie Donnio
- 1Institut NeuroMyoGène (INMG), CNRS, UMR 5310, INSERM U1217, Faculté de Médecine, Université Claude Bernard Lyon 1, 8 Avenue Rockefeller, 69008 LYON, France
| | - Catherine Miquel
- 2Pathology Department, Saint-Louis Hospital, Université de Paris, 1 Avenue Claude Vellefaux, 75010 Paris, France
| | - Wim Vermeulen
- 3Department of Genetics, Erasmus MC, Dr Molewaterplein 50, 3015 GE Rotterdam, The Netherlands
| | - Giuseppina Giglia-Mari
- 1Institut NeuroMyoGène (INMG), CNRS, UMR 5310, INSERM U1217, Faculté de Médecine, Université Claude Bernard Lyon 1, 8 Avenue Rockefeller, 69008 LYON, France
| | - Pierre-Olivier Mari
- 1Institut NeuroMyoGène (INMG), CNRS, UMR 5310, INSERM U1217, Faculté de Médecine, Université Claude Bernard Lyon 1, 8 Avenue Rockefeller, 69008 LYON, France
| |
Collapse
|
24
|
Lu S, Zhu ZG, Lu WC. Inferring novel genes related to colorectal cancer via random walk with restart algorithm. Gene Ther 2019; 26:373-385. [PMID: 31308477 DOI: 10.1038/s41434-019-0090-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2018] [Revised: 05/20/2019] [Accepted: 06/11/2019] [Indexed: 12/12/2022]
Abstract
Colorectal cancer (CRC) is the third most common type of cancer. In recent decades, genomic analysis has played an increasingly important role in understanding the molecular mechanisms of CRC. However, its pathogenesis has not been fully uncovered. Identification of genes related to CRC as complete as possible is an important way to investigate its pathogenesis. Therefore, we proposed a new computational method for the identification of novel CRC-associated genes. The proposed method is based on existing proven CRC-associated genes, human protein-protein interaction networks, and random walk with restart algorithm. The utility of the method is indicated by comparing it to the methods based on Guilt-by-association or shortest path algorithm. Using the proposed method, we successfully identified 298 novel CRC-associated genes. Previous studies have validated the involvement of the majority of these 298 novel genes in CRC-associated biological processes, thus suggesting the efficacy and accuracy of our method.
Collapse
Affiliation(s)
- Sheng Lu
- Department of General Surgery, Rui Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Institute of Digestive Surgery, Shanghai, 200025, China
| | - Zheng-Gang Zhu
- Department of General Surgery, Rui Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Institute of Digestive Surgery, Shanghai, 200025, China
| | - Wen-Cong Lu
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai, 200444, China.
| |
Collapse
|
25
|
Li M, Guo Y, Feng YM, Zhang N. Identification of Triple-Negative Breast Cancer Genes and a Novel High-Risk Breast Cancer Prediction Model Development Based on PPI Data and Support Vector Machines. Front Genet 2019; 10:180. [PMID: 30930932 PMCID: PMC6428707 DOI: 10.3389/fgene.2019.00180] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 02/19/2019] [Indexed: 12/20/2022] Open
Abstract
Triple-negative breast cancer (TNBC) is a special subtype of breast cancer that is difficult to treat. It is crucial to identify breast cancer-related genes that could provide new biomarkers for breast cancer diagnosis and potential treatment goals. In the development of our new high-risk breast cancer prediction model, seven raw gene expression datasets from the NCBI gene expression omnibus (GEO) database (GSE31519, GSE9574, GSE20194, GSE20271, GSE32646, GSE45255, and GSE15852) were used. Using the maximum relevance minimum redundancy (mRMR) method, we selected significant genes. Then, we mapped transcripts of the genes on the protein-protein interaction (PPI) network from the Search Tool for the Retrieval of Interacting Genes (STRING) database, as well as traced the shortest path between each pair of proteins. Genes with higher betweenness values were selected from the shortest path proteins. In order to ensure validity and precision, a permutation test was performed. We randomly selected 248 proteins from the PPI network for shortest path tracing and repeated the procedure 100 times. We also removed genes that appeared more frequently in randomized results. As a result, 54 genes were selected as potential TNBC-related genes. Using 14 out the 54 genes, which are potential TNBC associated genes, as input features into a support vector machine (SVM), a novel model was trained to predict high-risk breast cancer. The prediction accuracy of normal tissues and TNBC tissues reached 95.394%, and the predictions of Stage II and Stage III TNBC reached 86.598%, indicating that such genes play important roles in distinguishing breast cancers, and that the method could be promising in practical use. According to reports, some of the 54 genes we identified from the PPI network are associated with breast cancer in the literature. Several other genes have not yet been reported but have functional resemblance with known cancer genes. These may be novel breast cancer-related genes and need further experimental validation. Gene ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed to appraise the 54 genes. It was indicated that cellular response to organic cyclic compounds has an influence in breast cancer, and most genes may be related with viral carcinogenesis.
Collapse
Affiliation(s)
- Ming Li
- Department of Biomedical Engineering, Tianjin Key Lab of BME Measurement, Tianjin University, Tianjin, China
| | - Yu Guo
- Department of Biomedical Engineering, Tianjin Key Lab of BME Measurement, Tianjin University, Tianjin, China
| | - Yuan-Ming Feng
- Department of Biomedical Engineering, Tianjin Key Lab of BME Measurement, Tianjin University, Tianjin, China
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, Tianjin, China
| | - Ning Zhang
- Department of Biomedical Engineering, Tianjin Key Lab of BME Measurement, Tianjin University, Tianjin, China
| |
Collapse
|
26
|
Zhao D, Liu H, Zheng Y, He Y, Lu D, Lyu C. Whale optimized mixed kernel function of support vector machine for colorectal cancer diagnosis. J Biomed Inform 2019; 92:103124. [PMID: 30796977 DOI: 10.1016/j.jbi.2019.103124] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Revised: 01/15/2019] [Accepted: 02/04/2019] [Indexed: 12/17/2022]
Abstract
Microarray technique is a prevalent method for the classification and prediction of colorectal cancer (CRC). Nevertheless, microarray data suffers from the curse of dimensionality when selecting feature genes of the disease based on imbalance samples, thus causing low prediction accuracy. Hence, it is of vital significance to build proper models that can avoid the above problems and predict the CRC more accurately. In this paper, we use an ensemble model to classify samples into healthy and CRC groups and improve prediction performance. The proposed model is composed of three functional modules. The first module mainly performs the function of removing redundant genes. The main feature genes are selected using minimum redundancy maximum relevance (mRMR) method to reduce the dimensionality of features thereby increasing the prediction results. The second module aims to solve the problem caused by imbalanced data using hybrid sampling algorithm RUSBoost. The third module focuses on the classification algorithm optimization. We use mixed kernel function (MKF) based support vector machine (SVM) model to classify an unknown sample into healthy individuals and CRC patients, and then, the Whale Optimization Algorithm (WOA) is applied to find most optimal parameters of the proposed MKF-SVM. The final results show that the proposed model achieves higher G-means than other comparable models. The conclusion comes to show that RUSBoost wrapping WOA + MKF-SVM model can be applied to improve the predictive performance of colorectal cancer based on the imbalanced data.
Collapse
Affiliation(s)
- Dandan Zhao
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China.
| | - Yuanjie Zheng
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| | - Yanlin He
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| | - Dianjie Lu
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| | - Chen Lyu
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| |
Collapse
|
27
|
Inferring Drug-Protein⁻Side Effect Relationships from Biomedical Text. Genes (Basel) 2019; 10:genes10020159. [PMID: 30791472 PMCID: PMC6409686 DOI: 10.3390/genes10020159] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 02/13/2019] [Accepted: 02/14/2019] [Indexed: 11/16/2022] Open
Abstract
Background: Although there are many studies of drugs and their side effects, the underlying mechanisms of these side effects are not well understood. It is also difficult to understand the specific pathways between drugs and side effects. Objective: The present study seeks to construct putative paths between drugs and their side effects by applying text-mining techniques to free text of biomedical studies, and to develop ranking metrics that could identify the most-likely paths. Materials and Methods: We extracted three types of relationships—drug-protein, protein-protein, and protein–side effect—from biomedical texts by using text mining and predefined relation-extraction rules. Based on the extracted relationships, we constructed whole drug-protein–side effect paths. For each path, we calculated its ranking score by a new ranking function that combines corpus- and ontology-based semantic similarity as well as co-occurrence frequency. Results: We extracted 13 plausible biomedical paths connecting drugs and their side effects from cancer-related abstracts in the PubMed database. The top 20 paths were examined, and the proposed ranking function outperformed the other methods tested, including co-occurrence, COALS, and UMLS by P@5-P@20. In addition, we confirmed that the paths are novel hypotheses that are worth investigating further. Discussion: The risk of side effects has been an important issue for the US Food and Drug Administration (FDA). However, the causes and mechanisms of such side effects have not been fully elucidated. This study extends previous research on understanding drug side effects by using various techniques such as Named Entity Recognition (NER), Relation Extraction (RE), and semantic similarity. Conclusion: It is not easy to reveal the biomedical mechanisms of side effects due to a huge number of possible paths. However, we automatically generated predictable paths using the proposed approach, which could provide meaningful information to biomedical researchers to generate plausible hypotheses for the understanding of such mechanisms.
Collapse
|
28
|
Tian B, Wu X, Chen C, Qiu W, Ma Q, Yu B. Predicting protein–protein interactions by fusing various Chou's pseudo components and using wavelet denoising approach. J Theor Biol 2019; 462:329-346. [DOI: 10.1016/j.jtbi.2018.11.011] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 11/08/2018] [Accepted: 11/15/2018] [Indexed: 12/26/2022]
|
29
|
Rezaei-Tavirani S, Rostami-Nejad M, Montazar F. Highlighted role of VEGFA in follow up of celiac disease. GASTROENTEROLOGY AND HEPATOLOGY FROM BED TO BENCH 2019; 12:254-259. [PMID: 31528310 PMCID: PMC6668760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
AIM Evolution of gene expression change of intestine tissue in celiac patients to find a new molecular prospective of disease is the aim of this study. BACKGROUND Celiac disease (CD) as an autoimmune disease is known as an immune reaction response to the gluten in patients. It is reported that genetic and environmental conditions are important in onset and progress of CD. METHODS gene expression profiles of intestinal tissue in 12 celiac patients and 12 healthy controls from gene expression omnibus (GEO) were downloaded and verified by boxplot analysis. The significant and selected differentially expressed genes (DEGs) were included protein-protein interaction (PPI) network analysis. The central nodes were identified by network analyzer. RESULTS The network was constructed from 161 query DEGs and 50 additional neighbors. GTF2H1, VEGFA, SUMO1, RAD51, MED21, BBP4, LEP, and MAP2K7 as potent hub nodes LRP5, RABGEF1, BCAS2, DYRK1B, AOC3, RABL2A, CRTAP, VEGFA, and SPOPL as potent bottlenecks are introduced as crucial nodes. CONCLUSION Among the crucial DEGs, Vascular endothelial growth factor A (VEGFA) was highlighted as an important biomarker candidate for follow up of celiac patients.
Collapse
Affiliation(s)
- Sina Rezaei-Tavirani
- Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad Rostami-Nejad
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Fatemeh Montazar
- Firoozabadi Clinical Research Development Unit (FACRDU), Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
30
|
Jia J, Li X, Qiu W, Xiao X, Chou KC. iPPI-PseAAC(CGR): Identify protein-protein interactions by incorporating chaos game representation into PseAAC. J Theor Biol 2019; 460:195-203. [DOI: 10.1016/j.jtbi.2018.10.021] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2018] [Revised: 09/16/2018] [Accepted: 10/08/2018] [Indexed: 01/11/2023]
|
31
|
Zhu Y, Bian Y, Zhang Q, Hu J, Li L, Yang M, Qian H, Yu L, Liu B, Qian X. Construction and analysis of dysregulated lncRNA-associated ceRNA network in colorectal cancer. J Cell Biochem 2018; 120:9250-9263. [PMID: 30525245 DOI: 10.1002/jcb.28201] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 11/15/2018] [Indexed: 12/26/2022]
Abstract
Colorectal cancer (CRC) is one of the most frequently diagnosed digestive system cancer. The aim of the present study was to investigate the interactions among messenger RNAs (mRNAs), microRNAs (miRNAs), and long noncoding RNAs (lncRNAs) in CRC to reveal the mechanisms of CRC. Differentially expressed genes (DEGs) were identified from public gene expression data sets. One thousand eighty-one common dysregulated mRNAs in two data sets were identified. Gene function analysis and protein-protein interaction network analysis indicated that these DEGs might play important roles in CRC. LINC00365 was selected through coding- noncoding network analysis and its expression was validated upregulated in 22 paired clinical samples and four CRC cell lines. A competing endogenous RNA network composed of 70 miRNAs, nine mRNAs, and LINC00365 was constructed. Eight of nine mRNAs were validated upregulated in The Cancer Genome Atlas data set. Our results suggested that LINC00365 was an oncogene in CRC and it could regulate the expression of several mRNAs through sponging miRNAs.
Collapse
Affiliation(s)
- Yiping Zhu
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Clinical College of Nanjing Medical University, Nanjing, Jiangsu, China.,Department of Oncology, Yijishan Hospital, Wannan Medical College, Wuhu, Anhui, China
| | - Yinzhu Bian
- Department of Oncology, The First People's Hospital of Yancheng, Yancheng, Jiangsu, China
| | - Qun Zhang
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Medical School of Nanjing University, Clinical Cancer Institute of Nanjing University, Nanjing, Jiangsu, China
| | - Jing Hu
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Medical School of Nanjing University, Clinical Cancer Institute of Nanjing University, Nanjing, Jiangsu, China
| | - Li Li
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Medical School of Nanjing University, Clinical Cancer Institute of Nanjing University, Nanjing, Jiangsu, China
| | - Mi Yang
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Medical School of Nanjing University, Clinical Cancer Institute of Nanjing University, Nanjing, Jiangsu, China
| | - Hanqing Qian
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Medical School of Nanjing University, Clinical Cancer Institute of Nanjing University, Nanjing, Jiangsu, China
| | - Lixia Yu
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Medical School of Nanjing University, Clinical Cancer Institute of Nanjing University, Nanjing, Jiangsu, China
| | - Baorui Liu
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Clinical College of Nanjing Medical University, Nanjing, Jiangsu, China.,Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Medical School of Nanjing University, Clinical Cancer Institute of Nanjing University, Nanjing, Jiangsu, China
| | - Xiaoping Qian
- Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Clinical College of Nanjing Medical University, Nanjing, Jiangsu, China.,Comprehensive Cancer Center, Nanjing Drum Tower Hospital, Medical School of Nanjing University, Clinical Cancer Institute of Nanjing University, Nanjing, Jiangsu, China
| |
Collapse
|
32
|
Ju Z, Wang SY. Prediction of S-sulfenylation sites using mRMR feature selection and fuzzy support vector machine algorithm. J Theor Biol 2018; 457:6-13. [DOI: 10.1016/j.jtbi.2018.08.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/07/2018] [Accepted: 08/15/2018] [Indexed: 11/29/2022]
|
33
|
Elliott A, Leicht E, Whitmore A, Reinert G, Reed-Tsochas F. A nonparametric significance test for sampled networks. Bioinformatics 2018; 34:64-71. [PMID: 29036452 PMCID: PMC5870844 DOI: 10.1093/bioinformatics/btx419] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 06/30/2017] [Indexed: 12/31/2022] Open
Abstract
Motivation Our work is motivated by an interest in constructing a protein–protein interaction network that captures key features associated with Parkinson’s disease. While there is an abundance of subnetwork construction methods available, it is often far from obvious which subnetwork is the most suitable starting point for further investigation. Results We provide a method to assess whether a subnetwork constructed from a seed list (a list of nodes known to be important in the area of interest) differs significantly from a randomly generated subnetwork. The proposed method uses a Monte Carlo approach. As different seed lists can give rise to the same subnetwork, we control for redundancy by constructing a minimal seed list as the starting point for the significance test. The null model is based on random seed lists of the same length as a minimum seed list that generates the subnetwork; in this random seed list the nodes have (approximately) the same degree distribution as the nodes in the minimum seed list. We use this null model to select subnetworks which deviate significantly from random on an appropriate set of statistics and might capture useful information for a real world protein–protein interaction network. Availability and implementation The software used in this paper are available for download at https://sites.google.com/site/elliottande/. The software is written in Python and uses the NetworkX library. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrew Elliott
- CABDyN Complexity Centre, Saïd Business School, University of Oxford, Oxford OX1 1HP, UK
| | - Elizabeth Leicht
- CABDyN Complexity Centre, Saïd Business School, University of Oxford, Oxford OX1 1HP, UK
| | | | - Gesine Reinert
- Department of Statistics, University of Oxford, Oxford, UK
| | - Felix Reed-Tsochas
- CABDyN Complexity Centre, Saïd Business School, University of Oxford, Oxford OX1 1HP, UK.,Oxford Martin School, University of Oxford, Oxford, UK
| |
Collapse
|
34
|
Cai L, Huang T, Su J, Zhang X, Chen W, Zhang F, He L, Chou KC. Implications of Newly Identified Brain eQTL Genes and Their Interactors in Schizophrenia. MOLECULAR THERAPY. NUCLEIC ACIDS 2018; 12:433-442. [PMID: 30195780 PMCID: PMC6041437 DOI: 10.1016/j.omtn.2018.05.026] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2018] [Revised: 05/19/2018] [Accepted: 05/30/2018] [Indexed: 12/21/2022]
Abstract
Schizophrenia (SCZ) is a devastating genetic mental disorder. Identification of the SCZ risk genes in brains is helpful to understand this disease. Thus, we first used the minimum Redundancy-Maximum Relevance (mRMR) approach to integrate the genome-wide sequence analysis results on SCZ and the expression quantitative trait locus (eQTL) data from ten brain tissues to identify the genes related to SCZ. Second, we adopted the variance inflation factor regression algorithm to identify their interacting genes in brains. Third, using multiple analysis methods, we explored and validated their roles. By means of the aforementioned procedures, we have found that (1) the cerebellum may play a crucial role in the pathogenesis of SCZ and (2) ITIH4 may be utilized as a clinical biomarker for the diagnosis of SCZ. These interesting findings may stimulate novel strategy for developing new drugs against SCZ. It has not escaped our notice that the approach reported here is of use for studying many other genome diseases as well.
Collapse
Affiliation(s)
- Lei Cai
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Genetics and Development, Shanghai Mental Health Center, Shanghai Jiaotong University, Shanghai 200240, China; Gordon Life Science Institute, Boston, MA 02478, USA; Shanghai Center for Women and Children's Health, Shanghai 200062, China.
| | - Tao Huang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Genetics and Development, Shanghai Mental Health Center, Shanghai Jiaotong University, Shanghai 200240, China; Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Jingjing Su
- Department of Neurology, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, Shanghai 200011, China
| | - Xinxin Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Genetics and Development, Shanghai Mental Health Center, Shanghai Jiaotong University, Shanghai 200240, China
| | - Wenzhong Chen
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Genetics and Development, Shanghai Mental Health Center, Shanghai Jiaotong University, Shanghai 200240, China
| | - Fuquan Zhang
- Department of Psychiatry, Wuxi Mental Health Center, Nanjing Medical University, Wuxi 214015, China
| | - Lin He
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Genetics and Development, Shanghai Mental Health Center, Shanghai Jiaotong University, Shanghai 200240, China; Shanghai Center for Women and Children's Health, Shanghai 200062, China.
| | - Kuo-Chen Chou
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Collaborative Innovation Center for Genetics and Development, Shanghai Mental Health Center, Shanghai Jiaotong University, Shanghai 200240, China; Gordon Life Science Institute, Boston, MA 02478, USA; Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China; Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
35
|
More TH, Taware R, Taunk K, Chanukuppa V, Naik V, Mane A, Rapole S. Investigation of altered urinary metabolomic profiles of invasive ductal carcinoma of breast using targeted and untargeted approaches. Metabolomics 2018; 14:107. [PMID: 30830381 DOI: 10.1007/s11306-018-1405-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Accepted: 08/01/2018] [Indexed: 01/22/2023]
Abstract
INTRODUCTION Invasive ductal carcinoma (IDC) is a type of breast cancer, usually detected in advanced stages due to its asymptomatic nature which ultimately leads to low survival rate. Identification of urinary metabolic adaptations induced by IDC to understand the disease pathophysiology and monitor therapy response would be a helpful approach in clinical settings. Moreover, its non-invasive and cost effective strategy better suited to minimize apprehension among high risk population. OBJECTIVE This study aims toward investigating the urinary metabolic alterations of IDC by targeted (LC-MRM/MS) and untargeted (GC-MS) approaches for the better understanding of the disease pathophysiology and monitoring therapy response. METHODS Urinary metabolic alterations of IDC subjects (63) and control subjects (63) were explored by targeted (LC-MRM/MS) and untargeted (GC-MS) approaches. IDC specific urinary metabolomics signature was extracted by applying both univariate and multivariate statistical tools. RESULTS Statistical analysis identified 39 urinary metabolites with the highest contribution to metabolomic alterations specific to IDC. Out of which, 19 metabolites were identified from targeted LC-MRM/MS analysis, while 20 were identified from the untargeted GC-MS analysis. Receiver operator characteristic (ROC) curve analysis evidenced 6 most discriminatory metabolites from each type of approach that could differentiate between IDC subjects and controls with higher sensitivity and specificity. Furthermore, metabolic pathway analysis depicted several dysregulated pathways in IDC including sugar, amino acid, nucleotide metabolism, TCA cycle etc. CONCLUSIONS: Overall, this study provides valuable inputs regarding altered urinary metabolites which improved our knowledge on urinary metabolomic alterations induced by IDC. Moreover, this study identified several dysregulated metabolic pathways which offer further insight into the disease pathophysiology.
Collapse
Affiliation(s)
- Tushar H More
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, 411007, MH, India
- Savitribai Phule Pune University, Ganeshkhind, Pune, 411007, MH, India
| | - Ravindra Taware
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, 411007, MH, India
| | - Khushman Taunk
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, 411007, MH, India
| | - Venkatesh Chanukuppa
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, 411007, MH, India
- Savitribai Phule Pune University, Ganeshkhind, Pune, 411007, MH, India
| | - Venkateshwarlu Naik
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, 411007, MH, India
| | - Anupama Mane
- Grant Medical Foundation, Ruby Hall Clinic, Pune, 411001, MH, India
| | - Srikanth Rapole
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, 411007, MH, India.
| |
Collapse
|
36
|
Seal A, Wild DJ. Netpredictor: R and Shiny package to perform drug-target network analysis and prediction of missing links. BMC Bioinformatics 2018; 19:265. [PMID: 30012095 PMCID: PMC6047136 DOI: 10.1186/s12859-018-2254-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Accepted: 06/18/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Netpredictor is an R package for prediction of missing links in any given unipartite or bipartite network. The package provides utilities to compute missing links in a bipartite and well as unipartite networks using Random Walk with Restart and Network inference algorithm and a combination of both. The package also allows computation of Bipartite network properties, visualization of communities for two different sets of nodes, and calculation of significant interactions between two sets of nodes using permutation based testing. The application can also be used to search for top-K shortest paths between interactome and use enrichment analysis for disease, pathway and ontology. The R standalone package (including detailed introductory vignettes) and associated R Shiny web application is available under the GPL-2 Open Source license and is freely available to download. RESULTS We compared different algorithms performance in different small datasets and found random walk supersedes rest of the algorithms. The package is developed to perform network based prediction of unipartite and bipartite networks and use the results to understand the functionality of proteins in an interactome using enrichment analysis. CONCLUSION The rapid application development envrionment like shiny, helps non programmers to develop fast rich visualization apps and we beleieve it would continue to grow in future with further enhancements. We plan to update our algorithms in the package in near future and help scientist to analyse data in a much streamlined fashion.
Collapse
Affiliation(s)
- Abhik Seal
- School of Informatics and Computing, Indiana University Bloomington, Informatics West, Bloomington, 47408, Indiana, USA
| | - David J Wild
- School of Informatics and Computing, Indiana University Bloomington, Informatics West, Bloomington, 47408, Indiana, USA.
| |
Collapse
|
37
|
Zhang TM, Huang T, Wang RF. Cross talk of chromosome instability, CpG island methylator phenotype and mismatch repair in colorectal cancer. Oncol Lett 2018; 16:1736-1746. [PMID: 30008861 PMCID: PMC6036478 DOI: 10.3892/ol.2018.8860] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 05/22/2018] [Indexed: 12/20/2022] Open
Abstract
Colorectal cancer is a severe cancer associated with a high prevalence and fatality rate. There are three major mechanisms for colorectal cancer: (1) Chromosome instability (CIN), (2) CpG island methylator phenotype (CIMP) and (3) mismatch repair (MMR), of which CIN is the most common type. However, these subtypes are not exclusive and overlap. To investigate their biological mechanisms and cross talk, the gene expression profiles of 585 colorectal cancer patients with CIN, CIMP and MMR status records were collected. By comparing the CIN+ and CIN-samples, CIMP+ and CIMP-samples, MMR+ and MMR-samples with minimal redundancy maximal relevance (mRMR) and incremental feature selection (IFS) methods, the CIN, CIMP and MMR associated genes were selected. Unfortunately, there was little direct overlap among them. To investigate their indirect interactions, downstream genes of CIN, CIMP and MMR were identified using the random walk with restart (RWR) method and a greater overlap of downstream genes was indicated. The common downstream genes were involved in biosynthetic and metabolic pathways. These findings were consistent with the clinical observation of wide range metabolite aberrations in colorectal cancer. To conclude, the present study gave a gene level explanation of CIN, CIMP and MMR, but also showed the network level cross talk of CIN, CIMP and MMR. The common genes of CIN, CIMP and MMR may be useful for cross-subtype general colorectal cancer drug development.
Collapse
Affiliation(s)
- Tian-Ming Zhang
- Department of Colorectal and Anal Surgery, Jinhua Hospital of Zhejiang University, Jinhua, Zhejiang 321000, P.R. China
| | - Tao Huang
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, P.R. China
| | - Rong-Fei Wang
- Department of Colorectal and Anal Surgery, Jinhua People's Hospital, Jinhua, Zhejiang 321000, P.R. China
| |
Collapse
|
38
|
Prediction of the aquatic toxicity of aromatic compounds to tetrahymena pyriformis through support vector regression. Oncotarget 2018; 8:49359-49369. [PMID: 28467816 PMCID: PMC5564774 DOI: 10.18632/oncotarget.17210] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 03/30/2017] [Indexed: 01/24/2023] Open
Abstract
Toxicity evaluation is an extremely important process during drug development. It is usually initiated by experiments on animals, which is time-consuming and costly. To speed up such a process, a quantitative structure-activity relationship (QSAR) study was performed to develop a computational model for correlating the structures of 581 aromatic compounds with their aquatic toxicity to tetrahymena pyriformis. A set of 68 molecular descriptors derived solely from the structures of the aromatic compounds were calculated based on Gaussian 03, HyperChem 7.5, and TSAR V3.3. A comprehensive feature selection method, minimum Redundancy Maximum Relevance (mRMR)-genetic algorithm (GA)-support vector regression (SVR) method, was applied to select the best descriptor subset in QSAR analysis. The SVR method was employed to model the toxicity potency from a training set of 500 compounds. Five-fold cross-validation method was used to optimize the parameters of SVR model. The new SVR model was tested on an independent dataset of 81 compounds. Both high internal consistent and external predictive rates were obtained, indicating the SVR model is very promising to become an effective tool for fast detecting the toxicity.
Collapse
|
39
|
iACP: a sequence-based tool for identifying anticancer peptides. Oncotarget 2017; 7:16895-909. [PMID: 26942877 PMCID: PMC4941358 DOI: 10.18632/oncotarget.7815] [Citation(s) in RCA: 300] [Impact Index Per Article: 42.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 02/11/2016] [Indexed: 02/07/2023] Open
Abstract
Cancer remains a major killer worldwide. Traditional methods of cancer treatment are expensive and have some deleterious side effects on normal cells. Fortunately, the discovery of anticancer peptides (ACPs) has paved a new way for cancer treatment. With the explosive growth of peptide sequences generated in the post genomic age, it is highly desired to develop computational methods for rapidly and effectively identifying ACPs, so as to speed up their application in treating cancer. Here we report a sequence-based predictor called iACP developed by the approach of optimizing the g-gap dipeptide components. It was demonstrated by rigorous cross-validations that the new predictor remarkably outperformed the existing predictors for the same purpose in both overall accuracy and stability. For the convenience of most experimental scientists, a publicly accessible web-server for iACP has been established at http://lin.uestc.edu.cn/server/iACP, by which users can easily obtain their desired results.
Collapse
|
40
|
Maji P, Shah E. Significance and Functional Similarity for Identification of Disease Genes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1419-1433. [PMID: 28113633 DOI: 10.1109/tcbb.2016.2598163] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
One of the most significant research issues in functional genomics is insilico identification of disease related genes. In this regard, the paper presents a new gene selection algorithm, termed as SiFS, for identification of disease genes. It integrates the information obtained from interaction network of proteins and gene expression profiles. The proposed SiFS algorithm culls out a subset of genes from microarray data as disease genes by maximizing both significance and functional similarity of the selected gene subset. Based on the gene expression profiles, the significance of a gene with respect to another gene is computed using mutual information. On the other hand, a new measure of similarity is introduced to compute the functional similarity between two genes. Information derived from the protein-protein interaction network forms the basis of the proposed SiFS algorithm. The performance of the proposed gene selection algorithm and new similarity measure, is compared with that of other related methods and similarity measures, using several cancer microarray data sets.
Collapse
|
41
|
Chen L, Pan H, Zhang YH, Feng K, Kong X, Huang T, Cai YD. Network-Based Method for Identifying Co- Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues. Genes (Basel) 2017; 8:genes8100252. [PMID: 28974058 PMCID: PMC5664102 DOI: 10.3390/genes8100252] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Accepted: 09/28/2017] [Indexed: 12/26/2022] Open
Abstract
Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein–protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.
Collapse
Affiliation(s)
- Lei Chen
- School of Life Sciences, Shanghai University, Shanghai 200444, China.
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China.
| | - Hongying Pan
- Department of Oral Medicine, Infection and Immunity, Harvard School of Dental Medicine, Harvard University, Boston, MA 02115, USA.
- Department of Orthopedic Surgery, Brigham and Women's Hospital, Harvard University, Boston, MA 02115, USA.
| | - Yu-Hang Zhang
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic, Guangzhou 510507, Guangdong, China.
| | - XiangYin Kong
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.
| | - Tao Huang
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, China.
| |
Collapse
|
42
|
Dutta P, Saha S. Fusion of expression values and protein interaction information using multi-objective optimization for improving gene clustering. Comput Biol Med 2017; 89:31-43. [PMID: 28783536 DOI: 10.1016/j.compbiomed.2017.07.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Revised: 07/28/2017] [Accepted: 07/28/2017] [Indexed: 11/29/2022]
Abstract
One of the crucial problems in the field of functional genomics is to identify a set of genes which are responsible for a particular cellular mechanism. The current work explores the usage of a multi-objective optimization based genetic clustering technique to classify genes into groups with respect to their functional similarities and biological relevance. Our contribution is two-fold: firstly a new quality measure to compute the goodness of gene-clusters namely protein-protein interaction confidence score is developed. This utilizes the confidence scores of the protein-protein interaction networks to measure the similarity between genes of a particular cluster with respect to their biochemical protein products. Secondly, a multi-objective based clustering approach is developed which intelligently uses integrated information of expression values of microarray dataset and protein-protein interaction confidence scores to select both statistically and biologically relevant genes. For that very purpose, some biological cluster validity indices, viz. biological homogeneity index and protein-protein interaction confidence score, along with two traditional internal cluster validity indices, viz. fuzzy partition coefficient and Pakhira-Bandyopadhyay-Maulik-index, are simultaneously optimized during the clustering process. Experimental results on three real-life gene expression datasets show that the addition of new objective capturing protein-protein interaction information aids in clustering the genes as compared to the existing techniques. The observations are further supported by biological and statistical significance tests.
Collapse
Affiliation(s)
- Pratik Dutta
- Department of Computer Science and Engineering, Indian Institute of Technology Patna, Bihar, India.
| | - Sriparna Saha
- Department of Computer Science and Engineering, Indian Institute of Technology Patna, Bihar, India.
| |
Collapse
|
43
|
Abstract
Logistic Regression Model (LRM) and artificial neural networks (ANNs) as two nonlinear models have been used to establish a novel two-stage hybrid modeling procedure for prediction of metastasis in advanced colorectal carcinomas. Two different datasets were used in training and testing procedures. For the first stage of hybrid modeling procedure, LRM was used to evaluate the contribution of DNA sequence copy number aberrations detected by Comparative Genomic Hybridization in advanced colorectal carcinoma and its metastasis. Then, the most effective parameters were selected by the LRM. Selected effective parameters among 565 detected chromosomal gains and losses were as follows: gain of 20q11.2, loss of 1q42, loss of 13q34, gain of 5q12, gain of 17p13, loss of 2q22, loss of 11q24 and gain of 2p11.2. Consequently, neural network models were constructed and fed by the parameters selected by LRM to build hybrid predictors on the two databases during self-consistency and jackknife tests, and performance of the hybrid model was verified. The results showed that our two-stage hybrid model approach is very promising for prediction of metastasis in advanced colorectal carcinomas.
Collapse
|
44
|
Gov E, Kori M, Arga KY. RNA-based ovarian cancer research from 'a gene to systems biomedicine' perspective. Syst Biol Reprod Med 2017; 63:219-238. [PMID: 28574782 DOI: 10.1080/19396368.2017.1330368] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Ovarian cancer remains the leading cause of death from a gynecologic malignancy, and treatment of this disease is harder than any other type of female reproductive cancer. Improvements in the diagnosis and development of novel and effective treatment strategies for complex pathophysiologies, such as ovarian cancer, require a better understanding of disease emergence and mechanisms of progression through systems medicine approaches. RNA-level analyses generate new information that can help in understanding the mechanisms behind disease pathogenesis, to identify new biomarkers and therapeutic targets and in new drug discovery. Whole RNA sequencing and coding and non-coding RNA expression array datasets have shed light on the mechanisms underlying disease progression and have identified mRNAs, miRNAs, and lncRNAs involved in ovarian cancer progression. In addition, the results from these analyses indicate that various signalling pathways and biological processes are associated with ovarian cancer. Here, we present a comprehensive literature review on RNA-based ovarian cancer research and highlight the benefits of integrative approaches within the systems biomedicine concept for future ovarian cancer research. We invite the ovarian cancer and systems biomedicine research fields to join forces to achieve the interdisciplinary caliber and rigor required to find real-life solutions to common, devastating, and complex diseases such as ovarian cancer. ABBREVIATIONS CAF: cancer-associated fibroblasts; COG: Cluster of Orthologous Groups; DEA: disease enrichment analysis; EOC: epithelial ovarian carcinoma; ESCC: oesophageal squamous cell carcinoma; GSI: gamma secretase inhibitor; GO: Gene Ontology; GSEA: gene set enrichment analyzes; HAS: Hungarian Academy of Sciences; lncRNAs: long non-coding RNAs; MAPK/ERK: mitogen-activated protein kinase/extracellular signal-regulated kinases; NGS: next-generation sequencing; ncRNAs: non-coding RNAs; OvC: ovarian cancer; PI3K/Akt/mTOR: phosphatidylinositol-3-kinase/protein kinase B/mammalian target of rapamycin; RT-PCR: real-time polymerase chain reaction; SNP: single nucleotide polymorphism; TF: transcription factor; TGF-β: transforming growth factor-β.
Collapse
Affiliation(s)
- Esra Gov
- a Department of Bioengineering , Marmara University , Istanbul , Turkey.,b Department of Bioengineering , Adana Science and Technology University , Adana , Turkey
| | - Medi Kori
- a Department of Bioengineering , Marmara University , Istanbul , Turkey
| | - Kazim Yalcin Arga
- a Department of Bioengineering , Marmara University , Istanbul , Turkey
| |
Collapse
|
45
|
Maji P, Shah E, Paul S. RelSim: An integrated method to identify disease genes using gene expression profiles and PPIN based similarity measure. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2016.06.034] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
46
|
Rough Hypercuboid and Modified Kulczynski Coefficient for Disease Gene Identification. ACTA ACUST UNITED AC 2017. [DOI: 10.1007/978-3-319-54430-4_45] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2023]
|
47
|
Paul S, Lakatos P, Hartmann A, Schneider-Stock R, Vera J. Identification of miRNA-mRNA Modules in Colorectal Cancer Using Rough Hypercuboid Based Supervised Clustering. Sci Rep 2017; 7:42809. [PMID: 28220871 PMCID: PMC5318911 DOI: 10.1038/srep42809] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Accepted: 01/13/2017] [Indexed: 02/06/2023] Open
Abstract
Differences in the expression profiles of miRNAs and mRNAs have been reported in colorectal cancer. Nevertheless, information on important miRNA-mRNA regulatory modules in colorectal cancer is still lacking. In this regard, this study presents an application of the RH-SAC algorithm on miRNA and mRNA expression data for identification of potential miRNA-mRNA modules. First, a set of miRNA rules was generated using the RH-SAC algorithm. The mRNA targets of the selected miRNAs were identified using the miRTarBase database. Next, the expression values of target mRNAs were used to generate mRNA rules using the RH-SAC. Then all miRNA-mRNA rules have been integrated for generating networks. The RH-SAC algorithm unlike other existing methods selects a group of co-expressed miRNAs and mRNAs that are also differentially expressed. In total 17 miRNAs and 141 mRNAs were selected. The enrichment analysis of selected mRNAs revealed that our method selected mRNAs that are significantly associated with colorectal cancer. We identified novel miRNA/mRNA interactions in colorectal cancer. Through experiment, we could confirm that one of our discovered miRNAs, hsa-miR-93-5p, was significantly up-regulated in 75.8% CRC in comparison to their corresponding non-tumor samples. It could have the potential to examine colorectal cancer subtype specific unique miRNA/mRNA interactions.
Collapse
Affiliation(s)
- Sushmita Paul
- Department of Bioscience & Bioengineering, Indian Institute of Technology Jodhpur, India
| | - Petra Lakatos
- Experimental Tumorpathology, Institute of Pathology, University Hospital of Friedrich-Alexander-University Erlangen-Nürnberg, Germany
| | - Arndt Hartmann
- Institute of Pathology, University Hospital of Friedrich-Alexander-University Erlangen-Nürnberg, Germany
| | - Regine Schneider-Stock
- Experimental Tumorpathology, Institute of Pathology, University Hospital of Friedrich-Alexander-University Erlangen-Nürnberg, Germany
| | - Julio Vera
- Laboratory of Systems Tumor Immunology, Department of Dermatology, Erlangen University Hospital and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
48
|
OOgenesis_Pred: A sequence-based method for predicting oogenesis proteins by six different modes of Chou's pseudo amino acid composition. J Theor Biol 2017; 414:128-136. [DOI: 10.1016/j.jtbi.2016.11.028] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2016] [Revised: 11/25/2016] [Accepted: 11/29/2016] [Indexed: 12/22/2022]
|
49
|
Hassanzadeh HR, Phan JH, Wang MD. A Multi-Modal Graph-Based Semi-Supervised Pipeline for Predicting Cancer Survival. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2016; 2016:184-189. [PMID: 32655981 DOI: 10.1109/bibm.2016.7822516] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Cancer survival prediction is an active area of research that can help prevent unnecessary therapies and improve patient's quality of life. Gene expression profiling is being widely used in cancer studies to discover informative biomarkers that aid predict different clinical endpoint prediction. We use multiple modalities of data derived from RNA deep-sequencing (RNA-seq) to predict survival of cancer patients. Despite the wealth of information available in expression profiles of cancer tumors, fulfilling the aforementioned objective remains a big challenge, for the most part, due to the paucity of data samples compared to the high dimension of the expression profiles. As such, analysis of transcriptomic data modalities calls for state-of-the-art big-data analytics techniques that can maximally use all the available data to discover the relevant information hidden within a significant amount of noise. In this paper, we propose a pipeline that predicts cancer patients' survival by exploiting the structure of the input (manifold learning) and by leveraging the unlabeled samples using Laplacian support vector machines, a graph-based semi supervised learning (GSSL) paradigm. We show that under certain circumstances, no single modality per se will result in the best accuracy and by fusing different models together via a stacked generalization strategy, we may boost the accuracy synergistically. We apply our approach to two cancer datasets and present promising results. We maintain that a similar pipeline can be used for predictive tasks where labeled samples are expensive to acquire.
Collapse
Affiliation(s)
- Hamid Reza Hassanzadeh
- Department of Computational Science and Engineering, Georgia Institute of Technology Atlanta, Georgia 30332
| | - John H Phan
- Department of Biomedical Engineering Georgia Institute of Technology and Emory University, Atlanta, Georgia 30332
| | - May D Wang
- Department of Biomedical Engineering Georgia Institute of Technology and Emory University, Atlanta, Georgia 30332
| |
Collapse
|
50
|
Yang L, Wang S, Zhou M, Chen X, Zuo Y, Lv Y. Characterization of BioPlex network by topological properties. J Theor Biol 2016; 409:148-154. [PMID: 27552850 DOI: 10.1016/j.jtbi.2016.08.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Revised: 07/28/2016] [Accepted: 08/20/2016] [Indexed: 11/16/2022]
Abstract
Protein-protein interaction (PPI) networks are emerging as valuable prototypes to study important problems in molecular cellular biology and systems biomedicine. An analysis of the topological properties of a PPI network is very helpful for understanding the function and structure of networks. In this study, we analyzed the topological patterns in the BioPlex network containing interactions among 10,961 proteins; most interactions were previously undocumented. The BioPlex network is a comprehensive map of human protein interactions and represents the first phase of a long-term effort to profile the entire human ORFEOME collection. Similar to other biological networks, we observed that the BioPlex network has several topological properties. We also quantified correlations profiles for the BioPlex network and compared them to randomized versions of the same network. We found that for the BioPlex network, edges between proteins with intermediate degrees were strongly suppressed, whereas edges between low-connected proteins were favored. Finally, the degrees of essential genes were compared with the degrees of non-essential genes and randomly selected proteins. There were no significant differences between the groups.
Collapse
Affiliation(s)
- Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Shiyuan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Meng Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Xiaowen Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yongchun Zuo
- The National Research Center for Animal Transgenic Biotechnology, Inner Mongolia University, Hohhot 010021, China
| | - Yingli Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| |
Collapse
|