1
|
Kasavi C. Gene co-expression network analysis revealed novel biomarkers for ovarian cancer. Front Genet 2022; 13:971845. [PMID: 36338962 PMCID: PMC9627302 DOI: 10.3389/fgene.2022.971845] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 10/10/2022] [Indexed: 09/18/2023] Open
Abstract
Ovarian cancer is the second most common gynecologic cancer and remains the leading cause of death of all gynecologic oncologic disease. Therefore, understanding the molecular mechanisms underlying the disease, and the identification of effective and predictive biomarkers are invaluable for the development of diagnostic and treatment strategies. In the present study, a differential co-expression network analysis was performed via meta-analysis of three transcriptome datasets of serous ovarian adenocarcinoma to identify novel candidate biomarker signatures, i.e. genes and miRNAs. We identified 439 common differentially expressed genes (DEGs), and reconstructed differential co-expression networks using common DEGs and considering two conditions, i.e. healthy ovarian surface epithelia samples and serous ovarian adenocarcinoma epithelia samples. The modular analyses of the constructed networks indicated a co-expressed gene module consisting of 17 genes. A total of 11 biomarker candidates were determined through receiver operating characteristic (ROC) curves of gene expression of module genes, and miRNAs targeting these genes were identified. As a result, six genes (CDT1, CNIH4, CRLS1, LIMCH1, POC1A, and SNX13), and two miRNAs (mir-147a, and mir-103a-3p) were suggested as novel candidate prognostic biomarkers for ovarian cancer. Further experimental and clinical validation of the proposed biomarkers could help future development of potential diagnostic and therapeutic innovations in ovarian cancer.
Collapse
Affiliation(s)
- Ceyda Kasavi
- Department of Bioengineering, Faculty of Engineering, Marmara University, Istanbul, Turkey
| |
Collapse
|
2
|
A new feature extraction technique based on improved owl search algorithm: a case study in copper electrorefining plant. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06881-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
3
|
Mirsadeghi L, Haji Hosseini R, Banaei-Moghaddam AM, Kavousi K. EARN: an ensemble machine learning algorithm to predict driver genes in metastatic breast cancer. BMC Med Genomics 2021; 14:122. [PMID: 33962648 PMCID: PMC8105935 DOI: 10.1186/s12920-021-00974-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 04/27/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Today, there are a lot of markers on the prognosis and diagnosis of complex diseases such as primary breast cancer. However, our understanding of the drivers that influence cancer aggression is limited. METHODS In this work, we study somatic mutation data consists of 450 metastatic breast tumor samples from cBio Cancer Genomics Portal. We use four software tools to extract features from this data. Then, an ensemble classifier (EC) learning algorithm called EARN (Ensemble of Artificial Neural Network, Random Forest, and non-linear Support Vector Machine) is proposed to evaluate plausible driver genes for metastatic breast cancer (MBCA). The decision-making strategy for the proposed ensemble machine is based on the aggregation of the predicted scores obtained from individual learning classifiers to be prioritized homo sapiens genes annotated as protein-coding from NCBI. RESULTS This study is an attempt to focus on the findings in several aspects of MBCA prognosis and diagnosis. First, drivers and passengers predicted by SVM, ANN, RF, and EARN are introduced. Second, biological inferences of predictions are discussed based on gene set enrichment analysis. Third, statistical validation and comparison of all learning methods are performed by some evaluation metrics. Finally, the pathway enrichment analysis (PEA) using ReactomeFIVIz tool (FDR < 0.03) for the top 100 genes predicted by EARN leads us to propose a new gene set panel for MBCA. It includes HDAC3, ABAT, GRIN1, PLCB1, and KPNA2 as well as NCOR1, TBL1XR1, SIRT4, KRAS, CACNA1E, PRKCG, GPS2, SIN3A, ACTB, KDM6B, and PRMT1. Furthermore, we compare results for MBCA to other outputs regarding 983 primary tumor samples of breast invasive carcinoma (BRCA) obtained from the Cancer Genome Atlas (TCGA). The comparison between outputs shows that ROC-AUC reaches 99.24% using EARN for MBCA and 99.79% for BRCA. This statistical result is better than three individual classifiers in each case. CONCLUSIONS This research using an integrative approach assists precision oncologists to design compact targeted panels that eliminate the need for whole-genome/exome sequencing. The schematic representation of the proposed model is presented as the Graphic abstract.
Collapse
Affiliation(s)
- Leila Mirsadeghi
- Department of Biology, Faculty of Science, Payame Noor University, Tehran, Iran
| | - Reza Haji Hosseini
- Department of Biology, Faculty of Science, Payame Noor University, Tehran, Iran.
| | - Ali Mohammad Banaei-Moghaddam
- Laboratory of Genomics and Epigenomics (LGE), Department of Biochemistry, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran
| | - Kaveh Kavousi
- Laboratory of Complex Biological Systems and Bioinformatics (CBB), Department of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran.
| |
Collapse
|
4
|
Rangaswamy U, Dharshini SAP, Yesudhas D, Gromiha MM. VEPAD - Predicting the effect of variants associated with Alzheimer's disease using machine learning. Comput Biol Med 2020; 124:103933. [PMID: 32828070 DOI: 10.1016/j.compbiomed.2020.103933] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 07/25/2020] [Accepted: 07/25/2020] [Indexed: 12/26/2022]
Abstract
INTRODUCTION Alzheimer's disease (AD) is a complex and heterogeneous disease that affects neuronal cells over time and it is prevalent among all neurodegenerative diseases. Next Generation Sequencing (NGS) techniques are widely used for developing high-throughput screening methods to identify biomarkers and variants, which help early diagnosis and treatments. OBJECTIVE The primary purpose of this study is to develop a classification model using machine learning for predicting the deleterious effect of variants with respect to AD. METHODS We have constructed a set of 20,401 deleterious and 37,452 control variants from Genome-Wide Association Study (GWAS) and Genotype-Tissue Expression (GTEx) portals, respectively. Recursive feature elimination using cross-validation (RFECV) followed by a forward feature selection method was utilized to select the important features and a random forest classifier was used for distinguishing between deleterious and neutral variants. RESULTS Our method showed an accuracy of 81.21% on 10-fold cross-validation and 70.63% on a test set of 5785 variants. The same test set was used to compare the performance of CADD and FATHMM and their accuracies are in the range of 54%-62%. CONCLUSION Our model is freely available as the Variant Effect Predictor for Alzheimer's Disease (VEPAD) at http://web.iitm.ac.in/bioinfo2/vepad/. VEPAD can be used to predict the effect of new variants associated with AD.
Collapse
Affiliation(s)
- Uday Rangaswamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, India
| | - S Akila Parvathy Dharshini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, India
| | - Dhanusha Yesudhas
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, India; School of Computing, Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, Midori-ku, Kanagawa, 226-8503, Yokohama, Japan.
| |
Collapse
|
5
|
Mukherjee S, Perumal TM, Daily K, Sieberts SK, Omberg L, Preuss C, Carter GW, Mangravite LM, Logsdon BA. Identifying and ranking potential driver genes of Alzheimer's disease using multiview evidence aggregation. Bioinformatics 2019; 35:i568-i576. [PMID: 31510680 PMCID: PMC6612835 DOI: 10.1093/bioinformatics/btz365] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
MOTIVATION Late onset Alzheimer's disease is currently a disease with no known effective treatment options. To better understand disease, new multi-omic data-sets have recently been generated with the goal of identifying molecular causes of disease. However, most analytic studies using these datasets focus on uni-modal analysis of the data. Here, we propose a data driven approach to integrate multiple data types and analytic outcomes to aggregate evidences to support the hypothesis that a gene is a genetic driver of the disease. The main algorithmic contributions of our article are: (i) a general machine learning framework to learn the key characteristics of a few known driver genes from multiple feature sets and identifying other potential driver genes which have similar feature representations, and (ii) A flexible ranking scheme with the ability to integrate external validation in the form of Genome Wide Association Study summary statistics. While we currently focus on demonstrating the effectiveness of the approach using different analytic outcomes from RNA-Seq studies, this method is easily generalizable to other data modalities and analysis types. RESULTS We demonstrate the utility of our machine learning algorithm on two benchmark multiview datasets by significantly outperforming the baseline approaches in predicting missing labels. We then use the algorithm to predict and rank potential drivers of Alzheimer's. We show that our ranked genes show a significant enrichment for single nucleotide polymorphisms associated with Alzheimer's and are enriched in pathways that have been previously associated with the disease. AVAILABILITY AND IMPLEMENTATION Source code and link to all feature sets is available at https://github.com/Sage-Bionetworks/EvidenceAggregatedDriverRanking.
Collapse
Affiliation(s)
| | | | | | | | | | - Christoph Preuss
- The Jackson Laboratory for Mammalian Genetics, Bar Harbor, ME, USA
| | - Gregory W Carter
- The Jackson Laboratory for Mammalian Genetics, Bar Harbor, ME, USA
| | | | - Benjamin A Logsdon
- Sage Bionetworks, Seattle, WA, USA,To whom correspondence should be addressed.
| |
Collapse
|
6
|
Silvestrov P, Maier SJ, Fang M, Cisneros GA. DNArCdb: A database of cancer biomarkers in DNA repair genes that includes variants related to multiple cancer phenotypes. DNA Repair (Amst) 2018; 70:10-17. [PMID: 30098577 PMCID: PMC6151283 DOI: 10.1016/j.dnarep.2018.07.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 07/30/2018] [Accepted: 07/30/2018] [Indexed: 02/04/2023]
Abstract
Functioning DNA repair capabilities are vital for organisms to ensure that the biological information is preserved and correctly propagated. Disruptions in DNA repair pathways can result in the accumulation of DNA mutations, which may lead to onset of complex disease such as cancer. The discovery and characterization of cancer-related biomarkers may allow early diagnosis and targeted treatment, which could significantly contribute to the survival rates of cancer patients. To this end, we have applied a hypothesis driven bioinformatics approach to identify biomarkers related to 25 different DNA repair enzymes, in combination with structural analysis of six selected missense mutations of newly discovered SNPs that are associated with cancer phenotypes. Our search on 8 distinct cancer databases uncovered 43 missense SNPs that statistically significantly associated at least one phenotype. Moreover, nine of these missense SNPs are statistically significantly associated with two or more cancers. In addition, we have performed classical molecular dynamics to characterize the impact of rs10018786 on POLN, which results in the M310 L Pol ν variant, and rs3218784 on POLI, which results in the I236 M Pol ι. Our results suggest that both of these cancer-associated variants result in noticeable structural and dynamical changes compared with their respective wild-type proteins.
Collapse
Affiliation(s)
- Pavel Silvestrov
- Department of Chemistry, University of North Texas, Denton, TX, 76201, United States
| | - Sarah J Maier
- Department of Chemistry, University of North Texas, Denton, TX, 76201, United States
| | - Michelle Fang
- Department of Chemistry, University of North Texas, Denton, TX, 76201, United States
| | - G Andrés Cisneros
- Department of Chemistry, University of North Texas, Denton, TX, 76201, United States.
| |
Collapse
|
7
|
Przytycki PF, Singh M. Differential analysis between somatic mutation and germline variation profiles reveals cancer-related genes. Genome Med 2017; 9:79. [PMID: 28841835 PMCID: PMC5574113 DOI: 10.1186/s13073-017-0465-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Accepted: 08/07/2017] [Indexed: 12/30/2022] Open
Abstract
A major aim of cancer genomics is to pinpoint which somatically mutated genes are involved in tumor initiation and progression. We introduce a new framework for uncovering cancer genes, differential mutation analysis, which compares the mutational profiles of genes across cancer genomes with their natural germline variation across healthy individuals. We present DiffMut, a fast and simple approach for differential mutational analysis, and demonstrate that it is more effective in discovering cancer genes than considerably more sophisticated approaches. We conclude that germline variation across healthy human genomes provides a powerful means for characterizing somatic mutation frequency and identifying cancer driver genes. DiffMut is available at https://github.com/Singh-Lab/Differential-Mutation-Analysis.
Collapse
Affiliation(s)
- Pawel F Przytycki
- Department of Computer Science, Princeton University, Princeton, NJ, 08544, USA.,Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, 08544, USA
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ, 08544, USA. .,Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, 08544, USA.
| |
Collapse
|
8
|
Zhang T, Zhang D. Integrating omics data and protein interaction networks to prioritize driver genes in cancer. Oncotarget 2017; 8:58050-58060. [PMID: 28938536 PMCID: PMC5601632 DOI: 10.18632/oncotarget.19481] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2017] [Accepted: 06/19/2017] [Indexed: 11/25/2022] Open
Abstract
Although numerous approaches have been proposed to discern driver from passenger, identification of driver genes remains a critical challenge in the cancer genomics field. Driver genes with low mutated frequency tend to be filtered in cancer research. In addition, the accumulation of different omics data necessitates the development of algorithmic frameworks for nominating putative driver genes. In this study, we presented a novel framework to identify driver genes through integrating multi-omics data such as somatic mutation, gene expression, and copy number alterations. We developed a computational approach to detect potential driver genes by virtue of their effect on their neighbors in network. Application to three datasets (head and neck squamous cell carcinoma (HNSC), thyroid carcinoma (THCA) and kidney renal clear cell carcinoma (KIRC)) from The Cancer Genome Atlas (TCGA), by comparing the Precision, Recall and F1 score, our method outperformed DriverNet and MUFFINN in all three datasets. In addition, our method was less affected by protein length compared with DriverNet. Lastly, our method not only identified the known cancer genes but also detected the potential rare drivers (PTPN6 in THCA, SRC, GRB2 and PTPN6 in KIRC, MAPK1 and SMAD2 in HNSC).
Collapse
Affiliation(s)
- Tiejun Zhang
- GMU-GIBH Joint School of Life Sciences, Guangzhou Medical University, Guangzhou, Guangdong 511436, China
| | - Di Zhang
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| |
Collapse
|
9
|
Amar D, Izraeli S, Shamir R. Utilizing somatic mutation data from numerous studies for cancer research: proof of concept and applications. Oncogene 2017; 36:3375-3383. [PMID: 28092680 PMCID: PMC5485176 DOI: 10.1038/onc.2016.489] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Revised: 11/20/2016] [Accepted: 11/22/2016] [Indexed: 02/07/2023]
Abstract
Large cancer projects measure somatic mutations in thousands of samples, gradually assembling a catalog of recurring mutations in cancer. Many methods analyze these data jointly with auxiliary information with the aim of identifying subtype-specific results. Here, we show that somatic gene mutations alone can reliably and specifically predict cancer subtypes. Interpretation of the classifiers provides useful insights for several biomedical applications. We analyze the COSMIC database, which collects somatic mutations from The Cancer Genome Atlas (TCGA) as well as from many smaller scale studies. We use multi-label classification techniques and the Disease Ontology hierarchy in order to identify cancer subtype-specific biomarkers. Cancer subtype classifiers based on TCGA and the smaller studies have comparable performance, and the smaller studies add a substantial value in terms of validation, coverage of additional subtypes, and improved classification. The gene sets of the classifiers are used for threefold contribution. First, we refine the associations of genes to cancer subtypes and identify novel compelling candidate driver genes. Second, using our classifiers we successfully predict the primary site of metastatic samples. Third, we provide novel hypotheses regarding detection of subtype-specific synthetic lethality interactions. From the cancer research community perspective, our results suggest that curation efforts, such as COSMIC, have great added and complementary value even in the era of large international cancer projects.
Collapse
Affiliation(s)
- D Amar
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - S Izraeli
- Department of Pediatric Hematology-Oncology, Safra Children’s Hospital, Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel
- Sackler School of Medicine, Tel Aviv University, Tel-Aviv, Israel
| | - R Shamir
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
10
|
Xi J, Wang M, Li A. Discovering potential driver genes through an integrated model of somatic mutation profiles and gene functional information. MOLECULAR BIOSYSTEMS 2017; 13:2135-2144. [DOI: 10.1039/c7mb00303j] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
An integrated approach to identify driver genes based on information of somatic mutations, the interaction network and Gene Ontology similarity.
Collapse
Affiliation(s)
- Jianing Xi
- School of Information Science and Technology
- University of Science and Technology of China
- Hefei AH 230027
- People’s Republic of China
| | - Minghui Wang
- School of Information Science and Technology
- University of Science and Technology of China
- Hefei AH 230027
- People’s Republic of China
- Centers for Biomedical Engineering
| | - Ao Li
- School of Information Science and Technology
- University of Science and Technology of China
- Hefei AH 230027
- People’s Republic of China
- Centers for Biomedical Engineering
| |
Collapse
|
11
|
Dimitrakopoulos GN, Balomenos P, Vrahatis AG, Sgarbas K, Bezerianos A. Identifying disease network perturbations through regression on gene expression and pathway topology analysis. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2016; 2016:5969-5972. [PMID: 28269612 DOI: 10.1109/embc.2016.7592088] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In Systems Biology, network-based approaches have been extensively used to effectively study complex diseases. An important challenge is the detection of network perturbations which disrupt regular biological functions as a result of a disease. In this regard, we introduce a network based pathway analysis method which isolates casual interactions with significant regulatory roles within diseased-perturbed pathways. Specifically, we use gene expression data with Random Forest regression models to assess the interactivity strengths of genes within disease-perturbed networks, using KEGG pathway maps as a source of prior-knowledge pertaining to pathway topology. We deliver as output a network with imprinted perturbations corresponding to the biological phenomena arising in a disease-oriented experiment. The efficacy of our approach is demonstrated on a serous papillary ovarian cancer experiment and results highlight the functional roles of high impact interactions and key gene regulators which cause strong perturbations on pathway networks, in accordance with experimentally validated knowledge from recent literature.
Collapse
|
12
|
Shi K, Gao L, Wang B. Discovering potential cancer driver genes by an integrated network-based approach. MOLECULAR BIOSYSTEMS 2016; 12:2921-31. [DOI: 10.1039/c6mb00274a] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
An integrated network-based approach is proposed to nominate driver genes. It is composed of two steps including a network diffusion step and an aggregated ranking step, which fuses the correlation between the gene mutations and gene expression, the relationship between the mutated genes and the heterogeneous characteristic of the patient mutation.
Collapse
Affiliation(s)
- Kai Shi
- School of Computer Science and Technology
- Xidian University
- Xi'an
- China
- College of Science
| | - Lin Gao
- School of Computer Science and Technology
- Xidian University
- Xi'an
- China
| | - Bingbo Wang
- School of Computer Science and Technology
- Xidian University
- Xi'an
- China
| |
Collapse
|
13
|
Liu Y, Hu Z, DeLisi C. Mutated Pathways as a Guide to Adjuvant Therapy Treatments for Breast Cancer. Mol Cancer Ther 2015; 15:184-9. [PMID: 26625895 DOI: 10.1158/1535-7163.mct-15-0601] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Accepted: 10/10/2015] [Indexed: 12/18/2022]
Abstract
Adjuvant therapy following breast cancer surgery generally consists of either a course of chemotherapy, if the cancer lacks hormone receptors, or a course of hormonal therapy, otherwise. Here, we report a correlation between adjuvant strategy and mutated pathway patterns. In particular, we find that for breast cancer patients, pathways enriched in nonsynonymous mutations in the chemotherapy group are distinct from those of the hormonal therapy group. We apply a recently developed method that identifies collaborative pathway groups for hormone and chemotherapy patients. A collaborative group of pathways is one in which each member is altered in the same-generally large-number of samples. In particular, we find the following: (i) a chemotherapy group consisting of three pathways and a hormone therapy group consisting of 20, the members of the two groups being mutually exclusive; (ii) each group is highly enriched in breast cancer drivers; and (iii) the pathway groups are correlates of subtype-based therapeutic recommendations. These results suggest that patient profiling using these pathway groups can potentially enable the development of personalized treatment plans that may be more accurate and specific than those currently available.
Collapse
Affiliation(s)
- Yang Liu
- Bioinformatics Graduate Program and Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Zhenjun Hu
- Bioinformatics Graduate Program and Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Charles DeLisi
- Bioinformatics Graduate Program and Department of Biomedical Engineering, Boston University, Boston, Massachusetts.
| |
Collapse
|
14
|
An O, Dall'Olio GM, Mourikis TP, Ciccarelli FD. NCG 5.0: updates of a manually curated repository of cancer genes and associated properties from cancer mutational screenings. Nucleic Acids Res 2015; 44:D992-9. [PMID: 26516186 PMCID: PMC4702816 DOI: 10.1093/nar/gkv1123] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 10/14/2015] [Indexed: 12/21/2022] Open
Abstract
The Network of Cancer Genes (NCG, http://ncg.kcl.ac.uk/) is a manually curated repository of cancer genes derived from the scientific literature. Due to the increasing amount of cancer genomic data, we have introduced a more robust procedure to extract cancer genes from published cancer mutational screenings and two curators independently reviewed each publication. NCG release 5.0 (August 2015) collects 1571 cancer genes from 175 published studies that describe 188 mutational screenings of 13 315 cancer samples from 49 cancer types and 24 primary sites. In addition to collecting cancer genes, NCG also provides information on the experimental validation that supports the role of these genes in cancer and annotates their properties (duplicability, evolutionary origin, expression profile, function and interactions with proteins and miRNAs).
Collapse
Affiliation(s)
- Omer An
- Division of Cancer Studies, King's College London, London SE11UL, UK
| | | | - Thanos P Mourikis
- Division of Cancer Studies, King's College London, London SE11UL, UK
| | | |
Collapse
|