1
|
Böttcher J, Fuchs JE, Mayer M, Kahmann J, Zak KM, Wunberg T, Woehrle S, Kessler D. Ligandability assessment of the C-terminal Rel-homology domain of NFAT1. Arch Pharm (Weinheim) 2024; 357:e2300649. [PMID: 38396281 DOI: 10.1002/ardp.202300649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 02/01/2024] [Accepted: 02/02/2024] [Indexed: 02/25/2024]
Abstract
Transcription factors are generally considered challenging, if not "undruggable", targets but they promise new therapeutic options due to their fundamental involvement in many diseases. In this study, we aim to assess the ligandability of the C-terminal Rel-homology domain of nuclear factor of activated T cells 1 (NFAT1), a TF implicated in T-cell regulation. Using a combination of experimental and computational approaches, we demonstrate that small molecule fragments can indeed bind to this protein domain. The newly identified binder is the first small molecule binder to NFAT1 validated with biophysical methods and an elucidated binding mode by X-ray crystallography. The reported eutomer/distomer pair provides a strong basis for potential exploration of higher potency binders on the path toward degrader or glue modalities.
Collapse
Affiliation(s)
- Jark Böttcher
- Boehringer Ingelheim RCV GmbH & Co KG, Vienna, Austria
| | | | - Moriz Mayer
- Boehringer Ingelheim RCV GmbH & Co KG, Vienna, Austria
| | | | | | | | - Simon Woehrle
- Boehringer Ingelheim RCV GmbH & Co KG, Vienna, Austria
| | - Dirk Kessler
- Boehringer Ingelheim RCV GmbH & Co KG, Vienna, Austria
| |
Collapse
|
2
|
Müller T, Reichlmeir M, Hau AC, Wittig I, Schulte D. The neuronal transcription factor MEIS2 is a calpain-2 protease target. J Cell Sci 2024; 137:jcs261482. [PMID: 38305737 PMCID: PMC10941658 DOI: 10.1242/jcs.261482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 01/25/2024] [Indexed: 02/03/2024] Open
Abstract
Tight control over transcription factor activity is necessary for a sensible balance between cellular proliferation and differentiation in the embryo and during tissue homeostasis by adult stem cells, but mechanistic details have remained incomplete. The homeodomain transcription factor MEIS2 is an important regulator of neurogenesis in the ventricular-subventricular zone (V-SVZ) adult stem cell niche in mice. We here identify MEIS2 as direct target of the intracellular protease calpain-2 (composed of the catalytic subunit CAPN2 and the regulatory subunit CAPNS1). Phosphorylation at conserved serine and/or threonine residues, or dimerization with PBX1, reduced the sensitivity of MEIS2 towards cleavage by calpain-2. In the adult V-SVZ, calpain-2 activity is high in stem and progenitor cells, but rapidly declines during neuronal differentiation, which is accompanied by increased stability of MEIS2 full-length protein. In accordance with this, blocking calpain-2 activity in stem and progenitor cells, or overexpression of a cleavage-insensitive form of MEIS2, increased the production of neurons, whereas overexpression of a catalytically active CAPN2 reduced it. Collectively, our results support a key role for calpain-2 in controlling the output of adult V-SVZ neural stem and progenitor cells through cleavage of the neuronal fate determinant MEIS2.
Collapse
Affiliation(s)
- Tanja Müller
- Goethe University, Faculty of Medicine, University Hospital Frankfurt, Institute of Neurology (Edinger Institute), 60528 Frankfurt, Germany
- Goethe University, University Hospital Frankfurt, Dr. Senckenberg Institute of Neurooncology and Institute of Neurology (Edinger Institute), Frankfurt Cancer Institute (FCI), University Cancer Center Frankfurt (UCT), MSNZ Junior Group Translational Neurooncology, 60528 Frankfurt, Germany
- Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), Luxembourg Centre of Neuropathology (LCNP), 1445 Luxembourg, Luxembourg
| | - Marina Reichlmeir
- Goethe University, Faculty of Medicine, University Hospital Frankfurt, Institute of Neurology (Edinger Institute), 60528 Frankfurt, Germany
| | - Ann-Christin Hau
- Goethe University, University Hospital Frankfurt, Dr. Senckenberg Institute of Neurooncology and Institute of Neurology (Edinger Institute), Frankfurt Cancer Institute (FCI), University Cancer Center Frankfurt (UCT), MSNZ Junior Group Translational Neurooncology, 60528 Frankfurt, Germany
| | - Ilka Wittig
- Goethe University, Faculty of Medicine, Institute for Cardiovascular Physiology, Functional Proteomics, 60590, Frankfurt, Germany
| | - Dorothea Schulte
- Goethe University, Faculty of Medicine, University Hospital Frankfurt, Institute of Neurology (Edinger Institute), 60528 Frankfurt, Germany
| |
Collapse
|
3
|
Geng C, Wang Z, Tang Y. Machine learning in Alzheimer's disease drug discovery and target identification. Ageing Res Rev 2024; 93:102172. [PMID: 38104638 DOI: 10.1016/j.arr.2023.102172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/28/2023] [Accepted: 12/13/2023] [Indexed: 12/19/2023]
Abstract
Alzheimer's disease (AD) stands as a formidable neurodegenerative ailment that poses a substantial threat to the elderly population, with no known curative or disease-slowing drugs in existence. Among the vital and time-consuming stages in the drug discovery process, disease modeling and target identification hold particular significance. Disease modeling allows for a deeper comprehension of disease progression mechanisms and potential therapeutic avenues. On the other hand, target identification serves as the foundational step in drug development, exerting a profound influence on all subsequent phases and ultimately determining the success rate of drug development endeavors. Machine learning (ML) techniques have ushered in transformative breakthroughs in the realm of target discovery. Leveraging the strengths of large dataset analysis, multifaceted data processing, and the exploration of intricate biological mechanisms, ML has become instrumental in the quest for effective AD treatments. In this comprehensive review, we offer an account of how ML methodologies are being deployed in the pursuit of drug discovery for AD. Furthermore, we provide an overview of the utilization of ML in uncovering potential intervention strategies and prospective therapeutic targets for AD. Finally, we discuss the principal challenges and limitations currently faced by these approaches. We also explore the avenues for future research that hold promise in addressing these challenges.
Collapse
Affiliation(s)
- Chaofan Geng
- Department of Neurology & Innovation Center for Neurological Disorders, Xuanwu Hospital, Capital Medical University, National Center for Neurological Disorders, Beijing, China
| | - ZhiBin Wang
- Department of Neurology & Innovation Center for Neurological Disorders, Xuanwu Hospital, Capital Medical University, National Center for Neurological Disorders, Beijing, China
| | - Yi Tang
- Department of Neurology & Innovation Center for Neurological Disorders, Xuanwu Hospital, Capital Medical University, National Center for Neurological Disorders, Beijing, China; Neurodegenerative Laboratory of Ministry of Education of the People's Republic of China, Beijing, China.
| |
Collapse
|
4
|
Spreitzer E, Alderson TR, Bourgeois B, Eggenreich L, Habacher H, Brahmersdorfer G, Pritišanac I, Sánchez-Murcia PA, Madl T. FOXO transcription factors differ in their dynamics and intra/intermolecular interactions. Curr Res Struct Biol 2022; 4:118-133. [PMID: 35573459 PMCID: PMC9097636 DOI: 10.1016/j.crstbi.2022.04.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 03/19/2022] [Accepted: 04/07/2022] [Indexed: 11/19/2022] Open
Affiliation(s)
- Emil Spreitzer
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
| | - T. Reid Alderson
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
| | - Benjamin Bourgeois
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
| | - Loretta Eggenreich
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
| | - Hermann Habacher
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
| | - Greta Brahmersdorfer
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
| | - Iva Pritišanac
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
| | - Pedro A. Sánchez-Murcia
- Division of Physiological Chemistry, Otto-Loewi Research Center, Medical University of Graz, Graz, Austria
| | - Tobias Madl
- Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria
- BioTechMed-Graz, Graz, Austria
- Corresponding author. Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Molecular Biology and Biochemistry, Medical University of Graz, Graz, Austria.
| |
Collapse
|
5
|
van Roey R, Brabletz T, Stemmler MP, Armstark I. Deregulation of Transcription Factor Networks Driving Cell Plasticity and Metastasis in Pancreatic Cancer. Front Cell Dev Biol 2021; 9:753456. [PMID: 34888306 PMCID: PMC8650502 DOI: 10.3389/fcell.2021.753456] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 10/27/2021] [Indexed: 12/15/2022] Open
Abstract
Pancreatic cancer is a very aggressive disease with 5-year survival rates of less than 10%. The constantly increasing incidence and stagnant patient outcomes despite changes in treatment regimens emphasize the requirement of a better understanding of the disease mechanisms. Challenges in treating pancreatic cancer include diagnosis at already progressed disease states due to the lack of early detection methods, rapid acquisition of therapy resistance, and high metastatic competence. Pancreatic ductal adenocarcinoma, the most prevalent type of pancreatic cancer, frequently shows dominant-active mutations in KRAS and TP53 as well as inactivation of genes involved in differentiation and cell-cycle regulation (e.g. SMAD4 and CDKN2A). Besides somatic mutations, deregulated transcription factor activities strongly contribute to disease progression. Specifically, transcriptional regulatory networks essential for proper lineage specification and differentiation during pancreas development are reactivated or become deregulated in the context of cancer and exacerbate progression towards an aggressive phenotype. This review summarizes the recent literature on transcription factor networks and epigenetic gene regulation that play a crucial role during tumorigenesis.
Collapse
Affiliation(s)
- Ruthger van Roey
- Department of Experimental Medicine 1, Nikolaus-Fiebiger Center for Molecular Medicine, Friedrich-Alexander University of Erlangen-Nürnberg, Erlangen, Germany
| | - Thomas Brabletz
- Department of Experimental Medicine 1, Nikolaus-Fiebiger Center for Molecular Medicine, Friedrich-Alexander University of Erlangen-Nürnberg, Erlangen, Germany
| | - Marc P Stemmler
- Department of Experimental Medicine 1, Nikolaus-Fiebiger Center for Molecular Medicine, Friedrich-Alexander University of Erlangen-Nürnberg, Erlangen, Germany
| | - Isabell Armstark
- Department of Experimental Medicine 1, Nikolaus-Fiebiger Center for Molecular Medicine, Friedrich-Alexander University of Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
6
|
Vancura A, Lanzós A, Bosch-Guiteras N, Esteban MT, Gutierrez AH, Haefliger S, Johnson R. Cancer LncRNA Census 2 (CLC2): an enhanced resource reveals clinical features of cancer lncRNAs. NAR Cancer 2021; 3:zcab013. [PMID: 34316704 PMCID: PMC8210278 DOI: 10.1093/narcan/zcab013] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 03/12/2021] [Accepted: 03/17/2021] [Indexed: 01/28/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) play key roles in cancer and are at the vanguard of precision therapeutic development. These efforts depend on large and high-confidence collections of cancer lncRNAs. Here, we present the Cancer LncRNA Census 2 (CLC2). With 492 cancer lncRNAs, CLC2 is 4-fold greater in size than its predecessor, without compromising on strict criteria of confident functional/genetic roles and inclusion in the GENCODE annotation scheme. This increase was enabled by leveraging high-throughput transposon insertional mutagenesis screening data, yielding 92 novel cancer lncRNAs. CLC2 makes a valuable addition to existing collections: it is amongst the largest, contains numerous unique genes (not found in other databases) and carries functional labels (oncogene/tumour suppressor). Analysis of this dataset reveals that cancer lncRNAs are impacted by germline variants, somatic mutations and changes in expression consistent with inferred disease functions. Furthermore, we show how clinical/genomic features can be used to vet prospective gene sets from high-throughput sources. The combination of size and quality makes CLC2 a foundation for precision medicine, demonstrating cancer lncRNAs’ evolutionary and clinical significance.
Collapse
Affiliation(s)
- Adrienne Vancura
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Andrés Lanzós
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Núria Bosch-Guiteras
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Mònica Torres Esteban
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Alejandro H Gutierrez
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Simon Haefliger
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Rory Johnson
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| |
Collapse
|
7
|
Discovering novel driver mutations from pan-cancer analysis of mutational and gene expression profiles. PLoS One 2020; 15:e0242780. [PMID: 33232371 PMCID: PMC7685479 DOI: 10.1371/journal.pone.0242780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Accepted: 11/10/2020] [Indexed: 11/19/2022] Open
Abstract
As the genomic profile across cancers varies from person to person, patient prognosis and treatment may differ based on the mutational signature of each tumour. Thus, it is critical to understand genomic drivers of cancer and identify potential mutational commonalities across tumors originating at diverse anatomical sites. Large-scale cancer genomics initiatives, such as TCGA, ICGC and GENIE have enabled the analysis of thousands of tumour genomes. Our goal was to identify new cancer-causing mutations that may be common across tumour sites using mutational and gene expression profiles. Genomic and transcriptomic data from breast, ovarian, and prostate cancers were aggregated and analysed using differential gene expression methods to identify the effect of specific mutations on the expression of multiple genes. Mutated genes associated with the most differentially expressed genes were considered to be novel candidates for driver mutations, and were validated through literature mining, pathway analysis and clinical data investigation. Our driver selection method successfully identified 116 probable novel cancer-causing genes, with 4 discovered in patients having no alterations in any known driver genes: MXRA5, OBSCN, RYR1, and TG. The candidate genes previously not officially classified as cancer-causing showed enrichment in cancer pathways and in cancer diseases. They also matched expectations pertaining to properties of cancer genes, for instance, showing larger gene and protein lengths, and having mutation patterns suggesting oncogenic or tumor suppressor properties. Our approach allows for the identification of novel putative driver genes that are common across cancer sites using an unbiased approach without any a priori knowledge on pathways or gene interactions and is therefore an agnostic approach to the identification of putative common driver genes acting at multiple cancer sites.
Collapse
|
8
|
Romdhane L, Bouhamed H, Ghedira K, Ben Hamda C, Louhichi A, Jmel H, Romdhane S, Charfeddine C, Mokni M, Abdelhak S, Rebai A. The morbid cutaneous anatomy of the human genome revealed by a bioinformatic approach. Genomics 2020; 112:4232-4241. [PMID: 32650097 DOI: 10.1016/j.ygeno.2020.07.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 03/28/2020] [Accepted: 07/02/2020] [Indexed: 01/05/2023]
Abstract
Computational approaches have been developed to prioritize candidate genes in disease gene identification. They are based on different pieces of evidences associating each gene with the given disease. In this study, 648 genes underlying genodermatoses have been compared to 1808 genes involved in other genetic diseases using a bioinformatic approach. These genes were studied at the structural, evolutionary and functional levels. Results show that genes underlying genodermatoses present longer CDS and have more exons. Significant differences were observed in nucleotide motif and amino-acid compositions. Evolutionary conservation analysis revealed that genodermatoses genes have less paralogs, more orthologs in Mouse and Dog and are less conserved. Functional analysis revealed that genodermatosis genes seem to be involved in immune system and skin layers. The Bayesian network model returned a rate of good classification of around 80%. This computational approach could help investigators working in the field of dermatology by prioritizing positional candidate genes for mutation screening.
Collapse
Affiliation(s)
- Lilia Romdhane
- Biomedical Genomics and Oncogenetics Laboratory LR11IPT05, LR16IPT05, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia; Department of Biology, Faculty of Sciences of Bizerte, Jarzouna, Université Tunis Carthage, Tunis, Tunisia.
| | - Heni Bouhamed
- Molecular and Cellular Screening Process Laboratory, Centre of Biotechnology of Sfax, Sfax, Tunisia
| | - Kais Ghedira
- Laboratory of Bioinformatics, Biomathematics and Biostatistics (LR16IPT09), Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Cherif Ben Hamda
- Laboratory of Bioinformatics, Biomathematics and Biostatistics (LR16IPT09), Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Amel Louhichi
- Molecular and Cellular Screening Process Laboratory, Centre of Biotechnology of Sfax, Sfax, Tunisia
| | - Haifa Jmel
- Biomedical Genomics and Oncogenetics Laboratory LR11IPT05, LR16IPT05, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Safa Romdhane
- Biomedical Genomics and Oncogenetics Laboratory LR11IPT05, LR16IPT05, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Chérine Charfeddine
- Biomedical Genomics and Oncogenetics Laboratory LR11IPT05, LR16IPT05, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia; High Institut of Biotechnology of Sidi Thabet, University of Manouba, BiotechPole of Sidi Thabet, Ariana, Tunisia
| | - Mourad Mokni
- Department of Dermatology, CHU La Rabta Tunis, Tunis, Tunisia; Public health and infection Research Laboratory, La Rabta Hospital, Tunis, Tunisia
| | - Sonia Abdelhak
- Biomedical Genomics and Oncogenetics Laboratory LR11IPT05, LR16IPT05, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Ahmed Rebai
- Molecular and Cellular Screening Process Laboratory, Centre of Biotechnology of Sfax, Sfax, Tunisia
| |
Collapse
|
9
|
Gene Expression Signature of BRAF Inhibitor Resistant Melanoma Spheroids. Pathol Oncol Res 2020; 26:2557-2566. [PMID: 32613561 PMCID: PMC7471197 DOI: 10.1007/s12253-020-00837-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 06/08/2020] [Indexed: 02/06/2023]
Abstract
In vitro cell cultures are frequently used to define the molecular background of drug resistance. The majority of currently available data have been obtained from 2D in vitro cultures, however, 3D cell culture systems (spheroids) are more likely to behave similarly to in vivo conditions. Our major aim was to compare the gene expression signature of 2D and 3D cultured BRAFV600E mutant melanoma cell lines. We successfully developed BRAF-drug resistant cell lines from paired primary/metastatic melanoma cell lines in both 2D and 3D in vitro cultures. Using Affymetrix Human Gene 1.0 ST arrays, we determined the gene expression pattern of all cell lines. Our analysis revealed 1049 genes (562 upregulated and 487 downregulated) that were differentially expressed between drug-sensitive cells grown under different cell cultures. Pathway analysis showed that the differently expressed genes were mainly associated with the cell cycle, p53, and other cancer-related pathways. The number of upregulated genes (72 genes) was remarkably fewer when comparing the resistant adherent cells to cells that grow in 3D, and were associated with cell adhesion molecules and IGF1R signalling. Only 1% of the upregulated and 5.6% of the downregulated genes were commonly altered between the sensitive and the resistant spheroids. Interestingly, we found several genes (BNIP3, RING1 and ABHD4) with inverse expression signature between sensitive and resistant spheroids, which are involved in anoikis resistance and cell cycle regulation. In summary, our study highlights gene expression alterations that might help to understand the development of acquired resistance in melanoma cells in tumour tissue.
Collapse
|
10
|
Carlevaro-Fita J, Liu L, Zhou Y, Zhang S, Chouvardas P, Johnson R, Li J. LnCompare: gene set feature analysis for human long non-coding RNAs. Nucleic Acids Res 2020; 47:W523-W529. [PMID: 31147707 PMCID: PMC6602513 DOI: 10.1093/nar/gkz410] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 05/02/2019] [Accepted: 05/06/2019] [Indexed: 02/01/2023] Open
Abstract
Interest in the biological roles of long noncoding RNAs (lncRNAs) has resulted in growing numbers of studies that produce large sets of candidate genes, for example, differentially expressed between two conditions. For sets of protein-coding genes, ontology and pathway analyses are powerful tools for generating new insights from statistical enrichment of gene features. Here we present the LnCompare web server, an equivalent resource for studying the properties of lncRNA gene sets. The Gene Set Feature Comparison mode tests for enrichment amongst a panel of quantitative and categorical features, spanning gene structure, evolutionary conservation, expression, subcellular localization, repetitive sequences and disease association. Moreover, in Similar Gene Identification mode, users may identify other lncRNAs by similarity across a defined range of features. Comprehensive results may be downloaded in tabular and graphical formats, in addition to the entire feature resource. LnCompare will empower researchers to extract useful hypotheses and candidates from lncRNA gene sets.
Collapse
Affiliation(s)
- Joana Carlevaro-Fita
- Department of BioMedical Research (DBMR), University of Bern, Bern 3008, Switzerland.,Department of Medical Oncology, Inselspital, University Hospital and University of Bern 3010, Switzerland
| | - Leibo Liu
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Yuan Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Shan Zhang
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Panagiotis Chouvardas
- Department of BioMedical Research (DBMR), University of Bern, Bern 3008, Switzerland.,Department of Medical Oncology, Inselspital, University Hospital and University of Bern 3010, Switzerland
| | - Rory Johnson
- Department of BioMedical Research (DBMR), University of Bern, Bern 3008, Switzerland.,Department of Medical Oncology, Inselspital, University Hospital and University of Bern 3010, Switzerland
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| |
Collapse
|
11
|
Carlevaro-Fita J, Lanzós A, Feuerbach L, Hong C, Mas-Ponte D, Pedersen JS, Johnson R. Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis. Commun Biol 2020; 3:56. [PMID: 32024996 PMCID: PMC7002399 DOI: 10.1038/s42003-019-0741-7] [Citation(s) in RCA: 127] [Impact Index Per Article: 31.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 08/31/2018] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis.
Collapse
Affiliation(s)
- Joana Carlevaro-Fita
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland
- Department of Biomedical Research, University of Bern, 3008, Bern, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, 3012, Bern, Switzerland
| | - Andrés Lanzós
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland
- Department of Biomedical Research, University of Bern, 3008, Bern, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, 3012, Bern, Switzerland
| | - Lars Feuerbach
- Applied Bioinformatics, Deutsches Krebsforschungszentrum, 69120, Heidelberg, Germany
| | - Chen Hong
- Applied Bioinformatics, Deutsches Krebsforschungszentrum, 69120, Heidelberg, Germany
| | - David Mas-Ponte
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Jakob Skou Pedersen
- Department for Molecular Medicine, Aarhus University Hospital, Palle Juul-Jensens Boulevard 99, 8200, Aarhus N, Denmark
| | - Rory Johnson
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland.
- Department of Biomedical Research, University of Bern, 3008, Bern, Switzerland.
- Graduate School for Cellular and Biomedical Sciences, University of Bern, 3012, Bern, Switzerland.
| |
Collapse
|
12
|
Computational characterization and identification of human polycystic ovary syndrome genes. Sci Rep 2018; 8:12949. [PMID: 30154492 PMCID: PMC6113217 DOI: 10.1038/s41598-018-31110-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Accepted: 08/10/2018] [Indexed: 12/30/2022] Open
Abstract
Human polycystic ovary syndrome (PCOS) is a highly heritable disease regulated by genetic and environmental factors. Identifying PCOS genes is time consuming and costly in wet-lab. Developing an algorithm to predict PCOS candidates will be helpful. In this study, for the first time, we systematically analyzed properties of human PCOS genes. Compared with genes not yet known to be involved in PCOS regulation, known PCOS genes display distinguishing characteristics: (i) they tend to be located at network center; (ii) they tend to interact with each other; (iii) they tend to enrich in certain biological processes. Based on these features, we developed a machine-learning algorithm to predict new PCOS genes. 233 PCOS candidates were predicted with a posterior probability >0.9. Evidence supporting 7 of the top 10 predictions has been found.
Collapse
|
13
|
Modelling the evolution of transcription factor binding preferences in complex eukaryotes. Sci Rep 2017; 7:7596. [PMID: 28790414 PMCID: PMC5548724 DOI: 10.1038/s41598-017-07761-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Accepted: 06/30/2017] [Indexed: 12/27/2022] Open
Abstract
Transcription factors (TFs) exert their regulatory action by binding to DNA with specific sequence preferences. However, different TFs can partially share their binding sequences due to their common evolutionary origin. This "redundancy" of binding defines a way of organizing TFs in "motif families" by grouping TFs with similar binding preferences. Since these ultimately define the TF target genes, the motif family organization entails information about the structure of transcriptional regulation as it has been shaped by evolution. Focusing on the human TF repertoire, we show that a one-parameter evolutionary model of the Birth-Death-Innovation type can explain the TF empirical repartition in motif families, and allows to highlight the relevant evolutionary forces at the origin of this organization. Moreover, the model allows to pinpoint few deviations from the neutral scenario it assumes: three over-expanded families (including HOX and FOX genes), a set of "singleton" TFs for which duplication seems to be selected against, and a higher-than-average rate of diversification of the binding preferences of TFs with a Zinc Finger DNA binding domain. Finally, a comparison of the TF motif family organization in different eukaryotic species suggests an increase of redundancy of binding with organism complexity.
Collapse
|
14
|
Discovery of Cancer Driver Long Noncoding RNAs across 1112 Tumour Genomes: New Candidates and Distinguishing Features. Sci Rep 2017; 7:41544. [PMID: 28128360 PMCID: PMC5269722 DOI: 10.1038/srep41544] [Citation(s) in RCA: 79] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2016] [Accepted: 12/22/2016] [Indexed: 12/20/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) represent a vast unexplored genetic space that may hold missing drivers of tumourigenesis, but few such "driver lncRNAs" are known. Until now, they have been discovered through changes in expression, leading to problems in distinguishing between causative roles and passenger effects. We here present a different approach for driver lncRNA discovery using mutational patterns in tumour DNA. Our pipeline, ExInAtor, identifies genes with excess load of somatic single nucleotide variants (SNVs) across panels of tumour genomes. Heterogeneity in mutational signatures between cancer types and individuals is accounted for using a simple local trinucleotide background model, which yields high precision and low computational demands. We use ExInAtor to predict drivers from the GENCODE annotation across 1112 entire genomes from 23 cancer types. Using a stratified approach, we identify 15 high-confidence candidates: 9 novel and 6 known cancer-related genes, including MALAT1, NEAT1 and SAMMSON. Both known and novel driver lncRNAs are distinguished by elevated gene length, evolutionary conservation and expression. We have presented a first catalogue of mutated lncRNA genes driving cancer, which will grow and improve with the application of ExInAtor to future tumour genome projects.
Collapse
|
15
|
Abstract
Dysregulation of the normal gene expression program is the cause of a broad range of diseases, including cancer. Detecting the specific perturbed regulators that have an effect on the generation and the development of the disease is crucial for understanding the disease mechanism and for taking decisions on efficient preventive and curative therapies. Moreover, detecting such perturbations at the patient level is even more important from the perspective of personalized medicine. We applied the Transcription Factor Target Enrichment Analysis, a method that detects the activity of transcription factors based on the quantification of the collective transcriptional activation of their targets, to a large collection of 5607 cancer samples covering eleven cancer types. We produced for the first time a comprehensive catalogue of altered transcription factor activities in cancer, a considerable number of them significantly associated to patient’s survival. Moreover, we described several interesting TFs whose activity do not change substantially in the cancer with respect to the normal tissue but ultimately play an important role in patient prognostic determination, which suggest they might be promising therapeutic targets. An additional advantage of this method is that it allows obtaining personalized TF activity estimations for individual patients.
Collapse
|
16
|
Li YH, Zhang GG, Wang N. Systematic Characterization and Prediction of Human Hypertension Genes. Hypertension 2016; 69:349-355. [PMID: 27895194 DOI: 10.1161/hypertensionaha.116.08573] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Revised: 10/19/2016] [Accepted: 11/09/2016] [Indexed: 01/25/2023]
Abstract
Hypertension is a major cardiovascular risk factor and accounts for a large part of cardiovascular mortality. In this work, we analyzed the properties of hypertension genes and found that when compared with genes not yet known to be involved in hypertension regulation, known hypertension genes display distinguishing features: (1) hypertension genes tend to be located at network center; (2) hypertension genes tend to interact with each other; and (3) hypertension genes tend to enrich in certain biological processes and show certain phenotypes. Based on these features, we developed a machine-learning algorithm to predict new hypertension genes. One hundred and seventy-seven candidates were predicted with a posterior probability >0.9. Evidence supporting 17 of the predictions has been found.
Collapse
Affiliation(s)
- Yan-Hui Li
- From the Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Ministry of Education, Peking University Health Science Center, Beijing, People's Republic of China (Y.-H.L., N.W.); Special Medical Ward (Geratology Department), First Hospital of Tsinghua University Beijing, People's Republic of China (G.-G.Z.); and The Advanced Institute for Medical Sciences, Dalian Medical University, China (N.W.).
| | - Gai-Gai Zhang
- From the Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Ministry of Education, Peking University Health Science Center, Beijing, People's Republic of China (Y.-H.L., N.W.); Special Medical Ward (Geratology Department), First Hospital of Tsinghua University Beijing, People's Republic of China (G.-G.Z.); and The Advanced Institute for Medical Sciences, Dalian Medical University, China (N.W.)
| | - Nanping Wang
- From the Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Ministry of Education, Peking University Health Science Center, Beijing, People's Republic of China (Y.-H.L., N.W.); Special Medical Ward (Geratology Department), First Hospital of Tsinghua University Beijing, People's Republic of China (G.-G.Z.); and The Advanced Institute for Medical Sciences, Dalian Medical University, China (N.W.).
| |
Collapse
|
17
|
Ehsani R, Bahrami S, Drabløs F. Feature-based classification of human transcription factors into hypothetical sub-classes related to regulatory function. BMC Bioinformatics 2016; 17:459. [PMID: 27842491 PMCID: PMC5109715 DOI: 10.1186/s12859-016-1349-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Accepted: 11/10/2016] [Indexed: 12/15/2022] Open
Abstract
Background Transcription factors are key proteins in the regulation of gene transcription. An important step in this process is the opening of chromatin in order to make genomic regions available for transcription. Data on DNase I hypersensitivity has previously been used to label a subset of transcription factors as Pioneers, Settlers and Migrants to describe their potential role in this process. These labels represent an interesting hypothesis on gene regulation and possibly a useful approach for data analysis, and therefore we wanted to expand the set of labeled transcription factors to include as many known factors as possible. We have used a well-annotated dataset of 1175 transcription factors as input to supervised machine learning methods, using the subset with previously assigned labels as training set. We then used the final classifier to label the additional transcription factors according to their potential role as Pioneers, Settlers and Migrants. The full set of labeled transcription factors was used to investigate associated properties and functions of each class, including an analysis of interaction data for transcription factors based on DNA co-binding and protein-protein interactions. We also used the assigned labels to analyze a previously published set of gene lists associated with a time course experiment on cell differentiation. Results The analysis showed that the classification of transcription factors with respect to their potential role in chromatin opening largely was determined by how they bind to DNA. Each subclass of transcription factors was enriched for properties that seemed to characterize the subclass relative to its role in gene regulation, with very general functions for Pioneers, whereas Migrants to a larger extent were associated with specific processes. Further analysis showed that the expanded classification is a useful resource for analyzing other datasets on transcription factors with respect to their potential role in gene regulation. The analysis of transcription factor interaction data showed complementary differences between the subclasses, where transcription factors labeled as Pioneers often interact with other transcription factors through DNA co-binding, whereas Migrants to a larger extent use protein-protein interactions. The analysis of time course data on cell differentiation indicated a shift in the regulatory program associated with Pioneer-like transcription factors during differentiation. Conclusions The expanded classification is an interesting resource for analyzing data on gene regulation, as illustrated here on transcription factor interaction data and data from a time course experiment. The potential regulatory function of transcription factors seems largely to be determined by how they bind DNA, but is also influenced by how they interact with each other through cooperativity and protein-protein interactions. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1349-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rezvan Ehsani
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, PO Box 8905, NO-7491, Trondheim, Norway.,Department of Mathematics, University of Zabol, Zabol, Iran
| | - Shahram Bahrami
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, PO Box 8905, NO-7491, Trondheim, Norway.,St. Olavs Hospital, Trondheim University Hospital, NO-7006, Trondheim, Norway
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, PO Box 8905, NO-7491, Trondheim, Norway.
| |
Collapse
|
18
|
Jamal S, Goyal S, Shanker A, Grover A. Integrating network, sequence and functional features using machine learning approaches towards identification of novel Alzheimer genes. BMC Genomics 2016; 17:807. [PMID: 27756223 PMCID: PMC5070370 DOI: 10.1186/s12864-016-3108-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Accepted: 09/20/2016] [Indexed: 01/01/2023] Open
Abstract
Background Alzheimer’s disease (AD) is a complex progressive neurodegenerative disorder commonly characterized by short term memory loss. Presently no effective therapeutic treatments exist that can completely cure this disease. The cause of Alzheimer’s is still unclear, however one of the other major factors involved in AD pathogenesis are the genetic factors and around 70 % risk of the disease is assumed to be due to the large number of genes involved. Although genetic association studies have revealed a number of potential AD susceptibility genes, there still exists a need for identification of unidentified AD-associated genes and therapeutic targets to have better understanding of the disease-causing mechanisms of Alzheimer’s towards development of effective AD therapeutics. Results In the present study, we have used machine learning approach to identify candidate AD associated genes by integrating topological properties of the genes from the protein-protein interaction networks, sequence features and functional annotations. We also used molecular docking approach and screened already known anti-Alzheimer drugs against the novel predicted probable targets of AD and observed that an investigational drug, AL-108, had high affinity for majority of the possible therapeutic targets. Furthermore, we performed molecular dynamics simulations and MM/GBSA calculations on the docked complexes to validate our preliminary findings. Conclusions To the best of our knowledge, this is the first comprehensive study of its kind for identification of putative Alzheimer-associated genes using machine learning approaches and we propose that such computational studies can improve our understanding on the core etiology of AD which could lead to the development of effective anti-Alzheimer drugs. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3108-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Salma Jamal
- School of Biotechnology, Jawaharlal Nehru University, New Delhi, 110067, India.,Department of Bioscience and Biotechnology, Banasthali University, Tonk, Rajasthan, 304022, India
| | - Sukriti Goyal
- School of Biotechnology, Jawaharlal Nehru University, New Delhi, 110067, India.,Department of Bioscience and Biotechnology, Banasthali University, Tonk, Rajasthan, 304022, India
| | - Asheesh Shanker
- Bioinformatics Programme, Centre for Biological Sciences, Central University of South Bihar, BIT Campus, Patna, Bihar, India
| | - Abhinav Grover
- School of Biotechnology, Jawaharlal Nehru University, New Delhi, 110067, India.
| |
Collapse
|
19
|
Levati E, Sartini S, Ottonello S, Montanini B. Dry and wet approaches for genome-wide functional annotation of conventional and unconventional transcriptional activators. Comput Struct Biotechnol J 2016; 14:262-70. [PMID: 27453771 PMCID: PMC4941109 DOI: 10.1016/j.csbj.2016.06.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Revised: 06/21/2016] [Accepted: 06/23/2016] [Indexed: 02/06/2023] Open
Abstract
Transcription factors (TFs) are master gene products that regulate gene expression in response to a variety of stimuli. They interact with DNA in a sequence-specific manner using a variety of DNA-binding domain (DBD) modules. This allows to properly position their second domain, called "effector domain", to directly or indirectly recruit positively or negatively acting co-regulators including chromatin modifiers, thus modulating preinitiation complex formation as well as transcription elongation. At variance with the DBDs, which are comprised of well-defined and easily recognizable DNA binding motifs, effector domains are usually much less conserved and thus considerably more difficult to predict. Also not so easy to identify are the DNA-binding sites of TFs, especially on a genome-wide basis and in the case of overlapping binding regions. Another emerging issue, with many potential regulatory implications, is that of so-called "moonlighting" transcription factors, i.e., proteins with an annotated function unrelated to transcription and lacking any recognizable DBD or effector domain, that play a role in gene regulation as their second job. Starting from bioinformatic and experimental high-throughput tools for an unbiased, genome-wide identification and functional characterization of TFs (especially transcriptional activators), we describe both established (and usually well affordable) as well as newly developed platforms for DNA-binding site identification. Selected combinations of these search tools, some of which rely on next-generation sequencing approaches, allow delineating the entire repertoire of TFs and unconventional regulators encoded by the any sequenced genome.
Collapse
Affiliation(s)
| | | | - Simone Ottonello
- Corresponding author at: Department of Life Sciences, University of Parma, Parco Area delle Scienze 23/A, 43124 Parma, Italy.Department of Life SciencesUniversity of ParmaParco Area delle Scienze 23/AParma43124Italy
| | | |
Collapse
|
20
|
Human Genes Encoding Transcription Factors and Chromatin-Modifying Proteins Have Low Levels of Promoter Polymorphism: A Study of 1000 Genomes Project Data. Int J Genomics 2015; 2015:260159. [PMID: 26417590 PMCID: PMC4568383 DOI: 10.1155/2015/260159] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Accepted: 07/29/2015] [Indexed: 12/15/2022] Open
Abstract
The expression level of each gene is controlled by its regulatory regions, which determine the precise regulation in a tissue-specific manner, according to the developmental stage of the body and the necessity of a response to external stimuli. Nucleotide substitutions in regulatory gene regions may modify the affinity of transcription factors to their specific DNA binding sites, affecting the transcription rates of genes. In our previous research, we found that genes controlling the sensory perception of smell and genes involved in antigen processing and presentation were overrepresented significantly among genes with high SNP contents in their promoter regions. The goal of our study was to reveal functional features of human genes containing extremely small numbers of SNPs in promoter regions. Two functional groups were found to be overrepresented among genes whose promoters did not contain SNPs: (1) genes involved in gene-specific transcription and (2) genes controlling chromatin organization. We revealed that the 5′-regulatory regions of genes encoding transcription factors and chromatin-modifying proteins were characterized by reduced genetic variability. One important exception from this rule refers to genes encoding transcription factors with zinc-coordinating DNA-binding domains (DBDs), which underwent extensive expansion in vertebrates, particularly, in primate evolution. Hence, we obtained new evidence for evolutionary forces shaping variability in 5′-regulatory regions of genes.
Collapse
|
21
|
Casado-Vela J, Fuentes M, Franco-Zorrilla JM. Screening of Protein–Protein and Protein–DNA Interactions Using Microarrays. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 95:231-81. [DOI: 10.1016/b978-0-12-800453-1.00008-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
22
|
Predicting potential cancer genes by integrating network properties, sequence features and functional annotations. SCIENCE CHINA-LIFE SCIENCES 2013; 56:751-7. [PMID: 23838808 DOI: 10.1007/s11427-013-4500-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2012] [Accepted: 05/14/2013] [Indexed: 10/26/2022]
Abstract
The discovery of novel cancer genes is one of the main goals in cancer research. Bioinformatics methods can be used to accelerate cancer gene discovery, which may help in the understanding of cancer and the development of drug targets. In this paper, we describe a classifier to predict potential cancer genes that we have developed by integrating multiple biological evidence, including protein-protein interaction network properties, and sequence and functional features. We detected 55 features that were significantly different between cancer genes and non-cancer genes. Fourteen cancer-associated features were chosen to train the classifier. Four machine learning methods, logistic regression, support vector machines (SVMs), BayesNet and decision tree, were explored in the classifier models to distinguish cancer genes from non-cancer genes. The prediction power of the different models was evaluated by 5-fold cross-validation. The area under the receiver operating characteristic curve for logistic regression, SVM, Baysnet and J48 tree models was 0.834, 0.740, 0.800 and 0.782, respectively. Finally, the logistic regression classifier with multiple biological features was applied to the genes in the Entrez database, and 1976 cancer gene candidates were identified. We found that the integrated prediction model performed much better than the models based on the individual biological evidence, and the network and functional features had stronger powers than the sequence features in predicting cancer genes.
Collapse
|
23
|
Reva B. Revealing selection in cancer using the predicted functional impact of cancer mutations. Application to nomination of cancer drivers. BMC Genomics 2013; 14 Suppl 3:S8. [PMID: 23819556 PMCID: PMC3665576 DOI: 10.1186/1471-2164-14-s3-s8] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Every malignant tumor has a unique spectrum of genomic alterations including numerous protein mutations. There are also hundreds of personal germline variants to be taken into account. The combinatorial diversity of potential cancer-driving events limits the applicability of statistical methods to determine tumor-specific "driver" alterations among an overwhelming majority of "passengers". An alternative approach to determining driver mutations is to assess the functional impact of mutations in a given tumor and predict drivers based on a numerical value of the mutation impact in a particular context of genomic alterations.Recently, we introduced a functional impact score, which assesses the mutation impact by the value of entropic disordering of the evolutionary conservation patterns in proteins. The functional impact score separates disease-associated variants from benign polymorphisms with an accuracy of ~80%. Can the score be used to identify functionally important non-recurrent cancer-driver mutations? Assuming that cancer-drivers are positively selected in tumor evolution, we investigated how the functional impact score correlates with key features of natural selection in cancer, such as the non-uniformity of distribution of mutations, the frequency of affected tumor suppressors and oncogenes, the frequency of concurrent alterations in regions of heterozygous deletions and copy gain; as a control, we used presumably non-selected silent mutations. Using mutations of six cancers studied in TCGA projects, we found that predicted high-scoring functional mutations as well as truncating mutations tend to be evolutionarily selected as compared to low-scoring and silent mutations. This result justifies prediction of mutations-drivers using a shorter list of predicted high-scoring functional mutations, rather than the "long tail" of all mutations.
Collapse
Affiliation(s)
- B Reva
- Computational Biology Center, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, NY 10065, USA.
| |
Collapse
|
24
|
Santana-Codina N, Carretero R, Sanz-Pamplona R, Cabrera T, Guney E, Oliva B, Clezardin P, Olarte OE, Loza-Alvarez P, Méndez-Lucas A, Perales JC, Sierra A. A transcriptome-proteome integrated network identifies endoplasmic reticulum thiol oxidoreductase (ERp57) as a hub that mediates bone metastasis. Mol Cell Proteomics 2013; 12:2111-25. [PMID: 23625662 DOI: 10.1074/mcp.m112.022772] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Bone metastasis is the most common distant relapse in breast cancer. The identification of key proteins involved in the osteotropic phenotype would represent a major step toward the development of new prognostic markers and therapeutic improvements. The aim of this study was to characterize functional phenotypes that favor bone metastasis in human breast cancer. We used the human breast cancer cell line MDA-MB-231 and its osteotropic BO2 subclone to identify crucial proteins in bone metastatic growth. We identified 31 proteins, 15 underexpressed and 16 overexpressed, in BO2 cells compared with parental cells. We employed a network-modeling approach in which these 31 candidate proteins were prioritized with respect to their potential in metastasis formation, based on the topology of the protein-protein interaction network and differential expression. The protein-protein interaction network provided a framework to study the functional relationships between biological molecules by attributing functions to genes whose functions had not been characterized. The combination of expression profiles and protein interactions revealed an endoplasmic reticulum-thiol oxidoreductase, ERp57, functioning as a hub that retained four down-regulated nodes involved in antigen presentation associated with the human major histocompatibility complex class I molecules, including HLA-A, HLA-B, HLA-E, and HLA-F. Further analysis of the interaction network revealed an inverse correlation between ERp57 and vimentin, which influences cytoskeleton reorganization. Moreover, knockdown of ERp57 in BO2 cells confirmed its bone organ-specific prometastatic role. Altogether, ERp57 appears as a multifunctional chaperone that can regulate diverse biological processes to maintain the homeostasis of breast cancer cells and promote the development of bone metastasis.
Collapse
Affiliation(s)
- Naiara Santana-Codina
- Biological Clues of the Invasive and Metastatic Phenotype Group, Bellvitge Biomedical Research Institute IDIBELL, L'Hospitalet de Llobregat, Barcelona E-08908, Spain
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Shihab HA, Gough J, Cooper DN, Day INM, Gaunt TR. Predicting the functional consequences of cancer-associated amino acid substitutions. ACTA ACUST UNITED AC 2013; 29:1504-10. [PMID: 23620363 PMCID: PMC3673218 DOI: 10.1093/bioinformatics/btt182] [Citation(s) in RCA: 171] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Motivation: The number of missense mutations being identified in cancer genomes has greatly increased as a consequence of technological advances and the reduced cost of whole-genome/whole-exome sequencing methods. However, a high proportion of the amino acid substitutions detected in cancer genomes have little or no effect on tumour progression (passenger mutations). Therefore, accurate automated methods capable of discriminating between driver (cancer-promoting) and passenger mutations are becoming increasingly important. In our previous work, we developed the Functional Analysis through Hidden Markov Models (FATHMM) software and, using a model weighted for inherited disease mutations, observed improved performances over alternative computational prediction algorithms. Here, we describe an adaptation of our original algorithm that incorporates a cancer-specific model to potentiate the functional analysis of driver mutations. Results: The performance of our algorithm was evaluated using two separate benchmarks. In our analysis, we observed improved performances when distinguishing between driver mutations and other germ line variants (both disease-causing and putatively neutral mutations). In addition, when discriminating between somatic driver and passenger mutations, we observed performances comparable with the leading computational prediction algorithms: SPF-Cancer and TransFIC. Availability and implementation: A web-based implementation of our cancer-specific model, including a downloadable stand-alone package, is available at http://fathmm.biocompute.org.uk. Contact:fathmm@biocompute.org.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hashem A Shihab
- Bristol Centre for Systems Biomedicine and MRC CAiTE Centre, School of Social and Community Medicine, University of Bristol, Bristol BS8 2BN, UK
| | | | | | | | | |
Collapse
|
26
|
Ramani RG, Jacob SG. Improved classification of lung cancer tumors based on structural and physicochemical properties of proteins using data mining models. PLoS One 2013; 8:e58772. [PMID: 23505559 PMCID: PMC3591381 DOI: 10.1371/journal.pone.0058772] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2012] [Accepted: 02/06/2013] [Indexed: 11/22/2022] Open
Abstract
Detecting divergence between oncogenic tumors plays a pivotal role in cancer diagnosis and therapy. This research work was focused on designing a computational strategy to predict the class of lung cancer tumors from the structural and physicochemical properties (1497 attributes) of protein sequences obtained from genes defined by microarray analysis. The proposed methodology involved the use of hybrid feature selection techniques (gain ratio and correlation based subset evaluators with Incremental Feature Selection) followed by Bayesian Network prediction to discriminate lung cancer tumors as Small Cell Lung Cancer (SCLC), Non-Small Cell Lung Cancer (NSCLC) and the COMMON classes. Moreover, this methodology eliminated the need for extensive data cleansing strategies on the protein properties and revealed the optimal and minimal set of features that contributed to lung cancer tumor classification with an improved accuracy compared to previous work. We also attempted to predict via supervised clustering the possible clusters in the lung tumor data. Our results revealed that supervised clustering algorithms exhibited poor performance in differentiating the lung tumor classes. Hybrid feature selection identified the distribution of solvent accessibility, polarizability and hydrophobicity as the highest ranked features with Incremental feature selection and Bayesian Network prediction generating the optimal Jack-knife cross validation accuracy of 87.6%. Precise categorization of oncogenic genes causing SCLC and NSCLC based on the structural and physicochemical properties of their protein sequences is expected to unravel the functionality of proteins that are essential in maintaining the genomic integrity of a cell and also act as an informative source for drug design, targeting essential protein properties and their composition that are found to exist in lung cancer tumors.
Collapse
Affiliation(s)
- R. Geetha Ramani
- Department of Information Science and Technology, College of Engineering, Guindy, Anna University, Chennai, Tamilnadu, India
| | - Shomona Gracia Jacob
- Faculty of Information and Communication Engineering, Anna University, Chennai, Tamilnadu, India
| |
Collapse
|
27
|
Garcia-Bassets I, Wang D. Cistrome plasticity and mechanisms of cistrome reprogramming. Cell Cycle 2012; 11:3199-210. [PMID: 22895178 DOI: 10.4161/cc.21281] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Mammalian genomes contain thousands of cis-regulatory elements for each transcription factor (TF), but TFs only occupy a relatively small subset referred to as cistrome. Recent studies demonstrate that a TF cistrome might differ among different organisms, tissue types and individuals. In a cell, a TF cistrome might differ among different physiological states, pathological stages and between physiological and pathological conditions. It is, therefore, remarkable how highly plastic these binding profiles are, and how massively they can be reprogrammed in rapid response to intra/extracellular variations and during cell identity transitions and evolution. Biologically, cistrome reprogramming events tend to be followed by changes in transcriptional outputs, thus serving as transformative mechanisms to synchronically alter the biology of the cell. In this review, we discuss the molecular basis of cistrome plasticity and attempt to integrate the different mechanisms and biological conditions associated with cistrome reprogramming. Emerging data suggest that, when altered, these reprogramming events might be linked to tumor development and/or progression, which is a radical conceptual change in our mechanistic understanding of cancer and, potentially, other diseases.
Collapse
Affiliation(s)
- Ivan Garcia-Bassets
- Department of Medicine, School of Medicine, University of California, San Diego, La Jolla, CA, USA.
| | | |
Collapse
|
28
|
Hosseinzadeh F, Ebrahimi M, Goliaei B, Shamabadi N. Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models. PLoS One 2012; 7:e40017. [PMID: 22829872 PMCID: PMC3400626 DOI: 10.1371/journal.pone.0040017] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Accepted: 05/30/2012] [Indexed: 12/03/2022] Open
Abstract
Rapid distinction between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) tumors is very important in diagnosis of this disease. Furthermore sequence-derived structural and physicochemical descriptors are very useful for machine learning prediction of protein structural and functional classes, classifying proteins and the prediction performance. Herein, in this study is the classification of lung tumors based on 1497 attributes derived from structural and physicochemical properties of protein sequences (based on genes defined by microarray analysis) investigated through a combination of attribute weighting, supervised and unsupervised clustering algorithms. Eighty percent of the weighting methods selected features such as autocorrelation, dipeptide composition and distribution of hydrophobicity as the most important protein attributes in classification of SCLC, NSCLC and COMMON classes of lung tumors. The same results were observed by most tree induction algorithms while descriptors of hydrophobicity distribution were high in protein sequences COMMON in both groups and distribution of charge in these proteins was very low; showing COMMON proteins were very hydrophobic. Furthermore, compositions of polar dipeptide in SCLC proteins were higher than NSCLC proteins. Some clustering models (alone or in combination with attribute weighting algorithms) were able to nearly classify SCLC and NSCLC proteins. Random Forest tree induction algorithm, calculated on leaves one-out and 10-fold cross validation) shows more than 86% accuracy in clustering and predicting three different lung cancer tumors. Here for the first time the application of data mining tools to effectively classify three classes of lung cancer tumors regarding the importance of dipeptide composition, autocorrelation and distribution descriptor has been reported.
Collapse
Affiliation(s)
- Faezeh Hosseinzadeh
- Student at Laboratory of Biophysics and Molecular Biology, Institute of Biophysics and Biochemistry, University of Tehran, Tehran, Iran
| | - Mansour Ebrahimi
- Department of Biology at Basic science School & Bioinformatics Research Group, Green Research Center, University of Qom, Qom, Iran
| | - Bahram Goliaei
- Department of Medical Physics, Iran University of Medical Science, Tehran, Iran
| | - Narges Shamabadi
- Bioinformatics Research Group, Green Research Center, University of Qom, Qom, Iran
| |
Collapse
|
29
|
Bleda M, Medina I, Alonso R, De Maria A, Salavert F, Dopazo J. Inferring the regulatory network behind a gene expression experiment. Nucleic Acids Res 2012; 40:W168-72. [PMID: 22693210 PMCID: PMC3394273 DOI: 10.1093/nar/gks573] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Transcription factors (TFs) and miRNAs are the most important dynamic regulators in the control of gene expression in multicellular organisms. These regulatory elements play crucial roles in development, cell cycling and cell signaling, and they have also been associated with many diseases. The Regulatory Network Analysis Tool (RENATO) web server makes the exploration of regulatory networks easy, enabling a better understanding of functional modularity and network integrity under specific perturbations. RENATO is suitable for the analysis of the result of expression profiling experiments. The program analyses lists of genes and search for the regulators compatible with its activation or deactivation. Tests of single enrichment or gene set enrichment allow the selection of the subset of TFs or miRNAs significantly involved in the regulation of the query genes. RENATO also offers an interactive advanced graphical interface that allows exploring the regulatory network found.RENATO is available at: http://renato.bioinfo.cipf.es/.
Collapse
Affiliation(s)
- Marta Bleda
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | | | | | | | | | | |
Collapse
|
30
|
Furney SJ, Gundem G, Lopez-Bigas N. Oncogenomics methods and resources. Cold Spring Harb Protoc 2012; 2012:2012/5/pdb.top069229. [PMID: 22550293 DOI: 10.1101/pdb.top069229] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Today, cancer is viewed as a genetic disease and many genetic mechanisms of oncogenesis are known. The progression from normal tissue to invasive cancer is thought to occur over a timescale of 5-20 years. This transformation is driven by both inherited genetic factors and somatic genetic alterations and mutations, and it results in uncontrolled cell growth and, in many cases, death. In this article, we review the main types of genomic and genetic alterations involved in cancer, namely copy-number changes, genomic rearrangements, somatic mutations, polymorphisms, and epigenomic alterations in cancer. We then discuss the transcriptomic consequences of these alterations in tumor cells. The use of "next-generation" sequencing methods in cancer research is described in the relevant sections. Finally, we discuss different approaches for candidate prioritization and integration and analysis of these complex data.
Collapse
|
31
|
Normanno D, Dahan M, Darzacq X. Intra-nuclear mobility and target search mechanisms of transcription factors: a single-molecule perspective on gene expression. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2012; 1819:482-93. [PMID: 22342464 DOI: 10.1016/j.bbagrm.2012.02.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2011] [Revised: 01/26/2012] [Accepted: 02/03/2012] [Indexed: 12/26/2022]
Abstract
Precise expression of specific genes in time and space is at the basis of cellular viability as well as correct development of organisms. Understanding the mechanisms of gene regulation is fundamental and still one of the great challenges for biology. Gene expression is regulated also by specific transcription factors that recognize and bind to specific DNA sequences. Transcription factors dynamics, and especially the way they sample the nucleoplasmic space during the search for their specific target in the genome, are a key aspect for regulation and it has been puzzling researchers for forty years. The scope of this review is to give a state-of-the-art perspective over the intra-nuclear mobility and the target search mechanisms of specific transcription factors at the molecular level. Going through the seminal biochemical experiments that have raised the first questions about target localization and the theoretical grounds concerning target search processes, we describe the most recent experimental achievements and current challenges in understanding transcription factors dynamics and interactions with DNA using in vitro assays as well as in live prokaryotic and eukaryotic cells. This article is part of a Special Issue entitled: Nuclear Transport and RNA Processing.
Collapse
Affiliation(s)
- Davide Normanno
- Institut de Biologie de l'Ecole normale supérieure (IBENS), CNRS UMR 8197, Ecole normale supérieure, 46, Rue d'Ulm, 75005 Paris, France.
| | | | | |
Collapse
|
32
|
Koh JLY, Brown KR, Sayad A, Kasimer D, Ketela T, Moffat J. COLT-Cancer: functional genetic screening resource for essential genes in human cancer cell lines. Nucleic Acids Res 2011; 40:D957-63. [PMID: 22102578 PMCID: PMC3245009 DOI: 10.1093/nar/gkr959] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genome-wide pooled shRNA screens enable global identification of the genes essential for cancer cell survival and proliferation and provide a ‘functional genetic’ map of human cancer to complement genomic studies. Using a lentiviral shRNA library targeting approximately 16 000 human genes and a newly developed scoring approach, we identified essential gene profiles in more than 70 breast, pancreatic and ovarian cancer cell lines. We developed a web-accessible database system for capturing information from each step in our standardized screening pipeline and a gene-centric search tool for exploring shRNA activities within a given cell line or across multiple cell lines. The database consists of a laboratory information and management system for tracking each step of a pooled shRNA screen as well as a web interface for querying and visualization of shRNA and gene-level performance across multiple cancer cell lines. COLT-Cancer Version 1.0 is currently accessible at http://colt.ccbr.utoronto.ca/cancer.
Collapse
Affiliation(s)
- Judice L Y Koh
- Banting and Best Department of Medical Research, Donnelly Centre, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada M5G 0A3
| | | | | | | | | | | |
Collapse
|
33
|
Jiang D, Jia Y, Jarrett HW. Transcription factor proteomics: identification by a novel gel mobility shift-three-dimensional electrophoresis method coupled with southwestern blot and high-performance liquid chromatography-electrospray-mass spectrometry analysis. J Chromatogr A 2011; 1218:7003-15. [PMID: 21880322 PMCID: PMC3174475 DOI: 10.1016/j.chroma.2011.08.023] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Revised: 08/05/2011] [Accepted: 08/09/2011] [Indexed: 11/15/2022]
Abstract
Transcription factor (TF) purification and identification is an important step in elucidating gene regulatory mechanisms. In this study, we present two new electrophoretic mobility shift assay (EMSA)-based multi-dimensional electrophoresis approaches to isolate and characterize TFs, using detection with either southwestern or western blotting and HPLC-nanoESI-MS/MS analysis for identification. These new techniques involve several major steps. First, EMSA is performed with agents that diminish non-specific DNA-binding and the DNA-protein complex is separated by native PAGE gel. The gel is then electrotransferred to PVDF membrane and visualized by autoradiography. Next, the DNA-protein complex, which has been transferred onto the blot, is extracted using a detergent-containing elution buffer. Following detergent removal, concentrated extract is separated by SDS-PAGE (EMSA-2DE), followed by in-gel trypsin digestion and HPLC-nanoESI-MS/MS analysis, or the concentrated extract is separated by two-dimensional gel electrophoresis (EMSA-3DE), followed by southwestern or western blot analysis to localize DNA binding proteins on blot which are further identified by on-blot trypsin digestion and HPLC-nanoESI-MS/MS analysis. Finally, the identified DNA binding proteins are further validated by EMSA-immunoblotting or EMSA antibody supershift assay. This approach is used to purify and identify GFP-C/EBP fusion protein from bacterial crude extract, as well as purifying AP1 and CEBP DNA binding proteins from a human embryonic kidney cell line (HEK293) nuclear extract. AP1 components, c-Jun, Jun-D, c-Fos, CREB, ATF1 and ATF2 were successfully identified from 1.5 mg of nuclear extract (equivalent to 3×10(7) HEK293 cells) with AP1 binding activity of 750 fmol. In conclusion, this new strategy of combining EMSA with additional dimensions of electrophoresis and using southwestern blotting for detection proves to be a valuable approach in the identification of transcriptional complexes by proteomic methods.
Collapse
Affiliation(s)
- Daifeng Jiang
- Department of Chemistry, University of Texas San Antonio, San Antonio, TX 28249
| | - Yinshan Jia
- Department of Chemistry, University of Texas San Antonio, San Antonio, TX 28249
| | - Harry W. Jarrett
- Department of Chemistry, University of Texas San Antonio, San Antonio, TX 28249
| |
Collapse
|
34
|
Pajkos M, Mészáros B, Simon I, Dosztányi Z. Is there a biological cost of protein disorder? Analysis of cancer-associated mutations. MOLECULAR BIOSYSTEMS 2011; 8:296-307. [PMID: 21918772 DOI: 10.1039/c1mb05246b] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
As many diseases can be traced back to altered protein function, studying the effect of genetic variations at the level of proteins can provide a clue to understand how changes at the DNA level lead to various diseases. Cellular processes rely not only on proteins with well-defined structure but can also involve intrinsically disordered proteins (IDPs) that exist as highly flexible ensembles of conformations. Disordered proteins are mostly involved in signaling and regulatory processes, and their functional repertoire largely complements that of globular proteins. However, it was also suggested that protein disorder entails an increased biological cost. This notion was supported by a set of individual IDPs involved in various diseases, especially in cancer, and the increased amount of disorder observed among disease-associated proteins. In this work, we tested if there is any biological risk associated with protein disorder at the level of single nucleotide mutations. Specifically, we analyzed the distribution of mutations within ordered and disordered segments. Our results demonstrated that while neutral polymorphisms were more likely to occur within disordered segments, cancer-associated mutations had a preference for ordered regions. Additionally, we proposed an alternative explanation for the association of protein disorder and the involvement in cancer with the consideration of functional annotations. Individual examples also suggested that although disordered segments are fundamental functional elements, their presence is not necessarily accompanied with an increased mutation rate in cancer. The presented study can help to understand how the different structural properties of proteins influence the consequences of genetic mutations.
Collapse
Affiliation(s)
- Mátyás Pajkos
- Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
| | | | | | | |
Collapse
|
35
|
Zia A, Moses AM. Ranking insertion, deletion and nonsense mutations based on their effect on genetic information. BMC Bioinformatics 2011; 12:299. [PMID: 21781308 PMCID: PMC3155974 DOI: 10.1186/1471-2105-12-299] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2011] [Accepted: 07/22/2011] [Indexed: 11/10/2022] Open
Abstract
Background Genetic variations contribute to normal phenotypic differences as well as diseases, and new sequencing technologies are greatly increasing the capacity to identify these variations. Given the large number of variations now being discovered, computational methods to prioritize the functional importance of genetic variations are of growing interest. Thus far, the focus of computational tools has been mainly on the prediction of the effects of amino acid changing single nucleotide polymorphisms (SNPs) and little attention has been paid to indels or nonsense SNPs that result in premature stop codons. Results We propose computational methods to rank insertion-deletion mutations in the coding as well as non-coding regions and nonsense mutations. We rank these variations by measuring the extent of their effect on biological function, based on the assumption that evolutionary conservation reflects function. Using sequence data from budding yeast and human, we show that variations which that we predict to have larger effects segregate at significantly lower allele frequencies, and occur less frequently than expected by chance, indicating stronger purifying selection. Furthermore, we find that insertions, deletions and premature stop codons associated with disease in the human have significantly larger predicted effects than those not associated with disease. Interestingly, the large-effect mutations associated with disease show a similar distribution of predicted effects to that expected for completely random mutations. Conclusions This demonstrates that the evolutionary conservation context of the sequences that harbour insertions, deletions and nonsense mutations can be used to predict and rank the effects of the mutations.
Collapse
Affiliation(s)
- Amin Zia
- Department of Cell & Systems Biology, University of Toronto, 25 Willcocks Street, Toronto, Ontario, M5S 3B2, Canada
| | | |
Collapse
|
36
|
Bonifaci N, Górski B, Masojć B, Wokołorczyk D, Jakubowska A, Dębniak T, Berenguer A, Serra Musach J, Brunet J, Dopazo J, Narod SA, Lubiński J, Lázaro C, Cybulski C, Pujana MA. Exploring the link between germline and somatic genetic alterations in breast carcinogenesis. PLoS One 2010; 5:e14078. [PMID: 21124932 PMCID: PMC2989917 DOI: 10.1371/journal.pone.0014078] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2010] [Accepted: 11/02/2010] [Indexed: 12/19/2022] Open
Abstract
Recent genome-wide association studies (GWASs) have identified candidate genes contributing to cancer risk through low-penetrance mutations. Many of these genes were unexpected and, intriguingly, included well-known players in carcinogenesis at the somatic level. To assess the hypothesis of a germline-somatic link in carcinogenesis, we evaluated the distribution of somatic gene labels within the ordered results of a breast cancer risk GWAS. This analysis suggested frequent influence on risk of genetic variation in loci encoding for "driver kinases" (i.e., kinases encoded by genes that showed higher somatic mutation rates than expected by chance and, therefore, whose deregulation may contribute to cancer development and/or progression). Assessment of these predictions using a population-based case-control study in Poland replicated the association for rs3732568 in EPHB1 (odds ratio (OR) = 0.79; 95% confidence interval (CI): 0.63-0.98; P(trend) = 0.031). Analyses by early age at diagnosis and by estrogen receptor α (ERα) tumor status indicated potential associations for rs6852678 in CDKL2 (OR = 0.32, 95% CI: 0.10-1.00; P(recessive) = 0.044) and rs10878640 in DYRK2 (OR = 2.39, 95% CI: 1.32-4.30; P(dominant) = 0.003), and for rs12765929, rs9836340, rs4707795 in BMPR1A, EPHA3 and EPHA7, respectively (ERα tumor status P(interaction)<0.05). The identification of three novel candidates as EPH receptor genes might indicate a link between perturbed compartmentalization of early neoplastic lesions and breast cancer risk and progression. Together, these data may lay the foundations for replication in additional populations and could potentially increase our knowledge of the underlying molecular mechanisms of breast carcinogenesis.
Collapse
Affiliation(s)
- Núria Bonifaci
- Biomarkers and Susceptibility Unit, Spanish Biomedical Research Centre Network for Epidemiology and Public Health, Catalan Institute of Oncology, L'Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), L'Hospitalet, Barcelona, Spain
| | - Bohdan Górski
- Department of Genetics and Pathology, International Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Bartlomiej Masojć
- Department of Genetics and Pathology, International Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Dominika Wokołorczyk
- Department of Genetics and Pathology, International Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Anna Jakubowska
- Department of Genetics and Pathology, International Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Tadeusz Dębniak
- Department of Genetics and Pathology, International Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Antoni Berenguer
- Biomarkers and Susceptibility Unit, Spanish Biomedical Research Centre Network for Epidemiology and Public Health, Catalan Institute of Oncology, L'Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), L'Hospitalet, Barcelona, Spain
| | - Jordi Serra Musach
- Biomarkers and Susceptibility Unit, Spanish Biomedical Research Centre Network for Epidemiology and Public Health, Catalan Institute of Oncology, L'Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), L'Hospitalet, Barcelona, Spain
| | - Joan Brunet
- Hereditary Cancer Programme, Catalan Institute of Oncology, IdIBGi, Girona, Spain
| | - Joaquín Dopazo
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe, Functional Genomics Node and Spanish Biomedical Research Centre Network for Rare Diseases, Valencia, Spain
| | - Steven A. Narod
- Womens College Research Institute, University of Toronto and Women's College Hospital, Toronto, Ontario, Canada
| | - Jan Lubiński
- Department of Genetics and Pathology, International Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Conxi Lázaro
- Hereditary Cancer Programme, Catalan Institute of Oncology, IDIBELL, L'Hospitalet, Barcelona, Spain
| | - Cezary Cybulski
- Department of Genetics and Pathology, International Hereditary Cancer Center, Pomeranian Medical University, Szczecin, Poland
| | - Miguel Angel Pujana
- Biomarkers and Susceptibility Unit, Spanish Biomedical Research Centre Network for Epidemiology and Public Health, Catalan Institute of Oncology, L'Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), L'Hospitalet, Barcelona, Spain
- Translational Research Laboratory, Catalan Institute of Oncology, IDIBELL, L'Hospitalet, Barcelona, Spain
| |
Collapse
|
37
|
Li YH, Dong MQ, Guo Z. Systematic analysis and prediction of longevity genes in Caenorhabditis elegans. Mech Ageing Dev 2010; 131:700-9. [DOI: 10.1016/j.mad.2010.10.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2010] [Revised: 09/14/2010] [Accepted: 10/01/2010] [Indexed: 10/19/2022]
|
38
|
Neklason DW, Tuohy TM, Stevens J, Otterud B, Baird L, Kerber RA, Samowitz WS, Kuwada SK, Leppert MF, Burt RW. Colorectal adenomas and cancer link to chromosome 13q22.1-13q31.3 in a large family with excess colorectal cancer. J Med Genet 2010; 47:692-9. [PMID: 20522424 PMCID: PMC3050714 DOI: 10.1136/jmg.2009.076091] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
BACKGROUND Colorectal cancer is the fourth most common type of cancer and the second most common cause of cancer death. Fewer than 5% of colon cancers arise in the presence of a clear hereditary cancer condition; however, current estimates suggest that an additional 15-25% of colorectal cancers arise on the basis of unknown inherited factors. AIM To identify additional genetic factors responsible for colon cancer. METHODS A large kindred with excess colorectal cancer was identified through the Utah Population Database and evaluated clinically and genetically for inherited susceptibility. RESULTS A major genetic locus segregating with colonic polyps and cancer in this kindred was identified on chromosome 13q with a non-parametric linkage score of 24 (LOD score of 2.99 and p=0.001). The genetic region spans 21 Mbp and contains 27 RefSeq genes. Sequencing of all candidate genes in this region failed to identify a clearly deleterious mutation; however, polymorphisms segregating with the phenotype were identified. Chromosome 13q is commonly gained and overexpressed in colon cancers and correlates with metastasis, suggesting the presence of an important cancer progression gene. Evaluation of tumours from the kindred revealed a gain of 13q as well. CONCLUSIONS This identified region may contain a novel gene responsible for colon cancer progression in a significant proportion of sporadic cancers. Identification of the precise gene and causative genetic change in the kindred will be an important next step to understanding cancer progression and metastasis.
Collapse
Affiliation(s)
- Deborah W Neklason
- Department of Oncological Sciences, University of Utah, Salt Lake City, Utah, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Helwa R, Hoheisel JD. Analysis of DNA–protein interactions: from nitrocellulose filter binding assays to microarray studies. Anal Bioanal Chem 2010; 398:2551-61. [DOI: 10.1007/s00216-010-4096-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2010] [Accepted: 08/03/2010] [Indexed: 10/19/2022]
|
40
|
Zhu J, Xiao H, Shen X, Wang J, Zou J, Zhang L, Yang D, Ma W, Yao C, Gong X, Zhang M, Zhang Y, Guo Z. Viewing cancer genes from co-evolving gene modules. Bioinformatics 2010; 26:919-24. [DOI: 10.1093/bioinformatics/btq055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
41
|
Gong X, Wu R, Zhang Y, Zhao W, Cheng L, Gu Y, Zhang L, Wang J, Zhu J, Guo Z. Extracting consistent knowledge from highly inconsistent cancer gene data sources. BMC Bioinformatics 2010; 11:76. [PMID: 20137077 PMCID: PMC2832783 DOI: 10.1186/1471-2105-11-76] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Accepted: 02/05/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Hundreds of genes that are causally implicated in oncogenesis have been found and collected in various databases. For efficient application of these abundant but diverse data sources, it is of fundamental importance to evaluate their consistency. RESULTS First, we showed that the lists of cancer genes from some major data sources were highly inconsistent in terms of overlapping genes. In particular, most cancer genes accumulated in previous small-scale studies could not be rediscovered in current high-throughput genome screening studies. Then, based on a metric proposed in this study, we showed that most cancer gene lists from different data sources were highly functionally consistent. Finally, we extracted functionally consistent cancer genes from various data sources and collected them in our database F-Census. CONCLUSIONS Although they have very low gene overlapping, most cancer gene data sources are highly consistent at the functional level, which indicates that they can separately capture partial genes in a few key pathways associated with cancer. Our results suggest that the sample sizes currently used for cancer studies might be inadequate for consistently capturing individual cancer genes, but could be sufficient for finding a number of cancer genes that could represent functionally most cancer genes. The F-Census database provides biologists with a useful tool for browsing and extracting functionally consistent cancer genes from various data sources.
Collapse
Affiliation(s)
- Xue Gong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150086, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Kar G, Gursoy A, Keskin O. Human cancer protein-protein interaction network: a structural perspective. PLoS Comput Biol 2009; 5:e1000601. [PMID: 20011507 PMCID: PMC2785480 DOI: 10.1371/journal.pcbi.1000601] [Citation(s) in RCA: 144] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Accepted: 11/05/2009] [Indexed: 01/12/2023] Open
Abstract
Protein-protein interaction networks provide a global picture of cellular function and biological processes. Some proteins act as hub proteins, highly connected to others, whereas some others have few interactions. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. Similar or overlapping binding sites should be used repeatedly in single interface hub proteins, making them promiscuous. Alternatively, multi-interface hub proteins make use of several distinct binding sites to bind to different partners. We propose a methodology to integrate protein interfaces into cancer interaction networks (ciSPIN, cancer structural protein interface network). The interactions in the human protein interaction network are replaced by interfaces, coming from either known or predicted complexes. We provide a detailed analysis of cancer related human protein-protein interfaces and the topological properties of the cancer network. The results reveal that cancer-related proteins have smaller, more planar, more charged and less hydrophobic binding sites than non-cancer proteins, which may indicate low affinity and high specificity of the cancer-related interactions. We also classified the genes in ciSPIN according to phenotypes. Within phenotypes, for breast cancer, colorectal cancer and leukemia, interface properties were found to be discriminating from non-cancer interfaces with an accuracy of 71%, 67%, 61%, respectively. In addition, cancer-related proteins tend to interact with their partners through distinct interfaces, corresponding mostly to multi-interface hubs, which comprise 56% of cancer-related proteins, and constituting the nodes with higher essentiality in the network (76%). We illustrate the interface related affinity properties of two cancer-related hub proteins: Erbb3, a multi interface, and Raf1, a single interface hub. The results reveal that affinity of interactions of the multi-interface hub tends to be higher than that of the single-interface hub. These findings might be important in obtaining new targets in cancer as well as finding the details of specific binding regions of putative cancer drug candidates. Protein-protein interaction networks provide a global picture of cellular function and biological processes. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. The structural details of interfaces are immensely useful in efforts to answer some fundamental questions such as: (i) what features of cancer-related protein interfaces make them act as hubs; (ii) how hub protein interfaces can interact with tens of other proteins with varying affinities; and (iii) which interactions can occur simultaneously and which are mutually exclusive. Addressing these questions, we propose a method to characterize interactions in a human protein-protein interaction network using three-dimensional protein structures and interfaces. Protein interface analysis shows that the strength and specificity of the interactions of hub proteins and cancer proteins are different than the interactions of non-hub and non-cancer proteins, respectively. In addition, distinguishing overlapping from non-overlapping interfaces, we illustrate how a fourth dimension, that of the sequence of processes, is integrated into the network with case studies. We believe that such an approach should be useful in structural systems biology.
Collapse
Affiliation(s)
- Gozde Kar
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Rumeli Feneri Yolu, Sariyer Istanbul, Turkey
| | | | | |
Collapse
|
43
|
Hegyi H, Buday L, Tompa P. Intrinsic structural disorder confers cellular viability on oncogenic fusion proteins. PLoS Comput Biol 2009; 5:e1000552. [PMID: 19888473 PMCID: PMC2768585 DOI: 10.1371/journal.pcbi.1000552] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2009] [Accepted: 09/30/2009] [Indexed: 12/22/2022] Open
Abstract
Chromosomal translocations, which often generate chimeric proteins by fusing segments of two distinct genes, represent the single major genetic aberration leading to cancer. We suggest that the unifying theme of these events is a high level of intrinsic structural disorder, enabling fusion proteins to evade cellular surveillance mechanisms that eliminate misfolded proteins. Predictions in 406 translocation-related human proteins show that they are significantly enriched in disorder (43.3% vs. 20.7% in all human proteins), they have fewer Pfam domains, and their translocation breakpoints tend to avoid domain splitting. The vicinity of the breakpoint is significantly more disordered than the rest of these already highly disordered fusion proteins. In the unlikely event of domain splitting in fusion it usually spares much of the domain or splits at locations where the newly exposed hydrophobic surface area approximates that of an intact domain. The mechanisms of action of fusion proteins suggest that in most cases their structural disorder is also essential to the acquired oncogenic function, enabling the long-range structural communication of remote binding and/or catalytic elements. In this respect, there are three major mechanisms that contribute to generating an oncogenic signal: (i) a phosphorylation site and a tyrosine-kinase domain are fused, and structural disorder of the intervening region enables intramolecular phosphorylation (e.g., BCR-ABL); (ii) a dimerisation domain fuses with a tyrosine kinase domain and disorder enables the two subunits within the homodimer to engage in permanent intermolecular phosphorylations (e.g., TFG-ALK); (iii) the fusion of a DNA-binding element to a transactivator domain results in an aberrant transcription factor that causes severe misregulation of transcription (e.g. EWS-ATF). Our findings also suggest novel strategies of intervention against the ensuing neoplastic transformations. Chromosomal translocations generate chimeric proteins by fusing segments of two distinct genes and are frequently associated with cancer. The proteins involved are large and fairly heterogeneous in sequence and typically have only a few dispersed structural domains connected by long uncharacterized regions. It has never been studied from a structural perspective how these chimeras survive losing significant portions of the original proteins and acquire new oncogenic functions. By analyzing a collection of 406 human translocation proteins we show here that the answer to both questions lies to a large extent in the high level of structural disorder in the fusion partner proteins (on average, they are twice as disordered as all human proteins). The translocation breakpoints usually avoid globular domains. In rare cases when a globular domain is truncated by the fusion, it happens at a location in the domain where the hydrophobicity exposed by the split is favorable (i.e., not too high). Disorder on average is significantly higher in the vicinity of the breakpoint than in the rest of the fusion proteins. Disorder also plays a pivotal role in the acquired oncogenic function by bringing distant/disparate fusion segments together that enables novel intra- and/or intermolecular interactions.
Collapse
Affiliation(s)
- Hedi Hegyi
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary
| | - László Buday
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary
- Department of Medical Chemistry, Semmelweis University Medical School, Budapest, Hungary
| | - Peter Tompa
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary
- * E-mail:
| |
Collapse
|
44
|
Li L, Zhang K, Lee J, Cordes S, Davis DP, Tang Z. Discovering cancer genes by integrating network and functional properties. BMC Med Genomics 2009; 2:61. [PMID: 19765316 PMCID: PMC2758898 DOI: 10.1186/1755-8794-2-61] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2008] [Accepted: 09/19/2009] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Identification of novel cancer-causing genes is one of the main goals in cancer research. The rapid accumulation of genome-wide protein-protein interaction (PPI) data in humans has provided a new basis for studying the topological features of cancer genes in cellular networks. It is important to integrate multiple genomic data sources, including PPI networks, protein domains and Gene Ontology (GO) annotations, to facilitate the identification of cancer genes. METHODS Topological features of the PPI network, as well as protein domain compositions, enrichment of gene ontology categories, sequence and evolutionary conservation features were extracted and compared between cancer genes and other genes. The predictive power of various classifiers for identification of cancer genes was evaluated by cross validation. Experimental validation of a subset of the prediction results was conducted using siRNA knockdown and viability assays in human colon cancer cell line DLD-1. RESULTS Cross validation demonstrated advantageous performance of classifiers based on support vector machines (SVMs) with the inclusion of the topological features from the PPI network, protein domain compositions and GO annotations. We then applied the trained SVM classifier to human genes to prioritize putative cancer genes. siRNA knock-down of several SVM predicted cancer genes displayed greatly reduced cell viability in human colon cancer cell line DLD-1. CONCLUSION Topological features of PPI networks, protein domain compositions and GO annotations are good predictors of cancer genes. The SVM classifier integrates multiple features and as such is useful for prioritizing candidate cancer genes for experimental validations.
Collapse
Affiliation(s)
- Li Li
- Department of Bioinformatics, Genentech Inc,, 1 DNA Way, South San Francisco, CA 94080, USA.
| | | | | | | | | | | |
Collapse
|
45
|
A census of human transcription factors: function, expression and evolution. Nat Rev Genet 2009; 10:252-63. [PMID: 19274049 DOI: 10.1038/nrg2538] [Citation(s) in RCA: 1095] [Impact Index Per Article: 73.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Transcription factors are key cellular components that control gene expression: their activities determine how cells function and respond to the environment. Currently, there is great interest in research into human transcriptional regulation. However, surprisingly little is known about these regulators themselves. For example, how many transcription factors does the human genome contain? How are they expressed in different tissues? Are they evolutionarily conserved? Here, we present an analysis of 1,391 manually curated sequence-specific DNA-binding transcription factors, their functions, genomic organization and evolutionary conservation. Much remains to be explored, but this study provides a solid foundation for future investigations to elucidate regulatory mechanisms underlying diverse mammalian biological processes.
Collapse
|
46
|
From cancer genomes to cancer models: bridging the gaps. EMBO Rep 2009; 10:359-66. [PMID: 19305388 DOI: 10.1038/embor.2009.46] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Accepted: 02/23/2009] [Indexed: 11/08/2022] Open
Abstract
Cancer genome projects are now being expanded in an attempt to provide complete landscapes of the mutations that exist in tumours. Although the importance of cataloguing genome variations is well recognized, there are obvious difficulties in bridging the gaps between high-throughput resequencing information and the molecular mechanisms of cancer evolution. Here, we describe the current status of the high-throughput genomic technologies, and the current limitations of the associated computational analysis and experimental validation of cancer genetic variants. We emphasize how the current cancer-evolution models will be influenced by the high-throughput approaches, in particular through efforts devoted to monitoring tumour progression, and how, in turn, the integration of data and models will be translated into mechanistic knowledge and clinical applications.
Collapse
|
47
|
Care M, Bradford J, Needham C, Bulpitt A, Westhead D. Combining the interactome and deleterious SNP predictions to improve disease gene identification. Hum Mutat 2009; 30:485-92. [DOI: 10.1002/humu.20917] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
48
|
Neklason DW, Kerber RA, Nilson DB, Anton-Culver H, Schwartz AG, Griffin CA, Lowery JT, Schildkraut JM, Evans JP, Tomlinson GE, Strong LC, Miller AR, Stopfer JE, Finkelstein DM, Nadkarni PM, Kasten CH, Mineau GP, Burt RW. Common familial colorectal cancer linked to chromosome 7q31: a genome-wide analysis. Cancer Res 2008; 68:8993-7. [PMID: 18974144 PMCID: PMC2927856 DOI: 10.1158/0008-5472.can-08-1376] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Present investigations suggest that approximately 30% of colorectal cancer cases arise on the basis of inherited factors. We hypothesize that the majority of inherited factors are moderately penetrant genes, common in the population. We use an affected sibling pair approach to identify genetic regions that are coinherited by siblings with colorectal cancer. Individuals from families with at least two siblings diagnosed with colorectal adenocarcinoma or high-grade dysplasia were enrolled. Known familial colorectal cancer syndromes were excluded. A genome-wide scan on 151 DNA samples from 70 kindreds was completed using deCODE 1100 short tandem repeat marker set at an average 4-cM density. Fine mapping on a total of 184 DNAs from 83 kindreds was done in regions suggesting linkage. Linkage analysis was accomplished with Merlin analysis package. Nonparametric linkage analysis revealed three genetic regions with logarithm of the odds (LOD) scores >or=2.0: Ch. 3q29, LOD 2.61 (P = 0.0003); Ch. 4q31.3, LOD 2.13 (P = 0.0009); and Ch. 7q31.31, LOD 3.08 (P = 0.00008). Affected siblings with increased sharing at the 7q31 locus have a 3.8-year (+/- 3.5) earlier age of colorectal cancer onset although this is not statistically significant (P = 0.11). No significant linkage was found near genes causing known syndromes or regions previously reported (8q24, 9q22, and 11q23). The chromosome 3q21-q24 region reported to be linked in colorectal cancer relative pairs is supported by our study, albeit a minor peak (LOD 0.9; P = 0.02). No known familial cancer genes reside in the 7q31 locus, and thus the identified region may contain a novel susceptibility gene responsible for common familial colorectal cancer.
Collapse
Affiliation(s)
- Deborah W Neklason
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112-5550, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Furney SJ, Calvo B, Larrañaga P, Lozano JA, Lopez-Bigas N. Prioritization of candidate cancer genes--an aid to oncogenomic studies. Nucleic Acids Res 2008; 36:e115. [PMID: 18710882 PMCID: PMC2566894 DOI: 10.1093/nar/gkn482] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
The development of techniques for oncogenomic analyses such as array comparative genomic hybridization, messenger RNA expression arrays and mutational screens have come to the fore in modern cancer research. Studies utilizing these techniques are able to highlight panels of genes that are altered in cancer. However, these candidate cancer genes must then be scrutinized to reveal whether they contribute to oncogenesis or are coincidental and non-causative. We present a computational method for the prioritization of candidate (i) proto-oncogenes and (ii) tumour suppressor genes from oncogenomic experiments. We constructed computational classifiers using different combinations of sequence and functional data including sequence conservation, protein domains and interactions, and regulatory data. We found that these classifiers are able to distinguish between known cancer genes and other human genes. Furthermore, the classifiers also discriminate candidate cancer genes from a recent mutational screen from other human genes. We provide a web-based facility through which cancer biologists may access our results and we propose computational cancer gene classification as a useful method of prioritizing candidate cancer genes identified in oncogenomic studies.
Collapse
Affiliation(s)
- Simon J Furney
- Research Unit on Biomedical Informatics, Experimental and Health Science Department, Universitat Pompeu Fabra, Barcelona 08080, Spain
| | | | | | | | | |
Collapse
|
50
|
Li Y, Guo Z, Peng C, Liu Q, Ma W, Wang J, Yao C, Zhang M, Zhu J. Identifying cancer genes from cancer mutation profiles by cancer functions. ACTA ACUST UNITED AC 2008; 51:569-74. [PMID: 18488178 DOI: 10.1007/s11427-008-0072-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2008] [Accepted: 04/09/2008] [Indexed: 11/27/2022]
Abstract
It is of great importance to identify new cancer genes from the data of large scale genome screenings of gene mutations in cancers. Considering the alternations of some essential functions are indispensable for oncogenesis, we define them as cancer functions and select, as their approximations, a group of detailed functions in GO (Gene Ontology) highly enriched with known cancer genes. To evaluate the efficiency of using cancer functions as features to identify cancer genes, we define, in the screened genes, the known protein kinase cancer genes as gold standard positives and the other kinase genes as gold standard negatives. The results show that cancer associated functions are more efficient in identifying cancer genes than the selection pressure feature. Furthermore, combining cancer functions with the number of non-silent mutations can generate more reliable positive predictions. Finally, with precision 0.42, we suggest a list of 46 kinase genes as candidate cancer genes which are annotated to cancer functions and carry at least 3 non-silent mutations.
Collapse
Affiliation(s)
- YanHui Li
- Bioinformatics Centre, School of Life Science, University of Electronic Science and Technology of China, Chengdu 610054, China
| | | | | | | | | | | | | | | | | |
Collapse
|