1
|
Singh AK, Talseth-Palmer B, Xavier A, Scott RJ, Drabløs F, Sjursen W. Detection of germline variants with pathogenic potential in 48 patients with familial colorectal cancer by using whole exome sequencing. BMC Med Genomics 2023; 16:126. [PMID: 37296477 PMCID: PMC10257304 DOI: 10.1186/s12920-023-01562-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 05/30/2023] [Indexed: 06/12/2023] Open
Abstract
BACKGROUND Hereditary genetic mutations causing predisposition to colorectal cancer are accountable for approximately 30% of all colorectal cancer cases. However, only a small fraction of these are high penetrant mutations occurring in DNA mismatch repair genes, causing one of several types of familial colorectal cancer (CRC) syndromes. Most of the mutations are low-penetrant variants, contributing to an increased risk of familial colorectal cancer, and they are often found in additional genes and pathways not previously associated with CRC. The aim of this study was to identify such variants, both high-penetrant and low-penetrant ones. METHODS We performed whole exome sequencing on constitutional DNA extracted from blood of 48 patients suspected of familial colorectal cancer and used multiple in silico prediction tools and available literature-based evidence to detect and investigate genetic variants. RESULTS We identified several causative and some potentially causative germline variants in genes known for their association with colorectal cancer. In addition, we identified several variants in genes not typically included in relevant gene panels for colorectal cancer, including CFTR, PABPC1 and TYRO3, which may be associated with an increased risk for cancer. CONCLUSIONS Identification of variants in additional genes that potentially can be associated with familial colorectal cancer indicates a larger genetic spectrum of this disease, not limited only to mismatch repair genes. Usage of multiple in silico tools based on different methods and combined through a consensus approach increases the sensitivity of predictions and narrows down a large list of variants to the ones that are most likely to be significant.
Collapse
Affiliation(s)
- Ashish Kumar Singh
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway.
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway.
| | - Bente Talseth-Palmer
- School of Biomedical Science and Pharmacy, Faculty of Health and Medicine, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
- Møre and Romsdal Hospital Trust, Research Unit, Ålesund, Norway
- NSW Health Pathology, Newcastle, Australia
| | - Alexandre Xavier
- School of Biomedical Science and Pharmacy, Faculty of Health and Medicine, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
| | - Rodney J Scott
- School of Biomedical Science and Pharmacy, Faculty of Health and Medicine, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
- NSW Health Pathology, Newcastle, Australia
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Wenche Sjursen
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
2
|
Rocque MJ, Leipart V, Kumar Singh A, Mur P, Olsen MF, Engebretsen LF, Martin-Ramos E, Aligué R, Sætrom P, Valle L, Drabløs F, Otterlei M, Sjursen W. Characterization of POLE c.1373A > T p.(Tyr458Phe), causing high cancer risk. Mol Genet Genomics 2023; 298:555-566. [PMID: 36856825 PMCID: PMC10133059 DOI: 10.1007/s00438-023-02000-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 02/15/2023] [Indexed: 03/02/2023]
Abstract
The cancer syndrome polymerase proofreading-associated polyposis results from germline mutations in the POLE and POLD1 genes. Mutations in the exonuclease domain of these genes are associated with hyper- and ultra-mutated tumors with a predominance of base substitutions resulting from faulty proofreading during DNA replication. When a new variant is identified by gene testing of POLE and POLD1, it is important to verify whether the variant is associated with PPAP or not, to guide genetic counseling of mutation carriers. In 2015, we reported the likely pathogenic (class 4) germline POLE c.1373A > T p.(Tyr458Phe) variant and we have now characterized this variant to verify that it is a class 5 pathogenic variant. For this purpose, we investigated (1) mutator phenotype in tumors from two carriers, (2) mutation frequency in cell-based mutagenesis assays, and (3) structural consequences based on protein modeling. Whole-exome sequencing of two tumors identified an ultra-mutator phenotype with a predominance of base substitutions, the majority of which are C > T. A SupF mutagenesis assay revealed increased mutation frequency in cells overexpressing the variant of interest as well as in isogenic cells encoding the variant. Moreover, exonuclease repair yeast-based assay supported defect in proofreading activity. Lastly, we present a homology model of human POLE to demonstrate structural consequences leading to pathogenic impact of the p.(Tyr458Phe) mutation. The three lines of evidence, taken together with updated co-segregation and previously published data, allow the germline variant POLE c.1373A > T p.(Tyr458Phe) to be reclassified as a class 5 variant. That means the variant is associated with PPAP.
Collapse
Affiliation(s)
- Mariève J Rocque
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7030, Trondheim, Norway
- Department of Medical Genetics, St. Olavs Hospital, 7030, Trondheim, Norway
| | - Vilde Leipart
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7030, Trondheim, Norway
- Faculty of Environmental Sciences and Natural Resource Management, Norwegian University of Life Sciences, NMBU, 1432, Ås, Norway
| | - Ashish Kumar Singh
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7030, Trondheim, Norway
- Department of Medical Genetics, St. Olavs Hospital, 7030, Trondheim, Norway
| | - Pilar Mur
- Hereditary Cancer Program, Catalan Institute of Oncology, Oncobell Program, Bellvitge Biomedical Research Institute (IDIBELL), Hospitalet de Llobregat, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Maren F Olsen
- Department of Medical Genetics, St. Olavs Hospital, 7030, Trondheim, Norway
| | - Lars F Engebretsen
- Department of Medical Genetics, St. Olavs Hospital, 7030, Trondheim, Norway
| | - Edgar Martin-Ramos
- Department of Biomedical Sciences, School of Medicine, University of Barcelona, IDIBAPS, Barcelona, Spain
| | - Rosa Aligué
- Department of Biomedical Sciences, School of Medicine, University of Barcelona, IDIBAPS, Barcelona, Spain
| | - Pål Sætrom
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7030, Trondheim, Norway
- Department of Computer and Information Science, NTNU-Norwegian University of Science and Technology, 7491, Trondheim, Norway
- Bioinformatics Core Facility-BioCore, NTNU-Norwegian University of Science and Technology, 7491, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, NTNU-Norwegian University of Science and Technology, 7030, Trondheim, Norway
| | - Laura Valle
- Hereditary Cancer Program, Catalan Institute of Oncology, Oncobell Program, Bellvitge Biomedical Research Institute (IDIBELL), Hospitalet de Llobregat, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7030, Trondheim, Norway
| | - Marit Otterlei
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7030, Trondheim, Norway
| | - Wenche Sjursen
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7030, Trondheim, Norway.
- Department of Medical Genetics, St. Olavs Hospital, 7030, Trondheim, Norway.
| |
Collapse
|
3
|
Leipart V, Enger Ø, Turcu DC, Dobrovolska O, Drabløs F, Halskau Ø, Amdam GV. Resolving the zinc binding capacity of honey bee vitellogenin and locating its putative binding sites. Insect Mol Biol 2022; 31:810-820. [PMID: 36054587 PMCID: PMC9804912 DOI: 10.1111/imb.12807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 08/08/2022] [Indexed: 06/15/2023]
Abstract
The protein vitellogenin (Vg) plays a central role in lipid transportation in most egg-laying animals. High Vg levels correlate with stress resistance and lifespan potential in honey bees (Apis mellifera). Vg is the primary circulating zinc-carrying protein in honey bees. Zinc is an essential metal ion in numerous biological processes, including the function and structure of many proteins. Measurements of Zn2+ suggest a variable number of ions per Vg molecule in different animal species, but the molecular implications of zinc-binding by this protein are not well-understood. We used inductively coupled plasma mass spectrometry to determine that, on average, each honey bee Vg molecule binds 3 Zn2+ -ions. Our full-length protein structure and sequence analysis revealed seven potential zinc-binding sites. These are located in the β-barrel and α-helical subdomains of the N-terminal domain, the lipid binding site, and the cysteine-rich C-terminal region of unknown function. Interestingly, two potential zinc-binding sites in the β-barrel can support a proposed role for this structure in DNA-binding. Overall, our findings suggest that honey bee Vg bind zinc at several functional regions, indicating that Zn2+ -ions are important for many of the activities of this protein. In addition to being potentially relevant for other egg-laying species, these insights provide a platform for studies of metal ions in bee health, which is of global interest due to recent declines in pollinator numbers.
Collapse
Affiliation(s)
- Vilde Leipart
- Faculty of Environmental Sciences and Natural Resource ManagementNorwegian University of Life SciencesAasNorway
| | - Øyvind Enger
- Faculty of Environmental Sciences and Natural Resource ManagementNorwegian University of Life SciencesAasNorway
| | | | | | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health SciencesNTNU – Norwegian University of Science and TechnologyTrondheimNorway
| | - Øyvind Halskau
- Department of Biological SciencesUniversity of BergenBergenNorway
| | - Gro V. Amdam
- Faculty of Environmental Sciences and Natural Resource ManagementNorwegian University of Life SciencesAasNorway
- School of Life SciencesArizona State UniversityTempeArizonaUSA
| |
Collapse
|
4
|
Marakulina D, Vorontsov IE, Kulakovskiy IV, Lennartsson A, Drabløs F, Medvedeva Y. EpiFactors 2022: expansion and enhancement of a curated database of human epigenetic factors and complexes. Nucleic Acids Res 2022; 51:D564-D570. [PMID: 36350659 PMCID: PMC9825597 DOI: 10.1093/nar/gkac989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 09/30/2022] [Accepted: 10/24/2022] [Indexed: 11/11/2022] Open
Abstract
We present an update of EpiFactors, a manually curated database providing information about epigenetic regulators, their complexes, targets, and products which is openly accessible at http://epifactors.autosome.org. An updated version of the EpiFactors contains information on 902 proteins, including 101 histones and protamines, and, as a main update, a newly curated collection of 124 lncRNAs involved in epigenetic regulation. The amount of publications concerning the role of lncRNA in epigenetics is rapidly growing. Yet, the resource that compiles, integrates, organizes, and presents curated information on lncRNAs in epigenetics is missing. EpiFactors fills this gap and provides data on epigenetic regulators in an accessible and user-friendly form. For 820 of the genes in EpiFactors, we include expression estimates across multiple cell types assessed by CAGE-Seq in the FANTOM5 project. In addition, the updated EpiFactors contains information on 73 protein complexes involved in epigenetic regulation. Our resource is practical for a wide range of users, including biologists, bioinformaticians and molecular/systems biologists.
Collapse
Affiliation(s)
- Daria Marakulina
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, 141701, Dolgoprudny, Moscow Region, Russia
| | - Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia,Institute of Protein Research, Russian Academy of Sciences, Pushchino 142290, Russia
| | - Andreas Lennartsson
- Department of Biosciences and Nutrition, NEO, Karolinska Institutet, 14157, Huddinge, Sweden
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, PO Box 8905, NO-7491 Trondheim, Norway
| | | |
Collapse
|
5
|
Rise K, Tessem MB, Drabløs F, Rye MB. FunHoP analysis reveals upregulation of mitochondrial genes in prostate cancer. PLoS One 2022; 17:e0275621. [PMID: 36282866 PMCID: PMC9595552 DOI: 10.1371/journal.pone.0275621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 09/20/2022] [Indexed: 11/19/2022] Open
Abstract
Mitochondrial activity in cancer cells has been central to cancer research since Otto Warburg first published his thesis on the topic in 1956. Although Warburg proposed that oxidative phosphorylation in the tricarboxylic acid (TCA) cycle was perturbed in cancer, later research has shown that oxidative phosphorylation is activated in most cancers, including prostate cancer (PCa). However, more detailed knowledge on mitochondrial metabolism and metabolic pathways in cancers is still lacking. In this study we expand our previously developed method for analyzing functional homologous proteins (FunHoP), which can provide a more detailed view of metabolic pathways. FunHoP uses results from differential expression analysis of RNA-Seq data to improve pathway analysis. By adding information on subcellular localization based on experimental data and computational predictions we can use FunHoP to differentiate between mitochondrial and non-mitochondrial processes in cancerous and normal prostate cell lines. Our results show that mitochondrial pathways are upregulated in PCa and that splitting metabolic pathways into mitochondrial and non-mitochondrial counterparts using FunHoP adds to the interpretation of the metabolic properties of PCa cells.
Collapse
Affiliation(s)
- Kjersti Rise
- Department of Clinical and Molecular Medicine, NTNU–Norwegian University of Science and Technology, Trondheim, Norway
- * E-mail: (MBR); (KR)
| | - May-Britt Tessem
- Department of Circulation and Medical Imaging, NTNU–Norwegian University of Science and Technology, Trondheim, Norway
- Clinic of Surgery, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU–Norwegian University of Science and Technology, Trondheim, Norway
| | - Morten Beck Rye
- Department of Clinical and Molecular Medicine, NTNU–Norwegian University of Science and Technology, Trondheim, Norway
- Clinic of Surgery, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway
- Clinic of Laboratory Medicine, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway
- BioCore—Bioinformatics Core Facility, NTNU–Norwegian University of Science and Technology, Trondheim, Norway
- * E-mail: (MBR); (KR)
| |
Collapse
|
6
|
Rye MB, Krossa S, Hall M, van Mourik C, Bathen TF, Drabløs F, Tessem MB, Bertilsson H. The genes controlling normal function of citrate and spermine secretion are lost in aggressive prostate cancer and prostate model systems. iScience 2022; 25:104451. [PMID: 35707723 PMCID: PMC9189124 DOI: 10.1016/j.isci.2022.104451] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 04/12/2022] [Accepted: 05/17/2022] [Indexed: 11/22/2022] Open
Abstract
High secretion of the metabolites citrate and spermine is a unique hallmark for normal prostate epithelial cells, and is reduced in aggressive prostate cancer. However, the identity of the genes controlling this biological process is mostly unknown. In this study, we have created a gene signature of 150 genes connected to citrate and spermine secretion in the prostate. We have computationally integrated metabolic measurements with multiple transcriptomics datasets from the public domain, including 3826 tissue samples from prostate and prostate cancer. The accuracy of the signature is validated by its unique enrichment in prostate samples and prostate epithelial tissue compartments. The signature highlights genes AZGP1, ANPEP and metallothioneins with zinc-binding properties not previously studied in the prostate, and the expression of these genes are reduced in more aggressive cancer lesions. However, the absence of signature enrichment in common prostate model systems can make it challenging to study these genes mechanistically. Novel 150 gene signature reflecting prostatic citrate and spermine secretion Identified several zinc-binding proteins not previously investigated in the prostate The signature is absent in prostate model systems
Collapse
Affiliation(s)
- Morten Beck Rye
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, 7491 Trondheim, Norway.,Clinic of Surgery, St.Olavs Hospital, Trondheim University Hospital, 7030 Trondheim, Norway.,Clinic of Laboratory Medicine, St.Olavs Hospital, Trondheim University Hospital, 7030 Trondheim, Norway.,BioCore - Bioinformatics Core Facility, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, 7491 Trondheim, Norway
| | - Sebastian Krossa
- Department of Circulation and Medical Imaging, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Martina Hall
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway.,K. G. Jebsen Center for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Casper van Mourik
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, 7491 Trondheim, Norway.,Institute for Life Science & Technology, Hanze University of Applied Sciences, 9747 AS Groningen, the Netherlands
| | - Tone F Bathen
- Department of Circulation and Medical Imaging, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, 7491 Trondheim, Norway
| | - May-Britt Tessem
- Clinic of Surgery, St.Olavs Hospital, Trondheim University Hospital, 7030 Trondheim, Norway.,Department of Circulation and Medical Imaging, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Helena Bertilsson
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, 7491 Trondheim, Norway.,Clinic of Surgery, St.Olavs Hospital, Trondheim University Hospital, 7030 Trondheim, Norway
| |
Collapse
|
7
|
Rise K, Tessem MB, Drabløs F, Rye MB. FunHoP: Enhanced Visualization and Analysis of Functionally Homologous Proteins in Complex Metabolic Networks. Genomics Proteomics Bioinformatics 2021; 19:848-859. [PMID: 33741524 PMCID: PMC9170767 DOI: 10.1016/j.gpb.2021.03.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 05/08/2019] [Accepted: 08/18/2019] [Indexed: 11/28/2022]
Abstract
Cytoscape is often used for visualization and analysis of metabolic pathways. For example, based on KEGG data, a reader for KEGG Markup Language (KGML) is used to load files into Cytoscape. However, although multiple genes can be responsible for the same reaction, the KGML-reader KEGGScape only presents the first listed gene in a network node for a given reaction. This can lead to incorrect interpretations of the pathways. Our new method, FunHoP, shows all possible genes in each node, making the pathways more complete. FunHoP collapses all genes in a node into one measurement using read counts from RNA-seq. Assuming that activity for an enzymatic reaction mainly depends upon the gene with the highest number of reads, and weighting the reads on gene length and ratio, a new expression value is calculated for the node as a whole. Differential expression at node level is then applied to the networks. Using prostate cancer as model, we integrate RNA-seq data from two patient cohorts with metabolism data from literature. Here we show that FunHoP gives more consistent pathways that are easier to interpret biologically. Code and documentation for running FunHoP can be found at https://github.com/kjerstirise/FunHoP.
Collapse
Affiliation(s)
- Kjersti Rise
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, Trondheim NO-7491, Norway.
| | - May-Britt Tessem
- Department of Circulation and Medical Imaging, NTNU - Norwegian University of Science and Technology, Trondheim NO-7491, Norway
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, Trondheim NO-7491, Norway
| | - Morten B Rye
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, Trondheim NO-7491, Norway; Clinic of Surgery, St. Olavs Hospital, Trondheim University Hospital, Trondheim NO-7491, Norway.
| |
Collapse
|
8
|
Singh AK, Olsen MF, Lavik LAS, Vold T, Drabløs F, Sjursen W. Detecting copy number variation in next generation sequencing data from diagnostic gene panels. BMC Med Genomics 2021; 14:214. [PMID: 34465341 PMCID: PMC8406611 DOI: 10.1186/s12920-021-01059-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/16/2021] [Indexed: 01/21/2023] Open
Abstract
Background Detection of copy number variation (CNV) in genes associated with disease is important in genetic diagnostics, and next generation sequencing (NGS) technology provides data that can be used for CNV detection. However, CNV detection based on NGS data is in general not often used in diagnostic labs as the data analysis is challenging, especially with data from targeted gene panels. Wet lab methods like MLPA (MRC Holland) are widely used, but are expensive, time consuming and have gene-specific limitations. Our aim has been to develop a bioinformatic tool for CNV detection from NGS data in medical genetic diagnostic samples. Results Our computational pipeline for detection of CNVs in NGS data from targeted gene panels utilizes coverage depth of the captured regions and calculates a copy number ratio score for each region. This is computed by comparing the mean coverage of the sample with the mean coverage of the same region in other samples, defined as a pool. The pipeline selects pools for comparison dynamically from previously sequenced samples, using the pool with an average coverage depth that is nearest to the one of the samples. A sliding window-based approach is used to analyze each region, where length of sliding window and sliding distance can be chosen dynamically to increase or decrease the resolution. This helps in detecting CNVs in small or partial exons. With this pipeline we have correctly identified the CNVs in 36 positive control samples, with sensitivity of 100% and specificity of 91%. We have detected whole gene level deletion/duplication, single/multi exonic level deletion/duplication, partial exonic deletion and mosaic deletion. Since its implementation in mid-2018 it has proven its diagnostic value with more than 45 CNV findings in routine tests. Conclusions With this pipeline as part of our diagnostic practices it is now possible to detect partial, single or multi-exonic, and intragenic CNVs in all genes in our target panel. This has helped our diagnostic lab to expand the portfolio of genes where we offer CNV detection, which previously was limited by the availability of MLPA kits. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-021-01059-x.
Collapse
Affiliation(s)
- Ashish Kumar Singh
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway. .,Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway.
| | | | | | - Trine Vold
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Wenche Sjursen
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway.,Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
9
|
Gundersen S, Boddu S, Capella-Gutierrez S, Drabløs F, Fernández JM, Kompova R, Taylor K, Titov D, Zerbino D, Hovig E. Recommendations for the FAIRification of genomic track metadata. F1000Res 2021; 10. [PMID: 34249331 PMCID: PMC8226415 DOI: 10.12688/f1000research.28449.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/17/2021] [Indexed: 01/25/2023] Open
Abstract
Background: Many types of data from genomic analyses can be represented as genomic tracks,
i.e. features linked to the genomic coordinates of a reference genome. Examples of such data are epigenetic DNA methylation data, ChIP-seq peaks, germline or somatic DNA variants, as well as RNA-seq expression levels. Researchers often face difficulties in locating, accessing and combining relevant tracks from external sources, as well as locating the raw data, reducing the value of the generated information. Description of work: We propose to advance the application of FAIR data principles (Findable, Accessible, Interoperable, and Reusable) to produce searchable metadata for genomic tracks. Findability and Accessibility of metadata can then be ensured by a track search service that integrates globally identifiable metadata from various track hubs in the Track Hub Registry and other relevant repositories. Interoperability and Reusability need to be ensured by the specification and implementation of a basic set of recommendations for metadata. We have tested this concept by developing such a specification in a JSON Schema, called FAIRtracks, and have integrated it into a novel track search service, called TrackFind. We demonstrate practical usage by importing datasets through TrackFind into existing examples of relevant analytical tools for genomic tracks: EPICO and the GSuite HyperBrowser. Conclusion: We here provide a first iteration of a draft standard for genomic track metadata, as well as the accompanying software ecosystem. It can easily be adapted or extended to future needs of the research community regarding data, methods and tools, balancing the requirements of both data submitters and analytical end-users.
Collapse
Affiliation(s)
| | - Sanjay Boddu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | | | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - José M Fernández
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Radmila Kompova
- Center for Bioinformatics, University of Oslo (UiO), Oslo, Norway
| | - Kieron Taylor
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Dmytro Titov
- Center for Bioinformatics, University of Oslo (UiO), Oslo, Norway
| | - Daniel Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Eivind Hovig
- Center for Bioinformatics, University of Oslo (UiO), Oslo, Norway.,Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital (OUH), Oslo, Norway
| |
Collapse
|
10
|
Abstract
The k-Nearest Neighbor (kNN) classifier represents a simple and very general approach to classification. Still, the performance of kNN classifiers can often compete with more complex machine-learning algorithms. The core of kNN depends on a "guilt by association" principle where classification is performed by measuring the similarity between a query and a set of training patterns, often computed as distances. The relative performance of kNN classifiers is closely linked to the choice of distance or similarity measure, and it is therefore relevant to investigate the effect of using different distance measures when comparing biomedical data. In this study on classification of cancer data sets, we have used both common and novel distance measures, including the novel distance measures Sobolev and Fisher, and we have evaluated the performance of kNN with these distances on 4 cancer data sets of different type. We find that the performance when using the novel distance measures is comparable to the performance with more well-established measures, in particular for the Sobolev distance. We define a robust ranking of all the distance measures according to overall performance. Several distance measures show robust performance in kNN over several data sets, in particular the Hassanat, Sobolev, and Manhattan measures. Some of the other measures show good performance on selected data sets but seem to be more sensitive to the nature of the classification data. It is therefore important to benchmark distance measures on similar data prior to classification to identify the most suitable measure in each case.
Collapse
Affiliation(s)
- Rezvan Ehsani
- Department of Mathematics, University of Zabol, Zabol, Iran
- Department of Bioinformatics, University of Zabol, Zabol, Iran
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU – Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
11
|
Zwiggelaar RT, Lindholm HT, Fosslie M, Terndrup Pedersen M, Ohta Y, Díez-Sánchez A, Martín-Alonso M, Ostrop J, Matano M, Parmar N, Kvaløy E, Spanjers RR, Nazmi K, Rye M, Drabløs F, Arrowsmith C, Arne Dahl J, Jensen KB, Sato T, Oudhoff MJ. LSD1 represses a neonatal/reparative gene program in adult intestinal epithelium. Sci Adv 2020; 6:6/37/eabc0367. [PMID: 32917713 PMCID: PMC7486101 DOI: 10.1126/sciadv.abc0367] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Accepted: 07/29/2020] [Indexed: 05/08/2023]
Abstract
Intestinal epithelial homeostasis is maintained by adult intestinal stem cells, which, alongside Paneth cells, appear after birth in the neonatal period. We aimed to identify regulators of neonatal intestinal epithelial development by testing a small library of epigenetic modifier inhibitors in Paneth cell-skewed organoid cultures. We found that lysine-specific demethylase 1A (Kdm1a/Lsd1) is absolutely required for Paneth cell differentiation. Lsd1-deficient crypts, devoid of Paneth cells, are still able to form organoids without a requirement of exogenous or endogenous Wnt. Mechanistically, we find that LSD1 enzymatically represses genes that are normally expressed only in fetal and neonatal epithelium. This gene profile is similar to what is seen in repairing epithelium, and we find that Lsd1-deficient epithelium has superior regenerative capacities after irradiation injury. In summary, we found an important regulator of neonatal intestinal development and identified a druggable target to reprogram intestinal epithelium toward a reparative state.
Collapse
Affiliation(s)
- Rosalie T Zwiggelaar
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Håvard T Lindholm
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Madeleine Fosslie
- Department of Microbiology, Oslo University Hospital, Rikshospitalet, NO-0027 Oslo, Norway
| | - Marianne Terndrup Pedersen
- BRIC-Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Yuki Ohta
- Department of Gastroenterology, Keio University School of Medicine, Tokyo 160-8582, Japan
- Department of Organoid Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Alberto Díez-Sánchez
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Mara Martín-Alonso
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Jenny Ostrop
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Mami Matano
- Department of Gastroenterology, Keio University School of Medicine, Tokyo 160-8582, Japan
- Department of Organoid Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Naveen Parmar
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Emilie Kvaløy
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Roos R Spanjers
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Kamran Nazmi
- Department of Oral Biochemistry, Academic Centre for Dentistry (ACTA), 1081LA Amsterdam, Netherlands
| | - Morten Rye
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
- Clinic of Surgery, St. Olav's Hospital, Trondheim University Hospital, 7030 Trondheim, Norway
- Clinic of Laboratory Medicine, St. Olavs Hospital, Trondheim University Hospital, NO-7030 Trondheim, Norway
- BioCore-Bioinformatics Core Facility, NTNU-Norwegian University of Science and Technology, NO-7491, Trondheim, Norway
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Cheryl Arrowsmith
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 2M9, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - John Arne Dahl
- Department of Microbiology, Oslo University Hospital, Rikshospitalet, NO-0027 Oslo, Norway
| | - Kim B Jensen
- BRIC-Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Toshiro Sato
- Department of Gastroenterology, Keio University School of Medicine, Tokyo 160-8582, Japan
- Department of Organoid Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Menno J Oudhoff
- CEMIR-Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine, NTNU-Norwegian University of Science and Technology, 7491 Trondheim, Norway.
| |
Collapse
|
12
|
Singh AK, Talseth-Palmer B, McPhillips M, Lavik LAS, Xavier A, Drabløs F, Sjursen W. Targeted sequencing of genes associated with the mismatch repair pathway in patients with endometrial cancer. PLoS One 2020; 15:e0235613. [PMID: 32634176 PMCID: PMC7340288 DOI: 10.1371/journal.pone.0235613] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 06/19/2020] [Indexed: 01/28/2023] Open
Abstract
Germline variants inactivating the mismatch repair (MMR) genes MLH1, MSH2, MSH6 and PMS2 cause Lynch syndrome that implies an increased cancer risk, where colon and endometrial cancer are the most frequent. Identification of these pathogenic variants is important to identify endometrial cancer patients with inherited increased risk of new cancers, in order to offer them lifesaving surveillance. However, several other genes are also part of the MMR pathway. It is therefore relevant to search for variants in additional genes that may be associated with cancer risk by including all known genes involved in the MMR pathway. Next-generation sequencing was used to screen 22 genes involved in the MMR pathway in constitutional DNA extracted from full blood from 199 unselected endometrial cancer patients. Bioinformatic pipelines were developed for identification and functional annotation of variants, using several different software tools and custom programs. This facilitated identification of 22 exonic, 4 UTR and 9 intronic variants that could be classified according to pathogenicity. This study has identified several germline variants in genes of the MMR pathway that potentially may be associated with an increased risk for cancer, in particular endometrial cancer, and therefore are relevant for further investigation. We have also developed bioinformatics strategies to analyse targeted sequencing data, including low quality data and genomic regions outside of the protein coding exons of the relevant genes.
Collapse
Affiliation(s)
- Ashish Kumar Singh
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU—Norwegian University of Science and Technology, Trondheim, Norway
| | - Bente Talseth-Palmer
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway
- School of Biomedical Science and Pharmacy, Faculty of Health and Medicine, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
- Department of Research and Development, Møre og Romsdal Hospital Trust, Molde, Norway
| | - Mary McPhillips
- NSW Health Pathology, Molecular Medicine, John Hunter Hospital, Newcastle, NSW, Australia
| | | | - Alexandre Xavier
- School of Biomedical Science and Pharmacy, Faculty of Health and Medicine, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU—Norwegian University of Science and Technology, Trondheim, Norway
| | - Wenche Sjursen
- Department of Medical Genetics, St. Olavs Hospital, Trondheim, Norway
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, NTNU—Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
13
|
Vardaxis I, Drabløs F, Rye MB, Lindqvist BH. MACPET: model-based analysis for ChIA-PET. Biostatistics 2020; 21:625-639. [PMID: 30698663 PMCID: PMC7308020 DOI: 10.1093/biostatistics/kxy084] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Revised: 12/13/2018] [Accepted: 12/16/2018] [Indexed: 11/16/2022] Open
Abstract
We present model-based analysis for ChIA-PET (MACPET), which analyzes paired-end read sequences provided by ChIA-PET for finding binding sites of a protein of interest. MACPET uses information from both tags of each PET and searches for binding sites in a two-dimensional space, while taking into account different noise levels in different genomic regions. MACPET shows favorable results compared with MACS in terms of motif occurrence and spatial resolution. Furthermore, significant binding sites discovered by MACPET are involved in a higher number of significant three-dimensional interactions than those discovered by MACS. MACPET is freely available on Bioconductor. ChIA-PET; MACPET; Model-based clustering; Paired-end tags; Peak-calling algorithm.
Collapse
Affiliation(s)
- Ioannis Vardaxis
- Department of Mathematical Sciences, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
| | - Morten B Rye
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, N-7491 Trondheim, Norway and Clinic of Surgery, St. Olavs Hospital, Trondheim University Hospital, N-7030 Trondheim, Norway
| | - Bo Henry Lindqvist
- Department of Mathematical Sciences, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
| |
Collapse
|
14
|
Abstract
Background Diseases like cancer will lead to changes in gene expression, and it is relevant to identify key regulatory genes that can be linked directly to these changes. This can be done by computing a Regulatory Impact Factor (RIF) score for relevant regulators. However, this computation is based on estimating correlated patterns of gene expression, often Pearson correlation, and an assumption about a set of specific regulators, normally transcription factors. This study explores alternative measures of correlation, using the Fisher and Sobolev metrics, and an extended set of regulators, including epigenetic regulators and long non-coding RNAs (lncRNAs). Data on prostate cancer have been used to explore the effect of these modifications. Results A tool for computation of RIF scores with alternative correlation measures and extended sets of regulators was developed and tested on gene expression data for prostate cancer. The study showed that the Fisher and Sobolev metrics lead to improved identification of well-documented regulators of gene expression in prostate cancer, and the sets of identified key regulators showed improved overlap with previously defined gene sets of relevance to cancer. The extended set of regulators lead to identification of several interesting candidates for further studies, including lncRNAs. Several key processes were identified as important, including spindle assembly and the epithelial-mesenchymal transition (EMT). Conclusions The study has shown that using alternative metrics of correlation can improve the performance of tools based on correlation of gene expression in genomic data. The Fisher and Sobolev metrics should be considered also in other correlation-based applications.
Collapse
Affiliation(s)
- Rezvan Ehsani
- Department of Mathematics, University of Zabol, Zabol, Iran. .,Department of Bioinformatics, University of Zabol, Zabol, Iran.
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, NTNU - Norwegian University of Science and Technology, NO-7491, Trondheim, Norway.
| |
Collapse
|
15
|
Rauluseviciute I, Drabløs F, Rye MB. DNA hypermethylation associated with upregulated gene expression in prostate cancer demonstrates the diversity of epigenetic regulation. BMC Med Genomics 2020; 13:6. [PMID: 31914996 PMCID: PMC6950795 DOI: 10.1186/s12920-020-0657-6] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 12/31/2019] [Indexed: 01/15/2023] Open
Abstract
Background Prostate cancer (PCa) has the highest incidence rates of cancers in men in western countries. Unlike several other types of cancer, PCa has few genetic drivers, which has led researchers to look for additional epigenetic and transcriptomic contributors to PCa development and progression. Especially datasets on DNA methylation, the most commonly studied epigenetic marker, have recently been measured and analysed in several PCa patient cohorts. DNA methylation is most commonly associated with downregulation of gene expression. However, positive associations of DNA methylation to gene expression have also been reported, suggesting a more diverse mechanism of epigenetic regulation. Such additional complexity could have important implications for understanding prostate cancer development but has not been studied at a genome-wide scale. Results In this study, we have compared three sets of genome-wide single-site DNA methylation data from 870 PCa and normal tissue samples with multi-cohort gene expression data from 1117 samples, including 532 samples where DNA methylation and gene expression have been measured on the exact same samples. Genes were classified according to their corresponding methylation and expression profiles. A large group of hypermethylated genes was robustly associated with increased gene expression (UPUP group) in all three methylation datasets. These genes demonstrated distinct patterns of correlation between DNA methylation and gene expression compared to the genes showing the canonical negative association between methylation and expression (UPDOWN group). This indicates a more diversified role of DNA methylation in regulating gene expression than previously appreciated. Moreover, UPUP and UPDOWN genes were associated with different compartments — UPUP genes were related to the structures in nucleus, while UPDOWN genes were linked to extracellular features. Conclusion We identified a robust association between hypermethylation and upregulation of gene expression when comparing samples from prostate cancer and normal tissue. These results challenge the classical view where DNA methylation is always associated with suppression of gene expression, which underlines the importance of considering corresponding expression data when assessing the downstream regulatory effect of DNA methylation.
Collapse
Affiliation(s)
- Ieva Rauluseviciute
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, NO-7491, Trondheim, Norway.
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, NO-7491, Trondheim, Norway
| | - Morten Beck Rye
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, NO-7491, Trondheim, Norway.,Clinic of Surgery, St. Olavs Hospital, Trondheim University Hospital, NO-7030, Trondheim, Norway
| |
Collapse
|
16
|
Rauluseviciute I, Drabløs F, Rye MB. DNA methylation data by sequencing: experimental approaches and recommendations for tools and pipelines for data analysis. Clin Epigenetics 2019; 11:193. [PMID: 31831061 PMCID: PMC6909609 DOI: 10.1186/s13148-019-0795-x] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 12/04/2019] [Indexed: 02/06/2023] Open
Abstract
Sequencing technologies have changed not only our approaches to classical genetics, but also the field of epigenetics. Specific methods allow scientists to identify novel genome-wide epigenetic patterns of DNA methylation down to single-nucleotide resolution. DNA methylation is the most researched epigenetic mark involved in various processes in the human cell, including gene regulation and development of diseases, such as cancer. Increasing numbers of DNA methylation sequencing datasets from human genome are produced using various platforms-from methylated DNA precipitation to the whole genome bisulfite sequencing. Many of those datasets are fully accessible for repeated analyses. Sequencing experiments have become routine in laboratories around the world, while analysis of outcoming data is still a challenge among the majority of scientists, since in many cases it requires advanced computational skills. Even though various tools are being created and published, guidelines for their selection are often not clear, especially to non-bioinformaticians with limited experience in computational analyses. Separate tools are often used for individual steps in the analysis, and these can be challenging to manage and integrate. However, in some instances, tools are combined into pipelines that are capable to complete all the essential steps to achieve the result. In the case of DNA methylation sequencing analysis, the goal of such pipeline is to map sequencing reads, calculate methylation levels, and distinguish differentially methylated positions and/or regions. The objective of this review is to describe basic principles and steps in the analysis of DNA methylation sequencing data that in particular have been used for mammalian genomes, and more importantly to present and discuss the most pronounced computational pipelines that can be used to analyze such data. We aim to provide a good starting point for scientists with limited experience in computational analyses of DNA methylation and hydroxymethylation data, and recommend a few tools that are powerful, but still easy enough to use for their own data analysis.
Collapse
Affiliation(s)
- Ieva Rauluseviciute
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, NO-7491, Trondheim, Norway.
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, NO-7491, Trondheim, Norway
| | - Morten Beck Rye
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, P.O. Box 8905, NO-7491, Trondheim, Norway.,Clinic of Surgery, St. Olavs Hospital, Trondheim University Hospital, NO-7030, Trondheim, Norway
| |
Collapse
|
17
|
Abstract
BACKGROUND Almost 16,000 human long non-coding RNA (lncRNA) genes have been identified in the GENCODE project. However, the function of most of them remains to be discovered. The function of lncRNAs and other novel genes can be predicted by identifying significantly enriched annotation terms in already annotated genes that are co-expressed with the lncRNAs. However, such approaches are sensitive to the methods that are used to estimate the level of co-expression. RESULTS We have tested and compared two well-known statistical metrics (Pearson and Spearman) and two geometrical metrics (Sobolev and Fisher) for identification of the co-expressed genes, using experimental expression data across 19 normal human tissues. We have also used a benchmarking approach based on semantic similarity to evaluate how well these methods are able to predict annotation terms, using a well-annotated set of protein-coding genes. CONCLUSION This work shows that geometrical metrics, in particular in combination with the statistical metrics, will predict annotation terms more efficiently than traditional approaches. Tests on selected lncRNAs confirm that it is possible to predict the function of these genes given a reliable set of expression data. The software used for this investigation is freely available.
Collapse
Affiliation(s)
- Rezvan Ehsani
- Department of Mathematics, University of Zabol, Zabol, Iran. .,Department of Bioinformatics, University of Zabol, Zabol, Iran.
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, NTNU - Norwegian University of Science and Technology, NO-7491, Trondheim, Norway.
| |
Collapse
|
18
|
Tekle KM, Gundersen S, Klepper K, Bongo LA, Raknes IA, Li X, Zhang W, Andreetta C, Mulugeta TD, Kalaš M, Rye MB, Hjerde E, Antony Samy JK, Fornous G, Azab A, Våge DI, Hovig E, Willassen NP, Drabløs F, Nygård S, Petersen K, Jonassen I. Norwegian e-Infrastructure for Life Sciences (NeLS). F1000Res 2018; 7:ELIXIR-968. [PMID: 30271575 PMCID: PMC6137412 DOI: 10.12688/f1000research.15119.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/13/2018] [Indexed: 12/26/2022] Open
Abstract
The Norwegian e-Infrastructure for Life Sciences (NeLS) has been developed by ELIXIR Norway to provide its users with a system enabling data storage, sharing, and analysis in a project-oriented fashion. The system is available through easy-to-use web interfaces, including the Galaxy workbench for data analysis and workflow execution. Users confident with a command-line interface and programming may also access it through Secure Shell (SSH) and application programming interfaces (APIs). NeLS has been in production since 2015, with training and support provided by the help desk of ELIXIR Norway. Through collaboration with NorSeq, the national consortium for high-throughput sequencing, an integrated service is offered so that sequencing data generated in a research project is provided to the involved researchers through NeLS. Sensitive data, such as individual genomic sequencing data, are handled using the TSD (Services for Sensitive Data) platform provided by Sigma2 and the University of Oslo. NeLS integrates national e-infrastructure storage and computing resources, and is also integrated with the SEEK platform in order to store large data files produced by experiments described in SEEK. In this article, we outline the architecture of NeLS and discuss possible directions for further development.
Collapse
Affiliation(s)
- Kidane M. Tekle
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | | | - Kjetil Klepper
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Lars Ailo Bongo
- University of Tromsø - The Arctic University of Norway, Tromsø, Norway
| | | | - Xiaxi Li
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Wei Zhang
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Christian Andreetta
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Teshome Dagne Mulugeta
- Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | - Matúš Kalaš
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Morten B. Rye
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Erik Hjerde
- University of Tromsø - The Arctic University of Norway, Tromsø, Norway
| | - Jeevan Karloss Antony Samy
- Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | | | | | - Dag Inge Våge
- Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | | | | | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | | | - Kjell Petersen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Inge Jonassen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| |
Collapse
|
19
|
Aas CG, Drabløs F, Haugum K, Afset JE. Comparative Transcriptome Profiling Reveals a Potential Role of Type VI Secretion System and Fimbriae in Virulence of Non-O157 Shiga Toxin-Producing Escherichia coli. Front Microbiol 2018; 9:1416. [PMID: 30008706 PMCID: PMC6033998 DOI: 10.3389/fmicb.2018.01416] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 06/08/2018] [Indexed: 01/18/2023] Open
Abstract
Shiga toxin-producing Escherichia coli (STEC) cause both sporadic infections and outbreaks of enteric disease in humans, with symptoms ranging from asymptomatic carriage to severe disease like haemolytic uremic syndrome (HUS). Bacterial virulence factors like subtypes of the Shiga toxin (Stx) and the locus of enterocyte effacement (LEE) pathogenicity island, as well as host factors like young age, are strongly associated with development of HUS. However, these factors alone do not accurately differentiate between strains that cause HUS and those that do not cause severe disease, which is important in the context of diagnosis, treatment, as well as infection control. We have used RNA sequencing to compare transcriptomes of 30 stx2a and eae positive STEC strains of non-O157 serogroups isolated from children <5 years of age. The strains were from children with HUS (HUS group, n = 15), and children with asymptomatic or mild disease (non-HUS group, n = 15), either induced with mitomycin C or non-induced, to reveal potential differences in gene expression levels between groups. When the HUS and non-HUS group were compared for differential expression of protein-encoding gene families, 399 of 6,119 gene families were differentially expressed (log2 fold change ≥ 1, FDR < 0.05) in the non-induced condition, whereas only one gene family was differentially expressed in the induced condition. Gene ontology and cluster analysis showed that several fimbrial operons, as well as a putative type VI secretion system (T6SS) were more highly expressed in the HUS group than in the non-HUS group, indicating a role of these in the virulence of STEC strains causing severe disease.
Collapse
Affiliation(s)
- Christina G Aas
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway.,Clinic of Laboratory Medicine, Department of Medical Microbiology, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway
| | - Finn Drabløs
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Kjersti Haugum
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway.,Clinic of Laboratory Medicine, Department of Medical Microbiology, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway
| | - Jan E Afset
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway.,Clinic of Laboratory Medicine, Department of Medical Microbiology, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway
| |
Collapse
|
20
|
Olaisen C, Kvitvang HFN, Lee S, Almaas E, Bruheim P, Drabløs F, Otterlei M. The role of PCNA as a scaffold protein in cellular signaling is functionally conserved between yeast and humans. FEBS Open Bio 2018; 8:1135-1145. [PMID: 29988559 PMCID: PMC6026702 DOI: 10.1002/2211-5463.12442] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 02/19/2018] [Accepted: 05/01/2018] [Indexed: 12/11/2022] Open
Abstract
Proliferating cell nuclear antigen (PCNA), a member of the highly conserved DNA sliding clamp family, is an essential protein for cellular processes including DNA replication and repair. A large number of proteins from higher eukaryotes contain one of two PCNA-interacting motifs: PCNA-interacting protein box (PIP box) and AlkB homologue 2 PCNA-interacting motif (APIM). APIM has been shown to be especially important during cellular stress. PIP box is known to be functionally conserved in yeast, and here, we show that this is also the case for APIM. Several of the 84 APIM-containing yeast proteins are associated with cellular signaling as hub proteins, which are able to interact with a large number of other proteins. Cellular signaling is highly conserved throughout evolution, and we recently suggested a novel role for PCNA as a scaffold protein in cellular signaling in human cells. A cell-penetrating peptide containing the APIM sequence increases the sensitivity toward the chemotherapeutic agent cisplatin in both yeast and human cells, and both yeast and human cells become hypersensitive when the Hog1/p38 MAPK pathway is blocked. These results suggest that the interactions between APIM-containing signaling proteins and PCNA during the DNA damage response is evolutionary conserved between yeast and mammals and that PCNA has a role in cellular signaling also in yeast.
Collapse
Affiliation(s)
- Camilla Olaisen
- Department of Clinical and Molecular MedicineFaculty of Medicine and Health SciencesNorwegian University of Science and Technology (NTNU)TrondheimNorway
| | - Hans Fredrik N. Kvitvang
- Department of Biotechnology and Food ScienceFaculty of Natural SciencesNorwegian University of Science and Technology (NTNU)TrondheimNorway
| | - Sungmin Lee
- Department of Biotechnology and Food ScienceFaculty of Natural SciencesNorwegian University of Science and Technology (NTNU)TrondheimNorway
| | - Eivind Almaas
- Department of Biotechnology and Food ScienceFaculty of Natural SciencesNorwegian University of Science and Technology (NTNU)TrondheimNorway
| | - Per Bruheim
- Department of Biotechnology and Food ScienceFaculty of Natural SciencesNorwegian University of Science and Technology (NTNU)TrondheimNorway
| | - Finn Drabløs
- Department of Clinical and Molecular MedicineFaculty of Medicine and Health SciencesNorwegian University of Science and Technology (NTNU)TrondheimNorway
| | - Marit Otterlei
- Department of Clinical and Molecular MedicineFaculty of Medicine and Health SciencesNorwegian University of Science and Technology (NTNU)TrondheimNorway
| |
Collapse
|
21
|
Rye MB, Bertilsson H, Andersen MK, Rise K, Bathen TF, Drabløs F, Tessem MB. Cholesterol synthesis pathway genes in prostate cancer are transcriptionally downregulated when tissue confounding is minimized. BMC Cancer 2018; 18:478. [PMID: 29703166 PMCID: PMC5922022 DOI: 10.1186/s12885-018-4373-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 04/15/2018] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND The relationship between cholesterol and prostate cancer has been extensively studied for decades, where high levels of cellular cholesterol are generally associated with cancer progression and less favorable outcomes. However, the role of in vivo cellular cholesterol synthesis in this process is unclear, and data on the transcriptional activity of cholesterol synthesis pathway genes in tissue from prostate cancer patients are inconsistent. METHODS A common problem with cancer tissue data from patient cohorts is the presence of heterogeneous tissue which confounds molecular analysis of the samples. In this study we present a general method to minimize systematic confounding from stroma tissue in any prostate cancer cohort comparing prostate cancer and normal samples. In particular we use samples assessed by histopathology to identify genes enriched and depleted in prostate stroma. These genes are then used to assess stroma content in tissue samples from other prostate cancer cohorts where no histopathology is available. Differential expression analysis is performed by comparing cancer and normal samples where the average stroma content has been balanced between the sample groups. In total we analyzed seven patient cohorts with prostate cancer consisting of 1713 prostate cancer and 230 normal tissue samples. RESULTS When stroma confounding was minimized, differential gene expression analysis over all cohorts showed robust and consistent downregulation of nearly all genes in the cholesterol synthesis pathway. Additional Gene Ontology analysis also identified cholesterol synthesis as the most significantly altered metabolic pathway in prostate cancer at the transcriptional level. CONCLUSION The surprising observation that cholesterol synthesis genes are downregulated in prostate cancer is important for our understanding of how prostate cancer cells regulate cholesterol levels in vivo. Moreover, we show that tissue heterogeneity explains the lack of consistency in previous expression analysis of cholesterol synthesis genes in prostate cancer.
Collapse
Affiliation(s)
- Morten Beck Rye
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, NO-7491 Trondheim, Norway
- Clinic of Surgery, St. Olavs Hospital, Trondheim University Hospital, 7030 Trondheim, Norway
| | - Helena Bertilsson
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, NO-7491 Trondheim, Norway
- Department of Urology, St. Olavs Hospital, Trondheim University Hospital, 7030 Trondheim, Norway
| | - Maria K. Andersen
- MI Lab, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| | - Kjersti Rise
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, NO-7491 Trondheim, Norway
| | - Tone F. Bathen
- MI Lab, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, NO-7491 Trondheim, Norway
| | - May-Britt Tessem
- Clinic of Surgery, St. Olavs Hospital, Trondheim University Hospital, 7030 Trondheim, Norway
- MI Lab, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| |
Collapse
|
22
|
Ribicic D, Netzer R, Hazen TC, Techtmann SM, Drabløs F, Brakstad OG. Microbial community and metagenome dynamics during biodegradation of dispersed oil reveals potential key-players in cold Norwegian seawater. Mar Pollut Bull 2018; 129:370-378. [PMID: 29680562 DOI: 10.1016/j.marpolbul.2018.02.034] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Revised: 01/30/2018] [Accepted: 02/19/2018] [Indexed: 06/08/2023]
Abstract
Oil biodegradation as a weathering process has been extensively investigated over the years, especially after the Deepwater Horizon blowout. In this study, we performed microcosm experiments at 5 °C with chemically dispersed oil in non-amended seawater. We link biodegradation processes with microbial community and metagenome dynamics and explain the succession based on substrate specialization. Reconstructed genomes and 16S rRNA gene analysis revealed that Bermanella and Zhongshania were the main contributors to initial n-alkane breakdown, while subsequent abundances of Colwellia and microorganisms closely related to Porticoccaceae were involved in secondary n‑alkane breakdown and beta‑oxidation. Cycloclasticus, Porticoccaceae and Spongiiabcteraceae were associated with degradation of mono- and poly-cyclic aromatics. Successional pattern of genes coding for hydrocarbon degrading enzymes at metagenome level, and reconstructed genomic content, revealed a high differentiation of bacteria involved in hydrocarbon biodegradation. A cooperation among oil degrading microorganisms is thus needed for the complete substrate transformation.
Collapse
Affiliation(s)
- Deni Ribicic
- NTNU Norwegian University of Science and Technology, Department of Clinical and Molecular Medicine, Trondheim, Norway.
| | | | - Terry C Hazen
- University of Tennessee Knoxville, Department of Civil and Environmental Engineering, Knoxville, TN, USA
| | - Stephen M Techtmann
- Michigan Technological University, Department of Biological Sciences, Houghton, MI, USA
| | - Finn Drabløs
- NTNU Norwegian University of Science and Technology, Department of Clinical and Molecular Medicine, Trondheim, Norway
| | | |
Collapse
|
23
|
Simovski B, Vodák D, Gundersen S, Domanska D, Azab A, Holden L, Holden M, Grytten I, Rand K, Drabløs F, Johansen M, Mora A, Lund-Andersen C, Fromm B, Eskeland R, Gabrielsen OS, Ferkingstad E, Nakken S, Bengtsen M, Nederbragt AJ, Thorarensen HS, Akse JA, Glad I, Hovig E, Sandve GK. GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome. Gigascience 2017; 6:1-12. [PMID: 28459977 PMCID: PMC5493745 DOI: 10.1093/gigascience/gix032] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Revised: 01/17/2017] [Accepted: 04/24/2017] [Indexed: 12/01/2022] Open
Abstract
Background Recent large-scale undertakings such as ENCODE and Roadmap Epigenomics have generated experimental data mapped to the human reference genome (as genomic tracks) representing a variety of functional elements across a large number of cell types. Despite the high potential value of these publicly available data for a broad variety of investigations, little attention has been given to the analytical methodology necessary for their widespread utilisation. Findings We here present a first principled treatment of the analysis of collections of genomic tracks. We have developed novel computational and statistical methodology to permit comparative and confirmatory analyses across multiple and disparate data sources. We delineate a set of generic questions that are useful across a broad range of investigations and discuss the implications of choosing different statistical measures and null models. Examples include contrasting analyses across different tissues or diseases. The methodology has been implemented in a comprehensive open-source software system, the GSuite HyperBrowser. To make the functionality accessible to biologists, and to facilitate reproducible analysis, we have also developed a web-based interface providing an expertly guided and customizable way of utilizing the methodology. With this system, many novel biological questions can flexibly be posed and rapidly answered. Conclusions Through a combination of streamlined data acquisition, interoperable representation of dataset collections, and customizable statistical analysis with guided setup and interpretation, the GSuite HyperBrowser represents a first comprehensive solution for integrative analysis of track collections across the genome and epigenome. The software is available at: https://hyperbrowser.uio.no.
Collapse
Affiliation(s)
- Boris Simovski
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Daniel Vodák
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | | | - Diana Domanska
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Abdulrahman Azab
- Department of Informatics, University of Oslo, Oslo, Norway
- Research Support Services Group, University Center for Information Technology, Oslo, Norway
| | - Lars Holden
- Statistics For Innovation, Norwegian Computing Center, Oslo, Norway
| | - Marit Holden
- Statistics For Innovation, Norwegian Computing Center, Oslo, Norway
| | - Ivar Grytten
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Knut Rand
- Department of Mathematics, University of Oslo, Oslo, Norway
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Morten Johansen
- Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway
| | - Antonio Mora
- Department of Informatics, University of Oslo, Oslo, Norway
- Department of Biosciences, University of Oslo, Oslo, Norway
| | - Christin Lund-Andersen
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Bastian Fromm
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Ragnhild Eskeland
- Department of Biosciences, University of Oslo, Oslo, Norway
- Norwegian Center for Stem Cell Research, Department of Immunology, Oslo University Hospital, Oslo, Norway
| | | | | | - Sigve Nakken
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Mads Bengtsen
- Department of Biosciences, University of Oslo, Oslo, Norway
| | - Alexander Johan Nederbragt
- Department of Informatics, University of Oslo, Oslo, Norway
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway
| | | | | | - Ingrid Glad
- Department of Mathematics, University of Oslo, Oslo, Norway
| | - Eivind Hovig
- Department of Informatics, University of Oslo, Oslo, Norway
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- Statistics For Innovation, Norwegian Computing Center, Oslo, Norway
- Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway
| | | |
Collapse
|
24
|
Hansen MF, Johansen J, Sylvander AE, Bjørnevoll I, Talseth-Palmer BA, Lavik LAS, Xavier A, Engebretsen LF, Scott RJ, Drabløs F, Sjursen W. Use of multigene-panel identifies pathogenic variants in several CRC-predisposing genes in patients previously tested for Lynch Syndrome. Clin Genet 2017; 92:405-414. [PMID: 28195393 DOI: 10.1111/cge.12994] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Revised: 02/06/2017] [Accepted: 02/08/2017] [Indexed: 12/28/2022]
Abstract
BACKGROUND Many families with a high burden of colorectal cancer fulfil the clinical criteria for Lynch Syndrome. However, in about half of these families, no germline mutation in the mismatch repair genes known to be associated with this disease can be identified. The aim of this study was to find the genetic cause for the increased colorectal cancer risk in these unsolved cases. MATERIALS AND METHODS To reach the aim, we designed a gene panel targeting 112 previously known or candidate colorectal cancer susceptibility genes to screen 274 patient samples for mutations. Mutations were validated by Sanger sequencing and, where possible, segregation analysis was performed. RESULTS We identified 73 interesting variants, of whom 17 were pathogenic and 19 were variants of unknown clinical significance in well-established cancer susceptibility genes. In addition, 37 potentially pathogenic variants in candidate colorectal cancer susceptibility genes were detected. CONCLUSION In conclusion, we found a promising DNA variant in more than 25 % of the patients, which shows that gene panel testing is a more effective method to identify germline variants in CRC patients compared to a single gene approach.
Collapse
Affiliation(s)
- Maren F Hansen
- Department of Laboratory Medicine, Children's and Women's Health, Norwegian University of Science and Technology (NTNU), Trondheim, Norway.,Department of Pathology and Medical Genetics, St. Olavs University Hospital, Trondheim, Norway
| | - Jostein Johansen
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Anna E Sylvander
- Department of Pathology and Medical Genetics, St. Olavs University Hospital, Trondheim, Norway
| | - Inga Bjørnevoll
- Department of Pathology and Medical Genetics, St. Olavs University Hospital, Trondheim, Norway
| | - Bente A Talseth-Palmer
- Department of Laboratory Medicine, Children's and Women's Health, Norwegian University of Science and Technology (NTNU), Trondheim, Norway.,School of Biomedical Science and Pharmacy, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia.,Clinic for Medicine, Møre and Romsdal Hospital Trust, Molde, Norway
| | - Liss A S Lavik
- Department of Pathology and Medical Genetics, St. Olavs University Hospital, Trondheim, Norway
| | - Alexandre Xavier
- School of Biomedical Science and Pharmacy, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia
| | - Lars F Engebretsen
- Department of Pathology and Medical Genetics, St. Olavs University Hospital, Trondheim, Norway
| | - Rodney J Scott
- School of Biomedical Science and Pharmacy, University of Newcastle and Hunter Medical Research Institute, Newcastle, Australia.,Division of Molecular Medicine Pathology North, NSW Pathology, Newcastle, Australia
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| | - Wenche Sjursen
- Department of Laboratory Medicine, Children's and Women's Health, Norwegian University of Science and Technology (NTNU), Trondheim, Norway.,Department of Pathology and Medical Genetics, St. Olavs University Hospital, Trondheim, Norway
| |
Collapse
|
25
|
Langkilde A, Olsen LC, Sætrom P, Drabløs F, Besenbacher S, Raaby L, Johansen C, Iversen L. Pathway Analysis of Skin from Psoriasis Patients after Adalimumab Treatment Reveals New Early Events in the Anti-Inflammatory Mechanism of Anti-TNF-α. PLoS One 2016; 11:e0167437. [PMID: 28005985 PMCID: PMC5179238 DOI: 10.1371/journal.pone.0167437] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2016] [Accepted: 11/14/2016] [Indexed: 01/09/2023] Open
Abstract
Psoriasis is a chronic cutaneous inflammatory disease. The immunopathogenesis is a complex interplay between T cells, dendritic cells and the epidermis in which T cells and dendritic cells maintain skin inflammation. Anti-tumour necrosis factor (anti-TNF)-α agents have been approved for therapeutic use across a range of inflammatory disorders including psoriasis, but the anti-inflammatory mechanisms of anti-TNF-α in lesional psoriatic skin are not fully understood. We investigated early events in skin from psoriasis patients after treatment with anti-TNF-α antibodies by use of bioinformatics tools. We used the Human Gene 1.0 ST Array to analyse gene expression in punch biopsies taken from psoriatic patients before and also 4 and 14 days after initiation of treatment with the anti-TNF-α agent adalimumab. The gene expression was analysed by gene set enrichment analysis using the Functional Annotation Tool from DAVID Bioinformatics Resources. The most enriched pathway was visualised by the Pathview Package on Kyoto Encyclopedia of Genes and Genomes (KEGG) graphs. The analysis revealed new very early events in psoriasis after adalimumab treatment. Some of these events have been described after longer periods of anti-TNF-α treatment when clinical and histological changes appear, suggesting that effects of anti-TNF-α treatment on gene expression appear very early before clinical and histological changes. Combining microarray data on biopsies from psoriasis patients with pathway analysis allowed us to integrate in vitro findings into the identification of mechanisms that may be important in vivo. Furthermore, these results may reflect primary effect of anti-TNF-α treatment in contrast to studies of gene expression changes following clinical and histological changes, which may reflect secondary changes correlated to the healing of the skin.
Collapse
Affiliation(s)
- Ane Langkilde
- Department of Dermatology, Aarhus University Hospital, Aarhus C, Denmark
| | - Lene C. Olsen
- Department of Cancer Research and Molecular Medicine, Faculty of Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, NORWAY
| | - Pål Sætrom
- Department of Cancer Research and Molecular Medicine, Faculty of Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, NORWAY
- Department of Computer and Information Science, Faculty of Information Technology, Mathematics and Electrical Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, NORWAY
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Faculty of Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, NORWAY
| | | | - Line Raaby
- Department of Dermatology, Aarhus University Hospital, Aarhus C, Denmark
| | - Claus Johansen
- Department of Dermatology, Aarhus University Hospital, Aarhus C, Denmark
| | - Lars Iversen
- Department of Dermatology, Aarhus University Hospital, Aarhus C, Denmark
| |
Collapse
|
26
|
Ehsani R, Bahrami S, Drabløs F. Feature-based classification of human transcription factors into hypothetical sub-classes related to regulatory function. BMC Bioinformatics 2016; 17:459. [PMID: 27842491 PMCID: PMC5109715 DOI: 10.1186/s12859-016-1349-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Accepted: 11/10/2016] [Indexed: 12/15/2022] Open
Abstract
Background Transcription factors are key proteins in the regulation of gene transcription. An important step in this process is the opening of chromatin in order to make genomic regions available for transcription. Data on DNase I hypersensitivity has previously been used to label a subset of transcription factors as Pioneers, Settlers and Migrants to describe their potential role in this process. These labels represent an interesting hypothesis on gene regulation and possibly a useful approach for data analysis, and therefore we wanted to expand the set of labeled transcription factors to include as many known factors as possible. We have used a well-annotated dataset of 1175 transcription factors as input to supervised machine learning methods, using the subset with previously assigned labels as training set. We then used the final classifier to label the additional transcription factors according to their potential role as Pioneers, Settlers and Migrants. The full set of labeled transcription factors was used to investigate associated properties and functions of each class, including an analysis of interaction data for transcription factors based on DNA co-binding and protein-protein interactions. We also used the assigned labels to analyze a previously published set of gene lists associated with a time course experiment on cell differentiation. Results The analysis showed that the classification of transcription factors with respect to their potential role in chromatin opening largely was determined by how they bind to DNA. Each subclass of transcription factors was enriched for properties that seemed to characterize the subclass relative to its role in gene regulation, with very general functions for Pioneers, whereas Migrants to a larger extent were associated with specific processes. Further analysis showed that the expanded classification is a useful resource for analyzing other datasets on transcription factors with respect to their potential role in gene regulation. The analysis of transcription factor interaction data showed complementary differences between the subclasses, where transcription factors labeled as Pioneers often interact with other transcription factors through DNA co-binding, whereas Migrants to a larger extent use protein-protein interactions. The analysis of time course data on cell differentiation indicated a shift in the regulatory program associated with Pioneer-like transcription factors during differentiation. Conclusions The expanded classification is an interesting resource for analyzing data on gene regulation, as illustrated here on transcription factor interaction data and data from a time course experiment. The potential regulatory function of transcription factors seems largely to be determined by how they bind DNA, but is also influenced by how they interact with each other through cooperativity and protein-protein interactions. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1349-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rezvan Ehsani
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, PO Box 8905, NO-7491, Trondheim, Norway.,Department of Mathematics, University of Zabol, Zabol, Iran
| | - Shahram Bahrami
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, PO Box 8905, NO-7491, Trondheim, Norway.,St. Olavs Hospital, Trondheim University Hospital, NO-7006, Trondheim, Norway
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, PO Box 8905, NO-7491, Trondheim, Norway.
| |
Collapse
|
27
|
Lizio M, Harshbarger J, Abugessaisa I, Noguchi S, Kondo A, Severin J, Mungall C, Arenillas D, Mathelier A, Medvedeva YA, Lennartsson A, Drabløs F, Ramilowski JA, Rackham O, Gough J, Andersson R, Sandelin A, Ienasescu H, Ono H, Bono H, Hayashizaki Y, Carninci P, Forrest ARR, Kasukawa T, Kawaji H. Update of the FANTOM web resource: high resolution transcriptome of diverse cell types in mammals. Nucleic Acids Res 2016; 45:D737-D743. [PMID: 27794045 PMCID: PMC5210666 DOI: 10.1093/nar/gkw995] [Citation(s) in RCA: 86] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 10/17/2016] [Indexed: 12/26/2022] Open
Abstract
Upon the first publication of the fifth iteration of the Functional Annotation of Mammalian Genomes collaborative project, FANTOM5, we gathered a series of primary data and database systems into the FANTOM web resource (http://fantom.gsc.riken.jp) to facilitate researchers to explore transcriptional regulation and cellular states. In the course of the collaboration, primary data and analysis results have been expanded, and functionalities of the database systems enhanced. We believe that our data and web systems are invaluable resources, and we think the scientific community will benefit for this recent update to deepen their understanding of mammalian cellular organization. We introduce the contents of FANTOM5 here, report recent updates in the web resource and provide future perspectives.
Collapse
Affiliation(s)
- Marina Lizio
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Jayson Harshbarger
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Imad Abugessaisa
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Shuei Noguchi
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Atsushi Kondo
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Jessica Severin
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Chris Mungall
- Genomics Division, Lawrence Berkeley National Laboratory, 84R01, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - David Arenillas
- Centre for Molecular Medicine and Therapeutics at BC Children's Hospital Research, Department of Medical Genetics, University of British Columbia, 950 West 28th Avenue, Vancouver, BC, V5Z 4H4, Canada
| | - Anthony Mathelier
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway.,Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, 0372 Oslo, Norway
| | - Yulia A Medvedeva
- Institute of Bioengineering, Research Center of Biotechnology, Russian Academy of Science, Leninsky prospect, 33, build. 2, 119071 Moscow, Russia.,Vavilov Institute of General Genetics, Russian Academy of Science, Gubkina str. 3, Moscow 119991, Russia
| | - Andreas Lennartsson
- Department of Biosciences and Nutrition, Karolinska Institutet, Hälsovägen 7-9, 14183 Huddinge, Sweden
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), P.O. Box 8905, NO-7491 Trondheim, Norway
| | - Jordan A Ramilowski
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Owen Rackham
- Program in Cardiovascular and Metabolic Disorders, Duke's National University of Singapore Medical School, 8 College Road, Singapore 169857, Singapore
| | - Julian Gough
- Department of Computer Science, University of Bristol, Merchant Venturers Building, Woodland Road, Bristol BS8 1UB UK
| | - Robin Andersson
- The Bioinformatics Centre, Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen, Denmark
| | - Albin Sandelin
- Section for Computational and RNA Biology, Department of Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen, Denmark
| | - Hans Ienasescu
- Section for Computational and RNA Biology, Department of Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen, Denmark
| | - Hiromasa Ono
- Database Center for Life Science (DBCLS), Joint Support-Center for Data Science Research, Research Organization of Information and Systems (ROIS), 1111 Yata, Mishima 411-8540, Japan
| | - Hidemasa Bono
- Database Center for Life Science (DBCLS), Joint Support-Center for Data Science Research, Research Organization of Information and Systems (ROIS), 1111 Yata, Mishima 411-8540, Japan
| | - Yoshihide Hayashizaki
- Preventive medicine and applied genomics unit, RIKEN Advanced Center for Computing and Communication, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.,Systems biology and Genomics, Harry Perkins Institute of MedicalResearch, PO Box 7214, 6 Verdun Street, Nedlands, Perth, Western Australia 6008, Australia
| | - Piero Carninci
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Alistair R R Forrest
- RIKEN Preventive Medicine and Diagnosis Innovation Program, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Takeya Kasukawa
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Hideya Kawaji
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologie, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan .,RIKEN Preventive Medicine and Diagnosis Innovation Program, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.,Preventive medicine and applied genomics unit, RIKEN Advanced Center for Computing and Communication, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
28
|
Abstract
Immediate-early genes (IEGs) can be activated and transcribed within minutes after stimulation, without the need for de novo protein synthesis, and they are stimulated in response to both cell-extrinsic and cell-intrinsic signals. Extracellular signals are transduced from the cell surface, through receptors activating a chain of proteins in the cell, in particular extracellular-signal-regulated kinases (ERKs), mitogen-activated protein kinases (MAPKs) and members of the RhoA-actin pathway. These communicate through a signaling cascade by adding phosphate groups to neighboring proteins, and this will eventually activate and translocate TFs to the nucleus and thereby induce gene expression. The gene activation also involves proximal and distal enhancers that interact with promoters to simulate gene expression. The immediate-early genes have essential biological roles, in particular in stress response, like the immune system, and in differentiation. Therefore they also have important roles in various diseases, including cancer development. In this paper we summarize some recent advances on key aspects of the activation and regulation of immediate-early genes.
Collapse
Affiliation(s)
- Shahram Bahrami
- Department of Cancer Research and Molecular Medicine, NTNU - Norwegian University of Science and Technology, NO-7491 Trondheim, Norway; St. Olavs Hospital, Trondheim University Hospital, NO-7006 Trondheim, Norway.
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, NTNU - Norwegian University of Science and Technology, NO-7491 Trondheim, Norway.
| |
Collapse
|
29
|
Abstract
Background The Gene Ontology (GO) is a dynamic, controlled vocabulary that describes the cellular function of genes and proteins according to tree major categories: biological process, molecular function and cellular component. It has become widely used in many bioinformatics applications for annotating genes and measuring their semantic similarity, rather than their sequence similarity. Generally speaking, semantic similarity measures involve the GO tree topology, information content of GO terms, or a combination of both. Results Here we present a new semantic similarity measure called TopoICSim (Topological Information Content Similarity) which uses information on the specific paths between GO terms based on the topology of the GO tree, and the distribution of information content along these paths. The TopoICSim algorithm was evaluated on two human benchmark datasets based on KEGG pathways and Pfam domains grouped as clans, using GO terms from either the biological process or molecular function. The performance of the TopoICSim measure compared favorably to five existing methods. Furthermore, the TopoICSim similarity was also tested on gene/protein sets defined by correlated gene expression, using three human datasets, and showed improved performance compared to two previously published similarity measures. Finally we used an online benchmarking resource which evaluates any similarity measure against a set of 11 similarity measures in three tests, using gene/protein sets based on sequence similarity, Pfam domains, and enzyme classifications. The results for TopoICSim showed improved performance relative to most of the measures included in the benchmarking, and in particular a very robust performance throughout the different tests. Conclusions The TopoICSim similarity measure provides a competitive method with robust performance for quantification of semantic similarity between genes and proteins based on GO annotations. An R script for TopoICSim is available at http://bigr.medisin.ntnu.no/tools/TopoICSim.R.
Collapse
Affiliation(s)
- Rezvan Ehsani
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, NO-7491, Trondheim, Norway.,Department of Mathematics, University of Zabol, Zabol, Iran
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, NO-7491, Trondheim, Norway.
| |
Collapse
|
30
|
Lewin A, Strand TA, Haugen T, Klinkenberg G, Kotlar HK, Valla S, Drabløs F, Wentzel A. Discovery and Characterization of a Thermostable Esterase from an Oil Reservoir Metagenome. ACTA ACUST UNITED AC 2016. [DOI: 10.4236/aer.2016.42008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
31
|
Medvedeva YA, Lennartsson A, Ehsani R, Kulakovskiy IV, Vorontsov IE, Panahandeh P, Khimulya G, Kasukawa T, Drabløs F. EpiFactors: a comprehensive database of human epigenetic factors and complexes. Database (Oxford) 2015; 2015:bav067. [PMID: 26153137 PMCID: PMC4494013 DOI: 10.1093/database/bav067] [Citation(s) in RCA: 164] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Accepted: 06/15/2015] [Indexed: 12/22/2022]
Abstract
Epigenetics refers to stable and long-term alterations of cellular traits that are
not caused by changes in the DNA sequence per se. Rather, covalent
modifications of DNA and histones affect gene expression and genome stability
via proteins that recognize and act upon such modifications. Many
enzymes that catalyse epigenetic modifications or are critical for enzymatic
complexes have been discovered, and this is encouraging investigators to study the
role of these proteins in diverse normal and pathological processes. Rapidly growing
knowledge in the area has resulted in the need for a resource that compiles,
organizes and presents curated information to the researchers in an easily accessible
and user-friendly form. Here we present EpiFactors, a manually curated database
providing information about epigenetic regulators, their complexes, targets and
products. EpiFactors contains information on 815 proteins, including 95 histones and
protamines. For 789 of these genes, we include expressions values across several
samples, in particular a collection of 458 human primary cell samples (for
approximately 200 cell types, in many cases from three individual donors), covering
most mammalian cell steady states, 255 different cancer cell lines (representing
approximately 150 cancer subtypes) and 134 human postmortem tissues. Expression
values were obtained by the FANTOM5 consortium using Cap Analysis of Gene Expression
technique. EpiFactors also contains information on 69 protein complexes that are
involved in epigenetic regulation. The resource is practical for a wide range of
users, including biologists, pharmacologists and clinicians. Database URL: http://epifactors.autosome.ru
Collapse
Affiliation(s)
- Yulia A Medvedeva
- Institute of Personal and Predictive Medicine of Cancer, 08916 Badalona, Spain, Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia,
| | - Andreas Lennartsson
- Department of Biosciences and Nutrition, Karolinska Institutet, 14183 Huddinge, Sweden
| | - Rezvan Ehsani
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway
| | - Ivan V Kulakovskiy
- Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Ilya E Vorontsov
- Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Pouda Panahandeh
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway
| | - Grigory Khimulya
- Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Takeya Kasukawa
- Division of Genomic Technologies (DGT), RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-Cho, Tsurumi-Ku, Yokohama 230-0045, Kanagawa, Japan
| | | | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway,
| |
Collapse
|
32
|
Abstract
Background Transcription factors are essential proteins for regulating gene expression. This regulation depends upon specific features of the transcription factors, including how they interact with DNA, how they interact with each other, and how they are post-translationally modified. Reliable information about key properties associated with transcription factors will therefore be useful for data analysis, in particular of data from high-throughput experiments. Results We have used an existing list of 1978 human proteins described as transcription factors to make a well-annotated data set, which includes information on Pfam domains, DNA-binding domains, post-translational modifications and protein–protein interactions. We have then used this data set for enrichment analysis. We have investigated correlations within this set of features, and between the features and more general protein properties. We have also used the data set to analyze previously published gene lists associated with cell differentiation, cancer, and tissue distribution. Conclusions The study shows that well-annotated feature list for transcription factors is a useful resource for extensive data analysis; both of transcription factor properties in general and of properties associated with specific processes. However, the study also shows that such analyses are easily biased by incomplete coverage in experimental data, and by how gene sets are defined. Electronic supplementary material The online version of this article (doi:10.1186/s13104-015-1039-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shahram Bahrami
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, , NO-7491, Trondheim, Norway. .,St. Olavs Hospital, NO-7006, Trondheim, Norway.
| | - Rezvan Ehsani
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, , NO-7491, Trondheim, Norway.
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, , NO-7491, Trondheim, Norway.
| |
Collapse
|
33
|
Baglo Y, Peng Q, Hagen L, Berg K, Høgset A, Drabløs F, Gederaas OA. Studies of the photosensitizer disulfonated meso-tetraphenyl chlorin in an orthotopic rat bladder tumor model. Photodiagnosis Photodyn Ther 2015; 12:58-66. [DOI: 10.1016/j.pdpdt.2014.12.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2014] [Revised: 12/18/2014] [Accepted: 12/19/2014] [Indexed: 12/21/2022]
|
34
|
Arner E, Daub CO, Vitting-Seerup K, Andersson R, Lilje B, Drabløs F, Lennartsson A, Rönnerblad M, Hrydziuszko O, Vitezic M, Freeman TC, Alhendi AMN, Arner P, Axton R, Baillie JK, Beckhouse A, Bodega B, Briggs J, Brombacher F, Davis M, Detmar M, Ehrlund A, Endoh M, Eslami A, Fagiolini M, Fairbairn L, Faulkner GJ, Ferrai C, Fisher ME, Forrester L, Goldowitz D, Guler R, Ha T, Hara M, Herlyn M, Ikawa T, Kai C, Kawamoto H, Khachigian LM, Klinken SP, Kojima S, Koseki H, Klein S, Mejhert N, Miyaguchi K, Mizuno Y, Morimoto M, Morris KJ, Mummery C, Nakachi Y, Ogishima S, Okada-Hatakeyama M, Okazaki Y, Orlando V, Ovchinnikov D, Passier R, Patrikakis M, Pombo A, Qin XY, Roy S, Sato H, Savvi S, Saxena A, Schwegmann A, Sugiyama D, Swoboda R, Tanaka H, Tomoiu A, Winteringham LN, Wolvetang E, Yanagi-Mizuochi C, Yoneda M, Zabierowski S, Zhang P, Abugessaisa I, Bertin N, Diehl AD, Fukuda S, Furuno M, Harshbarger J, Hasegawa A, Hori F, Ishikawa-Kato S, Ishizu Y, Itoh M, Kawashima T, Kojima M, Kondo N, Lizio M, Meehan TF, Mungall CJ, Murata M, Nishiyori-Sueki H, Sahin S, Nagao-Sato S, Severin J, de Hoon MJL, Kawai J, Kasukawa T, Lassmann T, Suzuki H, Kawaji H, Summers KM, Wells C, Hume DA, Forrest ARR, Sandelin A, Carninci P, Hayashizaki Y. Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science 2015; 347:1010-4. [PMID: 25678556 PMCID: PMC4681433 DOI: 10.1126/science.1259418] [Citation(s) in RCA: 405] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Although it is generally accepted that cellular differentiation requires changes to transcriptional networks, dynamic regulation of promoters and enhancers at specific sets of genes has not been previously studied en masse. Exploiting the fact that active promoters and enhancers are transcribed, we simultaneously measured their activity in 19 human and 14 mouse time courses covering a wide range of cell types and biological stimuli. Enhancer RNAs, then messenger RNAs encoding transcription factors, dominated the earliest responses. Binding sites for key lineage transcription factors were simultaneously overrepresented in enhancers and promoters active in each cellular system. Our data support a highly generalizable model in which enhancer transcription is the earliest event in successive waves of transcriptional change during cellular differentiation or activation.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - David A. Hume
- Corresponding author. (D.A.H.); (A.R.R.F.); (A.S.); (P.C.); (Y.H.)
| | | | - Albin Sandelin
- Corresponding author. (D.A.H.); (A.R.R.F.); (A.S.); (P.C.); (Y.H.)
| | - Piero Carninci
- Corresponding author. (D.A.H.); (A.R.R.F.); (A.S.); (P.C.); (Y.H.)
| | | |
Collapse
|
35
|
Haugum K, Johansen J, Gabrielsen C, Brandal LT, Bergh K, Ussery DW, Drabløs F, Afset JE. Comparative genomics to delineate pathogenic potential in non-O157 Shiga toxin-producing Escherichia coli (STEC) from patients with and without haemolytic uremic syndrome (HUS) in Norway. PLoS One 2014; 9:e111788. [PMID: 25360710 PMCID: PMC4216125 DOI: 10.1371/journal.pone.0111788] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Accepted: 09/30/2014] [Indexed: 11/19/2022] Open
Abstract
Shiga toxin-producing Escherichia coli (STEC) cause infections in humans ranging from asymptomatic carriage to bloody diarrhoea and haemolytic uremic syndrome (HUS). Here we present whole genome comparison of Norwegian non-O157 STEC strains with the aim to distinguish between strains with the potential to cause HUS and less virulent strains. Whole genome sequencing and comparisons were performed across 95 non-O157 STEC strains. Twenty-three of these were classified as HUS-associated, including strains from patients with HUS (n = 19) and persons with an epidemiological link to a HUS-case (n = 4). Genomic comparison revealed considerable heterogeneity in gene content across the 95 STEC strains. A clear difference in gene profile was observed between strains with and without the Locus of Enterocyte Effacement (LEE) pathogenicity island. Phylogenetic analysis of the core genome showed high degree of diversity among the STEC strains, but all HUS-associated STEC strains were distributed in two distinct clusters within phylogroup B1. However, non-HUS strains were also found in these clusters. A number of accessory genes were found to be significantly overrepresented among HUS-associated STEC, but none of them were unique to this group of strains, suggesting that different sets of genes may contribute to the pathogenic potential in different phylogenetic STEC lineages. In this study we were not able to clearly distinguish between HUS-associated and non-HUS non-O157 STEC by extensive genome comparisons. Our results indicate that STECs from different phylogenetic backgrounds have independently acquired virulence genes that determine pathogenic potential, and that the content of such genes is overlapping between HUS-associated and non-HUS strains.
Collapse
Affiliation(s)
- Kjersti Haugum
- Department of Laboratory Medicine, Children’s and Women’s Health, Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Norway
- * E-mail:
| | - Jostein Johansen
- Department of Cancer Research and Molecular Medicine, Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Christina Gabrielsen
- Department of Laboratory Medicine, Children’s and Women’s Health, Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Lin T. Brandal
- Department of Foodborne Infections, Norwegian Institute of Public Health, Oslo, Norway
| | - Kåre Bergh
- Department of Laboratory Medicine, Children’s and Women’s Health, Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Medical Microbiology, St. Olavs University Hospital, Trondheim, Norway
| | - David W. Ussery
- Biosciences Division, Oak Ridge National Labs, Oak Ridge, Tennessee, United States of America
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Jan Egil Afset
- Department of Laboratory Medicine, Children’s and Women’s Health, Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Medical Microbiology, St. Olavs University Hospital, Trondheim, Norway
| |
Collapse
|
36
|
Rye MB, Bertilsson H, Drabløs F, Angelsen A, Bathen TF, Tessem MB. Gene signatures ESC, MYC and ERG-fusion are early markers of a potentially dangerous subtype of prostate cancer. BMC Med Genomics 2014; 7:50. [PMID: 25115192 PMCID: PMC4147934 DOI: 10.1186/1755-8794-7-50] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Accepted: 07/04/2014] [Indexed: 01/08/2023] Open
Abstract
Background Good prognostic tools for predicting disease progression in early stage prostate cancer (PCa) are still missing. Detection of molecular subtypes, for instance by using microarray gene technology, can give new prognostic information which can assist personalized treatment planning. The detection of new subtypes with validation across additional and larger patient cohorts is important for bringing a potential prognostic tool into the clinic. Methods We used fresh frozen prostatectomy tissue of high molecular quality to further explore four molecular subtype signatures of PCa based on Gene Set Enrichment Analysis (GSEA) of 15 selected gene sets published in a previous study. For this analysis we used a statistical test of dependent correlations to compare reference signatures to signatures in new normal and PCa samples, and also explore signatures within and between sample subgroups in the new samples. Results An important finding was the consistent signatures observed for samples from the same patient independent of Gleason score. This proves that the signatures are robust and can surpass a normally high tumor heterogeneity within each patient. Our data did not distinguish between four different subtypes of PCa as previously published, but rather highlighted two groups of samples which could be related to good and poor prognosis based on survival data from the previous study.The poor prognosis group highlighted a set of samples characterized by enrichment of ESC, ERG-fusion and MYC + rich signatures in patients diagnosed with low Gleason score,. The other group consisted of PCa samples showing good prognosis as well as normal samples. Accounting for sample composition (the amount of benign structures such as stroma and epithelial cells in addition to the cancer component) was important to improve subtype assignments and should also be considered in future studies. Conclusion Our study validates a previous molecular subtyping of PCa in a new patient cohort, and identifies a subgroup of PCa samples highly interesting for detecting high risk PCa at an early stage. The importance of taking sample tissue composition into account when assigning subtype is emphasized.
Collapse
Affiliation(s)
- Morten Beck Rye
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), P,O, Box 8905, N-7491 Trondheim, Norway.
| | | | | | | | | | | |
Collapse
|
37
|
Razick S, Močnik R, Thomas LF, Ryeng E, Drabløs F, Sætrom P. The eGenVar data management system--cataloguing and sharing sensitive data and metadata for the life sciences. Database (Oxford) 2014; 2014:bau027. [PMID: 24682735 PMCID: PMC4030636 DOI: 10.1093/database/bau027] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Systematic data management and controlled data sharing aim at increasing reproducibility,
reducing redundancy in work, and providing a way to efficiently locate complementing or
contradicting information. One method of achieving this is collecting data in a central
repository or in a location that is part of a federated system and providing interfaces to
the data. However, certain data, such as data from biobanks or clinical studies, may, for
legal and privacy reasons, often not be stored in public repositories. Instead, we
describe a metadata cataloguing system and a software suite for reporting the presence of
data from the life sciences domain. The system stores three types of metadata: file
information, file provenance and data lineage, and content descriptions. Our software
suite includes both graphical and command line interfaces that allow users to report and
tag files with these different metadata types. Importantly, the files remain in their
original locations with their existing access-control mechanisms in place, while our
system provides descriptions of their contents and relationships. Our system and software
suite thereby provide a common framework for cataloguing and sharing both public and
private data. Database URL:http://bigr.medisin.ntnu.no/data/eGenVar/
Collapse
Affiliation(s)
- Sabry Razick
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Prinsesse Kristinasgt. 1, NO-7491 Trondheim, Norway and Department of Computer and Information Science, Norwegian University of Science and Technology, Sem Sælands vei 9, NO-7491 Trondheim, Norway
| | | | | | | | | | | |
Collapse
|
38
|
Rye M, Sandve GK, Daub CO, Kawaji H, Carninci P, Forrest ARR, Drabløs F. Chromatin states reveal functional associations for globally defined transcription start sites in four human cell lines. BMC Genomics 2014; 15:120. [PMID: 24669905 PMCID: PMC3986914 DOI: 10.1186/1471-2164-15-120] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Accepted: 12/07/2013] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Deciphering the most common modes by which chromatin regulates transcription, and how this is related to cellular status and processes is an important task for improving our understanding of human cellular biology. The FANTOM5 and ENCODE projects represent two independent large scale efforts to map regulatory and transcriptional features to the human genome. Here we investigate chromatin features around a comprehensive set of transcription start sites in four cell lines by integrating data from these two projects. RESULTS Transcription start sites can be distinguished by chromatin states defined by specific combinations of both chromatin mark enrichment and the profile shapes of these chromatin marks. The observed patterns can be associated with cellular functions and processes, and they also show association with expression level, location relative to nearby genes, and CpG content. In particular we find a substantial number of repressed inter- and intra-genic transcription start sites enriched for active chromatin marks and Pol II, and these sites are strongly associated with immediate-early response processes and cell signaling. Associations between start sites with similar chromatin patterns are validated by significant correlations in their global expression profiles. CONCLUSIONS The results confirm the link between chromatin state and cellular function for expressed transcripts, and also indicate that active chromatin states at repressed transcripts may poise transcripts for rapid activation during immune response.
Collapse
Affiliation(s)
- Morten Rye
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, NO-7491 Trondheim, Norway
- St. Olavs Hospital, Postboks 3250, Sluppen 7006, Trondheim
| | | | - Carsten O Daub
- RIKEN Omics Science Center (OSC), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Hideya Kawaji
- RIKEN Omics Science Center (OSC), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa 230-0045, Japan
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Saitama 351-0198, Japan
| | - Piero Carninci
- RIKEN Omics Science Center (OSC), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Alistair RR Forrest
- RIKEN Omics Science Center (OSC), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, P.O. Box 8905, NO-7491 Trondheim, Norway
| |
Collapse
|
39
|
Mærk M, Johansen J, Ertesvåg H, Drabløs F, Valla S. Safety in numbers: multiple occurrences of highly similar homologs among Azotobacter vinelandii carbohydrate metabolism proteins probably confer adaptive benefits. BMC Genomics 2014; 15:192. [PMID: 24625193 PMCID: PMC4022178 DOI: 10.1186/1471-2164-15-192] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Accepted: 03/05/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene duplication and horizontal gene transfer are common processes in bacterial and archaeal genomes, and are generally assumed to result in either diversification or loss of the redundant gene copies. However, a recent analysis of the genome of the soil bacterium Azotobacter vinelandii DJ revealed an abundance of highly similar homologs among carbohydrate metabolism genes. In many cases these multiple genes did not appear to be the result of recent duplications, or to function only as a means of stimulating expression by increasing gene dosage, as the homologs were located in varying functional genetic contexts. Based on these initial findings we here report in-depth bioinformatic analyses focusing specifically on highly similar intra-genome homologs, or synologs, among carbohydrate metabolism genes, as well as an analysis of the general occurrence of very similar synologs in prokaryotes. RESULTS Approximately 900 bacterial and archaeal genomes were analysed for the occurrence of synologs, both in general and among carbohydrate metabolism genes specifically. This showed that large numbers of highly similar synologs among carbohydrate metabolism genes are very rare in bacterial and archaeal genomes, and that the A. vinelandii DJ genome contains an unusually large amount of such synologs. The majority of these synologs were found to be non-tandemly organized and localized in varying but metabolically relevant genomic contexts. The same observation was made for other genomes harbouring high levels of such synologs. It was also shown that highly similar synologs generally constitute a very small fraction of the protein-coding genes in prokaryotic genomes. The overall synolog fraction of the A. vinelandii DJ genome was well above the data set average, but not nearly as remarkable as the levels observed when only carbohydrate metabolism synologs were considered. CONCLUSIONS Large numbers of highly similar synologs are rare in bacterial and archaeal genomes, both in general and among carbohydrate metabolism genes. However, A. vinelandii and several other soil bacteria harbour large numbers of highly similar carbohydrate metabolism synologs which seem not to result from recent duplication or transfer events. These genes may confer adaptive benefits with respect to certain lifestyles and environmental factors, most likely due to increased regulatory flexibility and/or increased gene dosage.
Collapse
Affiliation(s)
| | | | | | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7491, Trondheim, Norway.
| | | |
Collapse
|
40
|
Handel AE, Sandve GK, Disanto G, Berlanga-Taylor AJ, Gallone G, Hanwell H, Drabløs F, Giovannoni G, Ebers GC, Ramagopalan SV. Vitamin D receptor ChIP-seq in primary CD4+ cells: relationship to serum 25-hydroxyvitamin D levels and autoimmune disease. BMC Med 2013; 11:163. [PMID: 23849224 PMCID: PMC3710212 DOI: 10.1186/1741-7015-11-163] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Accepted: 06/20/2013] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Vitamin D insufficiency has been implicated in autoimmunity. ChIP-seq experiments using immune cell lines have shown that vitamin D receptor (VDR) binding sites are enriched near regions of the genome associated with autoimmune diseases. We aimed to investigate VDR binding in primary CD4+ cells from healthy volunteers. METHODS We extracted CD4+ cells from nine healthy volunteers. Each sample underwent VDR ChIP-seq. Our results were analyzed in relation to published ChIP-seq and RNA-seq data in the Genomic HyperBrowser. We used MEMEChIP for de novo motif discovery. 25-Hydroxyvitamin D levels were measured using liquid chromatography-tandem mass spectrometry and samples were divided into vitamin D sufficient (25(OH)D ≥75 nmol/L) and insufficient/deficient (25(OH)D <75 nmol/L) groups. RESULTS We found that the amount of VDR binding is correlated with the serum level of 25-hydroxyvitamin D (r = 0.92, P= 0.0005). In vivo VDR binding sites are enriched for autoimmune disease associated loci, especially when 25-hydroxyvitamin D levels (25(OH)D) were sufficient (25(OH)D ≥75: 3.13-fold, P<0.0001; 25(OH)D <75: 2.76-fold, P<0.0001; 25(OH)D ≥75 enrichment versus 25(OH)D <75 enrichment: P= 0.0002). VDR binding was also enriched near genes associated specifically with T-regulatory and T-helper cells in the 25(OH)D ≥75 group. MEME ChIP did not identify any VDR-like motifs underlying our VDR ChIP-seq peaks. CONCLUSION Our results show a direct correlation between in vivo 25-hydroxyvitamin D levels and the number of VDR binding sites, although our sample size is relatively small. Our study further implicates VDR binding as important in gene-environment interactions underlying the development of autoimmunity and provides a biological rationale for 25-hydroxyvitamin D sufficiency being based at 75 nmol/L. Our results also suggest that VDR binding in response to physiological levels of vitamin D occurs predominantly in a VDR motif-independent manner.
Collapse
Affiliation(s)
- Adam E Handel
- Medical Research Council Functional Genomics Unit and Department of Physiology, Anatomy and Genetics, University of Oxford, Parks Road, Oxford OX1 3PT, UK
| | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Lewin A, Johansen J, Wentzel A, Kotlar HK, Drabløs F, Valla S. The microbial communities in two apparently physically separated deep subsurface oil reservoirs show extensive DNA sequence similarities. Environ Microbiol 2013; 16:545-58. [DOI: 10.1111/1462-2920.12181] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Revised: 05/20/2013] [Accepted: 06/02/2013] [Indexed: 01/28/2023]
Affiliation(s)
- Anna Lewin
- Department of Biotechnology; Norwegian University of Science and Technology; Trondheim N-7491 Norway
| | - Jostein Johansen
- Department of Cancer Research and Molecular Medicine; Norwegian University of Science and Technology; Trondheim N-7491 Norway
| | - Alexander Wentzel
- Department of Biotechnology; SINTEF Materials and Chemistry; Trondheim N-7465 Norway
| | | | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine; Norwegian University of Science and Technology; Trondheim N-7491 Norway
| | - Svein Valla
- Department of Biotechnology; Norwegian University of Science and Technology; Trondheim N-7491 Norway
| |
Collapse
|
42
|
Sandve GK, Gundersen S, Johansen M, Glad IK, Gunathasan K, Holden L, Holden M, Liestøl K, Nygård S, Nygaard V, Paulsen J, Rydbeck H, Trengereid K, Clancy T, Drabløs F, Ferkingstad E, Kalaš M, Lien T, Rye MB, Frigessi A, Hovig E. The Genomic HyperBrowser: an analysis web server for genome-scale data. Nucleic Acids Res 2013; 41:W133-41. [PMID: 23632163 PMCID: PMC3692097 DOI: 10.1093/nar/gkt342] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Revised: 03/27/2013] [Accepted: 04/10/2013] [Indexed: 11/14/2022] Open
Abstract
The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.
Collapse
Affiliation(s)
- Geir K. Sandve
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Sveinung Gundersen
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Morten Johansen
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Ingrid K. Glad
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Krishanthi Gunathasan
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Lars Holden
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Marit Holden
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Knut Liestøl
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Ståle Nygård
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Vegard Nygaard
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Jonas Paulsen
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Halfdan Rydbeck
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Kai Trengereid
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Trevor Clancy
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Finn Drabløs
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Egil Ferkingstad
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Matúš Kalaš
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Tonje Lien
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Morten B. Rye
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Arnoldo Frigessi
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| | - Eivind Hovig
- Department of Informatics, University of Oslo, PO Box 1080, Blindern, 0316 Oslo, Norway, Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, PO Box 4950, Nydalen, 0424 Oslo, Norway, Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950 Nydalen, 0424 Oslo, Norway, Institute for Medical Informatics, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway, Department of Mathematics, University of Oslo, PO Box 1053, Blindern, 0316 Oslo, Norway, Department of Medical Biology, Faculty of Health Science, University of Tromsø, 9037 Tromsø, Norway, Statistics For Innovation, Norwegian Computing Center, 0314 Oslo, Norway, Bioinformatics Core Facility, Oslo University Hospital and University of Oslo, PO Box 4950 Nydalen, N-0424 Oslo, Norway, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway, Department of Informatics, University of Bergen, PO Box 7803, 5020 Bergen, Norway, Computational Biology Unit, Uni Computing, Uni Research AS, 5020 Bergen, Norway and Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, PO Box 1122 Blindern, 0317 Oslo, Norway
| |
Collapse
|
43
|
Klepper K, Drabløs F. MotifLab: a tools and data integration workbench for motif discovery and regulatory sequence analysis. BMC Bioinformatics 2013; 14:9. [PMID: 23323883 PMCID: PMC3556059 DOI: 10.1186/1471-2105-14-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2012] [Accepted: 01/10/2013] [Indexed: 12/19/2022] Open
Abstract
Background Traditional methods for computational motif discovery often suffer from poor performance. In particular, methods that search for sequence matches to known binding motifs tend to predict many non-functional binding sites because they fail to take into consideration the biological state of the cell. In recent years, genome-wide studies have generated a lot of data that has the potential to improve our ability to identify functional motifs and binding sites, such as information about chromatin accessibility and epigenetic states in different cell types. However, it is not always trivial to make use of this data in combination with existing motif discovery tools, especially for researchers who are not skilled in bioinformatics programming. Results Here we present MotifLab, a general workbench for analysing regulatory sequence regions and discovering transcription factor binding sites and cis-regulatory modules. MotifLab supports comprehensive motif discovery and analysis by allowing users to integrate several popular motif discovery tools as well as different kinds of additional information, including phylogenetic conservation, epigenetic marks, DNase hypersensitive sites, ChIP-Seq data, positional binding preferences of transcription factors, transcription factor interactions and gene expression. MotifLab offers several data-processing operations that can be used to create, manipulate and analyse data objects, and complete analysis workflows can be constructed and automatically executed within MotifLab, including graphical presentation of the results. Conclusions We have developed MotifLab as a flexible workbench for motif analysis in a genomic context. The flexibility and effectiveness of this workbench has been demonstrated on selected test cases, in particular two previously published benchmark data sets for single motifs and modules, and a realistic example of genes responding to treatment with forskolin. MotifLab is freely available at http://www.motiflab.org.
Collapse
Affiliation(s)
- Kjetil Klepper
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway.
| | | |
Collapse
|
44
|
Peña-Diaz J, Hegre SA, Anderssen E, Aas PA, Mjelle R, Gilfillan GD, Lyle R, Drabløs F, Krokan HE, Sætrom P. Transcription profiling during the cell cycle shows that a subset of Polycomb-targeted genes is upregulated during DNA replication. Nucleic Acids Res 2013; 41:2846-56. [PMID: 23325852 PMCID: PMC3597645 DOI: 10.1093/nar/gks1336] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Genome-wide gene expression analyses of the human somatic cell cycle have indicated that the set of cycling genes differ between primary and cancer cells. By identifying genes that have cell cycle dependent expression in HaCaT human keratinocytes and comparing these with previously identified cell cycle genes, we have identified three distinct groups of cell cycle genes. First, housekeeping genes enriched for known cell cycle functions; second, cell type-specific genes enriched for HaCaT-specific functions; and third, Polycomb-regulated genes. These Polycomb-regulated genes are specifically upregulated during DNA replication, and consistent with being epigenetically silenced in other cell cycle phases, these genes have lower expression than other cell cycle genes. We also find similar patterns in foreskin fibroblasts, indicating that replication-dependent expression of Polycomb-silenced genes is a prevalent but unrecognized regulatory mechanism.
Collapse
Affiliation(s)
- Javier Peña-Diaz
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway
| | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Håndstad T, Rye M, Močnik R, Drabløs F, Sætrom P. Cell-type specificity of ChIP-predicted transcription factor binding sites. BMC Genomics 2012; 13:372. [PMID: 22863112 PMCID: PMC3574057 DOI: 10.1186/1471-2164-13-372] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2012] [Accepted: 07/06/2012] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Context-dependent transcription factor (TF) binding is one reason for differences in gene expression patterns between different cellular states. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) identifies genome-wide TF binding sites for one particular context-the cells used in the experiment. But can such ChIP-seq data predict TF binding in other cellular contexts and is it possible to distinguish context-dependent from ubiquitous TF binding? RESULTS We compared ChIP-seq data on TF binding for multiple TFs in two different cell types and found that on average only a third of ChIP-seq peak regions are common to both cell types. Expectedly, common peaks occur more frequently in certain genomic contexts, such as CpG-rich promoters, whereas chromatin differences characterize cell-type specific TF binding. We also find, however, that genotype differences between the cell types can explain differences in binding. Moreover, ChIP-seq signal intensity and peak clustering are the strongest predictors of common peaks. Compared with strong peaks located in regions containing peaks for multiple transcription factors, weak and isolated peaks are less common between the cell types and are less associated with data that indicate regulatory activity. CONCLUSIONS Together, the results suggest that experimental noise is prevalent among weak peaks, whereas strong and clustered peaks represent high-confidence binding events that often occur in other cellular contexts. Nevertheless, 30-40% of the strongest and most clustered peaks show context-dependent regulation. We show that by combining signal intensity with additional data-ranging from context independent information such as binding site conservation and position weight matrix scores to context dependent chromatin structure-we can predict whether a ChIP-seq peak is likely to be present in other cellular contexts.
Collapse
Affiliation(s)
- Tony Håndstad
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
| | - Morten Rye
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
| | - Rok Močnik
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
| | - Pål Sætrom
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
- Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, NO-7491, Norway
| |
Collapse
|
46
|
Kornacker K, Rye MB, Håndstad T, Drabløs F. The Triform algorithm: improved sensitivity and specificity in ChIP-Seq peak finding. BMC Bioinformatics 2012; 13:176. [PMID: 22827163 PMCID: PMC3480842 DOI: 10.1186/1471-2105-13-176] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2012] [Accepted: 06/21/2012] [Indexed: 11/10/2022] Open
Abstract
Background Chromatin immunoprecipitation combined with high-throughput sequencing (ChIP-Seq) is the most frequently used method to identify the binding sites of transcription factors. Active binding sites can be seen as peaks in enrichment profiles when the sequencing reads are mapped to a reference genome. However, the profiles are normally noisy, making it challenging to identify all significantly enriched regions in a reliable way and with an acceptable false discovery rate. Results We present the Triform algorithm, an improved approach to automatic peak finding in ChIP-Seq enrichment profiles for transcription factors. The method uses model-free statistics to identify peak-like distributions of sequencing reads, taking advantage of improved peak definition in combination with known characteristics of ChIP-Seq data. Conclusions Triform outperforms several existing methods in the identification of representative peak profiles in curated benchmark data sets. We also show that Triform in many cases is able to identify peaks that are more consistent with biological function, compared with other methods. Finally, we show that Triform can be used to generate novel information on transcription factor binding in repeat regions, which represents a particular challenge in many ChIP-Seq experiments. The Triform algorithm has been implemented in R, and is available via http://tare.medisin.ntnu.no/triform.
Collapse
Affiliation(s)
- Karl Kornacker
- Division of Sensory Biophysics, Ohio State University, Columbus, OH, USA
| | | | | | | |
Collapse
|
47
|
Kotlar HK, Lewin A, Johansen J, Throne-Holst M, Haverkamp T, Markussen S, Winnberg A, Ringrose P, Aakvik T, Ryeng E, Jakobsen K, Drabløs F, Valla S. High coverage sequencing of DNA from microorganisms living in an oil reservoir 2.5 kilometres subsurface. Environ Microbiol Rep 2011; 3:674-681. [PMID: 23761356 DOI: 10.1111/j.1758-2229.2011.00279.x] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Microorganisms colonize a variety of extreme environments, and based on cultivation studies and analyses of PCR-amplified 16S rDNA sequences, microbial life appears to extend deep into the earth crust. However, none of these studies involved comprehensive characterizations of total DNA. Here we report results of a high-coverage DNA pyrosequencing of an apparently representative and uncontaminated sample from a deep sea oil reservoir located 2.5 km subsurface, attributing a pressure and temperature of 250 bars and 85°C respectively. Bioinformatic analyses of the DNA sequences indicate that the reservoir harbours a rich microbial community dominated by a smaller number of taxa. Comparison of the metagenome with sequences in databases indicated that there may have been contact between the oil reservoir and surface communities late in the sequence of geological events leading to oil reservoir formation. One specific gene, encoding a putative enolase, was synthesized and expressed in Escherichia coli. Enolase activity was confirmed and was found to be much more thermotolerant than for a corresponding E. coli enzyme, consistent with the conditions in the oil reservoir.
Collapse
Affiliation(s)
- Hans K Kotlar
- Statoil ASA, 7053 Ranheim, Norway Department of Biotechnology, Norwegian University of Science and Technology, 7491 Trondheim, Norway Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, 7491 Trondheim, Norway SINTEF Materials and Chemistry, Department of Biotechnology, 7465 Trondheim, Norway CEES and MERG, Department of Biology, University of Oslo, 0316 Oslo, Norway
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Rye M, Sætrom P, Håndstad T, Drabløs F. Clustered ChIP-Seq-defined transcription factor binding sites and histone modifications map distinct classes of regulatory elements. BMC Biol 2011; 9:80. [PMID: 22115494 PMCID: PMC3239327 DOI: 10.1186/1741-7007-9-80] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2011] [Accepted: 11/24/2011] [Indexed: 12/16/2022] Open
Abstract
Background Transcription factor binding to DNA requires both an appropriate binding element and suitably open chromatin, which together help to define regulatory elements within the genome. Current methods of identifying regulatory elements, such as promoters or enhancers, typically rely on sequence conservation, existing gene annotations or specific marks, such as histone modifications and p300 binding methods, each of which has its own biases. Results Herein we show that an approach based on clustering of transcription factor peaks from high-throughput sequencing coupled with chromatin immunoprecipitation (Chip-Seq) can be used to evaluate markers for regulatory elements. We used 67 data sets for 54 unique transcription factors distributed over two cell lines to create regulatory element clusters. By integrating the clusters from our approach with histone modifications and data for open chromatin, we identified general methylation of lysine 4 on histone H3 (H3K4me) as the most specific marker for transcription factor clusters. Clusters mapping to annotated genes showed distinct patterns in cluster composition related to gene expression and histone modifications. Clusters mapping to intergenic regions fall into two groups either directly involved in transcription, including miRNAs and long noncoding RNAs, or facilitating transcription by long-range interactions. The latter clusters were specifically enriched with H3K4me1, but less with acetylation of lysine 27 on histone 3 or p300 binding. Conclusion By integrating genomewide data of transcription factor binding and chromatin structure and using our data-driven approach, we pinpointed the chromatin marks that best explain transcription factor association with different regulatory elements. Our results also indicate that a modest selection of transcription factors may be sufficient to map most regulatory elements in the human genome.
Collapse
Affiliation(s)
- Morten Rye
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway.
| | | | | | | |
Collapse
|
49
|
Sandve GK, Gundersen S, Rydbeck H, Glad IK, Holden L, Holden M, Liestøl K, Clancy T, Drabløs F, Ferkingstad E, Johansen M, Nygaard V, Tøstesen E, Frigessi A, Hovig E. The differential disease regulome. BMC Genomics 2011; 12:353. [PMID: 21736759 PMCID: PMC3160420 DOI: 10.1186/1471-2164-12-353] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Accepted: 07/07/2011] [Indexed: 12/11/2022] Open
Abstract
Background Transcription factors in disease-relevant pathways represent potential drug targets, by impacting a distinct set of pathways that may be modulated through gene regulation. The influence of transcription factors is typically studied on a per disease basis, and no current resources provide a global overview of the relations between transcription factors and disease. Furthermore, existing pipelines for related large-scale analysis are tailored for particular sources of input data, and there is a need for generic methodology for integrating complementary sources of genomic information. Results We here present a large-scale analysis of multiple diseases versus multiple transcription factors, with a global map of over-and under-representation of 446 transcription factors in 1010 diseases. This map, referred to as the differential disease regulome, provides a first global statistical overview of the complex interrelationships between diseases, genes and controlling elements. The map is visualized using the Google map engine, due to its very large size, and provides a range of detailed information in a dynamic presentation format. The analysis is achieved through a novel methodology that performs a pairwise, genome-wide comparison on the cartesian product of two distinct sets of annotation tracks, e.g. all combinations of one disease and one TF. The methodology was also used to extend with maps using alternative data sets related to transcription and disease, as well as data sets related to Gene Ontology classification and histone modifications. We provide a web-based interface that allows users to generate other custom maps, which could be based on precisely specified subsets of transcription factors and diseases, or, in general, on any categorical genome annotation tracks as they are improved or become available. Conclusion We have created a first resource that provides a global overview of the complex relations between transcription factors and disease. As the accuracy of the disease regulome depends mainly on the quality of the input data, forthcoming ChIP-seq based binding data for many TFs will provide improved maps. We further believe our approach to genome analysis could allow an advance from the current typical situation of one-time integrative efforts to reproducible and upgradable integrative analysis. The differential disease regulome and its associated methodology is available at http://hyperbrowser.uio.no.
Collapse
Affiliation(s)
- Geir K Sandve
- Department of Informatics, University of Oslo, Blindern, 0316 Oslo, Norway
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Abstract
Chromatin immunoprecipitation (ChIP) followed by high throughput sequencing (ChIP-seq) is rapidly becoming the method of choice for discovering cell-specific transcription factor binding locations genome wide. By aligning sequenced tags to the genome, binding locations appear as peaks in the tag profile. Several programs have been designed to identify such peaks, but program evaluation has been difficult due to the lack of benchmark data sets. We have created benchmark data sets for three transcription factors by manually evaluating a selection of potential binding regions that cover typical variation in peak size and appearance. Performance of five programs on this benchmark showed, first, that external control or background data was essential to limit the number of false positive peaks from the programs. However, >80% of these peaks could be manually filtered out by visual inspection alone, without using additional background data, showing that peak shape information is not fully exploited in the evaluated programs. Second, none of the programs returned peak-regions that corresponded to the actual resolution in ChIP-seq data. Our results showed that ChIP-seq peaks should be narrowed down to 100–400 bp, which is sufficient to identify unique peaks and binding sites. Based on these results, we propose a meta-approach that gives improved peak definitions.
Collapse
Affiliation(s)
- Morten Beck Rye
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway.
| | | | | |
Collapse
|