351
|
Wang YY, Chen WH, Xiao PP, Xie WB, Luo Q, Bork P, Zhao XM. GEAR: A database of Genomic Elements Associated with drug Resistance. Sci Rep 2017; 7:44085. [PMID: 28294141 PMCID: PMC5353689 DOI: 10.1038/srep44085] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 02/02/2017] [Indexed: 12/28/2022] Open
Abstract
Drug resistance is becoming a serious problem that leads to the failure of standard treatments, which is generally developed because of genetic mutations of certain molecules. Here, we present GEAR (A database of Genomic Elements Associated with drug Resistance) that aims to provide comprehensive information about genomic elements (including genes, single-nucleotide polymorphisms and microRNAs) that are responsible for drug resistance. Right now, GEAR contains 1631 associations between 201 human drugs and 758 genes, 106 associations between 29 human drugs and 66 miRNAs, and 44 associations between 17 human drugs and 22 SNPs. These relationships are firstly extracted from primary literature with text mining and then manually curated. The drug resistome deposited in GEAR provides insights into the genetic factors underlying drug resistance. In addition, new indications and potential drug combinations can be identified based on the resistome. The GEAR database can be freely accessed through http://gear.comp-sysbio.org.
Collapse
Affiliation(s)
- Yin-Ying Wang
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China.,Department of Electronic Engineering, City University of Hong Kong, Kowloon 999077, Hong Kong
| | - Wei-Hua Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology (HUST), Wuhan, Hubei 430074, China
| | - Pei-Pei Xiao
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Wen-Bin Xie
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Qibin Luo
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Peer Bork
- European Molecular Biology Laboratory (EMBL), Heidelberg, 69117, Germany
| | - Xing-Ming Zhao
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| |
Collapse
|
352
|
Roumpeka DD, Wallace RJ, Escalettes F, Fotheringham I, Watson M. A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data. Front Genet 2017; 8:23. [PMID: 28321234 PMCID: PMC5337752 DOI: 10.3389/fgene.2017.00023] [Citation(s) in RCA: 103] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Accepted: 02/16/2017] [Indexed: 12/21/2022] Open
Abstract
The microbiome can be defined as the community of microorganisms that live in a particular environment. Metagenomics is the practice of sequencing DNA from the genomes of all organisms present in a particular sample, and has become a common method for the study of microbiome population structure and function. Increasingly, researchers are finding novel genes encoded within metagenomes, many of which may be of interest to the biotechnology and pharmaceutical industries. However, such “bioprospecting” requires a suite of sophisticated bioinformatics tools to make sense of the data. This review summarizes the most commonly used bioinformatics tools for the assembly and annotation of metagenomic sequence data with the aim of discovering novel genes.
Collapse
Affiliation(s)
- Despoina D Roumpeka
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, UK
| | - R John Wallace
- The Rowett Institute of Nutrition and Health, Department of Life Sciences and Medicine, University of Aberdeen, Aberdeen, UK
| | | | | | - Mick Watson
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, UK
| |
Collapse
|
353
|
Rashid I, Nagpure NS, Srivastava P, Kumar R, Pathak AK, Singh M, Kushwaha B. HRGFish: A database of hypoxia responsive genes in fishes. Sci Rep 2017; 7:42346. [PMID: 28205556 PMCID: PMC5304231 DOI: 10.1038/srep42346] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 01/04/2017] [Indexed: 11/09/2022] Open
Abstract
Several studies have highlighted the changes in the gene expression due to the hypoxia response in fishes, but the systematic organization of the information and the analytical platform for such genes are lacking. In the present study, an attempt was made to develop a database of hypoxia responsive genes in fishes (HRGFish), integrated with analytical tools, using LAMPP technology. Genes reported in hypoxia response for fishes were compiled through literature survey and the database presently covers 818 gene sequences and 35 gene types from 38 fishes. The upstream fragments (3,000 bp), covered in this database, enables to compute CG dinucleotides frequencies, motif finding of the hypoxia response element, identification of CpG island and mapping with the reference promoter of zebrafish. The database also includes functional annotation of genes and provides tools for analyzing sequences and designing primers for selected gene fragments. This may be the first database on the hypoxia response genes in fishes that provides a workbench to the scientific community involved in studying the evolution and ecological adaptation of the fish species in relation to hypoxia.
Collapse
Affiliation(s)
- Iliyas Rashid
- Molecular Biology and Biotechnology Division, ICAR- National Bureau of Fish Genetic Resources, Lucknow- 226002, Uttar Pradesh, India.,AMITY Institute of Biotechnology, AMITY University Uttar Pradesh, Lucknow-226028, Uttar Pradesh, India
| | - Naresh Sahebrao Nagpure
- Fish Genetics and Biotechnology Division, ICAR- Central Institute of Fisheries Education, Mumbai-400 061, Maharashtra, India
| | - Prachi Srivastava
- AMITY Institute of Biotechnology, AMITY University Uttar Pradesh, Lucknow-226028, Uttar Pradesh, India
| | - Ravindra Kumar
- Molecular Biology and Biotechnology Division, ICAR- National Bureau of Fish Genetic Resources, Lucknow- 226002, Uttar Pradesh, India
| | - Ajey Kumar Pathak
- Molecular Biology and Biotechnology Division, ICAR- National Bureau of Fish Genetic Resources, Lucknow- 226002, Uttar Pradesh, India
| | - Mahender Singh
- Molecular Biology and Biotechnology Division, ICAR- National Bureau of Fish Genetic Resources, Lucknow- 226002, Uttar Pradesh, India
| | - Basdeo Kushwaha
- Molecular Biology and Biotechnology Division, ICAR- National Bureau of Fish Genetic Resources, Lucknow- 226002, Uttar Pradesh, India
| |
Collapse
|
354
|
Warren WC, Hillier LW, Tomlinson C, Minx P, Kremitzki M, Graves T, Markovic C, Bouk N, Pruitt KD, Thibaud-Nissen F, Schneider V, Mansour TA, Brown CT, Zimin A, Hawken R, Abrahamsen M, Pyrkosz AB, Morisson M, Fillon V, Vignal A, Chow W, Howe K, Fulton JE, Miller MM, Lovell P, Mello CV, Wirthlin M, Mason AS, Kuo R, Burt DW, Dodgson JB, Cheng HH. A New Chicken Genome Assembly Provides Insight into Avian Genome Structure. G3 (BETHESDA, MD.) 2017; 7:109-117. [PMID: 27852011 PMCID: PMC5217101 DOI: 10.1534/g3.116.035923] [Citation(s) in RCA: 157] [Impact Index Per Article: 22.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Accepted: 10/27/2016] [Indexed: 12/18/2022]
Abstract
The importance of the Gallus gallus (chicken) as a model organism and agricultural animal merits a continuation of sequence assembly improvement efforts. We present a new version of the chicken genome assembly (Gallus_gallus-5.0; GCA_000002315.3), built from combined long single molecule sequencing technology, finished BACs, and improved physical maps. In overall assembled bases, we see a gain of 183 Mb, including 16.4 Mb in placed chromosomes with a corresponding gain in the percentage of intact repeat elements characterized. Of the 1.21 Gb genome, we include three previously missing autosomes, GGA30, 31, and 33, and improve sequence contig length 10-fold over the previous Gallus_gallus-4.0. Despite the significant base representation improvements made, 138 Mb of sequence is not yet located to chromosomes. When annotated for gene content, Gallus_gallus-5.0 shows an increase of 4679 annotated genes (2768 noncoding and 1911 protein-coding) over those in Gallus_gallus-4.0. We also revisited the question of what genes are missing in the avian lineage, as assessed by the highest quality avian genome assembly to date, and found that a large fraction of the original set of missing genes are still absent in sequenced bird species. Finally, our new data support a detailed map of MHC-B, encompassing two segments: one with a highly stable gene copy number and another in which the gene copy number is highly variable. The chicken model has been a critical resource for many other fields of study, and this new reference assembly will substantially further these efforts.
Collapse
Affiliation(s)
- Wesley C Warren
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108
| | - LaDeana W Hillier
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108
| | - Chad Tomlinson
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108
| | - Patrick Minx
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108
| | - Milinn Kremitzki
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108
| | - Tina Graves
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108
| | - Chris Markovic
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108
| | - Nathan Bouk
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
| | - Valerie Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
| | | | | | - Aleksey Zimin
- Institute for Physical Sciences and Technology, University of Maryland, College Park, Maryland 20742
| | - Rachel Hawken
- Cobb-Vantress Inc., Siloam Springs, Arkansas 72761-1030
| | | | - Alexis B Pyrkosz
- United States Department of Agriculture-Agricultural Research Service, Avian Disease and Oncology, East Lansing, Michigan 48823
| | - Mireille Morisson
- Génétique Physiologie et Systèmes d'Elevage, Université de Toulouse, Institut National de la Recherche Agronomique, Auzeville Castanet Tolosan, France
| | - Valerie Fillon
- Génétique Physiologie et Systèmes d'Elevage, Université de Toulouse, Institut National de la Recherche Agronomique, Auzeville Castanet Tolosan, France
| | - Alain Vignal
- Génétique Physiologie et Systèmes d'Elevage, Université de Toulouse, Institut National de la Recherche Agronomique, Auzeville Castanet Tolosan, France
| | - William Chow
- Wellcome Trust Sanger Institute, Cambridgeshire CB10 1SA, United Kingdom
| | - Kerstin Howe
- Wellcome Trust Sanger Institute, Cambridgeshire CB10 1SA, United Kingdom
| | | | | | - Peter Lovell
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, Oregon 97239-3098
| | - Claudio V Mello
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, Oregon 97239-3098
| | - Morgan Wirthlin
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, Oregon 97239-3098
| | - Andrew S Mason
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, United Kingdom
| | - Richard Kuo
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, United Kingdom
| | - David W Burt
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, United Kingdom
| | - Jerry B Dodgson
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan 48824
| | - Hans H Cheng
- United States Department of Agriculture-Agricultural Research Service, Avian Disease and Oncology, East Lansing, Michigan 48823
| |
Collapse
|
355
|
Jin J, Tian F, Yang DC, Meng YQ, Kong L, Luo J, Gao G. PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 2017; 45:D1040-D1045. [PMID: 27924042 DOI: 10.1093/nar/gkw98] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 10/12/2016] [Indexed: 05/22/2023] Open
Abstract
With the goal of providing a comprehensive, high-quality resource for both plant transcription factors (TFs) and their regulatory interactions with target genes, we upgraded plant TF database PlantTFDB to version 4.0 (http://planttfdb.cbi.pku.edu.cn/). In the new version, we identified 320 370 TFs from 165 species, presenting a more comprehensive genomic TF repertoires of green plants. Besides updating the pre-existing abundant functional and evolutionary annotation for identified TFs, we generated three new types of annotation which provide more directly clues to investigate functional mechanisms underlying: (i) a set of high-quality, non-redundant TF binding motifs derived from experiments; (ii) multiple types of regulatory elements identified from high-throughput sequencing data; (iii) regulatory interactions curated from literature and inferred by combining TF binding motifs and regulatory elements. In addition, we upgraded previous TF prediction server, and set up four novel tools for regulation prediction and functional enrichment analyses. Finally, we set up a novel companion portal PlantRegMap (http://plantregmap.cbi.pku.edu.cn) for users to access the regulation resource and analysis tools conveniently.
Collapse
Affiliation(s)
- Jinpu Jin
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Feng Tian
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P.R. China
| | - De-Chang Yang
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Yu-Qi Meng
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Lei Kong
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Jingchu Luo
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Ge Gao
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| |
Collapse
|
356
|
Abstract
An in-depth evaluation of target safety is an invaluable resource throughout drug discovery and development. The goal of a target safety evaluation is to identify potential unintended adverse consequences of target modulation, and to propose a risk evaluation and mitigation strategy to shepherd compounds through the discovery and development pipeline, to confirm and characterize unavoidable on-target toxicities in a timely manner to assist in early program advancement decisions, and to anticipate, monitor, and manage potential clinical adverse events. The role of an experienced discovery toxicologist in synthesizing the available information into an actionable set of recommendations for a safety evaluation strategy is critical to its successful application in early discovery programs. This chapter presents a summary of some of the information types and sources that should be investigated, and approaches that can be taken to generate an early assessment of potential safety liabilities.
Collapse
|
357
|
Li T, Wernersson R, Hansen RB, Horn H, Mercer J, Slodkowicz G, Workman CT, Rigina O, Rapacki K, Stærfeldt HH, Brunak S, Jensen TS, Lage K. A scored human protein-protein interaction network to catalyze genomic interpretation. Nat Methods 2017; 14:61-64. [PMID: 27892958 PMCID: PMC5839635 DOI: 10.1038/nmeth.4083] [Citation(s) in RCA: 386] [Impact Index Per Article: 55.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Accepted: 10/20/2016] [Indexed: 02/07/2023]
Abstract
Genome-scale human protein-protein interaction networks are critical to understanding cell biology and interpreting genomic data, but challenging to produce experimentally. Through data integration and quality control, we provide a scored human protein-protein interaction network (InWeb_InBioMap, or InWeb_IM) with severalfold more interactions (>500,000) and better functional biological relevance than comparable resources. We illustrate that InWeb_InBioMap enables functional interpretation of >4,700 cancer genomes and genes involved in autism.
Collapse
Affiliation(s)
- Taibo Li
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Rasmus Wernersson
- Intomics A/S, Lyngby, Denmark
- Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
| | | | - Heiko Horn
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| | - Johnathan Mercer
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Greg Slodkowicz
- Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
| | - Christopher T Workman
- Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
| | - Olga Rigina
- Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
| | - Kristoffer Rapacki
- Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
| | - Hans H Stærfeldt
- Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | | | - Kasper Lage
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
- Institute for Biological Psychiatry, Mental Health Center Sct. Hans, University of Copenhagen, Roskilde, Denmark
| |
Collapse
|
358
|
Yamaguchi YL, Suzuki R, Cabrera J, Nakagami S, Sagara T, Ejima C, Sano R, Aoki Y, Olmo R, Kurata T, Obayashi T, Demura T, Ishida T, Escobar C, Sawa S. Root-Knot and Cyst Nematodes Activate Procambium-Associated Genes in Arabidopsis Roots. FRONTIERS IN PLANT SCIENCE 2017; 8:1195. [PMID: 28747918 PMCID: PMC5506325 DOI: 10.3389/fpls.2017.01195] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Accepted: 06/23/2017] [Indexed: 05/03/2023]
Abstract
Developmental plasticity is one of the most striking features of plant morphogenesis, as plants are able to vary their shapes in response to environmental cues. Biotic or abiotic stimuli often promote organogenesis events in plants not observed under normal growth conditions. Root-knot nematodes (RKNs) are known to parasitize multiple species of rooting plants and to induce characteristic tissue expansion called galls or root-knots on the roots of their hosts by perturbing the plant cellular machinery. Galls contain giant cells (GCs) and neighboring cells, and the GCs are a source of nutrients for the parasitizing nematode. Highly active cell proliferation was observed in galls. However, the underlying mechanisms that regulate the symptoms triggered by the plant-nematode interaction have not yet been elucidated. In this study, we deciphered the molecular mechanism of gall formation with an in vitro infection assay system using RKN Meloidogyne incognita, and the model plant Arabidopsis thaliana. By taking advantages of this system, we performed next-generation sequencing-based transcriptome profiling, and found that the expression of procambium identity-associated genes were enriched during gall formation. Clustering analyses with artificial xylogenic systems, together with the results of expression analyses of the candidate genes, showed a significant correlation between the induction of gall cells and procambium-associated cells. Furthermore, the promoters of several procambial marker genes such as ATHB8, TDR and WOX4 were activated not only in M. incognita-induced galls, but similarly in M. javanica induced-galls and Heterodera schachtii-induced syncytia. Our findings suggest that phytoparasitic nematodes modulate the host's developmental regulation of the vascular stem cells during gall formation.
Collapse
Affiliation(s)
- Yasuka L. Yamaguchi
- Graduate School of Science and Technology, Kumamoto UniversityKumamoto, Japan
| | - Reira Suzuki
- Graduate School of Science and Technology, Kumamoto UniversityKumamoto, Japan
| | - Javier Cabrera
- Facultad de Ciencias Ambientales y Bioquímica, Universidad de Castilla – La ManchaToledo, Spain
| | - Satoru Nakagami
- Graduate School of Science and Technology, Kumamoto UniversityKumamoto, Japan
| | - Tomomi Sagara
- Graduate School of Science and Technology, Kumamoto UniversityKumamoto, Japan
| | - Chika Ejima
- Graduate School of Science and Technology, Kumamoto UniversityKumamoto, Japan
| | - Ryosuke Sano
- Graduate School of Biological Science, Nara Institute of Science and TechnologyIkoma, Japan
| | - Yuichi Aoki
- Graduate School of Information Sciences, Tohoku UniversitySendai, Japan
| | - Rocio Olmo
- Facultad de Ciencias Ambientales y Bioquímica, Universidad de Castilla – La ManchaToledo, Spain
| | - Tetsuya Kurata
- Plant Global Education Project, Graduate School of Biological Science, Nara Institute of Science and TechnologyIkoma, Japan
| | - Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku UniversitySendai, Japan
| | - Taku Demura
- Graduate School of Biological Science, Nara Institute of Science and TechnologyIkoma, Japan
| | - Takashi Ishida
- Graduate School of Science and Technology, Kumamoto UniversityKumamoto, Japan
| | - Carolina Escobar
- Facultad de Ciencias Ambientales y Bioquímica, Universidad de Castilla – La ManchaToledo, Spain
| | - Shinichiro Sawa
- Graduate School of Science and Technology, Kumamoto UniversityKumamoto, Japan
- *Correspondence: Shinichiro Sawa,
| |
Collapse
|
359
|
Ienasescu H, Li K, Andersson R, Vitezic M, Rennie S, Chen Y, Vitting-Seerup K, Lagoni E, Boyd M, Bornholdt J, de Hoon MJL, Kawaji H, Lassmann T, Hayashizaki Y, Forrest ARR, Carninci P, Sandelin A. On-the-fly selection of cell-specific enhancers, genes, miRNAs and proteins across the human body using SlideBase. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw144. [PMID: 28025337 PMCID: PMC5199134 DOI: 10.1093/database/baw144] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Revised: 10/06/2016] [Accepted: 10/17/2016] [Indexed: 12/19/2022]
Abstract
Genomics consortia have produced large datasets profiling the expression of genes, micro-RNAs, enhancers and more across human tissues or cells. There is a need for intuitive tools to select subsets of such data that is the most relevant for specific studies. To this end, we present SlideBase, a web tool which offers a new way of selecting genes, promoters, enhancers and microRNAs that are preferentially expressed/used in a specified set of cells/tissues, based on the use of interactive sliders. With the help of sliders, SlideBase enables users to define custom expression thresholds for individual cell types/tissues, producing sets of genes, enhancers etc. which satisfy these constraints. Changes in slider settings result in simultaneous changes in the selected sets, updated in real time. SlideBase is linked to major databases from genomics consortia, including FANTOM, GTEx, The Human Protein Atlas and BioGPS.Database URL: http://slidebase.binf.ku.dk.
Collapse
Affiliation(s)
- Hans Ienasescu
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Kang Li
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Department of Mathematical Sciences, University of Copenhagen, Universitetsparken 5, Copenhagen Ø, DK2100, Denmark
| | - Robin Andersson
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Morana Vitezic
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Sarah Rennie
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Yun Chen
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Kristoffer Vitting-Seerup
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Emil Lagoni
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Mette Boyd
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Jette Bornholdt
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark.,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| | - Michiel J L de Hoon
- RIKEN Center for Life Science Technologies (Division of Genomic Technologies), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Hideya Kawaji
- RIKEN Center for Life Science Technologies (Division of Genomic Technologies), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan.,RIKEN Preventive Medicine and Diagnosis Innovation Program, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Timo Lassmann
- RIKEN Center for Life Science Technologies (Division of Genomic Technologies), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan.,Telethon Kids Institute, The University of Western Australia, 100 Roberts Road, Subiaco, 6008, Australia Western Australia
| | | | - Yoshihide Hayashizaki
- RIKEN Center for Life Science Technologies (Division of Genomic Technologies), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan.,RIKEN Preventive Medicine and Diagnosis Innovation Program, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Alistair R R Forrest
- RIKEN Center for Life Science Technologies (Division of Genomic Technologies), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan.,Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, the University of Western Australia, Nedlands, Western Australia, Australia
| | - Piero Carninci
- RIKEN Center for Life Science Technologies (Division of Genomic Technologies), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Albin Sandelin
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark .,Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen N, DK2200, Denmark
| |
Collapse
|
360
|
Gligorijević V, Malod-Dognin N, Pržulj N. Integrative methods for analyzing big data in precision medicine. Proteomics 2016; 16:741-58. [PMID: 26677817 DOI: 10.1002/pmic.201500396] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Revised: 11/16/2015] [Accepted: 12/09/2015] [Indexed: 12/19/2022]
Abstract
We provide an overview of recent developments in big data analyses in the context of precision medicine and health informatics. With the advance in technologies capturing molecular and medical data, we entered the area of "Big Data" in biology and medicine. These data offer many opportunities to advance precision medicine. We outline key challenges in precision medicine and present recent advances in data integration-based methods to uncover personalized information from big data produced by various omics studies. We survey recent integrative methods for disease subtyping, biomarkers discovery, and drug repurposing, and list the tools that are available to domain scientists. Given the ever-growing nature of these big data, we highlight key issues that big data integration methods will face.
Collapse
Affiliation(s)
| | | | - Nataša Pržulj
- Department of Computing, Imperial College London, London, UK
| |
Collapse
|
361
|
Zdobnov EM, Tegenfeldt F, Kuznetsov D, Waterhouse RM, Simão FA, Ioannidis P, Seppey M, Loetscher A, Kriventseva EV. OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs. Nucleic Acids Res 2016; 45:D744-D749. [PMID: 27899580 PMCID: PMC5210582 DOI: 10.1093/nar/gkw1119] [Citation(s) in RCA: 302] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 10/26/2016] [Accepted: 11/08/2016] [Indexed: 11/25/2022] Open
Abstract
OrthoDB is a comprehensive catalog of orthologs, genes inherited by extant species from a single gene in their last common ancestor. In 2016 OrthoDB reached its 9th release, growing to over 22 million genes from over 5000 species, now adding plants, archaea and viruses. In this update we focused on usability of this fast-growing wealth of data: updating the user and programmatic interfaces to browse and query the data, and further enhancing the already extensive integration of available gene functional annotations. Collating functional annotations from over 100 resources, and enabled us to propose descriptive titles for 87% of ortholog groups. Additionally, OrthoDB continues to provide computed evolutionary annotations and to allow user queries by sequence homology. The OrthoDB resource now enables users to generate publication-quality comparative genomics charts, as well as to upload, analyze and interactively explore their own private data. OrthoDB is available from http://orthodb.org.
Collapse
Affiliation(s)
- Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Fredrik Tegenfeldt
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Dmitry Kuznetsov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Robert M Waterhouse
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Felipe A Simão
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Panagiotis Ioannidis
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Mathieu Seppey
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Alexis Loetscher
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Evgenia V Kriventseva
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| |
Collapse
|
362
|
Yates B, Braschi B, Gray KA, Seal RL, Tweedie S, Bruford EA. Genenames.org: the HGNC and VGNC resources in 2017. Nucleic Acids Res 2016; 45:D619-D625. [PMID: 27799471 PMCID: PMC5210531 DOI: 10.1093/nar/gkw1033] [Citation(s) in RCA: 235] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Revised: 10/18/2016] [Accepted: 10/20/2016] [Indexed: 12/02/2022] Open
Abstract
The HUGO Gene Nomenclature Committee (HGNC) based at the European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. Currently the HGNC database contains almost 40 000 approved gene symbols, over 19 000 of which represent protein-coding genes. In addition to naming genomic loci we manually curate genes into family sets based on shared characteristics such as homology, function or phenotype. We have recently updated our gene family resources and introduced new improved visualizations which can be seen alongside our gene symbol reports on our primary website http://www.genenames.org. In 2016 we expanded our remit and formed the Vertebrate Gene Nomenclature Committee (VGNC) which is responsible for assigning names to vertebrate species lacking a dedicated nomenclature group. Using the chimpanzee genome as a pilot project we have approved symbols and names for over 14 500 protein-coding genes in chimpanzee, and have developed a new website http://vertebrate.genenames.org to distribute these data. Here, we review our online data and resources, focusing particularly on the improvements and new developments made during the last two years.
Collapse
Affiliation(s)
- Bethan Yates
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Bryony Braschi
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Kristian A Gray
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ruth L Seal
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Susan Tweedie
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Elspeth A Bruford
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
363
|
Jin J, Tian F, Yang DC, Meng YQ, Kong L, Luo J, Gao G. PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 2016; 45:D1040-D1045. [PMID: 27924042 PMCID: PMC5210657 DOI: 10.1093/nar/gkw982] [Citation(s) in RCA: 1166] [Impact Index Per Article: 145.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 10/12/2016] [Indexed: 12/12/2022] Open
Abstract
With the goal of providing a comprehensive, high-quality resource for both plant transcription factors (TFs) and their regulatory interactions with target genes, we upgraded plant TF database PlantTFDB to version 4.0 (http://planttfdb.cbi.pku.edu.cn/). In the new version, we identified 320 370 TFs from 165 species, presenting a more comprehensive genomic TF repertoires of green plants. Besides updating the pre-existing abundant functional and evolutionary annotation for identified TFs, we generated three new types of annotation which provide more directly clues to investigate functional mechanisms underlying: (i) a set of high-quality, non-redundant TF binding motifs derived from experiments; (ii) multiple types of regulatory elements identified from high-throughput sequencing data; (iii) regulatory interactions curated from literature and inferred by combining TF binding motifs and regulatory elements. In addition, we upgraded previous TF prediction server, and set up four novel tools for regulation prediction and functional enrichment analyses. Finally, we set up a novel companion portal PlantRegMap (http://plantregmap.cbi.pku.edu.cn) for users to access the regulation resource and analysis tools conveniently.
Collapse
Affiliation(s)
- Jinpu Jin
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Feng Tian
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China.,Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P.R. China
| | - De-Chang Yang
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Yu-Qi Meng
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Lei Kong
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Jingchu Luo
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| | - Ge Gao
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Beijing 100871, P.R. China
| |
Collapse
|
364
|
Madsen MB, Kogelman LJA, Kadarmideen HN, Rasmussen HB. Systems genetics analysis of pharmacogenomics variation during antidepressant treatment. THE PHARMACOGENOMICS JOURNAL 2016; 18:144-152. [PMID: 27752142 DOI: 10.1038/tpj.2016.68] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Revised: 06/17/2016] [Accepted: 08/25/2016] [Indexed: 12/24/2022]
Abstract
Selective serotonin reuptake inhibitors (SSRIs) are the most widely used antidepressants, but the efficacy of the treatment varies significantly among individuals. It is believed that complex genetic mechanisms play a part in this variation. We have used a network based approach to unravel the involved genetic components. Moreover, we investigated the potential difference in the genetic interaction networks underlying SSRI treatment response over time. We found four hub genes (ASCC3, PPARGC1B, SCHIP1 and TMTC2) with different connectivity in the initial SSRI treatment period (baseline to week 4) compared with the subsequent period (4-8 weeks after initiation), suggesting that different genetic networks are important at different times during SSRI treatment. The strongest interactions in the initial SSRI treatment period involved genes encoding transcriptional factors, and in the subsequent period genes involved in calcium homeostasis. In conclusion, we suggest a difference in genetic interaction networks between initial and subsequent SSRI response.
Collapse
Affiliation(s)
- M B Madsen
- Institute of Biological Psychiatry, Mental Health Centre Sct. Hans, Capital Region of Denmark, Roskilde, Denmark.,iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Denmark
| | - L J A Kogelman
- Department of Large Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - H N Kadarmideen
- Department of Large Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - H B Rasmussen
- Institute of Biological Psychiatry, Mental Health Centre Sct. Hans, Capital Region of Denmark, Roskilde, Denmark.,iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Denmark
| |
Collapse
|
365
|
Wang D, Yang L, Zhang P, LaBaer J, Hermjakob H, Li D, Yu X. AAgAtlas 1.0: a human autoantigen database. Nucleic Acids Res 2016; 45:D769-D776. [PMID: 27924021 PMCID: PMC5210642 DOI: 10.1093/nar/gkw946] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2016] [Revised: 09/22/2016] [Accepted: 10/11/2016] [Indexed: 12/25/2022] Open
Abstract
Autoantibodies refer to antibodies that target self-antigens, which can play pivotal roles in maintaining homeostasis, distinguishing normal from tumor tissue and trigger autoimmune diseases. In the last three decades, tremendous efforts have been devoted to elucidate the generation, evolution and functions of autoantibodies, as well as their target autoantigens. However, reports of these countless previously identified autoantigens are randomly dispersed in the literature. Here, we constructed an AAgAtlas database 1.0 using text-mining and manual curation. We extracted 45 830 autoantigen-related abstracts and 94 313 sentences from PubMed using the keywords of either ‘autoantigen’ or ‘autoantibody’ or their lexical variants, which were further refined to 25 520 abstracts, 43 253 sentences and 3984 candidates by our bio-entity recognizer based on the Protein Ontology. Finally, we identified 1126 genes as human autoantigens and 1071 related human diseases, with which we constructed a human autoantigen database (AAgAtlas database 1.0). The database provides a user-friendly interface to conveniently browse, retrieve and download human autoantigens as well as their associated diseases. The database is freely accessible at http://biokb.ncpsb.org/aagatlas/. We believe this database will be a valuable resource to track and understand human autoantigens as well as to investigate their functions in basic and translational research.
Collapse
Affiliation(s)
- Dan Wang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| | - Liuhui Yang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| | - Ping Zhang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| | - Joshua LaBaer
- The Virginia G. Piper Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Henning Hermjakob
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China .,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dong Li
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| | - Xiaobo Yu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| |
Collapse
|
366
|
Network diffusion-based analysis of high-throughput data for the detection of differentially enriched modules. Sci Rep 2016; 6:34841. [PMID: 27731320 PMCID: PMC5059623 DOI: 10.1038/srep34841] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 08/19/2016] [Indexed: 11/08/2022] Open
Abstract
A relation exists between network proximity of molecular entities in interaction networks, functional similarity and association with diseases. The identification of network regions associated with biological functions and pathologies is a major goal in systems biology. We describe a network diffusion-based pipeline for the interpretation of different types of omics in the context of molecular interaction networks. We introduce the network smoothing index, a network-based quantity that allows to jointly quantify the amount of omics information in genes and in their network neighbourhood, using network diffusion to define network proximity. The approach is applicable to both descriptive and inferential statistics calculated on omics data. We also show that network resampling, applied to gene lists ranked by quantities derived from the network smoothing index, indicates the presence of significantly connected genes. As a proof of principle, we identified gene modules enriched in somatic mutations and transcriptional variations observed in samples of prostate adenocarcinoma (PRAD). In line with the local hypothesis, network smoothing index and network resampling underlined the existence of a connected component of genes harbouring molecular alterations in PRAD.
Collapse
|
367
|
Gawron P, Ostaszewski M, Satagopam V, Gebel S, Mazein A, Kuzma M, Zorzan S, McGee F, Otjacques B, Balling R, Schneider R. MINERVA-a platform for visualization and curation of molecular interaction networks. NPJ Syst Biol Appl 2016; 2:16020. [PMID: 28725475 PMCID: PMC5516855 DOI: 10.1038/npjsba.2016.20] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Revised: 06/15/2016] [Accepted: 06/24/2016] [Indexed: 12/11/2022] Open
Abstract
Our growing knowledge about various molecular mechanisms is becoming increasingly more structured and accessible. Different repositories of molecular interactions and available literature enable construction of focused and high-quality molecular interaction networks. Novel tools for curation and exploration of such networks are needed, in order to foster the development of a systems biology environment. In particular, solutions for visualization, annotation and data cross-linking will facilitate usage of network-encoded knowledge in biomedical research. To this end we developed the MINERVA (Molecular Interaction NEtwoRks VisuAlization) platform, a standalone webservice supporting curation, annotation and visualization of molecular interaction networks in Systems Biology Graphical Notation (SBGN)-compliant format. MINERVA provides automated content annotation and verification for improved quality control. The end users can explore and interact with hosted networks, and provide direct feedback to content curators. MINERVA enables mapping drug targets or overlaying experimental data on the visualized networks. Extensive export functions enable downloading areas of the visualized networks as SBGN-compliant models for efficient reuse of hosted networks. The software is available under Affero GPL 3.0 as a Virtual Machine snapshot, Debian package and Docker instance at http://r3lab.uni.lu/web/minerva-website/. We believe that MINERVA is an important contribution to systems biology community, as its architecture enables set-up of locally or globally accessible SBGN-oriented repositories of molecular interaction networks. Its functionalities allow overlay of multiple information layers, facilitating exploration of content and interpretation of data. Moreover, annotation and verification workflows of MINERVA improve the efficiency of curation of networks, allowing life-science researchers to better engage in development and use of biomedical knowledge repositories.
Collapse
Affiliation(s)
- Piotr Gawron
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Marek Ostaszewski
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Stephan Gebel
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Alexander Mazein
- European Institute for Systems Biology and Medicine, Université de Lyon, eTRIKS Consortium, Lyon, France
| | - Michal Kuzma
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Simone Zorzan
- Luxembourg Institute of Science and Technology, Belvaux, Luxembourg
| | - Fintan McGee
- Luxembourg Institute of Science and Technology, Belvaux, Luxembourg
| | - Benoît Otjacques
- Luxembourg Institute of Science and Technology, Belvaux, Luxembourg
| | - Rudi Balling
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
368
|
Mariani E, Frabetti F, Tarozzi A, Pelleri MC, Pizzetti F, Casadei R. Meta-Analysis of Parkinson's Disease Transcriptome Data Using TRAM Software: Whole Substantia Nigra Tissue and Single Dopamine Neuron Differential Gene Expression. PLoS One 2016; 11:e0161567. [PMID: 27611585 PMCID: PMC5017670 DOI: 10.1371/journal.pone.0161567] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 08/08/2016] [Indexed: 01/21/2023] Open
Abstract
The understanding of the genetic basis of the Parkinson's disease (PD) and the correlation between genotype and phenotype has revolutionized our knowledge about the pathogenetic mechanisms of neurodegeneration, opening up exciting new therapeutic and neuroprotective perspectives. Genomic knowledge of PD is still in its early stages and can provide a good start for studies of the molecular mechanisms that underlie the gene expression variations and the epigenetic mechanisms that may contribute to the complex and characteristic phenotype of PD. In this study we used the software TRAM (Transcriptome Mapper) to analyse publicly available microarray data of a total of 151 PD patients and 130 healthy controls substantia nigra (SN) samples, to identify chromosomal segments and gene loci differential expression. In particular, we separately analyzed PD patients and controls data from post-mortem snap-frozen SN whole tissue and from laser microdissected midbrain dopamine (DA) neurons, to better characterize the specific DA neuronal expression profile associated with the late-stage Parkinson's condition. The default "Map" mode analysis resulted in 10 significantly over/under-expressed segments, mapping on 8 different chromosomes for SN whole tissue and in 4 segments mapping on 4 different chromosomes for DA neurons. In conclusion, TRAM software allowed us to confirm the deregulation of some genomic regions and loci involved in key molecular pathways related to neurodegeneration, as well as to provide new insights about genes and non-coding RNA transcripts not yet associated with the disease.
Collapse
Affiliation(s)
- Elisa Mariani
- Department for Life Quality Studies, University of Bologna, Rimini, Italy
| | - Flavia Frabetti
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Andrea Tarozzi
- Department for Life Quality Studies, University of Bologna, Rimini, Italy
| | - Maria Chiara Pelleri
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Fabrizio Pizzetti
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Raffaella Casadei
- Department for Life Quality Studies, University of Bologna, Rimini, Italy
- * E-mail:
| |
Collapse
|
369
|
Mohanty B, Helder S, Silva APG, Mackay JP, Ryan DP. The Chromatin Remodelling Protein CHD1 Contains a Previously Unrecognised C-Terminal Helical Domain. J Mol Biol 2016; 428:4298-4314. [PMID: 27591891 DOI: 10.1016/j.jmb.2016.08.028] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Revised: 08/25/2016] [Accepted: 08/26/2016] [Indexed: 10/21/2022]
Abstract
The packaging of eukaryotic DNA into nucleosomes, and the organisation of these nucleosomes into chromatin, plays a critical role in regulating all DNA-associated processes. Chromodomain helicase DNA-binding protein 1 (CHD1) is an ATP-dependent chromatin remodelling protein that is conserved throughout eukaryotes and has an ability to assemble and organise nucleosomes both in vitro and in vivo. This activity is involved in the regulation of transcription and is implicated in mammalian development and stem cell biology. CHD1 is classically depicted as possessing a pair of tandem chromodomains that directly precede a core catalytic helicase-like domain that is then followed by a SANT-SLIDE DNA-binding domain. Here, we have identified an additional conserved domain C-terminal to the SANT-SLIDE domain and determined its structure by multidimensional heteronuclear NMR spectroscopy. We have termed this domain the CHD1 helical C-terminal (CHCT) domain as it is comprised of five α-helices arranged in a variant helical bundle topology. CHCT has a conserved, positively charged surface and is able to bind DNA and nucleosomes. In addition, we have identified another group of proteins, the as yet uncharacterised C17orf64 proteins, as also containing a conserved CHCT domain. Our data provide new structural insights into the CHD1 enzyme family.
Collapse
Affiliation(s)
- Biswaranjan Mohanty
- School of Life and Environmental Sciences, The University of Sydney, Building G08, Corner Butlin Avenue and Maze Crescent, Sydney, New South Wales, 2006, Australia; Faculty of Pharmacy and Pharmaceutical Sciences, Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, Victoria, 3052, Australia
| | - Stephanie Helder
- School of Life and Environmental Sciences, The University of Sydney, Building G08, Corner Butlin Avenue and Maze Crescent, Sydney, New South Wales, 2006, Australia
| | - Ana P G Silva
- School of Life and Environmental Sciences, The University of Sydney, Building G08, Corner Butlin Avenue and Maze Crescent, Sydney, New South Wales, 2006, Australia
| | - Joel P Mackay
- School of Life and Environmental Sciences, The University of Sydney, Building G08, Corner Butlin Avenue and Maze Crescent, Sydney, New South Wales, 2006, Australia.
| | - Daniel P Ryan
- Department of Genome Sciences, The John Curtin School of Medical Research, Building 131, Garran Road, The Australian National University, Canberra, Australian Capital Territory, 2601, Australia.
| |
Collapse
|
370
|
Zolfaghari Emameh R, Barker HR, Syrjänen L, Urbański L, Supuran CT, Parkkila S. Identification and inhibition of carbonic anhydrases from nematodes. J Enzyme Inhib Med Chem 2016; 31:176-184. [DOI: 10.1080/14756366.2016.1221826] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Affiliation(s)
- Reza Zolfaghari Emameh
- School of Medicine, University of Tampere, Tampere, Finland,
- BioMediTech, University of Tampere, Tampere, Finland,
- Fimlab Laboratories Ltd and Tampere University Hospital, Tampere, Finland,
| | | | - Leo Syrjänen
- School of Medicine, University of Tampere, Tampere, Finland,
- Department of Otorhinolaryngology, Central Finland Central Hospital, Jyväskylä, Finland, and
| | - Linda Urbański
- School of Medicine, University of Tampere, Tampere, Finland,
| | - Claudiu T. Supuran
- Neurofarba Dipartment, Sezione di Scienza Farmaceutiche e Nutraceutiche, Università degli Studi di Firenze, Firenze, Italy
| | - Seppo Parkkila
- School of Medicine, University of Tampere, Tampere, Finland,
- Fimlab Laboratories Ltd and Tampere University Hospital, Tampere, Finland,
| |
Collapse
|
371
|
Fluck J, Madan S, Ansari S, Kodamullil AT, Karki R, Rastegar-Mojarad M, Catlett NL, Hayes W, Szostak J, Hoeng J, Peitsch M. Training and evaluation corpora for the extraction of causal relationships encoded in biological expression language (BEL). DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw113. [PMID: 27554092 PMCID: PMC4995071 DOI: 10.1093/database/baw113] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 07/07/2016] [Indexed: 01/21/2023]
Abstract
Success in extracting biological relationships is mainly dependent on the complexity of the task as well as the availability of high-quality training data. Here, we describe the new corpora in the systems biology modeling language BEL for training and testing biological relationship extraction systems that we prepared for the BioCreative V BEL track. BEL was designed to capture relationships not only between proteins or chemicals, but also complex events such as biological processes or disease states. A BEL nanopub is the smallest unit of information and represents a biological relationship with its provenance. In BEL relationships (called BEL statements), the entities are normalized to defined namespaces mainly derived from public repositories, such as sequence databases, MeSH or publicly available ontologies. In the BEL nanopubs, the BEL statements are associated with citation information and supportive evidence such as a text excerpt. To enable the training of extraction tools, we prepared BEL resources and made them available to the community. We selected a subset of these resources focusing on a reduced set of namespaces, namely, human and mouse genes, ChEBI chemicals, MeSH diseases and GO biological processes, as well as relationship types ‘increases’ and ‘decreases’. The published training corpus contains 11 000 BEL statements from over 6000 supportive text excerpts. For method evaluation, we selected and re-annotated two smaller subcorpora containing 100 text excerpts. For this re-annotation, the inter-annotator agreement was measured by the BEL track evaluation environment and resulted in a maximal F-score of 91.18% for full statement agreement. In addition, for a set of 100 BEL statements, we do not only provide the gold standard expert annotations, but also text excerpts pre-selected by two automated systems. Those text excerpts were evaluated and manually annotated as true or false supportive in the course of the BioCreative V BEL track task. Database URL:http://wiki.openbel.org/display/BIOC/Datasets
Collapse
Affiliation(s)
- Juliane Fluck
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Sumit Madan
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Sam Ansari
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Alpha T Kodamullil
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Reagon Karki
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | | | | | - William Hayes
- Selventa, One Alewife Center, Cambridge, MA 02140, USA
| | - Justyna Szostak
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Julia Hoeng
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Manuel Peitsch
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| |
Collapse
|
372
|
Archer NP, Perez-Andreu V, Scheurer ME, Rabin KR, Peckham-Gregory EC, Plon SE, Zabriskie RC, De Alarcon PA, Fernandez KS, Najera CR, Yang JJ, Antillon-Klussmann F, Lupo PJ. Family-based exome-wide assessment of maternal genetic effects on susceptibility to childhood B-cell acute lymphoblastic leukemia in hispanics. Cancer 2016; 122:3697-3704. [PMID: 27529658 DOI: 10.1002/cncr.30241] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Revised: 05/31/2016] [Accepted: 07/06/2016] [Indexed: 11/11/2022]
Abstract
BACKGROUND Children of Hispanic ancestry have a higher incidence of acute lymphoblastic leukemia (ALL) compared with other ethnic groups, but to the authors' knowledge, the genetic basis for these racial disparities remain incompletely understood. Genome-wide association studies of childhood ALL to date have focused on inherited genetic effects; however, maternal genetic effects (the role of the maternal genotype on phenotype development in the offspring) also may play a role in ALL susceptibility. METHODS The authors conducted a family-based exome-wide association study of maternal genetic effects among Hispanics with childhood B-cell ALL using the Illumina Infinium HumanExome BeadChip. A discovery cohort of 312 Guatemalan and Hispanic American families and an independent replication cohort of 152 Hispanic American families were used. RESULTS Three maternal single-nucleotide polymorphisms (SNPs) approached the study threshold for significance after correction for multiple testing (P<1.0 × 10-6 ): MTL5 rs12365708 (testis expressed metallothionein-like protein [tesmin]) (relative risk [RR], 2.62; 95% confidence interval [95% CI], 1.61-4.27 [P = 1.8 × 10-5 ]); ALKBH1 rs6494 (AlkB homolog 1, histone H2A dioxygenase) (RR, 3.77; 95% CI, 1.84-7.74 [P = 3.7 × 10-5 ]); and NEUROG3 rs4536103 (neurogenin 3) (RR, 1.75; 95% CI, 1.30-2.37 [P = 1.2 × 10-4 ]). Although effect sizes were similar, these SNPs were not nominally significant in the replication cohort in the current study. In a meta-analysis comprised of the discovery cohort and the replication cohort, these SNPs were still not found to be statistically significant after correction for multiple comparisons (rs12365708: pooled RR, 2.27 [95% CI, 1.48-3.50], P = 1.99 × 10-4 ; rs6494: pooled RR, 2.31 [95% CI, 1.38-3.85], P = .001; and rs4536103: pooled RR, 1.67 [95% CI, 1.29-2.16] P = 9.23 × 10-5 ). CONCLUSIONS In what to the authors' knowledge is the first family-based based exome-wide association study to investigate maternal genotype effects associated with childhood ALL, the results did not implicate a strong role of maternal genotype on disease risk among Hispanics; however, 3 maternal SNPs were identified that may play a modest role in susceptibility. Cancer 2016;122:3697-704. © 2016 American Cancer Society.
Collapse
Affiliation(s)
- Natalie P Archer
- Austin Regional Campus, University of Texas School of Public Health, Austin, Texas
| | - Virginia Perez-Andreu
- Department of Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, Tennessee.,Hematologic Malignancies Program, Comprehensive Cancer Center, St. Jude Children's Research Hospital, Memphis, Tennessee
| | - Michael E Scheurer
- Section of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, Texas
| | - Karen R Rabin
- Section of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, Texas
| | - Erin C Peckham-Gregory
- Section of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, Texas
| | - Sharon E Plon
- Section of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, Texas
| | - Ryan C Zabriskie
- Section of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, Texas
| | - Pedro A De Alarcon
- Department of Pediatrics, University of Illinois College of Medicine at Peoria, Peoria, Illinois
| | - Karen S Fernandez
- Department of Pediatrics, University of Illinois College of Medicine at Peoria, Peoria, Illinois
| | - Cesar R Najera
- National Pediatric Oncology Unit, Guatemala City, Guatemala
| | - Jun J Yang
- Department of Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, Tennessee.,Hematologic Malignancies Program, Comprehensive Cancer Center, St. Jude Children's Research Hospital, Memphis, Tennessee
| | - Federico Antillon-Klussmann
- National Pediatric Oncology Unit, Guatemala City, Guatemala.,School of Medicine, Francisco Marroquin University, Guatemala City, Guatemala
| | - Philip J Lupo
- Section of Hematology-Oncology, Department of Pediatrics, Baylor College of Medicine, Houston, Texas
| |
Collapse
|
373
|
Tsur E, Friger M, Menashe I. The Unique Evolutionary Signature of Genes Associated with Autism Spectrum Disorder. Behav Genet 2016; 46:754-762. [PMID: 27515661 DOI: 10.1007/s10519-016-9804-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Accepted: 08/04/2016] [Indexed: 11/29/2022]
Abstract
Autism spectrum disorder (ASD) is a common heritable neurodevelopmental disorder, which is characterized by communication and social deficits that reduce the reproductive fitness of individuals with the disorder. Here, we studied the genomic characteristics of 651 ASD genes in a whole-exome sequencing dataset, to search for traces of the evolutionary forces that helped maintain ASD in the human population. We show that ASD genes are ~65 longer and ~20 % less variable than non-ASD genes. The mutational shortage in ASD genes was particularly eminent when considering only deleterious genetic variations, which is a hallmark of negative selection. We further show that these genomic characteristics are unique to ASD genes, as compared with brain-specific genes or with genes of other diseases. Our findings suggest that ASD genes have evolved under complex evolutionary forces, which have left a unique signature that can be used to identify new candidate ASD genes.
Collapse
Affiliation(s)
- Erez Tsur
- Department of Public Health, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beersheba, Israel.,Zlotowski Center for Neuroscience, Ben-Gurion University of the Negev, Beersheba, Israel
| | - Michael Friger
- Department of Public Health, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beersheba, Israel
| | - Idan Menashe
- Department of Public Health, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beersheba, Israel. .,Zlotowski Center for Neuroscience, Ben-Gurion University of the Negev, Beersheba, Israel.
| |
Collapse
|
374
|
Xu D, Zhang M, Xie Y, Wang F, Chen M, Zhu KQ, Wei J. DTMiner: identification of potential disease targets through biomedical literature mining. Bioinformatics 2016; 32:3619-3626. [PMID: 27506226 PMCID: PMC5181534 DOI: 10.1093/bioinformatics/btw503] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Revised: 06/07/2016] [Accepted: 07/19/2016] [Indexed: 11/12/2022] Open
Abstract
Motivation: Biomedical researchers often search through massive catalogues of literature to look for potential relationships between genes and diseases. Given the rapid growth of biomedical literature, automatic relation extraction, a crucial technology in biomedical literature mining, has shown great potential to support research of gene-related diseases. Existing work in this field has produced datasets that are limited both in scale and accuracy. Results: In this study, we propose a reliable and efficient framework that takes large biomedical literature repositories as inputs, identifies credible relationships between diseases and genes, and presents possible genes related to a given disease and possible diseases related to a given gene. The framework incorporates name entity recognition (NER), which identifies occurrences of genes and diseases in texts, association detection whereby we extract and evaluate features from gene–disease pairs, and ranking algorithms that estimate how closely the pairs are related. The F1-score of the NER phase is 0.87, which is higher than existing studies. The association detection phase takes drastically less time than previous work while maintaining a comparable F1-score of 0.86. The end-to-end result achieves a 0.259 F1-score for the top 50 genes associated with a disease, which performs better than previous work. In addition, we released a web service for public use of the dataset. Availability and Implementation: The implementation of the proposed algorithms is publicly available at http://gdr-web.rwebox.com/public_html/index.php?page=download.php. The web service is available at http://gdr-web.rwebox.com/public_html/index.php. Contact:jenny.wei@astrazeneca.com or kzhu@cs.sjtu.edu.cn Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dong Xu
- Department of CSE, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Meizhuo Zhang
- R&D Information, Innovation Center China, AstraZeneca, Pudong, Shanghai 201203, China
| | - Yanping Xie
- Department of CSE, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Fan Wang
- Department of CSE, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ming Chen
- R&D Information, Innovation Center China, AstraZeneca, Pudong, Shanghai 201203, China
| | - Kenny Q Zhu
- Department of CSE, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jia Wei
- R&D Information, Innovation Center China, AstraZeneca, Pudong, Shanghai 201203, China
| |
Collapse
|
375
|
A gene browser of colorectal cancer with literature evidence and pre-computed regulatory information to identify key tumor suppressors and oncogenes. Sci Rep 2016; 6:30624. [PMID: 27477450 PMCID: PMC4967895 DOI: 10.1038/srep30624] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2016] [Accepted: 07/06/2016] [Indexed: 02/07/2023] Open
Abstract
Colorectal cancer (CRC) is a cancer of growing incidence that associates with a high mortality rate worldwide. There is a poor understanding of the heterogeneity of CRC with regard to causative genetic mutations and gene regulatory mechanisms. Previous studies have identified several susceptibility genes in small-scale experiments. However, the information has not been comprehensively and systematically compiled and interpreted. In this study, we constructed the gbCRC, the first literature-based gene resource for investigating CRC-related human genes. The features of our database include: (i) manual curation of experimentally-verified genes reported in the literature; (ii) comprehensive integration of five reliable data sources; and (iii) pre-computed regulatory patterns involving transcription factors, microRNAs and long non-coding RNAs. In total, 2067 genes associating with 2819 PubMed abstracts were compiled. Comprehensive functional annotations associated with all the genes, including gene expression profiles, homologous genes in other model species, protein-protein interactions, somatic mutations, and potential methylation sites. These comprehensive annotations and this pre-computed regulatory information highlighted the importance of the gbCRC with regard to the unexplored regulatory network of CRC. This information is available in a plain text format that is free to download.
Collapse
|
376
|
Laprairie RB, Denovan-Wright EM, Wright JM. Subfunctionalization of peroxisome proliferator response elements accounts for retention of duplicated fabp1 genes in zebrafish. BMC Evol Biol 2016; 16:147. [PMID: 27421266 PMCID: PMC4947323 DOI: 10.1186/s12862-016-0717-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 06/30/2016] [Indexed: 01/01/2023] Open
Abstract
Background In the duplication-degeneration-complementation (DDC) model, a duplicated gene has three possible fates: it may lose functionality through the accumulation of mutations (nonfunctionalization), acquire a new function (neofunctionalization), or each duplicate gene may retain a subset of functions of the ancestral gene (subfunctionalization). The role that promoter evolution plays in retention of duplicated genes in eukaryotic genomes is not well understood. Fatty acid-binding proteins (Fabp) belong to a multigene family that are highly conserved in sequence and function, but differ in their gene regulation, suggesting selective pressure is exerted via regulatory elements in the promoter. Results In this study, we describe the PPAR regulation of zebrafish fabp1a, fabp1b.1, and fabp1b.2 promoters and compare them to the PPAR regulation of the spotted gar fabp1 promoter, representative of the ancestral fabp1 gene. Evolution of the fabp1 promoter was inferred by sequence analysis, and differential PPAR-agonist activation of fabp1 promoter activity in zebrafish liver and intestine explant cells, and in HEK293A cells transiently transfected with wild-type and mutated fabp1promoter-reporter gene constructs. The promoter activity of spotted gar fabp1, representative of the ancestral fabp1, was induced by both PPARα- and PPARγ-specific agonists, but displayed a biphasic response to PPARα activation. Zebrafish fabp1a was PPARα-selective, fabp1b.1 was PPARγ-selective, and fabp1b.2 was not regulated by PPAR. Conclusions The zebrafish fabp1 promoters underwent two successive rounds of subfunctionalization with respect to PPAR regulation leading to retention of three zebrafish fabp1 genes with stimuli-specific regulation. Using a pharmacological approach, we demonstrated here the divergent regulation of the zebrafish fabp1a, fabp1b.1, and fabp1b.2 with regard to subfunctionalization of PPAR regulation following two rounds of gene duplication. Electronic supplementary material The online version of this article (doi:10.1186/s12862-016-0717-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Robert B Laprairie
- Department of Pharmacology, Dalhousie University, 5850 College St, Halifax, NS, B3H 4R2, Canada
| | - Eileen M Denovan-Wright
- Department of Pharmacology, Dalhousie University, 5850 College St, Halifax, NS, B3H 4R2, Canada
| | - Jonathan M Wright
- Department of Biology, Dalhousie University, 31355 Oxford St, PO Box 15000, Halifax, NS, B3H 4R2, Canada.
| |
Collapse
|
377
|
Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, McDermott MG, Monteiro CD, Gundersen GW, Ma'ayan A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res 2016; 44. [PMID: 27141961 PMCID: PMC4987924 DOI: 10.1093/nar%2fgkw377] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.
Collapse
Affiliation(s)
- Maxim V. Kuleshov
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Matthew R. Jones
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Andrew D. Rouillard
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Nicolas F. Fernandez
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Qiaonan Duan
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Simon Koplev
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Sherry L. Jenkins
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Kathleen M. Jagodnik
- Fluid Physics and Transport Processes Branch, NASA Glenn Research Center, 21000 Brookpark Rd., Cleveland, OH 44135, USA
| | - Alexander Lachmann
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Michael G. McDermott
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Caroline D. Monteiro
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Gregory W. Gundersen
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA,To whom correspondence should be addressed. Tel: +1 212 241 1153; Fax: +1 212 996 7214;
| |
Collapse
|
378
|
Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, McDermott MG, Monteiro CD, Gundersen GW, Ma'ayan A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res 2016. [PMID: 27141961 DOI: 10.1093/nar/gkw377)] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.
Collapse
Affiliation(s)
- Maxim V Kuleshov
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Matthew R Jones
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Andrew D Rouillard
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Nicolas F Fernandez
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Qiaonan Duan
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Simon Koplev
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Sherry L Jenkins
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Kathleen M Jagodnik
- Fluid Physics and Transport Processes Branch, NASA Glenn Research Center, 21000 Brookpark Rd., Cleveland, OH 44135, USA
| | - Alexander Lachmann
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Michael G McDermott
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Caroline D Monteiro
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Gregory W Gundersen
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| |
Collapse
|
379
|
Rouillard AD, Gundersen GW, Fernandez NF, Wang Z, Monteiro CD, McDermott MG, Ma'ayan A. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database (Oxford) 2016; 2016:baw100. [PMID: 27374120 PMCID: PMC4930834 DOI: 10.1093/database/baw100] [Citation(s) in RCA: 874] [Impact Index Per Article: 109.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Revised: 05/15/2016] [Accepted: 05/31/2016] [Indexed: 12/18/2022]
Abstract
Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their associations, with an urgent need for data integration to achieve better knowledge extraction and data reuse. For this purpose, we developed the Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into ∼72 million functional associations between genes/proteins and their attributes. Such attributes could be physical relationships with other biomolecules, expression in cell lines and tissues, genetic associations with knockout mouse or human phenotypes, or changes in expression after drug treatment. We stored these associations in a relational database along with rich metadata for the genes/proteins, their attributes and the original resources. The freely available Harmonizome web portal provides a graphical user interface, a web service and a mobile app for querying, browsing and downloading all of the collected data. To demonstrate the utility of the Harmonizome, we computed and visualized gene-gene and attribute-attribute similarity networks, and through unsupervised clustering, identified many unexpected relationships by combining pairs of datasets such as the association between kinase perturbations and disease signatures. We also applied supervised machine learning methods to predict novel substrates for kinases, endogenous ligands for G-protein coupled receptors, mouse phenotypes for knockout genes, and classified unannotated transmembrane proteins for likelihood of being ion channels. The Harmonizome is a comprehensive resource of knowledge about genes and proteins, and as such, it enables researchers to discover novel relationships between biological entities, as well as form novel data-driven hypotheses for experimental validation.Database URL: http://amp.pharm.mssm.edu/Harmonizome.
Collapse
Affiliation(s)
- Andrew D Rouillard
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gregory W Gundersen
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Nicolas F Fernandez
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Caroline D Monteiro
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael G McDermott
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
380
|
Körber I, Katayama S, Einarsdottir E, Krjutškov K, Hakala P, Kere J, Lehesjoki AE, Joensuu T. Gene-Expression Profiling Suggests Impaired Signaling via the Interferon Pathway in Cstb-/- Microglia. PLoS One 2016; 11:e0158195. [PMID: 27355630 PMCID: PMC4927094 DOI: 10.1371/journal.pone.0158195] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Accepted: 06/13/2016] [Indexed: 01/26/2023] Open
Abstract
Progressive myoclonus epilepsy of Unverricht-Lundborg type (EPM1, OMIM254800) is an autosomal recessive neurodegenerative disorder characterized by stimulus-sensitive and action-activated myoclonus, tonic-clonic epileptic seizures, and ataxia. Loss-of-function mutations in the gene encoding the cysteine protease inhibitor cystatin B (CSTB) underlie EPM1. The deficiency of CSTB in mice (Cstb-/- mice) generates a phenotype resembling the symptoms of EPM1 patients and is accompanied by microglial activation at two weeks of age and an upregulation of immune system-associated genes in the cerebellum at one month of age. To shed light on molecular pathways and processes linked to CSTB deficiency in microglia we characterized the transcriptome of cultured Cstb-/- mouse microglia using microarray hybridization and RNA sequencing (RNA-seq). The gene expression profiles obtained with these two techniques were in good accordance and not polarized to either pro- or anti-inflammatory status. In Cstb-/- microglia, altogether 184 genes were differentially expressed. Of these, 33 genes were identified by both methods. Several interferon-regulated genes were weaker expressed in Cstb-/- microglia compared to control. This was confirmed by quantitative real-time PCR of the transcripts Irf7 and Stat1. Subsequently, we explored the biological context of CSTB deficiency in microglia more deeply by functional enrichment and canonical pathway analysis. This uncovered a potential role for CSTB in chemotaxis, antigen-presentation, and in immune- and defense response-associated processes by altering JAK-STAT pathway signaling. These data support and expand the previously suggested involvement of inflammatory processes to the disease pathogenesis of EPM1 and connect CSTB deficiency in microglia to altered expression of interferon-regulated genes.
Collapse
Affiliation(s)
- Inken Körber
- Folkhälsan Institute of Genetics, Helsinki, Finland
- Research Program’s Unit, Molecular Neurology, University of Helsinki, Helsinki, Finland
- Neuroscience Center, University of Helsinki, Helsinki, Finland
| | - Shintaro Katayama
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| | - Elisabet Einarsdottir
- Folkhälsan Institute of Genetics, Helsinki, Finland
- Research Program’s Unit, Molecular Neurology, University of Helsinki, Helsinki, Finland
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| | - Kaarel Krjutškov
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
- Competence Centre on Health Technologies, Tartu, Estonia
| | - Paula Hakala
- Folkhälsan Institute of Genetics, Helsinki, Finland
- Research Program’s Unit, Molecular Neurology, University of Helsinki, Helsinki, Finland
- Neuroscience Center, University of Helsinki, Helsinki, Finland
| | - Juha Kere
- Folkhälsan Institute of Genetics, Helsinki, Finland
- Research Program’s Unit, Molecular Neurology, University of Helsinki, Helsinki, Finland
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| | - Anna-Elina Lehesjoki
- Folkhälsan Institute of Genetics, Helsinki, Finland
- Research Program’s Unit, Molecular Neurology, University of Helsinki, Helsinki, Finland
- Neuroscience Center, University of Helsinki, Helsinki, Finland
| | - Tarja Joensuu
- Folkhälsan Institute of Genetics, Helsinki, Finland
- Research Program’s Unit, Molecular Neurology, University of Helsinki, Helsinki, Finland
- Neuroscience Center, University of Helsinki, Helsinki, Finland
- * E-mail:
| |
Collapse
|
381
|
QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks. PLoS Comput Biol 2016; 12:e1004809. [PMID: 27336171 PMCID: PMC4919057 DOI: 10.1371/journal.pcbi.1004809] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 05/12/2016] [Indexed: 01/30/2023] Open
Abstract
Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. AVAILABILITY: QuIN’s web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/.
Collapse
|
382
|
Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, Stein TI, Nudel R, Lieder I, Mazor Y, Kaplan S, Dahary D, Warshawsky D, Guan-Golan Y, Kohn A, Rappaport N, Safran M, Lancet D. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses. ACTA ACUST UNITED AC 2016; 54:1.30.1-1.30.33. [PMID: 27322403 DOI: 10.1002/cpbi.5] [Citation(s) in RCA: 2104] [Impact Index Per Article: 263.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Gil Stelzer
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,These authors contributed equally to the paper
| | - Naomi Rosen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,These authors contributed equally to the paper
| | - Inbar Plaschkes
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,LifeMap Sciences Ltd, Tel Aviv, Israel
| | - Shahar Zimmerman
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Michal Twik
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Simon Fishilevich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Tsippi Iny Stein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Ron Nudel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | | | | | | | - Dvir Dahary
- LifeMap Sciences Ltd, Tel Aviv, Israel.,Toldot Genetics Ltd, Hod Hasharon, Israel
| | | | | | - Asher Kohn
- LifeMap Sciences Inc, Marshfield, Massachusetts
| | - Noa Rappaport
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Marilyn Safran
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Doron Lancet
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,Corresponding author
| |
Collapse
|
383
|
Zhao M, Chen L, Liu Y, Qu H. GCGene: a gene resource for gastric cancer with literature evidence. Oncotarget 2016; 7:33983-93. [PMID: 27127885 PMCID: PMC5085132 DOI: 10.18632/oncotarget.9030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 04/16/2016] [Indexed: 12/31/2022] Open
Abstract
Gastric cancer (GC) is the fifth most common cancer and third leading cause of cancer-related deaths worldwide. Its lethality primarily stems from a lack of detection strategies for early stages of GC and a lack of noninvasive detection strategies for advanced stages. The development of early diagnostic biomarkers largely depends on understanding the biological pathways and regulatory mechanisms associated with putative GC genes. Unfortunately, the GC-implicated genes that have been identified thus far are scattered among thousands of published studies, and no systematic summary is available, which hinders the development of a large-scale genetic screen. To provide a publically accessible resource tool to meet this need, we constructed a literature-based database GCGene (Gastric Cancer Gene database) with comprehensive annotations supported by a user-friendly website. In the current release, we have collected 1,815 unique human genes including 1,678 protein-coding and 137 non-coding genes curated from extensive examination of 3,142 PubMed abstracts. The resulting database has a convenient web-based interface to facilitate both textual and sequence-based searches. All curated genes in GCGene are downloadable for advanced bioinformatics data mining. Gene prioritization was performed to rank the relative relevance of these genes in GC development. The 100 top-ranked genes are highly mutated according to the cohort of published studies we reviewed. By conducting a network analysis of these top-ranked GC-associated genes in the human interactome, we were able to identify strong links between 8 highly connected genes with low expression and patient survival time. GCGene is freely available to academic users at http://gcgene.bioinfo-minzhao.org/.
Collapse
Affiliation(s)
- Min Zhao
- School of Engineering, Faculty of Science, Health, Education and Engineering, University of The Sunshine Coast, Maroochydore DC, Queensland, Australia
| | - Luming Chen
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Peking University, Beijing, P.R. China
| | - Yining Liu
- School of Engineering, Faculty of Science, Health, Education and Engineering, University of The Sunshine Coast, Maroochydore DC, Queensland, Australia
| | - Hong Qu
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Peking University, Beijing, P.R. China
| |
Collapse
|
384
|
Xi D, Zhao J, Lai W, Guo Z. Systematic analysis of the molecular mechanism underlying atherosclerosis using a text mining approach. Hum Genomics 2016; 10:14. [PMID: 27251057 PMCID: PMC4890502 DOI: 10.1186/s40246-016-0075-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Accepted: 04/25/2016] [Indexed: 12/24/2022] Open
Abstract
Background Atherosclerosis is one of the common health threats all over the world. It is a complex heritable disease that affects arterial blood vessels. Chronic inflammatory response plays an important role in atherogenesis. There has been little success in fully identifying functionally important genes in the pathogenesis of atherosclerosis. Results In the present study, we performed a systematic analysis of atherosclerosis-related genes using text mining. We identified a total of 1312 genes. Gene ontology (GO) analysis revealed that a total of 35 terms exhibited significance (p < 0.05) as overrepresented terms, indicating that atherosclerosis invokes many genes with a wide range of different functions. Pathway analysis demonstrated that the most highly enriched pathway is the Toll-like receptor signaling pathway. Finally, through gene network analysis, we prioritized 48 genes using the hub gene method. Conclusions Our study provides a valuable resource for the in-depth understanding of the mechanism underlying atherosclerosis. Electronic supplementary material The online version of this article (doi:10.1186/s40246-016-0075-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Dan Xi
- Division of Cardiology, Huiqiao Medical Center, Nanfang Hospital, Southern Medical University, 1838 North Guangzhou Avenue, Guangzhou, 510515, Guangdong, People's Republic of China
| | - Jinzhen Zhao
- Division of Cardiology, Huiqiao Medical Center, Nanfang Hospital, Southern Medical University, 1838 North Guangzhou Avenue, Guangzhou, 510515, Guangdong, People's Republic of China
| | - Wenyan Lai
- Laboratory of Department of Cardiology, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, Guangdong, People's Republic of China.
| | - Zhigang Guo
- Division of Cardiology, Huiqiao Medical Center, Nanfang Hospital, Southern Medical University, 1838 North Guangzhou Avenue, Guangzhou, 510515, Guangdong, People's Republic of China.
| |
Collapse
|
385
|
Ambrosino L, Bostan H, Ruggieri V, Chiusano ML. Bioinformatics resources for pollen. PLANT REPRODUCTION 2016; 29:133-147. [PMID: 27271281 DOI: 10.1007/s00497-016-0284-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Accepted: 05/19/2016] [Indexed: 06/06/2023]
Abstract
Bioinformatics for Pollen. Pollen plays a key role in crop production, and its development is the most delicate phase in reproduction. Different metabolic pathways are involved in pollen development, and changes in the level of some metabolites, as well as responses to stress, are correlated with the reduction in pollen viability, leading consequently to a decrease in the fruit production. However, studies on pollen may be hard because gamete development and fertilization are complex processes that occur during a short window of time. The rise of the so-called -omics sciences provided key strategies to promote molecular research in pollen tissues, starting from model organisms and moving to increasing number of species. An integrated multi-level approach based on investigations from genomics, transcriptomics, proteomics and metabolomics appears now feasible to clarify key molecular processes in pollen development and viability. To this aim, bioinformatics has a fundamental role for data production and analysis, contributing varied and ad hoc methodologies, endowed with different sensitivity and specificity, necessary for extracting added-value information from the large amount of molecular data achievable. Bioinformatics is also essential for data management, organization, distribution and integration in suitable resources. This is necessary to catch the biological features of the pollen tissues and to design effective approaches to identifying structural or functional properties, enabling the modeling of the major involved processes in normal or in stress conditions. In this review, we provide an overview of the available bioinformatics resources for pollen, ranging from raw data collections to complete databases or platforms, when available, which include data and/or results from -omics efforts on the male gametophyte. Perspectives in the fields will also be described.
Collapse
Affiliation(s)
- Luca Ambrosino
- Department of Agricultural Sciences, University of Naples "Federico II", via Università 100, Portici (NA), 80055, Italy
| | - Hamed Bostan
- Department of Agricultural Sciences, University of Naples "Federico II", via Università 100, Portici (NA), 80055, Italy
| | - Valentino Ruggieri
- Department of Agricultural Sciences, University of Naples "Federico II", via Università 100, Portici (NA), 80055, Italy
| | - Maria Luisa Chiusano
- Department of Agricultural Sciences, University of Naples "Federico II", via Università 100, Portici (NA), 80055, Italy.
| |
Collapse
|
386
|
Generating Gene Ontology-Disease Inferences to Explore Mechanisms of Human Disease at the Comparative Toxicogenomics Database. PLoS One 2016; 11:e0155530. [PMID: 27171405 PMCID: PMC4865041 DOI: 10.1371/journal.pone.0155530] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 04/29/2016] [Indexed: 12/20/2022] Open
Abstract
Strategies for discovering common molecular events among disparate diseases hold promise for improving understanding of disease etiology and expanding treatment options. One technique is to leverage curated datasets found in the public domain. The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) manually curates chemical-gene, chemical-disease, and gene-disease interactions from the scientific literature. The use of official gene symbols in CTD interactions enables this information to be combined with the Gene Ontology (GO) file from NCBI Gene. By integrating these GO-gene annotations with CTD’s gene-disease dataset, we produce 753,000 inferences between 15,700 GO terms and 4,200 diseases, providing opportunities to explore presumptive molecular underpinnings of diseases and identify biological similarities. Through a variety of applications, we demonstrate the utility of this novel resource. As a proof-of-concept, we first analyze known repositioned drugs (e.g., raloxifene and sildenafil) and see that their target diseases have a greater degree of similarity when comparing GO terms vs. genes. Next, a computational analysis predicts seemingly non-intuitive diseases (e.g., stomach ulcers and atherosclerosis) as being similar to bipolar disorder, and these are validated in the literature as reported co-diseases. Additionally, we leverage other CTD content to develop testable hypotheses about thalidomide-gene networks to treat seemingly disparate diseases. Finally, we illustrate how CTD tools can rank a series of drugs as potential candidates for repositioning against B-cell chronic lymphocytic leukemia and predict cisplatin and the small molecule inhibitor JQ1 as lead compounds. The CTD dataset is freely available for users to navigate pathologies within the context of extensive biological processes, molecular functions, and cellular components conferred by GO. This inference set should aid researchers, bioinformaticists, and pharmaceutical drug makers in finding commonalities in disease mechanisms, which in turn could help identify new therapeutics, new indications for existing pharmaceuticals, potential disease comorbidities, and alerts for side effects.
Collapse
|
387
|
Pelletier D, Wiegers TC, Enayetallah A, Kibbey C, Gosink M, Koza-Taylor P, Mattingly CJ, Lawton M. ToxEvaluator: an integrated computational platform to aid the interpretation of toxicology study-related findings. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw062. [PMID: 27161010 PMCID: PMC4860628 DOI: 10.1093/database/baw062] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 04/03/2016] [Indexed: 12/27/2022]
Abstract
Attempts are frequently made to investigate adverse findings from preclinical toxicology studies in order to better understand underlying toxicity mechanisms. These efforts often begin with limited information, including a description of the adverse finding, knowledge of the structure of the chemical associated with its cause and the intended pharmacological target. ToxEvaluator was developed jointly by Pfizer and the Comparative Toxicogenomics Database (http://ctdbase.org) team at North Carolina State University as an in silico platform to facilitate interpretation of toxicity findings in light of prior knowledge. Through the integration of a diverse set of in silico tools that leverage a number of public and proprietary databases, ToxEvaluator streamlines the process of aggregating and interrogating diverse sources of information. The user enters compound and target identifiers, and selects adverse event descriptors from a safety lexicon and mapped MeSH disease terms. ToxEvaluator provides a summary report with multiple distinct areas organized according to what target or structural aspects have been linked to the adverse finding, including primary pharmacology, structurally similar proprietary compounds, structurally similar public domain compounds, predicted secondary (i.e. off-target) pharmacology and known secondary pharmacology. Similar proprietary compounds and their associated in vivo toxicity findings are reported, along with a link to relevant supporting documents. For similar public domain compounds and interacting targets, ToxEvaluator integrates relationships curated in Comparative Toxicogenomics Database, returning all direct and inferred linkages between them. As an example of its utility, we demonstrate how ToxEvaluator rapidly identified direct (primary pharmacology) and indirect (secondary pharmacology) linkages between cerivastatin and myopathy.
Collapse
Affiliation(s)
- D Pelletier
- Pfizer Worldwide Research & Development, Groton, CT 06340
| | - T C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695
| | | | - C Kibbey
- Pfizer Worldwide Research & Development, Groton, CT 06340
| | - M Gosink
- Pfizer Worldwide Research & Development, Groton, CT 06340
| | - P Koza-Taylor
- Pfizer Worldwide Research & Development, Groton, CT 06340
| | - C J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695
| | - M Lawton
- Pfizer Worldwide Research & Development, Groton, CT 06340
| |
Collapse
|
388
|
Lee J, Hong WY, Cho M, Sim M, Lee D, Ko Y, Kim J. Synteny Portal: a web-based application portal for synteny block analysis. Nucleic Acids Res 2016; 44:W35-40. [PMID: 27154270 PMCID: PMC4987893 DOI: 10.1093/nar/gkw310] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 04/12/2016] [Indexed: 11/12/2022] Open
Abstract
Recent advances in next-generation sequencing technologies and genome assembly algorithms have enabled the accumulation of a huge volume of genome sequences from various species. This has provided new opportunities for large-scale comparative genomics studies. Identifying and utilizing synteny blocks, which are genomic regions conserved among multiple species, is key to understanding genomic architecture and the evolutionary history of genomes. However, the construction and visualization of such synteny blocks from multiple species are very challenging, especially for biologists with a lack of computational skills. Here, we present Synteny Portal, a versatile web-based application portal for constructing, visualizing and browsing synteny blocks. With Synteny Portal, users can easily (i) construct synteny blocks among multiple species by using prebuilt alignments in the UCSC genome browser database, (ii) visualize and download syntenic relationships as high-quality images, (iii) browse synteny blocks with genetic information and (iv) download the details of synteny blocks to be used as input for downstream synteny-based analyses, all in an intuitive and easy-to-use web-based interface. We believe that Synteny Portal will serve as a highly valuable tool that will enable biologists to easily perform comparative genomics studies by compensating limitations of existing tools. Synteny Portal is freely available at http://bioinfo.konkuk.ac.kr/synteny_portal.
Collapse
Affiliation(s)
- Jongin Lee
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Woon-Young Hong
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Minah Cho
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Mikang Sim
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Daehwan Lee
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Younhee Ko
- Department of Clinical Genetics, Department of Pediatrics, Yonsei University College of Medicine, Seoul 03722, South Korea
| | - Jaebum Kim
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| |
Collapse
|
389
|
Xin J, Mark A, Afrasiabi C, Tsueng G, Juchler M, Gopal N, Stupp GS, Putman TE, Ainscough BJ, Griffith OL, Torkamani A, Whetzel PL, Mungall CJ, Mooney SD, Su AI, Wu C. High-performance web services for querying gene and variant annotation. Genome Biol 2016; 17:91. [PMID: 27154141 PMCID: PMC4858870 DOI: 10.1186/s13059-016-0953-9] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 04/14/2016] [Indexed: 01/18/2023] Open
Abstract
Efficient tools for data management and integration are essential for many aspects of high-throughput biology. In particular, annotations of genes and human genetic variants are commonly used but highly fragmented across many resources. Here, we describe MyGene.info and MyVariant.info, high-performance web services for querying gene and variant annotation information. These web services are currently accessed more than three million times permonth. They also demonstrate a generalizable cloud-based model for organizing and querying biological annotation information. MyGene.info and MyVariant.info are provided as high-performance web services, accessible at http://mygene.info and http://myvariant.info . Both are offered free of charge to the research community.
Collapse
Affiliation(s)
- Jiwen Xin
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Adam Mark
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA.,Current address: Avera Cancer Institute, 11099 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Cyrus Afrasiabi
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Ginger Tsueng
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Moritz Juchler
- Department of Biomedical Informatics and Medical Education, The University of Washington, Box SLU-BIME 358047, Seattle, WA, 98195, USA
| | - Nikhil Gopal
- Department of Biomedical Informatics and Medical Education, The University of Washington, Box SLU-BIME 358047, Seattle, WA, 98195, USA
| | - Gregory S Stupp
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Timothy E Putman
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Benjamin J Ainscough
- McDonnell Genome Institute, Washington University School of Medicine, 4444 Forest Park Ave, St. Louis, MO, 63108, USA
| | - Obi L Griffith
- McDonnell Genome Institute, Washington University School of Medicine, 4444 Forest Park Ave, St. Louis, MO, 63108, USA
| | - Ali Torkamani
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA.,The Scripps Translational Science Institute, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Patricia L Whetzel
- Center for Research in Biological Systems, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | | | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, The University of Washington, Box SLU-BIME 358047, Seattle, WA, 98195, USA
| | - Andrew I Su
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA. .,The Scripps Translational Science Institute, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA.
| | - Chunlei Wu
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA.
| |
Collapse
|
390
|
Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, McDermott MG, Monteiro CD, Gundersen GW, Ma'ayan A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res 2016; 44:W90-7. [PMID: 27141961 PMCID: PMC4987924 DOI: 10.1093/nar/gkw377] [Citation(s) in RCA: 5610] [Impact Index Per Article: 701.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 04/25/2016] [Indexed: 12/11/2022] Open
Abstract
Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.
Collapse
Affiliation(s)
- Maxim V Kuleshov
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Matthew R Jones
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Andrew D Rouillard
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Nicolas F Fernandez
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Qiaonan Duan
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Simon Koplev
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Sherry L Jenkins
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Kathleen M Jagodnik
- Fluid Physics and Transport Processes Branch, NASA Glenn Research Center, 21000 Brookpark Rd., Cleveland, OH 44135, USA
| | - Alexander Lachmann
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Michael G McDermott
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Caroline D Monteiro
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Gregory W Gundersen
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| |
Collapse
|
391
|
Laprairie RB, Denovan-Wright EM, Wright JM. Divergent evolution of cis-acting peroxisome proliferator-activated receptor elements that differentially control the tandemly duplicated fatty acid-binding protein genes, fabp1b.1 and fabp1b.2, in zebrafish. Genome 2016; 59:403-12. [PMID: 27228313 DOI: 10.1139/gen-2016-0033] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Gene duplication is thought to facilitate increasing complexity in the evolution of life. The fate of most duplicated genes is nonfunctionalization: functional decay resulting from the accumulation of mutations. According to the duplication-degeneration-complementation (DDC) model, duplicated genes are retained by subfunctionalization, where the functions of the ancestral gene are sub-divided between duplicate genes, or by neofunctionalization, where one of the duplicates acquires a new function. Here, we report the differential regulation of the zebrafish tandemly duplicated fatty acid-binding protein genes, fabp1b.1 and fabp1b.2, by peroxisome proliferator-activated receptors (PPAR). fabp1b.1 mRNA levels were induced in tissue explants of liver, but not intestine, by PPAR agonists. fabp1b.1 promoter activity was induced to a greater extent by rosiglitazone (PPARγ-selective agonist) compared to WY 14,643 (PPARα-selective agonist) in HEK293A cells. Mutation of a peroxisome proliferator response element (PPRE) at -1232 bp in the fabp1b.1 promoter reduced PPAR-dependent activation. fabp1b.2 promoter activity was not affected by PPAR agonists. Differential regulation of the duplicated fabp1b promoters may be the result of PPRE loss in fabp1b.2 during a meiotic crossing-over event. Retention of PPAR inducibility in fabp1b.1 and not fabp1b.2 suggests unique regulation and function of the fabp1b duplicates.
Collapse
|
392
|
Reimand J, Arak T, Adler P, Kolberg L, Reisberg S, Peterson H, Vilo J. g:Profiler-a web server for functional interpretation of gene lists (2016 update). Nucleic Acids Res 2016; 44:W83-9. [PMID: 27098042 PMCID: PMC4987867 DOI: 10.1093/nar/gkw199] [Citation(s) in RCA: 884] [Impact Index Per Article: 110.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 03/13/2016] [Indexed: 12/13/2022] Open
Abstract
Functional enrichment analysis is a key step in interpreting gene lists discovered in diverse high-throughput experiments. g:Profiler studies flat and ranked gene lists and finds statistically significant Gene Ontology terms, pathways and other gene function related terms. Translation of hundreds of gene identifiers is another core feature of g:Profiler. Since its first publication in 2007, our web server has become a popular tool of choice among basic and translational researchers. Timeliness is a major advantage of g:Profiler as genome and pathway information is synchronized with the Ensembl database in quarterly updates. g:Profiler supports 213 species including mammals and other vertebrates, plants, insects and fungi. The 2016 update of g:Profiler introduces several novel features. We have added further functional datasets to interpret gene lists, including transcription factor binding site predictions, Mendelian disease annotations, information about protein expression and complexes and gene mappings of human genetic polymorphisms. Besides the interactive web interface, g:Profiler can be accessed in computational pipelines using our R package, Python interface and BioJS component. g:Profiler is freely available at http://biit.cs.ut.ee/gprofiler/.
Collapse
Affiliation(s)
- Jüri Reimand
- Ontario Institute for Cancer Research, 661 University Avenue, Toronto, ON M5G 0A3, Canada Department of Medical Biophysics, University of Toronto, 101 College Street, Toronto, ON M5G 1L7, Canada
| | - Tambet Arak
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Priit Adler
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Liis Kolberg
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Sulev Reisberg
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Hedi Peterson
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Jaak Vilo
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| |
Collapse
|
393
|
Zhao M, Rotgans B, Wang T, Cummins SF. REGene: a literature-based knowledgebase of animal regeneration that bridge tissue regeneration and cancer. Sci Rep 2016; 6:23167. [PMID: 26975833 PMCID: PMC4791596 DOI: 10.1038/srep23167] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2015] [Accepted: 02/18/2016] [Indexed: 12/13/2022] Open
Abstract
Regeneration is a common phenomenon across multiple animal phyla. Regeneration-related genes (REGs) are critical for fundamental cellular processes such as proliferation and differentiation. Identification of REGs and elucidating their functions may help to further develop effective treatment strategies in regenerative medicine. So far, REGs have been largely identified by small-scale experimental studies and a comprehensive characterization of the diverse biological processes regulated by REGs is lacking. Therefore, there is an ever-growing need to integrate REGs at the genomics, epigenetics, and transcriptome level to provide a reference list of REGs for regeneration and regenerative medicine research. Towards achieving this, we developed the first literature-based database called REGene (REgeneration Gene database). In the current release, REGene contains 948 human (929 protein-coding and 19 non-coding genes) and 8445 homologous genes curated from gene ontology and extensive literature examination. Additionally, the REGene database provides detailed annotations for each REG, including: gene expression, methylation sites, upstream transcription factors, and protein-protein interactions. An analysis of the collected REGs reveals strong links to a variety of cancers in terms of genetic mutation, protein domains, and cellular pathways. We have prepared a web interface to share these regeneration genes, supported by refined browsing and searching functions at http://REGene.bioinfo-minzhao.org/.
Collapse
Affiliation(s)
- Min Zhao
- School of Engineering, Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore DC, Queensland, 4558, Australia
| | - Bronwyn Rotgans
- School of Engineering, Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore DC, Queensland, 4558, Australia
| | - Tianfang Wang
- School of Engineering, Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore DC, Queensland, 4558, Australia
| | - S F Cummins
- School of Engineering, Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore DC, Queensland, 4558, Australia
| |
Collapse
|
394
|
Shirley MD, Frelin L, López JS, Jedlicka A, Dziedzic A, Frank-Crawford MA, Silverman W, Hagopian L, Pevsner J. Copy Number Variants Associated with 14 Cases of Self-Injurious Behavior. PLoS One 2016; 11:e0149646. [PMID: 26933844 PMCID: PMC4774994 DOI: 10.1371/journal.pone.0149646] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Accepted: 02/03/2016] [Indexed: 11/18/2022] Open
Abstract
Copy number variants (CNVs) were detected and analyzed in 14 probands with autism and intellectual disability with self-injurious behavior (SIB) resulting in tissue damage. For each proband we obtained a clinical history and detailed behavioral descriptions. Genetic anomalies were observed in all probands, and likely clinical significance could be established in four cases. This included two cases having novel, de novo copy number variants and two cases having variants likely to have functional significance. These cases included segmental trisomy 14, segmental monosomy 21, and variants predicted to disrupt the function of ZEB2 (encoding a transcription factor) and HTR2C (encoding a serotonin receptor). Our results identify variants in regions previously implicated in intellectual disability and suggest candidate genes that could contribute to the etiology of SIB.
Collapse
Affiliation(s)
- Matthew D. Shirley
- Program in Biochemistry, Cellular and Molecular Biology, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Laurence Frelin
- Department of Neurology, Hugo W. Moser Research Institute at Kennedy Krieger, Baltimore, Maryland, United States of America
| | - José Soria López
- Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Anne Jedlicka
- Genomic Analysis and Sequencing Core, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America
| | - Amanda Dziedzic
- Genomic Analysis and Sequencing Core, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America
| | - Michelle A. Frank-Crawford
- Deptartment of Behavioral Psychology, Kennedy Krieger Institute, Baltimore, Maryland, United States of America
| | - Wayne Silverman
- Deptartment of Behavioral Psychology, Kennedy Krieger Institute, Baltimore, Maryland, United States of America
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Louis Hagopian
- Deptartment of Behavioral Psychology, Kennedy Krieger Institute, Baltimore, Maryland, United States of America
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Jonathan Pevsner
- Program in Biochemistry, Cellular and Molecular Biology, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
- Department of Neurology, Hugo W. Moser Research Institute at Kennedy Krieger, Baltimore, Maryland, United States of America
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
395
|
Whole-exome sequencing in obsessive-compulsive disorder identifies rare mutations in immunological and neurodevelopmental pathways. Transl Psychiatry 2016; 6:e764. [PMID: 27023170 PMCID: PMC4872454 DOI: 10.1038/tp.2016.30] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Revised: 01/13/2016] [Accepted: 01/24/2016] [Indexed: 12/31/2022] Open
Abstract
Studies of rare genetic variation have identified molecular pathways conferring risk for developmental neuropsychiatric disorders. To date, no published whole-exome sequencing studies have been reported in obsessive-compulsive disorder (OCD). We sequenced all the genome coding regions in 20 sporadic OCD cases and their unaffected parents to identify rare de novo (DN) single-nucleotide variants (SNVs). The primary aim of this pilot study was to determine whether DN variation contributes to OCD risk. To this aim, we evaluated whether there is an elevated rate of DN mutations in OCD, which would justify this approach toward gene discovery in larger studies of the disorder. Furthermore, to explore functional molecular correlations among genes with nonsynonymous DN SNVs in OCD probands, a protein-protein interaction (PPI) network was generated based on databases of direct molecular interactions. We applied Degree-Aware Disease Gene Prioritization (DADA) to rank the PPI network genes based on their relatedness to a set of OCD candidate genes from two OCD genome-wide association studies (Stewart et al., 2013; Mattheisen et al., 2014). In addition, we performed a pathway analysis with genes from the PPI network. The rate of DN SNVs in OCD was 2.51 × 10(-8) per base per generation, significantly higher than a previous estimated rate in unaffected subjects using the same sequencing platform and analytic pipeline. Several genes harboring DN SNVs in OCD were highly interconnected in the PPI network and ranked high in the DADA analysis. Nearly all the DN SNVs in this study are in genes expressed in the human brain, and a pathway analysis revealed enrichment in immunological and central nervous system functioning and development. The results of this pilot study indicate that further investigation of DN variation in larger OCD cohorts is warranted to identify specific risk genes and to confirm our preliminary finding with regard to PPI network enrichment for particular biological pathways and functions.
Collapse
|
396
|
Schaefer MH, Serrano L. Cell type-specific properties and environment shape tissue specificity of cancer genes. Sci Rep 2016; 6:20707. [PMID: 26856619 PMCID: PMC4746590 DOI: 10.1038/srep20707] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Accepted: 01/11/2016] [Indexed: 12/21/2022] Open
Abstract
One of the biggest mysteries in cancer research remains why mutations in certain genes cause cancer only at specific sites in the human body. The poor correlation between the expression level of a cancer gene and the tissues in which it causes malignant transformations raises the question of which factors determine the tissue-specific effects of a mutation. Here, we explore why some cancer genes are associated only with few different cancer types (i.e., are specific), while others are found mutated in a large number of different types of cancer (i.e., are general). We do so by contrasting cellular functions of specific-cancer genes with those of general ones to identify properties that determine where in the body a gene mutation is causing malignant transformations. We identified different groups of cancer genes that did not behave as expected (i.e., DNA repair genes being tissue specific, immune response genes showing a bimodal specificity function or strong association of generally expressed genes to particular cancers). Analysis of these three groups demonstrates the importance of environmental impact for understanding why certain cancer genes are only involved in the development of some cancer types but are rarely found mutated in other types of cancer.
Collapse
Affiliation(s)
- Martin H Schaefer
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, Barcelona, Spain
| | - Luis Serrano
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, Barcelona, Spain
| |
Collapse
|
397
|
Patterson SE, Liu R, Statz CM, Durkin D, Lakshminarayana A, Mockus SM. The clinical trial landscape in oncology and connectivity of somatic mutational profiles to targeted therapies. Hum Genomics 2016; 10:4. [PMID: 26772741 PMCID: PMC4715272 DOI: 10.1186/s40246-016-0061-7] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Accepted: 01/10/2016] [Indexed: 12/24/2022] Open
Abstract
Background Precision medicine in oncology relies on rapid associations between patient-specific variations and targeted therapeutic efficacy. Due to the advancement of genomic analysis, a vast literature characterizing cancer-associated molecular aberrations and relative therapeutic relevance has been published. However, data are not uniformly reported or readily available, and accessing relevant information in a clinically acceptable time-frame is a daunting proposition, hampering connections between patients and appropriate therapeutic options. One important therapeutic avenue for oncology patients is through clinical trials. Accordingly, a global view into the availability of targeted clinical trials would provide insight into strengths and weaknesses and potentially enable research focus. However, data regarding the landscape of clinical trials in oncology is not readily available, and as a result, a comprehensive understanding of clinical trial availability is difficult. Results To support clinical decision-making, we have developed a data loader and mapper that connects sequence information from oncology patients to data stored in an in-house database, the JAX Clinical Knowledgebase (JAX-CKB), which can be queried readily to access comprehensive data for clinical reporting via customized reporting queries. JAX-CKB functions as a repository to house expertly curated clinically relevant data surrounding our 358-gene panel, the JAX Cancer Treatment Profile (JAX CTP), and supports annotation of functional significance of molecular variants. Through queries of data housed in JAX-CKB, we have analyzed the landscape of clinical trials relevant to our 358-gene targeted sequencing panel to evaluate strengths and weaknesses in current molecular targeting in oncology. Through this analysis, we have identified patient indications, molecular aberrations, and targeted therapy classes that have strong or weak representation in clinical trials. Conclusions Here, we describe the development and disseminate system methods for associating patient genomic sequence data with clinically relevant information, facilitating interpretation and providing a mechanism for informing therapeutic decision-making. Additionally, through customized queries, we have the capability to rapidly analyze the landscape of targeted therapies in clinical trials, enabling a unique view into current therapeutic availability in oncology.
Collapse
Affiliation(s)
- Sara E Patterson
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| | - Rangjiao Liu
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| | - Cara M Statz
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| | - Daniel Durkin
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| | | | - Susan M Mockus
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| |
Collapse
|
398
|
Zhao M, Liu Y, O'Mara TA. ECGene: A Literature-Based Knowledgebase of Endometrial Cancer Genes. Hum Mutat 2016; 37:337-43. [PMID: 26699919 PMCID: PMC5066700 DOI: 10.1002/humu.22950] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Accepted: 12/16/2015] [Indexed: 12/14/2022]
Abstract
Endometrial cancer (EC) ranks as the sixth common cancer for women worldwide. To better distinguish cancer subtypes and identify effective early diagnostic biomarkers, we need improved understanding of the biological mechanisms associated with EC dysregulated genes. Although there is a wealth of clinical and molecular information relevant to EC in the literature, there has been no systematic summary of EC‐implicated genes. In this study, we developed a literature‐based database ECGene (Endometrial Cancer Gene database) with comprehensive annotations. ECGene features manual curation of 414 genes from thousands of publications, results from eight EC gene expression datasets, precomputation of coexpressed long noncoding RNAs, and an EC‐implicated gene interactome. In the current release, we generated and comprehensively annotated a list of 458 EC‐implicated genes. We found the top‐ranked EC‐implicated genes are frequently mutated in The Cancer Genome Atlas (TCGA) tumor samples. Furthermore, systematic analysis of coexpressed lncRNAs provided insight into the important roles of lncRNA in EC development. ECGene has a user‐friendly Web interface and is freely available at http://ecgene.bioinfo‐minzhao.org/. As the first literature‐based online resource for EC, ECGene serves as a useful gateway for researchers to explore EC genetics.
Collapse
Affiliation(s)
- Min Zhao
- School of Engineering, Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Queensland, 4558, Australia
| | - Yining Liu
- School of Engineering, Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Queensland, 4558, Australia
| | - Tracy A O'Mara
- Genetics and Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, 4006, Australia
| |
Collapse
|
399
|
Hakenberg J, Cheng WY, Thomas P, Wang YC, Uzilov AV, Chen R. Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts. BMC Bioinformatics 2016; 17:24. [PMID: 26746786 PMCID: PMC4706706 DOI: 10.1186/s12859-015-0865-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2015] [Accepted: 12/17/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Data from a plethora of high-throughput sequencing studies is readily available to researchers, providing genetic variants detected in a variety of healthy and disease populations. While each individual cohort helps gain insights into polymorphic and disease-associated variants, a joint perspective can be more powerful in identifying polymorphisms, rare variants, disease-associations, genetic burden, somatic variants, and disease mechanisms. DESCRIPTION We have set up a Reference Variant Store (RVS) containing variants observed in a number of large-scale sequencing efforts, such as 1000 Genomes, ExAC, Scripps Wellderly, UK10K; various genotyping studies; and disease association databases. RVS holds extensive annotations pertaining to affected genes, functional impacts, disease associations, and population frequencies. RVS currently stores 400 million distinct variants observed in more than 80,000 human samples. CONCLUSIONS RVS facilitates cross-study analysis to discover novel genetic risk factors, gene-disease associations, potential disease mechanisms, and actionable variants. Due to its large reference populations, RVS can also be employed for variant filtration and gene prioritization. AVAILABILITY A web interface to public datasets and annotations in RVS is available at https://rvs.u.hpc.mssm.edu/.
Collapse
Affiliation(s)
- Jörg Hakenberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, Box 1498, New York, 10029, USA.
| | - Wei-Yi Cheng
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, Box 1498, New York, 10029, USA.
- Current affiliation: Illumina, Inc., 451 El Camino Real, Suite 210, Santa Clara, 95050, USA.
| | - Philippe Thomas
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, Box 1498, New York, 10029, USA.
- Current affiliation: Roche Parma Research and Early Development, Informatics, Roche Innovation Center New York, 430 East 29th St, New York, 10016, USA.
| | - Ying-Chih Wang
- Department of Computer Science, Humboldt-Universität zu Berlin, Unter den Linden 6, Berlin, 10099, Germany.
- Current affiliation: German Research Centre for Artificial Intelligence (DFKI), Alt Moabit 91c, Berlin, 10559, Germany.
| | - Andrew V Uzilov
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, Box 1498, New York, 10029, USA.
| | - Rong Chen
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, Box 1498, New York, 10029, USA.
| |
Collapse
|
400
|
Kim M, Cooper BA, Venkat R, Phillips JB, Eidem HR, Hirbo J, Nutakki S, Williams SM, Muglia LJ, Capra JA, Petren K, Abbot P, Rokas A, McGary KL. GEneSTATION 1.0: a synthetic resource of diverse evolutionary and functional genomic data for studying the evolution of pregnancy-associated tissues and phenotypes. Nucleic Acids Res 2016; 44:D908-16. [PMID: 26567549 PMCID: PMC4702823 DOI: 10.1093/nar/gkv1137] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Revised: 09/30/2015] [Accepted: 10/16/2015] [Indexed: 01/24/2023] Open
Abstract
Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy.
Collapse
Affiliation(s)
- Mara Kim
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Brian A Cooper
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Rohit Venkat
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Julie B Phillips
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Haley R Eidem
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Jibril Hirbo
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Sashank Nutakki
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Scott M Williams
- Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
| | - Louis J Muglia
- Center for Prevention of Preterm Birth, Perinatal Institute, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - J Anthony Capra
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37235, USA
| | - Kenneth Petren
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH 45221, USA
| | - Patrick Abbot
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37235, USA
| | - Kriston L McGary
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
| |
Collapse
|