1101
|
Sebestyén E, Zawisza M, Eyras E. Detection of recurrent alternative splicing switches in tumor samples reveals novel signatures of cancer. Nucleic Acids Res 2015; 43:1345-56. [PMID: 25578962 PMCID: PMC4330360 DOI: 10.1093/nar/gku1392] [Citation(s) in RCA: 128] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The determination of the alternative splicing isoforms expressed in cancer is fundamental for the development of tumor-specific molecular targets for prognosis and therapy, but it is hindered by the heterogeneity of tumors and the variability across patients. We developed a new computational method, robust to biological and technical variability, which identifies significant transcript isoform changes across multiple samples. We applied this method to more than 4000 samples from the The Cancer Genome Atlas project to obtain novel splicing signatures that are predictive for nine different cancer types, and find a specific signature for basal-like breast tumors involving the tumor-driver CTNND1. Additionally, our method identifies 244 isoform switches, for which the change occurs in the most abundant transcript. Some of these switches occur in known tumor drivers, including PPARG, CCND3, RALGDS, MITF, PRDM1, ABI1 and MYH11, for which the switch implies a change in the protein product. Moreover, some of the switches cannot be described with simple splicing events. Surprisingly, isoform switches are independent of somatic mutations, except for the tumor-suppressor FBLN2 and the oncogene MYH11. Our method reveals novel signatures of cancer in terms of transcript isoforms specifically expressed in tumors, providing novel potential molecular targets for prognosis and therapy. Data and software are available at: http://dx.doi.org/10.6084/m9.figshare.1061917 and https://bitbucket.org/regulatorygenomicsupf/iso-ktsp.
Collapse
Affiliation(s)
- Endre Sebestyén
- Computational Genomics, Universitat Pompeu Fabra, Dr. Aiguader 88, E08003 Barcelona, Spain
| | - Michał Zawisza
- Universitat Politècnica de Catalunya, Jordi Girona 1-3, E08034 Barcelona, Spain
| | - Eduardo Eyras
- Computational Genomics, Universitat Pompeu Fabra, Dr. Aiguader 88, E08003 Barcelona, Spain Catalan Institution for Research and Advanced Studies, Passeig Lluís Companys 23, E08010 Barcelona, Spain
| |
Collapse
|
1102
|
Wang Y, Liu S, Hu Y, Li P, Wan JB. Current state of the art of mass spectrometry-based metabolomics studies – a review focusing on wide coverage, high throughput and easy identification. RSC Adv 2015. [DOI: 10.1039/c5ra14058g] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Metabolomics aims at the comprehensive assessment of a wide range of endogenous metabolites and attempts to identify and quantify the attractive metabolites in a given biological sample.
Collapse
Affiliation(s)
- Yang Wang
- State Key Laboratory of Quality Research in Chinese Medicine
- Institute of Chinese Medical Sciences
- University of Macau
- Macao
- China
| | - Shuying Liu
- Jilin Ginseng Academy
- Changchun University of Chinese Medicine
- Changchun
- China
| | - Yuanjia Hu
- State Key Laboratory of Quality Research in Chinese Medicine
- Institute of Chinese Medical Sciences
- University of Macau
- Macao
- China
| | - Peng Li
- State Key Laboratory of Quality Research in Chinese Medicine
- Institute of Chinese Medical Sciences
- University of Macau
- Macao
- China
| | - Jian-Bo Wan
- State Key Laboratory of Quality Research in Chinese Medicine
- Institute of Chinese Medical Sciences
- University of Macau
- Macao
- China
| |
Collapse
|
1103
|
Carnielli CM, Winck FV, Paes Leme AF. Functional annotation and biological interpretation of proteomics data. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2015; 1854:46-54. [DOI: 10.1016/j.bbapap.2014.10.019] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Revised: 10/07/2014] [Accepted: 10/21/2014] [Indexed: 12/22/2022]
|
1104
|
Banwait JK, Bastola DR. Contribution of bioinformatics prediction in microRNA-based cancer therapeutics. Adv Drug Deliv Rev 2015; 81:94-103. [PMID: 25450261 DOI: 10.1016/j.addr.2014.10.030] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 10/13/2014] [Accepted: 10/30/2014] [Indexed: 12/15/2022]
Abstract
Despite enormous efforts, cancer remains one of the most lethal diseases in the world. With the advancement of high throughput technologies massive amounts of cancer data can be accessed and analyzed. Bioinformatics provides a platform to assist biologists in developing minimally invasive biomarkers to detect cancer, and in designing effective personalized therapies to treat cancer patients. Still, the early diagnosis, prognosis, and treatment of cancer are an open challenge for the research community. MicroRNAs (miRNAs) are small non-coding RNAs that serve to regulate gene expression. The discovery of deregulated miRNAs in cancer cells and tissues has led many to investigate the use of miRNAs as potential biomarkers for early detection, and as a therapeutic agent to treat cancer. Here we describe advancements in computational approaches to predict miRNAs and their targets, and discuss the role of bioinformatics in studying miRNAs in the context of human cancer.
Collapse
Affiliation(s)
- Jasjit K Banwait
- College of Information Science and Technology, University of Nebraska at Omaha, 1110 South 67th Street, PKI 172, Omaha, NE 68106, USA.
| | - Dhundy R Bastola
- College of Information Science and Technology, University of Nebraska at Omaha, 1110 South 67th Street, PKI 172, Omaha, NE 68106, USA.
| |
Collapse
|
1105
|
Ogilvie LA, Wierling C, Kessler T, Lehrach H, Lange BMH. Article Commentary: Predictive Modeling of Drug Treatment in the Area of Personalized Medicine. Cancer Inform 2015. [PMID: 26692759 PMCID: PMC4671548 DOI: 10.4137/cin.s19330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Despite a growing body of knowledge on the mechanisms underlying the onset and progression of cancer, treatment success rates in oncology are at best modest. Current approaches use statistical methods that fail to embrace the inherent and expansive complexity of the tumor/patient/drug interaction. Computational modeling, in particular mechanistic modeling, has the power to resolve this complexity. Using fundamental knowledge on the interactions occurring between the components of a complex biological system, large-scale in silico models with predictive capabilities can be generated. Here, we describe how mechanistic virtual patient models, based on systematic molecular characterization of patients and their diseases, have the potential to shift the theranostic paradigm for oncology, both in the fields of personalized medicine and targeted drug development. In particular, we highlight the mechanistic modeling platform ModCell™ for individualized prediction of patient responses to treatment, emphasizing modeling techniques and avenues of application.
Collapse
Affiliation(s)
| | - Christoph Wierling
- Alacris Theranostics GmbH, Berlin, Germany
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Thomas Kessler
- Alacris Theranostics GmbH, Berlin, Germany
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Hans Lehrach
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Dahlem Centre for Genome Research and Medical Systems Biology, Berlin, Germany
| | | |
Collapse
|
1106
|
Talikka M, Boue S, Schlage WK. Causal Biological Network Database: A Comprehensive Platform of Causal Biological Network Models Focused on the Pulmonary and Vascular Systems. METHODS IN PHARMACOLOGY AND TOXICOLOGY 2015. [DOI: 10.1007/978-1-4939-2778-4_3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
1107
|
Négyessy L, Györffy B, Hanics J, Bányai M, Fonta C, Bazsó F. Signal Transduction Pathways of TNAP: Molecular Network Analyses. Subcell Biochem 2015. [PMID: 26219713 DOI: 10.1007/978-94-017-7197-9_10] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Despite the growing body of evidence pointing on the involvement of tissue non-specific alkaline phosphatase (TNAP) in brain function and diseases like epilepsy and Alzheimer's disease, our understanding about the role of TNAP in the regulation of neurotransmission is severely limited. The aim of our study was to integrate the fragmented knowledge into a comprehensive view regarding neuronal functions of TNAP using objective tools. As a model we used the signal transduction molecular network of a pyramidal neuron after complementing with TNAP related data and performed the analysis using graph theoretic tools. The analyses show that TNAP is in the crossroad of numerous pathways and therefore is one of the key players of the neuronal signal transduction network. Through many of its connections, most notably with molecules of the purinergic system, TNAP serves as a controller by funnelling signal flow towards a subset of molecules. TNAP also appears as the source of signal to be spread via interactions with molecules involved among others in neurodegeneration. Cluster analyses identified TNAP as part of the second messenger signalling cascade. However, TNAP also forms connections with other functional groups involved in neuronal signal transduction. The results indicate the distinct ways of involvement of TNAP in multiple neuronal functions and diseases.
Collapse
Affiliation(s)
- László Négyessy
- Theoretical Neuroscience and Complex Systems Research Group, Wigner Research Center for Physics, Budapest, Hungary,
| | | | | | | | | | | |
Collapse
|
1108
|
Abstract
Metabolites as an end product of metabolism possess a wealth of information about altered metabolic control and homeostasis that is dependent on numerous variables including age, sex, and environment. Studying significant changes in the metabolite patterns has been recognized as a tool to understand crucial aspects in drug development like drug efficacy and toxicity. The inclusion of metabonomics into the OMICS study platform brings us closer to define the phenotype and allows us to look at alternatives to improve the diagnosis of diseases. Advancements in the analytical strategies and statistical tools used to study metabonomics allow us to prevent drug failures at early stages of drug development and reduce financial losses during expensive phase II and III clinical trials. This chapter introduces metabonomics along with the instruments used in the study; in addition relevant examples of the usage of metabonomics in the drug development process are discussed along with an emphasis on future directions and the challenges it faces.
Collapse
Affiliation(s)
- Pranov Ramana
- Pharmaceutical Analysis, Department of Pharmaceutical and Pharmacological Sciences, KU Leuven, O&N2 PB 923, Herestraat 49, 3000, Leuven, Belgium
| | | | | | | |
Collapse
|
1109
|
Liu Y, Wei Q, Yu G, Gai W, Li Y, Chen X. DCDB 2.0: a major update of the drug combination database. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau124. [PMID: 25539768 PMCID: PMC4275564 DOI: 10.1093/database/bau124] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Experience in clinical practice and research in systems pharmacology suggested the limitations of the current one-drug-one-target paradigm in new drug discovery. Single-target drugs may not always produce desired physiological effects on the entire biological system, even if they have successfully regulated the activities of their designated targets. On the other hand, multicomponent therapy, in which two or more agents simultaneously interact with multiple targets, has attracted growing attention. Many drug combinations consisting of multiple agents have already entered clinical practice, especially in treating complex and refractory diseases. Drug combination database (DCDB), launched in 2010, is the first available database that collects and organizes information on drug combinations, with an aim to facilitate systems-oriented new drug discovery. Here, we report the second major release of DCDB (Version 2.0), which includes 866 new drug combinations (1363 in total), consisting of 904 distinctive components. These drug combinations are curated from ∼140,000 clinical studies and the food and drug administration (FDA) electronic orange book. In this update, DCDB collects 237 unsuccessful drug combinations, which may provide a contrast for systematic discovery of the patterns in successful drug combinations. Database URL: http://www.cls.zju.edu.cn/dcdb/
Collapse
Affiliation(s)
- Yanbin Liu
- Department of Bioinformatics, College of Life Sciences and Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, P.R. China Department of Bioinformatics, College of Life Sciences and Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, P.R. China
| | - Qiang Wei
- Department of Bioinformatics, College of Life Sciences and Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, P.R. China
| | - Guisheng Yu
- Department of Bioinformatics, College of Life Sciences and Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, P.R. China
| | - Wanxia Gai
- Department of Bioinformatics, College of Life Sciences and Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, P.R. China
| | - Yongquan Li
- Department of Bioinformatics, College of Life Sciences and Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, P.R. China
| | - Xin Chen
- Department of Bioinformatics, College of Life Sciences and Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, P.R. China Department of Bioinformatics, College of Life Sciences and Institute of Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, P.R. China
| |
Collapse
|
1110
|
Hornbeck PV, Zhang B, Murray B, Kornhauser JM, Latham V, Skrzypek E. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res 2014; 43:D512-20. [PMID: 25514926 PMCID: PMC4383998 DOI: 10.1093/nar/gku1267] [Citation(s) in RCA: 2263] [Impact Index Per Article: 205.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
PhosphoSitePlus® (PSP, http://www.phosphosite.org/), a knowledgebase dedicated to mammalian post-translational modifications (PTMs), contains over 330 000 non-redundant PTMs, including phospho, acetyl, ubiquityl and methyl groups. Over 95% of the sites are from mass spectrometry (MS) experiments. In order to improve data reliability, early MS data have been reanalyzed, applying a common standard of analysis across over 1 000 000 spectra. Site assignments with P > 0.05 were filtered out. Two new downloads are available from PSP. The ‘Regulatory sites’ dataset includes curated information about modification sites that regulate downstream cellular processes, molecular functions and protein-protein interactions. The ‘PTMVar’ dataset, an intersect of missense mutations and PTMs from PSP, identifies over 25 000 PTMVars (PTMs Impacted by Variants) that can rewire signaling pathways. The PTMVar data include missense mutations from UniPROTKB, TCGA and other sources that cause over 2000 diseases or syndromes (MIM) and polymorphisms, or are associated with hundreds of cancers. PTMVars include 18 548 phosphorlyation sites, 3412 ubiquitylation sites, 2316 acetylation sites, 685 methylation sites and 245 succinylation sites.
Collapse
Affiliation(s)
| | - Bin Zhang
- Cell Signaling Technology, 3 Trask Lane, Danvers, MA 01923, USA
| | - Beth Murray
- Cell Signaling Technology, 3 Trask Lane, Danvers, MA 01923, USA
| | | | - Vaughan Latham
- Cell Signaling Technology, 3 Trask Lane, Danvers, MA 01923, USA
| | | |
Collapse
|
1111
|
Iourov IY, Vorsanova SG, Yurov YB. In silico molecular cytogenetics: a bioinformatic approach to prioritization of candidate genes and copy number variations for basic and clinical genome research. Mol Cytogenet 2014; 7:98. [PMID: 25525469 PMCID: PMC4269961 DOI: 10.1186/s13039-014-0098-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Accepted: 12/02/2014] [Indexed: 01/08/2023] Open
Abstract
Background The availability of multiple in silico tools for prioritizing genetic variants widens the possibilities for converting genomic data into biological knowledge. However, in molecular cytogenetics, bioinformatic analyses are generally limited to result visualization or database mining for finding similar cytogenetic data. Obviously, the potential of bioinformatics might go beyond these applications. On the other hand, the requirements for performing successful in silico analyses (i.e. deep knowledge of computer science, statistics etc.) can hinder the implementation of bioinformatics in clinical and basic molecular cytogenetic research. Here, we propose a bioinformatic approach to prioritization of genomic variations that is able to solve these problems. Results Selecting gene expression as an initial criterion, we have proposed a bioinformatic approach combining filtering and ranking prioritization strategies, which includes analyzing metabolome and interactome data on proteins encoded by candidate genes. To finalize the prioritization of genetic variants, genomic, epigenomic, interactomic and metabolomic data fusion has been made. Structural abnormalities and aneuploidy revealed by array CGH and FISH have been evaluated to test the approach through determining genotype-phenotype correlations, which have been found similar to those of previous studies. Additionally, we have been able to prioritize copy number variations (CNV) (i.e. differentiate between benign CNV and CNV with phenotypic outcome). Finally, the approach has been applied to prioritize genetic variants in cases of somatic mosaicism (including tissue-specific mosaicism). Conclusions In order to provide for an in silico evaluation of molecular cytogenetic data, we have proposed a bioinformatic approach to prioritization of candidate genes and CNV. While having the disadvantage of possible unavailability of gene expression data or lack of expression variability between genes of interest, the approach provides several advantages. These are (i) the versatility due to independence from specific databases/tools or software, (ii) relative algorithm simplicity (possibility to avoid sophisticated computational/statistical methodology) and (iii) applicability to molecular cytogenetic data because of the chromosome-centric nature. In conclusion, the approach is able to become useful for increasing the yield of molecular cytogenetic techniques.
Collapse
Affiliation(s)
- Ivan Y Iourov
- Mental Health Research Center, Russian Academy of Medical Sciences, 117152 Moscow, Russia ; Russian National Research Medical University named after N.I. Pirogov, Separated Structural Unit "Clinical Research Institute of Pediatrics", Ministry of Health of Russian Federation, 125412 Moscow, Russia ; Department of Medical Genetics, Russian Medical Academy of Postgraduate Education, Moscow, 123995 Russia
| | - Svetlana G Vorsanova
- Mental Health Research Center, Russian Academy of Medical Sciences, 117152 Moscow, Russia ; Russian National Research Medical University named after N.I. Pirogov, Separated Structural Unit "Clinical Research Institute of Pediatrics", Ministry of Health of Russian Federation, 125412 Moscow, Russia
| | - Yuri B Yurov
- Mental Health Research Center, Russian Academy of Medical Sciences, 117152 Moscow, Russia ; Russian National Research Medical University named after N.I. Pirogov, Separated Structural Unit "Clinical Research Institute of Pediatrics", Ministry of Health of Russian Federation, 125412 Moscow, Russia
| |
Collapse
|
1112
|
Kunz M, Xiao K, Liang C, Viereck J, Pachel C, Frantz S, Thum T, Dandekar T. Bioinformatics of cardiovascular miRNA biology. J Mol Cell Cardiol 2014; 89:3-10. [PMID: 25486579 DOI: 10.1016/j.yjmcc.2014.11.027] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/03/2014] [Revised: 11/05/2014] [Accepted: 11/29/2014] [Indexed: 12/16/2022]
Abstract
MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'.
Collapse
Affiliation(s)
- Meik Kunz
- Functional Genomics and Systems Biology Group, Department of Bioinformatics, Biocenter, Würzburg, Germany; Institute for Molecular and Translational Therapeutic Strategies (IMTTS), Hannover Medical School, Hannover, Germany
| | - Ke Xiao
- Institute for Molecular and Translational Therapeutic Strategies (IMTTS), Hannover Medical School, Hannover, Germany; Plant Breeding Institute, Christian-Albrechts-University of Kiel, Olshausenstr. 40, 24098 Kiel, Germany
| | - Chunguang Liang
- Functional Genomics and Systems Biology Group, Department of Bioinformatics, Biocenter, Würzburg, Germany
| | - Janika Viereck
- Institute for Molecular and Translational Therapeutic Strategies (IMTTS), Hannover Medical School, Hannover, Germany
| | - Christina Pachel
- Department of Internal Medicine I, University Hospital Würzburg, Germany and Comprehensive Heart Failure Center, University of Würzburg, Germany
| | - Stefan Frantz
- Department of Internal Medicine I, University Hospital Würzburg, Germany and Comprehensive Heart Failure Center, University of Würzburg, Germany
| | - Thomas Thum
- Institute for Molecular and Translational Therapeutic Strategies (IMTTS), Hannover Medical School, Hannover, Germany; Excellence Cluster REBIRTH, Hannover Medical School, Hannover, Germany; National Heart and Lung Institute, Imperial College London, London, UK
| | - Thomas Dandekar
- Functional Genomics and Systems Biology Group, Department of Bioinformatics, Biocenter, Würzburg, Germany; EMBL Heidelberg, BioComputing Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany.
| |
Collapse
|
1113
|
Tsoi LC, Elder JT, Abecasis GR. Graphical algorithm for integration of genetic and biological data: proof of principle using psoriasis as a model. ACTA ACUST UNITED AC 2014; 31:1243-9. [PMID: 25480373 DOI: 10.1093/bioinformatics/btu799] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Accepted: 11/26/2014] [Indexed: 01/17/2023]
Abstract
MOTIVATION Pathway analysis to reveal biological mechanisms for results from genetic association studies have great potential to better understand complex traits with major human disease impact. However, current approaches have not been optimized to maximize statistical power to identify enriched functions/pathways, especially when the genetic data derives from studies using platforms (e.g. Immunochip and Metabochip) customized to have pre-selected markers from previously identified top-rank loci. We present here a novel approach, called Minimum distance-based Enrichment Analysis for Genetic Association (MEAGA), with the potential to address both of these important concerns. RESULTS MEAGA performs enrichment analysis using graphical algorithms to identify sub-graphs among genes and measure their closeness in interaction database. It also incorporates a statistic summarizing the numbers and total distances of the sub-graphs, depicting the overlap between observed genetic signals and defined function/pathway gene-sets. MEAGA uses sampling technique to approximate empirical and multiple testing-corrected P-values. We show in simulation studies that MEAGA is more powerful compared to count-based strategies in identifying disease-associated functions/pathways, and the increase in power is influenced by the shortest distances among associated genes in the interactome. We applied MEAGA to the results of a meta-analysis of psoriasis using Immunochip datasets, and showed that associated genes are significantly enriched in immune-related functions and closer with each other in the protein-protein interaction network. AVAILABILITY AND IMPLEMENTATION http://genome.sph.umich.edu/wiki/MEAGA CONTACT: : tsoi.teen@gmail.com or goncalo@umich.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lam C Tsoi
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA, Department of Dermatology, University of Michigan, Ann Arbor, MI, USA, and Ann Arbor Veterans Affairs Hospital, Ann Arbor, MI, USA
| | - James T Elder
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA, Department of Dermatology, University of Michigan, Ann Arbor, MI, USA, and Ann Arbor Veterans Affairs Hospital, Ann Arbor, MI, USA Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA, Department of Dermatology, University of Michigan, Ann Arbor, MI, USA, and Ann Arbor Veterans Affairs Hospital, Ann Arbor, MI, USA
| | - Goncalo R Abecasis
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA, Department of Dermatology, University of Michigan, Ann Arbor, MI, USA, and Ann Arbor Veterans Affairs Hospital, Ann Arbor, MI, USA
| |
Collapse
|
1114
|
Cakır T, Khatibipour MJ. Metabolic network discovery by top-down and bottom-up approaches and paths for reconciliation. Front Bioeng Biotechnol 2014; 2:62. [PMID: 25520953 PMCID: PMC4253960 DOI: 10.3389/fbioe.2014.00062] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 11/14/2014] [Indexed: 11/13/2022] Open
Abstract
The primary focus in the network-centric analysis of cellular metabolism by systems biology approaches is to identify the active metabolic network for the condition of interest. Two major approaches are available for the discovery of the condition-specific metabolic networks. One approach starts from genome-scale metabolic networks, which cover all possible reactions known to occur in the related organism in a condition-independent manner, and applies methods such as the optimization-based Flux-Balance Analysis to elucidate the active network. The other approach starts from the condition-specific metabolome data, and processes the data with statistical or optimization-based methods to extract information content of the data such that the active network is inferred. These approaches, termed bottom-up and top-down, respectively, are currently employed independently. However, considering that both approaches have the same goal, they can both benefit from each other paving the way for the novel integrative analysis methods of metabolome data- and flux-analysis approaches in the post-genomic era. This study reviews the strengths of constraint-based analysis and network inference methods reported in the metabolic systems biology field; then elaborates on the potential paths to reconcile the two approaches to shed better light on how the metabolism functions.
Collapse
Affiliation(s)
- Tunahan Cakır
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University (formerly known as Gebze Institute of Technology) , Gebze , Turkey
| | - Mohammad Jafar Khatibipour
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University (formerly known as Gebze Institute of Technology) , Gebze , Turkey ; Department of Chemical Engineering, Gebze Technical University (formerly known as Gebze Institute of Technology) , Gebze , Turkey
| |
Collapse
|
1115
|
Mosca E, Alfieri R, Milanesi L. Diffusion of information throughout the host interactome reveals gene expression variations in network proximity to target proteins of hepatitis C virus. PLoS One 2014; 9:e113660. [PMID: 25461596 PMCID: PMC4251971 DOI: 10.1371/journal.pone.0113660] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Accepted: 10/27/2014] [Indexed: 12/22/2022] Open
Abstract
Hepatitis C virus infection is one of the most common and chronic in the world, and hepatitis associated with HCV infection is a major risk factor for the development of cirrhosis and hepatocellular carcinoma (HCC). The rapidly growing number of viral-host and host protein-protein interactions is enabling more and more reliable network-based analyses of viral infection supported by omics data. The study of molecular interaction networks helps to elucidate the mechanistic pathways linking HCV molecular activities and the host response that modulates the stepwise hepatocarcinogenic process from preneoplastic lesions (cirrhosis and dysplasia) to HCC. Simulating the impact of HCV-host molecular interactions throughout the host protein-protein interaction (PPI) network, we ranked the host proteins in relation to their network proximity to viral targets. We observed that the set of proteins in the neighborhood of HCV targets in the host interactome is enriched in key players of the host response to HCV infection. In opposition to HCV targets, subnetworks of proteins in network proximity to HCV targets are significantly enriched in proteins reported as differentially expressed in preneoplastic and neoplastic liver samples by two independent studies. Using multi-objective optimization, we extracted subnetworks that are simultaneously “guilt-by-association” with HCV proteins and enriched in proteins differentially expressed. These subnetworks contain established, recently proposed and novel candidate proteins for the regulation of the mechanisms of liver cells response to chronic HCV infection.
Collapse
Affiliation(s)
- Ettore Mosca
- Institute of Biomedical Technologies, National Research Council, Segrate, Milan, Italy
- * E-mail:
| | - Roberta Alfieri
- Institute of Biomedical Technologies, National Research Council, Segrate, Milan, Italy
| | - Luciano Milanesi
- Institute of Biomedical Technologies, National Research Council, Segrate, Milan, Italy
| |
Collapse
|
1116
|
Emamjomeh A, Goliaei B, Zahiri J, Ebrahimpour R. Predicting protein-protein interactions between human and hepatitis C virus via an ensemble learning method. MOLECULAR BIOSYSTEMS 2014; 10:3147-3154. [PMID: 25230581 DOI: 10.1039/c4mb00410h] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
An estimated 170 million people, approximately 3% of the world population, are chronically infected with the hepatitis C virus (HCV). More than 350,000 deaths are reported annually, which are caused by HCV. HCV, similar to a variety of viruses, causes disease in humans by altering protein-protein interactions within the host cells. Experimental approaches for the detection of host-virus PPIs have many inherent limitations. Computational approaches to predict these interactions are therefore of significant importance. While many studies have been developed to predict intra-species PPIs in the last decade, predictions on inter-species PPIs such as human-HCV PPIs are rare. In this study, we developed an ensemble learning method to predict PPIs between human and HCV proteins. Our model utilises four well-established diverse learners as base classifiers including random forest (RF), Naïve Bayes (NB), support vector machine (SVM) and multilayer perceptron (MLP). In addition, an MLP was used as a meta-learner to combine base learners' predictions to provide the final prediction. To encode human and HCV proteins as feature vectors, we used six different descriptors as follows: amino acid composition (ACC), pseudo amino acid composition (PAC), evolutionary information feature, network centrality measures, tissue information and post-translational modification information. To assess the prediction power of the proposed method, we assembled a benchmark dataset composed of confident positive and negative PPIs. In a 10-fold cross-validation experiment, our prediction method achieved accuracy and specificity as high as 83% and 94%, respectively. Furthermore, in an independent test set the proposed method achieved an accuracy of 84% and a specificity of 92%. When compared with the existing method, our method showed a better performance. These results revealed that our method is suitable for performing PPI prediction in a host-pathogen context.
Collapse
Affiliation(s)
- Abbasali Emamjomeh
- Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran
| | | | | | | |
Collapse
|
1117
|
Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M. The BioGRID interaction database: 2015 update. Nucleic Acids Res 2014; 43:D470-8. [PMID: 25428363 PMCID: PMC4383984 DOI: 10.1093/nar/gku1204] [Citation(s) in RCA: 648] [Impact Index Per Article: 58.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
The Biological General Repository for Interaction Datasets (BioGRID: http://thebiogrid.org) is an open access database that houses genetic and protein interactions curated from the primary biomedical literature for all major model organism species and humans. As of September 2014, the BioGRID contains 749 912 interactions as drawn from 43 149 publications that represent 30 model organisms. This interaction count represents a 50% increase compared to our previous 2013 BioGRID update. BioGRID data are freely distributed through partner model organism databases and meta-databases and are directly downloadable in a variety of formats. In addition to general curation of the published literature for the major model species, BioGRID undertakes themed curation projects in areas of particular relevance for biomedical sciences, such as the ubiquitin-proteasome system and various human disease-associated interaction networks. BioGRID curation is coordinated through an Interaction Management System (IMS) that facilitates the compilation interaction records through structured evidence codes, phenotype ontologies, and gene annotation. The BioGRID architecture has been improved in order to support a broader range of interaction and post-translational modification types, to allow the representation of more complex multi-gene/protein interactions, to account for cellular phenotypes through structured ontologies, to expedite curation through semi-automated text-mining approaches, and to enhance curation quality control.
Collapse
Affiliation(s)
- Andrew Chatr-Aryamontri
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada
| | - Bobby-Joe Breitkreutz
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Lorrie Boucher
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Sven Heinicke
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Daici Chen
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada
| | - Chris Stark
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Ashton Breitkreutz
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Nadine Kolas
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Lara O'Donnell
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Teresa Reguly
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Julie Nixon
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Lindsay Ramage
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Andrew Winter
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Adnane Sellam
- Centre Hospitalier de l'Université Laval (CHUL), Québec, Québec G1V 4G2, Canada
| | - Christie Chang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Jodi Hirschman
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Chandra Theesfeld
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Jennifer Rust
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Michael S Livstone
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Mike Tyers
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| |
Collapse
|
1118
|
Merico D, Costain G, Butcher NJ, Warnica W, Ogura L, Alfred SE, Brzustowicz LM, Bassett AS. MicroRNA Dysregulation, Gene Networks, and Risk for Schizophrenia in 22q11.2 Deletion Syndrome. Front Neurol 2014; 5:238. [PMID: 25484875 PMCID: PMC4240070 DOI: 10.3389/fneur.2014.00238] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 11/02/2014] [Indexed: 01/20/2023] Open
Abstract
The role of microRNAs (miRNAs) in the etiology of schizophrenia is increasingly recognized. Microdeletions at chromosome 22q11.2 are recurrent structural variants that impart a high risk for schizophrenia and are found in up to 1% of all patients with schizophrenia. The 22q11.2 deletion region overlaps gene DGCR8, encoding a subunit of the miRNA microprocessor complex. We identified miRNAs overlapped by the 22q11.2 microdeletion and for the first time investigated their predicted target genes, and those implicated by DGCR8, to identify targets that may be involved in the risk for schizophrenia. The 22q11.2 region encompasses seven validated or putative miRNA genes. Employing two standard prediction tools, we generated sets of predicted target genes. Functional enrichment profiles of the 22q11.2 region miRNA target genes suggested a role in neuronal processes and broader developmental pathways. We then constructed a protein interaction network of schizophrenia candidate genes and interaction partners relevant to brain function, independent of the 22q11.2 region miRNA mechanisms. We found that the predicted gene targets of the 22q11.2 deletion miRNAs, and targets of the genome-wide miRNAs predicted to be dysregulated by DGCR8 hemizygosity, were significantly represented in this schizophrenia network. The findings provide new insights into the pathway from 22q11.2 deletion to expression of schizophrenia, and suggest that hemizygosity of the 22q11.2 region may have downstream effects implicating genes elsewhere in the genome that are relevant to the general schizophrenia population. These data also provide further support for the notion that robust genetic findings in schizophrenia may converge on a reasonable number of final pathways.
Collapse
Affiliation(s)
- Daniele Merico
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children , Toronto, ON , Canada
| | - Gregory Costain
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada
| | - Nancy J Butcher
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada ; Institute of Medical Science, University of Toronto , Toronto, ON , Canada
| | - William Warnica
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada
| | - Lucas Ogura
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada
| | - Simon E Alfred
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada
| | - Linda M Brzustowicz
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers University , Piscataway, NJ , USA
| | - Anne S Bassett
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada ; Institute of Medical Science, University of Toronto , Toronto, ON , Canada ; The Dalglish Family Hearts and Minds Clinic for 22q11.2 Deletion Syndrome, Toronto General Hospital, University Health Network , Toronto, ON , Canada ; Department of Psychiatry, Toronto General Research Institute, University Health Network , Toronto, ON , Canada ; Department of Psychiatry, University of Toronto , Toronto, ON , Canada
| |
Collapse
|
1119
|
Liu Y, Xie D, Han L, Bai H, Li F, Wang S, Bo X. EHFPI: a database and analysis resource of essential host factors for pathogenic infection. Nucleic Acids Res 2014; 43:D946-55. [PMID: 25414353 PMCID: PMC4383917 DOI: 10.1093/nar/gku1086] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
High-throughput screening and computational technology has greatly changed the face of microbiology in better understanding pathogen–host interactions. Genome-wide RNA interference (RNAi) screens have given rise to a new class of host genes designated as Essential Host Factors (EHFs), whose knockdown effects significantly influence pathogenic infections. Therefore, we present the first release of a manually-curated bioinformatics database and analysis resource EHFPI (Essential Host Factors for Pathogenic Infection, http://biotech.bmi.ac.cn/ehfpi). EHFPI captures detailed article, screen, pathogen and phenotype annotation information for a total of 4634 EHF genes of 25 clinically important pathogenic species. Notably, EHFPI also provides six powerful and data-integrative analysis tools, i.e. EHF Overlap Analysis, EHF-pathogen Network Analysis, Gene Enrichment Analysis, Pathogen Interacting Proteins (PIPs) Analysis, Drug Target Analysis and GWAS Candidate Gene Analysis, which advance the comprehensive understanding of the biological roles of EHF genes, as in diverse perspectives of protein–protein interaction network, drug targets and diseases/traits. The EHFPI web interface provides appropriate tools that allow efficient query of EHF data and visualization of custom-made analysis results. EHFPI data and tools shall keep available without charge and serve the microbiology, biomedicine and pharmaceutics research communities, to finally facilitate the development of diagnostics, prophylactics and therapeutics for human pathogens.
Collapse
Affiliation(s)
- Yang Liu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Dafei Xie
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Lu Han
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Hui Bai
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China No. 451 Hospital of Chinese People's Liberation Army, Xi'an 710054, China
| | - Fei Li
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Shengqi Wang
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| |
Collapse
|
1120
|
Montague E, Janko I, Stanberry L, Lee E, Choiniere J, Anderson N, Stewart E, Broomall W, Higdon R, Kolker N, Kolker E. Beyond protein expression, MOPED goes multi-omics. Nucleic Acids Res 2014; 43:D1145-51. [PMID: 25404128 PMCID: PMC4383969 DOI: 10.1093/nar/gku1175] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
MOPED (Multi-Omics Profiling Expression Database; http://moped.proteinspire.org) has transitioned from solely a protein expression database to a multi-omics resource for human and model organisms. Through a web-based interface, MOPED presents consistently processed data for gene, protein and pathway expression. To improve data quality, consistency and use, MOPED includes metadata detailing experimental design and analysis methods. The multi-omics data are integrated through direct links between genes and proteins and further connected to pathways and experiments. MOPED now contains over 5 million records, information for approximately 75 000 genes and 50 000 proteins from four organisms (human, mouse, worm, yeast). These records correspond to 670 unique combinations of experiment, condition, localization and tissue. MOPED includes the following new features: pathway expression, Pathway Details pages, experimental metadata checklists, experiment summary statistics and more advanced searching tools. Advanced searching enables querying for genes, proteins, experiments, pathways and keywords of interest. The system is enhanced with visualizations for comparing across different data types. In the future MOPED will expand the number of organisms, increase integration with pathways and provide connections to disease.
Collapse
Affiliation(s)
- Elizabeth Montague
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Imre Janko
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Larissa Stanberry
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Elaine Lee
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - John Choiniere
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Nathaniel Anderson
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Elizabeth Stewart
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - William Broomall
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Roger Higdon
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Natali Kolker
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Eugene Kolker
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101 Departments of Biomedical Informatics and Medical Education and Pediatrics, University of Washington, Seattle, WA, USA 98109 Department of Chemistry and Chemical Biology, College of Science, Northeastern University, Boston, MA 02115
| |
Collapse
|
1121
|
Splicing mutation analysis reveals previously unrecognized pathways in lymph node-invasive breast cancer. Sci Rep 2014; 4:7063. [PMID: 25394353 PMCID: PMC4231324 DOI: 10.1038/srep07063] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Accepted: 10/29/2014] [Indexed: 12/22/2022] Open
Abstract
Somatic mutations reported in large-scale breast cancer (BC) sequencing studies primarily consist of protein coding mutations. mRNA splicing mutation analyses have been limited in scope, despite their prevalence in Mendelian genetic disorders. We predicted splicing mutations in 442 BC tumour and matched normal exomes from The Cancer Genome Atlas Consortium (TCGA). These splicing defects were validated by abnormal expression changes in these tumours. Of the 5,206 putative mutations identified, exon skipping, leaky or cryptic splicing was confirmed for 988 variants. Pathway enrichment analysis of the mutated genes revealed mutations in 9 NCAM1-related pathways, which were significantly increased in samples with evidence of lymph node metastasis, but not in lymph node-negative tumours. We suggest that comprehensive reporting of DNA sequencing data should include non-trivial splicing analyses to avoid missing clinically-significant deleterious splicing mutations, which may reveal novel mutated pathways present in genetic disorders.
Collapse
|
1122
|
Auerbach SS, Phadke DP, Mav D, Holmgren S, Gao Y, Xie B, Shin JH, Shah RR, Merrick BA, Tice RR. RNA-Seq-based toxicogenomic assessment of fresh frozen and formalin-fixed tissues yields similar mechanistic insights. J Appl Toxicol 2014; 35:766-80. [DOI: 10.1002/jat.3068] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Revised: 07/22/2014] [Accepted: 07/26/2014] [Indexed: 12/13/2022]
Affiliation(s)
- Scott S. Auerbach
- Biomolecular Screening Branch, Division of the National Toxicology Program; National Institute of Environmental Health Sciences; Research Triangle Park NC 27709 USA
| | | | | | - Stephanie Holmgren
- Library & Information Services Branch, Office of the Deputy Director; National Institute of Environmental Health Sciences; Research Triangle Park NC 27709 USA
| | - Yuan Gao
- Department of Biomedical Engineering; Johns Hopkins University; Baltimore MD 21205 USA
| | - Bin Xie
- Department of Biomedical Engineering; Johns Hopkins University; Baltimore MD 21205 USA
| | - Joo Heon Shin
- Department of Biomedical Engineering; Johns Hopkins University; Baltimore MD 21205 USA
| | | | - B. Alex Merrick
- Biomolecular Screening Branch, Division of the National Toxicology Program; National Institute of Environmental Health Sciences; Research Triangle Park NC 27709 USA
| | - Raymond R. Tice
- Biomolecular Screening Branch, Division of the National Toxicology Program; National Institute of Environmental Health Sciences; Research Triangle Park NC 27709 USA
| |
Collapse
|
1123
|
Morris JH, Knudsen GM, Verschueren E, Johnson JR, Cimermancic P, Greninger AL, Pico AR. Affinity purification-mass spectrometry and network analysis to understand protein-protein interactions. Nat Protoc 2014; 9:2539-54. [PMID: 25275790 PMCID: PMC4332878 DOI: 10.1038/nprot.2014.164] [Citation(s) in RCA: 134] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
By determining protein-protein interactions in normal, diseased and infected cells, we can improve our understanding of cellular systems and their reaction to various perturbations. In this protocol, we discuss how to use data obtained in affinity purification-mass spectrometry (AP-MS) experiments to generate meaningful interaction networks and effective figures. We begin with an overview of common epitope tagging, expression and AP practices, followed by liquid chromatography-MS (LC-MS) data collection. We then provide a detailed procedure covering a pipeline approach to (i) pre-processing the data by filtering against contaminant lists such as the Contaminant Repository for Affinity Purification (CRAPome) and normalization using the spectral index (SIN) or normalized spectral abundance factor (NSAF); (ii) scoring via methods such as MiST, SAInt and CompPASS; and (iii) testing the resulting scores. Data formats familiar to MS practitioners are then transformed to those most useful for network-based analyses. The protocol also explores methods available in Cytoscape to visualize and analyze these types of interaction data. The scoring pipeline can take anywhere from 1 d to 1 week, depending on one's familiarity with the tools and data peculiarities. Similarly, the network analysis and visualization protocol in Cytoscape takes 2-4 h to complete with the provided sample data, but we recommend taking days or even weeks to explore one's data and find the right questions.
Collapse
Affiliation(s)
- John H Morris
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, USA
| | - Giselle M Knudsen
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, USA
| | - Erik Verschueren
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California, USA
| | - Jeffrey R Johnson
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California, USA
| | - Peter Cimermancic
- 1] Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California, USA. [2] Graduate Group in Bioinformatics, University of California, San Francisco, San Francisco, California, USA
| | - Alexander L Greninger
- School of Medicine, University of California, San Francisco, San Francisco, California, USA
| | - Alexander R Pico
- Gladstone Institutes, University of California, San Francisco, San Francisco, California, USA
| |
Collapse
|
1124
|
Hao G, Jian L, Qi Z, Qiang L, Miao J, Ai-ping L, Cheng-ke C, Yun W. Based on bioinformatics approach to explore the novel targets and activity of multiple ingredients in Shuang-Huang-Lian (Using bioinformatics approach to explore the machanisms of a modern formula). 2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) 2014:33-34. [DOI: 10.1109/bibm.2014.6999318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
1125
|
Brown GR, Hem V, Katz KS, Ovetsky M, Wallin C, Ermolaeva O, Tolstoy I, Tatusova T, Pruitt KD, Maglott DR, Murphy TD. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res 2014; 43:D36-42. [PMID: 25355515 DOI: 10.1093/nar/gku1055] [Citation(s) in RCA: 431] [Impact Index Per Article: 39.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP.
Collapse
Affiliation(s)
- Garth R Brown
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Vichet Hem
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Kenneth S Katz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Michael Ovetsky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Craig Wallin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Olga Ermolaeva
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Igor Tolstoy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Tatiana Tatusova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Donna R Maglott
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| |
Collapse
|
1126
|
Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, von Mering C. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 2014; 43:D447-52. [PMID: 25352553 PMCID: PMC4383874 DOI: 10.1093/nar/gku1003] [Citation(s) in RCA: 7448] [Impact Index Per Article: 677.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The many functional partnerships and interactions that occur between proteins are at the core of cellular processing and their systematic characterization helps to provide context in molecular systems biology. However, known and predicted interactions are scattered over multiple resources, and the available data exhibit notable differences in terms of quality and completeness. The STRING database (http://string-db.org) aims to provide a critical assessment and integration of protein–protein interactions, including direct (physical) as well as indirect (functional) associations. The new version 10.0 of STRING covers more than 2000 organisms, which has necessitated novel, scalable algorithms for transferring interaction information between organisms. For this purpose, we have introduced hierarchical and self-consistent orthology annotations for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution. Further improvements in version 10.0 include a completely redesigned prediction pipeline for inferring protein–protein associations from co-expression data, an API interface for the R computing environment and improved statistical analysis for enrichment tests in user-provided networks.
Collapse
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Andrea Franceschini
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | | | - Davide Heller
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | | | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Roth
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alberto Santos
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Kalliopi P Tsafou
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Michael Kuhn
- Biotechnology Center, Technische Universität Dresden, 01062 Dresden, Germany Max Planck Institute of Molecular Cell Biology and Genetics, 01062 Dresden, Germany
| | - Peer Bork
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
1127
|
A guide for building biological pathways along with two case studies: hair and breast development. Methods 2014; 74:16-35. [PMID: 25449898 DOI: 10.1016/j.ymeth.2014.10.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Revised: 08/26/2014] [Accepted: 10/03/2014] [Indexed: 11/23/2022] Open
Abstract
Genomic information is being underlined in the format of biological pathways. Building these biological pathways is an ongoing demand and benefits from methods for extracting information from biomedical literature with the aid of text-mining tools. Here we hopefully guide you in the attempt of building a customized pathway or chart representation of a system. Our manual is based on a group of software designed to look at biointeractions in a set of abstracts retrieved from PubMed. However, they aim to support the work of someone with biological background, who does not need to be an expert on the subject and will play the role of manual curator while designing the representation of the system, the pathway. We therefore illustrate with two challenging case studies: hair and breast development. They were chosen for focusing on recent acquisitions of human evolution. We produced sub-pathways for each study, representing different phases of development. Differently from most charts present in current databases, we present detailed descriptions, which will additionally guide PESCADOR users along the process. The implementation as a web interface makes PESCADOR a unique tool for guiding the user along the biointeractions, which will constitute a novel pathway.
Collapse
|
1128
|
Knaack SA, Siahpirani AF, Roy S. A pan-cancer modular regulatory network analysis to identify common and cancer-specific network components. Cancer Inform 2014; 13:69-84. [PMID: 25374456 PMCID: PMC4213198 DOI: 10.4137/cin.s14058] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Revised: 09/22/2014] [Accepted: 09/24/2014] [Indexed: 12/19/2022] Open
Abstract
Many human diseases including cancer are the result of perturbations to transcriptional regulatory networks that control context-specific expression of genes. A comparative approach across multiple cancer types is a powerful approach to illuminate the common and specific network features of this family of diseases. Recent efforts from The Cancer Genome Atlas (TCGA) have generated large collections of functional genomic data sets for multiple types of cancers. An emerging challenge is to devise computational approaches that systematically compare these genomic data sets across different cancer types that identify common and cancer-specific network components. We present a module- and network-based characterization of transcriptional patterns in six different cancers being studied in TCGA: breast, colon, rectal, kidney, ovarian, and endometrial. Our approach uses a recently developed regulatory network reconstruction algorithm, modular regulatory network learning with per gene information (MERLIN), within a stability selection framework to predict regulators for individual genes and gene modules. Our module-based analysis identifies a common theme of immune system processes in each cancer study, with modules statistically enriched for immune response processes as well as targets of key immune response regulators from the interferon regulatory factor (IRF) and signal transducer and activator of transcription (STAT) families. Comparison of the inferred regulatory networks from each cancer type identified a core regulatory network that included genes involved in chromatin remodeling, cell cycle, and immune response. Regulatory network hubs included genes with known roles in specific cancer types as well as genes with potentially novel roles in different cancer types. Overall, our integrated module and network analysis recapitulated known themes in cancer biology and additionally revealed novel regulatory hubs that suggest a complex interplay of immune response, cell cycle, and chromatin remodeling across multiple cancers.
Collapse
Affiliation(s)
- Sara A Knaack
- Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI, USA
| | - Alireza Fotuhi Siahpirani
- Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI, USA. ; Department of Computer Sciences, University of Wisconsin, Madison, WI, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI, USA. ; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA
| |
Collapse
|
1129
|
Kibbe WA, Arze C, Felix V, Mitraka E, Bolton E, Fu G, Mungall CJ, Binder JX, Malone J, Vasant D, Parkinson H, Schriml LM. Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data. Nucleic Acids Res 2014; 43:D1071-8. [PMID: 25348409 PMCID: PMC4383880 DOI: 10.1093/nar/gku1011] [Citation(s) in RCA: 373] [Impact Index Per Article: 33.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The current version of the Human Disease Ontology (DO) (http://www.disease-ontology.org) database expands the utility of the ontology for the examination and comparison of genetic variation, phenotype, protein, drug and epitope data through the lens of human disease. DO is a biomedical resource of standardized common and rare disease concepts with stable identifiers organized by disease etiology. The content of DO has had 192 revisions since 2012, including the addition of 760 terms. Thirty-two percent of all terms now include definitions. DO has expanded the number and diversity of research communities and community members by 50+ during the past two years. These community members actively submit term requests, coordinate biomedical resource disease representation and provide expert curation guidance. Since the DO 2012 NAR paper, there have been hundreds of term requests and a steady increase in the number of DO listserv members, twitter followers and DO website usage. DO is moving to a multi-editor model utilizing Protégé to curate DO in web ontology language. This will enable closer collaboration with the Human Phenotype Ontology, EBI's Ontology Working Group, Mouse Genome Informatics and the Monarch Initiative among others, and enhance DO's current asserted view and multiple inferred views through reasoning.
Collapse
Affiliation(s)
- Warren A Kibbe
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA
| | - Cesar Arze
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Victor Felix
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Elvira Mitraka
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Evan Bolton
- PubChem, National Center for Biotechnology Information, National Library of Medicine National Institutes of Health Department of Health and Human Services 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Gang Fu
- PubChem, National Center for Biotechnology Information, National Library of Medicine National Institutes of Health Department of Health and Human Services 8600 Rockville Pike, Bethesda, MD 20894, USA
| | | | - Janos X Binder
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, 69117, Germany Bioinformatics Core Facility, Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, 4362, Luxembourg
| | - James Malone
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Drashtti Vasant
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lynn M Schriml
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| |
Collapse
|
1130
|
Abstract
Phosphatases are crucial enzymes in health and disease, but the knowledge of their biological roles is still limited. Identifying substrates continues to be a great challenge. To support the research on phosphatase-kinase-substrate networks we present here an update on the human DEPhOsphorylation Database: DEPOD (http://www.depod.org or http://www.koehn.embl.de/depod). DEPOD is a manually curated open access database providing human phosphatases, their protein and non-protein substrates, dephosphorylation sites, pathway involvements and external links to kinases and small molecule modulators. All internal data are fully searchable including a BLAST application. Since the first release, more human phosphatases and substrates, their associated signaling pathways (also from new sources), and interacting proteins for all phosphatases and protein substrates have been added into DEPOD. The user interface has been further optimized; for example, the interactive human phosphatase-substrate network contains now a 'highlight node' function for phosphatases, which includes the visualization of neighbors in the network.
Collapse
Affiliation(s)
- Guangyou Duan
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Xun Li
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Maja Köhn
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| |
Collapse
|
1131
|
Morgat A, Axelsen KB, Lombardot T, Alcántara R, Aimo L, Zerara M, Niknejad A, Belda E, Hyka-Nouspikel N, Coudert E, Redaschi N, Bougueleret L, Steinbeck C, Xenarios I, Bridge A. Updates in Rhea--a manually curated resource of biochemical reactions. Nucleic Acids Res 2014; 43:D459-64. [PMID: 25332395 PMCID: PMC4384025 DOI: 10.1093/nar/gku961] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Rhea (http://www.ebi.ac.uk/rhea) is a comprehensive and non-redundant resource of expert-curated biochemical reactions described using species from the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Rhea has been designed for the functional annotation of enzymes and the description of genome-scale metabolic networks, providing stoichiometrically balanced enzyme-catalyzed reactions (covering the IUBMB Enzyme Nomenclature list and additional reactions), transport reactions and spontaneously occurring reactions. Rhea reactions are extensively curated with links to source literature and are mapped to other publicly available enzyme and pathway databases such as Reactome, BioCyc, KEGG and UniPathway, through manual curation and computational methods. Here we describe developments in Rhea since our last report in the 2012 database issue of Nucleic Acids Research. These include significant growth in the number of Rhea reactions and the inclusion of reactions involving complex macromolecules such as proteins, nucleic acids and other polymers that lie outside the scope of ChEBI. Together these developments will significantly increase the utility of Rhea as a tool for the description, analysis and reconciliation of genome-scale metabolic models.
Collapse
Affiliation(s)
- Anne Morgat
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland Genoscope-LABGeM, CEA, Evry, F-91057, France
| | - Kristian B Axelsen
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Thierry Lombardot
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Rafael Alcántara
- Equipe BAMBOO, INRIA Grenoble Rhône-Alpes, Montbonnot Saint-Martin, F-38330, France
| | - Lucila Aimo
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Mohamed Zerara
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Anne Niknejad
- Cheminformatics and Metabolism Team, European Bioinformatics Institute, Hinxton, CB10 1SD, UK
| | - Eugeni Belda
- Department of Biochemistry, University of Geneva, Geneva, CH-1206, Switzerland
| | - Nevila Hyka-Nouspikel
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Elisabeth Coudert
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Nicole Redaschi
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Lydie Bougueleret
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Christoph Steinbeck
- Equipe BAMBOO, INRIA Grenoble Rhône-Alpes, Montbonnot Saint-Martin, F-38330, France
| | - Ioannis Xenarios
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland Cheminformatics and Metabolism Team, European Bioinformatics Institute, Hinxton, CB10 1SD, UK
| | - Alan Bridge
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| |
Collapse
|
1132
|
Abstract
BioLayout Express (3D) is a network analysis tool designed for the visualisation and analysis of graphs derived from biological data. It has proved to be powerful in the analysis of gene expression data, biological pathways and in a range of other applications. In version 3.2 of the tool we have introduced the ability to import, merge and display pathways and protein interaction networks available in the BioPAX Level 3 standard exchange format. A graphical interface allows users to search for pathways or interaction data stored in the Pathway Commons database. Queries using either gene/protein or pathway names are made via the cPath2 client and users can also define the source and/or species of information that they wish to examine. Data matching a query are listed and individual records may be viewed in isolation or merged using an 'Advanced' query tab. A visualisation scheme has been defined by mapping BioPAX entity types to a range of glyphs. Graphs of these data can be viewed and explored within BioLayout as 2D or 3D graph layouts, where they can be edited and/or exported for visualisation and editing within other tools.
Collapse
Affiliation(s)
- Derek W. Wright
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, Scotland, EH25 9RG, UK
| | - Tim Angus
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, Scotland, EH25 9RG, UK
| | - Anton J. Enright
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Tom C. Freeman
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, Scotland, EH25 9RG, UK
| |
Collapse
|
1133
|
Davis AP, Grondin CJ, Lennon-Hopkins K, Saraceni-Richards C, Sciaky D, King BL, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database's 10th year anniversary: update 2015. Nucleic Acids Res 2014; 43:D914-20. [PMID: 25326323 PMCID: PMC4384013 DOI: 10.1093/nar/gku935] [Citation(s) in RCA: 262] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Ten years ago, the Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) was developed out of a need to formalize, harmonize and centralize the information on numerous genes and proteins responding to environmental toxic agents across diverse species. CTD's initial approach was to facilitate comparisons of nucleotide and protein sequences of toxicologically significant genes by curating these sequences and electronically annotating them with chemical terms from their associated references. Since then, however, CTD has vastly expanded its scope to robustly represent a triad of chemical–gene, chemical–disease and gene–disease interactions that are manually curated from the scientific literature by professional biocurators using controlled vocabularies, ontologies and structured notation. Today, CTD includes 24 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, phenotypes, Gene Ontology annotations, pathways and interaction modules. In this 10th year anniversary update, we outline the evolution of CTD, including our increased data content, new ‘Pathway View’ visualization tool, enhanced curation practices, pilot chemical–phenotype results and impending exposure data set. The prototype database originally described in our first report has transformed into a sophisticated resource used actively today to help scientists develop and test hypotheses about the etiologies of environmentally influenced diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Kelley Lennon-Hopkins
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | | | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Benjamin L King
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| |
Collapse
|
1134
|
Wang Y, Fan X, Cai Y. A comparative study of improvements Pre-filter methods bring on feature selection using microarray data. Health Inf Sci Syst 2014; 2:7. [PMID: 25825671 PMCID: PMC4340279 DOI: 10.1186/2047-2501-2-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Accepted: 10/03/2014] [Indexed: 12/13/2022] Open
Abstract
Background Feature selection techniques have become an apparent need in biomarker discoveries with the development of microarray. However, the high dimensional nature of microarray made feature selection become time-consuming. To overcome such difficulties, filter data according to the background knowledge before applying feature selection techniques has become a hot topic in microarray analysis. Different methods may affect final results greatly, thus it is important to evaluate these pre-filter methods in a system way. Methods In this paper, we compared the performance of statistical-based, biological-based pre-filter methods and the combination of them on microRNA-mRNA parallel expression profiles using L1 logistic regression as feature selection techniques. Four types of data were built for both microRNA and mRNA expression profiles. Results Results showed that pre-filter methods could reduce the number of features greatly for both mRNA and microRNA expression datasets. The features selected after pre-filter procedures were shown to be significant in biological levels such as biology process and microRNA functions. Analyses of classification performance based on precision showed the pre-filter methods were necessary when the number of raw features was much bigger than that of samples. All the computing time was greatly shortened after pre-filter procedures. Conclusions With similar or better classification improvements, less but biological significant features, pre-filter-based feature selection should be taken into consideration if researchers need fast results when facing complex computing problems in bioinformatics. Electronic supplementary material The online version of this article (doi:10.1186/2047-2501-2-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yingying Wang
- Research Center for Biomedical Information, Shenzhen Institutes of Advanced Technologies, Chinese Academy of Sciences, Shenzhen, China
| | - Xiaomao Fan
- Research Center for Biomedical Information, Shenzhen Institutes of Advanced Technologies, Chinese Academy of Sciences, Shenzhen, China
| | - Yunpeng Cai
- Research Center for Biomedical Information, Shenzhen Institutes of Advanced Technologies, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|
1135
|
Meldal BHM, Forner-Martinez O, Costanzo MC, Dana J, Demeter J, Dumousseau M, Dwight SS, Gaulton A, Licata L, Melidoni AN, Ricard-Blum S, Roechert B, Skyzypek MS, Tiwari M, Velankar S, Wong ED, Hermjakob H, Orchard S. The complex portal--an encyclopaedia of macromolecular complexes. Nucleic Acids Res 2014; 43:D479-84. [PMID: 25313161 PMCID: PMC4384031 DOI: 10.1093/nar/gku975] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The IntAct molecular interaction database has created a new, free, open-source, manually curated resource, the Complex Portal (www.ebi.ac.uk/intact/complex), through which protein complexes from major model organisms are being collated and made available for search, viewing and download. It has been built in close collaboration with other bioinformatics services and populated with data from ChEMBL, MatrixDB, PDBe, Reactome and UniProtKB. Each entry contains information about the participating molecules (including small molecules and nucleic acids), their stoichiometry, topology and structural assembly. Complexes are annotated with details about their function, properties and complex-specific Gene Ontology (GO) terms. Consistent nomenclature is used throughout the resource with systematic names, recommended names and a list of synonyms all provided. The use of the Evidence Code Ontology allows us to indicate for which entries direct experimental evidence is available or if the complex has been inferred based on homology or orthology. The data are searchable using standard identifiers, such as UniProt, ChEBI and GO IDs, protein, gene and complex names or synonyms. This reference resource will be maintained and grow to encompass an increasing number of organisms. Input from groups and individuals with specific areas of expertise is welcome.
Collapse
Affiliation(s)
- Birgit H M Meldal
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Oscar Forner-Martinez
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Maria C Costanzo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Jose Dana
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Janos Demeter
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Marine Dumousseau
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Selina S Dwight
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Anna Gaulton
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Luana Licata
- Department of Biology, University of Rome, Tor Vergata, Rome 00133, Italy
| | - Anna N Melidoni
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Sylvie Ricard-Blum
- UMR 5086 CNRS, Université Lyon1, Institut de Biologie et Chimie des Protéines, 7 passage du Vercors, 69367 Lyon Cedex 07, France
| | - Bernd Roechert
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland
| | - Marek S Skyzypek
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Manu Tiwari
- Stammzellbiologie, Institut für Anatomie und Zellbiologie, GZMB Universitätsmedizin Göttingen, Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany
| | - Sameer Velankar
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Edith D Wong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Henning Hermjakob
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| |
Collapse
|
1136
|
Kim I, Lee H, Han SK, Kim S. Linear motif-mediated interactions have contributed to the evolution of modularity in complex protein interaction networks. PLoS Comput Biol 2014; 10:e1003881. [PMID: 25299147 PMCID: PMC4191887 DOI: 10.1371/journal.pcbi.1003881] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 08/29/2014] [Indexed: 02/06/2023] Open
Abstract
The modular architecture of protein-protein interaction (PPI) networks is evident in diverse species with a wide range of complexity. However, the molecular components that lead to the evolution of modularity in PPI networks have not been clearly identified. Here, we show that weak domain-linear motif interactions (DLIs) are more likely to connect different biological modules than strong domain-domain interactions (DDIs). This molecular division of labor is essential for the evolution of modularity in the complex PPI networks of diverse eukaryotic species. In particular, DLIs may compensate for the reduction in module boundaries that originate from increased connections between different modules in complex PPI networks. In addition, we show that the identification of biological modules can be greatly improved by including molecular characteristics of protein interactions. Our findings suggest that transient interactions have played a unique role in shaping the architecture and modularity of biological networks over the course of evolution. Modular architecture is important for the evolution of cellular systems. Modular rearrangements facilitate functional innovations and modular insulations provide robustness to perturbations. However, molecular-level understanding of the mechanisms underlying modular network evolution is currently not well understood. Here we show that strong domain-domain interactions (DDIs) and weak domain-linear motif interactions (DLIs) made different contributions to the evolution of the modular architecture of PPI networks. Especially, DLIs mediate between-module interactions, and that their relative abundance has dramatically increased in metazoan species. Linear motifs have been identified as evolutionary interaction switches since subtle amino acid changes can cause the short sequences in linear motifs to appear and disappear. Our results suggest that subtle changes in linear motifs have contributed to the rewiring of functional modules and, consequently, to functional innovations in metazoan species.
Collapse
Affiliation(s)
- Inhae Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Heetak Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Seong Kyu Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Sanguk Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
- School of Interdisciplinary Bioscience and Bioengineering, Pohang University of Science and Technology, Pohang, Korea
- * E-mail:
| |
Collapse
|
1137
|
Cicek AE, Qi X, Cakmak A, Johnson SR, Han X, Alshalwi S, Ozsoyoglu ZM, Ozsoyoglu G. An online system for metabolic network analysis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau091. [PMID: 25267793 PMCID: PMC4178370 DOI: 10.1093/database/bau091] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Metabolic networks have become one of the centers of attention in life sciences research with the advancements in the metabolomics field. A vast array of studies analyzes metabolites and their interrelations to seek explanations for various biological questions, and numerous genome-scale metabolic networks have been assembled to serve for this purpose. The increasing focus on this topic comes with the need for software systems that store, query, browse, analyze and visualize metabolic networks. PathCase Metabolomics Analysis Workbench (PathCaseMAW) is built, released and runs on a manually created generic mammalian metabolic network. The PathCaseMAW system provides a database-enabled framework and Web-based computational tools for browsing, querying, analyzing and visualizing stored metabolic networks. PathCaseMAW editor, with its user-friendly interface, can be used to create a new metabolic network and/or update an existing metabolic network. The network can also be created from an existing genome-scale reconstructed network using the PathCaseMAW SBML parser. The metabolic network can be accessed through a Web interface or an iPad application. For metabolomics analysis, steady-state metabolic network dynamics analysis (SMDA) algorithm is implemented and integrated with the system. SMDA tool is accessible through both the Web-based interface and the iPad application for metabolomics analysis based on a metabolic profile. PathCaseMAW is a comprehensive system with various data input and data access subsystems. It is easy to work with by design, and is a promising tool for metabolomics research and for educational purposes. Database URL: http://nashua.case.edu/PathwaysMAW/Web
Collapse
Affiliation(s)
- Abdullah Ercument Cicek
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Xinjian Qi
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Ali Cakmak
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Stephen R Johnson
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Xu Han
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Sami Alshalwi
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Zehra Meral Ozsoyoglu
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Gultekin Ozsoyoglu
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| |
Collapse
|
1138
|
Beck TN, Chikwem AJ, Solanki NR, Golemis EA. Bioinformatic approaches to augment study of epithelial-to-mesenchymal transition in lung cancer. Physiol Genomics 2014; 46:699-724. [PMID: 25096367 PMCID: PMC4187119 DOI: 10.1152/physiolgenomics.00062.2014] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Accepted: 08/04/2014] [Indexed: 12/22/2022] Open
Abstract
Bioinformatic approaches are intended to provide systems level insight into the complex biological processes that underlie serious diseases such as cancer. In this review we describe current bioinformatic resources, and illustrate how they have been used to study a clinically important example: epithelial-to-mesenchymal transition (EMT) in lung cancer. Lung cancer is the leading cause of cancer-related deaths and is often diagnosed at advanced stages, leading to limited therapeutic success. While EMT is essential during development and wound healing, pathological reactivation of this program by cancer cells contributes to metastasis and drug resistance, both major causes of death from lung cancer. Challenges of studying EMT include its transient nature, its molecular and phenotypic heterogeneity, and the complicated networks of rewired signaling cascades. Given the biology of lung cancer and the role of EMT, it is critical to better align the two in order to advance the impact of precision oncology. This task relies heavily on the application of bioinformatic resources. Besides summarizing recent work in this area, we use four EMT-associated genes, TGF-β (TGFB1), NEDD9/HEF1, β-catenin (CTNNB1) and E-cadherin (CDH1), as exemplars to demonstrate the current capacities and limitations of probing bioinformatic resources to inform hypothesis-driven studies with therapeutic goals.
Collapse
Affiliation(s)
- Tim N Beck
- Developmental Therapeutics Program, Fox Chase Cancer Center, Philadelphia, Pennsylvania; Program in Molecular and Cell Biology and Genetics, Drexel University College of Medicine, Philadelphia, Pennsylvania; and
| | - Adaeze J Chikwem
- Developmental Therapeutics Program, Fox Chase Cancer Center, Philadelphia, Pennsylvania; Temple University School of Medicine, Philadelphia, Pennsylvania; and
| | - Nehal R Solanki
- Immune Cell Development and Host Defense Program, Fox Chase Cancer Center, Philadelphia, Pennsylvania; Program in Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, Pennsylvania
| | - Erica A Golemis
- Developmental Therapeutics Program, Fox Chase Cancer Center, Philadelphia, Pennsylvania; Temple University School of Medicine, Philadelphia, Pennsylvania; and Program in Molecular and Cell Biology and Genetics, Drexel University College of Medicine, Philadelphia, Pennsylvania; and
| |
Collapse
|
1139
|
Zhang Q, Yang B, Chen X, Xu J, Mei C, Mao Z. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau092. [PMID: 25252782 PMCID: PMC4173636 DOI: 10.1093/database/bau092] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
UNLABELLED We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. AVAILABILITY AND IMPLEMENTATION Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. DATABASE URL http://rged.wall-eva.net.
Collapse
Affiliation(s)
- Qingzhou Zhang
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Bo Yang
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Xujiao Chen
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Jing Xu
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Changlin Mei
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Zhiguo Mao
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| |
Collapse
|
1140
|
Butler WE, Atai N, Carter B, Hochberg F. Informatic system for a global tissue-fluid biorepository with a graph theory-oriented graphical user interface. J Extracell Vesicles 2014; 3:24247. [PMID: 25317275 PMCID: PMC4172698 DOI: 10.3402/jev.v3.24247] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Revised: 06/13/2014] [Accepted: 06/15/2014] [Indexed: 12/12/2022] Open
Abstract
The Richard Floor Biorepository supports collaborative studies of extracellular vesicles (EVs) found in human fluids and tissue specimens. The current emphasis is on biomarkers for central nervous system neoplasms but its structure may serve as a template for collaborative EV translational studies in other fields. The informatic system provides specimen inventory tracking with bar codes assigned to specimens and containers and projects, is hosted on globalized cloud computing resources, and embeds a suite of shared documents, calendars, and video-conferencing features. Clinical data are recorded in relation to molecular EV attributes and may be tagged with terms drawn from a network of externally maintained ontologies thus offering expansion of the system as the field matures. We fashioned the graphical user interface (GUI) around a web-based data visualization package. This system is now in an early stage of deployment, mainly focused on specimen tracking and clinical, laboratory, and imaging data capture in support of studies to optimize detection and analysis of brain tumour-specific mutations. It currently includes 4,392 specimens drawn from 611 subjects, the majority with brain tumours. As EV science evolves, we plan biorepository changes which may reflect multi-institutional collaborations, proteomic interfaces, additional biofluids, changes in operating procedures and kits for specimen handling, novel procedures for detection of tumour-specific EVs, and for RNA extraction and changes in the taxonomy of EVs. We have used an ontology-driven data model and web-based architecture with a graph theory-driven GUI to accommodate and stimulate the semantic web of EV science.
Collapse
Affiliation(s)
- William E. Butler
- Neurosurgical Service, Massachusetts General Hospital, Boston, MA, USA
- Massachusetts General Hospital, Boston, MA, USA
| | - Nadia Atai
- Neurosurgical Service, Massachusetts General Hospital, Boston, MA, USA
- Massachusetts General Hospital, Boston, MA, USA
- Department of Cell Biology and Histology, University of Amsterdam, Amsterdam, The Netherlands
| | - Bob Carter
- Department of Neurosurgery, University of San Diego Medical School, San Diego, CA, USA
| | | |
Collapse
|
1141
|
Okada Y. From the era of genome analysis to the era of genomic drug discovery: a pioneering example of rheumatoid arthritis. Clin Genet 2014; 86:432-40. [PMID: 25060537 DOI: 10.1111/cge.12465] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Revised: 07/19/2014] [Accepted: 07/21/2014] [Indexed: 01/18/2023]
Abstract
Although we have obtained comprehensive catalogs of genetic risk loci that are linked to human diseases, little is known regarding how to devise a systematic strategy to integrate genetic study results with diverse biological resources. Such strategies will be crucial for providing novel insights into disease biology and for aiding drug discovery as an ultimate goal. Here we describe the current progress in this field using a pioneering example of large-scale genetic association studies on rheumatoid arthritis (RA), an autoimmune disease characterized by inflammation and destruction of joints. Through functional and bioinformatic annotations of risk single nucleotide polymorphisms (SNPs) and genes from >100 RA risk loci identified by genome-wide association study (GWAS) meta-analysis, we found novel biological insights into RA pathogenicity. Further, by integrating RA genetic findings with the complete catalog of approved drugs for RA and other diseases, we provide empirical data to indicate that human genetic-based approaches may be useful for supporting 'genetics-driven genomic drug discovery' efforts in complex human traits and suggest that further development of integrative approaches should be undertaken.
Collapse
Affiliation(s)
- Y Okada
- Department of Human Genetics and Disease Diversity, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan; Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| |
Collapse
|
1142
|
Manunza A, Casellas J, Quintanilla R, González-Prendes R, Pena RN, Tibau J, Mercadé A, Castelló A, Aznárez N, Hernández-Sánchez J, Amills M. A genome-wide association analysis for porcine serum lipid traits reveals the existence of age-specific genetic determinants. BMC Genomics 2014; 15:758. [PMID: 25189197 PMCID: PMC4164741 DOI: 10.1186/1471-2164-15-758] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Accepted: 07/25/2014] [Indexed: 01/07/2023] Open
Abstract
Background The genetic determinism of blood lipid concentrations, the main risk factor for atherosclerosis, is practically unknown in species other than human and mouse. Even in model organisms, little is known about how the genetic determinants of lipid traits are modulated by age-specific factors. To gain new insights into this issue, we have carried out a genome-wide association study (GWAS) for cholesterol (CHOL), triglyceride (TRIG) and low (LDL) and high (HDL) density lipoprotein concentrations measured in Duroc pigs at two time points (45 and 190 days). Results Analysis of data with mixed-model methods (EMMAX, GEMMA, GenABEL) and PLINK showed a low positional concordance between trait-associated regions (TARs) for serum lipids at 45 and 190 days. Besides, the proportion of phenotypic variance explained by SNPs at these two time points was also substantially different. The four analyses consistently detected two regions on SSC3 (124 Mb, CHOL and LDL at 190 days) and SSC6 (135 Mb, CHOL and TRIG at 190 days) with highly significant effects on the porcine blood lipid profile. Moreover, we have found that SNP variation within SSC3, SSC6, SSC10, SSC13 and SSC16 TARs is associated with the expression of several genes mapping to other chromosomes and related to lipid metabolism. Conclusions Our data demonstrate that the effects of genomic determinants influencing lipid concentrations in pigs, as well as the amount of phenotypic variance they explain, are influenced by age-related factors. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-758) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Marcel Amills
- Department of Animal Genetics, Center for Research in Agricultural Genomics (CSIC-IRTA-UAB-UB), Universitat Autònoma de Barcelona, Bellaterra 08193, Spain.
| |
Collapse
|
1143
|
Bean DM, Heimbach J, Ficorella L, Micklem G, Oliver SG, Favrin G. esyN: network building, sharing and publishing. PLoS One 2014; 9:e106035. [PMID: 25181461 PMCID: PMC4152123 DOI: 10.1371/journal.pone.0106035] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 07/27/2014] [Indexed: 01/18/2023] Open
Abstract
The construction and analysis of networks is increasingly widespread in biological research. We have developed esyN ("easy networks") as a free and open source tool to facilitate the exchange of biological network models between researchers. esyN acts as a searchable database of user-created networks from any field. We have developed a simple companion web tool that enables users to view and edit networks using data from publicly available databases. Both normal interaction networks (graphs) and Petri nets can be created. In addition to its basic tools, esyN contains a number of logical templates that can be used to create models more easily. The ability to use previously published models as building blocks makes esyN a powerful tool for the construction of models and network graphs. Users are able to save their own projects online and share them either publicly or with a list of collaborators. The latter can be given the ability to edit the network themselves, allowing online collaboration on network construction. esyN is designed to facilitate unrestricted exchange of this increasingly important type of biological information. Ultimately, the aim of esyN is to bring the advantages of Open Source software development to the construction of biological networks.
Collapse
Affiliation(s)
- Daniel M. Bean
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Joshua Heimbach
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
| | - Lorenzo Ficorella
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Dipartimento di Biochimica, Universita’ degli studi di Pisa, Pisa, Italy
| | - Gos Micklem
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Stephen G. Oliver
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Giorgio Favrin
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
1144
|
Do JH. Neurotoxin-induced pathway perturbation in human neuroblastoma SH-EP cells. Mol Cells 2014; 37:672-84. [PMID: 25234470 PMCID: PMC4179136 DOI: 10.14348/molcells.2014.0173] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Revised: 08/09/2014] [Accepted: 08/11/2014] [Indexed: 01/20/2023] Open
Abstract
The exact causes of cell death in Parkinson's disease (PD) remain unknown despite extensive studies on PD.The identification of signaling and metabolic pathways involved in PD might provide insight into the molecular mechanisms underlying PD. The neurotoxin 1-methyl-4-phenylpyridinium (MPP(+)) induces cellular changes characteristic of PD, and MPP(+)-based models have been extensively used for PD studies. In this study, pathways that were significantly perturbed in MPP(+)-treated human neuroblastoma SH-EP cells were identified from genome-wide gene expression data for five time points (1.5, 3, 9, 12, and 24 h) after treatment. The mitogen-activated protein kinase (MAPK) signaling pathway and endoplasmic reticulum (ER) protein processing pathway showed significant perturbation at all time points. Perturbation of each of these pathways resulted in the common outcome of upregulation of DNA-damage-inducible transcript 3 (DDIT3). Genes involved in ER protein processing pathway included ubiquitin ligase complex genes and ER-associated degradation (ERAD)-related genes. Additionally, overexpression of DDIT3 might induce oxidative stress via glutathione depletion as a result of overexpression of CHAC1. This study suggests that upregulation of DDIT3 caused by perturbation of the MAPK signaling pathway and ER protein processing pathway might play a key role in MPP(+)-induced neuronal cell death. Moreover, the toxicity signal of MPP(+) resulting from mitochondrial dysfunction through inhibition of complex I of the electron transport chain might feed back to the mitochondria via ER stress. This positive feedback could contribute to amplification of the death signal induced by MPP(+).
Collapse
Affiliation(s)
- Jin Hwan Do
- Department of Biomolecular and Chemical Engineering, DongYang University, Yeongju 750-711, Korea
| |
Collapse
|
1145
|
Ma'ayan A, Rouillard AD, Clark NR, Wang Z, Duan Q, Kou Y. Lean Big Data integration in systems biology and systems pharmacology. Trends Pharmacol Sci 2014; 35:450-60. [PMID: 25109570 PMCID: PMC4153537 DOI: 10.1016/j.tips.2014.07.001] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2014] [Revised: 07/01/2014] [Accepted: 07/08/2014] [Indexed: 12/11/2022]
Abstract
Data sets from recent large-scale projects can be integrated into one unified puzzle that can provide new insights into how drugs and genetic perturbations applied to human cells are linked to whole-organism phenotypes. Data that report how drugs affect the phenotype of human cell lines and how drugs induce changes in gene and protein expression in human cell lines can be combined with knowledge about human disease, side effects induced by drugs, and mouse phenotypes. Such data integration efforts can be achieved through the conversion of data from the various resources into single-node-type networks, gene-set libraries, or multipartite graphs. This approach can lead us to the identification of more relationships between genes, drugs, and phenotypes as well as benchmark computational and experimental methods. Overall, this lean 'Big Data' integration strategy will bring us closer toward the goal of realizing personalized medicine.
Collapse
Affiliation(s)
- Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA.
| | - Andrew D Rouillard
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Neil R Clark
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Qiaonan Duan
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Yan Kou
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| |
Collapse
|
1146
|
Altered gene transcription in human cells treated with Ludox® silica nanoparticles. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2014; 11:8867-90. [PMID: 25170680 PMCID: PMC4198995 DOI: 10.3390/ijerph110908867] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Revised: 07/08/2014] [Accepted: 08/05/2014] [Indexed: 12/13/2022]
Abstract
Silica (SiO2) nanoparticles (NPs) have found extensive applications in industrial manufacturing, biomedical and biotechnological fields. Therefore, the increasing exposure to such ultrafine particles requires studies to characterize their potential cytotoxic effects in order to provide exhaustive information to assess the impact of nanomaterials on human health. The understanding of the biological processes involved in the development and maintenance of a variety of pathologies is improved by genome-wide approaches, and in this context, gene set analysis has emerged as a fundamental tool for the interpretation of the results. In this work we show how the use of a combination of gene-by-gene and gene set analyses can enhance the interpretation of results of in vitro treatment of A549 cells with Ludox® colloidal amorphous silica nanoparticles. By gene-by-gene and gene set analyses, we evidenced a specific cell response in relation to NPs size and elapsed time after treatment, with the smaller NPs (SM30) having higher impact on inflammatory and apoptosis processes than the bigger ones. Apoptotic process appeared to be activated by the up-regulation of the initiator genes TNFa and IL1b and by ATM. Moreover, our analyses evidenced that cell treatment with Ludox® silica nanoparticles activated the matrix metalloproteinase genes MMP1, MMP10 and MMP9. The information derived from this study can be informative about the cytotoxicity of Ludox® and other similar colloidal amorphous silica NPs prepared by solution processes.
Collapse
|
1147
|
Holland A, Ohlendieck K. Comparative profiling of the sperm proteome. Proteomics 2014; 15:632-48. [DOI: 10.1002/pmic.201400032] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Revised: 02/27/2014] [Accepted: 06/02/2014] [Indexed: 01/28/2023]
Affiliation(s)
- Ashling Holland
- Department of Biology; National University of Ireland; Maynooth County Kildare Ireland
| | - Kay Ohlendieck
- Department of Biology; National University of Ireland; Maynooth County Kildare Ireland
| |
Collapse
|
1148
|
Heinzel A, Perco P, Mayer G, Oberbauer R, Lukas A, Mayer B. From molecular signatures to predictive biomarkers: modeling disease pathophysiology and drug mechanism of action. Front Cell Dev Biol 2014; 2:37. [PMID: 25364744 PMCID: PMC4207010 DOI: 10.3389/fcell.2014.00037] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2014] [Accepted: 07/29/2014] [Indexed: 12/31/2022] Open
Abstract
Omics profiling significantly expanded the molecular landscape describing clinical phenotypes. Association analysis resulted in first diagnostic and prognostic biomarker signatures entering clinical utility. However, utilizing Omics for deepening our understanding of disease pathophysiology, and further including specific interference with drug mechanism of action on a molecular process level still sees limited added value in the clinical setting. We exemplify a computational workflow for expanding from statistics-based association analysis toward deriving molecular pathway and process models for characterizing phenotypes and drug mechanism of action. Interference analysis on the molecular model level allows identification of predictive biomarker candidates for testing drug response. We discuss this strategy on diabetic nephropathy (DN), a complex clinical phenotype triggered by diabetes and presenting with renal as well as cardiovascular endpoints. A molecular pathway map indicates involvement of multiple molecular mechanisms, and selected biomarker candidates reported as associated with disease progression are identified for specific molecular processes. Selective interference of drug mechanism of action and disease-associated processes is identified for drug classes in clinical use, in turn providing precision medicine hypotheses utilizing predictive biomarkers.
Collapse
Affiliation(s)
| | - Paul Perco
- emergentec biodevelopment GmbHVienna, Austria
| | - Gert Mayer
- Department of Internal Medicine IV, Medical University of InnsbruckInnsbruck, Austria
| | - Rainer Oberbauer
- Department of Internal Medicine III, KH Elisabethinen Linz and Medical University of ViennaVienna, Austria
| | - Arno Lukas
- emergentec biodevelopment GmbHVienna, Austria
| | - Bernd Mayer
- emergentec biodevelopment GmbHVienna, Austria
| |
Collapse
|
1149
|
Mooney MA, Nigg JT, McWeeney SK, Wilmot B. Functional and genomic context in pathway analysis of GWAS data. Trends Genet 2014; 30:390-400. [PMID: 25154796 DOI: 10.1016/j.tig.2014.07.004] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 07/18/2014] [Accepted: 07/18/2014] [Indexed: 02/07/2023]
Abstract
Gene set analysis (GSA) is a promising tool for uncovering the polygenic effects associated with complex diseases. However, the available techniques reflect a wide variety of hypotheses about how genetic effects interact to contribute to disease susceptibility. The lack of consensus about the best way to perform GSA has led to confusion in the field and has made it difficult to compare results across methods. A clear understanding of the various choices made during GSA - such as how gene sets are defined, how single-nucleotide polymorphisms (SNPs) are assigned to genes, and how individual SNP-level effects are aggregated to produce gene- or pathway-level effects - will improve the interpretability and comparability of results across methods and studies. In this review we provide an overview of the various data sources used to construct gene sets and the statistical methods used to test for gene set association, as well as provide guidelines for ensuring the comparability of results.
Collapse
Affiliation(s)
- Michael A Mooney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA
| | - Joel T Nigg
- Division of Psychology, Department of Psychiatry, Oregon Health & Science University, Portland, OR, USA; Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR, USA
| | - Shannon K McWeeney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; Oregon Clinical and Translational Research Institute, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA.
| | - Beth Wilmot
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; Oregon Clinical and Translational Research Institute, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA
| |
Collapse
|
1150
|
Wimalaratne SM, Grenon P, Hermjakob H, Le Novère N, Laibe C. BioModels linked dataset. BMC SYSTEMS BIOLOGY 2014; 8:91. [PMID: 25182954 PMCID: PMC4423647 DOI: 10.1186/s12918-014-0091-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 07/18/2014] [Indexed: 11/17/2022]
Abstract
Background BioModels Database is a reference repository of mathematical models used in biology. Models are stored as SBML files on a file system and metadata is provided in a relational database. Models can be retrieved through a web interface and programmatically via web services. In addition to those more traditional ways to access information, Linked Data using Semantic Web technologies (such as the Resource Description Framework, RDF), is becoming an increasingly popular means to describe and expose biological relevant data. Results We present the BioModels Linked Dataset, which exposes the models’ content as a dereferencable interlinked dataset. BioModels Linked Dataset makes use of the wealth of annotations available within a large number of manually curated models to link and integrate data and models from other resources. Conclusions The BioModels Linked Dataset provides users with a dataset interoperable with other semantic web resources. It supports powerful search queries, some of which were not previously available to users and allow integration of data from multiple resources. This provides a distributed platform to find similar models for comparison, processing and enrichment.
Collapse
Affiliation(s)
- Sarala M Wimalaratne
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Pierre Grenon
- CHIME, The Farr Institute of Health Informatics Research, London, NW1 2DA, UK.
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Nicolas Le Novère
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. .,Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, UK.
| | - Camille Laibe
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|