1101
|
Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M. The BioGRID interaction database: 2015 update. Nucleic Acids Res 2014; 43:D470-8. [PMID: 25428363 PMCID: PMC4383984 DOI: 10.1093/nar/gku1204] [Citation(s) in RCA: 648] [Impact Index Per Article: 64.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
The Biological General Repository for Interaction Datasets (BioGRID: http://thebiogrid.org) is an open access database that houses genetic and protein interactions curated from the primary biomedical literature for all major model organism species and humans. As of September 2014, the BioGRID contains 749 912 interactions as drawn from 43 149 publications that represent 30 model organisms. This interaction count represents a 50% increase compared to our previous 2013 BioGRID update. BioGRID data are freely distributed through partner model organism databases and meta-databases and are directly downloadable in a variety of formats. In addition to general curation of the published literature for the major model species, BioGRID undertakes themed curation projects in areas of particular relevance for biomedical sciences, such as the ubiquitin-proteasome system and various human disease-associated interaction networks. BioGRID curation is coordinated through an Interaction Management System (IMS) that facilitates the compilation interaction records through structured evidence codes, phenotype ontologies, and gene annotation. The BioGRID architecture has been improved in order to support a broader range of interaction and post-translational modification types, to allow the representation of more complex multi-gene/protein interactions, to account for cellular phenotypes through structured ontologies, to expedite curation through semi-automated text-mining approaches, and to enhance curation quality control.
Collapse
Affiliation(s)
- Andrew Chatr-Aryamontri
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada
| | - Bobby-Joe Breitkreutz
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Lorrie Boucher
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Sven Heinicke
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Daici Chen
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada
| | - Chris Stark
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Ashton Breitkreutz
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Nadine Kolas
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Lara O'Donnell
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Teresa Reguly
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Julie Nixon
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Lindsay Ramage
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Andrew Winter
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Adnane Sellam
- Centre Hospitalier de l'Université Laval (CHUL), Québec, Québec G1V 4G2, Canada
| | - Christie Chang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Jodi Hirschman
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Chandra Theesfeld
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Jennifer Rust
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Michael S Livstone
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Mike Tyers
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| |
Collapse
|
1102
|
Merico D, Costain G, Butcher NJ, Warnica W, Ogura L, Alfred SE, Brzustowicz LM, Bassett AS. MicroRNA Dysregulation, Gene Networks, and Risk for Schizophrenia in 22q11.2 Deletion Syndrome. Front Neurol 2014; 5:238. [PMID: 25484875 PMCID: PMC4240070 DOI: 10.3389/fneur.2014.00238] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 11/02/2014] [Indexed: 01/20/2023] Open
Abstract
The role of microRNAs (miRNAs) in the etiology of schizophrenia is increasingly recognized. Microdeletions at chromosome 22q11.2 are recurrent structural variants that impart a high risk for schizophrenia and are found in up to 1% of all patients with schizophrenia. The 22q11.2 deletion region overlaps gene DGCR8, encoding a subunit of the miRNA microprocessor complex. We identified miRNAs overlapped by the 22q11.2 microdeletion and for the first time investigated their predicted target genes, and those implicated by DGCR8, to identify targets that may be involved in the risk for schizophrenia. The 22q11.2 region encompasses seven validated or putative miRNA genes. Employing two standard prediction tools, we generated sets of predicted target genes. Functional enrichment profiles of the 22q11.2 region miRNA target genes suggested a role in neuronal processes and broader developmental pathways. We then constructed a protein interaction network of schizophrenia candidate genes and interaction partners relevant to brain function, independent of the 22q11.2 region miRNA mechanisms. We found that the predicted gene targets of the 22q11.2 deletion miRNAs, and targets of the genome-wide miRNAs predicted to be dysregulated by DGCR8 hemizygosity, were significantly represented in this schizophrenia network. The findings provide new insights into the pathway from 22q11.2 deletion to expression of schizophrenia, and suggest that hemizygosity of the 22q11.2 region may have downstream effects implicating genes elsewhere in the genome that are relevant to the general schizophrenia population. These data also provide further support for the notion that robust genetic findings in schizophrenia may converge on a reasonable number of final pathways.
Collapse
Affiliation(s)
- Daniele Merico
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children , Toronto, ON , Canada
| | - Gregory Costain
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada
| | - Nancy J Butcher
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada ; Institute of Medical Science, University of Toronto , Toronto, ON , Canada
| | - William Warnica
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada
| | - Lucas Ogura
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada
| | - Simon E Alfred
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada
| | - Linda M Brzustowicz
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers University , Piscataway, NJ , USA
| | - Anne S Bassett
- Clinical Genetics Research Program, Centre for Addiction and Mental Health , Toronto, ON , Canada ; Institute of Medical Science, University of Toronto , Toronto, ON , Canada ; The Dalglish Family Hearts and Minds Clinic for 22q11.2 Deletion Syndrome, Toronto General Hospital, University Health Network , Toronto, ON , Canada ; Department of Psychiatry, Toronto General Research Institute, University Health Network , Toronto, ON , Canada ; Department of Psychiatry, University of Toronto , Toronto, ON , Canada
| |
Collapse
|
1103
|
Liu Y, Xie D, Han L, Bai H, Li F, Wang S, Bo X. EHFPI: a database and analysis resource of essential host factors for pathogenic infection. Nucleic Acids Res 2014; 43:D946-55. [PMID: 25414353 PMCID: PMC4383917 DOI: 10.1093/nar/gku1086] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
High-throughput screening and computational technology has greatly changed the face of microbiology in better understanding pathogen–host interactions. Genome-wide RNA interference (RNAi) screens have given rise to a new class of host genes designated as Essential Host Factors (EHFs), whose knockdown effects significantly influence pathogenic infections. Therefore, we present the first release of a manually-curated bioinformatics database and analysis resource EHFPI (Essential Host Factors for Pathogenic Infection, http://biotech.bmi.ac.cn/ehfpi). EHFPI captures detailed article, screen, pathogen and phenotype annotation information for a total of 4634 EHF genes of 25 clinically important pathogenic species. Notably, EHFPI also provides six powerful and data-integrative analysis tools, i.e. EHF Overlap Analysis, EHF-pathogen Network Analysis, Gene Enrichment Analysis, Pathogen Interacting Proteins (PIPs) Analysis, Drug Target Analysis and GWAS Candidate Gene Analysis, which advance the comprehensive understanding of the biological roles of EHF genes, as in diverse perspectives of protein–protein interaction network, drug targets and diseases/traits. The EHFPI web interface provides appropriate tools that allow efficient query of EHF data and visualization of custom-made analysis results. EHFPI data and tools shall keep available without charge and serve the microbiology, biomedicine and pharmaceutics research communities, to finally facilitate the development of diagnostics, prophylactics and therapeutics for human pathogens.
Collapse
Affiliation(s)
- Yang Liu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Dafei Xie
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Lu Han
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Hui Bai
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China No. 451 Hospital of Chinese People's Liberation Army, Xi'an 710054, China
| | - Fei Li
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Shengqi Wang
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China
| |
Collapse
|
1104
|
Montague E, Janko I, Stanberry L, Lee E, Choiniere J, Anderson N, Stewart E, Broomall W, Higdon R, Kolker N, Kolker E. Beyond protein expression, MOPED goes multi-omics. Nucleic Acids Res 2014; 43:D1145-51. [PMID: 25404128 PMCID: PMC4383969 DOI: 10.1093/nar/gku1175] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
MOPED (Multi-Omics Profiling Expression Database; http://moped.proteinspire.org) has transitioned from solely a protein expression database to a multi-omics resource for human and model organisms. Through a web-based interface, MOPED presents consistently processed data for gene, protein and pathway expression. To improve data quality, consistency and use, MOPED includes metadata detailing experimental design and analysis methods. The multi-omics data are integrated through direct links between genes and proteins and further connected to pathways and experiments. MOPED now contains over 5 million records, information for approximately 75 000 genes and 50 000 proteins from four organisms (human, mouse, worm, yeast). These records correspond to 670 unique combinations of experiment, condition, localization and tissue. MOPED includes the following new features: pathway expression, Pathway Details pages, experimental metadata checklists, experiment summary statistics and more advanced searching tools. Advanced searching enables querying for genes, proteins, experiments, pathways and keywords of interest. The system is enhanced with visualizations for comparing across different data types. In the future MOPED will expand the number of organisms, increase integration with pathways and provide connections to disease.
Collapse
Affiliation(s)
- Elizabeth Montague
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Imre Janko
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Larissa Stanberry
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Elaine Lee
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - John Choiniere
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Nathaniel Anderson
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Elizabeth Stewart
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - William Broomall
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Roger Higdon
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Natali Kolker
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Eugene Kolker
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101 Departments of Biomedical Informatics and Medical Education and Pediatrics, University of Washington, Seattle, WA, USA 98109 Department of Chemistry and Chemical Biology, College of Science, Northeastern University, Boston, MA 02115
| |
Collapse
|
1105
|
Splicing mutation analysis reveals previously unrecognized pathways in lymph node-invasive breast cancer. Sci Rep 2014; 4:7063. [PMID: 25394353 PMCID: PMC4231324 DOI: 10.1038/srep07063] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Accepted: 10/29/2014] [Indexed: 12/22/2022] Open
Abstract
Somatic mutations reported in large-scale breast cancer (BC) sequencing studies primarily consist of protein coding mutations. mRNA splicing mutation analyses have been limited in scope, despite their prevalence in Mendelian genetic disorders. We predicted splicing mutations in 442 BC tumour and matched normal exomes from The Cancer Genome Atlas Consortium (TCGA). These splicing defects were validated by abnormal expression changes in these tumours. Of the 5,206 putative mutations identified, exon skipping, leaky or cryptic splicing was confirmed for 988 variants. Pathway enrichment analysis of the mutated genes revealed mutations in 9 NCAM1-related pathways, which were significantly increased in samples with evidence of lymph node metastasis, but not in lymph node-negative tumours. We suggest that comprehensive reporting of DNA sequencing data should include non-trivial splicing analyses to avoid missing clinically-significant deleterious splicing mutations, which may reveal novel mutated pathways present in genetic disorders.
Collapse
|
1106
|
Auerbach SS, Phadke DP, Mav D, Holmgren S, Gao Y, Xie B, Shin JH, Shah RR, Merrick BA, Tice RR. RNA-Seq-based toxicogenomic assessment of fresh frozen and formalin-fixed tissues yields similar mechanistic insights. J Appl Toxicol 2014; 35:766-80. [DOI: 10.1002/jat.3068] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Revised: 07/22/2014] [Accepted: 07/26/2014] [Indexed: 12/13/2022]
Affiliation(s)
- Scott S. Auerbach
- Biomolecular Screening Branch, Division of the National Toxicology Program; National Institute of Environmental Health Sciences; Research Triangle Park NC 27709 USA
| | | | | | - Stephanie Holmgren
- Library & Information Services Branch, Office of the Deputy Director; National Institute of Environmental Health Sciences; Research Triangle Park NC 27709 USA
| | - Yuan Gao
- Department of Biomedical Engineering; Johns Hopkins University; Baltimore MD 21205 USA
| | - Bin Xie
- Department of Biomedical Engineering; Johns Hopkins University; Baltimore MD 21205 USA
| | - Joo Heon Shin
- Department of Biomedical Engineering; Johns Hopkins University; Baltimore MD 21205 USA
| | | | - B. Alex Merrick
- Biomolecular Screening Branch, Division of the National Toxicology Program; National Institute of Environmental Health Sciences; Research Triangle Park NC 27709 USA
| | - Raymond R. Tice
- Biomolecular Screening Branch, Division of the National Toxicology Program; National Institute of Environmental Health Sciences; Research Triangle Park NC 27709 USA
| |
Collapse
|
1107
|
Morris JH, Knudsen GM, Verschueren E, Johnson JR, Cimermancic P, Greninger AL, Pico AR. Affinity purification-mass spectrometry and network analysis to understand protein-protein interactions. Nat Protoc 2014; 9:2539-54. [PMID: 25275790 PMCID: PMC4332878 DOI: 10.1038/nprot.2014.164] [Citation(s) in RCA: 130] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
By determining protein-protein interactions in normal, diseased and infected cells, we can improve our understanding of cellular systems and their reaction to various perturbations. In this protocol, we discuss how to use data obtained in affinity purification-mass spectrometry (AP-MS) experiments to generate meaningful interaction networks and effective figures. We begin with an overview of common epitope tagging, expression and AP practices, followed by liquid chromatography-MS (LC-MS) data collection. We then provide a detailed procedure covering a pipeline approach to (i) pre-processing the data by filtering against contaminant lists such as the Contaminant Repository for Affinity Purification (CRAPome) and normalization using the spectral index (SIN) or normalized spectral abundance factor (NSAF); (ii) scoring via methods such as MiST, SAInt and CompPASS; and (iii) testing the resulting scores. Data formats familiar to MS practitioners are then transformed to those most useful for network-based analyses. The protocol also explores methods available in Cytoscape to visualize and analyze these types of interaction data. The scoring pipeline can take anywhere from 1 d to 1 week, depending on one's familiarity with the tools and data peculiarities. Similarly, the network analysis and visualization protocol in Cytoscape takes 2-4 h to complete with the provided sample data, but we recommend taking days or even weeks to explore one's data and find the right questions.
Collapse
Affiliation(s)
- John H Morris
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, USA
| | - Giselle M Knudsen
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, USA
| | - Erik Verschueren
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California, USA
| | - Jeffrey R Johnson
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California, USA
| | - Peter Cimermancic
- 1] Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California, USA. [2] Graduate Group in Bioinformatics, University of California, San Francisco, San Francisco, California, USA
| | - Alexander L Greninger
- School of Medicine, University of California, San Francisco, San Francisco, California, USA
| | - Alexander R Pico
- Gladstone Institutes, University of California, San Francisco, San Francisco, California, USA
| |
Collapse
|
1108
|
Brown GR, Hem V, Katz KS, Ovetsky M, Wallin C, Ermolaeva O, Tolstoy I, Tatusova T, Pruitt KD, Maglott DR, Murphy TD. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res 2014; 43:D36-42. [PMID: 25355515 DOI: 10.1093/nar/gku1055] [Citation(s) in RCA: 431] [Impact Index Per Article: 43.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP.
Collapse
Affiliation(s)
- Garth R Brown
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Vichet Hem
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Kenneth S Katz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Michael Ovetsky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Craig Wallin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Olga Ermolaeva
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Igor Tolstoy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Tatiana Tatusova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Donna R Maglott
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| |
Collapse
|
1109
|
Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, von Mering C. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 2014; 43:D447-52. [PMID: 25352553 PMCID: PMC4383874 DOI: 10.1093/nar/gku1003] [Citation(s) in RCA: 7263] [Impact Index Per Article: 726.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The many functional partnerships and interactions that occur between proteins are at the core of cellular processing and their systematic characterization helps to provide context in molecular systems biology. However, known and predicted interactions are scattered over multiple resources, and the available data exhibit notable differences in terms of quality and completeness. The STRING database (http://string-db.org) aims to provide a critical assessment and integration of protein–protein interactions, including direct (physical) as well as indirect (functional) associations. The new version 10.0 of STRING covers more than 2000 organisms, which has necessitated novel, scalable algorithms for transferring interaction information between organisms. For this purpose, we have introduced hierarchical and self-consistent orthology annotations for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution. Further improvements in version 10.0 include a completely redesigned prediction pipeline for inferring protein–protein associations from co-expression data, an API interface for the R computing environment and improved statistical analysis for enrichment tests in user-provided networks.
Collapse
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Andrea Franceschini
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | | | - Davide Heller
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | | | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alexander Roth
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alberto Santos
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Kalliopi P Tsafou
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Michael Kuhn
- Biotechnology Center, Technische Universität Dresden, 01062 Dresden, Germany Max Planck Institute of Molecular Cell Biology and Genetics, 01062 Dresden, Germany
| | - Peer Bork
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
1110
|
Knaack SA, Siahpirani AF, Roy S. A pan-cancer modular regulatory network analysis to identify common and cancer-specific network components. Cancer Inform 2014; 13:69-84. [PMID: 25374456 PMCID: PMC4213198 DOI: 10.4137/cin.s14058] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Revised: 09/22/2014] [Accepted: 09/24/2014] [Indexed: 12/19/2022] Open
Abstract
Many human diseases including cancer are the result of perturbations to transcriptional regulatory networks that control context-specific expression of genes. A comparative approach across multiple cancer types is a powerful approach to illuminate the common and specific network features of this family of diseases. Recent efforts from The Cancer Genome Atlas (TCGA) have generated large collections of functional genomic data sets for multiple types of cancers. An emerging challenge is to devise computational approaches that systematically compare these genomic data sets across different cancer types that identify common and cancer-specific network components. We present a module- and network-based characterization of transcriptional patterns in six different cancers being studied in TCGA: breast, colon, rectal, kidney, ovarian, and endometrial. Our approach uses a recently developed regulatory network reconstruction algorithm, modular regulatory network learning with per gene information (MERLIN), within a stability selection framework to predict regulators for individual genes and gene modules. Our module-based analysis identifies a common theme of immune system processes in each cancer study, with modules statistically enriched for immune response processes as well as targets of key immune response regulators from the interferon regulatory factor (IRF) and signal transducer and activator of transcription (STAT) families. Comparison of the inferred regulatory networks from each cancer type identified a core regulatory network that included genes involved in chromatin remodeling, cell cycle, and immune response. Regulatory network hubs included genes with known roles in specific cancer types as well as genes with potentially novel roles in different cancer types. Overall, our integrated module and network analysis recapitulated known themes in cancer biology and additionally revealed novel regulatory hubs that suggest a complex interplay of immune response, cell cycle, and chromatin remodeling across multiple cancers.
Collapse
Affiliation(s)
- Sara A Knaack
- Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI, USA
| | - Alireza Fotuhi Siahpirani
- Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI, USA. ; Department of Computer Sciences, University of Wisconsin, Madison, WI, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI, USA. ; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA
| |
Collapse
|
1111
|
A guide for building biological pathways along with two case studies: hair and breast development. Methods 2014; 74:16-35. [PMID: 25449898 DOI: 10.1016/j.ymeth.2014.10.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Revised: 08/26/2014] [Accepted: 10/03/2014] [Indexed: 11/23/2022] Open
Abstract
Genomic information is being underlined in the format of biological pathways. Building these biological pathways is an ongoing demand and benefits from methods for extracting information from biomedical literature with the aid of text-mining tools. Here we hopefully guide you in the attempt of building a customized pathway or chart representation of a system. Our manual is based on a group of software designed to look at biointeractions in a set of abstracts retrieved from PubMed. However, they aim to support the work of someone with biological background, who does not need to be an expert on the subject and will play the role of manual curator while designing the representation of the system, the pathway. We therefore illustrate with two challenging case studies: hair and breast development. They were chosen for focusing on recent acquisitions of human evolution. We produced sub-pathways for each study, representing different phases of development. Differently from most charts present in current databases, we present detailed descriptions, which will additionally guide PESCADOR users along the process. The implementation as a web interface makes PESCADOR a unique tool for guiding the user along the biointeractions, which will constitute a novel pathway.
Collapse
|
1112
|
Kibbe WA, Arze C, Felix V, Mitraka E, Bolton E, Fu G, Mungall CJ, Binder JX, Malone J, Vasant D, Parkinson H, Schriml LM. Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data. Nucleic Acids Res 2014; 43:D1071-8. [PMID: 25348409 PMCID: PMC4383880 DOI: 10.1093/nar/gku1011] [Citation(s) in RCA: 373] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The current version of the Human Disease Ontology (DO) (http://www.disease-ontology.org) database expands the utility of the ontology for the examination and comparison of genetic variation, phenotype, protein, drug and epitope data through the lens of human disease. DO is a biomedical resource of standardized common and rare disease concepts with stable identifiers organized by disease etiology. The content of DO has had 192 revisions since 2012, including the addition of 760 terms. Thirty-two percent of all terms now include definitions. DO has expanded the number and diversity of research communities and community members by 50+ during the past two years. These community members actively submit term requests, coordinate biomedical resource disease representation and provide expert curation guidance. Since the DO 2012 NAR paper, there have been hundreds of term requests and a steady increase in the number of DO listserv members, twitter followers and DO website usage. DO is moving to a multi-editor model utilizing Protégé to curate DO in web ontology language. This will enable closer collaboration with the Human Phenotype Ontology, EBI's Ontology Working Group, Mouse Genome Informatics and the Monarch Initiative among others, and enhance DO's current asserted view and multiple inferred views through reasoning.
Collapse
Affiliation(s)
- Warren A Kibbe
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA
| | - Cesar Arze
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Victor Felix
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Elvira Mitraka
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Evan Bolton
- PubChem, National Center for Biotechnology Information, National Library of Medicine National Institutes of Health Department of Health and Human Services 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Gang Fu
- PubChem, National Center for Biotechnology Information, National Library of Medicine National Institutes of Health Department of Health and Human Services 8600 Rockville Pike, Bethesda, MD 20894, USA
| | | | - Janos X Binder
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, 69117, Germany Bioinformatics Core Facility, Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, 4362, Luxembourg
| | - James Malone
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Drashtti Vasant
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lynn M Schriml
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| |
Collapse
|
1113
|
Abstract
Phosphatases are crucial enzymes in health and disease, but the knowledge of their biological roles is still limited. Identifying substrates continues to be a great challenge. To support the research on phosphatase-kinase-substrate networks we present here an update on the human DEPhOsphorylation Database: DEPOD (http://www.depod.org or http://www.koehn.embl.de/depod). DEPOD is a manually curated open access database providing human phosphatases, their protein and non-protein substrates, dephosphorylation sites, pathway involvements and external links to kinases and small molecule modulators. All internal data are fully searchable including a BLAST application. Since the first release, more human phosphatases and substrates, their associated signaling pathways (also from new sources), and interacting proteins for all phosphatases and protein substrates have been added into DEPOD. The user interface has been further optimized; for example, the interactive human phosphatase-substrate network contains now a 'highlight node' function for phosphatases, which includes the visualization of neighbors in the network.
Collapse
Affiliation(s)
- Guangyou Duan
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Xun Li
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Maja Köhn
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| |
Collapse
|
1114
|
Morgat A, Axelsen KB, Lombardot T, Alcántara R, Aimo L, Zerara M, Niknejad A, Belda E, Hyka-Nouspikel N, Coudert E, Redaschi N, Bougueleret L, Steinbeck C, Xenarios I, Bridge A. Updates in Rhea--a manually curated resource of biochemical reactions. Nucleic Acids Res 2014; 43:D459-64. [PMID: 25332395 PMCID: PMC4384025 DOI: 10.1093/nar/gku961] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Rhea (http://www.ebi.ac.uk/rhea) is a comprehensive and non-redundant resource of expert-curated biochemical reactions described using species from the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Rhea has been designed for the functional annotation of enzymes and the description of genome-scale metabolic networks, providing stoichiometrically balanced enzyme-catalyzed reactions (covering the IUBMB Enzyme Nomenclature list and additional reactions), transport reactions and spontaneously occurring reactions. Rhea reactions are extensively curated with links to source literature and are mapped to other publicly available enzyme and pathway databases such as Reactome, BioCyc, KEGG and UniPathway, through manual curation and computational methods. Here we describe developments in Rhea since our last report in the 2012 database issue of Nucleic Acids Research. These include significant growth in the number of Rhea reactions and the inclusion of reactions involving complex macromolecules such as proteins, nucleic acids and other polymers that lie outside the scope of ChEBI. Together these developments will significantly increase the utility of Rhea as a tool for the description, analysis and reconciliation of genome-scale metabolic models.
Collapse
Affiliation(s)
- Anne Morgat
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland Genoscope-LABGeM, CEA, Evry, F-91057, France
| | - Kristian B Axelsen
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Thierry Lombardot
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Rafael Alcántara
- Equipe BAMBOO, INRIA Grenoble Rhône-Alpes, Montbonnot Saint-Martin, F-38330, France
| | - Lucila Aimo
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Mohamed Zerara
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Anne Niknejad
- Cheminformatics and Metabolism Team, European Bioinformatics Institute, Hinxton, CB10 1SD, UK
| | - Eugeni Belda
- Department of Biochemistry, University of Geneva, Geneva, CH-1206, Switzerland
| | - Nevila Hyka-Nouspikel
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Elisabeth Coudert
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Nicole Redaschi
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Lydie Bougueleret
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| | - Christoph Steinbeck
- Equipe BAMBOO, INRIA Grenoble Rhône-Alpes, Montbonnot Saint-Martin, F-38330, France
| | - Ioannis Xenarios
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland Cheminformatics and Metabolism Team, European Bioinformatics Institute, Hinxton, CB10 1SD, UK
| | - Alan Bridge
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland
| |
Collapse
|
1115
|
Abstract
BioLayout Express (3D) is a network analysis tool designed for the visualisation and analysis of graphs derived from biological data. It has proved to be powerful in the analysis of gene expression data, biological pathways and in a range of other applications. In version 3.2 of the tool we have introduced the ability to import, merge and display pathways and protein interaction networks available in the BioPAX Level 3 standard exchange format. A graphical interface allows users to search for pathways or interaction data stored in the Pathway Commons database. Queries using either gene/protein or pathway names are made via the cPath2 client and users can also define the source and/or species of information that they wish to examine. Data matching a query are listed and individual records may be viewed in isolation or merged using an 'Advanced' query tab. A visualisation scheme has been defined by mapping BioPAX entity types to a range of glyphs. Graphs of these data can be viewed and explored within BioLayout as 2D or 3D graph layouts, where they can be edited and/or exported for visualisation and editing within other tools.
Collapse
Affiliation(s)
- Derek W. Wright
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, Scotland, EH25 9RG, UK
| | - Tim Angus
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, Scotland, EH25 9RG, UK
| | - Anton J. Enright
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Tom C. Freeman
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, Scotland, EH25 9RG, UK
| |
Collapse
|
1116
|
Davis AP, Grondin CJ, Lennon-Hopkins K, Saraceni-Richards C, Sciaky D, King BL, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database's 10th year anniversary: update 2015. Nucleic Acids Res 2014; 43:D914-20. [PMID: 25326323 PMCID: PMC4384013 DOI: 10.1093/nar/gku935] [Citation(s) in RCA: 262] [Impact Index Per Article: 26.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Ten years ago, the Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) was developed out of a need to formalize, harmonize and centralize the information on numerous genes and proteins responding to environmental toxic agents across diverse species. CTD's initial approach was to facilitate comparisons of nucleotide and protein sequences of toxicologically significant genes by curating these sequences and electronically annotating them with chemical terms from their associated references. Since then, however, CTD has vastly expanded its scope to robustly represent a triad of chemical–gene, chemical–disease and gene–disease interactions that are manually curated from the scientific literature by professional biocurators using controlled vocabularies, ontologies and structured notation. Today, CTD includes 24 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, phenotypes, Gene Ontology annotations, pathways and interaction modules. In this 10th year anniversary update, we outline the evolution of CTD, including our increased data content, new ‘Pathway View’ visualization tool, enhanced curation practices, pilot chemical–phenotype results and impending exposure data set. The prototype database originally described in our first report has transformed into a sophisticated resource used actively today to help scientists develop and test hypotheses about the etiologies of environmentally influenced diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Kelley Lennon-Hopkins
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | | | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Benjamin L King
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| |
Collapse
|
1117
|
Wang Y, Fan X, Cai Y. A comparative study of improvements Pre-filter methods bring on feature selection using microarray data. Health Inf Sci Syst 2014; 2:7. [PMID: 25825671 PMCID: PMC4340279 DOI: 10.1186/2047-2501-2-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Accepted: 10/03/2014] [Indexed: 12/13/2022] Open
Abstract
Background Feature selection techniques have become an apparent need in biomarker discoveries with the development of microarray. However, the high dimensional nature of microarray made feature selection become time-consuming. To overcome such difficulties, filter data according to the background knowledge before applying feature selection techniques has become a hot topic in microarray analysis. Different methods may affect final results greatly, thus it is important to evaluate these pre-filter methods in a system way. Methods In this paper, we compared the performance of statistical-based, biological-based pre-filter methods and the combination of them on microRNA-mRNA parallel expression profiles using L1 logistic regression as feature selection techniques. Four types of data were built for both microRNA and mRNA expression profiles. Results Results showed that pre-filter methods could reduce the number of features greatly for both mRNA and microRNA expression datasets. The features selected after pre-filter procedures were shown to be significant in biological levels such as biology process and microRNA functions. Analyses of classification performance based on precision showed the pre-filter methods were necessary when the number of raw features was much bigger than that of samples. All the computing time was greatly shortened after pre-filter procedures. Conclusions With similar or better classification improvements, less but biological significant features, pre-filter-based feature selection should be taken into consideration if researchers need fast results when facing complex computing problems in bioinformatics. Electronic supplementary material The online version of this article (doi:10.1186/2047-2501-2-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yingying Wang
- Research Center for Biomedical Information, Shenzhen Institutes of Advanced Technologies, Chinese Academy of Sciences, Shenzhen, China
| | - Xiaomao Fan
- Research Center for Biomedical Information, Shenzhen Institutes of Advanced Technologies, Chinese Academy of Sciences, Shenzhen, China
| | - Yunpeng Cai
- Research Center for Biomedical Information, Shenzhen Institutes of Advanced Technologies, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|
1118
|
Meldal BHM, Forner-Martinez O, Costanzo MC, Dana J, Demeter J, Dumousseau M, Dwight SS, Gaulton A, Licata L, Melidoni AN, Ricard-Blum S, Roechert B, Skyzypek MS, Tiwari M, Velankar S, Wong ED, Hermjakob H, Orchard S. The complex portal--an encyclopaedia of macromolecular complexes. Nucleic Acids Res 2014; 43:D479-84. [PMID: 25313161 PMCID: PMC4384031 DOI: 10.1093/nar/gku975] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The IntAct molecular interaction database has created a new, free, open-source, manually curated resource, the Complex Portal (www.ebi.ac.uk/intact/complex), through which protein complexes from major model organisms are being collated and made available for search, viewing and download. It has been built in close collaboration with other bioinformatics services and populated with data from ChEMBL, MatrixDB, PDBe, Reactome and UniProtKB. Each entry contains information about the participating molecules (including small molecules and nucleic acids), their stoichiometry, topology and structural assembly. Complexes are annotated with details about their function, properties and complex-specific Gene Ontology (GO) terms. Consistent nomenclature is used throughout the resource with systematic names, recommended names and a list of synonyms all provided. The use of the Evidence Code Ontology allows us to indicate for which entries direct experimental evidence is available or if the complex has been inferred based on homology or orthology. The data are searchable using standard identifiers, such as UniProt, ChEBI and GO IDs, protein, gene and complex names or synonyms. This reference resource will be maintained and grow to encompass an increasing number of organisms. Input from groups and individuals with specific areas of expertise is welcome.
Collapse
Affiliation(s)
- Birgit H M Meldal
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Oscar Forner-Martinez
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Maria C Costanzo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Jose Dana
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Janos Demeter
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Marine Dumousseau
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Selina S Dwight
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Anna Gaulton
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Luana Licata
- Department of Biology, University of Rome, Tor Vergata, Rome 00133, Italy
| | - Anna N Melidoni
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Sylvie Ricard-Blum
- UMR 5086 CNRS, Université Lyon1, Institut de Biologie et Chimie des Protéines, 7 passage du Vercors, 69367 Lyon Cedex 07, France
| | - Bernd Roechert
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland
| | - Marek S Skyzypek
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Manu Tiwari
- Stammzellbiologie, Institut für Anatomie und Zellbiologie, GZMB Universitätsmedizin Göttingen, Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany
| | - Sameer Velankar
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Edith D Wong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Henning Hermjakob
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| |
Collapse
|
1119
|
Kim I, Lee H, Han SK, Kim S. Linear motif-mediated interactions have contributed to the evolution of modularity in complex protein interaction networks. PLoS Comput Biol 2014; 10:e1003881. [PMID: 25299147 PMCID: PMC4191887 DOI: 10.1371/journal.pcbi.1003881] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 08/29/2014] [Indexed: 02/06/2023] Open
Abstract
The modular architecture of protein-protein interaction (PPI) networks is evident in diverse species with a wide range of complexity. However, the molecular components that lead to the evolution of modularity in PPI networks have not been clearly identified. Here, we show that weak domain-linear motif interactions (DLIs) are more likely to connect different biological modules than strong domain-domain interactions (DDIs). This molecular division of labor is essential for the evolution of modularity in the complex PPI networks of diverse eukaryotic species. In particular, DLIs may compensate for the reduction in module boundaries that originate from increased connections between different modules in complex PPI networks. In addition, we show that the identification of biological modules can be greatly improved by including molecular characteristics of protein interactions. Our findings suggest that transient interactions have played a unique role in shaping the architecture and modularity of biological networks over the course of evolution. Modular architecture is important for the evolution of cellular systems. Modular rearrangements facilitate functional innovations and modular insulations provide robustness to perturbations. However, molecular-level understanding of the mechanisms underlying modular network evolution is currently not well understood. Here we show that strong domain-domain interactions (DDIs) and weak domain-linear motif interactions (DLIs) made different contributions to the evolution of the modular architecture of PPI networks. Especially, DLIs mediate between-module interactions, and that their relative abundance has dramatically increased in metazoan species. Linear motifs have been identified as evolutionary interaction switches since subtle amino acid changes can cause the short sequences in linear motifs to appear and disappear. Our results suggest that subtle changes in linear motifs have contributed to the rewiring of functional modules and, consequently, to functional innovations in metazoan species.
Collapse
Affiliation(s)
- Inhae Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Heetak Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Seong Kyu Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Sanguk Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
- School of Interdisciplinary Bioscience and Bioengineering, Pohang University of Science and Technology, Pohang, Korea
- * E-mail:
| |
Collapse
|
1120
|
Cicek AE, Qi X, Cakmak A, Johnson SR, Han X, Alshalwi S, Ozsoyoglu ZM, Ozsoyoglu G. An online system for metabolic network analysis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau091. [PMID: 25267793 PMCID: PMC4178370 DOI: 10.1093/database/bau091] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Metabolic networks have become one of the centers of attention in life sciences research with the advancements in the metabolomics field. A vast array of studies analyzes metabolites and their interrelations to seek explanations for various biological questions, and numerous genome-scale metabolic networks have been assembled to serve for this purpose. The increasing focus on this topic comes with the need for software systems that store, query, browse, analyze and visualize metabolic networks. PathCase Metabolomics Analysis Workbench (PathCaseMAW) is built, released and runs on a manually created generic mammalian metabolic network. The PathCaseMAW system provides a database-enabled framework and Web-based computational tools for browsing, querying, analyzing and visualizing stored metabolic networks. PathCaseMAW editor, with its user-friendly interface, can be used to create a new metabolic network and/or update an existing metabolic network. The network can also be created from an existing genome-scale reconstructed network using the PathCaseMAW SBML parser. The metabolic network can be accessed through a Web interface or an iPad application. For metabolomics analysis, steady-state metabolic network dynamics analysis (SMDA) algorithm is implemented and integrated with the system. SMDA tool is accessible through both the Web-based interface and the iPad application for metabolomics analysis based on a metabolic profile. PathCaseMAW is a comprehensive system with various data input and data access subsystems. It is easy to work with by design, and is a promising tool for metabolomics research and for educational purposes. Database URL: http://nashua.case.edu/PathwaysMAW/Web
Collapse
Affiliation(s)
- Abdullah Ercument Cicek
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Xinjian Qi
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Ali Cakmak
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Stephen R Johnson
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Xu Han
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Sami Alshalwi
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Zehra Meral Ozsoyoglu
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| | - Gultekin Ozsoyoglu
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15222, USA, Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA and Department of Computer Science, Istanbul Sehir University, Istanbul 34662, Turkey
| |
Collapse
|
1121
|
Beck TN, Chikwem AJ, Solanki NR, Golemis EA. Bioinformatic approaches to augment study of epithelial-to-mesenchymal transition in lung cancer. Physiol Genomics 2014; 46:699-724. [PMID: 25096367 PMCID: PMC4187119 DOI: 10.1152/physiolgenomics.00062.2014] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Accepted: 08/04/2014] [Indexed: 12/22/2022] Open
Abstract
Bioinformatic approaches are intended to provide systems level insight into the complex biological processes that underlie serious diseases such as cancer. In this review we describe current bioinformatic resources, and illustrate how they have been used to study a clinically important example: epithelial-to-mesenchymal transition (EMT) in lung cancer. Lung cancer is the leading cause of cancer-related deaths and is often diagnosed at advanced stages, leading to limited therapeutic success. While EMT is essential during development and wound healing, pathological reactivation of this program by cancer cells contributes to metastasis and drug resistance, both major causes of death from lung cancer. Challenges of studying EMT include its transient nature, its molecular and phenotypic heterogeneity, and the complicated networks of rewired signaling cascades. Given the biology of lung cancer and the role of EMT, it is critical to better align the two in order to advance the impact of precision oncology. This task relies heavily on the application of bioinformatic resources. Besides summarizing recent work in this area, we use four EMT-associated genes, TGF-β (TGFB1), NEDD9/HEF1, β-catenin (CTNNB1) and E-cadherin (CDH1), as exemplars to demonstrate the current capacities and limitations of probing bioinformatic resources to inform hypothesis-driven studies with therapeutic goals.
Collapse
Affiliation(s)
- Tim N Beck
- Developmental Therapeutics Program, Fox Chase Cancer Center, Philadelphia, Pennsylvania; Program in Molecular and Cell Biology and Genetics, Drexel University College of Medicine, Philadelphia, Pennsylvania; and
| | - Adaeze J Chikwem
- Developmental Therapeutics Program, Fox Chase Cancer Center, Philadelphia, Pennsylvania; Temple University School of Medicine, Philadelphia, Pennsylvania; and
| | - Nehal R Solanki
- Immune Cell Development and Host Defense Program, Fox Chase Cancer Center, Philadelphia, Pennsylvania; Program in Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, Pennsylvania
| | - Erica A Golemis
- Developmental Therapeutics Program, Fox Chase Cancer Center, Philadelphia, Pennsylvania; Temple University School of Medicine, Philadelphia, Pennsylvania; and Program in Molecular and Cell Biology and Genetics, Drexel University College of Medicine, Philadelphia, Pennsylvania; and
| |
Collapse
|
1122
|
Zhang Q, Yang B, Chen X, Xu J, Mei C, Mao Z. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau092. [PMID: 25252782 PMCID: PMC4173636 DOI: 10.1093/database/bau092] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
UNLABELLED We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. AVAILABILITY AND IMPLEMENTATION Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. DATABASE URL http://rged.wall-eva.net.
Collapse
Affiliation(s)
- Qingzhou Zhang
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Bo Yang
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Xujiao Chen
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Jing Xu
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Changlin Mei
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| | - Zhiguo Mao
- Kidney Institute of CPLA, Division of Nephrology, Changzheng Hospital, Second Military Medical University, 415 Fengyang Road, Shanghai 200003, China
| |
Collapse
|
1123
|
Butler WE, Atai N, Carter B, Hochberg F. Informatic system for a global tissue-fluid biorepository with a graph theory-oriented graphical user interface. J Extracell Vesicles 2014; 3:24247. [PMID: 25317275 PMCID: PMC4172698 DOI: 10.3402/jev.v3.24247] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Revised: 06/13/2014] [Accepted: 06/15/2014] [Indexed: 12/12/2022] Open
Abstract
The Richard Floor Biorepository supports collaborative studies of extracellular vesicles (EVs) found in human fluids and tissue specimens. The current emphasis is on biomarkers for central nervous system neoplasms but its structure may serve as a template for collaborative EV translational studies in other fields. The informatic system provides specimen inventory tracking with bar codes assigned to specimens and containers and projects, is hosted on globalized cloud computing resources, and embeds a suite of shared documents, calendars, and video-conferencing features. Clinical data are recorded in relation to molecular EV attributes and may be tagged with terms drawn from a network of externally maintained ontologies thus offering expansion of the system as the field matures. We fashioned the graphical user interface (GUI) around a web-based data visualization package. This system is now in an early stage of deployment, mainly focused on specimen tracking and clinical, laboratory, and imaging data capture in support of studies to optimize detection and analysis of brain tumour-specific mutations. It currently includes 4,392 specimens drawn from 611 subjects, the majority with brain tumours. As EV science evolves, we plan biorepository changes which may reflect multi-institutional collaborations, proteomic interfaces, additional biofluids, changes in operating procedures and kits for specimen handling, novel procedures for detection of tumour-specific EVs, and for RNA extraction and changes in the taxonomy of EVs. We have used an ontology-driven data model and web-based architecture with a graph theory-driven GUI to accommodate and stimulate the semantic web of EV science.
Collapse
Affiliation(s)
- William E. Butler
- Neurosurgical Service, Massachusetts General Hospital, Boston, MA, USA
- Massachusetts General Hospital, Boston, MA, USA
| | - Nadia Atai
- Neurosurgical Service, Massachusetts General Hospital, Boston, MA, USA
- Massachusetts General Hospital, Boston, MA, USA
- Department of Cell Biology and Histology, University of Amsterdam, Amsterdam, The Netherlands
| | - Bob Carter
- Department of Neurosurgery, University of San Diego Medical School, San Diego, CA, USA
| | | |
Collapse
|
1124
|
Okada Y. From the era of genome analysis to the era of genomic drug discovery: a pioneering example of rheumatoid arthritis. Clin Genet 2014; 86:432-40. [PMID: 25060537 DOI: 10.1111/cge.12465] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Revised: 07/19/2014] [Accepted: 07/21/2014] [Indexed: 01/18/2023]
Abstract
Although we have obtained comprehensive catalogs of genetic risk loci that are linked to human diseases, little is known regarding how to devise a systematic strategy to integrate genetic study results with diverse biological resources. Such strategies will be crucial for providing novel insights into disease biology and for aiding drug discovery as an ultimate goal. Here we describe the current progress in this field using a pioneering example of large-scale genetic association studies on rheumatoid arthritis (RA), an autoimmune disease characterized by inflammation and destruction of joints. Through functional and bioinformatic annotations of risk single nucleotide polymorphisms (SNPs) and genes from >100 RA risk loci identified by genome-wide association study (GWAS) meta-analysis, we found novel biological insights into RA pathogenicity. Further, by integrating RA genetic findings with the complete catalog of approved drugs for RA and other diseases, we provide empirical data to indicate that human genetic-based approaches may be useful for supporting 'genetics-driven genomic drug discovery' efforts in complex human traits and suggest that further development of integrative approaches should be undertaken.
Collapse
Affiliation(s)
- Y Okada
- Department of Human Genetics and Disease Diversity, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan; Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| |
Collapse
|
1125
|
Manunza A, Casellas J, Quintanilla R, González-Prendes R, Pena RN, Tibau J, Mercadé A, Castelló A, Aznárez N, Hernández-Sánchez J, Amills M. A genome-wide association analysis for porcine serum lipid traits reveals the existence of age-specific genetic determinants. BMC Genomics 2014; 15:758. [PMID: 25189197 PMCID: PMC4164741 DOI: 10.1186/1471-2164-15-758] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Accepted: 07/25/2014] [Indexed: 01/07/2023] Open
Abstract
Background The genetic determinism of blood lipid concentrations, the main risk factor for atherosclerosis, is practically unknown in species other than human and mouse. Even in model organisms, little is known about how the genetic determinants of lipid traits are modulated by age-specific factors. To gain new insights into this issue, we have carried out a genome-wide association study (GWAS) for cholesterol (CHOL), triglyceride (TRIG) and low (LDL) and high (HDL) density lipoprotein concentrations measured in Duroc pigs at two time points (45 and 190 days). Results Analysis of data with mixed-model methods (EMMAX, GEMMA, GenABEL) and PLINK showed a low positional concordance between trait-associated regions (TARs) for serum lipids at 45 and 190 days. Besides, the proportion of phenotypic variance explained by SNPs at these two time points was also substantially different. The four analyses consistently detected two regions on SSC3 (124 Mb, CHOL and LDL at 190 days) and SSC6 (135 Mb, CHOL and TRIG at 190 days) with highly significant effects on the porcine blood lipid profile. Moreover, we have found that SNP variation within SSC3, SSC6, SSC10, SSC13 and SSC16 TARs is associated with the expression of several genes mapping to other chromosomes and related to lipid metabolism. Conclusions Our data demonstrate that the effects of genomic determinants influencing lipid concentrations in pigs, as well as the amount of phenotypic variance they explain, are influenced by age-related factors. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-758) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Marcel Amills
- Department of Animal Genetics, Center for Research in Agricultural Genomics (CSIC-IRTA-UAB-UB), Universitat Autònoma de Barcelona, Bellaterra 08193, Spain.
| |
Collapse
|
1126
|
Bean DM, Heimbach J, Ficorella L, Micklem G, Oliver SG, Favrin G. esyN: network building, sharing and publishing. PLoS One 2014; 9:e106035. [PMID: 25181461 PMCID: PMC4152123 DOI: 10.1371/journal.pone.0106035] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 07/27/2014] [Indexed: 01/18/2023] Open
Abstract
The construction and analysis of networks is increasingly widespread in biological research. We have developed esyN ("easy networks") as a free and open source tool to facilitate the exchange of biological network models between researchers. esyN acts as a searchable database of user-created networks from any field. We have developed a simple companion web tool that enables users to view and edit networks using data from publicly available databases. Both normal interaction networks (graphs) and Petri nets can be created. In addition to its basic tools, esyN contains a number of logical templates that can be used to create models more easily. The ability to use previously published models as building blocks makes esyN a powerful tool for the construction of models and network graphs. Users are able to save their own projects online and share them either publicly or with a list of collaborators. The latter can be given the ability to edit the network themselves, allowing online collaboration on network construction. esyN is designed to facilitate unrestricted exchange of this increasingly important type of biological information. Ultimately, the aim of esyN is to bring the advantages of Open Source software development to the construction of biological networks.
Collapse
Affiliation(s)
- Daniel M. Bean
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Joshua Heimbach
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
| | - Lorenzo Ficorella
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Dipartimento di Biochimica, Universita’ degli studi di Pisa, Pisa, Italy
| | - Gos Micklem
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Stephen G. Oliver
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Giorgio Favrin
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
1127
|
Do JH. Neurotoxin-induced pathway perturbation in human neuroblastoma SH-EP cells. Mol Cells 2014; 37:672-84. [PMID: 25234470 PMCID: PMC4179136 DOI: 10.14348/molcells.2014.0173] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Revised: 08/09/2014] [Accepted: 08/11/2014] [Indexed: 01/20/2023] Open
Abstract
The exact causes of cell death in Parkinson's disease (PD) remain unknown despite extensive studies on PD.The identification of signaling and metabolic pathways involved in PD might provide insight into the molecular mechanisms underlying PD. The neurotoxin 1-methyl-4-phenylpyridinium (MPP(+)) induces cellular changes characteristic of PD, and MPP(+)-based models have been extensively used for PD studies. In this study, pathways that were significantly perturbed in MPP(+)-treated human neuroblastoma SH-EP cells were identified from genome-wide gene expression data for five time points (1.5, 3, 9, 12, and 24 h) after treatment. The mitogen-activated protein kinase (MAPK) signaling pathway and endoplasmic reticulum (ER) protein processing pathway showed significant perturbation at all time points. Perturbation of each of these pathways resulted in the common outcome of upregulation of DNA-damage-inducible transcript 3 (DDIT3). Genes involved in ER protein processing pathway included ubiquitin ligase complex genes and ER-associated degradation (ERAD)-related genes. Additionally, overexpression of DDIT3 might induce oxidative stress via glutathione depletion as a result of overexpression of CHAC1. This study suggests that upregulation of DDIT3 caused by perturbation of the MAPK signaling pathway and ER protein processing pathway might play a key role in MPP(+)-induced neuronal cell death. Moreover, the toxicity signal of MPP(+) resulting from mitochondrial dysfunction through inhibition of complex I of the electron transport chain might feed back to the mitochondria via ER stress. This positive feedback could contribute to amplification of the death signal induced by MPP(+).
Collapse
Affiliation(s)
- Jin Hwan Do
- Department of Biomolecular and Chemical Engineering, DongYang University, Yeongju 750-711, Korea
| |
Collapse
|
1128
|
Ma'ayan A, Rouillard AD, Clark NR, Wang Z, Duan Q, Kou Y. Lean Big Data integration in systems biology and systems pharmacology. Trends Pharmacol Sci 2014; 35:450-60. [PMID: 25109570 PMCID: PMC4153537 DOI: 10.1016/j.tips.2014.07.001] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2014] [Revised: 07/01/2014] [Accepted: 07/08/2014] [Indexed: 12/11/2022]
Abstract
Data sets from recent large-scale projects can be integrated into one unified puzzle that can provide new insights into how drugs and genetic perturbations applied to human cells are linked to whole-organism phenotypes. Data that report how drugs affect the phenotype of human cell lines and how drugs induce changes in gene and protein expression in human cell lines can be combined with knowledge about human disease, side effects induced by drugs, and mouse phenotypes. Such data integration efforts can be achieved through the conversion of data from the various resources into single-node-type networks, gene-set libraries, or multipartite graphs. This approach can lead us to the identification of more relationships between genes, drugs, and phenotypes as well as benchmark computational and experimental methods. Overall, this lean 'Big Data' integration strategy will bring us closer toward the goal of realizing personalized medicine.
Collapse
Affiliation(s)
- Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA.
| | - Andrew D Rouillard
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Neil R Clark
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Qiaonan Duan
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Yan Kou
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| |
Collapse
|
1129
|
Altered gene transcription in human cells treated with Ludox® silica nanoparticles. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2014; 11:8867-90. [PMID: 25170680 PMCID: PMC4198995 DOI: 10.3390/ijerph110908867] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Revised: 07/08/2014] [Accepted: 08/05/2014] [Indexed: 12/13/2022]
Abstract
Silica (SiO2) nanoparticles (NPs) have found extensive applications in industrial manufacturing, biomedical and biotechnological fields. Therefore, the increasing exposure to such ultrafine particles requires studies to characterize their potential cytotoxic effects in order to provide exhaustive information to assess the impact of nanomaterials on human health. The understanding of the biological processes involved in the development and maintenance of a variety of pathologies is improved by genome-wide approaches, and in this context, gene set analysis has emerged as a fundamental tool for the interpretation of the results. In this work we show how the use of a combination of gene-by-gene and gene set analyses can enhance the interpretation of results of in vitro treatment of A549 cells with Ludox® colloidal amorphous silica nanoparticles. By gene-by-gene and gene set analyses, we evidenced a specific cell response in relation to NPs size and elapsed time after treatment, with the smaller NPs (SM30) having higher impact on inflammatory and apoptosis processes than the bigger ones. Apoptotic process appeared to be activated by the up-regulation of the initiator genes TNFa and IL1b and by ATM. Moreover, our analyses evidenced that cell treatment with Ludox® silica nanoparticles activated the matrix metalloproteinase genes MMP1, MMP10 and MMP9. The information derived from this study can be informative about the cytotoxicity of Ludox® and other similar colloidal amorphous silica NPs prepared by solution processes.
Collapse
|
1130
|
Holland A, Ohlendieck K. Comparative profiling of the sperm proteome. Proteomics 2014; 15:632-48. [DOI: 10.1002/pmic.201400032] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Revised: 02/27/2014] [Accepted: 06/02/2014] [Indexed: 01/28/2023]
Affiliation(s)
- Ashling Holland
- Department of Biology; National University of Ireland; Maynooth County Kildare Ireland
| | - Kay Ohlendieck
- Department of Biology; National University of Ireland; Maynooth County Kildare Ireland
| |
Collapse
|
1131
|
Heinzel A, Perco P, Mayer G, Oberbauer R, Lukas A, Mayer B. From molecular signatures to predictive biomarkers: modeling disease pathophysiology and drug mechanism of action. Front Cell Dev Biol 2014; 2:37. [PMID: 25364744 PMCID: PMC4207010 DOI: 10.3389/fcell.2014.00037] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2014] [Accepted: 07/29/2014] [Indexed: 12/31/2022] Open
Abstract
Omics profiling significantly expanded the molecular landscape describing clinical phenotypes. Association analysis resulted in first diagnostic and prognostic biomarker signatures entering clinical utility. However, utilizing Omics for deepening our understanding of disease pathophysiology, and further including specific interference with drug mechanism of action on a molecular process level still sees limited added value in the clinical setting. We exemplify a computational workflow for expanding from statistics-based association analysis toward deriving molecular pathway and process models for characterizing phenotypes and drug mechanism of action. Interference analysis on the molecular model level allows identification of predictive biomarker candidates for testing drug response. We discuss this strategy on diabetic nephropathy (DN), a complex clinical phenotype triggered by diabetes and presenting with renal as well as cardiovascular endpoints. A molecular pathway map indicates involvement of multiple molecular mechanisms, and selected biomarker candidates reported as associated with disease progression are identified for specific molecular processes. Selective interference of drug mechanism of action and disease-associated processes is identified for drug classes in clinical use, in turn providing precision medicine hypotheses utilizing predictive biomarkers.
Collapse
Affiliation(s)
| | - Paul Perco
- emergentec biodevelopment GmbHVienna, Austria
| | - Gert Mayer
- Department of Internal Medicine IV, Medical University of InnsbruckInnsbruck, Austria
| | - Rainer Oberbauer
- Department of Internal Medicine III, KH Elisabethinen Linz and Medical University of ViennaVienna, Austria
| | - Arno Lukas
- emergentec biodevelopment GmbHVienna, Austria
| | - Bernd Mayer
- emergentec biodevelopment GmbHVienna, Austria
| |
Collapse
|
1132
|
Mooney MA, Nigg JT, McWeeney SK, Wilmot B. Functional and genomic context in pathway analysis of GWAS data. Trends Genet 2014; 30:390-400. [PMID: 25154796 DOI: 10.1016/j.tig.2014.07.004] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 07/18/2014] [Accepted: 07/18/2014] [Indexed: 02/07/2023]
Abstract
Gene set analysis (GSA) is a promising tool for uncovering the polygenic effects associated with complex diseases. However, the available techniques reflect a wide variety of hypotheses about how genetic effects interact to contribute to disease susceptibility. The lack of consensus about the best way to perform GSA has led to confusion in the field and has made it difficult to compare results across methods. A clear understanding of the various choices made during GSA - such as how gene sets are defined, how single-nucleotide polymorphisms (SNPs) are assigned to genes, and how individual SNP-level effects are aggregated to produce gene- or pathway-level effects - will improve the interpretability and comparability of results across methods and studies. In this review we provide an overview of the various data sources used to construct gene sets and the statistical methods used to test for gene set association, as well as provide guidelines for ensuring the comparability of results.
Collapse
Affiliation(s)
- Michael A Mooney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA
| | - Joel T Nigg
- Division of Psychology, Department of Psychiatry, Oregon Health & Science University, Portland, OR, USA; Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR, USA
| | - Shannon K McWeeney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; Oregon Clinical and Translational Research Institute, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA.
| | - Beth Wilmot
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; Oregon Clinical and Translational Research Institute, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA
| |
Collapse
|
1133
|
Wimalaratne SM, Grenon P, Hermjakob H, Le Novère N, Laibe C. BioModels linked dataset. BMC SYSTEMS BIOLOGY 2014; 8:91. [PMID: 25182954 PMCID: PMC4423647 DOI: 10.1186/s12918-014-0091-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 07/18/2014] [Indexed: 11/17/2022]
Abstract
Background BioModels Database is a reference repository of mathematical models used in biology. Models are stored as SBML files on a file system and metadata is provided in a relational database. Models can be retrieved through a web interface and programmatically via web services. In addition to those more traditional ways to access information, Linked Data using Semantic Web technologies (such as the Resource Description Framework, RDF), is becoming an increasingly popular means to describe and expose biological relevant data. Results We present the BioModels Linked Dataset, which exposes the models’ content as a dereferencable interlinked dataset. BioModels Linked Dataset makes use of the wealth of annotations available within a large number of manually curated models to link and integrate data and models from other resources. Conclusions The BioModels Linked Dataset provides users with a dataset interoperable with other semantic web resources. It supports powerful search queries, some of which were not previously available to users and allow integration of data from multiple resources. This provides a distributed platform to find similar models for comparison, processing and enrichment.
Collapse
Affiliation(s)
- Sarala M Wimalaratne
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Pierre Grenon
- CHIME, The Farr Institute of Health Informatics Research, London, NW1 2DA, UK.
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Nicolas Le Novère
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. .,Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, UK.
| | - Camille Laibe
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
1134
|
How to learn about gene function: text-mining or ontologies? Methods 2014; 74:3-15. [PMID: 25088781 DOI: 10.1016/j.ymeth.2014.07.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2014] [Revised: 07/01/2014] [Accepted: 07/09/2014] [Indexed: 12/31/2022] Open
Abstract
As the amount of genome information increases rapidly, there is a correspondingly greater need for methods that provide accurate and automated annotation of gene function. For example, many high-throughput technologies--e.g., next-generation sequencing--are being used today to generate lists of genes associated with specific conditions. However, their functional interpretation remains a challenge and many tools exist trying to characterize the function of gene-lists. Such systems rely typically in enrichment analysis and aim to give a quick insight into the underlying biology by presenting it in a form of a summary-report. While the load of annotation may be alleviated by such computational approaches, the main challenge in modern annotation remains to develop a systems form of analysis in which a pipeline can effectively analyze gene-lists quickly and identify aggregated annotations through computerized resources. In this article we survey some of the many such tools and methods that have been developed to automatically interpret the biological functions underlying gene-lists. We overview current functional annotation aspects from the perspective of their epistemology (i.e., the underlying theories used to organize information about gene function into a body of verified and documented knowledge) and find that most of the currently used functional annotation methods fall broadly into one of two categories: they are based either on 'known' formally-structured ontology annotations created by 'experts' (e.g., the GO terms used to describe the function of Entrez Gene entries), or--perhaps more adventurously--on annotations inferred from literature (e.g., many text-mining methods use computer-aided reasoning to acquire knowledge represented in natural languages). Overall however, deriving detailed and accurate insight from such gene lists remains a challenging task, and improved methods are called for. In particular, future methods need to (1) provide more holistic insight into the underlying molecular systems; (2) provide better follow-up experimental testing and treatment options, and (3) better manage gene lists derived from organisms that are not well-studied. We discuss some promising approaches that may help achieve these advances, especially the use of extended dictionaries of biomedical concepts and molecular mechanisms, as well as greater use of annotation benchmarks.
Collapse
|
1135
|
Titz B, Elamin A, Martin F, Schneider T, Dijon S, Ivanov NV, Hoeng J, Peitsch MC. Proteomics for systems toxicology. Comput Struct Biotechnol J 2014; 11:73-90. [PMID: 25379146 PMCID: PMC4212285 DOI: 10.1016/j.csbj.2014.08.004] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Current toxicology studies frequently lack measurements at molecular resolution to enable a more mechanism-based and predictive toxicological assessment. Recently, a systems toxicology assessment framework has been proposed, which combines conventional toxicological assessment strategies with system-wide measurement methods and computational analysis approaches from the field of systems biology. Proteomic measurements are an integral component of this integrative strategy because protein alterations closely mirror biological effects, such as biological stress responses or global tissue alterations. Here, we provide an overview of the technical foundations and highlight select applications of proteomics for systems toxicology studies. With a focus on mass spectrometry-based proteomics, we summarize the experimental methods for quantitative proteomics and describe the computational approaches used to derive biological/mechanistic insights from these datasets. To illustrate how proteomics has been successfully employed to address mechanistic questions in toxicology, we summarized several case studies. Overall, we provide the technical and conceptual foundation for the integration of proteomic measurements in a more comprehensive systems toxicology assessment framework. We conclude that, owing to the critical importance of protein-level measurements and recent technological advances, proteomics will be an integral part of integrative systems toxicology approaches in the future.
Collapse
|
1136
|
Greco TM, Diner BA, Cristea IM. The Impact of Mass Spectrometry-Based Proteomics on Fundamental Discoveries in Virology. Annu Rev Virol 2014; 1:581-604. [PMID: 26958735 DOI: 10.1146/annurev-virology-031413-085527] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
In recent years, mass spectrometry has emerged as a core component of fundamental discoveries in virology. As a consequence of their coevolution, viruses and host cells have established complex, dynamic interactions that function either in promoting virus replication and dissemination or in host defense against invading pathogens. Thus, viral infection triggers an impressive range of proteome changes. Alterations in protein abundances, interactions, posttranslational modifications, subcellular localizations, and secretion are temporally regulated during the progression of an infection. Consequently, understanding viral infection at the molecular level requires versatile approaches that afford both breadth and depth of analysis. Mass spectrometry is uniquely positioned to bridge this experimental dichotomy. Its application to both unbiased systems analyses and targeted, hypothesis-driven studies has accelerated discoveries in viral pathogenesis and host defense. Here, we review the contributions of mass spectrometry-based proteomic approaches to understanding viral morphogenesis, replication, and assembly and to characterizing host responses to infection.
Collapse
Affiliation(s)
- Todd M Greco
- Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544;
| | - Benjamin A Diner
- Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544;
| | - Ileana M Cristea
- Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544;
| |
Collapse
|
1137
|
Wesołowska-Andersen A, Borst L, Dalgaard MD, Yadav R, Rasmussen KK, Wehner PS, Rasmussen M, Ørntoft TF, Nordentoft I, Koehler R, Bartram CR, Schrappe M, Sicheritz-Ponten T, Gautier L, Marquart H, Madsen HO, Brunak S, Stanulla M, Gupta R, Schmiegelow K. Genomic profiling of thousands of candidate polymorphisms predicts risk of relapse in 778 Danish and German childhood acute lymphoblastic leukemia patients. Leukemia 2014; 29:297-303. [PMID: 24990611 PMCID: PMC4320289 DOI: 10.1038/leu.2014.205] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Revised: 06/14/2014] [Accepted: 06/17/2014] [Indexed: 12/27/2022]
Abstract
Childhood acute lymphoblastic leukemia survival approaches 90%. New strategies are needed to identify the 10-15% who evade cure. We applied targeted, sequencing-based genotyping of 25 000 to 34 000 preselected potentially clinically relevant single-nucleotide polymorphisms (SNPs) to identify host genome profiles associated with relapse risk in 352 patients from the Nordic ALL92/2000 protocols and 426 patients from the German Berlin-Frankfurt-Munster (BFM) ALL2000 protocol. Patients were enrolled between 1992 and 2008 (median follow-up: 7.6 years). Eleven cross-validated SNPs were significantly associated with risk of relapse across protocols. SNP and biologic pathway level analyses associated relapse risk with leukemia aggressiveness, glucocorticosteroid pharmacology/response and drug transport/metabolism pathways. Classification and regression tree analysis identified three distinct risk groups defined by end of induction residual leukemia, white blood cell count and variants in myeloperoxidase (MPO), estrogen receptor 1 (ESR1), lamin B1 (LMNB1) and matrix metalloproteinase-7 (MMP7) genes, ATP-binding cassette transporters and glucocorticosteroid transcription regulation pathways. Relapse rates ranged from 4% (95% confidence interval (CI): 1.6-6.3%) for the best group (72% of patients) to 76% (95% CI: 41-90%) for the worst group (5% of patients, P<0.001). Validation of these findings and similar approaches to identify SNPs associated with toxicities may allow future individualized relapse and toxicity risk-based treatments adaptation.
Collapse
Affiliation(s)
- A Wesołowska-Andersen
- Center for Biological Sequence Analysis, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - L Borst
- Pediatrics and Adolescent Medicine, The Juliane Marie Centre, The University Hospital Rigshospitalet, Copenhagen, Denmark
| | - M D Dalgaard
- Center for Biological Sequence Analysis, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - R Yadav
- Center for Biological Sequence Analysis, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - K K Rasmussen
- Pediatrics and Adolescent Medicine, The Juliane Marie Centre, The University Hospital Rigshospitalet, Copenhagen, Denmark
| | - P S Wehner
- Department of Pediatric Hematology and Oncology, HC Andersen Children's Hospital, Odense University Hospital, Odense, Denmark
| | - M Rasmussen
- Centre for GeoGenetics, Natural History Museum of Denmark, The University of Copenhagen, Copenhagen, Denmark
| | - T F Ørntoft
- Institute of Clinical Medicine, Århus University Hospital, Århus, Denmark
| | - I Nordentoft
- Institute of Clinical Medicine, Århus University Hospital, Århus, Denmark
| | - R Koehler
- Department of Human Genetics, University of Heidelberg, Heidelberg, Germany
| | - C R Bartram
- Department of Human Genetics, University of Heidelberg, Heidelberg, Germany
| | - M Schrappe
- Department of General Pediatrics, University Medical Center Schleswig-Holstein, Kiel, Germany
| | - T Sicheritz-Ponten
- Center for Biological Sequence Analysis, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - L Gautier
- Center for Biological Sequence Analysis, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - H Marquart
- Pediatric Hematology and Oncology, Hannover Medical School, Hannover, Germany
| | - H O Madsen
- Pediatric Hematology and Oncology, Hannover Medical School, Hannover, Germany
| | - S Brunak
- Center for Biological Sequence Analysis, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - M Stanulla
- Department of Clinical Immunology, Diagnostic Centre, The University Hospital Rigshospitalet, Copenhagen, Denmark
| | - R Gupta
- Center for Biological Sequence Analysis, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - K Schmiegelow
- 1] Pediatrics and Adolescent Medicine, The Juliane Marie Centre, The University Hospital Rigshospitalet, Copenhagen, Denmark [2] Institute of Clinical Medicine, Faculty of Health and Medical Sciences, The University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
1138
|
Zhang Y, Qiu Z, Wei L, Tang R, Lian B, Zhao Y, He X, Xie L. Integrated analysis of mutation data from various sources identifies key genes and signaling pathways in hepatocellular carcinoma. PLoS One 2014; 9:e100854. [PMID: 24988079 PMCID: PMC4079600 DOI: 10.1371/journal.pone.0100854] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2014] [Accepted: 05/28/2014] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Recently, a number of studies have performed genome or exome sequencing of hepatocellular carcinoma (HCC) and identified hundreds or even thousands of mutations in protein-coding genes. However, these studies have only focused on a limited number of candidate genes, and many important mutation resources remain to be explored. PRINCIPAL FINDINGS In this study, we integrated mutation data obtained from various sources and performed pathway and network analysis. We identified 113 pathways that were significantly mutated in HCC samples and found that the mutated genes included in these pathways contained high percentages of known cancer genes, and damaging genes and also demonstrated high conservation scores, indicating their important roles in liver tumorigenesis. Five classes of pathways that were mutated most frequently included (a) proliferation and apoptosis related pathways, (b) tumor microenvironment related pathways, (c) neural signaling related pathways, (d) metabolic related pathways, and (e) circadian related pathways. Network analysis further revealed that the mutated genes with the highest betweenness coefficients, such as the well-known cancer genes TP53, CTNNB1 and recently identified novel mutated genes GNAL and the ADCY family, may play key roles in these significantly mutated pathways. Finally, we highlight several key genes (e.g., RPS6KA3 and PCLO) and pathways (e.g., axon guidance) in which the mutations were associated with clinical features. CONCLUSIONS Our workflow illustrates the increased statistical power of integrating multiple studies of the same subject, which can provide biological insights that would otherwise be masked under individual sample sets. This type of bioinformatics approach is consistent with the necessity of making the best use of the ever increasing data provided in valuable databases, such as TCGA, to enhance the speed of deciphering human cancers.
Collapse
Affiliation(s)
- Yuannv Zhang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zhaoping Qiu
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Lin Wei
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ruqi Tang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Baofeng Lian
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai, China
| | - Yingjun Zhao
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xianghuo He
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- * E-mail: (XH); (LX)
| | - Lu Xie
- Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai, China
- * E-mail: (XH); (LX)
| |
Collapse
|
1139
|
Wu G, Dawson E, Duong A, Haw R, Stein L. ReactomeFIViz: a Cytoscape app for pathway and network-based data analysis. F1000Res 2014; 3:146. [PMID: 25309732 PMCID: PMC4184317 DOI: 10.12688/f1000research.4431.2] [Citation(s) in RCA: 134] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/10/2014] [Indexed: 12/18/2022] Open
Abstract
High-throughput experiments are routinely performed in modern biological studies. However, extracting meaningful results from massive experimental data sets is a challenging task for biologists. Projecting data onto pathway and network contexts is a powerful way to unravel patterns embedded in seemingly scattered large data sets and assist knowledge discovery related to cancer and other complex diseases. We have developed a Cytoscape app called “ReactomeFIViz”, which utilizes a highly reliable gene functional interaction network combined with human curated pathways derived from Reactome and other pathway databases. This app provides a suite of features to assist biologists in performing pathway- and network-based data analysis in a biologically intuitive and user-friendly way. Biologists can use this app to uncover network and pathway patterns related to their studies, search for gene signatures from gene expression data sets, reveal pathways significantly enriched by genes in a list, and integrate multiple genomic data types into a pathway context using probabilistic graphical models. We believe our app will give researchers substantial power to analyze intrinsically noisy high-throughput experimental data to find biologically relevant information.
Collapse
Affiliation(s)
- Guanming Wu
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada ; DMICE, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Eric Dawson
- Section of Integrative Biology, Institute for Cellular and Molecular Biology, and Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX 78712, USA
| | - Adrian Duong
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada
| | - Robin Haw
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada
| | - Lincoln Stein
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada ; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| |
Collapse
|
1140
|
Josset L, Tchitchek N, Gralinski LE, Ferris MT, Eisfeld AJ, Green RR, Thomas MJ, Tisoncik-Go J, Schroth GP, Kawaoka Y, Pardo-Manuel de Villena F, Baric RS, Heise MT, Peng X, Katze MG. Annotation of long non-coding RNAs expressed in collaborative cross founder mice in response to respiratory virus infection reveals a new class of interferon-stimulated transcripts. RNA Biol 2014; 11:875-90. [PMID: 24922324 PMCID: PMC4179962 DOI: 10.4161/rna.29442] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2014] [Revised: 05/28/2014] [Accepted: 06/03/2014] [Indexed: 11/19/2022] Open
Abstract
The outcome of respiratory virus infection is determined by a complex interplay of viral and host factors. Some potentially important host factors for the antiviral response, whose functions remain largely unexplored, are long non-coding RNAs (lncRNAs). Here we systematically inferred the regulatory functions of host lncRNAs in response to influenza A virus and severe acute respiratory syndrome coronavirus (SARS-CoV) based on their similarity in expression with genes of known function. We performed total RNA-Seq on viral-infected lungs from eight mouse strains, yielding a large data set of transcriptional responses. Overall 5,329 lncRNAs were differentially expressed after infection. Most of the lncRNAs were co-expressed with coding genes in modules enriched in genes associated with lung homeostasis pathways or immune response processes. Each lncRNA was further individually annotated using a rank-based method, enabling us to associate 5,295 lncRNAs to at least one gene set and to predict their potential cis effects. We validated the lncRNAs predicted to be interferon-stimulated by profiling mouse responses after interferon-α treatment. Altogether, these results provide a broad categorization of potential lncRNA functions and identify subsets of lncRNAs with likely key roles in respiratory virus pathogenesis. These data are fully accessible through the MOuse NOn-Code Lung interactive database (MONOCLdb).
Collapse
Affiliation(s)
- Laurence Josset
- Department of Microbiology; School of Medicine; University of Washington; Seattle, WA USA
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
| | - Nicolas Tchitchek
- Department of Microbiology; School of Medicine; University of Washington; Seattle, WA USA
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
| | - Lisa E Gralinski
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
- Department of Epidemiology; University of North Carolina-Chapel Hill; Chapel Hill, NC USA
| | - Martin T Ferris
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
- Department of Genetics; University of North Carolina-Chapel Hill; Chapel Hill, NC USA
| | - Amie J Eisfeld
- Department of Pathobiological Sciences; Influenza Research Institute; University of Wisconsin-Madison; Madison, WI USA
| | - Richard R Green
- Department of Microbiology; School of Medicine; University of Washington; Seattle, WA USA
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
| | - Matthew J Thomas
- Department of Microbiology; School of Medicine; University of Washington; Seattle, WA USA
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
| | - Jennifer Tisoncik-Go
- Department of Microbiology; School of Medicine; University of Washington; Seattle, WA USA
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
| | | | - Yoshihiro Kawaoka
- Department of Pathobiological Sciences; Influenza Research Institute; University of Wisconsin-Madison; Madison, WI USA
| | | | - Ralph S Baric
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
- Department of Epidemiology; University of North Carolina-Chapel Hill; Chapel Hill, NC USA
| | - Mark T Heise
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
- Department of Genetics; University of North Carolina-Chapel Hill; Chapel Hill, NC USA
| | - Xinxia Peng
- Department of Microbiology; School of Medicine; University of Washington; Seattle, WA USA
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
| | - Michael G Katze
- Department of Microbiology; School of Medicine; University of Washington; Seattle, WA USA
- Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research; Portland, OR USA
| |
Collapse
|
1141
|
Montague E, Stanberry L, Higdon R, Janko I, Lee E, Anderson N, Choiniere J, Stewart E, Yandl G, Broomall W, Kolker N, Kolker E. MOPED 2.5--an integrated multi-omics resource: multi-omics profiling expression database now includes transcriptomics data. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2014; 18:335-43. [PMID: 24910945 PMCID: PMC4048574 DOI: 10.1089/omi.2014.0061] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Multi-omics data-driven scientific discovery crucially rests on high-throughput technologies and data sharing. Currently, data are scattered across single omics repositories, stored in varying raw and processed formats, and are often accompanied by limited or no metadata. The Multi-Omics Profiling Expression Database (MOPED, http://moped.proteinspire.org ) version 2.5 is a freely accessible multi-omics expression database. Continual improvement and expansion of MOPED is driven by feedback from the Life Sciences Community. In order to meet the emergent need for an integrated multi-omics data resource, MOPED 2.5 now includes gene relative expression data in addition to protein absolute and relative expression data from over 250 large-scale experiments. To facilitate accurate integration of experiments and increase reproducibility, MOPED provides extensive metadata through the Data-Enabled Life Sciences Alliance (DELSA Global, http://delsaglobal.org ) metadata checklist. MOPED 2.5 has greatly increased the number of proteomics absolute and relative expression records to over 500,000, in addition to adding more than four million transcriptomics relative expression records. MOPED has an intuitive user interface with tabs for querying different types of omics expression data and new tools for data visualization. Summary information including expression data, pathway mappings, and direct connection between proteins and genes can be viewed on Protein and Gene Details pages. These connections in MOPED provide a context for multi-omics expression data exploration. Researchers are encouraged to submit omics data which will be consistently processed into expression summaries. MOPED as a multi-omics data resource is a pivotal public database, interdisciplinary knowledge resource, and platform for multi-omics understanding.
Collapse
Affiliation(s)
- Elizabeth Montague
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, Washington
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Larissa Stanberry
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, Washington
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Roger Higdon
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, Washington
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Imre Janko
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Elaine Lee
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Nathaniel Anderson
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, Washington
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - John Choiniere
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, Washington
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Elizabeth Stewart
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Gregory Yandl
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - William Broomall
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Natali Kolker
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
| | - Eugene Kolker
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, Washington
- High-throughput Analysis Core, Seattle Children's Research Institute, Seattle, Washington
- Predictive Analytics, Seattle Children's, Seattle, Washington
- Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, Washington
- Departments of Biomedical Informatics and Medical Education and Pediatrics, University of Washington, Seattle, Washington
| |
Collapse
|
1142
|
Kawano S, Watanabe T, Mizuguchi S, Araki N, Katayama T, Yamaguchi A. TogoTable: cross-database annotation system using the Resource Description Framework (RDF) data model. Nucleic Acids Res 2014; 42:W442-8. [PMID: 24829452 PMCID: PMC4086138 DOI: 10.1093/nar/gku403] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
TogoTable (http://togotable.dbcls.jp/) is a web tool that adds user-specified annotations to a table that a user uploads. Annotations are drawn from several biological databases that use the Resource Description Framework (RDF) data model. TogoTable uses database identifiers (IDs) in the table as a query key for searching. RDF data, which form a network called Linked Open Data (LOD), can be searched from SPARQL endpoints using a SPARQL query language. Because TogoTable uses RDF, it can integrate annotations from not only the reference database to which the IDs originally belong, but also externally linked databases via the LOD network. For example, annotations in the Protein Data Bank can be retrieved using GeneID through links provided by the UniProt RDF. Because RDF has been standardized by the World Wide Web Consortium, any database with annotations based on the RDF data model can be easily incorporated into this tool. We believe that TogoTable is a valuable Web tool, particularly for experimental biologists who need to process huge amounts of data such as high-throughput experimental output.
Collapse
Affiliation(s)
- Shin Kawano
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Tsutomu Watanabe
- CrossEdge Systems Inc., 2-14-42 Higashi Yamada, Tsuzuki-ku, Yokohama, Kanagawa 224-0023, Japan
| | - Sohei Mizuguchi
- Department of Tumor Genetics and Biology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto, Kumamoto 860-8556, Japan
| | - Norie Araki
- Department of Tumor Genetics and Biology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto, Kumamoto 860-8556, Japan
| | - Toshiaki Katayama
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Atsuko Yamaguchi
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| |
Collapse
|
1143
|
Abstract
Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, "meta" information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally.
Collapse
|
1144
|
Kongsbak K, Hadrup N, Audouze K, Vinggaard AM. Applicability of computational systems biology in toxicology. Basic Clin Pharmacol Toxicol 2014; 115:45-9. [PMID: 24528503 DOI: 10.1111/bcpt.12216] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 02/05/2014] [Indexed: 12/31/2022]
Abstract
Systems biology as a research field has emerged within the last few decades. Systems biology, often defined as the antithesis of the reductionist approach, integrates information about individual components of a biological system. In integrative systems biology, large data sets from various sources and databases are used to model and predict effects of chemicals on, for instance, human health. In toxicology, computational systems biology enables identification of important pathways and molecules from large data sets; tasks that can be extremely laborious when performed by a classical literature search. However, computational systems biology offers more advantages than providing a high-throughput literature search; it may form the basis for establishment of hypotheses on potential links between environmental chemicals and human diseases, which would be very difficult to establish experimentally. This is possible due to the existence of comprehensive databases containing information on networks of human protein-protein interactions and protein-disease associations. Experimentally determined targets of the specific chemical of interest can be fed into these networks to obtain additional information that can be used to establish hypotheses on links between the chemical and human diseases. Such information can also be applied for designing more intelligent animal/cell experiments that can test the established hypotheses. Here, we describe how and why to apply an integrative systems biology method in the hypothesis-generating phase of toxicological research.
Collapse
Affiliation(s)
- Kristine Kongsbak
- Division of Toxicology and Risk Assessment, National Food Institute, Technical University of Denmark, Søborg, Denmark; Department for Systems Biology, Centre for Biological Sequence Analysis, Technical University of Denmark, Kgs. Lyngby, Denmark
| | | | | | | |
Collapse
|
1145
|
Hantavirus immunology of rodent reservoirs: current status and future directions. Viruses 2014; 6:1317-35. [PMID: 24638205 PMCID: PMC3970152 DOI: 10.3390/v6031317] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Revised: 02/19/2014] [Accepted: 02/24/2014] [Indexed: 12/22/2022] Open
Abstract
Hantaviruses are hosted by rodents, insectivores and bats. Several rodent-borne hantaviruses cause two diseases that share many features in humans, hemorrhagic fever with renal syndrome in Eurasia or hantavirus cardiopulmonary syndrome in the Americas. It is thought that the immune response plays a significant contributory role in these diseases. However, in reservoir hosts that have been closely examined, little or no pathology occurs and infection is persistent despite evidence of adaptive immune responses. Because most hantavirus reservoirs are not model organisms, it is difficult to conduct meaningful experiments that might shed light on how the viruses evade sterilizing immune responses and why immunopathology does not occur. Despite these limitations, recent advances in instrumentation and bioinformatics will have a dramatic impact on understanding reservoir host responses to hantaviruses by employing a systems biology approach to identify important pathways that mediate virus/reservoir relationships.
Collapse
|
1146
|
Gomez-Cabrero D, Abugessaisa I, Maier D, Teschendorff A, Merkenschlager M, Gisel A, Ballestar E, Bongcam-Rudloff E, Conesa A, Tegnér J. Data integration in the era of omics: current and future challenges. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 2:I1. [PMID: 25032990 PMCID: PMC4101704 DOI: 10.1186/1752-0509-8-s2-i1] [Citation(s) in RCA: 209] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
To integrate heterogeneous and large omics data constitutes not only a conceptual challenge but a practical hurdle in the daily analysis of omics data. With the rise of novel omics technologies and through large-scale consortia projects, biological systems are being further investigated at an unprecedented scale generating heterogeneous and often large data sets. These data-sets encourage researchers to develop novel data integration methodologies. In this introduction we review the definition and characterize current efforts on data integration in the life sciences. We have used a web-survey to assess current research projects on data-integration to tap into the views, needs and challenges as currently perceived by parts of the research community.
Collapse
|
1147
|
Zheng CL, Kawane S, Bottomly D, Wilmot B. Analysis considerations for utilizing RNA-Seq to characterize the brain transcriptome. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2014; 116:21-54. [PMID: 25172470 DOI: 10.1016/b978-0-12-801105-8.00002-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
RNA-Seq allows one to examine only gene expression as well as expression of noncoding RNAs, alternative splicing, and allele-specific expression. With this increased sensitivity and dynamic range, there are computational and statistical considerations that need to be contemplated, which are highly dependent on the biological question being asked. We highlight these to provide an overview of their importance and the impact they can have on downstream interpretation of the brain transcriptome.
Collapse
Affiliation(s)
- Christina L Zheng
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA; Knight Cancer Institute, Oregon Health, Oregon Health and Science University, Portland, Oregon, USA.
| | - Sunita Kawane
- Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| | - Daniel Bottomly
- Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| | - Beth Wilmot
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA; Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| |
Collapse
|
1148
|
Emamjomeh A, Goliaei B, Zahiri J, Ebrahimpour R. Predicting protein–protein interactions between human and hepatitis C virus via an ensemble learning method. ACTA ACUST UNITED AC 2014; 10:3147-54. [DOI: 10.1039/c4mb00410h] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
We developed a novel method to predict human–HCV protein–protein interactions, the most comprehensive study of this type.
Collapse
Affiliation(s)
- Abbasali Emamjomeh
- Institute of Biochemistry and Biophysics (IBB)
- University of Tehran
- Tehran, Iran
| | - Bahram Goliaei
- Institute of Biochemistry and Biophysics (IBB)
- University of Tehran
- Tehran, Iran
| | - Javad Zahiri
- Institute of Biochemistry and Biophysics (IBB)
- University of Tehran
- Tehran, Iran
- Department of Mathematics
- K.N. Toosi University of Technology
| | - Reza Ebrahimpour
- Brain and Intelligent Systems Research Lab
- Department of Electrical and Computer Engineering
- Shahid Rajaee Teacher Training University
- Tehran, Iran
| |
Collapse
|
1149
|
Fernández-Suárez XM, Rigden DJ, Galperin MY. The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection. Nucleic Acids Res 2013; 42:D1-6. [PMID: 24316579 PMCID: PMC3965027 DOI: 10.1093/nar/gkt1282] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI’s MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
Collapse
Affiliation(s)
- Xosé M Fernández-Suárez
- Life Technologies, Inchinnan Business Park, Paisley PA4 9RF, UK, Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK and National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | |
Collapse
|