1
|
Smith DA. Situating Wikipedia as a health information resource in various contexts: A scoping review. PLoS One 2020; 15:e0228786. [PMID: 32069322 PMCID: PMC7028268 DOI: 10.1371/journal.pone.0228786] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 01/22/2020] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Wikipedia's health content is the most frequently visited resource for health information on the internet. While the literature provides strong evidence for its high usage, a comprehensive literature review of Wikipedia's role within the health context has not yet been reported. OBJECTIVE To conduct a comprehensive review of peer-reviewed, published literature to learn what the existing body of literature says about Wikipedia as a health information resource and what publication trends exist, if any. METHODS A comprehensive literature search in OVID Medline, OVID Embase, CINAHL, LISTA, Wilson's Web, AMED, and Web of Science was performed. Through a two-stage screening process, records were excluded if: Wikipedia was not a major or exclusive focus of the article; Wikipedia was not discussed within the context of a health or medical topic; the article was not available in English, the article was irretrievable, or; the article was a letter, commentary, editorial, or popular media article. RESULTS 89 articles and conference proceedings were selected for inclusion in the review. Four categories of literature emerged: 1) studies that situate Wikipedia as a health information resource; 2) investigations into the quality of Wikipedia, 3) explorations of the utility of Wikipedia in education, and 4) studies that demonstrate the utility of Wikipedia in research. CONCLUSION The literature positions Wikipedia as a prominent health information resource in various contexts for the public, patients, students, and practitioners seeking health information online. Wikipedia's health content is accessed frequently, and its pages regularly rank highly in Google search results. While Wikipedia itself is well into its second decade, the academic discourse around Wikipedia within the context of health is still young and the academic literature is limited when attempts are made to understand Wikipedia as a health information resource. Possibilities for future research will be discussed.
Collapse
Affiliation(s)
- Denise A. Smith
- Health Sciences Library, McMaster University, Hamilton, Ontario, Canada
- Faculty of Information & Media Studies, Western University, London, Ontario, Canada
| |
Collapse
|
2
|
Zinovyev A, Czerwinska U, Cantini L, Barillot E, Frahm KM, Shepelyansky DL. Collective intelligence defines biological functions in Wikipedia as communities in the hidden protein connection network. PLoS Comput Biol 2020; 16:e1007652. [PMID: 32069277 PMCID: PMC7048313 DOI: 10.1371/journal.pcbi.1007652] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 02/28/2020] [Accepted: 01/13/2020] [Indexed: 11/23/2022] Open
Abstract
English Wikipedia, containing more than five millions articles, has approximately eleven thousands web pages devoted to proteins or genes most of which were generated by the Gene Wiki project. These pages contain information about interactions between proteins and their functional relationships. At the same time, they are interconnected with other Wikipedia pages describing biological functions, diseases, drugs and other topics curated by independent, not coordinated collective efforts. Therefore, Wikipedia contains a directed network of protein functional relations or physical interactions embedded into the global network of the encyclopedia terms, which defines hidden (indirect) functional proximity between proteins. We applied the recently developed reduced Google Matrix (REGOMAX) algorithm in order to extract the network of hidden functional connections between proteins in Wikipedia. In this network we discovered tight communities which reflect areas of interest in molecular biology or medicine and can be considered as definitions of biological functions shaped by collective intelligence. Moreover, by comparing two snapshots of Wikipedia graph (from years 2013 and 2017), we studied the evolution of the network of direct and hidden protein connections. We concluded that the hidden connections are more dynamic compared to the direct ones and that the size of the hidden interaction communities grows with time. We recapitulate the results of Wikipedia protein community analysis and annotation in the form of an interactive online map, which can serve as a portal to the Gene Wiki project.
Collapse
Affiliation(s)
- Andrei Zinovyev
- Institut Curie, PSL Research University, F-75005 Paris, France
- INSERM, U900, F-75005 Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
| | - Urszula Czerwinska
- Institut Curie, PSL Research University, F-75005 Paris, France
- INSERM, U900, F-75005 Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
| | - Laura Cantini
- Institut Curie, PSL Research University, F-75005 Paris, France
- INSERM, U900, F-75005 Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
- Computational Systems Biology Team, Institut de Biologie de l’Ecole Normale Supérieure, CNRS UMR8197, INSERM U1024, Ecole Normale Supérieure, PSL Research University, F-75005 Paris, France
| | - Emmanuel Barillot
- Institut Curie, PSL Research University, F-75005 Paris, France
- INSERM, U900, F-75005 Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
| | - Klaus M. Frahm
- Laboratoire de Physique Théorique, IRSAMC, Université de Toulouse, CNRS, UPS, F-31062 Toulouse, France
| | - Dima L. Shepelyansky
- Laboratoire de Physique Théorique, IRSAMC, Université de Toulouse, CNRS, UPS, F-31062 Toulouse, France
| |
Collapse
|
3
|
Ping P, Hermjakob H, Polson JS, Benos PV, Wang W. Biomedical Informatics on the Cloud: A Treasure Hunt for Advancing Cardiovascular Medicine. Circ Res 2018; 122:1290-1301. [PMID: 29700073 PMCID: PMC6192708 DOI: 10.1161/circresaha.117.310967] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
In the digital age of cardiovascular medicine, the rate of biomedical discovery can be greatly accelerated by the guidance and resources required to unearth potential collections of knowledge. A unified computational platform leverages metadata to not only provide direction but also empower researchers to mine a wealth of biomedical information and forge novel mechanistic insights. This review takes the opportunity to present an overview of the cloud-based computational environment, including the functional roles of metadata, the architecture schema of indexing and search, and the practical scenarios of machine learning-supported molecular signature extraction. By introducing several established resources and state-of-the-art workflows, we share with our readers a broadly defined informatics framework to phenotype cardiovascular health and disease.
Collapse
Affiliation(s)
- Peipei Ping
- From the NIH BD2K Center of Excellence for Biomedical Computing at UCLA (HeartBD2K), Los Angeles, CA (P.P., H.H., J.S.P., W.W.)
- Department of Physiology (P.P., J.S.P.)
- Department of Medicine (P.P.)
- UCLA School of Medicine, Los Angeles, CA; Department of Computer Science, Scalable Analytics Institute, UCLA School of Engineering, Los Angeles, CA (P.P., W.W.)
| | - Henning Hermjakob
- From the NIH BD2K Center of Excellence for Biomedical Computing at UCLA (HeartBD2K), Los Angeles, CA (P.P., H.H., J.S.P., W.W.)
- Molecular Systems Cluster, European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom (H.H.)
| | - Jennifer S Polson
- From the NIH BD2K Center of Excellence for Biomedical Computing at UCLA (HeartBD2K), Los Angeles, CA (P.P., H.H., J.S.P., W.W.)
- Department of Physiology (P.P., J.S.P.)
| | - Panagiotis V Benos
- Departments of Computational & Systems Biology, School of Medicine, University of Pittsburgh, PA (P.V.B.)
- NIH BD2K Center of Excellence for Biomedical Computing at University of Pittsburgh (Center for Causal Discovery), PA (P.V.B.)
| | - Wei Wang
- From the NIH BD2K Center of Excellence for Biomedical Computing at UCLA (HeartBD2K), Los Angeles, CA (P.P., H.H., J.S.P., W.W.)
- UCLA School of Medicine, Los Angeles, CA; Department of Computer Science, Scalable Analytics Institute, UCLA School of Engineering, Los Angeles, CA (P.P., W.W.)
| |
Collapse
|
4
|
Kimura M, Kulikowski CA, Murray PJ, Ohno-Machado L, Park HA, Haux R, Geissbuhler A. Confluence of Disciplines in Health Informatics: an International Perspective. Methods Inf Med 2018; 50:545-55. [DOI: 10.3414/me11-06-0005] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
SummaryObjective: To discuss international aspects as they relate to the convergence of disciplines in health informatics.Method: A group of international experts was invited at a symposium to present and discuss their perspectives on this topic. These have been collated in a single manuscript.Results and Conclusions: Significant challenges, as well as opportunities, appear when cumulating the intrinsic multidisciplinary nature of health informatics interventions with the diversity of contexts at the global level, in particular when considered in the perspective of a confluence, i.e., the mixing of different waters and their merging into a new, stronger entity. Health informatics experts reflect on key issues such as collaborative software development and distributed knowledge sourcing, social media and mobile technologies, the evolutions of the discipline from an historical perspective, as well as examples of challenges for implementing ubiquitous healthcare or for supporting disaster situations when infrastructures get disrupted.
Collapse
|
5
|
Chen T, Li M, He Q, Zou L, Li Y, Chang C, Zhao D, Zhu Y. LiverWiki: a wiki-based database for human liver. BMC Bioinformatics 2017; 18:452. [PMID: 29029599 PMCID: PMC5640914 DOI: 10.1186/s12859-017-1852-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 10/02/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent advances in omics technology have produced a large amount of liver-related data. A comprehensive and up-to-date source of liver-related data is needed to allow biologists to access the latest data. However, current liver-related data sources each cover only a specific part of the liver. It is difficult for them to keep pace with the rapid increase of liver-related data available at those data resources. Integrating diverse liver-related data is a critical yet formidable challenge, as it requires sustained human effort. RESULTS We present LiverWiki, a first wiki-based database that integrates liver-related genes, homolog genes, gene expressions in microarray datasets and RNA-Seq datasets, proteins, protein interactions, post-translational modifications, associated pathways, diseases, metabolites identified in the metabolomics datasets, and literatures into an easily accessible and searchable resource for community-driven sharing. LiverWiki houses information in a total of 141,897 content pages, including 19,787 liver-related gene pages, 17,077 homolog gene pages, 50,251 liver-related protein pages, 36,122 gene expression pages, 2067 metabolites identified in the metabolomics datasets, 16,366 disease-related molecules, and 227 liver disease pages. Other than assisting users in searching, browsing, reviewing, refining the contents on LiverWiki, the most important contribution of LiverWiki is to allow the community to create and update biological data of liver in visible and editable tables. This integrates newly produced data with existing knowledge. Implemented in mediawiki, LiverWiki provides powerful extensions to support community contributions. CONCLUSIONS The main goal of LiverWiki is to provide the research community with comprehensive liver-related data, as well as to allow the research community to share their liver-related data flexibly and efficiently. It also enables rapid sharing new discoveries by allowing the discoveries to be integrated and shared immediately, rather than relying on expert curators. The database is available online at http://liverwiki.hupo.org.cn /.
Collapse
Affiliation(s)
- Tao Chen
- Beijing Institute of Life Omics, State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, 33 Life Science Park Rd, Changping District, Beijing, 102206, China
| | - Mansheng Li
- Beijing Institute of Life Omics, State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, 33 Life Science Park Rd, Changping District, Beijing, 102206, China
| | - Qiang He
- School of Software and Electrical Engineering, Swinburne University of Technology, Melbourne, Victoria, 3122, Australia
| | - Lei Zou
- Institute of Computer Science and Technology, Peking University, No.5 Yiheyuan Road Haidian District, Beijing, 100871, China
| | - Youhuan Li
- Institute of Computer Science and Technology, Peking University, No.5 Yiheyuan Road Haidian District, Beijing, 100871, China
| | - Cheng Chang
- Beijing Institute of Life Omics, State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, 33 Life Science Park Rd, Changping District, Beijing, 102206, China
| | - Dongyan Zhao
- Institute of Computer Science and Technology, Peking University, No.5 Yiheyuan Road Haidian District, Beijing, 100871, China
| | - Yunping Zhu
- Beijing Institute of Life Omics, State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, 33 Life Science Park Rd, Changping District, Beijing, 102206, China.
| |
Collapse
|
6
|
Burgstaller-Muehlbacher S, Waagmeester A, Mitraka E, Turner J, Putman T, Leong J, Naik C, Pavlidis P, Schriml L, Good BM, Su AI. Wikidata as a semantic framework for the Gene Wiki initiative. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw015. [PMID: 26989148 PMCID: PMC4795929 DOI: 10.1093/database/baw015] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2015] [Accepted: 02/01/2016] [Indexed: 11/14/2022]
Abstract
Open biological data are distributed over many resources making them challenging to integrate, to update and to disseminate quickly. Wikidata is a growing, open community database which can serve this purpose and also provides tight integration with Wikipedia. In order to improve the state of biological data, facilitate data management and dissemination, we imported all human and mouse genes, and all human and mouse proteins into Wikidata. In total, 59,721 human genes and 73,355 mouse genes have been imported from NCBI and 27,306 human proteins and 16,728 mouse proteins have been imported from the Swissprot subset of UniProt. As Wikidata is open and can be edited by anybody, our corpus of imported data serves as the starting point for integration of further data by scientists, the Wikidata community and citizen scientists alike. The first use case for these data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata are modified. Although Gene Wiki pages are currently only on the English language version of Wikipedia, the multilingual nature of Wikidata allows for usage of the data we imported in all 280 different language Wikipedias. Apart from the Gene Wiki infobox use case, a SPARQL endpoint and exporting functionality to several standard formats (e.g. JSON, XML) enable use of the data by scientists. In summary, we created a fully open and extensible data resource for human and mouse molecular biology and biochemistry data. This resource enriches all the Wikipedias with structured information and serves as a new linking hub for the biological semantic web. Database URL: https://www.wikidata.org/.
Collapse
Affiliation(s)
| | | | | | - Julia Turner
- The Scripps Research Institute, La Jolla, CA, USA
| | - Tim Putman
- The Scripps Research Institute, La Jolla, CA, USA
| | - Justin Leong
- The University of British Columbia, Vancouver, British Columbia, Canada and
| | - Chinmay Naik
- Bangalore Inst. Of Technology, Visvesvaraya Technological University, Bangalore, Karnataka
| | - Paul Pavlidis
- The University of British Columbia, Vancouver, British Columbia, Canada and
| | - Lynn Schriml
- University of Maryland Baltimore, Baltimore, MD, USA
| | | | - Andrew I Su
- The Scripps Research Institute, La Jolla, CA, USA
| |
Collapse
|
7
|
Kanterakis A, Kuiper J, Potamias G, Swertz MA. PyPedia: using the wiki paradigm as crowd sourcing environment for bioinformatics protocols. SOURCE CODE FOR BIOLOGY AND MEDICINE 2015; 10:14. [PMID: 26587054 PMCID: PMC4652372 DOI: 10.1186/s13029-015-0042-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 10/20/2015] [Indexed: 11/10/2022]
Abstract
Background Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols. Results We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page. Conclusions PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams. Availability PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License. Electronic supplementary material The online version of this article (doi:10.1186/s13029-015-0042-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexandros Kanterakis
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Postbus 30 001, Groningen, 9700 RB The Netherlands ; Institute of Computer Science, Foundation for Research and Technology Hellas (FORTH), Nikolaou Plastira 100, Heraklion, 71110 Greece
| | - Joël Kuiper
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Postbus 30 001, Groningen, 9700 RB The Netherlands
| | - George Potamias
- Institute of Computer Science, Foundation for Research and Technology Hellas (FORTH), Nikolaou Plastira 100, Heraklion, 71110 Greece
| | - Morris A Swertz
- University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Postbus 30 001, Groningen, 9700 RB The Netherlands
| |
Collapse
|
8
|
Weiner J, Kaufmann SHE, Maertzdorf J. High-throughput data analysis and data integration for vaccine trials. Vaccine 2015; 33:5249-55. [PMID: 25976544 DOI: 10.1016/j.vaccine.2015.04.096] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Revised: 04/16/2015] [Accepted: 04/27/2015] [Indexed: 12/21/2022]
Abstract
Rational vaccine development can benefit from biomarker studies, which help to predict, optimize and evaluate the immunogenicity of vaccines and ultimately provide surrogate endpoints for vaccine trials. Systems biology approaches facilitate acquisition of both simple biomarkers and complex biosignatures. Yet, evaluation of high-throughput (HT) data requires a plethora of tools for data integration and analysis. In this review, we present an overview of methods for evaluation and integration of large amounts of data collected in vaccine trials from similar and divergent molecular HT techniques, such as transcriptomic, proteomic and metabolic profiling. We will describe a selection of relevant statistical and bioinformatic approaches that are frequently associated with systems biology. We will present data dimension reduction techniques, functional analysis approaches and methods of integrating heterogeneous HT data. Finally, we will provide a few examples of applications of these techniques in vaccine research and development.
Collapse
Affiliation(s)
- January Weiner
- Department of Immunology, Max Planck Institute for Infection Biology, Charitéplatz 1, D-10117, Berlin, Germany.
| | - Stefan H E Kaufmann
- Department of Immunology, Max Planck Institute for Infection Biology, Charitéplatz 1, D-10117, Berlin, Germany.
| | - Jeroen Maertzdorf
- Department of Immunology, Max Planck Institute for Infection Biology, Charitéplatz 1, D-10117, Berlin, Germany
| |
Collapse
|
9
|
Pfundner A, Schönberg T, Horn J, Boyce RD, Samwald M. Utilizing the Wikidata system to improve the quality of medical content in Wikipedia in diverse languages: a pilot study. J Med Internet Res 2015; 17:e110. [PMID: 25944105 PMCID: PMC4468594 DOI: 10.2196/jmir.4163] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2014] [Revised: 03/12/2015] [Accepted: 03/14/2015] [Indexed: 11/13/2022] Open
Abstract
Background Wikipedia is an important source of medical information for both patients and medical professionals. Given its wide reach, improving the quality, completeness, and accessibility of medical information on Wikipedia could have a positive impact on global health. Objective We created a prototypical implementation of an automated system for keeping drug-drug interaction (DDI) information in Wikipedia up to date with current evidence about clinically significant drug interactions. Our work is based on Wikidata, a novel, graph-based database backend of Wikipedia currently in development. Methods We set up an automated process for integrating data from the Office of the National Coordinator for Health Information Technology (ONC) high priority DDI list into Wikidata. We set up exemplary implementations demonstrating how the DDI data we introduced into Wikidata could be displayed in Wikipedia articles in diverse languages. Finally, we conducted a pilot analysis to explore if adding the ONC high priority data would substantially enhance the information currently available on Wikipedia. Results We derived 1150 unique interactions from the ONC high priority list. Integration of the potential DDI data from Wikidata into Wikipedia articles proved to be straightforward and yielded useful results. We found that even though the majority of current English Wikipedia articles about pharmaceuticals contained sections detailing contraindications, only a small fraction of articles explicitly mentioned interaction partners from the ONC high priority list. For 91.30% (1050/1150) of the interaction pairs we tested, none of the 2 articles corresponding to the interacting substances explicitly mentioned the interaction partner. For 7.21% (83/1150) of the pairs, only 1 of the 2 associated Wikipedia articles mentioned the interaction partner; for only 1.48% (17/1150) of the pairs, both articles contained explicit mentions of the interaction partner. Conclusions Our prototype demonstrated that automated updating of medical content in Wikipedia through Wikidata is a viable option, albeit further refinements and community-wide consensus building are required before integration into public Wikipedia is possible. A long-term endeavor to improve the medical information in Wikipedia through structured data representation and automated workflows might lead to a significant improvement of the quality of medical information in one of the world’s most popular Web resources.
Collapse
Affiliation(s)
- Alexander Pfundner
- Section for Medical Expert and Knowledge-Based Systems, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | | | | | | | | |
Collapse
|
10
|
Good BM, Ainscough BJ, McMichael JF, Su AI, Griffith OL. Organizing knowledge to enable personalization of medicine in cancer. Genome Biol 2014; 15:438. [PMID: 25222080 PMCID: PMC4281950 DOI: 10.1186/s13059-014-0438-7] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Interpretation of the clinical significance of genomic alterations remains the most severe bottleneck preventing the realization of personalized medicine in cancer. We propose a knowledge commons to facilitate collaborative contributions and open discussion of clinical decision-making based on genomic events in cancer.
Collapse
|
11
|
Abstract
The mission of the Universal Protein Resource (UniProt) (http://www.uniprot.org) is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequences and functional annotation. It integrates, interprets and standardizes data from literature and numerous resources to achieve the most comprehensive catalog possible of protein information. The central activities are the biocuration of the UniProt Knowledgebase and the dissemination of these data through our Web site and web services. UniProt is produced by the UniProt Consortium, which consists of groups from the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). UniProt is updated and distributed every 4 weeks and can be accessed online for searches or downloads.
Collapse
Affiliation(s)
- The UniProt Consortium
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211 Geneva 4, Switzerland, Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street North West, Suite 1200, Washington, DC 20007, USA and Protein Information Resource, University of Delaware, 15 Innovation Way, Suite 205, Newark, DE 19711, USA
| |
Collapse
|
12
|
Matsuoka Y, Funahashi A, Ghosh S, Kitano H. Modeling and simulation using CellDesigner. Methods Mol Biol 2014; 1164:121-45. [PMID: 24927840 DOI: 10.1007/978-1-4939-0805-9_11] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In silico modeling and simulation are effective means to understand how the regulatory systems function in life. In this chapter, we explain how to build a model and run the simulation using CellDesigner, adopting the standards such as SBML and SBGN.
Collapse
Affiliation(s)
- Yukiko Matsuoka
- The Systems Biology Institute, 5-6-9 Shirokanedai, Minato-ku, Tokyo, 108-0071, Japan
| | | | | | | |
Collapse
|
13
|
Brinkley JF, Borromeo C, Clarkson M, Cox TC, Cunningham MJ, Detwiler LT, Heike CL, Hochheiser H, Mejino JLV, Travillian RS, Shapiro LG. The ontology of craniofacial development and malformation for translational craniofacial research. AMERICAN JOURNAL OF MEDICAL GENETICS PART C-SEMINARS IN MEDICAL GENETICS 2013; 163C:232-45. [PMID: 24124010 DOI: 10.1002/ajmg.c.31377] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
We introduce the Ontology of Craniofacial Development and Malformation (OCDM) as a mechanism for representing knowledge about craniofacial development and malformation, and for using that knowledge to facilitate integrating craniofacial data obtained via multiple techniques from multiple labs and at multiple levels of granularity. The OCDM is a project of the NIDCR-sponsored FaceBase Consortium, whose goal is to promote and enable research into the genetic and epigenetic causes of specific craniofacial abnormalities through the provision of publicly accessible, integrated craniofacial data. However, the OCDM should be usable for integrating any web-accessible craniofacial data, not just those data available through FaceBase. The OCDM is based on the Foundational Model of Anatomy (FMA), our comprehensive ontology of canonical human adult anatomy, and includes modules to represent adult and developmental craniofacial anatomy in both human and mouse, mappings between homologous structures in human and mouse, and associated malformations. We describe these modules, as well as prototype uses of the OCDM for integrating craniofacial data. By using the terms from the OCDM to annotate data, and by combining queries over the ontology with those over annotated data, it becomes possible to create "intelligent" queries that can, for example, find gene expression data obtained from mouse structures that are precursors to homologous human structures involved in malformations such as cleft lip. We suggest that the OCDM can be useful not only for integrating craniofacial data, but also for expressing new knowledge gained from analyzing the integrated data.
Collapse
|
14
|
Loguercio S, Good BM, Su AI. Dizeez: an online game for human gene-disease annotation. PLoS One 2013; 8:e71171. [PMID: 23951102 PMCID: PMC3737187 DOI: 10.1371/journal.pone.0071171] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 07/01/2013] [Indexed: 12/29/2022] Open
Abstract
Structured gene annotations are a foundation upon which many bioinformatics and statistical analyses are built. However the structured annotations available in public databases are a sparse representation of biological knowledge as a whole. The rate of biomedical data generation is such that centralized biocuration efforts struggle to keep up. New models for gene annotation need to be explored that expand the pace at which we are able to structure biomedical knowledge. Recently, online games have emerged as an effective way to recruit, engage and organize large numbers of volunteers to help address difficult biological challenges. For example, games have been successfully developed for protein folding (Foldit), multiple sequence alignment (Phylo) and RNA structure design (EteRNA). Here we present Dizeez, a simple online game built with the purpose of structuring knowledge of gene-disease associations. Preliminary results from game play online and at scientific conferences suggest that Dizeez is producing valid gene-disease annotations not yet present in any public database. These early results provide a basic proof of principle that online games can be successfully applied to the challenge of gene annotation. Dizeez is available at http://genegames.org.
Collapse
Affiliation(s)
- Salvatore Loguercio
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, California, United States of America
| | - Benjamin M. Good
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, California, United States of America
| | - Andrew I. Su
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, California, United States of America
- * E-mail:
| |
Collapse
|
15
|
García Godoy MJ, López-Camacho E, Navas-Delgado I, Aldana-Montes JF. Sharing and executing linked data queries in a collaborative environment. Bioinformatics 2013; 29:1663-70. [PMID: 23620361 DOI: 10.1093/bioinformatics/btt192] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Life Sciences have emerged as a key domain in the Linked Data community because of the diversity of data semantics and formats available through a great variety of databases and web technologies. Thus, it has been used as the perfect domain for applications in the web of data. Unfortunately, bioinformaticians are not exploiting the full potential of this already available technology, and experts in Life Sciences have real problems to discover, understand and devise how to take advantage of these interlinked (integrated) data. RESULTS In this article, we present Bioqueries, a wiki-based portal that is aimed at community building around biological Linked Data. This tool has been designed to aid bioinformaticians in developing SPARQL queries to access biological databases exposed as Linked Data, and also to help biologists gain a deeper insight into the potential use of this technology. This public space offers several services and a collaborative infrastructure to stimulate the consumption of biological Linked Data and, therefore, contribute to implementing the benefits of the web of data in this domain. Bioqueries currently contains 215 query entries grouped by database and theme, 230 registered users and 44 end points that contain biological Resource Description Framework information. AVAILABILITY The Bioqueries portal is freely accessible at http://bioqueries.uma.es. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- María Jesús García Godoy
- Lenguajes y Ciencias de la Computación, Universidad de Málaga, Bulevar Louis Pasteur 35, Málaga, Spain
| | | | | | | |
Collapse
|
16
|
Garcia Castro LJ, McLaughlin C, Garcia A. Biotea: RDFizing PubMed Central in support for the paper as an interface to the Web of Data. J Biomed Semantics 2013; 4 Suppl 1:S5. [PMID: 23734622 PMCID: PMC3804025 DOI: 10.1186/2041-1480-4-s1-s5] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The World Wide Web has become a dissemination platform for scientific and non-scientific publications. However, most of the information remains locked up in discrete documents that are not always interconnected or machine-readable. The connectivity tissue provided by RDF technology has not yet been widely used to support the generation of self-describing, machine-readable documents. RESULTS In this paper, we present our approach to the generation of self-describing machine-readable scholarly documents. We understand the scientific document as an entry point and interface to the Web of Data. We have semantically processed the full-text, open-access subset of PubMed Central. Our RDF model and resulting dataset make extensive use of existing ontologies and semantic enrichment services. We expose our model, services, prototype, and datasets at http://biotea.idiginfo.org/ CONCLUSIONS The semantic processing of biomedical literature presented in this paper embeds documents within the Web of Data and facilitates the execution of concept-based queries against the entire digital library. Our approach delivers a flexible and adaptable set of tools for metadata enrichment and semantic processing of biomedical documents. Our model delivers a semantically rich and highly interconnected dataset with self-describing content so that software can make effective use of it.
Collapse
Affiliation(s)
- L Jael Garcia Castro
- Temporal Knowledge Bases Group, Department of Computer Languages and Systems, Universitat Jaumé I, Castello de la Plana, Valencia, 12071, Spain
| | - C McLaughlin
- Institute for Digital Information and Scientific Communication, College of Communication and Information, Florida State University, Tallahassee, Florida, 32306-2651, USA
| | - A Garcia
- Institute for Digital Information and Scientific Communication, College of Communication and Information, Florida State University, Tallahassee, Florida, 32306-2651, USA
| |
Collapse
|
17
|
Kiefer RC, Freimuth RR, Chute CG, Pathak J. Mining Genotype-Phenotype Associations from Public Knowledge Sources via Semantic Web Querying. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2013; 2013:118-22. [PMID: 24303249 PMCID: PMC3845769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Gene Wiki Plus (GeneWiki+) and the Online Mendelian Inheritance in Man (OMIM) are publicly available resources for sharing information about disease-gene and gene-SNP associations in humans. While immensely useful to the scientific community, both resources are manually curated, thereby making the data entry and publication process time-consuming, and to some degree, error-prone. To this end, this study investigates Semantic Web technologies to validate existing and potentially discover new genotype-phenotype associations in GWP and OMIM. In particular, we demonstrate the applicability of SPARQL queries for identifying associations not explicitly stated for commonly occurring chronic diseases in GWP and OMIM, and report our preliminary findings for coverage, completeness, and validity of the associations. Our results highlight the benefits of Semantic Web querying technology to validate existing disease-gene associations as well as identify novel associations although further evaluation and analysis is required before such information can be applied and used effectively.
Collapse
|
18
|
Brochu C, Cabrita MA, Melanson BD, Hamill JD, Lau R, Pratt MAC, McKay BC. NF-κB-dependent role for cold-inducible RNA binding protein in regulating interleukin 1β. PLoS One 2013; 8:e57426. [PMID: 23437386 PMCID: PMC3578848 DOI: 10.1371/journal.pone.0057426] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 01/21/2013] [Indexed: 12/31/2022] Open
Abstract
The cold inducible RNA binding protein (CIRBP) responds to a wide array of cellular stresses, including short wavelength ultraviolet light (UVC), at the transcriptional and post-translational level. CIRBP can bind the 3'untranslated region of specific transcripts to stabilize them and facilitate their transport to ribosomes for translation. Here we used RNA interference and oligonucleotide microarrays to identify potential downstream targets of CIRBP induced in response to UVC. Twenty eight transcripts were statistically increased in response to UVC and these exhibited a typical UVC response. Only 5 of the 28 UVC-induced transcripts exhibited a CIRBP-dependent pattern of expression. Surprisingly, 3 of the 5 transcripts (IL1B, IL8 and TNFAIP6) encoded proteins important in inflammation with IL-1β apparently contributing to IL8 and TNFAIP6 expression in an autocrine fashion. UVC-induced IL1B expression could be inhibited by pharmacological inhibition of NFκB suggesting that CIRBP was affecting NF-κB signaling as opposed to IL1B mRNA stability directly. Bacterial lipopolysaccharide (LPS) was used as an activator of NF-κB to further study the potential link between CIRBP and NFκB. Transfection of siRNAs against CIRBP reduced the extent of the LPS-induced phosphorylation of IκBα, NF-κB DNA binding activity and IL-1β expression. The present work firmly establishes a novel link between CIRBP and NF-κB signaling in response to agents with diverse modes of action. These results have potential implications for disease states associated with inflammation.
Collapse
Affiliation(s)
- Christian Brochu
- Cancer Therapeutics Program, Ottawa Hospital Research Institute, Ottawa, Canada
| | - Miguel A. Cabrita
- Cancer Therapeutics Program, Ottawa Hospital Research Institute, Ottawa, Canada
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, Canada
| | - Brian D. Melanson
- Cancer Therapeutics Program, Ottawa Hospital Research Institute, Ottawa, Canada
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, Canada
| | - Jeffrey D. Hamill
- Cancer Therapeutics Program, Ottawa Hospital Research Institute, Ottawa, Canada
| | - Rosanna Lau
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, Canada
| | | | - Bruce C. McKay
- Cancer Therapeutics Program, Ottawa Hospital Research Institute, Ottawa, Canada
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, Canada
- Department of Biology, Carleton University, Ottawa, Canada
- * E-mail:
| |
Collapse
|
19
|
Pizarro A, Hayer K, Lahens NF, Hogenesch JB. CircaDB: a database of mammalian circadian gene expression profiles. Nucleic Acids Res 2012. [PMID: 23180795 PMCID: PMC3531170 DOI: 10.1093/nar/gks1161] [Citation(s) in RCA: 248] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
CircaDB (http://circadb.org) is a new database of circadian transcriptional profiles from time course expression experiments from mice and humans. Each transcript's expression was evaluated by three separate algorithms, JTK_Cycle, Lomb Scargle and DeLichtenberg. Users can query the gene annotations using simple and powerful full text search terms, restrict results to specific data sets and provide probability thresholds for each algorithm. Visualizations of the data are intuitive charts that convey profile information more effectively than a table of probabilities. The CircaDB web application is open source and available at http://github.com/itmat/circadb.
Collapse
Affiliation(s)
- Angel Pizarro
- The Institute for Translational Medicine and Therapeutics, University of Pennsylvania, 3400 Civic Center Boulevard, Building 421, Philadelphia, PA 19104, USA.
| | | | | | | |
Collapse
|
20
|
Wu C, Macleod I, Su AI. BioGPS and MyGene.info: organizing online, gene-centric information. Nucleic Acids Res 2012; 41:D561-5. [PMID: 23175613 PMCID: PMC3531157 DOI: 10.1093/nar/gks1114] [Citation(s) in RCA: 254] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Fast-evolving technologies have enabled researchers to easily generate data at genome scale, and using these technologies to compare biological states typically results in a list of candidate genes. Researchers are then faced with the daunting task of prioritizing these candidate genes for follow-up studies. There are hundreds, possibly even thousands, of web-based gene annotation resources available, but it quickly becomes impractical to manually access and review all of these sites for each gene in a candidate gene list. BioGPS (http://biogps.org) was created as a centralized gene portal for aggregating distributed gene annotation resources, emphasizing community extensibility and user customizability. BioGPS serves as a convenient tool for users to access known gene-centric resources, as well as a mechanism to discover new resources that were previously unknown to the user. This article describes updates to BioGPS made after its initial release in 2008. We summarize recent additions of features and data, as well as the robust user activity that underlies this community intelligence application. Finally, we describe MyGene.info (http://mygene.info) and related web services that provide programmatic access to BioGPS.
Collapse
Affiliation(s)
- Chunlei Wu
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | | | | |
Collapse
|
21
|
Sanseverino W, Hermoso A, D'Alessandro R, Vlasova A, Andolfo G, Frusciante L, Lowy E, Roma G, Ercolano MR. PRGdb 2.0: towards a community-based database model for the analysis of R-genes in plants. Nucleic Acids Res 2012; 41:D1167-71. [PMID: 23161682 PMCID: PMC3531111 DOI: 10.1093/nar/gks1183] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The Plant Resistance Genes database (PRGdb; http://prgdb.org) is a comprehensive resource on resistance genes (R-genes), a major class of genes in plant genomes that convey disease resistance against pathogens. Initiated in 2009, the database has grown more than 6-fold to recently include annotation derived from recent plant genome sequencing projects. Release 2.0 currently hosts useful biological information on a set of 112 known and 104 310 putative R-genes present in 233 plant species and conferring resistance to 122 different pathogens. Moreover, the website has been completely redesigned with the implementation of Semantic MediaWiki technologies, which makes our repository freely accessed and easily edited by any scientists. To this purpose, we encourage plant biologist experts to join our annotation effort and share their knowledge on resistance-gene biology with the rest of the scientific community.
Collapse
Affiliation(s)
- Walter Sanseverino
- Department of Soil, Plant, Environmental and Animal Production Sciences, University of Naples "Federico II", Via Università 100, 80055 Portici, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Yusuf D, Butland SL, Swanson MI, Bolotin E, Ticoll A, Cheung WA, Zhang XYC, Dickman CTD, Fulton DL, Lim JS, Schnabl JM, Ramos OHP, Vasseur-Cognet M, de Leeuw CN, Simpson EM, Ryffel GU, Lam EWF, Kist R, Wilson MSC, Marco-Ferreres R, Brosens JJ, Beccari LL, Bovolenta P, Benayoun BA, Monteiro LJ, Schwenen HDC, Grontved L, Wederell E, Mandrup S, Veitia RA, Chakravarthy H, Hoodless PA, Mancarelli MM, Torbett BE, Banham AH, Reddy SP, Cullum RL, Liedtke M, Tschan MP, Vaz M, Rizzino A, Zannini M, Frietze S, Farnham PJ, Eijkelenboom A, Brown PJ, Laperrière D, Leprince D, de Cristofaro T, Prince KL, Putker M, del Peso L, Camenisch G, Wenger RH, Mikula M, Rozendaal M, Mader S, Ostrowski J, Rhodes SJ, Van Rechem C, Boulay G, Olechnowicz SWZ, Breslin MB, Lan MS, Nanan KK, Wegner M, Hou J, Mullen RD, Colvin SC, Noy PJ, Webb CF, Witek ME, Ferrell S, Daniel JM, Park J, Waldman SA, Peet DJ, Taggart M, Jayaraman PS, Karrich JJ, Blom B, Vesuna F, O'Geen H, Sun Y, Gronostajski RM, Woodcroft MW, Hough MR, Chen E, Europe-Finner GN, Karolczak-Bayatti M, Bailey J, Hankinson O, Raman V, LeBrun DP, Biswal S, Harvey CJ, DeBruyne JP, Hogenesch JB, Hevner RF, Héligon C, Luo XM, Blank MC, Millen KJ, Sharlin DS, Forrest D, Dahlman-Wright K, Zhao C, Mishima Y, Sinha S, Chakrabarti R, Portales-Casamar E, Sladek FM, Bradley PH, Wasserman WW. The transcription factor encyclopedia. Genome Biol 2012; 13:R24. [PMID: 22458515 PMCID: PMC3439975 DOI: 10.1186/gb-2012-13-3-r24] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2012] [Revised: 03/19/2012] [Accepted: 03/29/2012] [Indexed: 12/20/2022] Open
Abstract
Here we present the Transcription Factor Encyclopedia (TFe), a new web-based compendium of mini review articles on transcription factors (TFs) that is founded on the principles of open access and collaboration. Our consortium of over 100 researchers has collectively contributed over 130 mini review articles on pertinent human, mouse and rat TFs. Notable features of the TFe website include a high-quality PDF generator and web API for programmatic data retrieval. TFe aims to rapidly educate scientists about the TFs they encounter through the delivery of succinct summaries written and vetted by experts in the field. TFe is available at http://www.cisreg.ca/tfe.
Collapse
Affiliation(s)
- Dimas Yusuf
- Department of Medical Genetics, Faculty of Medicine, Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Glusman G, Cariaso M, Jimenez R, Swan D, Greshake B, Bhak J, Logan DW, Corpas M. Low budget analysis of Direct-To-Consumer genomic testing familial data. F1000Res 2012; 1:3. [PMID: 24627758 PMCID: PMC3941016 DOI: 10.12688/f1000research.1-3.v1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/05/2012] [Indexed: 11/21/2022] Open
Abstract
Direct-to-consumer (DTC) genetic testing is a recent commercial endeavor that allows the general public to access personal genomic data. The growing availability of personal genomic data has in turn stimulated the development of non-commercial tools for DTC data analysis. Despite this new wealth of public resources, no systematic research has been carried out to assess these tools for interpretation of DTC data. Here, we provide an initial analysis benchmark in the context of a whole family, using single nucleotide polymorphism (SNP) data. Five blood-related DTC SNP chip data tests were analyzed in conjunction with one whole exome sequence. We report findings related to genomic similarity between individuals, genetic risks and an overall assessment of data quality; thus providing an evaluation of the current potential of public domain analysis tools for personal genomics. We envisage that as the use of personal genome tests spreads to the general population, publicly available tools will have a more prominent role in the interpretation of genomic data in the context of health risks and ancestry.
Collapse
Affiliation(s)
- Gustavo Glusman
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109-5234, USA
| | | | - Rafael Jimenez
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Daniel Swan
- Oxford Gene Technology, Begbroke Science Park, Begbroke, Oxfordshire, OX5 1PF, UK
| | - Bastian Greshake
- Molecular Ecology Group, Biodiversity and Climate Research Centre, Frankfurt am Main, Senckenberganlage 25, D-60325, Germany
| | - Jong Bhak
- Theragen BiO Institute, TheragenEtex Inc, AICT building, Lui-dong, Youngtong-gu, Suwon 443-370, Korea, South
| | - Darren W Logan
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Manuel Corpas
- The Genome Analysis Centre, Norwich Research Park, Norwich, NR4 7UH, UK
| |
Collapse
|
24
|
TP Atlas: integration and dissemination of advances in Targeted Proteins Research Program (TPRP)-structural biology project phase II in Japan. ACTA ACUST UNITED AC 2012; 13:145-54. [PMID: 22644393 PMCID: PMC3414706 DOI: 10.1007/s10969-012-9139-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2011] [Accepted: 05/12/2012] [Indexed: 10/29/2022]
Abstract
The Targeted Proteins Research Program (TPRP) promoted by the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan is the phase II of structural biology project (2007-2011) following the Protein 3000 Project (2002-2006) in Japan. While the phase I Protein 3000 Project put partial emphasis on the construction and maintenance of pipelines for structural analyses, the TPRP is dedicated to revealing the structures and functions of the targeted proteins that have great importance in both basic research and industrial applications. To pursue this objective, 35 Targeted Proteins (TP) Projects selected in the three areas of fundamental biology, medicine and pharmacology, and food and environment are tightly collaborated with 10 Advanced Technology (AT) Projects in the four fields of protein production, structural analyses, chemical library and screening, and information platform. Here, the outlines and achievements of the 35 TP Projects are summarized in the system named TP Atlas. Progress in the diversified areas is described in the modules of Graphical Summary, General Summary, Tabular Summary, and Structure Gallery of the TP Atlas in the standard and unified format. Advances in TP Projects owing to novel technologies stemmed from AT Projects and collaborative research among TP Projects are illustrated as a hallmark of the Program. The TP Atlas can be accessed at http://net.genes.nig.ac.jp/tpatlas/index_e.html .
Collapse
|
25
|
Good BM, Clarke EL, Loguercio S, Su AI. Linking genes to diseases with a SNPedia-Gene Wiki mashup. J Biomed Semantics 2012; 3 Suppl 1:S6. [PMID: 22541597 PMCID: PMC3337266 DOI: 10.1186/2041-1480-3-s1-s6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background A variety of topic-focused wikis are used in the biomedical sciences to enable the mass-collaborative synthesis and distribution of diverse bodies of knowledge. To address complex problems such as defining the relationships between genes and disease, it is important to bring the knowledge from many different domains together. Here we show how advances in wiki technology and natural language processing can be used to automatically assemble ‘meta-wikis’ that present integrated views over the data collaboratively created in multiple source wikis. Results We produced a semantic meta-wiki called the Gene Wiki+ that automatically mirrors and integrates data from the Gene Wiki and SNPedia. The Gene Wiki+, available at (http://genewikiplus.org/), captures 8,047 distinct gene-disease relationships. SNPedia accounts for 4,149 of the gene-disease pairs, the Gene Wiki provides 4,377 and only 479 appear independently in both sources. All of this content is available to query and browse and is provided as linked open data. Conclusions Wikis contain increasing amounts of diverse, biological information useful for elucidating the connections between genes and disease. The Gene Wiki+ shows how wiki technology can be used in concert with natural language processing to provide integrated views over diverse underlying data sources.
Collapse
Affiliation(s)
- Benjamin M Good
- The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA.
| | | | | | | |
Collapse
|
26
|
Krallinger M, Leitner F, Vazquez M, Salgado D, Marcelle C, Tyers M, Valencia A, Chatr-aryamontri A. How to link ontologies and protein-protein interactions to literature: text-mining approaches and the BioCreative experience. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas017. [PMID: 22438567 PMCID: PMC3309177 DOI: 10.1093/database/bas017] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
There is an increasing interest in developing ontologies and controlled vocabularies to improve the efficiency and consistency of manual literature curation, to enable more formal biocuration workflow results and ultimately to improve analysis of biological data. Two ontologies that have been successfully used for this purpose are the Gene Ontology (GO) for annotating aspects of gene products and the Molecular Interaction ontology (PSI-MI) used by databases that archive protein–protein interactions. The examination of protein interactions has proven to be extremely promising for the understanding of cellular processes. Manual mapping of information from the biomedical literature to bio-ontology terms is one of the most challenging components in the curation pipeline. It requires that expert curators interpret the natural language descriptions contained in articles and infer their semantic equivalents in the ontology (controlled vocabulary). Since manual curation is a time-consuming process, there is strong motivation to implement text-mining techniques to automatically extract annotations from free text. A range of text mining strategies has been devised to assist in the automated extraction of biological data. These strategies either recognize technical terms used recurrently in the literature and propose them as candidates for inclusion in ontologies, or retrieve passages that serve as evidential support for annotating an ontology term, e.g. from the PSI-MI or GO controlled vocabularies. Here, we provide a general overview of current text-mining methods to automatically extract annotations of GO and PSI-MI ontology terms in the context of the BioCreative (Critical Assessment of Information Extraction Systems in Biology) challenge. Special emphasis is given to protein–protein interaction data and PSI-MI terms referring to interaction detection methods.
Collapse
Affiliation(s)
- Martin Krallinger
- Structural and Computational Biology Group, Spanish National Cancer Research Centre (CNIO), Spain
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Good BM, Clarke EL, Loguercio S, Su AI. Building a biomedical semantic network in Wikipedia with Semantic Wiki Links. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bar060. [PMID: 22434829 PMCID: PMC3308151 DOI: 10.1093/database/bar060] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Wikipedia is increasingly used as a platform for collaborative data curation, but its current technical implementation has significant limitations that hinder its use in biocuration applications. Specifically, while editors can easily link between two articles in Wikipedia to indicate a relationship, there is no way to indicate the nature of that relationship in a way that is computationally accessible to the system or to external developers. For example, in addition to noting a relationship between a gene and a disease, it would be useful to differentiate the cases where genetic mutation or altered expression causes the disease. Here, we introduce a straightforward method that allows Wikipedia editors to embed computable semantic relations directly in the context of current Wikipedia articles. In addition, we demonstrate two novel applications enabled by the presence of these new relationships. The first is a dynamically generated information box that can be rendered on all semantically enhanced Wikipedia articles. The second is a prototype gene annotation system that draws its content from the gene-centric articles on Wikipedia and exposes the new semantic relationships to enable previously impossible, user-defined queries. DATABASE URL: http://en.wikipedia.org/wiki/Portal:Gene_Wiki.
Collapse
|
28
|
Capriotti E, Nehrt NL, Kann MG, Bromberg Y. Bioinformatics for personal genome interpretation. Brief Bioinform 2012; 13:495-512. [PMID: 22247263 DOI: 10.1093/bib/bbr070] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
An international consortium released the first draft sequence of the human genome 10 years ago. Although the analysis of this data has suggested the genetic underpinnings of many diseases, we have not yet been able to fully quantify the relationship between genotype and phenotype. Thus, a major current effort of the scientific community focuses on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds. Many resources aim to identify and annotate the specific genes responsible for the observed phenotypes. Some of these use intra-species genetic variability as a means for better understanding this relationship. In addition, several online resources are now dedicated to collecting single nucleotide variants and other types of variants, and annotating their functional effects and associations with phenotypic traits. This information has enabled researchers to develop bioinformatics tools to analyze the rapidly increasing amount of newly extracted variation data and to predict the effect of uncharacterized variants. In this work, we review the most important developments in the field--the databases and bioinformatics tools that will be of utmost importance in our concerted effort to interpret the human variome.
Collapse
Affiliation(s)
- Emidio Capriotti
- Department of Mathematics and Computer Science, University of Balearic Islands, ctra. de Valldemossa Km 7.5, Palma de Mallorca, 07122 Spain.
| | | | | | | |
Collapse
|
29
|
Bergemann TL, Starr TK, Yu H, Steinbach M, Erdmann J, Chen Y, Cormier RT, Largaespada DA, Silverstein KAT. New methods for finding common insertion sites and co-occurring common insertion sites in transposon- and virus-based genetic screens. Nucleic Acids Res 2012; 40:3822-33. [PMID: 22241771 PMCID: PMC3351147 DOI: 10.1093/nar/gkr1295] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.
Collapse
Affiliation(s)
- Tracy L Bergemann
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Good BM, Howe DG, Lin SM, Kibbe WA, Su AI. Mining the Gene Wiki for functional genomic knowledge. BMC Genomics 2011; 12:603. [PMID: 22165947 PMCID: PMC3271090 DOI: 10.1186/1471-2164-12-603] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Accepted: 12/13/2011] [Indexed: 11/26/2022] Open
Abstract
Background Ontology-based gene annotations are important tools for organizing and analyzing genome-scale biological data. Collecting these annotations is a valuable but costly endeavor. The Gene Wiki makes use of Wikipedia as a low-cost, mass-collaborative platform for assembling text-based gene annotations. The Gene Wiki is comprised of more than 10,000 review articles, each describing one human gene. The goal of this study is to define and assess a computational strategy for translating the text of Gene Wiki articles into ontology-based gene annotations. We specifically explore the generation of structured annotations using the Gene Ontology and the Human Disease Ontology. Results Our system produced 2,983 candidate gene annotations using the Disease Ontology and 11,022 candidate annotations using the Gene Ontology from the text of the Gene Wiki. Based on manual evaluations and comparisons to reference annotation sets, we estimate a precision of 90-93% for the Disease Ontology annotations and 48-64% for the Gene Ontology annotations. We further demonstrate that this data set can systematically improve the results from gene set enrichment analyses. Conclusions The Gene Wiki is a rapidly growing corpus of text focused on human gene function. Here, we demonstrate that the Gene Wiki can be a powerful resource for generating ontology-based gene annotations. These annotations can be used immediately to improve workflows for building curated gene annotation databases and knowledge-based statistical analyses.
Collapse
Affiliation(s)
- Benjamin M Good
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA
| | | | | | | | | |
Collapse
|
31
|
Finn RD, Gardner PP, Bateman A. Making your database available through Wikipedia: the pros and cons. Nucleic Acids Res 2011; 40:D9-12. [PMID: 22144683 PMCID: PMC3245093 DOI: 10.1093/nar/gkr1195] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Wikipedia, the online encyclopedia, is the most famous wiki in use today. It contains over 3.7 million pages of content; with many pages written on scientific subject matters that include peer-reviewed citations, yet are written in an accessible manner and generally reflect the consensus opinion of the community. In this, the 19th Annual Database Issue of Nucleic Acids Research, there are 11 articles that describe the use of a wiki in relation to a biological database. In this commentary, we discuss how biological databases can be integrated with Wikipedia, thereby utilising the pre-existing infrastructure, tools and above all, large community of authors (or Wikipedians). The limitations to the content that can be included in Wikipedia are highlighted, with examples drawn from articles found in this issue and other wiki-based resources, indicating why other wiki solutions are necessary. We discuss the merits of using open wikis, like Wikipedia, versus other models, with particular reference to potential vandalism. Finally, we raise the question about the future role of dedicated database biocurators in context of the thousands of crowdsourced, community annotations that are now being stored in wikis.
Collapse
Affiliation(s)
- Robert D Finn
- HHMI Janelia Farm Research Campus, 19700 Helix Drive, Ashburn, VA, USA.
| | | | | |
Collapse
|
32
|
Renfro DP, McIntosh BK, Venkatraman A, Siegele DA, Hu JC. GONUTS: the Gene Ontology Normal Usage Tracking System. Nucleic Acids Res 2011; 40:D1262-9. [PMID: 22110029 PMCID: PMC3245169 DOI: 10.1093/nar/gkr907] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
The Gene Ontology Normal Usage Tracking System (GONUTS) is a community-based browser and usage guide for Gene Ontology (GO) terms and a community system for general GO annotation of proteins. GONUTS uses wiki technology to allow registered users to share and edit notes on the use of each term in GO, and to contribute annotations for specific genes of interest. By providing a site for generation of third-party documentation at the granularity of individual terms, GONUTS complements the official documentation of the Gene Ontology Consortium. To provide examples for community users, GONUTS displays the complete GO annotations from seven model organisms: Saccharomyces cerevisiae, Dictyostelium discoideum, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus and Arabidopsis thaliana. To support community annotation, GONUTS allows automated creation of gene pages for gene products in UniProt. GONUTS will improve the consistency of annotation efforts across genome projects, and should be useful in training new annotators and consumers in the production of GO annotations and the use of GO terms. GONUTS can be accessed at http://gowiki.tamu.edu. The source code for generating the content of GONUTS is available upon request.
Collapse
Affiliation(s)
- Daniel P Renfro
- Department of Biochemistry and Biophysics, Texas A&M University and Texas Agrilife Research, College Station, TX 77843-3258, USA
| | | | | | | | | |
Collapse
|
33
|
Kelder T, van Iersel MP, Hanspers K, Kutmon M, Conklin BR, Evelo CT, Pico AR. WikiPathways: building research communities on biological pathways. Nucleic Acids Res 2011; 40:D1301-7. [PMID: 22096230 PMCID: PMC3245032 DOI: 10.1093/nar/gkr1074] [Citation(s) in RCA: 372] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Here, we describe the development of WikiPathways (http://www.wikipathways.org), a public wiki for pathway curation, since it was first published in 2008. New features are discussed, as well as developments in the community of contributors. New features include a zoomable pathway viewer, support for pathway ontology annotations, the ability to mark pathways as private for a limited time and the availability of stable hyperlinks to pathways and the elements therein. WikiPathways content is freely available in a variety of formats such as the BioPAX standard, and the content is increasingly adopted by external databases and tools, including Wikipedia. A recent development is the use of WikiPathways as a staging ground for centrally curated databases such as Reactome. WikiPathways is seeing steady growth in the number of users, page views and edits for each pathway. To assess whether the community curation experiment can be considered successful, here we analyze the relation between use and contribution, which gives results in line with other wiki projects. The novel use of pathway pages as supplementary material to publications, as well as the addition of tailored content for research domains, is expected to stimulate growth further.
Collapse
Affiliation(s)
- Thomas Kelder
- Department of Bioinformatics-BiGCaT, Maastricht University, Maastricht, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
34
|
Schriml LM, Arze C, Nadendla S, Chang YWW, Mazaitis M, Felix V, Feng G, Kibbe WA. Disease Ontology: a backbone for disease semantic integration. Nucleic Acids Res 2011; 40:D940-6. [PMID: 22080554 PMCID: PMC3245088 DOI: 10.1093/nar/gkr972] [Citation(s) in RCA: 521] [Impact Index Per Article: 37.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The Disease Ontology (DO) database (http://disease-ontology.org) represents a comprehensive knowledge base of 8043 inherited, developmental and acquired human diseases (DO version 3, revision 2510). The DO web browser has been designed for speed, efficiency and robustness through the use of a graph database. Full-text contextual searching functionality using Lucene allows the querying of name, synonym, definition, DOID and cross-reference (xrefs) with complex Boolean search strings. The DO semantically integrates disease and medical vocabularies through extensive cross mapping and integration of MeSH, ICD, NCI's thesaurus, SNOMED CT and OMIM disease-specific terms and identifiers. The DO is utilized for disease annotation by major biomedical databases (e.g. Array Express, NIF, IEDB), as a standard representation of human disease in biomedical ontologies (e.g. IDO, Cell line ontology, NIFSTD ontology, Experimental Factor Ontology, Influenza Ontology), and as an ontological cross mappings resource between DO, MeSH and OMIM (e.g. GeneWiki). The DO project (http://diseaseontology.sf.net) has been incorporated into open source tools (e.g. Gene Answers, FunDO) to connect gene and disease biomedical data through the lens of human disease. The next iteration of the DO web browser will integrate DO's extended relations and logical definition representation along with these biomedical resource cross-mappings.
Collapse
Affiliation(s)
- Lynn Marie Schriml
- Department of Epidemiology and Public Health, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Good BM, Clarke EL, de Alfaro L, Su AI. The Gene Wiki in 2011: community intelligence applied to human gene annotation. Nucleic Acids Res 2011; 40:D1255-61. [PMID: 22075991 PMCID: PMC3245148 DOI: 10.1093/nar/gkr925] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
The Gene Wiki is an open-access and openly editable collection of Wikipedia articles about human genes. Initiated in 2008, it has grown to include articles about more than 10 000 genes that, collectively, contain more than 1.4 million words of gene-centric text with extensive citations back to the primary scientific literature. This growing body of useful, gene-centric content is the result of the work of thousands of individuals throughout the scientific community. Here, we describe recent improvements to the automated system that keeps the structured data presented on Gene Wiki articles in sync with the data from trusted primary databases. We also describe the expanding contents, editors and users of the Gene Wiki. Finally, we introduce a new automated system, called WikiTrust, which can effectively compute the quality of Wikipedia articles, including Gene Wiki articles, at the word level. All articles in the Gene Wiki can be freely accessed and edited at Wikipedia, and additional links and information can be found at the project's Wikipedia portal page: http://en.wikipedia.org/wiki/Portal:Gene_Wiki.
Collapse
Affiliation(s)
- Benjamin M Good
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | | | | | | |
Collapse
|
36
|
McIntosh BK, Renfro DP, Knapp GS, Lairikyengbam CR, Liles NM, Niu L, Supak AM, Venkatraman A, Zweifel AE, Siegele DA, Hu JC. EcoliWiki: a wiki-based community resource for Escherichia coli. Nucleic Acids Res 2011; 40:D1270-7. [PMID: 22064863 PMCID: PMC3245172 DOI: 10.1093/nar/gkr880] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
EcoliWiki is the community annotation component of the PortEco (http://porteco.org; formerly EcoliHub) project, an online data resource that integrates information on laboratory strains of Escherichia coli, its phages, plasmids and mobile genetic elements. As one of the early adopters of the wiki approach to model organism databases, EcoliWiki was designed to not only facilitate community-driven sharing of biological knowledge about E. coli as a model organism, but also to be interoperable with other data resources. EcoliWiki content currently covers genes from five laboratory E. coli strains, 21 bacteriophage genomes, F plasmid and eight transposons. EcoliWiki integrates the Mediawiki wiki platform with other open-source software tools and in-house software development to extend how wikis can be used for model organism databases. EcoliWiki can be accessed online at http://ecoliwiki.net.
Collapse
Affiliation(s)
- Brenley K McIntosh
- Department of Biochemistry and Biophysics, Texas Agrilife Research, Texas A&M University College Station, TX 77843, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Abstract
Understanding complex biological systems requires extensive support from software tools. Such tools are needed at each step of a systems biology computational workflow, which typically consists of data handling, network inference, deep curation, dynamical simulation and model analysis. In addition, there are now efforts to develop integrated software platforms, so that tools that are used at different stages of the workflow and by different researchers can easily be used together. This Review describes the types of software tools that are required at different stages of systems biology research and the current options that are available for systems biology researchers. We also discuss the challenges and prospects for modelling the effects of genetic changes on physiology and the concept of an integrated platform.
Collapse
|
38
|
Becker KG, Holmes KA, Zhang Y. Aging-kb: a knowledge base for the study of the aging process. Mech Ageing Dev 2011; 132:592-4. [PMID: 22100666 PMCID: PMC3287063 DOI: 10.1016/j.mad.2011.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2011] [Revised: 10/14/2011] [Accepted: 10/21/2011] [Indexed: 11/15/2022]
Abstract
As the science of the aging process moves forward, a recurring challenge is the integration of multiple types of data and information with classical aging theory while disseminating that information to the scientific community. Here we present AGING-kb, a public knowledge base with the goal of conceptualizing and presenting fundamental aspects of the study of the aging process. Aging-kb has two interconnected parts, the Aging-kb tree and the Aging Wiki. The Aging-kb tree is a simple intuitive dynamic tree hierarchy of terms describing the field of aging from the general to the specific. This enables the user to see relationships between areas of aging research in a logical comparative fashion. The second part is a specialized Aging Wiki which allows expert definition, description, supporting information, and documentation of each aging keyword term found in the Aging-kb tree. The Aging Wiki allows community participation in describing and defining concepts and terms in the Wiki format. This aging knowledge base provides a simple intuitive interface to the complexities of aging.
Collapse
Affiliation(s)
- Kevin G Becker
- Research Resources Branch, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, United States.
| | | | | |
Collapse
|
39
|
Sintchenko V, Coiera EW. Translational web robots for pathogen genome analysis. MICROBIAL INFORMATICS AND EXPERIMENTATION 2011; 1:10. [PMID: 22587672 PMCID: PMC3372293 DOI: 10.1186/2042-5783-1-10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2011] [Accepted: 10/31/2011] [Indexed: 11/10/2022]
Affiliation(s)
- Vitali Sintchenko
- Centre for Infectious Diseases and Microbiology-Public Health, Institute of Clinical Pathology and Medical Research, Westmead Hospital, Sydney, New South Wales, 2145 Australia.
| | | |
Collapse
|
40
|
Romano P, Giugno R, Pulvirenti A. Tools and collaborative environments for bioinformatics research. Brief Bioinform 2011; 12:549-61. [PMID: 21984743 PMCID: PMC3220874 DOI: 10.1093/bib/bbr055] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Advanced research requires intensive interaction among a multitude of actors, often possessing different expertise and usually working at a distance from each other. The field of collaborative research aims to establish suitable models and technologies to properly support these interactions. In this article, we first present the reasons for an interest of Bioinformatics in this context by also suggesting some research domains that could benefit from collaborative research. We then review the principles and some of the most relevant applications of social networking, with a special attention to networks supporting scientific collaboration, by also highlighting some critical issues, such as identification of users and standardization of formats. We then introduce some systems for collaborative document creation, including wiki systems and tools for ontology development, and review some of the most interesting biological wikis. We also review the principles of Collaborative Development Environments for software and show some examples in Bioinformatics. Finally, we present the principles and some examples of Learning Management Systems. In conclusion, we try to devise some of the goals to be achieved in the short term for the exploitation of these technologies.
Collapse
Affiliation(s)
- Paolo Romano
- Bioinformatics, National Cancer Research Institute (IST), Genoa, Italy.
| | | | | |
Collapse
|
41
|
Splendiani A, Gündel M, Austyn JM, Cavalieri D, Scognamiglio C, Brandizi M. Knowledge sharing and collaboration in translational research, and the DC-THERA Directory. Brief Bioinform 2011; 12:562-75. [PMID: 21969471 PMCID: PMC3220873 DOI: 10.1093/bib/bbr051] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Biomedical research relies increasingly on large collections of data sets and knowledge whose generation, representation and analysis often require large collaborative and interdisciplinary efforts. This dimension of ‘big data’ research calls for the development of computational tools to manage such a vast amount of data, as well as tools that can improve communication and access to information from collaborating researchers and from the wider community. Whenever research projects have a defined temporal scope, an additional issue of data management arises, namely how the knowledge generated within the project can be made available beyond its boundaries and life-time. DC-THERA is a European ‘Network of Excellence’ (NoE) that spawned a very large collaborative and interdisciplinary research community, focusing on the development of novel immunotherapies derived from fundamental research in dendritic cell immunobiology. In this article we introduce the DC-THERA Directory, which is an information system designed to support knowledge management for this research community and beyond. We present how the use of metadata and Semantic Web technologies can effectively help to organize the knowledge generated by modern collaborative research, how these technologies can enable effective data management solutions during and beyond the project lifecycle, and how resources such as the DC-THERA Directory fit into the larger context of e-science.
Collapse
|
42
|
Moriarity B, Largaespada DA. A Comprehensive Guide to Sleeping Beauty-Based Somatic Transposon Mutagenesis in the Mouse. ACTA ACUST UNITED AC 2011; 1:347-68. [PMID: 26069058 DOI: 10.1002/9780470942390.mo110087] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Recent advances in whole genome analyses made possible by next-generation DNA sequencing, high-density array comparative genome hybridization (aCGH), and other technologies have made it apparent that cancers harbor numerous genomic changes. However, without functional correlation or validation, it has proven difficult to determine which genetic changes are necessary or sufficient to produce cancer. Thus, it is still necessary to perform unbiased functional studies using model organisms to help interpret the results of whole genome analyses of human tumors. To this end, a Sleeping Beauty (SB) transposon-based mutagenesis technology was developed to identify genes that, when mutated, can cause cancer. Herein a detailed methodology to initiate and carry out an SB transposon mutagenesis screen is described. Although this system might be used to identify genes involved with many cellular phenotypes, it has been primarily implemented for cancer. Thus, SB transposon somatic cell screens for cancer development are highlighted. Curr. Protoc. Mouse Biol. 1:347-368 © 2011 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Branden Moriarity
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota.,Center for Genome Engineering, University of Minnesota, Minneapolis, Minnesota.,Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota
| | - David A Largaespada
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota.,Center for Genome Engineering, University of Minnesota, Minneapolis, Minnesota.,Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota
| |
Collapse
|
43
|
Baxter R, Hong NC. Tracking community intelligence with Trac. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2011; 369:3372-3383. [PMID: 21768145 DOI: 10.1098/rsta.2011.0141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We report on experiences at the Software Sustainability Institute (SSI) in customizing and using the Trac system to provide a single platform for recording, managing and tracking a wide range of community interactions. We note the essential requirement of a lightweight, easy-to-use system for recording 'community metadata' and discuss the pros and cons of using Trac in this way for day-to-day operations within SSI, and more generally as a means to record and track interactions with a wide and potentially very large community.
Collapse
Affiliation(s)
- Rob Baxter
- Software Sustainability Institute, EPCC, University of Edinburgh, Edinburgh EH9 3JZ, UK
| | | |
Collapse
|
44
|
Hochheiser H, Aronow BJ, Artinger K, Beaty TH, Brinkley JF, Chai Y, Clouthier D, Cunningham ML, Dixon M, Donahue LR, Fraser SE, Hallgrimsson B, Iwata J, Klein O, Marazita ML, Murray JC, Murray S, de Villena FPM, Postlethwait J, Potter S, Shapiro L, Spritz R, Visel A, Weinberg SM, Trainor PA. The FaceBase Consortium: a comprehensive program to facilitate craniofacial research. Dev Biol 2011; 355:175-82. [PMID: 21458441 DOI: 10.1016/j.ydbio.2011.02.033] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 02/23/2011] [Accepted: 02/24/2011] [Indexed: 12/21/2022]
Abstract
The FaceBase Consortium consists of ten interlinked research and technology projects whose goal is to generate craniofacial research data and technology for use by the research community through a central data management and integrated bioinformatics hub. Funded by the National Institute of Dental and Craniofacial Research (NIDCR) and currently focused on studying the development of the middle region of the face, the Consortium will produce comprehensive datasets of global gene expression patterns, regulatory elements and sequencing; will generate anatomical and molecular atlases; will provide human normative facial data and other phenotypes; conduct follow up studies of a completed genome-wide association study; generate independent data on the genetics of craniofacial development, build repositories of animal models and of human samples and data for community access and analysis; and will develop software tools and animal models for analyzing and functionally testing and integrating these data. The FaceBase website (http://www.facebase.org) will serve as a web home for these efforts, providing interactive tools for exploring these datasets, together with discussion forums and other services to support and foster collaboration within the craniofacial research community.
Collapse
Affiliation(s)
- Harry Hochheiser
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15232, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Gardner PP, Daub J, Tate J, Moore BL, Osuch IH, Griffiths-Jones S, Finn RD, Nawrocki EP, Kolbe DL, Eddy SR, Bateman A. Rfam: Wikipedia, clans and the "decimal" release. Nucleic Acids Res 2010; 39:D141-5. [PMID: 21062808 PMCID: PMC3013711 DOI: 10.1093/nar/gkq1129] [Citation(s) in RCA: 304] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.
Collapse
Affiliation(s)
- Paul P Gardner
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA0, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
YTPdb: A wiki database of yeast membrane transporters. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2010; 1798:1908-12. [DOI: 10.1016/j.bbamem.2010.06.008] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2010] [Revised: 05/17/2010] [Accepted: 06/07/2010] [Indexed: 02/04/2023]
|
47
|
Sustainable digital infrastructure. Although databases and other online resources have become a central tool for biological research, their long-term support and maintenance is far from secure. EMBO Rep 2010; 11:730-4. [PMID: 20847740 DOI: 10.1038/embor.2010.145] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Accepted: 08/26/2010] [Indexed: 01/25/2023] Open
|
48
|
Alonso F, Walsh CO, Salvador-Carulla L. Methodology for the development of a taxonomy and toolkit to evaluate health-related habits and lifestyle (eVITAL). BMC Res Notes 2010; 3:83. [PMID: 20334642 PMCID: PMC3003271 DOI: 10.1186/1756-0500-3-83] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2010] [Accepted: 03/24/2010] [Indexed: 11/10/2022] Open
Abstract
Background Chronic diseases cause an ever-increasing percentage of morbidity and mortality, but many have modifiable risk factors. Many behaviors that predispose or protect an individual to chronic disease are interrelated, and therefore are best approached using an integrated model of health and the longevity paradigm, using years lived without disability as the endpoint. Findings This study used a 4-phase mixed qualitative design to create a taxonomy and related online toolkit for the evaluation of health-related habits. Core members of a working group conducted a literature review and created a framing document that defined relevant constructs. This document was revised, first by a working group and then by a series of multidisciplinary expert groups. The working group and expert panels also designed a systematic evaluation of health behaviors and risks, which was computerized and evaluated for feasibility. A demonstration study of the toolkit was performed in 11 healthy volunteers. Discussion In this protocol, we used forms of the community intelligence approach, including frame analysis, feasibility, and demonstration, to develop a clinical taxonomy and an online toolkit with standardized procedures for screening and evaluation of multiple domains of health, with a focus on longevity and the goal of integrating the toolkit into routine clinical practice. Trial Registration IMSERSO registry 200700012672
Collapse
Affiliation(s)
- Federico Alonso
- Spanish Association for Research of Healthy Aging (Asociación Española para el Estudio Científico del Envejecimiento Saludable, AECES), Calle Infante Don Fernando 17, Antequera (Malaga) 29200, Spain.
| | | | | | | |
Collapse
|