1
|
Boosting biomedical document classification through the use of domain entity recognizers and semantic ontologies for document representation: The case of gluten bibliome. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
2
|
Causal biological network models for reactive astrogliosis: a systems approach to neuroinflammation. Sci Rep 2022; 12:4205. [PMID: 35273209 PMCID: PMC8913664 DOI: 10.1038/s41598-022-07651-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 02/15/2022] [Indexed: 11/22/2022] Open
Abstract
Astrocytes play a central role in the neuroimmune response by responding to CNS pathologies with diverse molecular and morphological changes during the process of reactive astrogliosis. Here, we used a computational biological network model and mathematical algorithms that allow the interpretation of high-throughput transcriptomic datasets in the context of known biology to study reactive astrogliosis. We gathered available mechanistic information from the literature into a comprehensive causal biological network (CBN) model of astrocyte reactivity. The CBN model was built in the Biological Expression Language, which is both human-readable and computable. We characterized the CBN with a network analysis of highly connected nodes and demonstrated that the CBN captures relevant astrocyte biology. Subsequently, we used the CBN and transcriptomic data to identify key molecular pathways driving the astrocyte phenotype in four CNS pathologies: samples from mouse models of lipopolysaccharide-induced endotoxemia, Alzheimer’s disease, and amyotrophic lateral sclerosis; and samples from multiple sclerosis patients. The astrocyte CBN provides a new tool to identify causal mechanisms and quantify astrogliosis based on transcriptomic data.
Collapse
|
3
|
Abdulkadhar S, Natarajan J. A Text Mining Protocol for Mining Biological Pathways and Regulatory Networks from Biomedical Literature. Methods Mol Biol 2022; 2496:141-157. [PMID: 35713863 DOI: 10.1007/978-1-0716-2305-3_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A biological pathway or regulatory network is a collection of molecular regulators which can activate the changes in cellular processes leading to an assembly of new molecules by series of actions among the molecules. There are three important pathways in system biology studies namely signaling pathways, metabolic pathways, and genetic pathways (or) gene regulatory networks. Recently, biological pathway construction from scientific literature is given much attention as the scientific literature contains a rich set of linguistic features to extract biological associations between genes and proteins. These associations can be united to construct biological networks. Here, we present a brief overview about various biological pathways, biomedical text resources/corpora for network construction and state-of-the-art existing methods for network construction followed by our hybrid text mining protocol for extracting pathways and regulatory networks from biomedical literature.
Collapse
Affiliation(s)
- Sabenabanu Abdulkadhar
- Data Mining and Text Mining Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore, Tamilnadu, India
| | - Jeyakumar Natarajan
- Data Mining and Text Mining Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore, Tamilnadu, India.
| |
Collapse
|
4
|
Chen H, Chen X, Shen Y, Yin X, Liu F, Liu L, Yao J, Chu Q, Wang Y, Qi H, Timko MP, Fang W, Fan L. Signaling pathway perturbation analysis for assessment of biological impact of cigarette smoke on lung cells. Sci Rep 2021; 11:16715. [PMID: 34408184 PMCID: PMC8373939 DOI: 10.1038/s41598-021-95938-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 07/21/2021] [Indexed: 12/13/2022] Open
Abstract
Exposure to cigarette smoke (CS) results in injury to the epithelial cells of the human respiratory tract and has been implicated as a causative factor in the development of chronic obstructive pulmonary disease and lung cancers. The application of omics-scale methodologies has improved the capacity to understand cellular signaling processes underlying response to CS exposure. We report here the development of an algorithm based on quantitative assessment of transcriptomic profiles and signaling pathway perturbation analysis (SPPA) of human bronchial epithelial cells (HBEC) exposed to the toxic components present in CS. HBEC were exposed to CS of different compositions and for different durations using an ISO3308 smoking regime and the impact of exposure was monitored in 2263 signaling pathways in the cell to generate a total effect score that reflects the quantitative degree of impact of external stimuli on the cells. These findings support the conclusion that the SPPA algorithm provides an objective, systematic, sensitive means to evaluate the biological impact of exposures to CS of different compositions making a powerful comparative tool for commercial product evaluation and potentially for other known or potentially toxic environmental smoke substances.
Collapse
Affiliation(s)
- Hongyu Chen
- Department of Medical Oncology, First Affiliated Hospital, Zhejiang University, Hangzhou, 310058, China.,Institute of Crop Science, Zhejiang University, Hangzhou, 310058, China
| | - Xi Chen
- Institute of Crop Science, Zhejiang University, Hangzhou, 310058, China.,Institute of Bioinformatics, Zhejiang University, Hangzhou, 310058, China
| | - Yifei Shen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Xinxin Yin
- Institute of Crop Science, Zhejiang University, Hangzhou, 310058, China
| | - Fangjie Liu
- Institute of Bioinformatics, Zhejiang University, Hangzhou, 310058, China
| | - Lu Liu
- Institute of Crop Science, Zhejiang University, Hangzhou, 310058, China
| | - Jie Yao
- Institute of Bioinformatics, Zhejiang University, Hangzhou, 310058, China
| | - Qinjie Chu
- Institute of Bioinformatics, Zhejiang University, Hangzhou, 310058, China
| | - Yaqin Wang
- Institute of Biotechnology, Zhejiang University, Hangzhou, 310058, China
| | - Hongyan Qi
- Department of Pathology and Pathophysiology, School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Michael P Timko
- Department of Biology and Public Health Sciences, University of Virginia, Charlottesville, VA, 22904, USA
| | - Weijia Fang
- Department of Medical Oncology, First Affiliated Hospital, Zhejiang University, Hangzhou, 310058, China.
| | - Longjiang Fan
- Department of Medical Oncology, First Affiliated Hospital, Zhejiang University, Hangzhou, 310058, China. .,Institute of Crop Science, Zhejiang University, Hangzhou, 310058, China. .,Institute of Bioinformatics, Zhejiang University, Hangzhou, 310058, China.
| |
Collapse
|
5
|
Anchang CG, Xu C, Raimondo MG, Atreya R, Maier A, Schett G, Zaburdaev V, Rauber S, Ramming A. The Potential of OMICs Technologies for the Treatment of Immune-Mediated Inflammatory Diseases. Int J Mol Sci 2021; 22:ijms22147506. [PMID: 34299122 PMCID: PMC8306614 DOI: 10.3390/ijms22147506] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 07/02/2021] [Accepted: 07/09/2021] [Indexed: 01/08/2023] Open
Abstract
Immune-mediated inflammatory diseases (IMIDs), such as inflammatory bowel diseases and inflammatory arthritis (e.g., rheumatoid arthritis, psoriatic arthritis), are marked by increasing worldwide incidence rates. Apart from irreversible damage of the affected tissue, the systemic nature of these diseases heightens the incidence of cardiovascular insults and colitis-associated neoplasia. Only 40–60% of patients respond to currently used standard-of-care immunotherapies. In addition to this limited long-term effectiveness, all current therapies have to be given on a lifelong basis as they are unable to specifically reprogram the inflammatory process and thus achieve a true cure of the disease. On the other hand, the development of various OMICs technologies is considered as “the great hope” for improving the treatment of IMIDs. This review sheds light on the progressive development and the numerous approaches from basic science that gradually lead to the transfer from “bench to bedside” and the implementation into general patient care procedures.
Collapse
Affiliation(s)
- Charles Gwellem Anchang
- Department of Internal Medicine 3—Rheumatology and Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum, 91054 Erlangen, Germany; (C.G.A.); (C.X.); (M.G.R.); (G.S.); (S.R.)
| | - Cong Xu
- Department of Internal Medicine 3—Rheumatology and Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum, 91054 Erlangen, Germany; (C.G.A.); (C.X.); (M.G.R.); (G.S.); (S.R.)
| | - Maria Gabriella Raimondo
- Department of Internal Medicine 3—Rheumatology and Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum, 91054 Erlangen, Germany; (C.G.A.); (C.X.); (M.G.R.); (G.S.); (S.R.)
| | - Raja Atreya
- Department of Internal Medicine 1, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum, 91054 Erlangen, Germany;
| | - Andreas Maier
- Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 91054 Erlangen, Germany;
| | - Georg Schett
- Department of Internal Medicine 3—Rheumatology and Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum, 91054 Erlangen, Germany; (C.G.A.); (C.X.); (M.G.R.); (G.S.); (S.R.)
| | - Vasily Zaburdaev
- Max-Planck-Zentrum für Physik und Medizin, 91054 Erlangen, Germany;
- Department of Biology, Mathematics in Life Sciences, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 91054 Erlangen, Germany
| | - Simon Rauber
- Department of Internal Medicine 3—Rheumatology and Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum, 91054 Erlangen, Germany; (C.G.A.); (C.X.); (M.G.R.); (G.S.); (S.R.)
| | - Andreas Ramming
- Department of Internal Medicine 3—Rheumatology and Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Universitätsklinikum, 91054 Erlangen, Germany; (C.G.A.); (C.X.); (M.G.R.); (G.S.); (S.R.)
- Correspondence: ; Tel.: +49-9131-8543048; Fax: +49-9131-8536448
| |
Collapse
|
6
|
Li R, Zupanic A, Talikka M, Belcastro V, Madan S, Dörpinghaus J, Berg CV, Szostak J, Martin F, Peitsch MC, Hoeng J. Systems Toxicology Approach for Testing Chemical Cardiotoxicity in Larval Zebrafish. Chem Res Toxicol 2020; 33:2550-2564. [PMID: 32638588 DOI: 10.1021/acs.chemrestox.0c00095] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Transcriptomic approaches can give insight into molecular mechanisms underlying chemical toxicity and are increasingly being used as part of toxicological assessments. To aid the interpretation of transcriptomic data, we have developed a systems toxicology method that relies on a computable biological network model. We created the first network model describing cardiotoxicity in zebrafish larvae-a valuable emerging model species in testing cardiotoxicity associated with drugs and chemicals. The network is based on scientific literature and represents hierarchical molecular pathways that lead from receptor activation to cardiac pathologies. To test the ability of our approach to detect cardiotoxic outcomes from transcriptomic data, we have selected three publicly available data sets that reported chemically induced heart pathologies in zebrafish larvae for five different chemicals. Network-based analysis detected cardiac perturbations for four out of five chemicals tested, for two of them using transcriptomic data collected up to 3 days before the onset of a visible phenotype. Additionally, we identified distinct molecular pathways that were activated by the different chemicals. The results demonstrate that the proposed integrational method can be used for evaluating the effects of chemicals on the zebrafish cardiac function and, together with observed cardiac apical end points, can provide a comprehensive method for connecting molecular events to organ toxicity. The computable network model is freely available and may be used to generate mechanistic hypotheses and quantifiable perturbation values from any zebrafish transcriptomic data.
Collapse
Affiliation(s)
- Roman Li
- Swiss Federal Institute of Aquatic Science and Technology, Eawag, Überlandstrasse 133, CH-8600 Dübendorf, Switzerland.,PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchâtel, Switzerland
| | - Anze Zupanic
- Swiss Federal Institute of Aquatic Science and Technology, Eawag, Überlandstrasse 133, CH-8600 Dübendorf, Switzerland
| | - Marja Talikka
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchâtel, Switzerland
| | - Vincenzo Belcastro
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchâtel, Switzerland
| | - Sumit Madan
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin 53754, Germany
| | - Jens Dörpinghaus
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin 53754, Germany
| | - Colette Vom Berg
- Swiss Federal Institute of Aquatic Science and Technology, Eawag, Überlandstrasse 133, CH-8600 Dübendorf, Switzerland
| | - Justyna Szostak
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchâtel, Switzerland
| | - Florian Martin
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchâtel, Switzerland
| | - Manuel C Peitsch
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchâtel, Switzerland
| | - Julia Hoeng
- PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchâtel, Switzerland
| |
Collapse
|
7
|
Martin F, Gubian S, Talikka M, Hoeng J, Peitsch MC. NPA: an R package for computing network perturbation amplitudes using gene expression data and two-layer networks. BMC Bioinformatics 2019; 20:451. [PMID: 31481014 PMCID: PMC6724309 DOI: 10.1186/s12859-019-3016-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 07/31/2019] [Indexed: 02/06/2023] Open
Abstract
Background High-throughput gene expression technologies provide complex datasets reflecting mechanisms perturbed in an experiment, typically in a treatment versus control design. Analysis of these information-rich data can be guided based on a priori knowledge, such as networks of related proteins or genes. Assessing the response of a specific mechanism and investigating its biological basis is extremely important in systems toxicology; as compounds or treatment need to be assessed with respect to a predefined set of key mechanisms that could lead to toxicity. Two-layer networks are suitable for this task, and a robust computational methodology specifically addressing those needs was previously published. The NPA package (https://github.com/philipmorrisintl/NPA) implements the algorithm, and a data package of eight two-layer networks representing key mechanisms, such as xenobiotic metabolism, apoptosis, or epithelial immune innate activation, is provided. Results Gene expression data from an animal study are analyzed using the package and its network models. The functionalities are implemented using R6 classes, making the use of the package seamless and intuitive. The various network responses are analyzed using the leading node analysis, and an overall perturbation, called the Biological Impact Factor, is computed. Conclusions The NPA package implements the published network perturbation amplitude methodology and provides a set of two-layer networks encoded in the Biological Expression Language.
Collapse
Affiliation(s)
- Florian Martin
- PMI R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, CH-2000, Neuchâtel, Switzerland.
| | - Sylvain Gubian
- PMI R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, CH-2000, Neuchâtel, Switzerland
| | - Marja Talikka
- PMI R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, CH-2000, Neuchâtel, Switzerland
| | - Julia Hoeng
- PMI R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, CH-2000, Neuchâtel, Switzerland
| | - Manuel C Peitsch
- PMI R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, CH-2000, Neuchâtel, Switzerland
| |
Collapse
|
8
|
Yepiskoposyan H, Talikka M, Vavassori S, Martin F, Sewer A, Gubian S, Luettich K, Peitsch MC, Hoeng J. Construction of a Suite of Computable Biological Network Models Focused on Mucociliary Clearance in the Respiratory Tract. Front Genet 2019; 10:87. [PMID: 30828347 PMCID: PMC6384416 DOI: 10.3389/fgene.2019.00087] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Accepted: 01/29/2019] [Indexed: 11/13/2022] Open
Abstract
Mucociliary clearance (MCC), considered as a collaboration of mucus secreted from goblet cells, the airway surface liquid layer, and the beating of cilia of ciliated cells, is the airways’ defense system against airborne contaminants. Because the process is well described at the molecular level, we gathered the available information into a suite of comprehensive causal biological network (CBN) models. The suite consists of three independent models that represent (1) cilium assembly, (2) ciliary beating, and (3) goblet cell hyperplasia/metaplasia and that were built in the Biological Expression Language, which is both human-readable and computable. The network analysis of highly connected nodes and pathways demonstrated that the relevant biology was captured in the MCC models. We also show the scoring of transcriptomic data onto these network models and demonstrate that the models capture the perturbation in each dataset accurately. This work is a continuation of our approach to use computational biological network models and mathematical algorithms that allow for the interpretation of high-throughput molecular datasets in the context of known biology. The MCC network model suite can be a valuable tool in personalized medicine to further understand heterogeneity and individual drug responses in complex respiratory diseases.
Collapse
Affiliation(s)
| | - Marja Talikka
- PMI R&D, Philip Morris Products S.A., Neuchâtel, Switzerland
| | | | - Florian Martin
- PMI R&D, Philip Morris Products S.A., Neuchâtel, Switzerland
| | - Alain Sewer
- PMI R&D, Philip Morris Products S.A., Neuchâtel, Switzerland
| | - Sylvain Gubian
- PMI R&D, Philip Morris Products S.A., Neuchâtel, Switzerland
| | - Karsta Luettich
- PMI R&D, Philip Morris Products S.A., Neuchâtel, Switzerland
| | | | - Julia Hoeng
- PMI R&D, Philip Morris Products S.A., Neuchâtel, Switzerland
| |
Collapse
|
9
|
Hoyt CT, Domingo-Fernández D, Aldisi R, Xu L, Kolpeja K, Spalek S, Wollert E, Bachman J, Gyori BM, Greene P, Hofmann-Apitius M. Re-curation and rational enrichment of knowledge graphs in Biological Expression Language. Database (Oxford) 2019; 2019:baz068. [PMID: 31225582 PMCID: PMC6587072 DOI: 10.1093/database/baz068] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 04/03/2019] [Accepted: 04/29/2019] [Indexed: 12/23/2022]
Abstract
The rapid accumulation of new biomedical literature not only causes curated knowledge graphs (KGs) to become outdated and incomplete, but also makes manual curation an impractical and unsustainable solution. Automated or semi-automated workflows are necessary to assist in prioritizing and curating the literature to update and enrich KGs. We have developed two workflows: one for re-curating a given KG to assure its syntactic and semantic quality and another for rationally enriching it by manually revising automatically extracted relations for nodes with low information density. We applied these workflows to the KGs encoded in Biological Expression Language from the NeuroMMSig database using content that was pre-extracted from MEDLINE abstracts and PubMed Central full-text articles using text mining output integrated by INDRA. We have made this workflow freely available at https://github.com/bel-enrichment/bel-enrichment.
Collapse
Affiliation(s)
- Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Rana Aldisi
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Lingling Xu
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Kristian Kolpeja
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Sandra Spalek
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Esther Wollert
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - John Bachman
- Laboratory of Systems Pharmacology, Harvard Medical School, 200 Longwood Ave, Boston, MA, USA
| | - Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School, 200 Longwood Ave, Boston, MA, USA
| | - Patrick Greene
- Laboratory of Systems Pharmacology, Harvard Medical School, 200 Longwood Ave, Boston, MA, USA
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| |
Collapse
|
10
|
Kondratova M, Sompairac N, Barillot E, Zinovyev A, Kuperstein I. Signalling maps in cancer research: construction and data analysis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:4964960. [PMID: 29688383 PMCID: PMC5890450 DOI: 10.1093/database/bay036] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 03/19/2018] [Indexed: 12/22/2022]
Abstract
Generation and usage of high-quality molecular signalling network maps can be augmented by standardizing notations, establishing curation workflows and application of computational biology methods to exploit the knowledge contained in the maps. In this manuscript, we summarize the major aims and challenges of assembling information in the form of comprehensive maps of molecular interactions. Mainly, we share our experience gained while creating the Atlas of Cancer Signalling Network. In the step-by-step procedure, we describe the map construction process and suggest solutions for map complexity management by introducing a hierarchical modular map structure. In addition, we describe the NaviCell platform, a computational technology using Google Maps API to explore comprehensive molecular maps similar to geographical maps and explain the advantages of semantic zooming principles for map navigation. We also provide the outline to prepare signalling network maps for navigation using the NaviCell platform. Finally, several examples of cancer high-throughput data analysis and visualization in the context of comprehensive signalling maps are presented.
Collapse
Affiliation(s)
- Maria Kondratova
- Institut Curie, PSL Research University, F-75005 Paris, France.,INSERM, U900, F-75005 Paris, France.,MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
| | - Nicolas Sompairac
- Institut Curie, PSL Research University, F-75005 Paris, France.,INSERM, U900, F-75005 Paris, France.,MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
| | - Emmanuel Barillot
- Institut Curie, PSL Research University, F-75005 Paris, France.,INSERM, U900, F-75005 Paris, France.,MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
| | - Andrei Zinovyev
- Institut Curie, PSL Research University, F-75005 Paris, France.,INSERM, U900, F-75005 Paris, France.,MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
| | - Inna Kuperstein
- Institut Curie, PSL Research University, F-75005 Paris, France.,INSERM, U900, F-75005 Paris, France.,MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006 Paris, France
| |
Collapse
|
11
|
Müller HM, Van Auken KM, Li Y, Sternberg PW. Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature. BMC Bioinformatics 2018; 19:94. [PMID: 29523070 PMCID: PMC5845379 DOI: 10.1186/s12859-018-2103-8] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 03/01/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. RESULTS We describe the next generation of the Textpresso information retrieval system, Textpresso Central (TPC). TPC builds on the strengths of the original system by expanding the full text corpus to include the PubMed Central Open Access Subset (PMC OA), as well as the WormBase C. elegans bibliography. In addition, TPC allows users to create a customized corpus by uploading and processing documents of their choosing. TPC is UIMA compliant, to facilitate compatibility with external processing modules, and takes advantage of Lucene indexing and search technology for efficient handling of millions of full text documents. Like Textpresso, TPC searches can be performed using keywords and/or categories (semantically related groups of terms), but to provide better context for interpreting and validating queries, search results may now be viewed as highlighted passages in the context of full text. To facilitate biocuration efforts, TPC also allows users to select text spans from the full text and annotate them, create customized curation forms for any data type, and send resulting annotations to external curation databases. As an example of such a curation form, we describe integration of TPC with the Noctua curation tool developed by the Gene Ontology (GO) Consortium. CONCLUSION Textpresso Central is an online literature search and curation platform that enables biocurators and biomedical researchers to search and mine the full text of literature by integrating keyword and category searches with viewing search results in the context of the full text. It also allows users to create customized curation interfaces, use those interfaces to make annotations linked to supporting evidence statements, and then send those annotations to any database in the world. Textpresso Central URL: http://www.textpresso.org/tpc.
Collapse
Affiliation(s)
- H.-M. Müller
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125 USA
| | - K. M. Van Auken
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125 USA
| | - Y. Li
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125 USA
| | - P. W. Sternberg
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125 USA
| |
Collapse
|
12
|
Geerts H, Hofmann-Apitius M, Anastasio TJ. Knowledge-driven computational modeling in Alzheimer's disease research: Current state and future trends. Alzheimers Dement 2017; 13:1292-1302. [PMID: 28917669 DOI: 10.1016/j.jalz.2017.08.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 07/05/2017] [Accepted: 08/01/2017] [Indexed: 11/24/2022]
Abstract
Neurodegenerative diseases such as Alzheimer's disease (AD) follow a slowly progressing dysfunctional trajectory, with a large presymptomatic component and many comorbidities. Using preclinical models and large-scale omics studies ranging from genetics to imaging, a large number of processes that might be involved in AD pathology at different stages and levels have been identified. The sheer number of putative hypotheses makes it almost impossible to estimate their contribution to the clinical outcome and to develop a comprehensive view on the pathological processes driving the clinical phenotype. Traditionally, bioinformatics approaches have provided correlations and associations between processes and phenotypes. Focusing on causality, a new breed of advanced and more quantitative modeling approaches that use formalized domain expertise offer new opportunities to integrate these different modalities and outline possible paths toward new therapeutic interventions. This article reviews three different computational approaches and their possible complementarities. Process algebras, implemented using declarative programming languages such as Maude, facilitate simulation and analysis of complicated biological processes on a comprehensive but coarse-grained level. A model-driven Integration of Data and Knowledge, based on the OpenBEL platform and using reverse causative reasoning and network jump analysis, can generate mechanistic knowledge and a new, mechanism-based taxonomy of disease. Finally, Quantitative Systems Pharmacology is based on formalized implementation of domain expertise in a more fine-grained, mechanism-driven, quantitative, and predictive humanized computer model. We propose a strategy to combine the strengths of these individual approaches for developing powerful modeling methodologies that can provide actionable knowledge for rational development of preventive and therapeutic interventions. Development of these computational approaches is likely to be required for further progress in understanding and treating AD.
Collapse
Affiliation(s)
- Hugo Geerts
- In Silico Biosciences, Berwyn, PA, USA; Perelman School of Medicine, Univ. of Pennsylvania.
| | - Martin Hofmann-Apitius
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Thomas J Anastasio
- Department of Molecular and Integrative Physiology, and Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | |
Collapse
|
13
|
Kreimeyer K, Foster M, Pandey A, Arya N, Halford G, Jones SF, Forshee R, Walderhaug M, Botsis T. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. J Biomed Inform 2017; 73:14-29. [PMID: 28729030 DOI: 10.1016/j.jbi.2017.07.012] [Citation(s) in RCA: 290] [Impact Index Per Article: 41.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Revised: 06/07/2017] [Accepted: 07/14/2017] [Indexed: 12/24/2022]
Abstract
We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP.
Collapse
Affiliation(s)
- Kory Kreimeyer
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States.
| | - Matthew Foster
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States
| | - Abhishek Pandey
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States
| | - Nina Arya
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States
| | - Gwendolyn Halford
- FDA Library, US Food and Drug Administration, Silver Spring, MD, United States
| | - Sandra F Jones
- Cancer Surveillance Branch, Division of Cancer Prevention and Control, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, United States
| | - Richard Forshee
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States
| | - Mark Walderhaug
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States
| | - Taxiarchis Botsis
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States
| |
Collapse
|
14
|
Szostak J, Martin F, Talikka M, Peitsch MC, Hoeng J. Semi-Automated Curation Allows Causal Network Model Building for the Quantification of Age-Dependent Plaque Progression in ApoE -/- Mouse. GENE REGULATION AND SYSTEMS BIOLOGY 2016; 10:95-103. [PMID: 27840576 PMCID: PMC5100841 DOI: 10.4137/grsb.s40031] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 08/31/2016] [Accepted: 08/31/2016] [Indexed: 11/05/2022]
Abstract
The cellular and molecular mechanisms behind the process of atherosclerotic plaque destabilization are complex, and molecular data from aortic plaques are difficult to interpret. Biological network models may overcome these difficulties and precisely quantify the molecular mechanisms impacted during disease progression. The atherosclerosis plaque destabilization biological network model was constructed with the semiautomated curation pipeline, BELIEF. Cellular and molecular mechanisms promoting plaque destabilization or rupture were captured in the network model. Public transcriptomic data sets were used to demonstrate the specificity of the network model and to capture the different mechanisms that were impacted in ApoE-/- mouse aorta at 6 and 32 weeks. We concluded that network models combined with the network perturbation amplitude algorithm provide a sensitive, quantitative method to follow disease progression at the molecular level. This approach can be used to investigate and quantify molecular mechanisms during plaque progression.
Collapse
Affiliation(s)
- Justyna Szostak
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Florian Martin
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Marja Talikka
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Manuel C Peitsch
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Julia Hoeng
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| |
Collapse
|
15
|
Madan S, Hodapp S, Senger P, Ansari S, Szostak J, Hoeng J, Peitsch M, Fluck J. The BEL information extraction workflow (BELIEF): evaluation in the BioCreative V BEL and IAT track. Database (Oxford) 2016; 2016:baw136. [PMID: 27694210 PMCID: PMC5045868 DOI: 10.1093/database/baw136] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 08/26/2016] [Accepted: 08/30/2016] [Indexed: 11/14/2022]
Abstract
Network-based approaches have become extremely important in systems biology to achieve a better understanding of biological mechanisms. For network representation, the Biological Expression Language (BEL) is well designed to collate findings from the scientific literature into biological network models. To facilitate encoding and biocuration of such findings in BEL, a BEL Information Extraction Workflow (BELIEF) was developed. BELIEF provides a web-based curation interface, the BELIEF Dashboard, that incorporates text mining techniques to support the biocurator in the generation of BEL networks. The underlying UIMA-based text mining pipeline (BELIEF Pipeline) uses several named entity recognition processes and relationship extraction methods to detect concepts and BEL relationships in literature. The BELIEF Dashboard allows easy curation of the automatically generated BEL statements and their context annotations. Resulting BEL statements and their context annotations can be syntactically and semantically verified to ensure consistency in the BEL network. In summary, the workflow supports experts in different stages of systems biology network building. Based on the BioCreative V BEL track evaluation, we show that the BELIEF Pipeline automatically extracts relationships with an F-score of 36.4% and fully correct statements can be obtained with an F-score of 30.8%. Participation in the BioCreative V Interactive task (IAT) track with BELIEF revealed a systems usability scale (SUS) of 67. Considering the complexity of the task for new users-learning BEL, working with a completely new interface, and performing complex curation-a score so close to the overall SUS average highlights the usability of BELIEF.Database URL: BELIEF is available at http://www.scaiview.com/belief/.
Collapse
Affiliation(s)
- Sumit Madan
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Sven Hodapp
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Philipp Senger
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Sam Ansari
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Justyna Szostak
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Julia Hoeng
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Manuel Peitsch
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Juliane Fluck
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| |
Collapse
|
16
|
Fluck J, Madan S, Ansari S, Kodamullil AT, Karki R, Rastegar-Mojarad M, Catlett NL, Hayes W, Szostak J, Hoeng J, Peitsch M. Training and evaluation corpora for the extraction of causal relationships encoded in biological expression language (BEL). DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw113. [PMID: 27554092 PMCID: PMC4995071 DOI: 10.1093/database/baw113] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 07/07/2016] [Indexed: 01/21/2023]
Abstract
Success in extracting biological relationships is mainly dependent on the complexity of the task as well as the availability of high-quality training data. Here, we describe the new corpora in the systems biology modeling language BEL for training and testing biological relationship extraction systems that we prepared for the BioCreative V BEL track. BEL was designed to capture relationships not only between proteins or chemicals, but also complex events such as biological processes or disease states. A BEL nanopub is the smallest unit of information and represents a biological relationship with its provenance. In BEL relationships (called BEL statements), the entities are normalized to defined namespaces mainly derived from public repositories, such as sequence databases, MeSH or publicly available ontologies. In the BEL nanopubs, the BEL statements are associated with citation information and supportive evidence such as a text excerpt. To enable the training of extraction tools, we prepared BEL resources and made them available to the community. We selected a subset of these resources focusing on a reduced set of namespaces, namely, human and mouse genes, ChEBI chemicals, MeSH diseases and GO biological processes, as well as relationship types ‘increases’ and ‘decreases’. The published training corpus contains 11 000 BEL statements from over 6000 supportive text excerpts. For method evaluation, we selected and re-annotated two smaller subcorpora containing 100 text excerpts. For this re-annotation, the inter-annotator agreement was measured by the BEL track evaluation environment and resulted in a maximal F-score of 91.18% for full statement agreement. In addition, for a set of 100 BEL statements, we do not only provide the gold standard expert annotations, but also text excerpts pre-selected by two automated systems. Those text excerpts were evaluated and manually annotated as true or false supportive in the course of the BioCreative V BEL track task. Database URL:http://wiki.openbel.org/display/BIOC/Datasets
Collapse
Affiliation(s)
- Juliane Fluck
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Sumit Madan
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Sam Ansari
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Alpha T Kodamullil
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | - Reagon Karki
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin, Germany
| | | | | | - William Hayes
- Selventa, One Alewife Center, Cambridge, MA 02140, USA
| | - Justyna Szostak
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Julia Hoeng
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| | - Manuel Peitsch
- Philip Morris International R&D, Philip Morris Products S.A, Quai Jeanrenaud 5, Neuchâtel, 2000, Switzerland
| |
Collapse
|
17
|
Hirschman L, Fort K, Boué S, Kyrpides N, Islamaj Doğan R, Cohen KB. Crowdsourcing and curation: perspectives from biology and natural language processing. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw115. [PMID: 27504010 PMCID: PMC4976298 DOI: 10.1093/database/baw115] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Accepted: 07/11/2016] [Indexed: 12/27/2022]
Abstract
Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging 'the crowd'; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9-11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing.Database URL: http://www.mitre.org/publications/technical-papers/crowdsourcing-and-curation-perspectives.
Collapse
Affiliation(s)
| | - Karën Fort
- University of Paris-Sorbonne/STIH Team, Paris, France
| | - Stéphanie Boué
- Philip Morris International R&D, Philip Morris Products S.A., Neuchâtel, Switzerland
| | | | - Rezarta Islamaj Doğan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | |
Collapse
|
18
|
Rastegar-Mojarad M, Komandur Elayavilli R, Liu H. BELTracker: evidence sentence retrieval for BEL statements. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw079. [PMID: 27173525 PMCID: PMC4865361 DOI: 10.1093/database/baw079] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 04/22/2016] [Indexed: 01/09/2023]
Abstract
Biological expression language (BEL) is one of the main formal representation models of biological networks. The primary source of information for curating biological networks in BEL representation has been literature. It remains a challenge to identify relevant articles and the corresponding evidence statements for curating and validating BEL statements. In this paper, we describe BELTracker, a tool used to retrieve and rank evidence sentences from PubMed abstracts and full-text articles for a given BEL statement (per the 2015 task requirements of BioCreative V BEL Task). The system is comprised of three main components, (i) translation of a given BEL statement to an information retrieval (IR) query, (ii) retrieval of relevant PubMed citations and (iii) finding and ranking the evidence sentences in those citations. BELTracker uses a combination of multiple approaches based on traditional IR, machine learning, and heuristics to accomplish the task. The system identified and ranked at least one fully relevant evidence sentence in the top 10 retrieved sentences for 72 out of 97 BEL statements in the test set. BELTracker achieved a precision of 0.392, 0.532 and 0.615 when evaluated with three criteria, namely full, relaxed and context criteria, respectively, by the task organizers. Our team at Mayo Clinic was the only participant in this task. BELTracker is available as a RESTful API and is available for public use. Database URL:http://www.openbionlp.org:8080/BelTracker/finder/Given_BEL_Statement
Collapse
Affiliation(s)
- Majid Rastegar-Mojarad
- Department of Health Sciences Research, Mayo Clinic, USA University of Wisconsin-Milwaukee, Milwaukee, WI, USA
| | | | - Hongfang Liu
- Department of Health Sciences Research, Mayo Clinic, USA
| |
Collapse
|
19
|
Rodriguez-Esteban R. Biocuration with insufficient resources and fixed timelines. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav116. [PMID: 26708987 PMCID: PMC4691339 DOI: 10.1093/database/bav116] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 11/17/2015] [Indexed: 11/14/2022]
Abstract
Biological curation, or biocuration, is often studied from the perspective of creating and maintaining databases that have the goal of mapping and tracking certain areas of biology. However, much biocuration is, in fact, dedicated to finite and time-limited projects in which insufficient resources demand trade-offs. This typically more ephemeral type of curation is nonetheless of importance in biomedical research. Here, I propose a framework to understand such restricted curation projects from the point of view of return on curation (ROC), value, efficiency and productivity. Moreover, I suggest general strategies to optimize these curation efforts, such as the ‘multiple strategies’ approach, as well as a metric called overhead that can be used in the context of managing curation resources.
Collapse
Affiliation(s)
- Raul Rodriguez-Esteban
- Roche Pharmaceutical Research and Early Development, pRED Informatics, Roche Innovation Center Basel, Basel 4070, Switzerland
| |
Collapse
|
20
|
Talikka M, Boue S, Schlage WK. Causal Biological Network Database: A Comprehensive Platform of Causal Biological Network Models Focused on the Pulmonary and Vascular Systems. METHODS IN PHARMACOLOGY AND TOXICOLOGY 2015. [DOI: 10.1007/978-1-4939-2778-4_3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|