Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Garten Y, Coulet A, Altman RB. Recent progress in automatically extracting information from the pharmacogenomic literature. Pharmacogenomics 2011;11:1467-89. [PMID: 21047206 DOI: 10.2217/pgs.10.136] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

For:	Garten Y, Coulet A, Altman RB. Recent progress in automatically extracting information from the pharmacogenomic literature. Pharmacogenomics 2011;11:1467-89. [PMID: 21047206 DOI: 10.2217/pgs.10.136] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Number

Cited by Other Article(s)

Alharbey R, Kim JI, Daud A, Song M, Alshdadi AA, Hayat MK. Indexing important drugs from medical literature. Scientometrics 2022. [DOI: 10.1007/s11192-022-04340-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Sun G, Ahn-Horst TA, Covert MW. The E. coli Whole-Cell Modeling Project. EcoSal Plus 2021;9:eESP00012020. [PMID: 34242084 PMCID: PMC11163835 DOI: 10.1128/ecosalplus.esp-0001-2020] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 05/26/2021] [Indexed: 12/22/2022]

Yehya A, Altaany Z. A Decade of Pharmacogenetic Studies in Jordan: A Systemic Review. THE PHARMACOGENOMICS JOURNAL 2021;21:543-550. [PMID: 33850297 DOI: 10.1038/s41397-021-00236-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 02/25/2021] [Accepted: 03/23/2021] [Indexed: 02/02/2023]

Gong L, Whirl-Carrillo M, Klein TE. PharmGKB, an Integrated Resource of Pharmacogenomic Knowledge. Curr Protoc 2021;1:e226. [PMID: 34387941 DOI: 10.1002/cpz1.226] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Guin D, Rani J, Singh P, Grover S, Bora S, Talwar P, Karthikeyan M, Satyamoorthy K, Adithan C, Ramachandran S, Saso L, Hasija Y, Kukreti R. Global Text Mining and Development of Pharmacogenomic Knowledge Resource for Precision Medicine. Front Pharmacol 2019;10:839. [PMID: 31447668 PMCID: PMC6692532 DOI: 10.3389/fphar.2019.00839] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 07/01/2019] [Indexed: 11/20/2022] Open

Affiliation(s)

Debleena Guin Genomics and Molecular Medicine Unit, Council of Scientific and Industrial Research (CSIR)—Institute of Genomics and Integrative Biology (IGIB), New Delhi, India Department of Biotechnology, Delhi Technological University, Delhi, India
Jyoti Rani Department of Biomedical Sciences, Acharya Narayan Dev College, University of Delhi, New Delhi, India G N Ramachandran Knowledge Centre, Council of Scientific and Industrial Research (CSIR)—Institute of Genomics and Integrative Biology (IGIB), New Delhi, India
Priyanka Singh Genomics and Molecular Medicine Unit, Council of Scientific and Industrial Research (CSIR)—Institute of Genomics and Integrative Biology (IGIB), New Delhi, India Academy of Scientific & Innovative Research (AcSIR), New Delhi, India
Sandeep Grover Institute of Medical Biometry and Statistics, University of Lübeck University Medical Center Schleswig-Holstein - Campus Lübeck, Lübeck, Germany
Shivangi Bora Genomics and Molecular Medicine Unit, Council of Scientific and Industrial Research (CSIR)—Institute of Genomics and Integrative Biology (IGIB), New Delhi, India Department of Biotechnology, Delhi Technological University, Delhi, India
Puneet Talwar Institute of Human Behaviour and Allied Sciences, Delhi, India
Muthusamy Karthikeyan Department of Bioinformatics, Alagappa University, Karaikudi, India
K Satyamoorthy School of Life Sciences, Manipal University, Manipal, India
C Adithan Central Inter-Disciplinary Research Facility (CIDRF), Pondicherry, India
S Ramachandran G N Ramachandran Knowledge Centre, Council of Scientific and Industrial Research (CSIR)—Institute of Genomics and Integrative Biology (IGIB), New Delhi, India Academy of Scientific & Innovative Research (AcSIR), New Delhi, India
Luciano Saso Department of Physiology and Pharmacology “Vittorio Erspamer,” Sapienza University of Rome, Rome, Italy
Yasha Hasija Department of Biotechnology, Delhi Technological University, Delhi, India
Ritushree Kukreti Genomics and Molecular Medicine Unit, Council of Scientific and Industrial Research (CSIR)—Institute of Genomics and Integrative Biology (IGIB), New Delhi, India Academy of Scientific & Innovative Research (AcSIR), New Delhi, India

Collapse

Monnin P, Legrand J, Husson G, Ringot P, Tchechmedjiev A, Jonquet C, Napoli A, Coulet A. PGxO and PGxLOD: a reconciliation of pharmacogenomic knowledge of various provenances, enabling further comparison. BMC Bioinformatics 2019;20:139. [PMID: 30999867 PMCID: PMC6471679 DOI: 10.1186/s12859-019-2693-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open

Abstract

Background

Pharmacogenomics (PGx) studies how genomic variations impact variations in drug response phenotypes. Knowledge in pharmacogenomics is typically composed of units that have the form of ternary relationships gene variant – drug – adverse event. Such a relationship states that an adverse event may occur for patients having the specified gene variant and being exposed to the specified drug. State-of-the-art knowledge in PGx is mainly available in reference databases such as PharmGKB and reported in scientific biomedical literature. But, PGx knowledge can also be discovered from clinical data, such as Electronic Health Records (EHRs), and in this case, may either correspond to new knowledge or confirm state-of-the-art knowledge that lacks “clinical counterpart” or validation. For this reason, there is a need for automatic comparison of knowledge units from distinct sources.

Results

In this article, we propose an approach, based on Semantic Web technologies, to represent and compare PGx knowledge units. To this end, we developed PGxO, a simple ontology that represents PGx knowledge units and their components. Combined with PROV-O, an ontology developed by the W3C to represent provenance information, PGxO enables encoding and associating provenance information to PGx relationships. Additionally, we introduce a set of rules to reconcile PGx knowledge, i.e. to identify when two relationships, potentially expressed using different vocabularies and levels of granularity, refer to the same, or to different knowledge units. We evaluated our ontology and rules by populating PGxO with knowledge units extracted from PharmGKB (2701), the literature (65,720) and from discoveries reported in EHR analysis studies (only 10, manually extracted); and by testing their similarity. We called PGxLOD (PGx Linked Open Data) the resulting knowledge base that represents and reconciles knowledge units of those various origins.

Conclusions

The proposed ontology and reconciliation rules constitute a first step toward a more complete framework for knowledge comparison in PGx. In this direction, the experimental instantiation of PGxO, named PGxLOD, illustrates the ability and difficulties of reconciling various existing knowledge sources.

Electronic supplementary material

The online version of this article (10.1186/s12859-019-2693-9) contains supplementary material, which is available to authorized users.

Collapse

Fujiwara T, Yamamoto Y, Kim JD, Buske O, Takagi T. PubCaseFinder: A Case-Report-Based, Phenotype-Driven Differential-Diagnosis System for Rare Diseases. Am J Hum Genet 2018;103:389-399. [PMID: 30173820 DOI: 10.1016/j.ajhg.2018.08.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 08/01/2018] [Indexed: 01/29/2023] Open

Vasilevich A, de Boer J. Robot-scientists will lead tomorrow's biomaterials discovery. CURRENT OPINION IN BIOMEDICAL ENGINEERING 2018. [DOI: 10.1016/j.cobme.2018.03.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Chen L, Friedman C, Finkelstein J. Automated Metabolic Phenotyping of Cytochrome Polymorphisms Using PubMed Abstract Mining. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018;2017:535-544. [PMID: 29854118 PMCID: PMC5977704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Sharma V, Law W, Balick MJ, Sarkar IN. Harnessing Biomedical Natural Language Processing Tools to Identify Medicinal Plant Knowledge from Historical Texts. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018;2017:1537-1546. [PMID: 29854223 PMCID: PMC5977595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Mahmood ASMA, Rao S, McGarvey P, Wu C, Madhavan S, Vijay-Shanker K. eGARD: Extracting associations between genomic anomalies and drug responses from text. PLoS One 2017;12:e0189663. [PMID: 29261751 PMCID: PMC5738129 DOI: 10.1371/journal.pone.0189663] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Accepted: 11/29/2017] [Indexed: 12/25/2022] Open

Zwierzyna M, Overington JP. Classification and analysis of a large collection of in vivo bioassay descriptions. PLoS Comput Biol 2017;13:e1005641. [PMID: 28678787 PMCID: PMC5517062 DOI: 10.1371/journal.pcbi.1005641] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 07/19/2017] [Accepted: 06/21/2017] [Indexed: 12/17/2022] Open

Dalleau K, Marzougui Y, Da Silva S, Ringot P, Ndiaye NC, Coulet A. Learning from biomedical linked data to suggest valid pharmacogenes. J Biomed Semantics 2017;8:16. [PMID: 28427468 PMCID: PMC5399403 DOI: 10.1186/s13326-017-0125-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Accepted: 03/29/2017] [Indexed: 12/15/2022] Open

Abstract

Background

A standard task in pharmacogenomics research is identifying genes that may be involved in drug response variability, i.e., pharmacogenes. Because genomic experiments tended to generate many false positives, computational approaches based on the use of background knowledge have been proposed. Until now, only molecular networks or the biomedical literature were used, whereas many other resources are available.

Method

We propose here to consume a diverse and larger set of resources using linked data related either to genes, drugs or diseases. One of the advantages of linked data is that they are built on a standard framework that facilitates the joint use of various sources, and thus facilitates considering features of various origins. We propose a selection and linkage of data sources relevant to pharmacogenomics, including for example DisGeNET and Clinvar. We use machine learning to identify and prioritize pharmacogenes that are the most probably valid, considering the selected linked data. This identification relies on the classification of gene–drug pairs as either pharmacogenomically associated or not and was experimented with two machine learning methods –random forest and graph kernel–, which results are compared in this article.

Results

We assembled a set of linked data relative to pharmacogenomics, of 2,610,793 triples, coming from six distinct resources. Learning from these data, random forest enables identifying valid pharmacogenes with a F-measure of 0.73, on a 10 folds cross-validation, whereas graph kernel achieves a F-measure of 0.81. A list of top candidates proposed by both approaches is provided and their obtention is discussed.

Electronic supplementary material

The online version of this article (doi:10.1186/s13326-017-0125-1) contains supplementary material, which is available to authorized users.

Collapse

Huang Y, Wang L, Zan ALS. ARN: analysis and prediction by adipogenic professional database. BMC SYSTEMS BIOLOGY 2016;10:57. [PMID: 27503118 PMCID: PMC4977645 DOI: 10.1186/s12918-016-0321-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Accepted: 07/14/2016] [Indexed: 12/23/2022]

Gonzalez GH, Tahsin T, Goodale BC, Greene AC, Greene CS. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery. Brief Bioinform 2015;17:33-42. [PMID: 26420781 PMCID: PMC4719073 DOI: 10.1093/bib/bbv087] [Citation(s) in RCA: 103] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Indexed: 02/06/2023] Open

A Relation Extraction Framework for Biomedical Text Using Hybrid Feature Set. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015;2015:910423. [PMID: 26347797 PMCID: PMC4546954 DOI: 10.1155/2015/910423] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2015] [Revised: 06/17/2015] [Accepted: 06/29/2015] [Indexed: 11/27/2022]

Ravikumar KE, Wagholikar KB, Li D, Kocher JP, Liu H. Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature. BMC Bioinformatics 2015;16:185. [PMID: 26047637 PMCID: PMC4457984 DOI: 10.1186/s12859-015-0609-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Accepted: 04/30/2015] [Indexed: 12/03/2022] Open

Abstract

Background

Advances in the next generation sequencing technology has accelerated the pace of individualized medicine (IM), which aims to incorporate genetic/genomic information into medicine. One immediate need in interpreting sequencing data is the assembly of information about genetic variants and their corresponding associations with other entities (e.g., diseases or medications). Even with dedicated effort to capture such information in biological databases, much of this information remains ‘locked’ in the unstructured text of biomedical publications. There is a substantial lag between the publication and the subsequent abstraction of such information into databases. Multiple text mining systems have been developed, but most of them focus on the sentence level association extraction with performance evaluation based on gold standard text annotations specifically prepared for text mining systems.

Results

We developed and evaluated a text mining system, MutD, which extracts protein mutation-disease associations from MEDLINE abstracts by incorporating discourse level analysis, using a benchmark data set extracted from curated database records. MutD achieves an F-measure of 64.3 % for reconstructing protein mutation disease associations in curated database records. Discourse level analysis component of MutD contributed to a gain of more than 10 % in F-measure when compared against the sentence level association extraction. Our error analysis indicates that 23 of the 64 precision errors are true associations that were not captured by database curators and 68 of the 113 recall errors are caused by the absence of associated disease entities in the abstract. After adjusting for the defects in the curated database, the revised F-measure of MutD in association detection reaches 81.5 %.

Conclusions

Our quantitative analysis reveals that MutD can effectively extract protein mutation disease associations when benchmarking based on curated database records. The analysis also demonstrates that incorporating discourse level analysis significantly improved the performance of extracting the protein-mutation-disease association. Future work includes the extension of MutD for full text articles.

Collapse

Segura-Bedmar I, Martínez P, Herrero-Zazo M. Lessons learnt from the DDIExtraction-2013 Shared Task. J Biomed Inform 2014;51:152-64. [PMID: 24858490 DOI: 10.1016/j.jbi.2014.05.007] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Revised: 05/05/2014] [Accepted: 05/08/2014] [Indexed: 10/25/2022]

Jones DE, Igo S, Hurdle J, Facelli JC. Automatic extraction of nanoparticle properties using natural language processing: NanoSifter an application to acquire PAMAM dendrimer properties. PLoS One 2014;9:e83932. [PMID: 24392101 PMCID: PMC3879259 DOI: 10.1371/journal.pone.0083932] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Accepted: 11/11/2013] [Indexed: 11/19/2022] Open

Luo G. Open issues in intelligent personal health record--an updated status report for 2012. J Med Syst 2013;37:9943. [PMID: 23584758 DOI: 10.1007/s10916-013-9943-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Accepted: 03/20/2013] [Indexed: 12/16/2022]

Xu R, Wang Q. A semi-supervised approach to extract pharmacogenomics-specific drug-gene pairs from biomedical literature for personalized medicine. J Biomed Inform 2013;46:585-93. [PMID: 23570835 DOI: 10.1016/j.jbi.2013.04.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Revised: 03/15/2013] [Accepted: 04/01/2013] [Indexed: 12/14/2022]

Jiang G, Wang C, Zhu Q, Chute CG. A Framework of Knowledge Integration and Discovery for Supporting Pharmacogenomics Target Predication of Adverse Drug Events: A Case Study of Drug-Induced Long QT Syndrome. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2013;2013:88-92. [PMID: 24303306 PMCID: PMC3814489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Xu R, Wang Q. An iterative searching and ranking algorithm for prioritising pharmacogenomics genes. INTERNATIONAL JOURNAL OF COMPUTATIONAL BIOLOGY AND DRUG DESIGN 2013;6:18-31. [PMID: 23428471 PMCID: PMC6100784 DOI: 10.1504/ijcbdd.2013.052199] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Trugenberger CA, Wälti C, Peregrim D, Sharp ME, Bureeva S. Discovery of novel biomarkers and phenotypes by semantic technologies. BMC Bioinformatics 2013;14:51. [PMID: 23402646 PMCID: PMC3605201 DOI: 10.1186/1471-2105-14-51] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2012] [Accepted: 02/01/2013] [Indexed: 11/13/2022] Open

Abstract

BACKGROUND

Biomarkers and target-specific phenotypes are important to targeted drug design and individualized medicine, thus constituting an important aspect of modern pharmaceutical research and development. More and more, the discovery of relevant biomarkers is aided by in silico techniques based on applying data mining and computational chemistry on large molecular databases. However, there is an even larger source of valuable information available that can potentially be tapped for such discoveries: repositories constituted by research documents.

RESULTS

This paper reports on a pilot experiment to discover potential novel biomarkers and phenotypes for diabetes and obesity by self-organized text mining of about 120,000 PubMed abstracts, public clinical trial summaries, and internal Merck research documents. These documents were directly analyzed by the InfoCodex semantic engine, without prior human manipulations such as parsing. Recall and precision against established, but different benchmarks lie in ranges up to 30% and 50% respectively. Retrieval of known entities missed by other traditional approaches could be demonstrated. Finally, the InfoCodex semantic engine was shown to discover new diabetes and obesity biomarkers and phenotypes. Amongst these were many interesting candidates with a high potential, although noticeable noise (uninteresting or obvious terms) was generated.

CONCLUSIONS

The reported approach of employing autonomous self-organising semantic engines to aid biomarker discovery, supplemented by appropriate manual curation processes, shows promise and has potential to impact, conservatively, a faster alternative to vocabulary processes dependent on humans having to read and analyze all the texts. More optimistically, it could impact pharmaceutical research, for example to shorten time-to-market of novel drugs, or speed up early recognition of dead ends and adverse reactions.

Collapse

Nawaz R, Thompson P, Ananiadou S. Negated bio-events: analysis and identification. BMC Bioinformatics 2013;14:14. [PMID: 23323936 PMCID: PMC3561152 DOI: 10.1186/1471-2105-14-14] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2012] [Accepted: 01/10/2013] [Indexed: 12/19/2022] Open

Abstract

BACKGROUND

Negation occurs frequently in scientific literature, especially in biomedical literature. It has previously been reported that around 13% of sentences found in biomedical research articles contain negation. Historically, the main motivation for identifying negated events has been to ensure their exclusion from lists of extracted interactions. However, recently, there has been a growing interest in negative results, which has resulted in negation detection being identified as a key challenge in biomedical relation extraction. In this article, we focus on the problem of identifying negated bio-events, given gold standard event annotations.

RESULTS

We have conducted a detailed analysis of three open access bio-event corpora containing negation information (i.e., GENIA Event, BioInfer and BioNLP'09 ST), and have identified the main types of negated bio-events. We have analysed the key aspects of a machine learning solution to the problem of detecting negated events, including selection of negation cues, feature engineering and the choice of learning algorithm. Combining the best solutions for each aspect of the problem, we propose a novel framework for the identification of negated bio-events. We have evaluated our system on each of the three open access corpora mentioned above. The performance of the system significantly surpasses the best results previously reported on the BioNLP'09 ST corpus, and achieves even better results on the GENIA Event and BioInfer corpora, both of which contain more varied and complex events.

CONCLUSIONS

Recently, in the field of biomedical text mining, the development and enhancement of event-based systems has received significant interest. The ability to identify negated events is a key performance element for these systems. We have conducted the first detailed study on the analysis and identification of negated bio-events. Our proposed framework can be integrated with state-of-the-art event extraction systems. The resulting systems will be able to extract bio-events with attached polarities from textual documents, which can serve as the foundation for more elaborate systems that are able to detect mutually contradicting bio-events.

Collapse

Chapter 7: Pharmacogenomics. PLoS Comput Biol 2012;8:e1002817. [PMID: 23300409 PMCID: PMC3531317 DOI: 10.1371/journal.pcbi.1002817] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Li J, Lu Z. Systematic identification of pharmacogenomics information from clinical trials. J Biomed Inform 2012;45:870-8. [PMID: 22546622 PMCID: PMC3760158 DOI: 10.1016/j.jbi.2012.04.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Revised: 03/13/2012] [Accepted: 04/11/2012] [Indexed: 11/23/2022]

The state of the art in text mining and natural language processing for pharmacogenomics. J Biomed Inform 2012. [DOI: 10.1016/j.jbi.2012.08.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Gyimesi G, Borsodi D, Sarankó H, Tordai H, Sarkadi B, Hegedűs T. ABCMdb: a database for the comparative analysis of protein mutations in ABC transporters, and a potential framework for a general application. Hum Mutat 2012;33:1547-56. [PMID: 22693078 DOI: 10.1002/humu.22138] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2012] [Accepted: 05/29/2012] [Indexed: 11/08/2022]

Samwald M, Coulet A, Huerga I, Powers RL, Luciano JS, Freimuth RR, Whipple F, Pichler E, Prud'hommeaux E, Dumontier M, Marshall MS. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics 2012;13:201-12. [PMID: 22256869 DOI: 10.2217/pgs.11.179] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Hahn U, Cohen KB, Garten Y, Shah NH. Mining the pharmacogenomics literature--a survey of the state of the art. Brief Bioinform 2012;13:460-94. [PMID: 22833496 PMCID: PMC3404399 DOI: 10.1093/bib/bbs018] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2011] [Accepted: 03/23/2012] [Indexed: 01/05/2023] Open

A mutation-centric approach to identifying pharmacogenomic relations in text. J Biomed Inform 2012;45:835-41. [PMID: 22683993 DOI: 10.1016/j.jbi.2012.05.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Revised: 05/19/2012] [Accepted: 05/21/2012] [Indexed: 11/21/2022]

Pakhomov S, McInnes BT, Lamba J, Liu Y, Melton GB, Ghodke Y, Bhise N, Lamba V, Birnbaum AK. Using PharmGKB to train text mining approaches for identifying potential gene targets for pharmacogenomic studies. J Biomed Inform 2012;45:862-9. [PMID: 22564551 DOI: 10.1016/j.jbi.2012.04.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2011] [Revised: 04/04/2012] [Accepted: 04/11/2012] [Indexed: 11/19/2022]

van Mulligen EM, Fourrier-Reglat A, Gurwitz D, Molokhia M, Nieto A, Trifiro G, Kors JA, Furlong LI. The EU-ADR corpus: annotated drugs, diseases, targets, and their relationships. J Biomed Inform 2012;45:879-84. [PMID: 22554700 DOI: 10.1016/j.jbi.2012.04.004] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2011] [Revised: 02/02/2012] [Accepted: 04/11/2012] [Indexed: 11/25/2022]

Rinaldi F, Clematide S, Garten Y, Whirl-Carrillo M, Gong L, Hebert JM, Sangkuhl K, Thorn CF, Klein TE, Altman RB. Using ODIN for a PharmGKB revalidation experiment. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012;2012:bas021. [PMID: 22529178 PMCID: PMC3332569 DOI: 10.1093/database/bas021] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

McDonagh EM, Whirl-Carrillo M, Garten Y, Altman RB, Klein TE. From pharmacogenomic knowledge acquisition to clinical applications: the PharmGKB as a clinical pharmacogenomic biomarker resource. Biomark Med 2012;5:795-806. [PMID: 22103613 DOI: 10.2217/bmm.11.94] [Citation(s) in RCA: 120] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open

WU YONGHUI, LIU MEI, ZHENG WJIM, ZHAO ZHONGMING, XU HUA. Ranking gene-drug relationships in biomedical literature using Latent Dirichlet Allocation. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2012:422-433. [PMID: 22174297 PMCID: PMC4095990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Percha B, Garten Y, Altman RB. Discovery and explanation of drug-drug interactions via text mining. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2012:410-421. [PMID: 22174296 PMCID: PMC3345566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Stringent response of Escherichia coli: revisiting the bibliome using literature mining. MICROBIAL INFORMATICS AND EXPERIMENTATION 2011;1:14. [PMID: 22587779 PMCID: PMC3372295 DOI: 10.1186/2042-5783-1-14] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2011] [Accepted: 12/30/2011] [Indexed: 12/11/2022]

Abstract

Background

Understanding the mechanisms responsible for cellular responses depends on the systematic collection and analysis of information on the main biological concepts involved. Indeed, the identification of biologically relevant concepts in free text, namely genes, tRNAs, mRNAs, gene products and small molecules, is crucial to capture the structure and functioning of different responses.

Results

In this work, we review literature reports on the study of the stringent response in Escherichia coli. Rather than undertaking the development of a highly specialised literature mining approach, we investigate the suitability of concept recognition and statistical analysis of concept occurrence as means to highlight the concepts that are most likely to be biologically engaged during this response. The co-occurrence analysis of core concepts in this stringent response, i.e. the (p)ppGpp nucleotides with gene products was also inspected and suggest that besides the enzymes RelA and SpoT that control the basal levels of (p)ppGpp nucleotides, many other proteins have a key role in this response. Functional enrichment analysis revealed that basic cellular processes such as metabolism, transcriptional and translational regulation are central, but other stress-associated responses might be elicited during the stringent response. In addition, the identification of less annotated concepts revealed that some (p)ppGpp-induced functional activities are still overlooked in most reviews.

Conclusions

In this paper we applied a literature mining approach that offers a more comprehensive analysis of the stringent response in E. coli. The compilation of relevant biological entities to this stress response and the assessment of their functional roles provided a more systematic understanding of this cellular response. Overlooked regulatory entities, such as transcriptional regulators, were found to play a role in this stress response. Moreover, the involvement of other stress-associated concepts demonstrates the complexity of this cellular response.

Collapse

Tsuruoka Y, Miwa M, Hamamoto K, Tsujii J, Ananiadou S. Discovering and visualizing indirect associations between biomedical concepts. Bioinformatics 2011;27:i111-9. [PMID: 21685059 PMCID: PMC3117364 DOI: 10.1093/bioinformatics/btr214] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Fernald GH, Capriotti E, Daneshjou R, Karczewski KJ, Altman RB. Bioinformatics challenges for personalized medicine. ACTA ACUST UNITED AC 2011;27:1741-8. [PMID: 21596790 PMCID: PMC3117361 DOI: 10.1093/bioinformatics/btr295] [Citation(s) in RCA: 177] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Coulet A, Garten Y, Dumontier M, Altman RB, Musen MA, Shah NH. Integration and publication of heterogeneous text-mined relationships on the Semantic Web. J Biomed Semantics 2011;2 Suppl 2:S10. [PMID: 21624156 PMCID: PMC3102890 DOI: 10.1186/2041-1480-2-s2-s10] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Abstract

Background

Advances in Natural Language Processing (NLP) techniques enable the extraction of fine-grained relationships mentioned in biomedical text. The variability and the complexity of natural language in expressing similar relationships causes the extracted relationships to be highly heterogeneous, which makes the construction of knowledge bases difficult and poses a challenge in using these for data mining or question answering.

Results

We report on the semi-automatic construction of the PHARE relationship ontology (the PHArmacogenomic RElationships Ontology) consisting of 200 curated relations from over 40,000 heterogeneous relationships extracted via text-mining. These heterogeneous relations are then mapped to the PHARE ontology using synonyms, entity descriptions and hierarchies of entities and roles. Once mapped, relationships can be normalized and compared using the structure of the ontology to identify relationships that have similar semantics but different syntax. We compare and contrast the manual procedure with a fully automated approach using WordNet to quantify the degree of integration enabled by iterative curation and refinement of the PHARE ontology. The result of such integration is a repository of normalized biomedical relationships, named PHARE-KB, which can be queried using Semantic Web technologies such as SPARQL and can be visualized in the form of a biological network.

Conclusions

The PHARE ontology serves as a common semantic framework to integrate more than 40,000 relationships pertinent to pharmacogenomics. The PHARE ontology forms the foundation of a knowledge base named PHARE-KB. Once populated with relationships, PHARE-KB (i) can be visualized in the form of a biological network to guide human tasks such as database curation and (ii) can be queried programmatically to guide bioinformatics applications such as the prediction of molecular interactions. PHARE is available at http://purl.bioontology.org/ontology/PHARE.

Collapse