Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kwon D, Kim S, Shin SY, Chatr-aryamontri A, Wilbur WJ. Assisting manual literature curation for protein-protein interactions using BioQRator. Database (Oxford) 2014;2014:bau067. [PMID: 25052701 PMCID: PMC4105708 DOI: 10.1093/database/bau067] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

For:	Kwon D, Kim S, Shin SY, Chatr-aryamontri A, Wilbur WJ. Assisting manual literature curation for protein-protein interactions using BioQRator. Database (Oxford) 2014;2014:bau067. [PMID: 25052701 PMCID: PMC4105708 DOI: 10.1093/database/bau067] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Number

Cited by Other Article(s)

Georgouli K, Yeom JS, Blake RC, Navid A. Multi-scale models of whole cells: progress and challenges. Front Cell Dev Biol 2023;11:1260507. [PMID: 38020904 PMCID: PMC10661945 DOI: 10.3389/fcell.2023.1260507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 10/19/2023] [Indexed: 12/01/2023] Open

Boosting biomedical document classification through the use of domain entity recognizers and semantic ontologies for document representation: The case of gluten bibliome. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Neves M, Ševa J. An extensive review of tools for manual annotation of documents. Brief Bioinform 2021;22:146-163. [PMID: 31838514 PMCID: PMC7820865 DOI: 10.1093/bib/bbz130] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Indexed: 12/16/2022] Open

Grosman JS, Furtado PH, Rodrigues AM, Schardong GG, Barbosa SD, Lopes HC. Eras: Improving the quality control in the annotation process for Natural Language Processing tasks. INFORM SYST 2020. [DOI: 10.1016/j.is.2020.101553] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Islamaj R, Kwon D, Kim S, Lu Z. TeamTat: a collaborative text annotation tool. Nucleic Acids Res 2020;48:W5-W11. [PMID: 32383756 DOI: 10.1093/nar/gkaa333] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 04/16/2020] [Accepted: 04/22/2020] [Indexed: 12/20/2022] Open

Piad-Morffis A, Gutiérrez Y, Almeida-Cruz Y, Muñoz R. A computational ecosystem to support eHealth Knowledge Discovery technologies in Spanish. J Biomed Inform 2020;109:103517. [PMID: 32712157 PMCID: PMC7377985 DOI: 10.1016/j.jbi.2020.103517] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 05/18/2020] [Accepted: 07/19/2020] [Indexed: 11/29/2022]

Nicholson DN, Greene CS. Constructing knowledge graphs and their biomedical applications. Comput Struct Biotechnol J 2020;18:1414-1428. [PMID: 32637040 PMCID: PMC7327409 DOI: 10.1016/j.csbj.2020.05.017] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 05/22/2020] [Accepted: 05/23/2020] [Indexed: 12/31/2022] Open

Kwon D, Kim S, Wei CH, Leaman R, Lu Z. ezTag: tagging biomedical concepts via interactive learning. Nucleic Acids Res 2019;46:W523-W529. [PMID: 29788413 PMCID: PMC6030907 DOI: 10.1093/nar/gky428] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 05/07/2018] [Indexed: 12/22/2022] Open

Islamaj Dogan R, Kim S, Chatr-Aryamontri A, Wei CH, Comeau DC, Antunes R, Matos S, Chen Q, Elangovan A, Panyam NC, Verspoor K, Liu H, Wang Y, Liu Z, Altinel B, Hüsünbeyi ZM, Özgür A, Fergadis A, Wang CK, Dai HJ, Tran T, Kavuluru R, Luo L, Steppi A, Zhang J, Qu J, Lu Z. Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine. Database (Oxford) 2019;2019:5303240. [PMID: 30689846 PMCID: PMC6348314 DOI: 10.1093/database/bay147] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Accepted: 12/19/2018] [Indexed: 12/16/2022]

Abstract

The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease association studies, electronic health records and other data types. The biomedical literature provides a rich foundation for populating these KBs, reporting genetic and molecular interactions that provide the scaffold for the cellular regulatory systems and detailing the influence of genetic variants in these interactions. The goal of BioCreative VI Precision Medicine Track was to extract this particular type of information and was organized in two tasks: (i) document triage task, focused on identifying scientific literature containing experimentally verified protein-protein interactions (PPIs) affected by genetic mutations and (ii) relation extraction task, focused on extracting the affected interactions (protein pairs). To assist system developers and task participants, a large-scale corpus of PubMed documents was manually annotated for this task. Ten teams worldwide contributed 22 distinct text-mining models for the document triage task, and six teams worldwide contributed 14 different text-mining systems for the relation extraction task. When comparing the text-mining system predictions with human annotations, for the triage task, the best F-score was 69.06%, the best precision was 62.89%, the best recall was 98.0% and the best average precision was 72.5%. For the relation extraction task, when taking homologous genes into account, the best F-score was 37.73%, the best precision was 46.5% and the best recall was 54.1%. Submitted systems explored a wide range of methods, from traditional rule-based, statistical and machine learning systems to state-of-the-art deep learning methods. Given the level of participation and the individual team results we find the precision medicine track to be successful in engaging the text-mining research community. In the meantime, the track produced a manually annotated corpus of 5509 PubMed documents developed by BioGRID curators and relevant for precision medicine. The data set is freely available to the community, and the specific interactions have been integrated into the BioGRID data set. In addition, this challenge provided the first results of automatically identifying PubMed articles that describe PPI affected by mutations, as well as extracting the affected relations from those articles. Still, much progress is needed for computer-assisted precision medicine text mining to become mainstream. Future work should focus on addressing the remaining technical challenges and incorporating the practical benefits of text-mining tools into real-world precision medicine information-related curation.

Collapse

Affiliation(s)

Rezarta Islamaj Dogan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Sun Kim National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Andrew Chatr-Aryamontri Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Canada
Chih-Hsuan Wei National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Rui Antunes Department of Electronics, Telecommunications and Informatics (DETI)/Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal
Sérgio Matos Department of Electronics, Telecommunications and Informatics (DETI)/Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal
Qingyu Chen School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Aparna Elangovan School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Nagesh C Panyam School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Karin Verspoor School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Hongfang Liu Department of Health Science Research, Mayo Clinic, Rochester, MN, USA
Yanshan Wang Department of Health Science Research, Mayo Clinic, Rochester, MN, USA
Zhuang Liu School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Berna Altinel Department of Computer Engineering, Marmara University, Istanbul, Turkey
Zehra Melce Hüsünbeyi Department of Computer Engineering, Bogaziçi University, Istanbul, Turkey
Arzucan Özgür
Aris Fergadis School of Electrical and Computer Engineering, National Technical University of Athens, Zografou, Athens, Greece
Chen-Kai Wang Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan
Hong-Jie Dai Department of Electrical Engineering, National Kaousiung University of Science and Technology, Kaohsiung, Taiwan
Tung Tran Department of Computer Science, University of Kentucky, Lexington, KY, USA
Ramakanth Kavuluru Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, KY, USA
Ling Luo College of Computer Science and Technology, Dalian University of Technology, Dalian, China
Albert Steppi Department of Statistics, Florida State University, Florida, USA
Jinfeng Zhang Department of Statistics, Florida State University, Florida, USA
Jinchan Qu Department of Statistics, Florida State University, Florida, USA
Zhiyong Lu National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA

Collapse

Chatr-Aryamontri A, Oughtred R, Boucher L, Rust J, Chang C, Kolas NK, O'Donnell L, Oster S, Theesfeld C, Sellam A, Stark C, Breitkreutz BJ, Dolinski K, Tyers M. The BioGRID interaction database: 2017 update. Nucleic Acids Res 2016;45:D369-D379. [PMID: 27980099 PMCID: PMC5210573 DOI: 10.1093/nar/gkw1102] [Citation(s) in RCA: 666] [Impact Index Per Article: 83.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Revised: 10/25/2016] [Accepted: 10/27/2016] [Indexed: 01/05/2023] Open

Metabolic Pathway Mining. Methods Mol Biol 2016. [PMID: 27896740 DOI: 10.1007/978-1-4939-6613-4_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Shin SY, Kim S, Wilbur WJ, Kwon D. BioC viewer: a web-based tool for displaying and merging annotations in BioC. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016;2016:baw106. [PMID: 27515823 PMCID: PMC4980568 DOI: 10.1093/database/baw106] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 06/23/2016] [Indexed: 12/20/2022]

Mottin L, Gobeill J, Pasche E, Michel PA, Cusin I, Gaudet P, Ruch P. neXtA5: accelerating annotation of articles via automated approaches in neXtProt. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016;2016:baw098. [PMID: 27374119 PMCID: PMC4930835 DOI: 10.1093/database/baw098] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Accepted: 05/31/2016] [Indexed: 11/13/2022]

Abstract

The rapid increase in the number of published articles poses a challenge for curated databases to remain up-to-date. To help the scientific community and database curators deal with this issue, we have developed an application, neXtA5, which prioritizes the literature for specific curation requirements. Our system, neXtA5, is a curation service composed of three main elements. The first component is a named-entity recognition module, which annotates MEDLINE over some predefined axes. This report focuses on three axes: Diseases, the Molecular Function and Biological Process sub-ontologies of the Gene Ontology (GO). The automatic annotations are then stored in a local database, BioMed, for each annotation axis. Additional entities such as species and chemical compounds are also identified. The second component is an existing search engine, which retrieves the most relevant MEDLINE records for any given query. The third component uses the content of BioMed to generate an axis-specific ranking, which takes into account the density of named-entities as stored in the Biomed database. The two ranked lists are ultimately merged using a linear combination, which has been specifically tuned to support the annotation of each axis. The fine-tuning of the coefficients is formally reported for each axis-driven search. Compared with PubMed, which is the system used by most curators, the improvement is the following: +231% for Diseases, +236% for Molecular Functions and +3153% for Biological Process when measuring the precision of the top-returned PMID (P0 or mean reciprocal rank). The current search methods significantly improve the search effectiveness of curators for three important curation axes. Further experiments are being performed to extend the curation types, in particular protein-protein interactions, which require specific relationship extraction capabilities. In parallel, user-friendly interfaces powered with a set of JSON web services are currently being implemented into the neXtProt annotation pipeline.Available on: http://babar.unige.ch:8082/neXtA5Database URL: http://babar.unige.ch:8082/neXtA5/fetcher.jsp.

Collapse

Matos S, Campos D, Pinho R, Silva RM, Mort M, Cooper DN, Oliveira JL. Mining clinical attributes of genomic variants through assisted literature curation in Egas. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016;2016:baw096. [PMID: 27278817 PMCID: PMC4897594 DOI: 10.1093/database/baw096] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 05/15/2016] [Indexed: 01/08/2023]

McIntosh L, Hudson-Vitale C, Prior F. Special Issue on Reproducible Research for Biomedical Informatics. J Biomed Inform 2016. [DOI: 10.1016/j.jbi.2015.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Badal VD, Kundrotas PJ, Vakser IA. Text Mining for Protein Docking. PLoS Comput Biol 2015;11:e1004630. [PMID: 26650466 PMCID: PMC4674139 DOI: 10.1371/journal.pcbi.1004630] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 10/29/2015] [Indexed: 11/18/2022] Open

Abstract

The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking). Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu). The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features) approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound benchmark set, significantly increasing the docking success rate.

Protein interactions are central for many cellular processes. Physical characterization of these interactions is essential for understanding of life processes and applications in biology and medicine. Because of the inherent limitations of experimental techniques and rapid development of computational power and methodology, computer modeling is a tool of choice in many studies. Publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for modeling of proteins and protein complexes. A major paradigm shift in modeling of protein complexes is emerging due to the rapidly expanding amount of such information, which can be used as modeling constraints. Text mining has been widely used in recreating networks of protein interactions, as well as in detecting small molecule binding sites on proteins. Combining and expanding these two well-developed areas of research, we applied the text mining to physical modeling of protein complexes (protein docking). Our procedure retrieves published abstracts on a protein-protein interaction and extracts the relevant information. The results show that correct information on binding can be obtained for about half of protein complexes. The extracted constraints were incorporated in a modeling procedure, significantly improving its performance.

Collapse

Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M. The BioGRID interaction database: 2015 update. Nucleic Acids Res 2014;43:D470-8. [PMID: 25428363 PMCID: PMC4383984 DOI: 10.1093/nar/gku1204] [Citation(s) in RCA: 648] [Impact Index Per Article: 64.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Affiliation(s)

Andrew Chatr-Aryamontri Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada
Bobby-Joe Breitkreutz The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
Rose Oughtred Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Lorrie Boucher The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
Sven Heinicke Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Daici Chen Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada
Chris Stark The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
Ashton Breitkreutz The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
Nadine Kolas The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
Lara O'Donnell The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
Teresa Reguly The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
Julie Nixon School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
Lindsay Ramage School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
Andrew Winter School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
Adnane Sellam Centre Hospitalier de l'Université Laval (CHUL), Québec, Québec G1V 4G2, Canada
Christie Chang Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Jodi Hirschman Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Chandra Theesfeld Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Jennifer Rust Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Michael S Livstone Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Kara Dolinski Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Mike Tyers Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK

Collapse

Liu RL, Shih CC. Identification of highly related references about gene-disease association. BMC Bioinformatics 2014;15:286. [PMID: 25155502 PMCID: PMC4162969 DOI: 10.1186/1471-2105-15-286] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2013] [Accepted: 08/12/2014] [Indexed: 02/03/2023] Open