Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

31
(from Reference Citation Analysis)

Article PDFs (16)

Cited by > 0 (29)

Searched Name

Donald C Comeau

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Tian S, Jin Q, Yeganova L, Lai PT, Zhu Q, Chen X, Yang Y, Chen Q, Kim W, Comeau DC, Islamaj R, Kapoor A, Gao X, Lu Z. Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Brief Bioinform 2023;25:bbad493. [PMID: 38168838 PMCID: PMC10762511 DOI: 10.1093/bib/bbad493] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/15/2023] [Accepted: 12/06/2023] [Indexed: 01/05/2024] Open

Jin Q, Kim W, Chen Q, Comeau DC, Yeganova L, Wilbur WJ, Lu Z. MedCPT: Contrastive Pre-trained Transformers with large-scale PubMed search logs for zero-shot biomedical information retrieval. Bioinformatics 2023;39:btad651. [PMID: 37930897 PMCID: PMC10627406 DOI: 10.1093/bioinformatics/btad651] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 09/29/2023] [Indexed: 11/08/2023] Open

Tian S, Jin Q, Yeganova L, Lai PT, Zhu Q, Chen X, Yang Y, Chen Q, Kim W, Comeau DC, Islamaj R, Kapoor A, Gao X, Lu Z. Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health. ArXiv 2023:arXiv:2306.10070v2. [PMID: 37904734 PMCID: PMC10614979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]

Kim W, Yeganova L, Comeau DC, Wilbur WJ, Lu Z. Towards a unified search: Improving PubMed retrieval with full text. J Biomed Inform 2022;134:104211. [PMID: 36152950 PMCID: PMC9561061 DOI: 10.1016/j.jbi.2022.104211] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 09/12/2022] [Accepted: 09/15/2022] [Indexed: 10/14/2022]

Abstract

OBJECTIVE

A significant number of recent articles in PubMed have full text available in PubMed Central®, and the availability of full texts has been consistently growing. However, it is not currently possible for a user to simultaneously query the contents of both databases and receive a single integrated search result. In this study, we investigate how to score full text articles given a multitoken query and how to combine those full text article scores with scores originating from abstracts and achieve an overall improved retrieval performance.

MATERIALS AND METHODS

For scoring full text articles, we propose a method to combine information coming from different sections by converting the traditionally used BM25 scores into log odds ratio scores which can be treated uniformly. We further propose a method that successfully combines scores from two heterogenous retrieval sources - full text articles and abstract only articles - by balancing the contributions of their respective scores through a probabilistic transformation. We use PubMed click data that consists of queries sampled from PubMed user logs along with a subset of retrieved and clicked documents to train the probabilistic functions and to evaluate retrieval effectiveness.

RESULTS AND CONCLUSIONS

Random ranking achieves 0.579 MAP score on our PubMed click data. BM25 ranking on PubMed abstracts improves the MAP by 10.6%. For full text documents, experiments confirm that BM25 section scores are of different value depending on the section type and are not directly comparable. Naïvely using the body text of articles along with abstract text degrades the overall quality of the search. The proposed log odds ratio scores normalize and combine the contributions of occurrences of query tokens in different sections. By including full text where available, we gain another 0.67%, or 7% relative improvement over abstract alone. We find an advantage in the more accurate estimate of the value of BM25 scores depending on the section from which they were produced. Taking the sum of top three section scores performs the best.

Collapse

Sayers EW, Bolton EE, Brister JR, Canese K, Chan J, Comeau DC, Connor R, Funk K, Kelly C, Kim S, Madej T, Marchler-Bauer A, Lanczycki C, Lathrop S, Lu Z, Thibaud-Nissen F, Murphy T, Phan L, Skripchenko Y, Tse T, Wang J, Williams R, Trawick BW, Pruitt KD, Sherry ST. Database resources of the national center for biotechnology information. Nucleic Acids Res 2021;50:D20-D26. [PMID: 34850941 DOI: 10.1093/nar/gkab1112] [Citation(s) in RCA: 711] [Impact Index Per Article: 237.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 10/20/2021] [Accepted: 11/18/2021] [Indexed: 11/14/2022] Open

Affiliation(s)

Eric W Sayers National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Evan E Bolton National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
J Rodney Brister National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kathi Canese National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Jessica Chan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Ryan Connor National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kathryn Funk National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Chris Kelly National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Sunghwan Kim National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Tom Madej National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Aron Marchler-Bauer National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Christopher Lanczycki National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Stacy Lathrop National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Zhiyong Lu National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Francoise Thibaud-Nissen National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Terence Murphy National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Lon Phan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Yuri Skripchenko National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Tony Tse National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Jiyao Wang National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Rebecca Williams National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Barton W Trawick National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kim D Pruitt National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Stephen T Sherry National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA

Collapse

Islamaj R, Leaman R, Kim S, Kwon D, Wei CH, Comeau DC, Peng Y, Cissel D, Coss C, Fisher C, Guzman R, Kochar PG, Koppel S, Trinh D, Sekiya K, Ward J, Whitman D, Schmidt S, Lu Z. NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature. Sci Data 2021;8:91. [PMID: 33767203 PMCID: PMC7994842 DOI: 10.1038/s41597-021-00875-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 01/19/2021] [Indexed: 11/13/2022] Open

Affiliation(s)

Rezarta Islamaj National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Robert Leaman National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Sun Kim National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Dongseop Kwon National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Chih-Hsuan Wei National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Donald C Comeau National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Yifan Peng National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
David Cissel National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Cathleen Coss National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Carol Fisher National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Rob Guzman National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Preeti Gokal Kochar National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Stella Koppel National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Dorothy Trinh National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Keiko Sekiya National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Janice Ward National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Deborah Whitman National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Susan Schmidt National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Zhiyong Lu National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.

Collapse

Sayers EW, Beck J, Bolton EE, Bourexis D, Brister JR, Canese K, Comeau DC, Funk K, Kim S, Klimke W, Marchler-Bauer A, Landrum M, Lathrop S, Lu Z, Madden TL, O’Leary N, Phan L, Rangwala SH, Schneider VA, Skripchenko Y, Wang J, Ye J, Trawick BW, Pruitt KD, Sherry ST. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2021;49:D10-D17. [PMID: 33095870 PMCID: PMC7778943 DOI: 10.1093/nar/gkaa892] [Citation(s) in RCA: 410] [Impact Index Per Article: 136.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 09/25/2020] [Accepted: 10/08/2020] [Indexed: 11/14/2022] Open

Affiliation(s)

Eric W Sayers National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Jeffrey Beck National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Evan E Bolton National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Devon Bourexis National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
James R Brister National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kathi Canese National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kathryn Funk National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Sunghwan Kim National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
William Klimke National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Aron Marchler-Bauer National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Melissa Landrum National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Stacy Lathrop National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Zhiyong Lu National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Thomas L Madden National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Nuala O’Leary National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Lon Phan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Sanjida H Rangwala National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Valerie A Schneider National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Yuri Skripchenko National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Jiyao Wang National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Jian Ye National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Barton W Trawick National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kim D Pruitt National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Stephen T Sherry National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA

Collapse

Comeau DC, Wei CH, Islamaj Doğan R, Lu Z. PMC text mining subset in BioC: about three million full-text articles and growing. Bioinformatics 2020;35:3533-3535. [PMID: 30715220 DOI: 10.1093/bioinformatics/btz070] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 01/17/2018] [Accepted: 01/28/2019] [Indexed: 12/19/2022] Open

Allot A, Chen Q, Kim S, Vera Alvarez R, Comeau DC, Wilbur WJ, Lu Z. LitSense: making sense of biomedical literature at sentence level. Nucleic Acids Res 2020;47:W594-W599. [PMID: 31020319 DOI: 10.1093/nar/gkz289] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 04/05/2019] [Accepted: 04/10/2019] [Indexed: 11/15/2022] Open

Sayers EW, Beck J, Brister JR, Bolton EE, Canese K, Comeau DC, Funk K, Ketter A, Kim S, Kimchi A, Kitts PA, Kuznetsov A, Lathrop S, Lu Z, McGarvey K, Madden TL, Murphy TD, O'Leary N, Phan L, Schneider VA, Thibaud-Nissen F, Trawick BW, Pruitt KD, Ostell J. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2020;48:D9-D16. [PMID: 31602479 DOI: 10.1093/nar/gkz899] [Citation(s) in RCA: 267] [Impact Index Per Article: 66.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 10/09/2019] [Indexed: 11/14/2022] Open

Affiliation(s)

Eric W Sayers National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Jeff Beck National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
J Rodney Brister National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Evan E Bolton National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kathi Canese National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kathryn Funk National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Anne Ketter National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Sunghwan Kim National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Avi Kimchi National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Paul A Kitts National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Anatoliy Kuznetsov National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Stacy Lathrop National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Zhiyong Lu National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kelly McGarvey National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Thomas L Madden National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Terence D Murphy National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Nuala O'Leary National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Lon Phan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Valerie A Schneider National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Françoise Thibaud-Nissen National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Bart W Trawick National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Kim D Pruitt National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
James Ostell National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA

Collapse

Islamaj Dogan R, Kim S, Chatr-Aryamontri A, Wei CH, Comeau DC, Antunes R, Matos S, Chen Q, Elangovan A, Panyam NC, Verspoor K, Liu H, Wang Y, Liu Z, Altinel B, Hüsünbeyi ZM, Özgür A, Fergadis A, Wang CK, Dai HJ, Tran T, Kavuluru R, Luo L, Steppi A, Zhang J, Qu J, Lu Z. Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine. Database (Oxford) 2019;2019:5303240. [PMID: 30689846 PMCID: PMC6348314 DOI: 10.1093/database/bay147] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Accepted: 12/19/2018] [Indexed: 12/16/2022]

Abstract

The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease association studies, electronic health records and other data types. The biomedical literature provides a rich foundation for populating these KBs, reporting genetic and molecular interactions that provide the scaffold for the cellular regulatory systems and detailing the influence of genetic variants in these interactions. The goal of BioCreative VI Precision Medicine Track was to extract this particular type of information and was organized in two tasks: (i) document triage task, focused on identifying scientific literature containing experimentally verified protein-protein interactions (PPIs) affected by genetic mutations and (ii) relation extraction task, focused on extracting the affected interactions (protein pairs). To assist system developers and task participants, a large-scale corpus of PubMed documents was manually annotated for this task. Ten teams worldwide contributed 22 distinct text-mining models for the document triage task, and six teams worldwide contributed 14 different text-mining systems for the relation extraction task. When comparing the text-mining system predictions with human annotations, for the triage task, the best F-score was 69.06%, the best precision was 62.89%, the best recall was 98.0% and the best average precision was 72.5%. For the relation extraction task, when taking homologous genes into account, the best F-score was 37.73%, the best precision was 46.5% and the best recall was 54.1%. Submitted systems explored a wide range of methods, from traditional rule-based, statistical and machine learning systems to state-of-the-art deep learning methods. Given the level of participation and the individual team results we find the precision medicine track to be successful in engaging the text-mining research community. In the meantime, the track produced a manually annotated corpus of 5509 PubMed documents developed by BioGRID curators and relevant for precision medicine. The data set is freely available to the community, and the specific interactions have been integrated into the BioGRID data set. In addition, this challenge provided the first results of automatically identifying PubMed articles that describe PPI affected by mutations, as well as extracting the affected relations from those articles. Still, much progress is needed for computer-assisted precision medicine text mining to become mainstream. Future work should focus on addressing the remaining technical challenges and incorporating the practical benefits of text-mining tools into real-world precision medicine information-related curation.

Collapse

Affiliation(s)

Rezarta Islamaj Dogan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Sun Kim National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Andrew Chatr-Aryamontri Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Canada
Chih-Hsuan Wei National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Rui Antunes Department of Electronics, Telecommunications and Informatics (DETI)/Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal
Sérgio Matos Department of Electronics, Telecommunications and Informatics (DETI)/Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal
Qingyu Chen School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Aparna Elangovan School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Nagesh C Panyam School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Karin Verspoor School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Hongfang Liu Department of Health Science Research, Mayo Clinic, Rochester, MN, USA
Yanshan Wang Department of Health Science Research, Mayo Clinic, Rochester, MN, USA
Zhuang Liu School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Berna Altinel Department of Computer Engineering, Marmara University, Istanbul, Turkey
Zehra Melce Hüsünbeyi Department of Computer Engineering, Bogaziçi University, Istanbul, Turkey
Arzucan Özgür
Aris Fergadis School of Electrical and Computer Engineering, National Technical University of Athens, Zografou, Athens, Greece
Chen-Kai Wang Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan
Hong-Jie Dai Department of Electrical Engineering, National Kaousiung University of Science and Technology, Kaohsiung, Taiwan
Tung Tran Department of Computer Science, University of Kentucky, Lexington, KY, USA
Ramakanth Kavuluru Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, KY, USA
Ling Luo College of Computer Science and Technology, Dalian University of Technology, Dalian, China
Albert Steppi Department of Statistics, Florida State University, Florida, USA
Jinfeng Zhang Department of Statistics, Florida State University, Florida, USA
Jinchan Qu Department of Statistics, Florida State University, Florida, USA
Zhiyong Lu National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA

Collapse

Kim S, Yeganova L, Comeau DC, Wilbur WJ, Lu Z. PubMed Phrases, an open set of coherent phrases for searching biomedical literature. Sci Data 2018;5:180104. [PMID: 29893755 PMCID: PMC5996850 DOI: 10.1038/sdata.2018.104] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 04/06/2018] [Indexed: 11/09/2022] Open

Yeganova L, Kim W, Comeau DC, Wilbur WJ, Lu Z. A Field Sensor: computing the composition and intent of PubMed queries. Database (Oxford) 2018;2018:5053191. [PMID: 30010750 PMCID: PMC6044290 DOI: 10.1093/database/bay052] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2017] [Revised: 04/19/2018] [Accepted: 05/17/2018] [Indexed: 11/13/2022]

Islamaj Dogan R, Kim S, Chatr-Aryamontri A, Chang CS, Oughtred R, Rust J, Wilbur WJ, Comeau DC, Dolinski K, Tyers M. The BioC-BioGRID corpus: full text articles annotated for curation of protein-protein and genetic interactions. Database (Oxford) 2017;2017:baw147. [PMID: 28077563 PMCID: PMC5225395 DOI: 10.1093/database/baw147] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2016] [Revised: 10/14/2016] [Accepted: 10/18/2016] [Indexed: 11/13/2022]

Abstract

A great deal of information on the molecular genetics and biochemistry of model organisms has been reported in the scientific literature. However, this data is typically described in free text form and is not readily amenable to computational analyses. To this end, the BioGRID database systematically curates the biomedical literature for genetic and protein interaction data. This data is provided in a standardized computationally tractable format and includes structured annotation of experimental evidence. BioGRID curation necessarily involves substantial human effort by expert curators who must read each publication to extract the relevant information. Computational text-mining methods offer the potential to augment and accelerate manual curation. To facilitate the development of practical text-mining strategies, a new challenge was organized in BioCreative V for the BioC task, the collaborative Biocurator Assistant Task. This was a non-competitive, cooperative task in which the participants worked together to build BioC-compatible modules into an integrated pipeline to assist BioGRID curators. As an integral part of this task, a test collection of full text articles was developed that contained both biological entity annotations (gene/protein and organism/species) and molecular interaction annotations (protein–protein and genetic interactions (PPIs and GIs)). This collection, which we call the BioC-BioGRID corpus, was annotated by four BioGRID curators over three rounds of annotation and contains 120 full text articles curated in a dataset representing two major model organisms, namely budding yeast and human. The BioC-BioGRID corpus contains annotations for 6409 mentions of genes and their Entrez Gene IDs, 186 mentions of organism names and their NCBI Taxonomy IDs, 1867 mentions of PPIs and 701 annotations of PPI experimental evidence statements, 856 mentions of GIs and 399 annotations of GI evidence statements. The purpose, characteristics and possible future uses of the BioC-BioGRID corpus are detailed in this report.

Database URL:http://bioc.sourceforge.net/BioC-BioGRID.html

Collapse

Kim S, Islamaj Doğan R, Chatr-Aryamontri A, Chang CS, Oughtred R, Rust J, Batista-Navarro R, Carter J, Ananiadou S, Matos S, Santos A, Campos D, Oliveira JL, Singh O, Jonnagaddala J, Dai HJ, Su ECY, Chang YC, Su YC, Chu CH, Chen CC, Hsu WL, Peng Y, Arighi C, Wu CH, Vijay-Shanker K, Aydın F, Hüsünbeyi ZM, Özgür A, Shin SY, Kwon D, Dolinski K, Tyers M, Wilbur WJ, Comeau DC. BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID. Database (Oxford) 2016;2016:baw121. [PMID: 27589962 PMCID: PMC5009341 DOI: 10.1093/database/baw121] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Accepted: 08/02/2016] [Indexed: 11/14/2022]

Affiliation(s)

Sun Kim National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Rezarta Islamaj Doğan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Andrew Chatr-Aryamontri Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, QC H3C 3J7, Canada
Christie S Chang Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Rose Oughtred Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Jennifer Rust Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Riza Batista-Navarro National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, UK
Jacob Carter National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, UK
Sophia Ananiadou National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, UK
Sérgio Matos DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
André Santos DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
David Campos BMD Software, Lda, Rua Calouste Gulbenkian 1, 3810-074 Aveiro, Portugal
José Luís Oliveira DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
Onkar Singh Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
Jitendra Jonnagaddala School of Public Health and Community Medicine, University of New South Wales, Kensington NSW 2033, Australia Prince of Wales Clinical School, University of New South Wales, Kensington NSW 2033, Australia
Hong-Jie Dai Department of Computer Science and Information Engineering, National Taitung University, Taitung, Taiwan
Emily Chia-Yu Su Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
Yung-Chun Chang Institute of Information Science, Academia Sinica, Taipei, Taiwan Department of Information Management, National Taiwan University, Taipei, Taiwan
Yu-Chen Su Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
Chun-Han Chu Institute of Information Science, Academia Sinica, Taipei, Taiwan
Chien Chin Chen Department of Information Management, National Taiwan University, Taipei, Taiwan
Wen-Lian Hsu Institute of Information Science, Academia Sinica, Taipei, Taiwan
Yifan Peng Computer & Information Sciences, University of Delaware, Newark, DE 19716, USA
Cecilia Arighi Computer & Information Sciences, University of Delaware, Newark, DE 19716, USA Center for Bioinformatics & Computational Biology, University of Delaware, Newark, DE 19716, USA
Cathy H Wu Computer & Information Sciences, University of Delaware, Newark, DE 19716, USA Center for Bioinformatics & Computational Biology, University of Delaware, Newark, DE 19716, USA
K Vijay-Shanker Computer & Information Sciences, University of Delaware, Newark, DE 19716, USA
Ferhat Aydın Department of Computer Engineering, Boğaziçi University, Bebek, 34342 Istanbul, Turkey
Zehra Melce Hüsünbeyi Department of Computer Engineering, Boğaziçi University, Bebek, 34342 Istanbul, Turkey
Arzucan Özgür Department of Computer Engineering, Boğaziçi University, Bebek, 34342 Istanbul, Turkey
Soo-Yong Shin Department of Biomedical Informatics, Asan Medical Center, 138-736 Seoul, South Korea
Dongseop Kwon Department of Computer Engineering, Myongji University, 449-728 Yongin, South Korea
Kara Dolinski Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
Mike Tyers Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, QC H3C 3J7, Canada The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
W John Wilbur National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA

Collapse

Liu H, Verspoor K, Comeau DC, MacKinlay AD, Wilbur W. Optimizing graph-based patterns to extract biomedical events from the literature. BMC Bioinformatics 2015;16 Suppl 16:S2. [PMID: 26551594 PMCID: PMC4642081 DOI: 10.1186/1471-2105-16-s16-s2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

In BioNLP-ST 2013

We participated in the BioNLP 2013 shared tasks on event extraction. Our extraction method is based on the search for an approximate subgraph isomorphism between key context dependencies of events and graphs of input sentences. Our system was able to address both the GENIA (GE) task focusing on 13 molecular biology related event types and the Cancer Genetics (CG) task targeting a challenging group of 40 cancer biology related event types with varying arguments concerning 18 kinds of biological entities. In addition to adapting our system to the two tasks, we also attempted to integrate semantics into the graph matching scheme using a distributional similarity model for more events, and evaluated the event extraction impact of using paths of all possible lengths as key context dependencies beyond using only the shortest paths in our system. We achieved a 46.38% F-score in the CG task (ranking 3^rd) and a 48.93% F-score in the GE task (ranking 4^th).

After BioNLP-ST 2013

We explored three ways to further extend our event extraction system in our previously published work: (1) We allow non-essential nodes to be skipped, and incorporated a node skipping penalty into the subgraph distance function of our approximate subgraph matching algorithm. (2) Instead of assigning a unified subgraph distance threshold to all patterns of an event type, we learned a customized threshold for each pattern. (3) We implemented the well-known Empirical Risk Minimization (ERM) principle to optimize the event pattern set by balancing prediction errors on training data against regularization. When evaluated on the official GE task test data, these extensions help to improve the extraction precision from 62% to 65%. However, the overall F-score stays equivalent to the previous performance due to a 1% drop in recall.

Collapse

Comeau DC, Batista-Navarro RT, Dai HJ, Doğan RI, Yepes AJ, Khare R, Lu Z, Marques H, Mattingly CJ, Neves M, Peng Y, Rak R, Rinaldi F, Tsai RTH, Verspoor K, Wiegers TC, Wu CH, Wilbur WJ. BioC interoperability track overview. Database (Oxford) 2014;2014:bau053. [PMID: 24980129 PMCID: PMC4074764 DOI: 10.1093/database/bau053] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Affiliation(s)

Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Riza Theresa Batista-Navarro National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Hong-Jie Dai National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Rezarta Islamaj Doğan National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Antonio Jimeno Yepes National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Ritu Khare National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Zhiyong Lu National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Hernani Marques National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Carolyn J Mattingly National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Mariana Neves National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USANational Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Univers
Yifan Peng National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Rafal Rak National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Fabio Rinaldi National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Richard Tzong-Han Tsai National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Karin Verspoor National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USANational Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Univers
Thomas C Wiegers National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
Cathy H Wu National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USANational Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Univers
W John Wilbur National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA, National Centre for Text Mining and School of Computer Science, University of Manchester, Manchester M1 7DN, UK, Graduate Institute of BioMedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan, R.O.C., Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria Australia 3010, Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA, WBI, Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany, Berlin Brandenburg Center for Regenerative Therapies, Charité - Universitätsmedizin Berlin, Berlin 13353, Germany, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19711, USA, Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, Taiwan, R.O.C., Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, Victoria Australia 3010, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA

Collapse

Liu W, Islamaj Doğan R, Kwon D, Marques H, Rinaldi F, Wilbur WJ, Comeau DC. BioC implementations in Go, Perl, Python and Ruby. Database (Oxford) 2014;2014:bau059. [PMID: 24961236 PMCID: PMC4067548 DOI: 10.1093/database/bau059] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Affiliation(s)

Wanli Liu National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA, Department of Computer Engineering, Myongji University, Yongin, Republic of Korea and Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland
Rezarta Islamaj Doğan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA, Department of Computer Engineering, Myongji University, Yongin, Republic of Korea and Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland
Dongseop Kwon National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA, Department of Computer Engineering, Myongji University, Yongin, Republic of Korea and Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland
Hernani Marques National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA, Department of Computer Engineering, Myongji University, Yongin, Republic of Korea and Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland
Fabio Rinaldi National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA, Department of Computer Engineering, Myongji University, Yongin, Republic of Korea and Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland
W John Wilbur National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA, Department of Computer Engineering, Myongji University, Yongin, Republic of Korea and Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland
Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA, Department of Computer Engineering, Myongji University, Yongin, Republic of Korea and Institute of Computational Linguistics, University of Zurich, Zurich 8050, Switzerland

Collapse

Comeau DC, Liu H, Islamaj Doğan R, Wilbur WJ. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus. Database (Oxford) 2014;2014:bau056. [PMID: 24935050 PMCID: PMC4058794 DOI: 10.1093/database/bau056] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Islamaj Doğan R, Comeau DC, Yeganova L, Wilbur WJ. Finding abbreviations in biomedical literature: three BioC-compatible modules and four BioC-formatted corpora. Database (Oxford) 2014;2014:bau044. [PMID: 24914232 PMCID: PMC4051513 DOI: 10.1093/database/bau044] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Liu W, Islamaj Doğan R, Kim S, Comeau DC, Kim W, Yeganova L, Lu Z, Wilbur WJ. Author Name Disambiguation for PubMed. J Assoc Inf Sci Technol 2013;65:765-781. [PMID: 28758138 PMCID: PMC5530597 DOI: 10.1002/asi.23063] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]

Comeau DC, Islamaj Doğan R, Ciccarese P, Cohen KB, Krallinger M, Leitner F, Lu Z, Peng Y, Rinaldi F, Torii M, Valencia A, Verspoor K, Wiegers TC, Wu CH, Wilbur WJ. BioC: a minimalist approach to interoperability for biomedical text processing. Database (Oxford) 2013;2013:bat064. [PMID: 24048470 PMCID: PMC3889917 DOI: 10.1093/database/bat064] [Citation(s) in RCA: 100] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Yeganova L, Kim W, Comeau DC, Wilbur WJ. Finding biomedical categories in Medline®. J Biomed Semantics 2012;3 Suppl 3:S3. [PMID: 23046816 PMCID: PMC3465206 DOI: 10.1186/2041-1480-3-s3-s3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Kim W, Yeganova L, Comeau DC, Wilbur WJ. Identifying well-formed biomedical phrases in MEDLINE® text. J Biomed Inform 2012;45:1035-41. [PMID: 22683889 DOI: 10.1016/j.jbi.2012.05.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Revised: 05/22/2012] [Accepted: 05/25/2012] [Indexed: 11/26/2022]

Yeganova L, Comeau DC, Wilbur WJ. Machine learning with naturally labeled data for identifying abbreviation definitions. BMC Bioinformatics 2011;12 Suppl 3:S6. [PMID: 21658293 PMCID: PMC3111592 DOI: 10.1186/1471-2105-12-s3-s6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Sohn S, Comeau DC, Kim W, Wilbur WJ. Term-Centric Active Learning for Naive Bayes Document Classification. ACTA ACUST UNITED AC 2009. [DOI: 10.2174/1874133900903010054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Yeganova L, Comeau DC, Kim W, Wilbur WJ. How to Interpret PubMed Queries and Why It Matters. ACTA ACUST UNITED AC 2008;60:264-274. [PMID: 29456459 DOI: 10.1002/asi.20979] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Sohn S, Comeau DC, Kim W, Wilbur WJ. Abbreviation definition identification based on automatic precision estimates. BMC Bioinformatics 2008;9:402. [PMID: 18817555 PMCID: PMC2576267 DOI: 10.1186/1471-2105-9-402] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2008] [Accepted: 09/25/2008] [Indexed: 11/10/2022] Open

Sohn S, Kim W, Comeau DC, Wilbur WJ. Optimal training sets for Bayesian prediction of MeSH assignment. J Am Med Inform Assoc 2008;15:546-53. [PMID: 18436913 DOI: 10.1197/jamia.m2431] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Tanabe L, Thom LH, Matten W, Comeau DC, Wilbur WJ. SemCat: semantically categorized entities for genomics. AMIA Annu Symp Proc 2006;2006:754-8. [PMID: 17238442 PMCID: PMC1839293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]

Comeau DC, Shavitt I, Jensen P, Bunker PR. Anabinitiodetermination of the potential‐energy surfaces and rotation–vibration energy levels of methylene in the lowest triplet and singlet states and the singlet–triplet splitting. J Chem Phys 1989. [DOI: 10.1063/1.456315] [Citation(s) in RCA: 102] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open