Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Luo Y, Uzuner Ö, Szolovits P. Bridging semantics and syntax with graph algorithms-state-of-the-art of extracting biomedical relations. Brief Bioinform 2017;18:160-178. [PMID: 26851224 PMCID: PMC5221425 DOI: 10.1093/bib/bbw001] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Revised: 11/29/2015] [Indexed: 01/18/2023] Open

For:	Luo Y, Uzuner Ö, Szolovits P. Bridging semantics and syntax with graph algorithms-state-of-the-art of extracting biomedical relations. Brief Bioinform 2017;18:160-178. [PMID: 26851224 PMCID: PMC5221425 DOI: 10.1093/bib/bbw001] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Revised: 11/29/2015] [Indexed: 01/18/2023] Open

Number

Cited by Other Article(s)

Liu Q, Tian Y, Zhou T, Lyu K, Wang Z, Zheng Y, Liu Y, Ren J, Li J. An Explainable and Personalized Cognitive Reasoning Model Based on Knowledge Graph: Toward Decision Making for General Practice. IEEE J Biomed Health Inform 2024;28:707-718. [PMID: 37669206 DOI: 10.1109/jbhi.2023.3312154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]

Pacheco JA, Rasmussen LV, Wiley K, Person TN, Cronkite DJ, Sohn S, Murphy S, Gundelach JH, Gainer V, Castro VM, Liu C, Mentch F, Lingren T, Sundaresan AS, Eickelberg G, Willis V, Furmanchuk A, Patel R, Carrell DS, Deng Y, Walton N, Satterfield BA, Kullo IJ, Dikilitas O, Smith JC, Peterson JF, Shang N, Kiryluk K, Ni Y, Li Y, Nadkarni GN, Rosenthal EA, Walunas TL, Williams MS, Karlson EW, Linder JE, Luo Y, Weng C, Wei W. Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network. Sci Rep 2023;13:1971. [PMID: 36737471 PMCID: PMC9898520 DOI: 10.1038/s41598-023-27481-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 01/03/2023] [Indexed: 02/05/2023] Open

Affiliation(s)

Jennifer A Pacheco Northwestern University, Evanston, USA.
Luke V Rasmussen Northwestern University, Evanston, USA
Ken Wiley National Human Genome Research Institute, Bethesda, USA
Thomas Nate Person Pennsylvania State University, Hershey, USA
David J Cronkite Kaiser Permanente Washington Health Research Institute, Seattle, USA
Sunghwan Sohn Mayo Clinic, Rochester, USA
Shawn Murphy Massachusetts General Hospital, Boston, USA
Justin H Gundelach Mayo Clinic, Rochester, USA
Vivian Gainer Mass General Brigham, Somerville, USA
Victor M Castro Mass General Brigham, Somerville, USA
Cong Liu Columbia University, New York, USA
Frank Mentch Children's Hospital of Philadelphia, Philadelphia, USA
Todd Lingren Cincinnati Children's Hospital Medical Center, Cincinnati, USA
Agnes S Sundaresan Geisinger, Danville, USA
Garrett Eickelberg Northwestern University, Evanston, USA
Valerie Willis National Human Genome Research Institute, Bethesda, USA
Al'ona Furmanchuk Northwestern University, Evanston, USA
Roshan Patel Geisinger, Danville, USA
David S Carrell Kaiser Permanente Washington Health Research Institute, Seattle, USA
Yu Deng Northwestern University, Evanston, USA
Nephi Walton Intermountain Healthcare, Salt Lake City, USA
Benjamin A Satterfield Mayo Clinic, Rochester, USA
Iftikhar J Kullo Mayo Clinic, Rochester, USA
Ozan Dikilitas Mayo Clinic, Rochester, USA
Joshua C Smith Vanderbilt University Medical Center, Nashville, USA
Josh F Peterson Vanderbilt University Medical Center, Nashville, USA
Ning Shang Columbia University, New York, USA
Krzysztof Kiryluk Columbia University, New York, USA
Yizhao Ni Cincinnati Children's Hospital Medical Center, Cincinnati, USA
Yikuan Li Northwestern University, Evanston, USA
Girish N Nadkarni Icahn School of Medicine at Mount Sinai, New York, USA
Elisabeth A Rosenthal University of Washington, Seattle, USA
Theresa L Walunas Northwestern University, Evanston, USA
Marc S Williams Geisinger, Danville, USA
Elizabeth W Karlson Brigham and Women's Hospital, Boston, USA
Jodell E Linder Vanderbilt University Medical Center, Nashville, USA
Yuan Luo Northwestern University, Evanston, USA
Chunhua Weng Columbia University, New York, USA
WeiQi Wei Vanderbilt University Medical Center, Nashville, USA

Collapse

Ramgopal S, Sanchez-Pinto LN, Horvat CM, Carroll MS, Luo Y, Florin TA. Artificial intelligence-based clinical decision support in pediatrics. Pediatr Res 2023;93:334-341. [PMID: 35906317 PMCID: PMC9668209 DOI: 10.1038/s41390-022-02226-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 06/29/2022] [Accepted: 07/18/2022] [Indexed: 11/24/2022]

French E, McInnes BT. An overview of biomedical entity linking throughout the years. J Biomed Inform 2023;137:104252. [PMID: 36464228 PMCID: PMC9845184 DOI: 10.1016/j.jbi.2022.104252] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 09/19/2022] [Accepted: 11/15/2022] [Indexed: 12/04/2022]

Kline A, Wang H, Li Y, Dennis S, Hutch M, Xu Z, Wang F, Cheng F, Luo Y. Multimodal machine learning in precision health: A scoping review. NPJ Digit Med 2022;5:171. [PMID: 36344814 PMCID: PMC9640667 DOI: 10.1038/s41746-022-00712-8] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 10/14/2022] [Indexed: 11/09/2022] Open

Lee Y, Son J, Song M. BertSRC: transformer-based semantic relation classification. BMC Med Inform Decis Mak 2022;22:234. [PMID: 36068535 PMCID: PMC9446816 DOI: 10.1186/s12911-022-01977-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022] Open

Abstract

The relationship between biomedical entities is complex, and many of them have not yet been identified. For many biomedical research areas including drug discovery, it is of paramount importance to identify the relationships that have already been established through a comprehensive literature survey. However, manually searching through literature is difficult as the amount of biomedical publications continues to increase. Therefore, the relation classification task, which automatically mines meaningful relations from the literature, is spotlighted in the field of biomedical text mining. By applying relation classification techniques to the accumulated biomedical literature, existing semantic relations between biomedical entities that can help to infer previously unknown relationships are efficiently grasped. To develop semantic relation classification models, which is a type of supervised machine learning, it is essential to construct a training dataset that is manually annotated by biomedical experts with semantic relations among biomedical entities. Any advanced model must be trained on a dataset with reliable quality and meaningful scale to be deployed in the real world and can assist biologists in their research. In addition, as the number of such public datasets increases, the performance of machine learning algorithms can be accurately revealed and compared by using those datasets as a benchmark for model development and improvement. In this paper, we aim to build such a dataset. Along with that, to validate the usability of the dataset as training data for relation classification models and to improve the performance of the relation extraction task, we built a relation classification model based on Bidirectional Encoder Representations from Transformers (BERT) trained on our dataset, applying our newly proposed fine-tuning methodology. In experiments comparing performance among several models based on different deep learning algorithms, our model with the proposed fine-tuning methodology showed the best performance. The experimental results show that the constructed training dataset is an important information resource for the development and evaluation of semantic relation extraction models. Furthermore, relation extraction performance can be improved by integrating our proposed fine-tuning methodology. Therefore, this can lead to the promotion of future text mining research in the biomedical field.

Collapse

Yue T, He Z, Li C, Hu Z, Li Y. Lightweight fine-grained classification for scientific paper. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-213022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Deng Z, Sun C, Zhong G, Mao Y. Text Classification with Attention Gated Graph Neural Network. Cognit Comput 2022. [DOI: 10.1007/s12559-022-10017-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Entity understanding with hierarchical graph learning for enhanced text classification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Devi R, Mehrotra D, Lamine SBAB. Constituent vs Dependency Parsing-Based RDF Model Generation from Dengue Patients’ Case Sheets. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT 2022. [DOI: 10.1142/s0219649222500137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Zhao Y, Yu Y, Wang H, Li Y, Deng Y, Jiang G, Luo Y. Machine Learning in Causal Inference: Application in Pharmacovigilance. Drug Saf 2022;45:459-476. [PMID: 35579811 PMCID: PMC9114053 DOI: 10.1007/s40264-022-01155-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2022] [Indexed: 01/28/2023]

Zhang J, Huang W, Ji D, Ren Y. Globally normalized neural model for joint entity and event extraction. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2021.102636] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Weinzierl MA, Maldonado R, Harabagiu SM. The impact of learning Unified Medical Language System knowledge embeddings in relation extraction from biomedical texts. J Am Med Inform Assoc 2021;27:1556-1567. [PMID: 33029619 DOI: 10.1093/jamia/ocaa205] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 07/23/2020] [Accepted: 08/07/2020] [Indexed: 11/14/2022] Open

Tran T, Kavuluru R, Kilicoglu H. Attention-Gated Graph Convolutions for Extracting Drug Interaction Information from Drug Labels. ACM TRANSACTIONS ON COMPUTING FOR HEALTHCARE 2021;2:10. [PMID: 34541578 PMCID: PMC8445229 DOI: 10.1145/3423209] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 09/01/2020] [Indexed: 01/02/2023]

Shi X, Yi Y, Xiong Y, Tang B, Chen Q, Wang X, Ji Z, Zhang Y, Xu H. Extracting entities with attributes in clinical text via joint deep learning. J Am Med Inform Assoc 2021;26:1584-1591. [PMID: 31550346 DOI: 10.1093/jamia/ocz158] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2019] [Revised: 07/18/2019] [Accepted: 08/15/2019] [Indexed: 11/13/2022] Open

Sousa D, Lamurias A, Couto FM. Using Neural Networks for Relation Extraction from Biomedical Literature. Methods Mol Biol 2021;2190:289-305. [PMID: 32804372 DOI: 10.1007/978-1-0716-0826-5_14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Zhang L, Hu J, Xu Q, Li F, Rao G, Tao C. A semantic relationship mining method among disorders, genes, and drugs from different biomedical datasets. BMC Med Inform Decis Mak 2020;20:283. [PMID: 33317518 PMCID: PMC7734713 DOI: 10.1186/s12911-020-01274-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 09/22/2020] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

Semantic web technology has been applied widely in the biomedical informatics field. Large numbers of biomedical datasets are available online in the resource description framework (RDF) format. Semantic relationship mining among genes, disorders, and drugs is widely used in, for example, precision medicine and drug repositioning. However, most of the existing studies focused on a single dataset. It is not easy to find the most current relationships among disorder-gene-drug relationships since the relationships are distributed in heterogeneous datasets. How to mine their semantic relationships from different biomedical datasets is an important issue.

METHODS

First, a variety of biomedical datasets were converted into RDF triple data; then, multisource biomedical datasets were integrated into a storage system using a data integration algorithm. Second, nine query patterns among genes, disorders, and drugs from different biomedical datasets were designed. Third, the gene-disorder-drug semantic relationship mining algorithm is presented. This algorithm can query the relationships among various entities from different datasets.

RESULTS AND CONCLUSIONS

We focused on mining the putative and the most current disorder-gene-drug relationships about Parkinson's disease (PD). The results demonstrate that our method has significant advantages in mining and integrating multisource heterogeneous biomedical datasets. Twenty-five new relationships among the genes, disorders, and drugs were mined from four different datasets. The query results showed that most of them came from different datasets. The precision of the method increased by 2.51% compared to that of the multisource linked open data fusion method presented in the 4th International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019). Moreover, the number of query results increased by 7.7%, and the number of correct queries increased by 9.5%.

Collapse

Abdulkadhar S, Bhasuran B, Natarajan J. Multiscale Laplacian graph kernel combined with lexico-syntactic patterns for biomedical event extraction from literature. Knowl Inf Syst 2020. [DOI: 10.1007/s10115-020-01514-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Yan S, Wong KC. Context awareness and embedding for biomedical event extraction. Bioinformatics 2020;36:637-643. [PMID: 31392318 DOI: 10.1093/bioinformatics/btz607] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2018] [Revised: 07/26/2019] [Accepted: 08/06/2019] [Indexed: 11/13/2022] Open

Perera N, Dehmer M, Emmert-Streib F. Named Entity Recognition and Relation Detection for Biomedical Information Extraction. Front Cell Dev Biol 2020;8:673. [PMID: 32984300 PMCID: PMC7485218 DOI: 10.3389/fcell.2020.00673] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 07/02/2020] [Indexed: 12/29/2022] Open

Peterson KJ, Jiang G, Liu H. A corpus-driven standardization framework for encoding clinical problems with HL7 FHIR. J Biomed Inform 2020;110:103541. [PMID: 32814201 DOI: 10.1016/j.jbi.2020.103541] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 08/09/2020] [Accepted: 08/13/2020] [Indexed: 01/17/2023]

Kilicoglu H, Rosemblat G, Fiszman M, Shin D. Broad-coverage biomedical relation extraction with SemRep. BMC Bioinformatics 2020;21:188. [PMID: 32410573 PMCID: PMC7222583 DOI: 10.1186/s12859-020-3517-7] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 04/29/2020] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

In the era of information overload, natural language processing (NLP) techniques are increasingly needed to support advanced biomedical information management and discovery applications. In this paper, we present an in-depth description of SemRep, an NLP system that extracts semantic relations from PubMed abstracts using linguistic principles and UMLS domain knowledge. We also evaluate SemRep on two datasets. In one evaluation, we use a manually annotated test collection and perform a comprehensive error analysis. In another evaluation, we assess SemRep's performance on the CDR dataset, a standard benchmark corpus annotated with causal chemical-disease relationships.

RESULTS

A strict evaluation of SemRep on our manually annotated dataset yields 0.55 precision, 0.34 recall, and 0.42 F 1 score. A relaxed evaluation, which more accurately characterizes SemRep performance, yields 0.69 precision, 0.42 recall, and 0.52 F 1 score. An error analysis reveals named entity recognition/normalization as the largest source of errors (26.9%), followed by argument identification (14%) and trigger detection errors (12.5%). The evaluation on the CDR corpus yields 0.90 precision, 0.24 recall, and 0.38 F 1 score. The recall and the F 1 score increase to 0.35 and 0.50, respectively, when the evaluation on this corpus is limited to sentence-bound relationships, which represents a fairer evaluation, as SemRep operates at the sentence level.

CONCLUSIONS

SemRep is a broad-coverage, interpretable, strong baseline system for extracting semantic relations from biomedical text. It also underpins SemMedDB, a literature-scale knowledge graph based on semantic relations. Through SemMedDB, SemRep has had significant impact in the scientific community, supporting a variety of clinical and translational applications, including clinical decision making, medical diagnosis, drug repurposing, literature-based discovery and hypothesis generation, and contributing to improved health outcomes. In ongoing development, we are redesigning SemRep to increase its modularity and flexibility, and addressing weaknesses identified in the error analysis.

Collapse

Pesaranghader A, Matwin S, Sokolova M, Pesaranghader A. deepBioWSD: effective deep neural word sense disambiguation of biomedical text data. J Am Med Inform Assoc 2020;26:438-446. [PMID: 30811548 DOI: 10.1093/jamia/ocy189] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 12/03/2018] [Accepted: 12/19/2018] [Indexed: 01/05/2023] Open

Abstract

OBJECTIVE

In biomedicine, there is a wealth of information hidden in unstructured narratives such as research articles and clinical reports. To exploit these data properly, a word sense disambiguation (WSD) algorithm prevents downstream difficulties in the natural language processing applications pipeline. Supervised WSD algorithms largely outperform un- or semisupervised and knowledge-based methods; however, they train 1 separate classifier for each ambiguous term, necessitating a large number of expert-labeled training data, an unattainable goal in medical informatics. To alleviate this need, a single model that shares statistical strength across all instances and scales well with the vocabulary size is desirable.

MATERIALS AND METHODS

Built on recent advances in deep learning, our deepBioWSD model leverages 1 single bidirectional long short-term memory network that makes sense prediction for any ambiguous term. In the model, first, the Unified Medical Language System sense embeddings will be computed using their text definitions; and then, after initializing the network with these embeddings, it will be trained on all (available) training data collectively. This method also considers a novel technique for automatic collection of training data from PubMed to (pre)train the network in an unsupervised manner.

RESULTS

We use the MSH WSD dataset to compare WSD algorithms, with macro and micro accuracies employed as evaluation metrics. deepBioWSD outperforms existing models in biomedical text WSD by achieving the state-of-the-art performance of 96.82% for macro accuracy.

CONCLUSIONS

Apart from the disambiguation improvement and unsupervised training, deepBioWSD depends on considerably less number of expert-labeled data as it learns the target and the context terms jointly. These merit deepBioWSD to be conveniently deployable in real-time biomedical applications.

Collapse

Cullen CM, Aneja KK, Beyhan S, Cho CE, Woloszynek S, Convertino M, McCoy SJ, Zhang Y, Anderson MZ, Alvarez-Ponce D, Smirnova E, Karstens L, Dorrestein PC, Li H, Sen Gupta A, Cheung K, Powers JG, Zhao Z, Rosen GL. Emerging Priorities for Microbiome Research. Front Microbiol 2020;11:136. [PMID: 32140140 PMCID: PMC7042322 DOI: 10.3389/fmicb.2020.00136] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 01/21/2020] [Indexed: 12/12/2022] Open

Abstract

Microbiome research has increased dramatically in recent years, driven by advances in technology and significant reductions in the cost of analysis. Such research has unlocked a wealth of data, which has yielded tremendous insight into the nature of the microbial communities, including their interactions and effects, both within a host and in an external environment as part of an ecological community. Understanding the role of microbiota, including their dynamic interactions with their hosts and other microbes, can enable the engineering of new diagnostic techniques and interventional strategies that can be used in a diverse spectrum of fields, spanning from ecology and agriculture to medicine and from forensics to exobiology. From June 19-23 in 2017, the NIH and NSF jointly held an Innovation Lab on Quantitative Approaches to Biomedical Data Science Challenges in our Understanding of the Microbiome. This review is inspired by some of the topics that arose as priority areas from this unique, interactive workshop. The goal of this review is to summarize the Innovation Lab's findings by introducing the reader to emerging challenges, exciting potential, and current directions in microbiome research. The review is broken into five key topic areas: (1) interactions between microbes and the human body, (2) evolution and ecology of microbes, including the role played by the environment and microbe-microbe interactions, (3) analytical and mathematical methods currently used in microbiome research, (4) leveraging knowledge of microbial composition and interactions to develop engineering solutions, and (5) interventional approaches and engineered microbiota that may be enabled by selectively altering microbial composition. As such, this review seeks to arm the reader with a broad understanding of the priorities and challenges in microbiome research today and provide inspiration for future investigation and multi-disciplinary collaboration.

Collapse

Affiliation(s)

Chad M. Cullen School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, United States
Kawalpreet K. Aneja The School District of Philadelphia, Philadelphia, PA, United States
Sinem Beyhan Department of Infectious Diseases, J. Craig Venter Institute, La Jolla, CA, United States
Clara E. Cho Department of Nutrition, Dietetics and Food Sciences, Utah State University, Logan, UT, United States
Stephen Woloszynek Ecological and Evolutionary Signal-processing and Informatics Laboratory (EESI), Electrical and Computer Engineering, Drexel University, Philadelphia, PA, United States College of Medicine, Drexel University, Philadelphia, PA, United States
Matteo Convertino Nexus Group, Faculty of Information Science and Technology, Gi-CoRE Station for Big Data & Cybersecurity, Hokkaido University, Sapporo, Japan
Sophie J. McCoy Department of Biological Science, Florida State University, Tallahassee, FL, United States
Yanyan Zhang Department of Civil Engineering, New Mexico State University, Las Cruces, NM, United States
Matthew Z. Anderson Department of Microbiology, The Ohio State University, Columbus, OH, United States Department of Microbial Infection and Immunity, The Ohio State University, Columbus, OH, United States
David Alvarez-Ponce Department of Biology, University of Nevada, Reno, Reno, NV, United States
Ekaterina Smirnova Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, United States
Lisa Karstens Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, United States Department of Obstetrics and Gynecology, Oregon Health & Science University, Portland, OR, United States
Pieter C. Dorrestein Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, United States
Hongzhe Li Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
Ananya Sen Gupta Department of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA, United States
Kevin Cheung Department of Dermatology, The University of Iowa, Iowa City, IA, United States
Jennifer Gloeckner Powers Department of Dermatology, The University of Iowa, Iowa City, IA, United States
Zhengqiao Zhao Ecological and Evolutionary Signal-processing and Informatics Laboratory (EESI), Electrical and Computer Engineering, Drexel University, Philadelphia, PA, United States
Gail L. Rosen School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, United States Ecological and Evolutionary Signal-processing and Informatics Laboratory (EESI), Electrical and Computer Engineering, Drexel University, Philadelphia, PA, United States

Collapse

Li Y, Jin R, Luo Y. Classifying relations in clinical narratives using segment graph convolutional and recurrent neural networks (Seg-GCRNs). J Am Med Inform Assoc 2019;26:262-268. [PMID: 30590613 DOI: 10.1093/jamia/ocy157] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2018] [Accepted: 11/03/2018] [Indexed: 01/16/2023] Open

Rosemblat G, Fiszman M, Shin D, Kilicoglu H. Towards a characterization of apparent contradictions in the biomedical literature using context analysis. J Biomed Inform 2019;98:103275. [PMID: 31473364 DOI: 10.1016/j.jbi.2019.103275] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Revised: 08/26/2019] [Accepted: 08/28/2019] [Indexed: 11/19/2022]

Abstract

BACKGROUND

With the substantial growth in the biomedical research literature, a larger number of claims are published daily, some of which seemingly disagree with or contradict prior claims on the same topics. Resolving such contradictions is critical to advancing our understanding of human disease and developing effective treatments. Automated text analysis techniques can facilitate such analysis by extracting claims from the literature, flagging those that are potentially contradictory, and identifying any study characteristics that may explain such contradictions.

METHODS

Using SemMedDB, our own PubMed-scale repository of semantic predications (subject-relation-object triples), we identified apparent contradictions in the biomedical research literature and developed a categorization of contextual characteristics that explain such contradictions. Clinically relevant semantic predications relating to 20 diseases and involving opposing predicate pairs (e.g., an intervention treats or causes a disease) were retrieved from SemMedDB. After addressing inference, uncertainty, generic concepts, and NLP errors through automatic and manual filtering steps, a set of apparent contradictions were identified and characterized.

RESULTS

We retrieved 117,676 predication instances from 62,360 PubMed abstracts (Jan 1980-Dec 2016). From these instances, automatic filtering steps generated 2236 candidate contradictory pairs. Through manual analysis, we determined that 58 of these pairs (2.6%) were apparent contradictions. We identified five main categories of contextual characteristics that explain these contradictions: (a) internal to the patient, (b) external to the patient, (c) endogenous/exogenous, (d) known controversy, and (e) contradictions in literature. Categories (a) and (b) were subcategorized further (e.g., species, dosage) and accounted for the bulk of the contradictory information.

CONCLUSIONS

Semantic predications, by accounting for lexical variability, and SemMedDB, owing to its literature scale, can support identification and elucidation of potentially contradictory claims across the biomedical domain. Further filtering and classification steps are needed to distinguish among them the true contradictory claims. The ability to detect contradictions automatically can facilitate important biomedical knowledge management tasks, such as tracking and verifying scientific claims, summarizing research on a given topic, identifying knowledge gaps, and assessing evidence for systematic reviews, with potential benefits to the scientific community. Future work will focus on automating these steps for fully automatic recognition of contradictions from the biomedical research literature.

Collapse

Minimalistic Approach to Coreference Resolution in Lithuanian Medical Records. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019;2019:9079840. [PMID: 31015858 PMCID: PMC6446105 DOI: 10.1155/2019/9079840] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Accepted: 02/26/2019] [Indexed: 12/20/2022]

Assale M, Dui LG, Cina A, Seveso A, Cabitza F. The Revival of the Notes Field: Leveraging the Unstructured Content in Electronic Health Records. Front Med (Lausanne) 2019;6:66. [PMID: 31058150 PMCID: PMC6478793 DOI: 10.3389/fmed.2019.00066] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Accepted: 03/18/2019] [Indexed: 01/01/2023] Open

Luo Y, Cheng Y, Uzuner Ö, Szolovits P, Starren J. Segment convolutional neural networks (Seg-CNNs) for classifying relations in clinical notes. J Am Med Inform Assoc 2019;25:93-98. [PMID: 29025149 DOI: 10.1093/jamia/ocx090] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Accepted: 08/05/2017] [Indexed: 11/13/2022] Open

Zeng Z, Deng Y, Li X, Naumann T, Luo Y. Natural Language Processing for EHR-Based Computational Phenotyping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:139-153. [PMID: 29994486 PMCID: PMC6388621 DOI: 10.1109/tcbb.2018.2849968] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Li Y, Yao L, Mao C, Srivastava A, Jiang X, Luo Y. Early Prediction of Acute Kidney Injury in Critical Care Setting Using Clinical Notes. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2018;2018:683-686. [PMID: 33376624 PMCID: PMC7768909 DOI: 10.1109/bibm.2018.8621574] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Kilicoglu H. Biomedical text mining for research rigor and integrity: tasks, challenges, directions. Brief Bioinform 2018;19:1400-1414. [PMID: 28633401 PMCID: PMC6291799 DOI: 10.1093/bib/bbx057] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 04/10/2017] [Indexed: 01/01/2023] Open

Xing W, Qi J, Yuan X, Li L, Zhang X, Fu Y, Xiong S, Hu L, Peng J. A gene-phenotype relationship extraction pipeline from the biomedical literature using a representation learning approach. Bioinformatics 2018;34:i386-i394. [PMID: 29950017 PMCID: PMC6022650 DOI: 10.1093/bioinformatics/bty263] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Bakal G, Talari P, Kakani EV, Kavuluru R. Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations. J Biomed Inform 2018;82:189-199. [PMID: 29763706 PMCID: PMC6070294 DOI: 10.1016/j.jbi.2018.05.003] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Revised: 01/31/2018] [Accepted: 05/09/2018] [Indexed: 01/27/2023]

Abstract

BACKGROUND

Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying different causal relations between biomedical entities is also critical to understand biomedical processes. Generally, natural language processing (NLP) and machine learning are used to predict specific relations between any given pair of entities using the distant supervision approach.

OBJECTIVE

To build high accuracy supervised predictive models to predict previously unknown treatment and causative relations between biomedical entities based only on semantic graph pattern features extracted from biomedical knowledge graphs.

METHODS

We used 7000 treats and 2918 causes hand-curated relations from the UMLS Metathesaurus to train and test our models. Our graph pattern features are extracted from simple paths connecting biomedical entities in the SemMedDB graph (based on the well-known SemMedDB database made available by the U.S. National Library of Medicine). Using these graph patterns connecting biomedical entities as features of logistic regression and decision tree models, we computed mean performance measures (precision, recall, F-score) over 100 distinct 80-20% train-test splits of the datasets. For all experiments, we used a positive:negative class imbalance of 1:10 in the test set to model relatively more realistic scenarios.

RESULTS

Our models predict treats and causes relations with high F-scores of 99% and 90% respectively. Logistic regression model coefficients also help us identify highly discriminative patterns that have an intuitive interpretation. We are also able to predict some new plausible relations based on false positives that our models scored highly based on our collaborations with two physician co-authors. Finally, our decision tree models are able to retrieve over 50% of treatment relations from a recently created external dataset.

CONCLUSIONS

We employed semantic graph patterns connecting pairs of candidate biomedical entities in a knowledge graph as features to predict treatment/causative relations between them. We provide what we believe is the first evidence in direct prediction of biomedical relations based on graph features. Our work complements lexical pattern based approaches in that the graph patterns can be used as additional features for weakly supervised relation prediction.

Collapse

Maldonado R, Goodwin TR, Harabagiu SM. Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018;2017:156-165. [PMID: 29888063 PMCID: PMC5961777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]

Luo Y. Recurrent neural networks for classifying relations in clinical notes. J Biomed Inform 2017;72:85-95. [PMID: 28694119 PMCID: PMC6657689 DOI: 10.1016/j.jbi.2017.07.006] [Citation(s) in RCA: 85] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Revised: 06/13/2017] [Accepted: 07/06/2017] [Indexed: 01/16/2023]

Eftimov T, Koroušić Seljak B, Korošec P. A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations. PLoS One 2017. [PMID: 28644863 PMCID: PMC5482438 DOI: 10.1371/journal.pone.0179488] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Natural Language Processing for EHR-Based Pharmacovigilance: A Structured Review. Drug Saf 2017. [DOI: 10.1007/s40264-017-0558-6] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Zhang Y, Jiang M, Wang J, Xu H. Semantic Role Labeling of Clinical Text: Comparing Syntactic Parsers and Features. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2017;2016:1283-1292. [PMID: 28269926 PMCID: PMC5333340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Luo Y, Szolovits P. Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records. BIOMEDICAL INFORMATICS INSIGHTS 2016;8:29-38. [PMID: 27478379 PMCID: PMC4954589 DOI: 10.4137/bii.s38916] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Revised: 06/13/2016] [Accepted: 06/22/2016] [Indexed: 11/07/2022]