Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: He Z, Chen Z, Oh S, Hou J, Bian J. Enriching consumer health vocabulary through mining a social Q&A site: A similarity-based approach. J Biomed Inform 2017;69:75-85. [PMID: 28359728 DOI: 10.1016/j.jbi.2017.03.016] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Revised: 03/21/2017] [Accepted: 03/24/2017] [Indexed: 11/29/2022]

For:	He Z, Chen Z, Oh S, Hou J, Bian J. Enriching consumer health vocabulary through mining a social Q&A site: A similarity-based approach. J Biomed Inform 2017;69:75-85. [PMID: 28359728 DOI: 10.1016/j.jbi.2017.03.016] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Revised: 03/21/2017] [Accepted: 03/24/2017] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Lin AY, Arabandi S, Beale T, Duncan WD, Hicks A, Hogan WR, Jensen M, Koppel R, Martínez-Costa C, Nytrø Ø, Obeid JS, de Oliveira JP, Ruttenberg A, Seppälä S, Smith B, Soergel D, Zheng J, Schulz S. Improving the Quality and Utility of Electronic Health Record Data through Ontologies. STANDARDS (BASEL, SWITZERLAND) 2023;3:316-340. [PMID: 37873508 PMCID: PMC10591519 DOI: 10.3390/standards3030023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]

Affiliation(s)

Asiyah Yu Lin National Institutes of Health, Bethesda, MD 20892, USA
Sivaram Arabandi ONTOPRO, Houston, TX 77025, USA
Thomas Beale Ars Semantica Ltd., London W4 1PQ, UK
William D. Duncan College of Dentistry, University of Florida, Gainesville, FL 32610, USA
Amanda Hicks The Johns Hopkins University Applied Physics Laboratory, Laurel, MD 20723, USA
William R. Hogan Data Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA
Mark Jensen CUBRC Inc., Buffalo, NY 14225, USA
Ross Koppel Department of Medical Informatics, Jacobs School of Medicine, University at Buffalo, Buffalo, NY 14260, USA Department of Medical Informatics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
Catalina Martínez-Costa Department of Informatics and Systems, Faculty of Computer Science, University of Murcia, 30100 Murcia, Spain
Øystein Nytrø Department of Computer Science, UIT Arctic University of Norway, 9037 Tromsø, Norway Department of Computer Science, Norwegian University of Science and Technology, 7491 Trondheim, Norway
Jihad S. Obeid Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC 29425, USA
Jose Parente de Oliveira Aeronautics Institute of Technology, São José dos Campos 12228-900, Brazil
Alan Ruttenberg School of Dental Medicine, University at Buffalo, Buffalo, NY 14260, USA
Selja Seppälä Department of Business Information Systems, University College Cork, T12 K8AF Cork, Ireland
Barry Smith Department of Philosophy, University at Buffalo, Buffalo, NY 14260, USA
Dagobert Soergel Department of Philosophy, University at Buffalo, Buffalo, NY 14260, USA
Jie Zheng Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI 48104, USA
Stefan Schulz Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, 8036 Graz, Austria Averbis GmbH, Salzstrasse 15, 79098 Freiburg im Breisgau, Germany

Collapse

van Mens HJ, Martens SS, Paiman EH, Mertens AC, Nienhuis R, de Keizer NF, Cornet R. Diagnosis clarification by generalization to patient-friendly terms and definitions: Validation study. J Biomed Inform 2022;129:104071. [DOI: 10.1016/j.jbi.2022.104071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 03/12/2022] [Accepted: 04/05/2022] [Indexed: 11/16/2022]

Newman-Griffis D, Divita G, Desmet B, Zirikly A, Rosé CP, Fosler-Lussier E. Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets. J Am Med Inform Assoc 2021;28:516-532. [PMID: 33319905 DOI: 10.1093/jamia/ocaa269] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 09/13/2020] [Accepted: 11/17/2020] [Indexed: 12/18/2022] Open

Abstract

OBJECTIVES

Normalizing mentions of medical concepts to standardized vocabularies is a fundamental component of clinical text analysis. Ambiguity-words or phrases that may refer to different concepts-has been extensively researched as part of information extraction from biomedical literature, but less is known about the types and frequency of ambiguity in clinical text. This study characterizes the distribution and distinct types of ambiguity exhibited by benchmark clinical concept normalization datasets, in order to identify directions for advancing medical concept normalization research.

MATERIALS AND METHODS

We identified ambiguous strings in datasets derived from the 2 available clinical corpora for concept normalization and categorized the distinct types of ambiguity they exhibited. We then compared observed string ambiguity in the datasets with potential ambiguity in the Unified Medical Language System (UMLS) to assess how representative available datasets are of ambiguity in clinical language.

RESULTS

We found that <15% of strings were ambiguous within the datasets, while over 50% were ambiguous in the UMLS, indicating only partial coverage of clinical ambiguity. The percentage of strings in common between any pair of datasets ranged from 2% to only 36%; of these, 40% were annotated with different sets of concepts, severely limiting generalization. Finally, we observed 12 distinct types of ambiguity, distributed unequally across the available datasets, reflecting diverse linguistic and medical phenomena.

DISCUSSION

Existing datasets are not sufficient to cover the diversity of clinical concept ambiguity, limiting both training and evaluation of normalization methods for clinical text. Additionally, the UMLS offers important semantic information for building and evaluating normalization methods.

CONCLUSIONS

Our findings identify 3 opportunities for concept normalization research, including a need for ambiguity-specific clinical datasets and leveraging the rich semantics of the UMLS in new methods and evaluation measures for normalization.

Collapse

Ibrahim M, Gauch S, Salman O, Alqahtani M. An automated method to enrich consumer health vocabularies using GloVe word embeddings and an auxiliary lexical resource. PeerJ Comput Sci 2021;7:e668. [PMID: 34458573 PMCID: PMC8371999 DOI: 10.7717/peerj-cs.668] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 07/19/2021] [Indexed: 06/13/2023]

Abstract

BACKGROUND

Clear language makes communication easier between any two parties. A layman may have difficulty communicating with a professional due to not understanding the specialized terms common to the domain. In healthcare, it is rare to find a layman knowledgeable in medical terminology which can lead to poor understanding of their condition and/or treatment. To bridge this gap, several professional vocabularies and ontologies have been created to map laymen medical terms to professional medical terms and vice versa.

OBJECTIVE

Many of the presented vocabularies are built manually or semi-automatically requiring large investments of time and human effort and consequently the slow growth of these vocabularies. In this paper, we present an automatic method to enrich laymen's vocabularies that has the benefit of being able to be applied to vocabularies in any domain.

METHODS

Our entirely automatic approach uses machine learning, specifically Global Vectors for Word Embeddings (GloVe), on a corpus collected from a social media healthcare platform to extend and enhance consumer health vocabularies. Our approach further improves the consumer health vocabularies by incorporating synonyms and hyponyms from the WordNet ontology. The basic GloVe and our novel algorithms incorporating WordNet were evaluated using two laymen datasets from the National Library of Medicine (NLM), Open-Access Consumer Health Vocabulary (OAC CHV) and MedlinePlus Healthcare Vocabulary.

RESULTS

The results show that GloVe was able to find new laymen terms with an F-score of 48.44%. Furthermore, our enhanced GloVe approach outperformed basic GloVe with an average F-score of 61%, a relative improvement of 25%. Furthermore, the enhanced GloVe showed a statistical significance over the two ground truth datasets with P < 0.001.

CONCLUSIONS

This paper presents an automatic approach to enrich consumer health vocabularies using the GloVe word embeddings and an auxiliary lexical source, WordNet. Our approach was evaluated used healthcare text downloaded from MedHelp.org, a healthcare social media platform using two standard laymen vocabularies, OAC CHV, and MedlinePlus. We used the WordNet ontology to expand the healthcare corpus by including synonyms, hyponyms, and hypernyms for each layman term occurrence in the corpus. Given a seed term selected from a concept in the ontology, we measured our algorithms' ability to automatically extract synonyms for those terms that appeared in the ground truth concept. We found that enhanced GloVe outperformed GloVe with a relative improvement of 25% in the F-score.

Collapse

Li P, Xu L, Tang T, Wu X, Huang C. Users' Willingness to Share Health Information in a Social Question-and-Answer Community: Cross-sectional Survey in China. JMIR Med Inform 2021;9:e26265. [PMID: 33783364 PMCID: PMC8075348 DOI: 10.2196/26265] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 02/10/2021] [Accepted: 03/07/2021] [Indexed: 11/13/2022] Open

Sarker A, DeRoos A, Perrone J. Mining social media for prescription medication abuse monitoring: a review and proposal for a data-centric framework. J Am Med Inform Assoc 2021;27:315-329. [PMID: 31584645 PMCID: PMC7025330 DOI: 10.1093/jamia/ocz162] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 08/14/2019] [Indexed: 01/02/2023] Open

Fodeh SJ, Al-Garadi M, Elsankary O, Perrone J, Becker W, Sarker A. Utilizing a multi-class classification approach to detect therapeutic and recreational misuse of opioids on Twitter. Comput Biol Med 2020;129:104132. [PMID: 33290931 DOI: 10.1016/j.compbiomed.2020.104132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 11/10/2020] [Accepted: 11/16/2020] [Indexed: 10/23/2022]

Abstract

BACKGROUND

Opioid misuse (OM) is a major health problem in the United States, and can lead to addiction and fatal overdose. We sought to employ natural language processing (NLP) and machine learning to categorize Twitter chatter based on the motive of OM.

MATERIALS AND METHODS

We collected data from Twitter using opioid-related keywords, and manually annotated 6988 tweets into three classes-No-OM, Pain-related-OM, and Recreational-OM-with the No-OM class representing tweets indicating no use/misuse, and the Pain-related misuse and Recreational-misuse classes representing misuse for pain or recreation/addiction. We trained and evaluated multi-class classifiers, and performed term-level k-means clustering to assess whether there were terms closely associated with the three classes.

RESULTS

On a held-out test set of 1677 tweets, a transformer-based classifier (XLNet) achieved the best performance with F₁-score of 0.71 for the Pain-misuse class, and 0.79 for the Recreational-misuse class. Macro- and micro-averaged F₁-scores over all classes were 0.82 and 0.92, respectively. Content-analysis using clustering revealed distinct clusters of terms associated with each class.

DISCUSSION

While some past studies have attempted to automatically detect opioid misuse, none have further characterized the motive for misuse. Our multi-class classification approach using XLNet showed promising performance, including in detecting the subtle differences between pain-related and recreation-related misuse. The distinct clustering of class-specific keywords may help conduct targeted data collection, overcoming under-representation of minority classes.

CONCLUSION

Machine learning can help identify pain-related and recreational-related OM contents on Twitter to potentially enable the study of the characteristics of individuals exhibiting such behavior.

Collapse

An Automatic Approach to Extending the Consumer Health Vocabulary. JOURNAL OF DATA AND INFORMATION SCIENCE 2020. [DOI: 10.2478/jdis-2021-0003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Abstract Abstract Purpose Given the ubiquitous presence of the internet in our lives, many individuals turn to the web for medical information. A challenge here is that many laypersons (as “consumers”) do not use professional terms found in the medical nomenclature when describing their conditions and searching the internet. The Consumer Health Vocabulary (CHV) ontology, initially developed in 2007, aimed to bridge this gap, although updates have been limited over the last decade. The purpose of this research is to implement a means of automatically creating a hierarchical consumer health vocabulary. This overall purpose is improving consumers’ ability to search for medical conditions and symptoms with an enhanced CHV and improving the search capabilities of our searching and indexing tool HIVE (Helping Interdisciplinary Vocabulary Engineering). Design/methodology/approach The research design uses ontological fusion, an approach for automatically extracting and integrating the Medical Subject Headings (MeSH) ontology into CHV, and further convert CHV from a flat mapping to a hierarchical ontology. The additional relationships and parent terms from MeSH allow us to uncover relationships between existing terms in the CHV ontology as well. The research design also included improving the search capabilities of HIVE identifying alternate relationships and consolidating them to a single entry. Findings The key findings are an improved CHV with a hierarchical structure that enables consumers to search through the ontology and uncover more relationships. Research limitations There are some cases where the improved search results in HIVE return terms that are related but not completely synonymous. We present an example and discuss the implications of this result. Practical implications This research makes available an updated and richer CHV ontology using the HIVE tool. Consumers may use this tool to search consumer terminology for medical conditions and symptoms. The HIVE tool will return results about the medical term linked with the consumer term as well as the hierarchy of other medical terms connected to the term. Originality/value This is a first attempt in over a decade to improve and enhance the CHV ontology with current terminology and the first research effort to convert CHV's original flat ontology structure to a hierarchical structure. This research also enhances the HIVE infrastructure and provides consumers with a simple, efficient mechanism for searching the CHV ontology and providing meaningful data to consumers. Collapse

Wu DTY, Xin C, Bindhu S, Xu C, Sachdeva J, Brown JL, Jung H. Clinician Perspectives and Design Implications in Using Patient-Generated Health Data to Improve Mental Health Practices: Mixed Methods Study. JMIR Form Res 2020;4:e18123. [PMID: 32763884 PMCID: PMC7442947 DOI: 10.2196/18123] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 05/25/2020] [Accepted: 06/15/2020] [Indexed: 01/10/2023] Open

Abstract

Background

Patient-generated health data (PGHD) have been largely collected through mobile health (mHealth) apps and wearable devices. PGHD can be especially helpful in mental health, as patients’ illness history and symptom narratives are vital to developing diagnoses and treatment plans. However, the extent to which clinicians use mental health–related PGHD is unknown.

Objective

A mixed methods study was conducted to understand clinicians’ perspectives on PGHD and current mental health apps. This approach uses information gathered from semistructured interviews, workflow analysis, and user-written mental health app reviews to answer the following research questions: (1) What is the current workflow of mental health practice and how are PGHD integrated into this workflow, (2) what are clinicians’ perspectives on PGHD and how do they choose mobile apps for their patients, (3) and what are the features of current mobile apps in terms of interpreting and sharing PGHD?

Methods

The study consists of semistructured interviews with 12 psychiatrists and clinical psychologists from a large academic hospital. These interviews were thematically and qualitatively analyzed for common themes and workflow elements. User-posted reviews of 56 sleep and mood tracking apps were analyzed to understand app features in comparison with the information gathered from interviews.

Results

The results showed that PGHD have been part of the workflow, but its integration and use are not optimized. Mental health clinicians supported the use of PGHD but had concerns regarding data reliability and accuracy. They also identified challenges in selecting suitable apps for their patients. From the app review, it was discovered that mHealth apps had limited features to support personalization and collaborative care as well as data interpretation and sharing.

Conclusions

This study investigates clinicians’ perspectives on PGHD use and explored existing app features using the app review data in the mental health setting. A total of 3 design guidelines were generated: (1) improve data interpretation and sharing mechanisms, (2) consider clinical workflow and electronic health record integration, and (3) support personalized and collaborative care. More research is needed to demonstrate the best practices of PGHD use and to evaluate their effectiveness in improving patient outcomes.

Collapse

Yu B, He Z, Xing A, Lustria MLA. An Informatics Framework to Assess Consumer Health Language Complexity Differences: Proof-of-Concept Study. J Med Internet Res 2020;22:e16795. [PMID: 32436849 PMCID: PMC7273233 DOI: 10.2196/16795] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 01/21/2020] [Accepted: 02/21/2020] [Indexed: 11/23/2022] Open

Abstract

Background

The language gap between health consumers and health professionals has been long recognized as the main hindrance to effective health information comprehension. Although providing health information access in consumer health language (CHL) is widely accepted as the solution to the problem, health consumers are found to have varying health language preferences and proficiencies. To simplify health documents for heterogeneous consumer groups, it is important to quantify how CHLs are different in terms of complexity among various consumer groups.

Objective

This study aimed to propose an informatics framework (consumer health language complexity [CHELC]) to assess the complexity differences of CHL using syntax-level, text-level, term-level, and semantic-level complexity metrics. Specifically, we identified 8 language complexity metrics validated in previous literature and combined them into a 4-faceted framework. Through a rank-based algorithm, we developed unifying scores (CHELC scores [CHELCS]) to quantify syntax-level, text-level, term-level, semantic-level, and overall CHL complexity. We applied CHELCS to compare posts of each individual on online health forums designed for (1) the general public, (2) deaf and hearing-impaired people, and (3) people with autism spectrum disorder (ASD).

Methods

We examined posts with more than 4 sentences of each user from 3 health forums to understand CHL complexity differences among these groups: 12,560 posts from 3756 users in Yahoo! Answers, 25,545 posts from 1623 users in AllDeaf, and 26,484 posts from 2751 users in Wrong Planet. We calculated CHELCS for each user and compared the scores of 3 user groups (ie, deaf and hearing-impaired people, people with ASD, and the public) through 2-sample Kolmogorov-Smirnov tests and analysis of covariance tests.

Results

The results suggest that users in the public forum used more complex CHL, particularly more diverse semantics and more complex health terms compared with users in the ASD and deaf and hearing-impaired user forums. However, between the latter 2 groups, people with ASD used more complex words, and deaf and hearing-impaired users used more complex syntax.

Conclusions

Our results show that the users in 3 online forums had significantly different CHL complexities in different facets. The proposed framework and detailed measurements help to quantify these CHL complexity differences comprehensively. The results emphasize the importance of tailoring health-related content for different consumer groups with varying CHL complexities.

Collapse

Khaleghi T, Murat A, Arslanturk S, Davies E. Automated Surgical Term Clustering: A Text Mining Approach for Unstructured Textual Surgery Descriptions. IEEE J Biomed Health Inform 2019;24:2107-2118. [PMID: 31796420 DOI: 10.1109/jbhi.2019.2956973] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Zhang Z, Lu Y, Kou Y, Wu DTY, Huh-Yoo J, He Z. Understanding Patient Information Needs About Their Clinical Laboratory Results: A Study of Social Q&A Site. Stud Health Technol Inform 2019;264:1403-1407. [PMID: 31438157 DOI: 10.3233/shti190458] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Rizvi RF, Wang Y, Nguyen T, Vasilakes J, Bian J, He Z, Zhang R. Analyzing Social Media Data to Understand Consumer Information Needs on Dietary Supplements. Stud Health Technol Inform 2019;264:323-327. [PMID: 31437938 PMCID: PMC6792048 DOI: 10.3233/shti190236] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Gu G, Zhang X, Zhu X, Jian Z, Chen K, Wen D, Gao L, Zhang S, Wang F, Ma H, Lei J. Development of a Consumer Health Vocabulary by Mining Health Forum Texts Based on Word Embedding: Semiautomatic Approach. JMIR Med Inform 2019;7:e12704. [PMID: 31124461 PMCID: PMC6552449 DOI: 10.2196/12704] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 03/19/2019] [Accepted: 04/05/2019] [Indexed: 12/31/2022] Open

Abstract

Background

The vocabulary gap between consumers and professionals in the medical domain hinders information seeking and communication. Consumer health vocabularies have been developed to aid such informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language.

Objective

Our objective is to develop a method for identifying and adding new terms to consumer health vocabularies, so that it can keep up with the constantly evolving medical knowledge and language use.

Methods

In this paper, we propose a consumer health term–finding framework based on a distributed word vector space model. We first learned word vectors from a large-scale text corpus and then adopted a supervised method with existing consumer health vocabularies for learning vector representation of words, which can provide additional supervised fine tuning after unsupervised word embedding learning. With a fine-tuned word vector space, we identified pairs of professional terms and their consumer variants by their semantic distance in the vector space. A subsequent manual review of the extracted and labeled pairs of entities was conducted to validate the results generated by the proposed approach. The results were evaluated using mean reciprocal rank (MRR).

Results

Manual evaluation showed that it is feasible to identify alternative medical concepts by using professional or consumer concepts as queries in the word vector space without fine tuning, but the results are more promising in the final fine-tuned word vector space. The MRR values indicated that on an average, a professional or consumer concept is about 14th closest to its counterpart in the word vector space without fine tuning, and the MRR in the final fine-tuned word vector space is 8. Furthermore, the results demonstrate that our method can collect abbreviations and common typos frequently used by consumers.

Conclusions

By integrating a large amount of text information and existing consumer health vocabularies, our method outperformed several baseline ranking methods and is effective for generating a list of candidate terms for human review during consumer health vocabulary development.

Collapse

Denecke K, Gabarron E, Grainger R, Konstantinidis ST, Lau A, Rivera-Romero O, Miron-Shatz T, Merolli M. Artificial Intelligence for Participatory Health: Applications, Impact, and Future Implications. Yearb Med Inform 2019;28:165-173. [PMID: 31022749 PMCID: PMC6697496 DOI: 10.1055/s-0039-1677902] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Abstract

Objective : Artificial intelligence (AI) provides people and professionals working in the field of participatory health informatics an opportunity to derive robust insights from a variety of online sources. The objective of this paper is to identify current state of the art and application areas of AI in the context of participatory health.

Methods : A search was conducted across seven databases (PubMed, Embase, CINAHL, PsychInfo, ACM Digital Library, IEEExplore, and SCOPUS), covering articles published since 2013. Additionally, clinical trials involving AI in participatory health contexts registered at clinicaltrials.gov were collected and analyzed.

Results : Twenty-two articles and 12 trials were selected for review. The most common application of AI in participatory health was the secondary analysis of social media data: self-reported data including patient experiences with healthcare facilities, reports of adverse drug reactions, safety and efficacy concerns about over-the-counter medications, and other perspectives on medications. Other application areas included determining which online forum threads required moderator assistance, identifying users who were likely to drop out from a forum, extracting terms used in an online forum to learn its vocabulary, highlighting contextual information that is missing from online questions and answers, and paraphrasing technical medical terms for consumers.

Conclusions : While AI for supporting participatory health is still in its infancy, there are a number of important research priorities that should be considered for the advancement of the field. Further research evaluating the impact of AI in participatory health informatics on the psychosocial wellbeing of individuals would help in facilitating the wider acceptance of AI into the healthcare ecosystem.

Collapse

He Z, Keloth VK, Chen Y, Geller J. Extended Analysis of Topological-Pattern-Based Ontology Enrichment. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2019;2018:1641-1648. [PMID: 30854243 DOI: 10.1109/bibm.2018.8621564] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Chen Z, He Z, Liu X, Bian J. Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases. BMC Med Inform Decis Mak 2018;18:65. [PMID: 30066651 PMCID: PMC6069806 DOI: 10.1186/s12911-018-0630-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

In the past few years, neural word embeddings have been widely used in text mining. However, the vector representations of word embeddings mostly act as a black box in downstream applications using them, thereby limiting their interpretability. Even though word embeddings are able to capture semantic regularities in free text documents, it is not clear how different kinds of semantic relations are represented by word embeddings and how semantically-related terms can be retrieved from word embeddings.

METHODS

To improve the transparency of word embeddings and the interpretability of the applications using them, in this study, we propose a novel approach for evaluating the semantic relations in word embeddings using external knowledge bases: Wikipedia, WordNet and Unified Medical Language System (UMLS). We trained multiple word embeddings using health-related articles in Wikipedia and then evaluated their performance in the analogy and semantic relation term retrieval tasks. We also assessed if the evaluation results depend on the domain of the textual corpora by comparing the embeddings of health-related Wikipedia articles with those of general Wikipedia articles.

RESULTS

Regarding the retrieval of semantic relations, we were able to retrieve diverse semantic relations in the nearest neighbors of a given word. Meanwhile, the two popular word embedding approaches, Word2vec and GloVe, obtained comparable results on both the analogy retrieval task and the semantic relation retrieval task, while dependency-based word embeddings had much worse performance in both tasks. We also found that the word embeddings trained with health-related Wikipedia articles obtained better performance in the health-related relation retrieval tasks than those trained with general Wikipedia articles.

CONCLUSION

It is evident from this study that word embeddings can group terms with diverse semantic relations together. The domain of the training corpus does have impact on the semantic relations represented by word embeddings. We thus recommend using domain-specific corpus to train word embeddings for domain-specific text mining tasks.

Collapse

Watson J. Social Media Use in Cancer Care. Semin Oncol Nurs 2018;34:126-131. [PMID: 29622519 DOI: 10.1016/j.soncn.2018.03.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Gu H, He Z, Wei D, Elhanan G, Chen Y. Validating UMLS Semantic Type Assignments Using SNOMED CT Semantic Tags. Methods Inf Med 2018;57:43-53. [PMID: 29621830 DOI: 10.3414/me17-01-0120] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

BACKGROUND

The UMLS assigns semantic types to all its integrated concepts. The semantic types are widely used in various natural language processing tasks in the biomedical domain, such as named entity recognition, semantic disambiguation, and semantic annotation. Due to the size of the UMLS, erroneous semantic type assignments are hard to detect. It is imperative to devise automated techniques to identify errors and inconsistencies in semantic type assignments.

OBJECTIVES

Designing a methodology to perform programmatic checks to detect semantic type assignment errors for UMLS concepts with one or more SNOMED CT terms and evaluating concepts in a selected set of SNOMED CT hierarchies to verify our hypothesis that UMLS semantic type assignment errors may exist in concepts residing in semantically inconsistent groups.

METHODS

Our methodology is a four-stage process. 1) partitioning concepts in a SNOMED CT hierarchy into semantically uniform groups based on their assigned semantic tags; 2) partitioning concepts in each group from 1) into the disjoint sub-groups based on their semantic type assignments; 3) mapping all SNOMED CT semantic tags into one or more semantic types in the UMLS; 4) identifying semantically inconsistent groups that have inconsistent assignments between semantic tags and semantic types according to the mapping from 3) and providing concepts in such groups to the domain experts for reviewing.

RESULTS

We applied our method on the UMLS 2013AA release. Concepts of the semantically inconsistent groups in the PHYSICAL FORCE and RECORD ARTIFACT hierarchies have error rates 33% and 62.5% respectively, which are greatly larger than error rates 0.6% and 1% in semantically consistent groups of the two hierarchies.

CONCLUSION

Concepts in semantically in - consistent groups are more likely to contain semantic type assignment errors. Our methodology can make auditing more efficient by limiting auditing resources on concepts of semantically inconsistent groups.

Collapse