1
|
Prompt Tuning in Biomedical Relation Extraction. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2024; 8:206-224. [PMID: 38681754 PMCID: PMC11052745 DOI: 10.1007/s41666-024-00162-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 02/09/2024] [Accepted: 02/19/2024] [Indexed: 05/01/2024]
Abstract
Biomedical relation extraction (RE) is critical in constructing high-quality knowledge graphs and databases as well as supporting many downstream text mining applications. This paper explores prompt tuning on biomedical RE and its few-shot scenarios, aiming to propose a simple yet effective model for this specific task. Prompt tuning reformulates natural language processing (NLP) downstream tasks into masked language problems by embedding specific text prompts into the original input, facilitating the adaption of pre-trained language models (PLMs) to better address these tasks. This study presents a customized prompt tuning model designed explicitly for biomedical RE, including its applicability in few-shot learning contexts. The model's performance was rigorously assessed using the chemical-protein relation (CHEMPROT) dataset from BioCreative VI and the drug-drug interaction (DDI) dataset from SemEval-2013, showcasing its superior performance over conventional fine-tuned PLMs across both datasets, encompassing few-shot scenarios. This observation underscores the effectiveness of prompt tuning in enhancing the capabilities of conventional PLMs, though the extent of enhancement may vary by specific model. Additionally, the model demonstrated a harmonious balance between simplicity and efficiency, matching state-of-the-art performance without needing external knowledge or extra computational resources. The pivotal contribution of our study is the development of a suitably designed prompt tuning model, highlighting prompt tuning's effectiveness in biomedical RE. It offers a robust, efficient approach to the field's challenges and represents a significant advancement in extracting complex relations from biomedical texts. Supplementary Information The online version contains supplementary material available at 10.1007/s41666-024-00162-9.
Collapse
|
2
|
Examining Linguistic Differences in Electronic Health Records for Diverse Patients With Diabetes: Natural Language Processing Analysis. JMIR Med Inform 2024; 12:e50428. [PMID: 38787295 DOI: 10.2196/50428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/26/2023] [Accepted: 04/23/2024] [Indexed: 05/25/2024] Open
Abstract
Background Individuals from minoritized racial and ethnic backgrounds experience pernicious and pervasive health disparities that have emerged, in part, from clinician bias. Objective We used a natural language processing approach to examine whether linguistic markers in electronic health record (EHR) notes differ based on the race and ethnicity of the patient. To validate this methodological approach, we also assessed the extent to which clinicians perceive linguistic markers to be indicative of bias. Methods In this cross-sectional study, we extracted EHR notes for patients who were aged 18 years or older; had more than 5 years of diabetes diagnosis codes; and received care between 2006 and 2014 from family physicians, general internists, or endocrinologists practicing in an urban, academic network of clinics. The race and ethnicity of patients were defined as White non-Hispanic, Black non-Hispanic, or Hispanic or Latino. We hypothesized that Sentiment Analysis and Social Cognition Engine (SEANCE) components (ie, negative adjectives, positive adjectives, joy words, fear and disgust words, politics words, respect words, trust verbs, and well-being words) and mean word count would be indicators of bias if racial differences emerged. We performed linear mixed effects analyses to examine the relationship between the outcomes of interest (the SEANCE components and word count) and patient race and ethnicity, controlling for patient age. To validate this approach, we asked clinicians to indicate the extent to which they thought variation in the use of SEANCE language domains for different racial and ethnic groups was reflective of bias in EHR notes. Results We examined EHR notes (n=12,905) of Black non-Hispanic, White non-Hispanic, and Hispanic or Latino patients (n=1562), who were seen by 281 physicians. A total of 27 clinicians participated in the validation study. In terms of bias, participants rated negative adjectives as 8.63 (SD 2.06), fear and disgust words as 8.11 (SD 2.15), and positive adjectives as 7.93 (SD 2.46) on a scale of 1 to 10, with 10 being extremely indicative of bias. Notes for Black non-Hispanic patients contained significantly more negative adjectives (coefficient 0.07, SE 0.02) and significantly more fear and disgust words (coefficient 0.007, SE 0.002) than those for White non-Hispanic patients. The notes for Hispanic or Latino patients included significantly fewer positive adjectives (coefficient -0.02, SE 0.007), trust verbs (coefficient -0.009, SE 0.004), and joy words (coefficient -0.03, SE 0.01) than those for White non-Hispanic patients. Conclusions This approach may enable physicians and researchers to identify and mitigate bias in medical interactions, with the goal of reducing health disparities stemming from bias.
Collapse
|
3
|
Online continual decoding of streaming EEG signal with a balanced and informative memory buffer. Neural Netw 2024; 176:106338. [PMID: 38692190 DOI: 10.1016/j.neunet.2024.106338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 03/20/2024] [Accepted: 04/23/2024] [Indexed: 05/03/2024]
Abstract
Electroencephalography (EEG) based Brain Computer Interface (BCI) systems play a significant role in facilitating how individuals with neurological impairments effectively interact with their environment. In real world applications of BCI system for clinical assistance and rehabilitation training, the EEG classifier often needs to learn on sequentially arriving subjects in an online manner. As patterns of EEG signals can be significantly different for different subjects, the EEG classifier can easily erase knowledge of learnt subjects after learning on later ones as it performs decoding in online streaming scenario, namely catastrophic forgetting. In this work, we tackle this problem with a memory-based approach, which considers the following conditions: (1) subjects arrive sequentially in an online manner, with no large scale dataset available for joint training beforehand, (2) data volume from the different subjects could be imbalanced, (3) decoding difficulty of the sequential streaming signal vary, (4) continual classification for a long time is required. This online sequential EEG decoding problem is more challenging than classic cross subject EEG decoding as there is no large-scale training data from the different subjects available beforehand. The proposed model keeps a small balanced memory buffer during sequential learning, with memory data dynamically selected based on joint consideration of data volume and informativeness. Furthermore, for the more general scenarios where subject identity is unknown to the EEG decoder, aka. subject agnostic scenario, we propose a kernel based subject shift detection method that identifies underlying subject changes on the fly in a computationally efficient manner. We develop challenging benchmarks of streaming EEG data from sequentially arriving subjects with both balanced and imbalanced data volumes, and performed extensive experiments with a detailed ablation study on the proposed model. The results show the effectiveness of our proposed approach, enabling the decoder to maintain performance on all previously seen subjects over a long period of sequential decoding. The model demonstrates the potential for real-world applications.
Collapse
|
4
|
Artificial intelligence-powered pharmacovigilance: A review of machine and deep learning in clinical text-based adverse drug event detection for benchmark datasets. J Biomed Inform 2024; 152:104621. [PMID: 38447600 DOI: 10.1016/j.jbi.2024.104621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 02/19/2024] [Accepted: 03/03/2024] [Indexed: 03/08/2024]
Abstract
OBJECTIVE The primary objective of this review is to investigate the effectiveness of machine learning and deep learning methodologies in the context of extracting adverse drug events (ADEs) from clinical benchmark datasets. We conduct an in-depth analysis, aiming to compare the merits and drawbacks of both machine learning and deep learning techniques, particularly within the framework of named-entity recognition (NER) and relation classification (RC) tasks related to ADE extraction. Additionally, our focus extends to the examination of specific features and their impact on the overall performance of these methodologies. In a broader perspective, our research extends to ADE extraction from various sources, including biomedical literature, social media data, and drug labels, removing the limitation to exclusively machine learning or deep learning methods. METHODS We conducted an extensive literature review on PubMed using the query "(((machine learning [Medical Subject Headings (MeSH) Terms]) OR (deep learning [MeSH Terms])) AND (adverse drug event [MeSH Terms])) AND (extraction)", and supplemented this with a snowballing approach to review 275 references sourced from retrieved articles. RESULTS In our analysis, we included twelve articles for review. For the NER task, deep learning models outperformed machine learning models. In the RC task, gradient Boosting, multilayer perceptron and random forest models excelled. The Bidirectional Encoder Representations from Transformers (BERT) model consistently achieved the best performance in the end-to-end task. Future efforts in the end-to-end task should prioritize improving NER accuracy, especially for 'ADE' and 'Reason'. CONCLUSION These findings hold significant implications for advancing the field of ADE extraction and pharmacovigilance, ultimately contributing to improved drug safety monitoring and healthcare outcomes.
Collapse
|
5
|
FedFSA: Hybrid and federated framework for functional status ascertainment across institutions. J Biomed Inform 2024; 152:104623. [PMID: 38458578 PMCID: PMC11005095 DOI: 10.1016/j.jbi.2024.104623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/12/2024] [Accepted: 03/04/2024] [Indexed: 03/10/2024]
Abstract
INTRODUCTION Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients' functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions. METHODS FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs. RESULTS ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance. CONCLUSION NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.
Collapse
|
6
|
Ensemble pretrained language models to extract biomedical knowledge from literature. J Am Med Inform Assoc 2024:ocae061. [PMID: 38520725 DOI: 10.1093/jamia/ocae061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 02/14/2024] [Accepted: 03/12/2024] [Indexed: 03/25/2024] Open
Abstract
OBJECTIVES The rapid expansion of biomedical literature necessitates automated techniques to discern relationships between biomedical concepts from extensive free text. Such techniques facilitate the development of detailed knowledge bases and highlight research deficiencies. The LitCoin Natural Language Processing (NLP) challenge, organized by the National Center for Advancing Translational Science, aims to evaluate such potential and provides a manually annotated corpus for methodology development and benchmarking. MATERIALS AND METHODS For the named entity recognition (NER) task, we utilized ensemble learning to merge predictions from three domain-specific models, namely BioBERT, PubMedBERT, and BioM-ELECTRA, devised a rule-driven detection method for cell line and taxonomy names and annotated 70 more abstracts as additional corpus. We further finetuned the T0pp model, with 11 billion parameters, to boost the performance on relation extraction and leveraged entites' location information (eg, title, background) to enhance novelty prediction performance in relation extraction (RE). RESULTS Our pioneering NLP system designed for this challenge secured first place in Phase I-NER and second place in Phase II-relation extraction and novelty prediction, outpacing over 200 teams. We tested OpenAI ChatGPT 3.5 and ChatGPT 4 in a Zero-Shot setting using the same test set, revealing that our finetuned model considerably surpasses these broad-spectrum large language models. DISCUSSION AND CONCLUSION Our outcomes depict a robust NLP system excelling in NER and RE across various biomedical entities, emphasizing that task-specific models remain superior to generic large ones. Such insights are valuable for endeavors like knowledge graph development and hypothesis formulation in biomedical research.
Collapse
|
7
|
AE-GPT: Using Large Language Models to extract adverse events from surveillance reports-A use case with influenza vaccine adverse events. PLoS One 2024; 19:e0300919. [PMID: 38512919 PMCID: PMC10956752 DOI: 10.1371/journal.pone.0300919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 03/06/2024] [Indexed: 03/23/2024] Open
Abstract
Though Vaccines are instrumental in global health, mitigating infectious diseases and pandemic outbreaks, they can occasionally lead to adverse events (AEs). Recently, Large Language Models (LLMs) have shown promise in effectively identifying and cataloging AEs within clinical reports. Utilizing data from the Vaccine Adverse Event Reporting System (VAERS) from 1990 to 2016, this study particularly focuses on AEs to evaluate LLMs' capability for AE extraction. A variety of prevalent LLMs, including GPT-2, GPT-3 variants, GPT-4, and Llama2, were evaluated using Influenza vaccine as a use case. The fine-tuned GPT 3.5 model (AE-GPT) stood out with a 0.704 averaged micro F1 score for strict match and 0.816 for relaxed match. The encouraging performance of the AE-GPT underscores LLMs' potential in processing medical data, indicating a significant stride towards advanced AE detection, thus presumably generalizable to other AE extraction tasks.
Collapse
|
8
|
A Semantic Approach to Describe Social and Economic Characteristics That Impact Health Outcomes (Social Determinants of Health): Ontology Development Study. Online J Public Health Inform 2024; 16:e52845. [PMID: 38477963 PMCID: PMC10973958 DOI: 10.2196/52845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 11/28/2023] [Accepted: 02/19/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND Social determinants of health (SDoH) have been described by the World Health Organization as the conditions in which individuals are born, live, work, and age. These conditions can be grouped into 3 interrelated levels known as macrolevel (societal), mesolevel (community), and microlevel (individual) determinants. The scope of SDoH expands beyond the biomedical level, and there remains a need to connect other areas such as economics, public policy, and social factors. OBJECTIVE Providing a computable artifact that can link health data to concepts involving the different levels of determinants may improve our understanding of the impact SDoH have on human populations. Modeling SDoH may help to reduce existing gaps in the literature through explicit links between the determinants and biological factors. This in turn can allow researchers and clinicians to make better sense of data and discover new knowledge through the use of semantic links. METHODS An experimental ontology was developed to represent knowledge of the social and economic characteristics of SDoH. Information from 27 literature sources was analyzed to gather concepts and encoded using Web Ontology Language, version 2 (OWL2) and Protégé. Four evaluators independently reviewed the ontology axioms using natural language translation. The analyses from the evaluations and selected terminologies from the Basic Formal Ontology were used to create a revised ontology with a broad spectrum of knowledge concepts ranging from the macrolevel to the microlevel determinants. RESULTS The literature search identified several topics of discussion for each determinant level. Publications for the macrolevel determinants centered around health policy, income inequality, welfare, and the environment. Articles relating to the mesolevel determinants discussed work, work conditions, psychosocial factors, socioeconomic position, outcomes, food, poverty, housing, and crime. Finally, sources found for the microlevel determinants examined gender, ethnicity, race, and behavior. Concepts were gathered from the literature and used to produce an ontology consisting of 383 classes, 109 object properties, and 748 logical axioms. A reasoning test revealed no inconsistent axioms. CONCLUSIONS This ontology models heterogeneous social and economic concepts to represent aspects of SDoH. The scope of SDoH is expansive, and although the ontology is broad, it is still in its early stages. To our current understanding, this ontology represents the first attempt to concentrate on knowledge concepts that are currently not covered by existing ontologies. Future direction will include further expanding the ontology to link with other biomedical ontologies, including alignment for granular semantics.
Collapse
|
9
|
Identification and classification of principal features for analyzing unwarranted clinical variation. J Eval Clin Pract 2024; 30:251-259. [PMID: 37933789 DOI: 10.1111/jep.13940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 10/10/2023] [Accepted: 10/19/2023] [Indexed: 11/08/2023]
Abstract
RATIONALE, AIMS, AND OBJECTIVE Unwarranted clinical variation (UCV) is an undesirable aspect of a healthcare system, but analyzing for UCV can be difficult and time-consuming. No analytic feature guidelines currently exist to aid researchers. We performed a systematic review of UCV literature to identify and classify the features researchers have identified as necessary for the analysis of UCV. METHODS The literature search followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses. We looked for articles with the terms 'medical practice variation' and 'unwarranted clinical variation' from four databases: Medline, Web of Science, EMBASE and CINAHL. The search was performed on 24 March 2023. The articles selected were original research articles in the English language reporting on UCV analysis in adult populations. Most of the studies were retrospective cohort analyses. We excluded studies reporting geographic variation based on the Atlas of Variation or small-area analysis methods. We used ASReview Lab software to assist in identifying articles for abstract review. We also conducted subsequent reference searches of the primary articles to retrieve additional articles. RESULTS The search yielded 499 articles, and we reviewed 46. We identified 28 principal analytic features utilized to analyze for unwarranted variation, categorised under patient-related or local healthcare context factors. Within the patient-related factors, we identified three subcategories: patient sociodemographics, clinical characteristics, and preferences, and classified 17 features into seven subcategories. In the local context category, 11 features are classified under two subcategories. Examples are provided on the usage of each feature for analysis. CONCLUSION Twenty-eight analytic features have been identified, and a categorisation has been established showing the relationships between features. Identifying and classifying features provides guidelines for known confounders during analysis and reduces the steps required when performing UCV analysis; there is no longer a need for a UCV researcher to engage in time-consuming feature engineering activities.
Collapse
|
10
|
Evaluating MedDRA-to-ICD terminology mappings. BMC Med Inform Decis Mak 2024; 23:299. [PMID: 38326827 PMCID: PMC10851449 DOI: 10.1186/s12911-023-02375-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 11/14/2023] [Indexed: 02/09/2024] Open
Abstract
BACKGROUND In this era of big data, data harmonization is an important step to ensure reproducible, scalable, and collaborative research. Thus, terminology mapping is a necessary step to harmonize heterogeneous data. Take the Medical Dictionary for Regulatory Activities (MedDRA) and International Classification of Diseases (ICD) for example, the mapping between them is essential for drug safety and pharmacovigilance research. Our main objective is to provide a quantitative and qualitative analysis of the mapping status between MedDRA and ICD. We focus on evaluating the current mapping status between MedDRA and ICD through the Unified Medical Language System (UMLS) and Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). We summarized the current mapping statistics and evaluated the quality of the current MedDRA-ICD mapping; for unmapped terms, we used our self-developed algorithm to rank the best possible mapping candidates for additional mapping coverage. RESULTS The identified MedDRA-ICD mapped pairs cover 27.23% of the overall MedDRA preferred terms (PT). The systematic quality analysis demonstrated that, among the mapped pairs provided by UMLS, only 51.44% are considered an exact match. For the 2400 sampled unmapped terms, 56 of the 2400 MedDRA Preferred Terms (PT) could have exact match terms from ICD. CONCLUSION Some of the mapped pairs between MedDRA and ICD are not exact matches due to differences in granularity and focus. For 72% of the unmapped PT terms, the identified exact match pairs illustrate the possibility of identifying additional mapped pairs. Referring to its own mapping standard, some of the unmapped terms should qualify for the expansion of MedDRA to ICD mapping in UMLS.
Collapse
|
11
|
Dynamic Prognosis Prediction for Patients on DAPT After Drug-Eluting Stent Implantation: Model Development and Validation. J Am Heart Assoc 2024; 13:e029900. [PMID: 38293921 PMCID: PMC11056175 DOI: 10.1161/jaha.123.029900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 12/01/2023] [Indexed: 02/01/2024]
Abstract
BACKGROUND The rapid evolution of artificial intelligence (AI) in conjunction with recent updates in dual antiplatelet therapy (DAPT) management guidelines emphasizes the necessity for innovative models to predict ischemic or bleeding events after drug-eluting stent implantation. Leveraging AI for dynamic prediction has the potential to revolutionize risk stratification and provide personalized decision support for DAPT management. METHODS AND RESULTS We developed and validated a new AI-based pipeline using retrospective data of drug-eluting stent-treated patients, sourced from the Cerner Health Facts data set (n=98 236) and Optum's de-identified Clinformatics Data Mart Database (n=9978). The 36 months following drug-eluting stent implantation were designated as our primary forecasting interval, further segmented into 6 sequential prediction windows. We evaluated 5 distinct AI algorithms for their precision in predicting ischemic and bleeding risks. Model discriminative accuracy was assessed using the area under the receiver operating characteristic curve, among other metrics. The weighted light gradient boosting machine stood out as the preeminent model, thus earning its place as our AI-DAPT model. The AI-DAPT demonstrated peak accuracy in the 30 to 36 months window, charting an area under the receiver operating characteristic curve of 90% [95% CI, 88%-92%] for ischemia and 84% [95% CI, 82%-87%] for bleeding predictions. CONCLUSIONS Our AI-DAPT excels in formulating iterative, refined dynamic predictions by assimilating ongoing updates from patients' clinical profiles, holding value as a novel smart clinical tool to facilitate optimal DAPT duration management with high accuracy and adaptability.
Collapse
|
12
|
Promoting Personalized Reminiscence Among Cognitively Intact Older Adults Through an AI-Driven Interactive Multimodal Photo Album: Development and Usability Study. JMIR Aging 2024; 7:e49415. [PMID: 38261365 PMCID: PMC10848130 DOI: 10.2196/49415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 10/25/2023] [Accepted: 12/11/2023] [Indexed: 01/24/2024] Open
Abstract
BACKGROUND Reminiscence, a therapy that uses stimulating materials such as old photos and videos to stimulate long-term memory, can improve the emotional well-being and life satisfaction of older adults, including those who are cognitively intact. However, providing personalized reminiscence therapy can be challenging for caregivers and family members. OBJECTIVE This study aimed to achieve three objectives: (1) design and develop the GoodTimes app, an interactive multimodal photo album that uses artificial intelligence (AI) to engage users in personalized conversations and storytelling about their pictures, encompassing family, friends, and special moments; (2) examine the app's functionalities in various scenarios using use-case studies and assess the app's usability and user experience through the user study; and (3) investigate the app's potential as a supplementary tool for reminiscence therapy among cognitively intact older adults, aiming to enhance their psychological well-being by facilitating the recollection of past experiences. METHODS We used state-of-the-art AI technologies, including image recognition, natural language processing, knowledge graph, logic, and machine learning, to develop GoodTimes. First, we constructed a comprehensive knowledge graph that models the information required for effective communication, including photos, people, locations, time, and stories related to the photos. Next, we developed a voice assistant that interacts with users by leveraging the knowledge graph and machine learning techniques. Then, we created various use cases to examine the functions of the system in different scenarios. Finally, to evaluate GoodTimes' usability, we conducted a study with older adults (N=13; age range 58-84, mean 65.8 years). The study period started from January to March 2023. RESULTS The use-case tests demonstrated the performance of GoodTimes in handling a variety of scenarios, highlighting its versatility and adaptability. For the user study, the feedback from our participants was highly positive, with 92% (12/13) reporting a positive experience conversing with GoodTimes. All participants mentioned that the app invoked pleasant memories and aided in recollecting loved ones, resulting in a sense of happiness for the majority (11/13, 85%). Additionally, a significant majority found GoodTimes to be helpful (11/13, 85%) and user-friendly (12/13, 92%). Most participants (9/13, 69%) expressed a desire to use the app frequently, although some (4/13, 31%) indicated a need for technical support to navigate the system effectively. CONCLUSIONS Our AI-based interactive photo album, GoodTimes, was able to engage users in browsing their photos and conversing about them. Preliminary evidence supports GoodTimes' usability and benefits cognitively intact older adults. Future work is needed to explore its potential positive effects among older adults with cognitive impairment.
Collapse
|
13
|
Lessons learned from annotation of VAERS reports on adverse events following influenza vaccination and related to Guillain-Barré syndrome. BMC Med Inform Decis Mak 2024; 23:298. [PMID: 38183034 PMCID: PMC10770878 DOI: 10.1186/s12911-023-02374-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 11/14/2023] [Indexed: 01/07/2024] Open
Abstract
BACKGROUND Vaccine Adverse Events ReportingSystem (VAERS) is a promising resource of tracking adverse events following immunization. Medical Dictionary for Regulatory Activities (MedDRA) terminology used for coding adverse events in VAERS reports has several limitations. We focus on developing an automated system for semantic extraction of adverse events following vaccination and their temporal relationships for a better understanding of VAERS data and its integration into other applications. The aim of the present studyis to summarize the lessons learned during the initial phase of this project in annotating adverse events following influenza vaccination and related to Guillain-Barré syndrome (GBS). We emphasize on identifying the limitations of VAERS and MedDRA. RESULTS We collected 282 VAERS reports documented between 1990 and 2016 and shortlisted those with at least 1,100 characters in the report. We used a subset of 50 reports for the preliminary investigation and annotated all adverse events following influenza vaccination by mapping to representative MedDRA terms. Associated time expressions were annotated when available. We used 16 System Organ Class (SOC) level MedDRA terms to map GBS related adverse events and expanded some SOC terms to Lowest Level Terms (LLT) for granular representation. We annotated three broad categories of events such as problems, clinical investigations, and treatments/procedures. The inter-annotator agreement of events achieved was 86%. Incomplete reports, typographical errors, lack of clarity and coherence, repeated texts, unavailability of associated temporal information, difficulty to interpret due to incorrect grammar, use of generalized terms to describe adverse events / symptoms, uncommon abbreviations, difficulty annotating multiple events with a conjunction / common phrase, irrelevant historical events and coexisting events were some of the challenges encountered. Some of the limitations we noted are in agreement with previous reports. CONCLUSIONS We reported the challenges encountered and lessons learned during annotation of adverse events in VAERS reports following influenza vaccination and related to GBS. Though the challenges may be due to the inevitable limitations of public reporting systems and widely reported limitations of MedDRA, we emphasize the need to understand these limitations and extraction of other supportive information for a better understanding of adverse events following vaccination.
Collapse
|
14
|
Unpacking adverse events and associations post COVID-19 vaccination: a deep dive into vaccine adverse event reporting system data. Expert Rev Vaccines 2024; 23:53-59. [PMID: 38063069 PMCID: PMC10872386 DOI: 10.1080/14760584.2023.2292203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 11/30/2023] [Indexed: 12/18/2023]
Abstract
INTRODUCTION The rapid development of COVID-19 vaccines has provided crucial tools for pandemic control, but the occurrence of vaccine-related adverse events (AEs) underscores the need for comprehensive monitoring. METHODS This study analyzed the Vaccine Adverse Event Reporting System (VAERS) data from 2020-2022 using statistical methods such as zero-truncated Poisson regression and logistic regression to assess associations with age, gender groups, and vaccine manufacturers. RESULTS Logistic regression identified 26 System Organ Classes (SOCs) significantly associated with age and gender. Females displayed especially higher odds in SOC 19 (Pregnancy, puerperium and perinatal conditions), while males had higher odds in SOC 25 (Surgical and medical procedures). Older adults (>65) were more prone to symptoms like Cardiac disorders, whereas those aged 18-65 showed susceptibility to AEs like Skin and subcutaneous tissue disorders. Moderna and Pfizer vaccines induced fewer SOC symptoms compared to Janssen and Novavax. The zero-truncated Poisson regression model estimated an average of 4.243 symptoms per individual. CONCLUSION These findings offer vital insights into vaccine safety, guiding evidence-based vaccination strategies and monitoring programs for precise and effective outcomes.
Collapse
|
15
|
Machine learning-based donor permission extraction from informed consent documents. BMC Bioinformatics 2023; 24:477. [PMID: 38102593 PMCID: PMC10724888 DOI: 10.1186/s12859-023-05568-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 11/14/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND With more clinical trials are offering optional participation in the collection of bio-specimens for biobanking comes the increasing complexity of requirements of informed consent forms. The aim of this study is to develop an automatic natural language processing (NLP) tool to annotate informed consent documents to promote biorepository data regulation, sharing, and decision support. We collected informed consent documents from several publicly available sources, then manually annotated them, covering sentences containing permission information about the sharing of either bio-specimens or donor data, or conducting genetic research or future research using bio-specimens or donor data. RESULTS We evaluated a variety of machine learning algorithms including random forest (RF) and support vector machine (SVM) for the automatic identification of these sentences. 120 informed consent documents containing 29,204 sentences were annotated, of which 1250 sentences (4.28%) provide answers to a permission question. A support vector machine (SVM) model achieved a F-1 score of 0.95 on classifying the sentences when using a gold standard, which is a prefiltered corpus containing all relevant sentences. CONCLUSIONS This study provides the feasibility of using machine learning tools to classify permission-related sentences in informed consent documents.
Collapse
|
16
|
Mapping Vaccine Names in Clinical Trials to Vaccine Ontology using Cascaded Fine-Tuned Domain-Specific Language Models. RESEARCH SQUARE 2023:rs.3.rs-3362256. [PMID: 37841880 PMCID: PMC10571639 DOI: 10.21203/rs.3.rs-3362256/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Background Vaccines have revolutionized public health by providing protection against infectious diseases. They stimulate the immune system and generate memory cells to defend against targeted diseases. Clinical trials evaluate vaccine performance, including dosage, administration routes, and potential side effects. ClinicalTrials.gov is a valuable repository of clinical trial information, but the vaccine data in them lacks standardization, leading to challenges in automatic concept mapping, vaccine-related knowledge development, evidence-based decision-making, and vaccine surveillance. Results In this study, we developed a cascaded framework that capitalized on multiple domain knowledge sources, including clinical trials, Unified Medical Language System (UMLS), and the Vaccine Ontology (VO), to enhance the performance of domain-specific language models for automated mapping of VO from clinical trials. The Vaccine Ontology (VO) is a community-based ontology that was developed to promote vaccine data standardization, integration, and computer-assisted reasoning. Our methodology involved extracting and annotating data from various sources. We then performed pre-training on the PubMedBERT model, leading to the development of CTPubMedBERT. Subsequently, we enhanced CTPubMedBERT by incorporating SAPBERT, which was pretrained using the UMLS, resulting in CTPubMedBERT + SAPBERT. Further refinement was accomplished through fine-tuning using the Vaccine Ontology corpus and vaccine data from clinical trials, yielding the CTPubMedBERT + SAPBERT + VO model. Finally, we utilized a collection of pre-trained models, along with the weighted rule-based ensemble approach, to normalize the vaccine corpus and improve the accuracy of the process. The ranking process in concept normalization involves prioritizing and ordering potential concepts to identify the most suitable match for a given context. We conducted a ranking of the Top 10 concepts, and our experimental results demonstrate that our proposed cascaded framework consistently outperformed existing effective baselines on vaccine mapping, achieving 71.8% on top 1 candidate's accuracy and 90.0% on top 10 candidate's accuracy. Conclusion This study provides a detailed insight into a cascaded framework of fine-tuned domain-specific language models improving mapping of VO from clinical trials. By effectively leveraging domain-specific information and applying weighted rule-based ensembles of different pre-trained BERT models, our framework can significantly enhance the mapping of VO from clinical trials.
Collapse
|
17
|
DEVO: an ontology to assist with dermoscopic feature standardization. BMC Med Inform Decis Mak 2023; 23:162. [PMID: 37596573 PMCID: PMC10436380 DOI: 10.1186/s12911-023-02251-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 07/26/2023] [Indexed: 08/20/2023] Open
Abstract
BACKGROUND The utilization of dermoscopic analysis is becoming increasingly critical for diagnosing skin diseases by physicians and even artificial intelligence. With the expansion of dermoscopy, its vocabulary has proliferated, but the rapid evolution of the vocabulary of dermoscopy without standardized control is counterproductive. We aimed to develop a domain-specific ontology to formally represent knowledge for certain dermoscopic features. METHODS The first phase involved creating a fundamental-level ontology that covers the fundamental aspects and elements in describing visualizations, such as shapes and colors. The second phase involved creating a domain ontology that harnesses the fundamental-level ontology to formalize the definitions of dermoscopic metaphorical terms. RESULTS The Dermoscopy Elements of Visuals Ontology (DEVO) contains 1047 classes, 47 object properties, and 16 data properties. It has a better semiotic score compared to similar ontologies of the same domain. Three human annotators also examined the consistency, complexity, and future application of the ontology. CONCLUSIONS The proposed ontology was able to harness the definitions of metaphoric terms by decomposing them into their visual elements. Future applications include providing education for trainees and diagnostic support for dermatologists, with the goal of generating responses to queries about dermoscopic features and integrating these features to diagnose skin diseases.
Collapse
|
18
|
Systematic design and data-driven evaluation of social determinants of health ontology (SDoHO). J Am Med Inform Assoc 2023; 30:1465-1473. [PMID: 37301740 PMCID: PMC10436148 DOI: 10.1093/jamia/ocad096] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/23/2023] [Accepted: 06/02/2023] [Indexed: 06/12/2023] Open
Abstract
OBJECTIVE Social determinants of health (SDoH) play critical roles in health outcomes and well-being. Understanding the interplay of SDoH and health outcomes is critical to reducing healthcare inequalities and transforming a "sick care" system into a "health-promoting" system. To address the SDOH terminology gap and better embed relevant elements in advanced biomedical informatics, we propose an SDoH ontology (SDoHO), which represents fundamental SDoH factors and their relationships in a standardized and measurable way. MATERIAL AND METHODS Drawing on the content of existing ontologies relevant to certain aspects of SDoH, we used a top-down approach to formally model classes, relationships, and constraints based on multiple SDoH-related resources. Expert review and coverage evaluation, using a bottom-up approach employing clinical notes data and a national survey, were performed. RESULTS We constructed the SDoHO with 708 classes, 106 object properties, and 20 data properties, with 1,561 logical axioms and 976 declaration axioms in the current version. Three experts achieved 0.967 agreement in the semantic evaluation of the ontology. A comparison between the coverage of the ontology and SDOH concepts in 2 sets of clinical notes and a national survey instrument also showed satisfactory results. DISCUSSION SDoHO could potentially play an essential role in providing a foundation for a comprehensive understanding of the associations between SDoH and health outcomes and paving the way for health equity across populations. CONCLUSION SDoHO has well-designed hierarchies, practical objective properties, and versatile functionalities, and the comprehensive semantic and coverage evaluation achieved promising performance compared to the existing ontologies relevant to SDoH.
Collapse
|
19
|
An ontology-based approach for harmonization and cross-cohort query of Alzheimer's disease data resources. BMC Med Inform Decis Mak 2023; 23:151. [PMID: 37542312 PMCID: PMC10401730 DOI: 10.1186/s12911-023-02250-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 07/26/2023] [Indexed: 08/06/2023] Open
Abstract
BACKGROUND In the United States, the National Alzheimer's Coordinating Center (NACC) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) are two major data sharing resources for Alzheimer's Disease (AD) research. NACC and ADNI strive to make their data more FAIR (findable, interoperable, accessible and reusable) for the broader research community. However, there is limited work harmonizing and supporting cross-cohort interoperability of the two resources. METHOD In this paper, we leverage an ontology-based approach to harmonize data elements in the two resources and develop a web-based query system to search patient cohorts across the two resources. We first mapped data elements across NACC and ADNI, and performed value harmonization for the mapped data elements with inconsistent permissible values. Then we built an Alzheimer's Disease Data Element Ontology (ADEO) to model the mapped data elements in NACC and ADNI. We further developed a prototype cross-cohort query system to search patient cohorts across NACC and ADNI. RESULTS After manual review, we found 172 mappings between NACC and ADNI. These 172 mappings were further used to construct common concepts in ADEO. Our data element mapping and harmonization resulted in five files storing common concepts, variables in NACC and ADNI, mappings between variables and common concepts, permissible values of categorical type data elements, and coding inconsistency harmonization, respectively. Our cross-cohort query system consists of three core architectural elements: a web-based interface, an advanced query engine, and a backend MongoDB database. CONCLUSIONS In this work, ADEO has been specifically designed to facilitate data harmonization and cross-cohort query of NACC and ADNI data resources. Although our prototype cross-cohort query system was developed for exploring NACC and ADNI, its backend and frontend framework has been designed and implemented to be generally applicable to other domains for querying patient cohorts from multiple heterogeneous data sources.
Collapse
|
20
|
Enriching Real-world Data with Social Determinants of Health for Health Outcomes and Health Equity: Successes, Challenges, and Opportunities. Yearb Med Inform 2023; 32:253-263. [PMID: 38147867 PMCID: PMC10751148 DOI: 10.1055/s-0043-1768732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023] Open
Abstract
OBJECTIVE To summarize the recent methods and applications that leverage real-world data such as electronic health records (EHRs) with social determinants of health (SDoH) for public and population health and health equity and identify successes, challenges, and possible solutions. METHODS In this opinion review, grounded on a social-ecological-model-based conceptual framework, we surveyed data sources and recent informatics approaches that enable leveraging SDoH along with real-world data to support public health and clinical health applications including helping design public health intervention, enhancing risk stratification, and enabling the prediction of unmet social needs. RESULTS Besides summarizing data sources, we identified gaps in capturing SDoH data in existing EHR systems and opportunities to leverage informatics approaches to collect SDoH information either from structured and unstructured EHR data or through linking with public surveys and environmental data. We also surveyed recently developed ontologies for standardizing SDoH information and approaches that incorporate SDoH for disease risk stratification, public health crisis prediction, and development of tailored interventions. CONCLUSIONS To enable effective public health and clinical applications using real-world data with SDoH, it is necessary to develop both non-technical solutions involving incentives, policies, and training as well as technical solutions such as novel social risk management tools that are integrated into clinical workflow. Ultimately, SDoH-powered social risk management, disease risk prediction, and development of SDoH tailored interventions for disease prevention and management have the potential to improve population health, reduce disparities, and improve health equity.
Collapse
|
21
|
Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions. Yearb Med Inform 2023; 32:215-224. [PMID: 38147863 PMCID: PMC10751115 DOI: 10.1055/s-0043-1768735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023] Open
Abstract
OBJECTIVES Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research. METHODS We conducted a comprehensive search of multiple databases, including PubMed, Web of Science, IEEE Xplore, and Google Scholar, to collect relevant publications from the past two years (2021-2022). The studies selected for review were based on their relevance to the topic and the publication quality. RESULTS A total of 78 articles were included in our analysis. We identified three main categories of GRL methods and summarized their methodological foundations and notable models. In terms of GRL applications, we focused on two main topics: drug and disease. We analyzed the study frameworks and achievements of the prominent research. Based on the current state-of-the-art, we discussed the challenges and future directions. CONCLUSIONS GRL methods applied in the biomedical field demonstrated several key characteristics, including the utilization of attention mechanisms to prioritize relevant features, a growing emphasis on model interpretability, and the combination of various techniques to improve model performance. There are also challenges needed to be addressed, including mitigating model bias, accommodating the heterogeneity of large-scale knowledge graphs, and improving the availability of high-quality graph data. To fully leverage the potential of GRL, future efforts should prioritize these areas of research.
Collapse
|
22
|
An NLP approach to identify SDoH-related circumstance and suicide crisis from death investigation narratives. J Am Med Inform Assoc 2023; 30:1408-1417. [PMID: 37040620 PMCID: PMC10354765 DOI: 10.1093/jamia/ocad068] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/10/2023] [Accepted: 03/29/2023] [Indexed: 04/13/2023] Open
Abstract
OBJECTIVES Suicide presents a major public health challenge worldwide, affecting people across the lifespan. While previous studies revealed strong associations between Social Determinants of Health (SDoH) and suicide deaths, existing evidence is limited by the reliance on structured data. To resolve this, we aim to adapt a suicide-specific SDoH ontology (Suicide-SDoHO) and use natural language processing (NLP) to effectively identify individual-level SDoH-related social risks from death investigation narratives. MATERIALS AND METHODS We used the latest National Violent Death Report System (NVDRS), which contains 267 804 victim suicide data from 2003 to 2019. After adapting the Suicide-SDoHO, we developed a transformer-based model to identify SDoH-related circumstances and crises in death investigation narratives. We applied our model retrospectively to annotate narratives whose crisis variables were not coded in NVDRS. The crisis rates were calculated as the percentage of the group's total suicide population with the crisis present. RESULTS The Suicide-SDoHO contains 57 fine-grained circumstances in a hierarchical structure. Our classifier achieves AUCs of 0.966 and 0.942 for classifying circumstances and crises, respectively. Through the crisis trend analysis, we observed that not everyone is equally affected by SDoH-related social risks. For the economic stability crisis, our result showed a significant increase in crisis rate in 2007-2009, parallel with the Great Recession. CONCLUSIONS This is the first study curating a Suicide-SDoHO using death investigation narratives. We showcased that our model can effectively classify SDoH-related social risks through NLP approaches. We hope our study will facilitate the understanding of suicide crises and inform effective prevention strategies.
Collapse
|
23
|
CRENO: An ontology to model concepts relating to culture, race, ethnicity, and nationality for health data. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2023; 2023:398-407. [PMID: 37350894 PMCID: PMC10283130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
Generating categories and classifications is a common function in life science research; however, categorizing the human population based on "race" remains controversial. There is an awareness and recognition of social-economic disparities with respect to health which are sometimes impacted by someone's ethnicity or race. This work describes an endeavor to develop a computable ontology model to represent a standardization of the concepts surrounding culture, race, ethnicity, and nationality - concepts misrepresented widely. We constructed an OWL ontology based on reliable resources with iterative human expert evaluations and aligned it to existing biomedical ontological models. The effort produced a preliminary ontology that expresses concepts related to classes of ethnic, racial, national, and cultural identities and showcases how health disparity data can be linked and expressed within our ontological framework. Future work will explore automated methods to expand the ontology and its utilization for clinical informatics.
Collapse
|
24
|
Named Entity Recognition and Normalization for Alzheimer's Disease Eligibility Criteria. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2023; 2023:558-564. [PMID: 38283164 PMCID: PMC10815931 DOI: 10.1109/ichi57859.2023.00100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2024]
Abstract
Alzheimer's Disease (AD) is a complex neurodegenerative disorder that affects millions of people worldwide. Finding effective treatments for this disease is crucial. Clinical trials play an essential role in developing and testing new treatments for AD. However, identifying eligible participants can be challenging, time-consuming, and costly. In recent years, the development of natural language processing (NLP) techniques, specifically named entity recognition (NER) and named entity normalization (NEN), have helped to automate the identification and extraction of relevant information from the eligibility criteria (EC) more efficiently, in order to facilitate semi-automatic patient recruitment and enable data FAIRness for clinical trial data. Nevertheless, most current biomedical NER models only provide annotations for a restricted set of entity types that may not be applicable to the clinical trial data. Additionally, accurately performing NEN on entities that are negated using a negative prefix currently lacks established techniques. In this paper, we introduce a pipeline designed for information extraction from AD clinical trial EC, which involves preprocessing of the EC data, clinical NER, and biomedical NEN to Unified Medical Language System (UMLS). Our NER model can identify named entities in seven pre-defined categories, while our NEN model employs a combination of exact match and partial match search strategies, as well as customized rules to accurately normalize entities with negative prefixes. To evaluate the performance of our pipeline, we measured the precision, recall, and F1 score for the NER component, and we manually reviewed the top five mapping results produced by the NEN component. Our evaluation of the pipeline's performance revealed that it can successfully normalize named entities in clinical trial ECs with optimal accuracies. The NER component achieved a overall F1 of 0.816, demonstrating its ability to accurately identify seven types of named entities in clinical text. The NEN component of the pipeline also demonstrated impressive performance, with customized rules and a combination of exact and partial match strategies leading to an accuracy of 0.940 for normalized entities.
Collapse
|
25
|
Higher mortality among lean patients with non-alcoholic fatty liver disease despite fewer metabolic comorbidities. Aliment Pharmacol Ther 2023; 57:1014-1027. [PMID: 36815445 PMCID: PMC10682563 DOI: 10.1111/apt.17424] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 01/31/2023] [Accepted: 02/07/2023] [Indexed: 02/24/2023]
Abstract
BACKGROUND & AIMS Non-alcoholic fatty liver disease (NAFLD) can develop in individuals who are not overweight. Whether lean persons with NAFLD have lower mortality and lower incidence of cirrhosis, cardiovascular diseases (CVD), diabetes mellitus (DM) and cancer than overweight/obese persons with NAFLD remains inconclusive. We compared mortality and incidence of cirrhosis, CVD, DM and cancer between lean versus non-lean persons with NAFLD. METHODS This is a retrospective study of adults with NAFLD in a single centre from 2012 to 2021. Primary outcomes were mortality and new diagnosis of cirrhosis, CVD, DM and cancer. Outcomes were modelled using competing risk analysis and Cox proportional hazards regression analysis. RESULTS A total of 18,594 and 13,420 patients were identified for cross-sectional and longitudinal analysis respectively: approximately 11% lean, 25% overweight, 28% class 1 obesity and 35% class 2-3 obesity. The median age was 51.0 years, 54.6% were women. The median follow-up was 49.3 months. Lean patients had lower prevalence of metabolic diseases at baseline and lower incidence of cirrhosis and DM than non-lean patients and no difference in CVD, any cancer or obesity-related cancer during follow-up. However, lean patients had significantly higher mortality with incidence per 1000 person-years of 16.67, 10.11, 7.37 and 8.99, respectively, in lean, overweight, obesity class 1 and obesity class 2-3 groups respectively. CONCLUSIONS Lean patients with NAFLD had higher mortality despite lower incidence of cirrhosis and DM, and similar incidence of CVD and cancer and merit similar if not more attention as non-lean patients with NAFLD.
Collapse
|
26
|
Associations of Antihypertensive Medication Consumption and Drug-Drug Interaction with Statin and Metformin with Reduced Alzheimer's Disease and Related Dementias Risk among Hypertensive Patients with Mild Cognitive Impairment using High Volume Claims Data. RESEARCH SQUARE 2023:rs.3.rs-2629005. [PMID: 37090575 PMCID: PMC10120765 DOI: 10.21203/rs.3.rs-2629005/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Background While hypertension is a modifiable risk factor of Alzheimer's disease and related dementias (ADRD), limited studies have been conducted on the effectiveness of antihypertensive medications (AHMs) in altering the progression from mild cognitive impairment (MCI) to ADRD; similarly, few studies have assessed drug-drug interactions of AHMs with drugs targeted to modify other risk factors of ADRD such as type II diabetes and hypercholesterolemia. Method 128,683 unique hypertensive patients with MCI on US-based Optum claims data were identified. Diuretics, beta blockers (BBs), calcium channel blockers (CCBs), angiotensin-converting enzyme inhibitors (ACE inhibitors), and angiotensin II receptor antagonists (ARBs) were identified as five major AHM classes. Baseline characteristics were compared. Cox proportional hazards (PH) models were used to study the association between specific AHM exposure and the progression from MCI to ADRD while controlling for demographic variables, comorbidities, and the use of Statins and Metformin. To examine the association of AHM-Statin or AHM-Metformin interaction with ADRD progression, we also investigated models controlling for the aforementioned confounders, as well as drug-drug interactions. Result The study included 100,678 patients who were taking at least one class of AHM and 28,005 who were not taking any AHMs during the study period. AHM users had a higher incidence of comorbidities (all P≤0.039) and consumption of Metformin and Statins (both P<0.001) compared to non-users. Users of each major AHM class showed significantly lower risk of developing ADRD compared to non-users of that specific drug class (adjusted hazard ratio (aHR): 0.96-0.98; all P≤0.048). Within patients on monotherapy (using only one AHM drug), no specific AHM class had significantly lower risk of ADRD diagnosis compared to other AHM drug classes (aHR: 0.97-1.11; all P≥0.053). Use of Diuretics or CCBs in combination with Metformin consumption (aHR: 0.89, 0.91, respectively) showed lower risk of MCI to ADRD progression than use without Metformin consumption (aHR: 0.97, 0.98, respectively), whereas use of any of the five major AHMs with Statin consumption (aHR: 0.91-0.94) all showed lower risk than without Statin consumption (aHR: 0.98-1.04). Conclusion All five major AHM classes showed a protective effect against ADRD progression among hypertensive patients with MCI. Also, certain combinations of AHMs with Metformin or Statins showed a stronger protective effect compared to AHMs alone, and some drug-drug interactions of AHM-Metformin or AHM-Statin also showed protective effects against progression from MCI to ADRD.
Collapse
|
27
|
Application of an ontology for model cards to generate computable artifacts for linking machine learning information from biomedical research. PROCEEDINGS OF THE ... INTERNATIONAL WORLD-WIDE WEB CONFERENCE. INTERNATIONAL WWW CONFERENCE 2023; 2023:820-825. [PMID: 38327770 PMCID: PMC10848146 DOI: 10.1145/3543873.3587601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Model card reports provide a transparent description of machine learning models which includes information about their evaluation, limitations, intended use, etc. Federal health agencies have expressed an interest in model cards report for research studies using machine-learning based AI. Previously, we have developed an ontology model for model card reports to structure and formalize these reports. In this paper, we demonstrate a Java-based library (OWL API, FaCT++) that leverages our ontology to publish computable model card reports. We discuss future directions and other use cases that highlight applicability and feasibility of ontology-driven systems to support FAIR challenges.
Collapse
|
28
|
Social Determinants, Cardiovascular Disease, and Health Care Cost: A Nationwide Study in the United States Using Machine Learning. J Am Heart Assoc 2023; 12:e027919. [PMID: 36802713 PMCID: PMC10111459 DOI: 10.1161/jaha.122.027919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Background Existing studies on cardiovascular diseases (CVDs) often focus on individual-level behavioral risk factors, but research examining social determinants is limited. This study applies a novel machine learning approach to identify the key predictors of county-level care costs and prevalence of CVDs (including atrial fibrillation, acute myocardial infarction, congestive heart failure, and ischemic heart disease). Methods and Results We applied the extreme gradient boosting machine learning approach to a total of 3137 counties. Data are from the Interactive Atlas of Heart Disease and Stroke and a variety of national data sets. We found that although demographic composition (eg, percentages of Black people and older adults) and risk factors (eg, smoking and physical inactivity) are among the most important predictors for inpatient care costs and CVD prevalence, contextual factors such as social vulnerability and racial and ethnic segregation are particularly important for the total and outpatient care costs. Poverty and income inequality are the major contributors to the total care costs for counties that are in nonmetro areas or have high segregation or social vulnerability levels. Racial and ethnic segregation is particularly important in shaping the total care costs for counties with low poverty rates or social vulnerability level. Demographic composition, education, and social vulnerability are consistently important across different scenarios. Conclusions The findings highlight the differences in predictors for different types of CVD cost outcomes and the importance of social determinants. Interventions directed toward areas that have been economically and socially marginalized may aid in reducing the impact of CVDs.
Collapse
|
29
|
Klotho inhibits the formation of calcium oxalate stones by regulating the Keap1-Nrf2-ARE signaling pathway. Int Urol Nephrol 2023; 55:263-276. [PMID: 36336747 DOI: 10.1007/s11255-022-03398-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022]
Abstract
PURPOSE Oxidative damage is important in calcium oxalate (CaOx) stone development but occurs via multiple pathways. Studies have shown that klotho plays an essential role in ameliorating oxidative damage. This study aims to explore the role of klotho in CaOx stones and whether the underlying mechanism is related to the regulation of Keap1-Nrf2-ARE signaling. METHODS METHODS The levels of GSH, SOD, CAT, MDA, and ROS were examined by ELISA. The klotho, Bcl-2, caspase-3, Keap1, Nrf2, HO-1, and NQO1 mRNA levels were measured by qRT‒PCR, and their protein levels were detected by Western blotting. Renal tissue apoptosis was examined by TUNEL staining, and crystal cell adherence and apoptosis in HKC cells were assessed based on the Ca2+ concentrations and by flow cytometry. The renal pathological changes and the adhesion of CaOx crystals in the kidneys were examined by hematoxylin-eosin and von Kossa staining, respectively. RESULTS RESULTS We constructed a CaOx kidney stone model in vitro. By regulating the klotho gene, klotho overexpression inhibited the CaOx-induced promotion of crystal cell adherence and apoptosis in HKC cells, and these effects were reversed by klotho knockdown. Moreover, our in vivo assay demonstrated that klotho overexpression alleviated glyoxylate administration-induced renal oxidative damage, renal apoptosis, and crystal deposition in the kidneys of mice, and these effects were also associated with activation of the Keap1-Nrf2-ARE pathway. CONCLUSION CONCLUSION Klotho protein inhibits the oxidative stress response of HKC cells through the Keap1-Nrf2-ARE signaling pathway, reduces the apoptosis of and adhesion of CaOx crystals to HKC cells, and decreases the occurrence of CaOx kidney stones. CLINICAL TRIAL REGISTRATION 20220304.
Collapse
|
30
|
Issues in Melanoma Detection: Semisupervised Deep Learning Algorithm Development via a Combination of Human and Artificial Intelligence. JMIR DERMATOLOGY 2022; 5:e39113. [PMID: 37632881 PMCID: PMC10334941 DOI: 10.2196/39113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 09/01/2022] [Accepted: 10/12/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Automatic skin lesion recognition has shown to be effective in increasing access to reliable dermatology evaluation; however, most existing algorithms rely solely on images. Many diagnostic rules, including the 3-point checklist, are not considered by artificial intelligence algorithms, which comprise human knowledge and reflect the diagnosis process of human experts. OBJECTIVE In this paper, we aimed to develop a semisupervised model that can not only integrate the dermoscopic features and scoring rule from the 3-point checklist but also automate the feature-annotation process. METHODS We first trained the semisupervised model on a small, annotated data set with disease and dermoscopic feature labels and tried to improve the classification accuracy by integrating the 3-point checklist using ranking loss function. We then used a large, unlabeled data set with only disease label to learn from the trained algorithm to automatically classify skin lesions and features. RESULTS After adding the 3-point checklist to our model, its performance for melanoma classification improved from a mean of 0.8867 (SD 0.0191) to 0.8943 (SD 0.0115) under 5-fold cross-validation. The trained semisupervised model can automatically detect 3 dermoscopic features from the 3-point checklist, with best performances of 0.80 (area under the curve [AUC] 0.8380), 0.89 (AUC 0.9036), and 0.76 (AUC 0.8444), in some cases outperforming human annotators. CONCLUSIONS Our proposed semisupervised learning framework can help with the automatic diagnosis of skin disease based on its ability to detect dermoscopic features and automate the label-annotation process. The framework can also help combine semantic knowledge with a computer algorithm to arrive at a more accurate and more interpretable diagnostic result, which can be applied to broader use cases.
Collapse
|
31
|
Mining on Alzheimer's diseases related knowledge graph to identity potential AD-related semantic triples for drug repurposing. BMC Bioinformatics 2022; 23:407. [PMID: 36180861 PMCID: PMC9523633 DOI: 10.1186/s12859-022-04934-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 09/16/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND To date, there are no effective treatments for most neurodegenerative diseases. Knowledge graphs can provide comprehensive and semantic representation for heterogeneous data, and have been successfully leveraged in many biomedical applications including drug repurposing. Our objective is to construct a knowledge graph from literature to study the relations between Alzheimer's disease (AD) and chemicals, drugs and dietary supplements in order to identify opportunities to prevent or delay neurodegenerative progression. We collected biomedical annotations and extracted their relations using SemRep via SemMedDB. We used both a BERT-based classifier and rule-based methods during data preprocessing to exclude noise while preserving most AD-related semantic triples. The 1,672,110 filtered triples were used to train with knowledge graph completion algorithms (i.e., TransE, DistMult, and ComplEx) to predict candidates that might be helpful for AD treatment or prevention. RESULTS Among three knowledge graph completion models, TransE outperformed the other two (MR = 10.53, Hits@1 = 0.28). We leveraged the time-slicing technique to further evaluate the prediction results. We found supporting evidence for most highly ranked candidates predicted by our model which indicates that our approach can inform reliable new knowledge. CONCLUSION This paper shows that our graph mining model can predict reliable new relationships between AD and other entities (i.e., dietary supplements, chemicals, and drugs). The knowledge graph constructed can facilitate data-driven knowledge discoveries and the generation of novel hypotheses.
Collapse
|
32
|
Identification of missing hierarchical relations in the vaccine ontology using acquired term pairs. J Biomed Semantics 2022; 13:22. [PMID: 35964149 PMCID: PMC9375092 DOI: 10.1186/s13326-022-00276-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 07/24/2022] [Indexed: 11/10/2022] Open
Abstract
Background The Vaccine Ontology (VO) is a biomedical ontology that standardizes vaccine annotation. Errors in VO will affect a multitude of applications that it is being used in. Quality assurance of VO is imperative to ensure that it provides accurate domain knowledge to these downstream tasks. Manual review to identify and fix quality issues (such as missing hierarchical is-a relations) is challenging given the complexity of the ontology. Automated approaches are highly desirable to facilitate the quality assurance of VO. Methods We developed an automated lexical approach that identifies potentially missing is-a relations in VO. First, we construct two types of VO concept-pairs: (1) linked; and (2) unlinked. Each concept-pair further derives an Acquired Term Pair (ATP) based on their lexical features. If the same ATP is obtained by a linked concept-pair and an unlinked concept-pair, this is considered to indicate a potentially missing is-a relation between the unlinked pair of concepts. Results Applying this approach on the 1.1.192 version of VO, we were able to identify 232 potentially missing is-a relations. A manual review by a VO domain expert on a random sample of 70 potentially missing is-a relations revealed that 65 of the cases were valid missing is-a relations in VO (a precision of 92.86%). Conclusions The results indicate that our approach is highly effective in identifying missing is-a relation in VO.
Collapse
|
33
|
Toward a standard formal semantic representation of the model card report. BMC Bioinformatics 2022; 23:281. [PMID: 35836130 PMCID: PMC9284683 DOI: 10.1186/s12859-022-04797-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Accepted: 06/15/2022] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Model card reports aim to provide informative and transparent description of machine learning models to stakeholders. This report document is of interest to the National Institutes of Health's Bridge2AI initiative to address the FAIR challenges with artificial intelligence-based machine learning models for biomedical research. We present our early undertaking in developing an ontology for capturing the conceptual-level information embedded in model card reports. RESULTS Sourcing from existing ontologies and developing the core framework, we generated the Model Card Report Ontology. Our development efforts yielded an OWL2-based artifact that represents and formalizes model card report information. The current release of this ontology utilizes standard concepts and properties from OBO Foundry ontologies. Also, the software reasoner indicated no logical inconsistencies with the ontology. With sample model cards of machine learning models for bioinformatics research (HIV social networks and adverse outcome prediction for stent implantation), we showed the coverage and usefulness of our model in transforming static model card reports to a computable format for machine-based processing. CONCLUSIONS The benefit of our work is that it utilizes expansive and standard terminologies and scientific rigor promoted by biomedical ontologists, as well as, generating an avenue to make model cards machine-readable using semantic web technology. Our future goal is to assess the veracity of our model and later expand the model to include additional concepts to address terminological gaps. We discuss tools and software that will utilize our ontology for potential application services.
Collapse
|
34
|
Comparability of clinical trials and spontaneous reporting data regarding COVID-19 vaccine safety. Sci Rep 2022; 12:10946. [PMID: 35768434 PMCID: PMC9243073 DOI: 10.1038/s41598-022-13809-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 05/27/2022] [Indexed: 11/09/2022] Open
Abstract
Severe adverse events (AEs) after COVID-19 vaccination are not well studied in randomized controlled trials (RCTs) due to rarity and short follow-up. To monitor the safety of COVID-19 vaccines ("Pfizer" vaccine dose 1 and 2, "Moderna" vaccine dose 1 and 2, and "Janssen" vaccine single dose) in the U.S., especially regarding severe AEs, we compare the relative rankings of these vaccines using both RCT and the Vaccine Adverse Event Reporting System (VAERS) data. The risks of local and systemic AEs were assessed from the three pivotal COVID-19 vaccine trials and also calculated in the VAERS cohort consisting of 559,717 reports between December 14, 2020 and September 17, 2021. AE rankings of the five vaccine groups calculated separately by RCT and VAERS were consistent, especially for systemic AEs. For severe AEs reported in VAERS, the reported risks of thrombosis and GBS after Janssen vaccine were highest. The reported risk of shingles after the first dose of Moderna vaccine was highest, followed by the second dose of the Moderna vaccine. The reported risk of myocarditis was higher after the second dose of Pfizer and Moderna vaccines. The reported risk of anaphylaxis was higher after the first dose of Pfizer vaccine. Limitations of this study are the inherent biases of the spontaneous reporting system data, and only including three pivotal RCTs and no comparison with other active vaccine safety surveillance systems.
Collapse
|
35
|
Understanding Public Perceptions of Measles from Twitter Using Multi-Task Convolutional Neural Networks. Stud Health Technol Inform 2022; 290:607-611. [PMID: 35673088 DOI: 10.3233/shti220149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Measles is a highly contagious cause of febrile illness typically seen in young children. Recent years have witnessed the resurgence of measles cases in the United States. Prompt understanding of public perceptions of measles will allow public health agencies to respond appropriately promptly. We proposed a multi-task Convolutional Neural Network (MT-CNN) model to classify measles-related tweets in terms of three characteristics: Type of Message (6 subclasses), Emotion Expressed (6 subclasses), and Attitude towards Vaccination (3 subclasses). A gold standard corpus that contains 2,997 tweets with annotation in these dimensions was manually curated. A variety of conventional machine learning and deep learning models were evaluated as baseline models. The MT-CNN model performed better than other baseline conventional machine learning and the signal-task CNN models, and was then applied to predict unlabeled measles-related Twitter discussions that were crawled from 2007 to 2019, and the trends of public perceptions were analyzed along three dimensions.
Collapse
|
36
|
Chemical-Protein Relation Extraction with Pre-trained Prompt Tuning. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2022; 2022:608-609. [PMID: 37664001 PMCID: PMC10474649 DOI: 10.1109/ichi54592.2022.00120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Biomedical relation extraction plays a critical role in the construction of high-quality knowledge graphs and databases, which can further support many downstream applications. Pre-trained prompt tuning, as a new paradigm, has shown great potential in many natural language processing (NLP) tasks. Through inserting a piece of text into the original input, prompt converts NLP tasks into masked language problems, which could be better addressed by pre-trained language models (PLMs). In this study, we applied pre-trained prompt tuning to chemical-protein relation extraction using the BioCreative VI CHEMPROT dataset. The experiment results showed that the pre-trained prompt tuning outperformed the baseline approach in chemical-protein interaction classification. We conclude that the prompt tuning can improve the efficiency of the PLMs on chemical-protein relation extraction tasks.
Collapse
|
37
|
Visualization of pulmonary vein reconnections using dynamic mapping in redo procedures for patients with atrial fibrillation. Europace 2022. [DOI: 10.1093/europace/euac053.199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Abstract
Funding Acknowledgements
Type of funding sources: None.
Background/Introduction
Pulmonary vein (PV) reconnection is commonly associated with recurrence of atrial fibrillation (AF) after the initial catheter ablation procedure. Visualization and identification of PV reconnections are critical during repeat procedures.
Purpose
To examine the use of dynamic mapping (LiveView) in combination with a high-density mapping catheter (HD Grid) in the recognition of PV reconnections in redo AF ablation procedures.
Methods
Acute procedure data from 81 patients were prospectively collected. Mapping catheter selection and the use of LiveView was determined at the physician’s discretion. For cases where LiveView was used, the location and number of gaps from the previous procedure were identified using both standard mapping and dynamic mapping separately.
Results
Most of the patients included in the analysis were treated for paroxysmal AF (PAF: n=63/81, 77.8%). Dynamic mapping data was incorporated in 50 PAF cases and 15 persistent AF cases. Within these 65 cases, standard mapping identified a total of 120 PV gaps whereas LiveView identified a total of 138 PV gaps; gaps were most frequently identified on the right PVs, especially in the anterior region (Table1). A contact force-sensing ablation catheter was commonly (n=64/81, 79%) used by the operators. The right anterior region was ablated with an average contact force of 13.8±3.1g and Lesion index (LSI) of 5.2±0.7 at a power of 35.8±8.4W. Non-PV ablation was performed in 38 (46.9%) patients; the most common lesion sets were roofline, cavotricuspid isthmus (CTI) line, and mitral isthmus line. Acute PV isolation was achieved in all patients at the end of the procedure.
Conclusion
Data from this analysis suggest the incorporation of dynamic mapping data may help reveal more PV reconnections compared to standard mapping. Additional study is needed to assess the long-term clinical outcomes when regional dynamic mapping data is used to identify PV reconnections in repeat procedures.
Collapse
|
38
|
Bactericidal efficacy of low concentration of vaporized hydrogen peroxide with validation in a BSL-3 laboratory. J Hosp Infect 2022; 127:51-58. [PMID: 35594986 DOI: 10.1016/j.jhin.2022.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/26/2022] [Accepted: 05/05/2022] [Indexed: 11/19/2022]
Abstract
BACKGROUND Highly infective pathogens are cultured and studied in biosafety laboratories. It is critical to thoroughly disinfect these laboratories to prevent laboratory infection. A whole-room, non-contact, reduced corrosion disinfection strategy using hydrogen peroxide was proposed and evaluated. AIM To evaluate the bactericidal efficacy of 8% and 10% vaporized hydrogen peroxide( VHP) in a laboratory setting with spores and bacteria as bioindicators. METHODS Spores of B. atrophaeus and B. stearothermophilus, along with bacteria E. coli, S. aureus, and S. epidermidis were placed in pre-selected locations in a sealed laboratory and an OXY-PHARM NOCOSPRAY2 vaporized hydrogen peroxide generator was applied. Spore killing efficacy was qualitatively evaluated, and bactericidal efficacy was quantitatively analyzed, and the mean log10 reduction was determined. Finally, the optimized disinfection strategy was verified in a BSL-3 laboratory. FINDINGS Significant reductions in microbial load were obtained for each of the selected spores and bacteria when exposed to VHP in concentrations of 8% and 10% for 2~3 h. S. aureus was found to be more resistant than E. coli and S. epidermidis. Tests with 8% hydrogen peroxide and exposure for more than 3 h completely killed B. atrophaeus on surfaces and equipment in the BSL-3 laboratory. CONCLUSION The vaporized hydrogen peroxide generator is superior in terms of good diffusivity and low corrosiveness and is time-effective in removing the disinfectant residue. This study provides reference for the precise disinfection of air and object surfaces in biosafety laboratories under varying conditions.
Collapse
|
39
|
Understanding Information Needs and Barriers to Accessing Health Information Across All Stages of Pregnancy: Systematic Review. JMIR Pediatr Parent 2022; 5:e32235. [PMID: 35188477 PMCID: PMC8902674 DOI: 10.2196/32235] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 11/15/2021] [Accepted: 12/08/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Understanding consumers' health information needs across all stages of the pregnancy trajectory is crucial to the development of mechanisms that allow them to retrieve high-quality, customized, and layperson-friendly health information. OBJECTIVE The objective of this study was to identify research gaps in pregnancy-related consumer information needs and available information from different sources. METHODS We conducted a systematic review of CINAHL, Cochrane, PubMed, and Web of Science for relevant articles that were published from 2009 to 2019. The quality of the included articles was assessed using the Critical Appraisal Skills Program. A descriptive data analysis was performed on these articles. Based on the review result, we developed the Pregnancy Information Needs Ontology (PINO) and made it publicly available in GitHub and BioPortal. RESULTS A total of 33 articles from 9 countries met the inclusion criteria for this review, of which the majority were published no earlier than 2016. Most studies were either descriptive (9/33, 27%), interviews (7/33, 21%), or surveys/questionnaires (7/33, 21%); 20 articles mentioned consumers' pregnancy-related information needs. Half (9/18, 50%) of the human-subject studies were conducted in the United States. More than a third (13/33, 39%) of all studies focused on during-pregnancy stage; only one study (1/33, 3%) was about all stages of pregnancy. The most frequent consumer information needs were related to labor delivery (9/20, 45%), medication in pregnancy (6/20, 30%), newborn care (5/20, 25%), and lab tests (6/20, 30%). The most frequently available source of information was the internet (15/24, 63%). PINO consists of 267 classes, 555 axioms, and 271 subclass relationships. CONCLUSIONS Only a few articles assessed the barriers to access to pregnancy-related information and the quality of each source of information; further work is needed. Future work is also needed to address the gaps between the information needed and the information available.
Collapse
|
40
|
Data and Model Biases in Social Media Analyses: A Case Study of COVID-19 Tweets. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022; 2021:1264-1273. [PMID: 35308985 PMCID: PMC8861742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
During the coronavirus disease pandemic (COVID-19), social media platforms such as Twitter have become a venue for individuals, health professionals, and government agencies to share COVID-19 information. Twitter has been a popular source of data for researchers, especially for public health studies. However, the use of Twitter data for research also has drawbacks and barriers. Biases appear everywhere from data collection methods to modeling approaches, and those biases have not been systematically assessed. In this study, we examined six different data collection methods and three different machine learning (ML) models-commonly used in social media analysis-to assess data collection bias and measure ML models' sensitivity to data collection bias. We showed that (1) publicly available Twitter data collection endpoints with appropriate strategies can collect data that is reasonably representative of the Twitter universe; and (2) careful examinations of ML models' sensitivity to data collection bias are critical.
Collapse
|
41
|
Expressing and Executing Informed Consent Permissions Using SWRL: The All of Us Use Case. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022; 2021:197-206. [PMID: 35309008 PMCID: PMC8861693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The informed consent process is a complicated procedure involving permissions as well a variety of entities and actions. In this paper, we discuss the use of Semantic Web Rule Language (SWRL) to further extend the Informed Consent Ontology (ICO) to allow for semantic machine-based reasoning to manage and generate important permission-based information that can later be viewed by stakeholders. We present four use cases of permissions from the All of Us informed consent document and translate these permissions into SWRL expressions to extend and operationalize ICO. Our efforts show how SWRL is able to infer some of the implicit information based on the defined rules, and demonstrate the utility of ICO through the use of SWRL extensions. Future work will include developing formal and generalized rules and expressing permissions from the entire document, as well as working towards integrating ICO into software systems to enhance the semantic representation of informed consent for biomedical research.
Collapse
|
42
|
Application of artificial intelligence and machine learning for HIV prevention interventions. Lancet HIV 2022; 9:e54-e62. [PMID: 34762838 PMCID: PMC9840899 DOI: 10.1016/s2352-3018(21)00247-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Revised: 08/11/2021] [Accepted: 09/02/2021] [Indexed: 01/17/2023]
Abstract
In 2019, the US Government announced its goal to end the HIV epidemic within 10 years, mirroring the initiatives set forth by UNAIDS. Public health prevention interventions are a crucial part of this ambitious goal. However, numerous challenges to this goal exist, including improving HIV awareness, increasing early HIV infection detection, ensuring rapid treatment, optimising resource distribution, and providing efficient prevention services for vulnerable populations. Artificial intelligence has had a pivotal role in revolutionising health care and has shown great potential in developing effective HIV prevention intervention strategies. Although artificial intelligence has been used in a few HIV prevention intervention areas, there are challenges to address and opportunities to explore.
Collapse
|
43
|
Dental EHR-infused Persona Ontologies to Enrich Dental Dialogue Interaction of Agents. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2021; 2021:1818-1825. [PMID: 35371617 PMCID: PMC8972912 DOI: 10.1109/bibm52615.2021.9669748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The quality of patient-provider communication can predict the healthcare outcomes in patients, and therefore, training dental providers to handle the communication effort with patients is crucial. In our previous work, we developed an ontology model that can standardize and represent patient-provider communication, which can later be integrated in conversational agents as tools for dental communication training. In this study, we embark on enriching our previous model with an ontology of patient personas to portray and express types of dental patient archetypes. The Ontology of Patient Personas that we developed was rooted in terminologies from an OBO Foundry ontology and dental electronic health record data elements. We discuss how this ontology aims to enhance the aforementioned dialogue ontology and future direction in executing our model in software agents to train dental students.
Collapse
|
44
|
Equine chorionic gonadotropin pretreatment 15 days before fixed-time artificial insemination improves the reproductive performance of replacement gilts. Animal 2021; 15:100406. [PMID: 34844186 DOI: 10.1016/j.animal.2021.100406] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 10/12/2021] [Accepted: 10/14/2021] [Indexed: 10/19/2022] Open
Abstract
Fixed-time artificial insemination (FTAI) technology uses exogenous reproductive hormones to regulate the sexual cycle and ovulation of sows without oestrus identification, which improves the sow breeding utilisation rate, reduces the number of non-productive days, and elevates the efficiency of pig farm management. In this study, we aimed to optimise FTAI procedures. Healthy 190-day-old and about 90 kg Large White × Landrace crossing breed replacement gilts (n = 166) which were of unknown reproductive status were randomly selected and divided into three groups: a control group (n = 62), an eCG-15D group in which the gilts were pretreated with equine chorionic gonadotropin (eCG) injection 15 days before starting FTAI (n = 50), and an eCG-20D group pretreated with eCG injection 20 days before starting FTAI (n = 54). All three groups were then subjected to the same conventional FTAI procedure. Pigs were orally administered Altrenogest (ALT, 20 mg per pig per day) for 18 days and then 42 h after ALT feeding was stopped, they were injected with 1 000 IU eCG followed by 100 μg GnRH 80 h later. The gilts were inseminated for the first time 24 h after gonadotropin-releasing hormone (GnRH) injection and then again 16 h later. After 42 h of ALT feeding, gilts in the eCG-15D group displayed a higher follicular diameter until artificial insemination (AI) than those from the other groups (P < 0.05). In addition, the ovulation times were the most synchronised in the eCG-15D group, with 100% of the gilts ovulating before the second AI on day 25 of FTAI. Furthermore, the gilts in the eCG-15D group achieved the highest pregnancy rate (92%), farrowing rate (90%), total pigs born (11.59), and pigs born alive (11.18). Together, the findings of this study demonstrate that reproductive performance can be optimised by pretreating gilts with eCG 15 days before conventional FTAI.
Collapse
|
45
|
Letter: hepatocellular carcinoma risk in patients with non-selective beta blockers-authors' reply. Aliment Pharmacol Ther 2021; 54:1095-1096. [PMID: 34564885 DOI: 10.1111/apt.16598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
|
46
|
Editorial: when to start carvedilol in cirrhosis-time to reconsider? Authors' reply. Aliment Pharmacol Ther 2021; 54:728-729. [PMID: 34379839 DOI: 10.1111/apt.16526] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
47
|
Using Machine Learning-Based Approaches for the Detection and Classification of Human Papillomavirus Vaccine Misinformation: Infodemiology Study of Reddit Discussions. J Med Internet Res 2021; 23:e26478. [PMID: 34383667 PMCID: PMC8380585 DOI: 10.2196/26478] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 04/14/2021] [Accepted: 05/06/2021] [Indexed: 12/25/2022] Open
Abstract
Background The rapid growth of social media as an information channel has made it possible to quickly spread inaccurate or false vaccine information, thus creating obstacles for vaccine promotion. Objective The aim of this study is to develop and evaluate an intelligent automated protocol for identifying and classifying human papillomavirus (HPV) vaccine misinformation on social media using machine learning (ML)–based methods. Methods Reddit posts (from 2007 to 2017, N=28,121) that contained keywords related to HPV vaccination were compiled. A random subset (2200/28,121, 7.82%) was manually labeled for misinformation and served as the gold standard corpus for evaluation. A total of 5 ML-based algorithms, including a support vector machine, logistic regression, extremely randomized trees, a convolutional neural network, and a recurrent neural network designed to identify vaccine misinformation, were evaluated for identification performance. Topic modeling was applied to identify the major categories associated with HPV vaccine misinformation. Results A convolutional neural network model achieved the highest area under the receiver operating characteristic curve of 0.7943. Of the 28,121 Reddit posts, 7207 (25.63%) were classified as vaccine misinformation, with discussions about general safety issues identified as the leading type of misinformed posts (2666/7207, 36.99%). Conclusions ML-based approaches are effective in the identification and classification of HPV vaccine misinformation on Reddit and may be generalizable to other social media platforms. ML-based methods may provide the capacity and utility to meet the challenge involved in intelligent automated monitoring and classification of public health misinformation on social media platforms. The timely identification of vaccine misinformation on the internet is the first step in misinformation correction and vaccine promotion.
Collapse
|
48
|
Abstract
Jupiter's upper atmosphere is considerably hotter than expected from the amount of sunlight that it receives1-3. Processes that couple the magnetosphere to the atmosphere give rise to intense auroral emissions and enormous deposition of energy in the magnetic polar regions, so it has been presumed that redistribution of this energy could heat the rest of the planet4-6. Instead, most thermospheric global circulation models demonstrate that auroral energy is trapped at high latitudes by the strong winds on this rapidly rotating planet3,5,7-10. Consequently, other possible heat sources have continued to be studied, such as heating by gravity waves and acoustic waves emanating from the lower atmosphere2,11-13. Each mechanism would imprint a unique signature on the global Jovian temperature gradients, thus revealing the dominant heat source, but a lack of planet-wide, high-resolution data has meant that these gradients have not been determined. Here we report infrared spectroscopy of Jupiter with a spatial resolution of 2 degrees in longitude and latitude, extending from pole to equator. We find that temperatures decrease steadily from the auroral polar regions to the equator. Furthermore, during a period of enhanced activity possibly driven by a solar wind compression, a high-temperature planetary-scale structure was observed that may be propagating from the aurora. These observations indicate that Jupiter's upper atmosphere is predominantly heated by the redistribution of auroral energy.
Collapse
|
49
|
Nonselective beta-blockers are associated with a lower risk of hepatocellular carcinoma among cirrhotic patients in the United States. Aliment Pharmacol Ther 2021; 54:481-492. [PMID: 34224163 DOI: 10.1111/apt.16490] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 05/21/2021] [Accepted: 06/04/2021] [Indexed: 12/12/2022]
Abstract
BACKGROUND Previous studies have demonstrated an association between nonselective beta-blockers (NSBBs) and lower risk of hepatocellular carcinoma (HCC) in cirrhosis. However, there has been no population-based study investigating the risk of HCC among cirrhotic patients treated using carvedilol. AIMS To determine the risk of HCC among cirrhotic patients with NSBBs including carvedilol. METHODS This retrospective cohort study utilised the Cerner Health Facts database in the United States from 2000 to 2017. Kaplan-Meier estimate, Cox proportional hazards regression, and propensity score matching (PSM) were used to test the HCC risk among the carvedilol, nadolol, and propranolol groups compared with no beta-blocker group. RESULTS The final cohort comprised 107 428 eligible patients. The 100-month cumulative HCC incidence of NSBBs was significantly lower than the no beta-blocker group (carvedilol (11.24%) vs no beta-blocker (15.69%), nadolol (27.55%) vs no beta-blocker (32.11%), and propranolol (26.17%) vs no beta-blocker (28.84%) (P values < 0.0001). NSBBs were associated with a significantly lower risk of HCC (Hazard ratio: carvedilol 0.61 (95% CI 0.51-0.73), nadolol 0.74 (95% CI 0.63-0.87), propranolol 0.75 (95% CI 0.66-0.84) after PSM in the multivariate cox analysis. In subgroup analysis, NSBBs reduced the risk of HCC in cirrhosis with complications and non-alcoholic cirrhosis. CONCLUSIONS NSBBs, including carvedilol, were associated with a significantly decreased risk of HCC in patients with cirrhosis when compared with no beta-blocker regardless of complications status. Future randomised-controlled studies comparing the incidence of HCC among NSBBs should elucidate which NSBB would be the best option to prevent HCC in cirrhosis.
Collapse
|
50
|
Extracting postmarketing adverse events from safety reports in the vaccine adverse event reporting system (VAERS) using deep learning. J Am Med Inform Assoc 2021; 28:1393-1400. [PMID: 33647938 DOI: 10.1093/jamia/ocab014] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVE Automated analysis of vaccine postmarketing surveillance narrative reports is important to understand the progression of rare but severe vaccine adverse events (AEs). This study implemented and evaluated state-of-the-art deep learning algorithms for named entity recognition to extract nervous system disorder-related events from vaccine safety reports. MATERIALS AND METHODS We collected Guillain-Barré syndrome (GBS) related influenza vaccine safety reports from the Vaccine Adverse Event Reporting System (VAERS) from 1990 to 2016. VAERS reports were selected and manually annotated with major entities related to nervous system disorders, including, investigation, nervous_AE, other_AE, procedure, social_circumstance, and temporal_expression. A variety of conventional machine learning and deep learning algorithms were then evaluated for the extraction of the above entities. We further pretrained domain-specific BERT (Bidirectional Encoder Representations from Transformers) using VAERS reports (VAERS BERT) and compared its performance with existing models. RESULTS AND CONCLUSIONS Ninety-one VAERS reports were annotated, resulting in 2512 entities. The corpus was made publicly available to promote community efforts on vaccine AEs identification. Deep learning-based methods (eg, bi-long short-term memory and BERT models) outperformed conventional machine learning-based methods (ie, conditional random fields with extensive features). The BioBERT large model achieved the highest exact match F-1 scores on nervous_AE, procedure, social_circumstance, and temporal_expression; while VAERS BERT large models achieved the highest exact match F-1 scores on investigation and other_AE. An ensemble of these 2 models achieved the highest exact match microaveraged F-1 score at 0.6802 and the second highest lenient match microaveraged F-1 score at 0.8078 among peer models.
Collapse
|