1
|
Tanaka H, Shimaoka M. Challenges associated with delayed definitive diagnosis among Japanese patients with specific intractable diseases: A cross-sectional study. Intractable Rare Dis Res 2023; 12:213-221. [PMID: 38024587 PMCID: PMC10680161 DOI: 10.5582/irdr.2023.01068] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 09/18/2023] [Accepted: 10/15/2023] [Indexed: 12/01/2023] Open
Abstract
This study aimed to determine the challenges that cause a delay in the diagnosis of Japanese patients with specific intractable diseases by means of a survey. We conducted a questionnaire survey involving 424 patients with 12 specific intractable diseases. Pearson's chi-square test was used to examine the relationship between diagnostic delay and each factor. The reasons for the diagnostic delay were analyzed. Pearson's chi-square test showed statistically significant differences in the relationship between the period to definitive diagnosis and period between symptom onset and first hospital visit (p = 0.002), and the period when the patients suspected the disease (p < 0.001). Reasons for diagnostic delay of these patients were patients' time constraints, problem in access to medical institutions, hesitancy in seeking medical attention, and healthcare system issues. Early definitive diagnosis of intractable diseases was hindered by several important issues. The resolution of these issues will require combined societal efforts as well as improvements in the healthcare system. The study revealed the need for improving patients' awareness about their disease, enabling patients to be proactive towards achieving a definitive diagnosis, and making improvements in the healthcare system regarding early diagnosis and care of patients with intractable diseases.
Collapse
Affiliation(s)
- Hiroyuki Tanaka
- Graduate School of Health Innovation, Kanagawa University of Human Services, Kawasaki, Kanagawa, Japan
- CMIC Ashfield Co., Ltd., Tokyo, Japan
| | - Mikiko Shimaoka
- Graduate School of Health Innovation, Kanagawa University of Human Services, Kawasaki, Kanagawa, Japan
| |
Collapse
|
2
|
Pathak R, Catalan-Matamoros D. Can Twitter posts serve as early indicators for potential safety signals? A retrospective analysis. INTERNATIONAL JOURNAL OF RISK & SAFETY IN MEDICINE 2023; 34:41-61. [PMID: 35491804 DOI: 10.3233/jrs-210024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
BACKGROUND As Twitter has gained significant popularity, tweets can serve as large pool of readily available data to estimate the adverse events (AEs) of medications. OBJECTIVE This study evaluated whether tweets were an early indicator for potential safety warnings. Additionally, the trend of AEs posted on Twitter was compared with AEs from the Yellow Card system in the United Kingdom. METHODS English Tweets for 35 drug-event pairs for the period 2017-2019, two years prior to the date of EMA Pharmacovigilance Risk Assessment Committee (PRAC) meeting, were collected. Both signal and non-signal AEs were manually identified and encoded using the MedDRA dictionary. AEs from Yellow Card were also gathered for the same period. Descriptive and inferential statistical analysis was conducted using Fisher's exact test to assess the distribution and proportion of AEs from the two data sources. RESULTS Of the total 61,661 English tweets, 1,411 had negative or neutral sentiment and mention of at least one AE. Tweets for 15 out of the 35 drugs (42.9%) contained AEs associated with the signals. On pooling data from Twitter and Yellow Card, 24 out of 35 drug-event pairs (68.6%) were identified prior to the respective PRAC meetings. Both data sources showed similar distribution of AEs based on seriousness, however, the distribution based on labelling was divergent. CONCLUSION Twitter cannot be used in isolation for signal detection in current pharmacovigilance (PV) systems. However, it can be used in combination with traditional PV systems for early signal detection, as it can provide a holistic drug safety profile.
Collapse
Affiliation(s)
- Revati Pathak
- UC3M Medialab, Department of Communication and Media Studies, University Carlos III of Madrid, Madrid, Spain.,Eu2P Programme, University of Bordeaux, Bordeaux, France
| | - Daniel Catalan-Matamoros
- UC3M Medialab, Department of Communication and Media Studies, University Carlos III of Madrid, Madrid, Spain.,Eu2P Programme, University of Bordeaux, Bordeaux, France.,Health Research Centre, University of Almeria, Almeria, Spain
| |
Collapse
|
3
|
Thakur N. MonkeyPox2022Tweets: A Large-Scale Twitter Dataset on the 2022 Monkeypox Outbreak, Findings from Analysis of Tweets, and Open Research Questions. Infect Dis Rep 2022; 14:855-883. [PMID: 36412745 PMCID: PMC9680479 DOI: 10.3390/idr14060087] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/13/2022] [Accepted: 11/08/2022] [Indexed: 11/16/2022] Open
Abstract
The mining of Tweets to develop datasets on recent issues, global challenges, pandemics, virus outbreaks, emerging technologies, and trending matters has been of significant interest to the scientific community in the recent past, as such datasets serve as a rich data resource for the investigation of different research questions. Furthermore, the virus outbreaks of the past, such as COVID-19, Ebola, Zika virus, and flu, just to name a few, were associated with various works related to the analysis of the multimodal components of Tweets to infer the different characteristics of conversations on Twitter related to these respective outbreaks. The ongoing outbreak of the monkeypox virus, declared a Global Public Health Emergency (GPHE) by the World Health Organization (WHO), has resulted in a surge of conversations about this outbreak on Twitter, which is resulting in the generation of tremendous amounts of Big Data. There has been no prior work in this field thus far that has focused on mining such conversations to develop a Twitter dataset. Furthermore, no prior work has focused on performing a comprehensive analysis of Tweets about this ongoing outbreak. To address these challenges, this work makes three scientific contributions to this field. First, it presents an open-access dataset of 556,427 Tweets about monkeypox that have been posted on Twitter since the first detected case of this outbreak. A comparative study is also presented that compares this dataset with 36 prior works in this field that focused on the development of Twitter datasets to further uphold the novelty, relevance, and usefulness of this dataset. Second, the paper reports the results of a comprehensive analysis of the Tweets of this dataset. This analysis presents several novel findings; for instance, out of all the 34 languages supported by Twitter, English has been the most used language to post Tweets about monkeypox, about 40,000 Tweets related to monkeypox were posted on the day WHO declared monkeypox as a GPHE, a total of 5470 distinct hashtags have been used on Twitter about this outbreak out of which #monkeypox is the most used hashtag, and Twitter for iPhone has been the leading source of Tweets about the outbreak. The sentiment analysis of the Tweets was also performed, and the results show that despite a lot of discussions, debate, opinions, information, and misinformation, on Twitter on various topics in this regard, such as monkeypox and the LGBTQI+ community, monkeypox and COVID-19, vaccines for monkeypox, etc., "neutral" sentiment was present in most of the Tweets. It was followed by "negative" and "positive" sentiments, respectively. Finally, to support research and development in this field, the paper presents a list of 50 open research questions related to the outbreak in the areas of Big Data, Data Mining, Natural Language Processing, and Machine Learning that may be investigated based on this dataset.
Collapse
Affiliation(s)
- Nirmalya Thakur
- Department of Computer Science, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
4
|
Frantz LM, Wall LB, Goldfarb CA. Media Depiction of Birth Differences of the Upper Extremity: Accuracy of Shared Diagnoses. J Pediatr Orthop 2022; 42:e753-e755. [PMID: 35576061 DOI: 10.1097/bpo.0000000000002185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
BACKGROUND To assess the diagnostic accuracy of public representation of congenital differences of the upper extremities. We hypothesized that there is an over-diagnosis of certain diagnoses such as amniotic constriction band and under-diagnosis of others such as symbrachydactyly and radial deficiency. METHODS Publicly shared images and associated diagnoses were searched on publicly available news media and social media accounts published from October 2018 through November 2021 using key terms such as "amniotic band syndrome," "congenital arm amputation," and "3D prosthetic arm" as well as The Lucky Fin Project account on Instagram. The images were collected and reviewed by 2 congenital hand surgeons. The surgeons' diagnoses were then compared to the reported diagnoses associated with each image to assess accuracy. RESULTS A total of 100 images were collected with the reported diagnosis associated with each image. Two images were removed due to evidence of prior surgery. The hand surgeons' diagnosis disagreed with the reported diagnosis in 60 of 98 (61%) images. Of those 60 inaccurate diagnoses, 2/3 were reported as amniotic constriction band. CONCLUSIONS Media and social media depictions of congenital upper extremity differences are frequently inaccurate, and our search demonstrated that the amniotic constriction band is the most commonly reported, inaccurate diagnosis. Accuracy of diagnosis in public media is important given the impact a diagnosis has on those viewing and sharing the images. LEVEL OF EVIDENCE Level IV, diagnostic.
Collapse
Affiliation(s)
- Lisa M Frantz
- University of Kansas School of Medicine, Wichita, KS
| | | | | |
Collapse
|
5
|
Klein AZ, O'Connor K, Levine LD, Gonzalez-Hernandez G. Using Twitter Data for Cohort Studies of Drug Safety in Pregnancy: Proof-of-concept With β-Blockers. JMIR Form Res 2022; 6:e36771. [PMID: 35771614 PMCID: PMC9284350 DOI: 10.2196/36771] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 04/27/2022] [Accepted: 06/06/2022] [Indexed: 01/26/2023] Open
Abstract
Background Despite the fact that medication is taken during more than 90% of pregnancies, the fetal risk for most medications is unknown, and the majority of medications have no data regarding safety in pregnancy. Objective Using β-blockers as a proof-of-concept, the primary objective of this study was to assess the utility of Twitter data for a cohort study design—in particular, whether we could identify (1) Twitter users who have posted tweets reporting that they took medication during pregnancy and (2) their associated pregnancy outcomes. Methods We searched for mentions of β-blockers in 2.75 billion tweets posted by 415,690 users who announced their pregnancy on Twitter. We manually reviewed the matching tweets to first determine if the user actually took the β-blocker mentioned in the tweet. Then, to help determine if the β-blocker was taken during pregnancy, we used the time stamp of the tweet reporting intake and drew upon an automated natural language processing (NLP) tool that estimates the date of the user’s prenatal time period. For users who posted tweets indicating that they took or may have taken the β-blocker during pregnancy, we drew upon additional NLP tools to help identify tweets that report their pregnancy outcomes. Adverse pregnancy outcomes included miscarriage, stillbirth, birth defects, preterm birth (<37 weeks gestation), low birth weight (<5 pounds and 8 ounces at delivery), and neonatal intensive care unit (NICU) admission. Normal pregnancy outcomes included gestational age ≥37 weeks and birth weight ≥5 pounds and 8 ounces. Results We retrieved 5114 tweets, posted by 2339 users, that mention a β-blocker, and manually identified 2332 (45.6%) tweets, posted by 1195 (51.1%) of the users, that self-report taking the β-blocker. We were able to estimate the date of the prenatal time period for 356 pregnancies among 334 (27.9%) of these 1195 users. Among these 356 pregnancies, we identified 257 (72.2%) during which the β-blocker was or may have been taken. We manually verified an adverse pregnancy outcome—preterm birth, NICU admission, low birth weight, birth defects, or miscarriage—for 38 (14.8%) of these 257 pregnancies. We manually verified a gestational age ≥37 weeks for 198 (90.4%) and a birth weight ≥5 pounds and 8 ounces for 50 (22.8%) of the 219 pregnancies for which we did not identify an adverse pregnancy outcome. Conclusions Our ability to detect pregnancy outcomes for Twitter users who posted tweets reporting that they took or may have taken a β-blocker during pregnancy suggests that Twitter can be a complementary resource for cohort studies of drug safety in pregnancy.
Collapse
Affiliation(s)
- Ari Z Klein
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Lisa D Levine
- Department of Obstetrics and Gynecology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | | |
Collapse
|
6
|
Ge Y, Guo Y, Yang YC, Al-Garadi MA, Sarker A. A comparison of few-shot and traditional named entity recognition models for medical text. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2022; 2022:84-89. [PMID: 37641590 PMCID: PMC10462421 DOI: 10.1109/ichi54592.2022.00024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Many research problems involving medical texts have limited amounts of annotated data available (e.g., expressions of rare diseases). Traditional supervised machine learning algorithms, particularly those based on deep neural networks, require large volumes of annotated data, and they underperform when only small amounts of labeled data are available. Few-shot learning (FSL) is a category of machine learning models that are designed with the intent of solving problems that have small annotated datasets available. However, there is no current study that compares the performances of FSL models with traditional models (e.g., conditional random fields) for medical text at different training set sizes. In this paper, we attempted to fill this gap in research by comparing multiple FSL models with traditional models for the task of named entity recognition (NER) from medical texts. Using five health-related annotated NER datasets, we benchmarked three traditional NER models based on BERT-BERT-Linear Classifier (BLC), BERT-CRF (BC) and SANER; and three FSL NER models-StructShot & NNShot, Few-Shot Slot Tagging (FS-ST) and ProtoNER. Our benchmarking results show that almost all models, whether traditional or FSL, achieve significantly lower performances compared to the state-of-the-art with small amounts of training data. For the NER experiments we executed, the F1-scores were very low with small training sets, typically below 30%. FSL models that were reported to perform well on non-medical texts significantly underperformed, compared to their reported best, on medical texts. Our experiments also suggest that FSL methods tend to perform worse on data sets from noisy sources of medical texts, such as social media (which includes misspellings and colloquial expressions), compared to less noisy sources such as medical literature. Our experiments demonstrate that the current state-of-the-art FSL systems are not yet suitable for effective NER in medical natural language processing tasks, and further research needs to be carried out to improve their performances. Creation of specialized, standardized datasets replicating real-world scenarios may help to move this category of methods forward.
Collapse
Affiliation(s)
- Yao Ge
- Department of Biomedical Informatics School of Medicine, Emory University Atlanta, GA
| | - Yuting Guo
- Department of Biomedical Informatics School of Medicine, Emory University Atlanta, GA
| | - Yuan-Chi Yang
- Department of Biomedical Informatics School of Medicine, Emory University Atlanta, GA
| | | | - Abeed Sarker
- Department of Biomedical Informatics School of Medicine, Emory University Atlanta, GA
| |
Collapse
|
7
|
Klein AZ, O'Connor K, Gonzalez-Hernandez G. Toward Using Twitter Data to Monitor COVID-19 Vaccine Safety in Pregnancy: Proof-of-Concept Study of Cohort Identification. JMIR Form Res 2022; 6:e33792. [PMID: 34870607 PMCID: PMC8734607 DOI: 10.2196/33792] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/15/2021] [Accepted: 11/22/2021] [Indexed: 01/19/2023] Open
Abstract
Background COVID-19 during pregnancy is associated with an increased risk of maternal death, intensive care unit admission, and preterm birth; however, many people who are pregnant refuse to receive COVID-19 vaccination because of a lack of safety data. Objective The objective of this preliminary study was to assess whether Twitter data could be used to identify a cohort for epidemiologic studies of COVID-19 vaccination in pregnancy. Specifically, we examined whether it is possible to identify users who have reported (1) that they received COVID-19 vaccination during pregnancy or the periconception period, and (2) their pregnancy outcomes. Methods We developed regular expressions to search for reports of COVID-19 vaccination in a large collection of tweets posted through the beginning of July 2021 by users who have announced their pregnancy on Twitter. To help determine if users were vaccinated during pregnancy, we drew upon a natural language processing (NLP) tool that estimates the timeframe of the prenatal period. For users who posted tweets with a timestamp indicating they were vaccinated during pregnancy, we drew upon additional NLP tools to help identify tweets that reported their pregnancy outcomes. Results We manually verified the content of tweets detected automatically, identifying 150 users who reported on Twitter that they received at least one dose of COVID-19 vaccination during pregnancy or the periconception period. We manually verified at least one reported outcome for 45 of the 60 (75%) completed pregnancies. Conclusions Given the limited availability of data on COVID-19 vaccine safety in pregnancy, Twitter can be a complementary resource for potentially increasing the acceptance of COVID-19 vaccination in pregnant populations. The results of this preliminary study justify the development of scalable methods to identify a larger cohort for epidemiologic studies.
Collapse
Affiliation(s)
- Ari Z Klein
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
8
|
Helman SM, Herrup EA, Christopher AB, Al-Zaiti SS. The role of machine learning applications in diagnosing and assessing critical and non-critical CHD: a scoping review. Cardiol Young 2021; 31:1770-1780. [PMID: 34725005 PMCID: PMC8805679 DOI: 10.1017/s1047951121004212] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Machine learning uses historical data to make predictions about new data. It has been frequently applied in healthcare to optimise diagnostic classification through discovery of hidden patterns in data that may not be obvious to clinicians. Congenital Heart Defect (CHD) machine learning research entails one of the most promising clinical applications, in which timely and accurate diagnosis is essential. The objective of this scoping review is to summarise the application and clinical utility of machine learning techniques used in paediatric cardiology research, specifically focusing on approaches aiming to optimise diagnosis and assessment of underlying CHD. Out of 50 full-text articles identified between 2015 and 2021, 40% focused on optimising the diagnosis and assessment of CHD. Deep learning and support vector machine were the most commonly used algorithms, accounting for an overall diagnostic accuracy > 0.80. Clinical applications primarily focused on the classification of auscultatory heart sounds, transthoracic echocardiograms, and cardiac MRIs. The range of these applications and directions of future research are discussed in this scoping review.
Collapse
Affiliation(s)
- Stephanie M Helman
- Department of Acute and Tertiary Care Nursing, University of Pittsburgh, Pittsburgh, PA, USA
| | - Elizabeth A Herrup
- Division of Pediatric Critical Care Medicine, UPMC Children's Hospital of Pittsburgh, Pittsburgh, PA, USA
| | - Adam B Christopher
- Division of Pediatric Cardiology, UPMC Children's Hospital of Pittsburgh, Pittsburgh, PA, USA
| | - Salah S Al-Zaiti
- Department of Acute and Tertiary Care Nursing, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Emergency Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Division of Cardiology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
9
|
Koss J, Rheinlaender A, Truebel H, Bohnet-Joschko S. Social media mining in drug development-Fundamentals and use cases. Drug Discov Today 2021; 26:2871-2880. [PMID: 34481080 DOI: 10.1016/j.drudis.2021.08.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 06/03/2021] [Accepted: 08/27/2021] [Indexed: 11/18/2022]
Abstract
The incorporation of patients' perspectives into drug discovery and development has become critically important from the viewpoint of accounting for modern-day business dynamics. There is a trend among patients to narrate their disease experiences on social media. The insights gained by analyzing the data pertaining to such social-media posts could be leveraged to support patient-centered drug development. Manual analysis of these data is nearly impossible, but artificial intelligence enables automated and cost-effective processing, also referred as social media mining (SMM). This paper discusses the fundamental SMM methods along with several relevant drug-development use cases.
Collapse
Affiliation(s)
| | | | - Hubert Truebel
- Witten/Herdecke University, Witten, Germany; AiCuris AG, Wuppertal, Germany
| | | |
Collapse
|
10
|
Durand-Moreau Q, Mackenzie G, Adisesh A, Straube S, Chan XHS, Zelyas N, Greenhalgh T. Twitter Analytics to Inform Provisional Guidance for COVID-19 Challenges in the Meatpacking Industry. Ann Work Expo Health 2021; 65:373-376. [PMID: 33492381 PMCID: PMC7929462 DOI: 10.1093/annweh/wxaa123] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 11/02/2020] [Accepted: 11/13/2020] [Indexed: 12/04/2022] Open
Abstract
The COVID-19 pandemic raised considerable challenges to obtain reliable guidance to help occupational health practitioners, workers, and stakeholders building up efficient prevention strategies at the workplace, between the constant increase of publications in the domain, the time required to run high-quality research and systematic reviews, and the urgent need to identify areas for prevention at the workplace. Social Media and Twitter, in particular, have already been used in research and constitute a useful source of information to identify community needs and topics of interest for prevention in the meatpacking industry. In this commentary, we introduce the methods and tools we used to screen relevant posts on Twitter. Twitter analytics is a way to capture real-time concerns of the community and help ensure compliance with the notion of social accountability. As such research has limitations in terms of exhaustiveness and level of evidence, it should be considered as provisional guidance to direct both actions at the workplace and further conventional research projects.
Collapse
Affiliation(s)
- Quentin Durand-Moreau
- Division of Preventive Medicine, Department of Medicine, Faculty of Medicine and Dentistry, 5-30 University Terrace, 8303-112St, Edmonton AB, Canada
| | - Graham Mackenzie
- Department of Public Health, NHS Education for Scotland, 102 West Port, Edinburgh, UK
| | - Anil Adisesh
- Division of Occupational Medicine, Department of Medicine, University of Toronto and St. Michael's Hospital, 30 Bond St, Toronto, ON, Canada
| | - Sebastian Straube
- Division of Preventive Medicine, Department of Medicine, Faculty of Medicine and Dentistry, 5-30 University Terrace, 8303-112St, Edmonton AB, Canada
| | - Xin Hui S Chan
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Nathan Zelyas
- Department of Laboratory Medicine and Pathology, University of Alberta, 116 St & 85 Ave, Edmonton, AB, Canada
| | - Trisha Greenhalgh
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Radcliffe Primary Care Building, Radcliffe Observatory Quarter, Woodstock Rd, Oxford, UK
| |
Collapse
|
11
|
Davidson L, Boland MR. Towards deep phenotyping pregnancy: a systematic review on artificial intelligence and machine learning methods to improve pregnancy outcomes. Brief Bioinform 2021; 22:6065792. [PMID: 33406530 PMCID: PMC8424395 DOI: 10.1093/bib/bbaa369] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 10/13/2020] [Accepted: 11/18/2020] [Indexed: 12/16/2022] Open
Abstract
Objective Development of novel informatics methods focused on improving pregnancy outcomes remains an active area of research. The purpose of this study is to systematically review the ways that artificial intelligence (AI) and machine learning (ML), including deep learning (DL), methodologies can inform patient care during pregnancy and improve outcomes. Materials and methods We searched English articles on EMBASE, PubMed and SCOPUS. Search terms included ML, AI, pregnancy and informatics. We included research articles and book chapters, excluding conference papers, editorials and notes. Results We identified 127 distinct studies from our queries that were relevant to our topic and included in the review. We found that supervised learning methods were more popular (n = 69) than unsupervised methods (n = 9). Popular methods included support vector machines (n = 30), artificial neural networks (n = 22), regression analysis (n = 17) and random forests (n = 16). Methods such as DL are beginning to gain traction (n = 13). Common areas within the pregnancy domain where AI and ML methods were used the most include prenatal care (e.g. fetal anomalies, placental functioning) (n = 73); perinatal care, birth and delivery (n = 20); and preterm birth (n = 13). Efforts to translate AI into clinical care include clinical decision support systems (n = 24) and mobile health applications (n = 9). Conclusions Overall, we found that ML and AI methods are being employed to optimize pregnancy outcomes, including modern DL methods (n = 13). Future research should focus on less-studied pregnancy domain areas, including postnatal and postpartum care (n = 2). Also, more work on clinical adoption of AI methods and the ethical implications of such adoption is needed.
Collapse
Affiliation(s)
- Lena Davidson
- MS degree at College of St. Scholastica, Duluth, MN, USA
| | - Mary Regina Boland
- Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania
| |
Collapse
|
12
|
Les médecins du travail doivent être plus visibles sur Twitter. ARCH MAL PROF ENVIRO 2020. [DOI: 10.1016/j.admp.2020.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
13
|
Klein AZ, Gonzalez-Hernandez G. An annotated data set for identifying women reporting adverse pregnancy outcomes on Twitter. Data Brief 2020; 32:106249. [PMID: 32944604 PMCID: PMC7481818 DOI: 10.1016/j.dib.2020.106249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 08/25/2020] [Indexed: 10/29/2022] Open
Abstract
Despite the prevalence in the United States of miscarriage [1], stillbirth [2], and infant mortality associated with preterm birth and low birthweight [3], their causes remain largely unknown [4], [5], [6]. To advance the use of social media data as a complementary resource for epidemiology of adverse pregnancy outcomes, we present a data set of 6487 tweets that mention miscarriage, stillbirth, preterm birth or premature labor, low birthweight, neonatal intensive care, or fetal/infant loss in general. These tweets are a subset of 22,912 tweets retrieved by applying hand-written regular expressions to a database containing more than 400 million public tweets posted by more than 100,000 women who have announced their pregnancy on Twitter [7]. Two professional annotators labeled the 6487 tweets in a binary fashion, distinguishing those potentially reporting that the user has personally experienced the outcome ("outcome" tweets) from those that merely mention the outcome ("non-outcome" tweets). Inter-annotator agreement was κ = 0.90 (Cohen's kappa). The tweets annotated as "outcome" include 1318 women reporting miscarriage, 94 stillbirth, 591 preterm birth or premature labor, 171 low birthweight, 453 neonatal intensive care, and 356 fetal/infant loss in general. These "outcome" tweets can be used to explore patient experiences and perceptions of adverse pregnancy outcomes, and can direct researchers to the users' broader timelines-tweets posted by a user over time-for observational studies. Our past work demonstrates the analysis of timelines for selecting a study population [8] and conducting a case-control study [9] of users reporting that their child has a birth defect. For larger-scale studies, the full annotated corpus can be used to train supervised machine learning algorithms to automatically identify additional users reporting adverse pregnancy outcomes on Twitter. We used the annotated corpus to train feature-engineered and deep learning-based classifiers presented in "A natural language processing pipeline to advance the use of Twitter data for digital epidemiology of adverse pregnancy outcomes" [10].
Collapse
Affiliation(s)
- Ari Z. Klein
- University of Pennsylvania, Philadelphia, PA, USA
| | | |
Collapse
|
14
|
Yamaguchi A, Queralt-Rosinach N. A proof-of-concept study of extracting patient histories for rare/intractable diseases from social media. Genomics Inform 2020; 18:e17. [PMID: 32634871 PMCID: PMC7362943 DOI: 10.5808/gi.2020.18.2.e17] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 06/18/2020] [Indexed: 12/22/2022] Open
Abstract
The amount of content on social media platforms such as Twitter is expanding rapidly. Simultaneously, the lack of patient information seriously hinders the diagnosis and treatment of rare/intractable diseases. However, these patient communities are especially active on social media. Data from social media could serve as a source of patient-centric knowledge for these diseases complementary to the information collected in clinical settings and patient registries, and may also have potential for research use. To explore this question, we attempted to extract patient-centric knowledge from social media as a task for the 3-day Biomedical Linked Annotation Hackathon 6 (BLAH6). We selected amyotrophic lateral sclerosis and multiple sclerosis as use cases of rare and intractable diseases, respectively, and we extracted patient histories related to these health conditions from Twitter. Four diagnosed patients for each disease were selected. From the user timelines of these eight patients, we extracted tweets that might be related to health conditions. Based on our experiment, we show that our approach has considerable potential, although we identified problems that should be addressed in future attempts to mine information about rare/intractable diseases from Twitter.
Collapse
|
15
|
Davoudi A, Klein AZ, Sarker A, Gonzalez-Hernandez G. Towards Automatic Bot Detection in Twitter for Health-related Tasks. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2020; 2020:136-141. [PMID: 32477632 PMCID: PMC7233076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
With the increasing use of social media data for health-related research, the credibility of the information from this source has been questioned as the posts may not from originating personal accounts. While automatic bot detection approaches have been proposed, none have been evaluated on users posting health-related information. In this paper, we extend an existing bot detection system and customize it for health-related research. Using a dataset of Twitter users, we first show that the system, which was designed for political bot detection, underperforms when applied to health-related Twitter users. We then incorporate additional features and a statistical machine learning classifier to improve bot detection performance significantly. Our approach obtains F1-scores of 0.7 for the "bot" class, representing improvements of 0.339. Our approach is customizable and generalizable for bot detection in other health-related social media cohorts.
Collapse
Affiliation(s)
- Anahita Davoudi
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Ari Z Klein
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA 30322
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
16
|
Klein AZ, Gebreyesus A, Gonzalez-Hernandez G. Automatically Identifying Comparator Groups on Twitter for Digital Epidemiology of Pregnancy Outcomes. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2020; 2020:317-325. [PMID: 32477651 PMCID: PMC7233041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Despite the prevalence of adverse pregnancy outcomes such as miscarriage, stillbirth, birth defects, and preterm birth, their causes are largely unknown. We seek to advance the use of social media for observational studies of pregnancy outcomes by developing a natural language processing pipeline for automatically identifying users from which to select comparator groups on Twitter. We annotated 2361 tweets by users who have announced their pregnancy on Twitter, which were used to train and evaluate supervised machine learning algorithms as a basis for automatically detecting women who have reported that their pregnancy had reached term and their baby was born at a normal weight. Upon further processing the tweet-level predictions of a majority voting-based ensemble classifier, the pipeline achieved a user-level F1-score of 0.933 (precision = 0.947, recall = 0.920). Our pipeline will be deployed to identify large comparator groups for studying pregnancy outcomes on Twitter.
Collapse
Affiliation(s)
- Ari Z Klein
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Abeselom Gebreyesus
- Department of Sociology, Anthropology, and Health Administration and Policy, University of Maryland, Baltimore County, Baltimore, MD, USA
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
17
|
Tavoschi L, Quattrone F, D’Andrea E, Ducange P, Vabanesi M, Marcelloni F, Lopalco PL. Twitter as a sentinel tool to monitor public opinion on vaccination: an opinion mining analysis from September 2016 to August 2017 in Italy. Hum Vaccin Immunother 2020; 16:1062-1069. [PMID: 32118519 PMCID: PMC7227677 DOI: 10.1080/21645515.2020.1714311] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2019] [Revised: 12/18/2019] [Accepted: 01/06/2020] [Indexed: 11/29/2022] Open
Abstract
Social media have become a common way for people to express their personal viewpoints, including sentiments about health topics. We present the results of an opinion mining analysis on vaccination performed on Twitter from September 2016 to August 2017 in Italy. Vaccine-related tweets were automatically classified as against, in favor or neutral in respect of the vaccination topic by means of supervised machine-learning techniques. During this period, we found an increasing trend in the number of tweets on this topic. According to the overall analysis by category, 60% of tweets were classified as neutral, 23% against vaccination, and 17% in favor of vaccination. Vaccine-related events appeared able to influence the number and the opinion polarity of tweets. In particular, the approval of the decree introducing mandatory immunization for selected childhood diseases produced a prominent effect in the social discussion in terms of number of tweets. Opinion mining analysis based on Twitter showed to be a potentially useful and timely sentinel system to assess the orientation of public opinion toward vaccination and, in future, it may effectively contribute to the development of appropriate communication and information strategies.
Collapse
Affiliation(s)
- Lara Tavoschi
- Department of Translational Research and of New Surgical and Medical Technologies, University of Pisa, Pisa, Italy
| | - Filippo Quattrone
- Department of Translational Research and of New Surgical and Medical Technologies, University of Pisa, Pisa, Italy
| | - Eleonora D’Andrea
- Dipartimento di Ingegneria dell’Informazione, University of Pisa, Pisa, Italy
| | - Pietro Ducange
- Dipartimento di Ingegneria dell’Informazione, University of Pisa, Pisa, Italy
| | - Marco Vabanesi
- Department of Neurology, San Raffaele Scientific Institute and University, Milan, Italy
| | | | - Pier Luigi Lopalco
- Department of Translational Research and of New Surgical and Medical Technologies, University of Pisa, Pisa, Italy
| |
Collapse
|
18
|
Wexler A, Davoudi A, Weissenbacher D, Choi R, O’Connor K, Cummings H, Gonzalez-Hernandez G. Pregnancy and health in the age of the Internet: A content analysis of online "birth club" forums. PLoS One 2020; 15:e0230947. [PMID: 32287266 PMCID: PMC7156049 DOI: 10.1371/journal.pone.0230947] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Accepted: 03/12/2020] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Although studies report that more than 90% of pregnant women utilize digital sources to supplement their maternal healthcare, little is known about the kinds of information that women seek from their peers during pregnancy. To date, most research has used self-report measures to elucidate how and why women to turn to digital sources during pregnancy. However, given that these measures may differ from actual utilization of online health information, it is important to analyze the online content pregnant women generate. OBJECTIVE To apply machine learning methods to analyze online pregnancy forums, to better understand how women seek information from a community of online peers during pregnancy. METHODS Data from seven WhatToExpect.com "birth club" forums (September 2018; January-June 2018) were scraped. Forum posts were collected for a one-year period, which included three trimesters and three months postpartum. Only initial posts from each thread were analyzed (n = 262,238). Automatic natural language processing (NLP) methods captured 50 discussed topics, which were annotated by two independent coders and grouped categorically. RESULTS The largest topic categories were maternal health (45%), baby-related topics (29%), and people/relationships (10%). While pain was a popular topic all throughout pregnancy, individual topics that were dominant by trimester included miscarriage (first trimester), labor (third trimester), and baby sleeping routine (postpartum period). CONCLUSION More than just emotional or peer support, pregnant women turn to online forums to discuss their health. Dominant topics, such as labor and miscarriage, suggest unmet informational needs in these domains. With misinformation becoming a growing public health concern, more attention must be directed toward peer-exchange outlets.
Collapse
Affiliation(s)
- Anna Wexler
- Department of Medical Ethics and Health Policy, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Anahita Davoudi
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Davy Weissenbacher
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Rebekah Choi
- Department of Medical Ethics and Health Policy, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Karen O’Connor
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Holly Cummings
- Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, United States of America
| |
Collapse
|
19
|
Pharmacoepidemiologic Evaluation of Birth Defects from Health-Related Postings in Social Media During Pregnancy. Drug Saf 2020; 42:389-400. [PMID: 30284214 PMCID: PMC6426821 DOI: 10.1007/s40264-018-0731-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Introduction Adverse effects of medications taken during pregnancy are traditionally studied through post-marketing pregnancy registries, which have limitations. Social media data may be an alternative data source for pregnancy surveillance studies. Objective The objective of this study was to assess the feasibility of using social media data as an alternative source for pregnancy surveillance for regulatory decision making. Methods We created an automated method to identify Twitter accounts of pregnant women. We identified 196 pregnant women with a mention of a birth defect in relation to their baby and 196 without a mention of a birth defect in relation to their baby. We extracted information on pregnancy and maternal demographics, medication intake and timing, and birth defects. Results Although often incomplete, we extracted data for the majority of the pregnancies. Among women that reported birth defects, 35% reported taking one or more medications during pregnancy compared with 17% of controls. After accounting for age, race, and place of residence, a higher medication intake was observed in women who reported birth defects. The rate of birth defects in the pregnancy cohort was lower (0.44%) compared with the rate in the general population (3%). Conclusions Twitter data capture information on medication intake and birth defects; however, the information obtained cannot replace pregnancy registries at this time. Development of improved methods to automatically extract and annotate social media data may increase their value to support regulatory decision making regarding pregnancy outcomes in women using medications during their pregnancies.
Collapse
|
20
|
Klein AZ, Cai H, Weissenbacher D, Levine LD, Gonzalez-Hernandez G. A natural language processing pipeline to advance the use of Twitter data for digital epidemiology of adverse pregnancy outcomes. J Biomed Inform 2020; 112S:100076. [PMID: 34417007 DOI: 10.1016/j.yjbinx.2020.100076] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 06/30/2020] [Accepted: 07/27/2020] [Indexed: 10/23/2022]
Abstract
BACKGROUND In the United States, 17% of pregnancies end in fetal loss: miscarriage or stillbirth. Preterm birth affects 10% of live births in the United States and is the leading cause of neonatal death globally. Preterm births with low birthweight are the second leading cause of infant mortality in the United States. Despite their prevalence, the causes of miscarriage, stillbirth, and preterm birth are largely unknown. OBJECTIVE The primary objectives of this study are to (1) assess whether women report miscarriage, stillbirth, and preterm birth, among others, on Twitter, and (2) develop natural language processing (NLP) methods to automatically identify users from which to select cases for large-scale observational studies. METHODS We handcrafted regular expressions to retrieve tweets that mention an adverse pregnancy outcome, from a database containing more than 400 million publicly available tweets posted by more than 100,000 users who have announced their pregnancy on Twitter. Two annotators independently annotated 8109 (one random tweet per user) of the 22,912 retrieved tweets, distinguishing those reporting that the user has personally experienced the outcome ("outcome" tweets) from those that merely mention the outcome ("non-outcome" tweets). Inter-annotator agreement was κ = 0.90 (Cohen's kappa). We used the annotated tweets to train and evaluate feature-engineered and deep learning-based classifiers. We further annotated 7512 (of the 8109) tweets to develop a generalizable, rule-based module designed to filter out reported speech-that is, posts containing what was said by others-prior to automatic classification. We performed an extrinsic evaluation assessing whether the reported speech filter could improve the detection of women reporting adverse pregnancy outcomes on Twitter. RESULTS The tweets annotated as "outcome" include 1632 women reporting miscarriage, 119 stillbirth, 749 preterm birth or premature labor, 217 low birthweight, 558 NICU admission, and 458 fetal/infant loss in general. A deep neural network, BERT-based classifier achieved the highest overall F1-score (0.88) for automatically detecting "outcome" tweets (precision = 0.87, recall = 0.89), with an F1-score of at least 0.82 and a precision of at least 0.84 for each of the adverse pregnancy outcomes. Our reported speech filter significantly (P < 0.05) improved the accuracy of Logistic Regression (from 78.0% to 80.8%) and majority voting-based ensemble (from 81.1% to 82.9%) classifiers. Although the filter did not improve the F1-score of the BERT-based classifier, it did improve precision-a trade-off of recall that may be acceptable for automated case selection of more prevalent outcomes. Without the filter, reported speech is one of the main sources of errors for the BERT-based classifier. CONCLUSION This study demonstrates that (1) women do report their adverse pregnancy outcomes on Twitter, (2) our NLP pipeline can automatically identify users from which to select cases for large-scale observational studies, and (3) our reported speech filter would reduce the cost of annotating health-related social media data and can significantly improve the overall performance of feature-based classifiers.
Collapse
Affiliation(s)
- Ari Z Klein
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| | - Haitao Cai
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| | - Davy Weissenbacher
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| | - Lisa D Levine
- Maternal and Child Health Research Center, Department of Obstetrics and Gynecology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
21
|
Klein AZ, Sarker A, Weissenbacher D, Gonzalez-Hernandez G. Towards scaling Twitter for digital epidemiology of birth defects. NPJ Digit Med 2019; 2:96. [PMID: 31583284 PMCID: PMC6773753 DOI: 10.1038/s41746-019-0170-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 08/12/2019] [Indexed: 11/13/2022] Open
Abstract
Social media has recently been used to identify and study a small cohort of Twitter users whose pregnancies with birth defect outcomes-the leading cause of infant mortality-could be observed via their publicly available tweets. In this study, we exploit social media on a larger scale by developing natural language processing (NLP) methods to automatically detect, among thousands of users, a cohort of mothers reporting that their child has a birth defect. We used 22,999 annotated tweets to train and evaluate supervised machine learning algorithms-feature-engineered and deep learning-based classifiers-that automatically distinguish tweets referring to the user's pregnancy outcome from tweets that merely mention birth defects. Because 90% of the tweets merely mention birth defects, we experimented with under-sampling and over-sampling approaches to address this class imbalance. An SVM classifier achieved the best performance for the two positive classes: an F1-score of 0.65 for the "defect" class and 0.51 for the "possible defect" class. We deployed the classifier on 20,457 unlabeled tweets that mention birth defects, which helped identify 542 additional users for potential inclusion in our cohort. Contributions of this study include (1) NLP methods for automatically detecting tweets by users reporting their birth defect outcomes, (2) findings that an SVM classifier can outperform a deep neural network-based classifier for highly imbalanced social media data, (3) evidence that automatic classification can be used to identify additional users for potential inclusion in our cohort, and (4) a publicly available corpus for training and evaluating supervised machine learning algorithms.
Collapse
Affiliation(s)
- Ari Z. Klein
- Department of Biostatistics, Epidemiology, and Informatics Perelman School of Medicine University of Pennsylvania, Philadelphia, PA USA
| | - Abeed Sarker
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA USA
| | - Davy Weissenbacher
- Department of Biostatistics, Epidemiology, and Informatics Perelman School of Medicine University of Pennsylvania, Philadelphia, PA USA
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics Perelman School of Medicine University of Pennsylvania, Philadelphia, PA USA
| |
Collapse
|
22
|
Grabar N, Grouin C. A Year of Papers Using Biomedical Texts: Findings from the Section on Natural Language Processing of the IMIA Yearbook. Yearb Med Inform 2019; 28:218-222. [PMID: 31419835 PMCID: PMC6697498 DOI: 10.1055/s-0039-1677937] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
OBJECTIVES To analyze the content of publications within the medical Natural Language Processing (NLP) domain in 2018. METHODS Automatic and manual pre-selection of publications to be reviewed, and selection of the best NLP papers of the year. Analysis of the important issues. RESULTS Two best papers have been selected this year. One dedicated to the generation of multi- documents summaries and another dedicated to the generation of imaging reports. We also proposed an analysis of the content of main research trends of NLP publications in 2018. CONCLUSIONS The year 2018 is very rich with regard to NLP issues and topics addressed. It shows the will of researchers to go towards robust and reproducible results. Researchers also prove to be creative for original issues and approaches.
Collapse
Affiliation(s)
- Natalia Grabar
- LIMSI, CNRS, Université Paris-Saclay, Orsay, France
- STL, CNRS, Université de Lille, Villeneuve-d'Ascq, France
| | - Cyril Grouin
- LIMSI, CNRS, Université Paris-Saclay, Orsay, France
| | | |
Collapse
|
23
|
Natsiavas P, Malousi A, Bousquet C, Jaulent MC, Koutkias V. Computational Advances in Drug Safety: Systematic and Mapping Review of Knowledge Engineering Based Approaches. Front Pharmacol 2019; 10:415. [PMID: 31156424 PMCID: PMC6533857 DOI: 10.3389/fphar.2019.00415] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2018] [Accepted: 04/02/2019] [Indexed: 12/12/2022] Open
Abstract
Drug Safety (DS) is a domain with significant public health and social impact. Knowledge Engineering (KE) is the Computer Science discipline elaborating on methods and tools for developing “knowledge-intensive” systems, depending on a conceptual “knowledge” schema and some kind of “reasoning” process. The present systematic and mapping review aims to investigate KE-based approaches employed for DS and highlight the introduced added value as well as trends and possible gaps in the domain. Journal articles published between 2006 and 2017 were retrieved from PubMed/MEDLINE and Web of Science® (873 in total) and filtered based on a comprehensive set of inclusion/exclusion criteria. The 80 finally selected articles were reviewed on full-text, while the mapping process relied on a set of concrete criteria (concerning specific KE and DS core activities, special DS topics, employed data sources, reference ontologies/terminologies, and computational methods, etc.). The analysis results are publicly available as online interactive analytics graphs. The review clearly depicted increased use of KE approaches for DS. The collected data illustrate the use of KE for various DS aspects, such as Adverse Drug Event (ADE) information collection, detection, and assessment. Moreover, the quantified analysis of using KE for the respective DS core activities highlighted room for intensifying research on KE for ADE monitoring, prevention and reporting. Finally, the assessed use of the various data sources for DS special topics demonstrated extensive use of dominant data sources for DS surveillance, i.e., Spontaneous Reporting Systems, but also increasing interest in the use of emerging data sources, e.g., observational healthcare databases, biochemical/genetic databases, and social media. Various exemplar applications were identified with promising results, e.g., improvement in Adverse Drug Reaction (ADR) prediction, detection of drug interactions, and novel ADE profiles related with specific mechanisms of action, etc. Nevertheless, since the reviewed studies mostly concerned proof-of-concept implementations, more intense research is required to increase the maturity level that is necessary for KE approaches to reach routine DS practice. In conclusion, we argue that efficiently addressing DS data analytics and management challenges requires the introduction of high-throughput KE-based methods for effective knowledge discovery and management, resulting ultimately, in the establishment of a continuous learning DS system.
Collapse
Affiliation(s)
- Pantelis Natsiavas
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, Greece.,Sorbonne Université, INSERM, Univ Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé, LIMICS, Paris, France
| | - Andigoni Malousi
- Laboratory of Biological Chemistry, Department of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Cédric Bousquet
- Sorbonne Université, INSERM, Univ Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé, LIMICS, Paris, France.,Public Health and Medical Information Unit, University Hospital of Saint-Etienne, Saint-Étienne, France
| | - Marie-Christine Jaulent
- Sorbonne Université, INSERM, Univ Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé, LIMICS, Paris, France
| | - Vassilis Koutkias
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, Greece
| |
Collapse
|