1
|
Pérez-Pérez M, Fernandez Gonzalez M, Rodriguez-Rajo FJ, Fdez-Riverola F. Tracking the Spread of Pollen on Social Media Using Pollen-Related Messages From Twitter: Retrospective Analysis. J Med Internet Res 2024; 26:e58309. [PMID: 39432897 PMCID: PMC11535798 DOI: 10.2196/58309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/27/2024] [Accepted: 09/10/2024] [Indexed: 10/23/2024] Open
Abstract
BACKGROUND Allergy disorders caused by biological particles, such as the proteins in some airborne pollen grains, are currently considered one of the most common chronic diseases, and European Academy of Allergy and Clinical Immunology forecasts indicate that within 15 years 50% of Europeans will have some kind of allergy as a consequence of urbanization, industrialization, pollution, and climate change. OBJECTIVE The aim of this study was to monitor and analyze the dissemination of information about pollen symptoms from December 2006 to January 2022. By conducting a comprehensive evaluation of public comments and trends on Twitter, the research sought to provide valuable insights into the impact of pollen on sensitive individuals, ultimately enhancing our understanding of how pollen-related information spreads and its implications for public health awareness. METHODS Using a blend of large language models, dimensionality reduction, unsupervised clustering, and term frequency-inverse document frequency, alongside visual representations such as word clouds and semantic interaction graphs, our study analyzed Twitter data to uncover insights on respiratory allergies. This concise methodology enabled the extraction of significant themes and patterns, offering a deep dive into public knowledge and discussions surrounding respiratory allergies on Twitter. RESULTS The months between March and August had the highest volume of messages. The percentage of patient tweets appeared to increase notably during the later years, and there was also a potential increase in the prevalence of symptoms, mainly in the morning hours, indicating a potential rise in pollen allergies and related discussions on social media. While pollen allergy is a global issue, specific sociocultural, political, and economic contexts mean that patients experience symptomatology at a localized level, needing appropriate localized responses. CONCLUSIONS The interpretation of tweet information represents a valuable tool to take preventive measures to mitigate the impact of pollen allergy on sensitive patients to achieve equity in living conditions and enhance access to health information and services.
Collapse
Affiliation(s)
- Martín Pérez-Pérez
- CINBIO, Universidade de Vigo (University of Vigo), Vigo, Spain
- Department of Computer Science, School of Computer Engineering, Universidade de Vigo (University of Vigo), Ourense, Spain
- Next Generation Computer Systems Group, School of Computer Engineering, Galicia Sur Health Research Institute, Galician Health Service, SERGAS-UVIGO, Ourense, Spain
| | - María Fernandez Gonzalez
- Department of Plant Biology and Soil Sciences, Faculty of Sciences, Universidade de Vigo (University of Vigo), Ourense, Spain
| | - Francisco Javier Rodriguez-Rajo
- Department of Plant Biology and Soil Sciences, Faculty of Sciences, Universidade de Vigo (University of Vigo), Ourense, Spain
| | - Florentino Fdez-Riverola
- CINBIO, Universidade de Vigo (University of Vigo), Vigo, Spain
- Department of Computer Science, School of Computer Engineering, Universidade de Vigo (University of Vigo), Ourense, Spain
- Next Generation Computer Systems Group, School of Computer Engineering, Galicia Sur Health Research Institute, Galician Health Service, SERGAS-UVIGO, Ourense, Spain
| |
Collapse
|
2
|
Correia JC, Ahmad SS, Waqas A, Meraj H, Pataky Z. Exploring Public Emotions on Obesity During the COVID-19 Pandemic Using Sentiment Analysis and Topic Modeling: Cross-Sectional Study. J Med Internet Res 2024; 26:e52142. [PMID: 39393064 PMCID: PMC11512131 DOI: 10.2196/52142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 02/06/2024] [Accepted: 06/27/2024] [Indexed: 10/13/2024] Open
Abstract
BACKGROUND Obesity is a chronic, multifactorial, and relapsing disease, affecting people of all ages worldwide, and is directly related to multiple complications. Understanding public attitudes and perceptions toward obesity is essential for developing effective health policies, prevention strategies, and treatment approaches. OBJECTIVE This study investigated the sentiments of the general public, celebrities, and important organizations regarding obesity using social media data, specifically from Twitter (subsequently rebranded as X). METHODS The study analyzes a dataset of 53,414 tweets related to obesity posted on Twitter during the COVID-19 pandemic, from April 2019 to December 2022. Sentiment analysis was performed using the XLM-RoBERTa-base model, and topic modeling was conducted using the BERTopic library. RESULTS The analysis revealed that tweets regarding obesity were predominantly negative. Spikes in Twitter activity correlated with significant political events, such as the exchange of obesity-related comments between US politicians and criticism of the United Kingdom's obesity campaign. Topic modeling identified 243 clusters representing various obesity-related topics, such as childhood obesity; the US President's obesity struggle; COVID-19 vaccinations; the UK government's obesity campaign; body shaming; racism and high obesity rates among Black American people; smoking, substance abuse, and alcohol consumption among people with obesity; environmental risk factors; and surgical treatments. CONCLUSIONS Twitter serves as a valuable source for understanding obesity-related sentiments and attitudes among the public, celebrities, and influential organizations. Sentiments regarding obesity were predominantly negative. Negative portrayals of obesity by influential politicians and celebrities were shown to contribute to negative public sentiments, which can have adverse effects on public health. It is essential for public figures to be mindful of their impact on public opinion and the potential consequences of their statements.
Collapse
Affiliation(s)
- Jorge César Correia
- Unit of Therapeutic Patient Education, WHO Collaborating Center, University Hospitals of Geneva and University of Geneva, Geneva, Switzerland
| | - Sarmad Shaharyar Ahmad
- School of Mathematics, Computer Science & Engineering, Liverpool Hope University, Liverpool, United Kingdom
| | - Ahmed Waqas
- Department of Primary Care & Mental Health, Institute of Population Health, University of Liverpool, Liverpool, United Kingdom
| | - Hafsa Meraj
- Greater Manchester Mental Health NHS Foundation Trust, Salford, United Kingdom
| | - Zoltan Pataky
- Unit of Therapeutic Patient Education, WHO Collaborating Center, University Hospitals of Geneva and University of Geneva, Geneva, Switzerland
| |
Collapse
|
3
|
Molenaar A, Jenkins EL, Brennan L, Lukose D, McCaffrey TA. The use of sentiment and emotion analysis and data science to assess the language of nutrition-, food- and cooking-related content on social media: a systematic scoping review. Nutr Res Rev 2024; 37:43-78. [PMID: 36991525 DOI: 10.1017/s0954422423000069] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Social media data are rapidly evolving and accessible, which presents opportunities for research. Data science techniques, such as sentiment or emotion analysis which analyse textual emotion, provide an opportunity to gather insight from social media. This paper describes a systematic scoping review of interdisciplinary evidence to explore how sentiment or emotion analysis methods alongside other data science methods have been used to examine nutrition, food and cooking social media content. A PRISMA search strategy was used to search nine electronic databases in November 2020 and January 2022. Of 7325 studies identified, thirty-six studies were selected from seventeen countries, and content was analysed thematically and summarised in an evidence table. Studies were published between 2014 and 2022 and used data from seven different social media platforms (Twitter, YouTube, Instagram, Reddit, Pinterest, Sina Weibo and mixed platforms). Five themes of research were identified: dietary patterns, cooking and recipes, diet and health, public health and nutrition and food in general. Papers developed a sentiment or emotion analysis tool or used available open-source tools. Accuracy to predict sentiment ranged from 33·33% (open-source engine) to 98·53% (engine developed for the study). The average proportion of sentiment was 38·8% positive, 46·6% neutral and 28·0% negative. Additional data science techniques used included topic modelling and network analysis. Future research requires optimising data extraction processes from social media platforms, the use of interdisciplinary teams to develop suitable and accurate methods for the subject and the use of complementary methods to gather deeper insights into these complex data.
Collapse
Affiliation(s)
- Annika Molenaar
- Department of Nutrition, Dietetics and Food, Monash University, Level 1, 264 Ferntree Gully Road, Notting Hill, VIC3168, Australia
| | - Eva L Jenkins
- Department of Nutrition, Dietetics and Food, Monash University, Level 1, 264 Ferntree Gully Road, Notting Hill, VIC3168, Australia
| | - Linda Brennan
- School of Media and Communication, RMIT University, 124 La Trobe St, MelbourneVIC3004, Australia
| | - Dickson Lukose
- Monash Data Futures Institute, Monash University, Level 2, 13 Rainforest Walk, Monash University, ClaytonVIC3800, Australia
| | - Tracy A McCaffrey
- Department of Nutrition, Dietetics and Food, Monash University, Level 1, 264 Ferntree Gully Road, Notting Hill, VIC3168, Australia
| |
Collapse
|
4
|
Ni Z, Zhu L, Li S, Zhang Y, Zhao R. Characteristics and associated factors of health information-seeking behaviour among patients with inflammatory bowel disease in the digital era: a scoping review. BMC Public Health 2024; 24:307. [PMID: 38279086 PMCID: PMC10821566 DOI: 10.1186/s12889-024-17758-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 01/12/2024] [Indexed: 01/28/2024] Open
Abstract
BACKGROUND Health Information-Seeking Behaviour (HISB) is necessary for self-management and medical decision-making among patients with inflammatory bowel disease (IBD). With the advancement of information technology, health information needs and seeking are reshaped among patients with IBD. This scoping review aims to gain a comprehensive understanding of HISB of people with IBD in the digital age. METHODS This scoping review adhered to Arksey and O'Malley's framework and Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews frameworks (PRISMA-ScR). A comprehensive literature search was conducted in PubMed, Embase, Web of Science, PsycINFO, CINAHL, and three Chinese databases from January 1, 2010 to April 10, 2023. Employing both deductive and inductive content analysis, we scrutinized studies using Wilson's model. RESULTS In total, 56 articles were selected. Within the information dimension of HISB among patients with IBD, treatment-related information, particularly medication-related information, was identified as the most critical information need. Other information requirements included basic IBD-related information, daily life and self-management, sexual and reproductive health, and other needs. In the sources dimension, of the eight common sources of information, the internet was the most frequently mentioned source of information, while face-to-face communication with healthcare professionals was the preferred source. Associated factors were categorized into six categories: demographic characteristics, psychological aspects, role-related or interpersonal traits, environmental aspects, source-related characteristics, and disease-related factors. Moreover, the results showed five types of HISB among people with IBD, including active searching, ongoing searching, passive attention, passive searching, and avoid seeking. Notably, active searching, especially social information seeking, appeared to be the predominant common type of HISB among people with IBD in the digital era. CONCLUSION Information needs and sources for patients with IBD exhibit variability, and their health information-seeking behaviour is influenced by a combination of diverse factors, including resource-related and individual factors. Future research should focus on the longitudinal changes in HISB among patients with IBD. Moreover, efforts should be made to develop information resources that are both convenient and provide credible information services, although the development of such resources requires further investigation and evaluation.
Collapse
Affiliation(s)
- Zijun Ni
- Nursing Department, The Second Affiliated Hospital of Zhejiang University School of Medicine, No.88 Jiefang Road, Hangzhou, 310009, China
- Department of Nursing, School of Medicine, Zhejiang University, Hangzhou, China
| | - Lingli Zhu
- Nursing Department, The Second Affiliated Hospital of Zhejiang University School of Medicine, No.88 Jiefang Road, Hangzhou, 310009, China
- Department of Nursing, School of Medicine, Zhejiang University, Hangzhou, China
| | - Shuyan Li
- Nursing Department, The Second Affiliated Hospital of Zhejiang University School of Medicine, No.88 Jiefang Road, Hangzhou, 310009, China
| | - Yuping Zhang
- Nursing Department, The Second Affiliated Hospital of Zhejiang University School of Medicine, No.88 Jiefang Road, Hangzhou, 310009, China
| | - Ruiyi Zhao
- Nursing Department, The Second Affiliated Hospital of Zhejiang University School of Medicine, No.88 Jiefang Road, Hangzhou, 310009, China.
| |
Collapse
|
5
|
Lee E, Tsuchiya H, Iida H, Nagano K, Murata Y, Maemoto A. Perceptions and Responses to Diseases among Patients with Inflammatory Bowel Disease: Text Mining Analysis of Posts on a Japanese Patient Community Website. Inflamm Intest Dis 2024; 9:283-295. [PMID: 39640255 PMCID: PMC11620774 DOI: 10.1159/000541837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 10/02/2024] [Indexed: 12/07/2024] Open
Abstract
Introduction Patients with inflammatory bowel disease (IBD) are increasingly using online platforms to communicate with other patients and healthcare professionals seeking disease-related information and support. Free-text posts on these platforms could provide insights into patients' everyday lives, which could help improve patient care. In this proof-of-concept (POC) study, we applied text mining to extract patient needs from free-text posts on a community forum in Japan, holistically visualized the patients' perceptions and their connections, and explored the patient characteristic-dependent trends in the use of words. Methods Free-text posts written between May 11, 2020 and May 31, 2022 on the community forum were retrieved and subjected to text mining analysis. Trends in the use of words were extracted from the posts for correspondence and co-occurrence network analyses using KH Coder open-source text mining software. Results Seventy-four posts were analyzed. Using text mining methods, we successfully extracted and visualized a variety of patient concerns and their connections. The correspondence and co-occurrence analyses revealed patient segment-dependent trends in the use of words. For example, patients with a disease duration of ≤5 years were more likely to use words related to emotions or their desire to change or quit their job, such as "anxiety" and "resignation." Patients with a disease duration of >10 years were more likely to use words showing that they are finding ways to live with or accept their disease, and are getting used to the lifestyle, but some patients continued to experience worsening disease. Conclusions We found that free-text posts on an IBD community forum can be a useful source of information to capture the wide variety of thoughts of patients. Text mining procedures can help visualize the relative importance of the topics identified from free-text posts. Our findings of this POC study will be useful for generating new hypotheses to better understand and address the needs of patients with IBD.
Collapse
Affiliation(s)
- Eujin Lee
- Medical Affairs Division, Janssen Pharmaceutical K.K., Tokyo, Japan
| | - Hiroaki Tsuchiya
- Medical Affairs Division, Janssen Pharmaceutical K.K., Tokyo, Japan
| | - Hajime Iida
- Medical Affairs Division, Janssen Pharmaceutical K.K., Tokyo, Japan
| | - Katsumasa Nagano
- Medical Affairs Division, Janssen Pharmaceutical K.K., Tokyo, Japan
| | - Yoko Murata
- Medical Affairs Division, Janssen Pharmaceutical K.K., Tokyo, Japan
| | - Atsuo Maemoto
- Inflammatory Bowel Disease Center, Sapporo Higashi Tokushukai Hospital, Sapporo, Japan
| |
Collapse
|
6
|
Fu J, Li C, Zhou C, Li W, Lai J, Deng S, Zhang Y, Guo Z, Wu Y. Methods for Analyzing the Contents of Social Media for Health Care: Scoping Review. J Med Internet Res 2023; 25:e43349. [PMID: 37358900 DOI: 10.2196/43349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 05/28/2023] [Accepted: 05/30/2023] [Indexed: 06/27/2023] Open
Abstract
BACKGROUND Given the rapid development of social media, effective extraction and analysis of the contents of social media for health care have attracted widespread attention from health care providers. As far as we know, most of the reviews focus on the application of social media, and there is a lack of reviews that integrate the methods for analyzing social media information for health care. OBJECTIVE This scoping review aims to answer the following 4 questions: (1) What types of research have been used to investigate social media for health care, (2) what methods have been used to analyze the existing health information on social media, (3) what indicators should be applied to collect and evaluate the characteristics of methods for analyzing the contents of social media for health care, and (4) what are the current problems and development directions of methods used to analyze the contents of social media for health care? METHODS A scoping review following Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines was conducted. We searched PubMed, the Web of Science, EMBASE, the Cumulative Index to Nursing and Allied Health Literature, and the Cochrane Library for the period from 2010 to May 2023 for primary studies focusing on social media and health care. Two independent reviewers screened eligible studies against inclusion criteria. A narrative synthesis of the included studies was conducted. RESULTS Of 16,161 identified citations, 134 (0.8%) studies were included in this review. These included 67 (50.0%) qualitative designs, 43 (32.1%) quantitative designs, and 24 (17.9%) mixed methods designs. The applied research methods were classified based on the following aspects: (1) manual analysis methods (content analysis methodology, grounded theory, ethnography, classification analysis, thematic analysis, and scoring tables) and computer-aided analysis methods (latent Dirichlet allocation, support vector machine, probabilistic clustering, image analysis, topic modeling, sentiment analysis, and other natural language processing technologies), (2) categories of research contents, and (3) health care areas (health practice, health services, and health education). CONCLUSIONS Based on an extensive literature review, we investigated the methods for analyzing the contents of social media for health care to determine the main applications, differences, trends, and existing problems. We also discussed the implications for the future. Traditional content analysis is still the mainstream method for analyzing social media content, and future research may be combined with big data research. With the progress of computers, mobile phones, smartwatches, and other smart devices, social media information sources will become more diversified. Future research can combine new sources, such as pictures, videos, and physiological signals, with online social networking to adapt to the development trend of the internet. More medical information talents need to be trained in the future to better solve the problem of network information analysis. Overall, this scoping review can be useful for a large audience that includes researchers entering the field.
Collapse
Affiliation(s)
- Jiaqi Fu
- Nanfang Hospital, Southern Medical University, Guangzhou, China
- School of Nursing, Southern Medical University, Guangzhou, China
| | - Chaixiu Li
- Nanfang Hospital, Southern Medical University, Guangzhou, China
- School of Nursing, Southern Medical University, Guangzhou, China
| | - Chunlan Zhou
- Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Wenji Li
- Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Jie Lai
- Nanfang Hospital, Southern Medical University, Guangzhou, China
- School of Nursing, Southern Medical University, Guangzhou, China
| | - Shisi Deng
- Nanfang Hospital, Southern Medical University, Guangzhou, China
- School of Nursing, Southern Medical University, Guangzhou, China
| | - Yujie Zhang
- Nanfang Hospital, Southern Medical University, Guangzhou, China
- School of Nursing, Southern Medical University, Guangzhou, China
| | - Zihan Guo
- Nanfang Hospital, Southern Medical University, Guangzhou, China
- School of Nursing, Southern Medical University, Guangzhou, China
| | - Yanni Wu
- Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
7
|
Cheng Q, Lin Y. Multilevel Classification of Users' Needs in Chinese Online Medical and Health Communities: Model Development and Evaluation Based on Graph Convolutional Network. JMIR Form Res 2023; 7:e42297. [PMID: 37079346 PMCID: PMC10160934 DOI: 10.2196/42297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 03/20/2023] [Accepted: 03/22/2023] [Indexed: 04/21/2023] Open
Abstract
BACKGROUND Online medical and health communities provide a platform for internet users to share experiences and ask questions about medical and health issues. However, there are problems in these communities, such as the low accuracy of the classification of users' questions and the uneven health literacy of users, which affect the accuracy of user retrieval and the professionalism of the medical personnel answering the question. In this context, it is essential to study more effective classification methods of users' information needs. OBJECTIVE Most online medical and health communities tend to provide only disease-type labels, which do not give a comprehensive summary of users' needs. The study aims to construct a multilevel classification framework based on the graph convolutional network (GCN) model for users' needs in online medical and health communities so that users can perform more targeted information retrieval. METHODS Using the Chinese online medical and health community "Qiuyi" as an example, we crawled questions posted by users in the "Cardiovascular Disease" section as the data source. First, the disease types involved in the problem data were segmented by manual coding to generate the first-level label. Second, the needs were identified by K-means clustering to generate the users' information needs label as the second-level label. Finally, by constructing a GCN model, users' questions were automatically classified, thus realizing the multilevel classification of users' needs. RESULTS Based on the empirical research of questions posted by users in the "Cardiovascular Disease" section of Qiuyi, the hierarchical classification of users' questions (data) was realized. The classification models designed in the study achieved accuracy, precision, recall, and F1-score of 0.6265, 0.6328, 0.5788, and 0.5912, respectively. Compared with the traditional machine learning method naïve Bayes and the deep learning method hierarchical text classification convolutional neural network, our classification model showed better performance. At the same time, we also performed a single-level classification experiment on users' needs, which in comparison with the multilevel classification model exhibited a great improvement. CONCLUSIONS A multilevel classification framework has been designed based on the GCN model. The results demonstrated that the method is effective in classifying users' information needs in online medical and health communities. At the same time, users with different diseases have different directions for information needs, which plays an important role in providing diversified and targeted services to the online medical and health community. Our method is also applicable to other similar disease classifications.
Collapse
Affiliation(s)
- Quan Cheng
- School of Economics and Management, Fuzhou University, Fuzhou, China
| | - Yingru Lin
- School of Economics and Management, Fuzhou University, Fuzhou, China
| |
Collapse
|
8
|
Khademi S, Hallinan CM, Conway M, Bonomo Y. Using Social Media Data to Investigate Public Perceptions of Cannabis as a Medicine: Narrative Review. J Med Internet Res 2023; 25:e36667. [PMID: 36848191 PMCID: PMC10012004 DOI: 10.2196/36667] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 08/31/2022] [Accepted: 12/16/2022] [Indexed: 03/01/2023] Open
Abstract
BACKGROUND The use and acceptance of medicinal cannabis is on the rise across the globe. To support the interests of public health, evidence relating to its use, effects, and safety is required to match this community demand. Web-based user-generated data are often used by researchers and public health organizations for the investigation of consumer perceptions, market forces, population behaviors, and for pharmacoepidemiology. OBJECTIVE In this review, we aimed to summarize the findings of studies that have used user-generated text as a data source to study medicinal cannabis or the use of cannabis as medicine. Our objectives were to categorize the insights provided by social media research on cannabis as medicine and describe the role of social media for consumers using medicinal cannabis. METHODS The inclusion criteria for this review were primary research studies and reviews that reported on the analysis of web-based user-generated content on cannabis as medicine. The MEDLINE, Scopus, Web of Science, and Embase databases were searched from January 1974 to April 2022. RESULTS We examined 42 studies published in English and found that consumers value their ability to exchange experiences on the web and tend to rely on web-based information sources. Cannabis discussions have portrayed the substance as a safe and natural medicine to help with many health conditions including cancer, sleep disorders, chronic pain, opioid use disorders, headaches, asthma, bowel disease, anxiety, depression, and posttraumatic stress disorder. These discussions provide a rich resource for researchers to investigate medicinal cannabis-related consumer sentiment and experiences, including the opportunity to monitor cannabis effects and adverse events, given the anecdotal and often biased nature of the information is properly accounted for. CONCLUSIONS The extensive web-based presence of the cannabis industry coupled with the conversational nature of social media discourse results in rich but potentially biased information that is often not well-supported by scientific evidence. This review summarizes what social media is saying about the medicinal use of cannabis and discusses the challenges faced by health governance agencies and professionals to make use of web-based resources to both learn from medicinal cannabis users and provide factual, timely, and reliable evidence-based health information to consumers.
Collapse
Affiliation(s)
- Sedigh Khademi
- Department of General Practice, Faculty of Medicine, Dentistry & Health Sciences, University of Melbourne, Victoria, Australia.,Centre for Health Analytics, Murdoch Children's Research Institute, Melbourne, Australia
| | - Christine Mary Hallinan
- Department of General Practice, Faculty of Medicine, Dentistry & Health Sciences, University of Melbourne, Victoria, Australia.,Health & Biomedical Research Information Technology Unit, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, Australia
| | - Mike Conway
- School of Computing and Information Systems, University of Melbourne, Melbourne, Australia
| | - Yvonne Bonomo
- Department of General Practice, Faculty of Medicine, Dentistry & Health Sciences, University of Melbourne, Victoria, Australia.,Department of Addiction Medicine, St Vincent's Health, Melbourne, Australia
| |
Collapse
|
9
|
Hallinan CM, Khademi Habibabadi S, Conway M, Bonomo YA. Social media discourse and internet search queries on cannabis as a medicine: A systematic scoping review. PLoS One 2023; 18:e0269143. [PMID: 36662832 PMCID: PMC9858862 DOI: 10.1371/journal.pone.0269143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 12/15/2022] [Indexed: 01/21/2023] Open
Abstract
The use of cannabis for medicinal purposes has increased globally over the past decade since patient access to medicinal cannabis has been legislated across jurisdictions in Europe, the United Kingdom, the United States, Canada, and Australia. Yet, evidence relating to the effect of medical cannabis on the management of symptoms for a suite of conditions is only just emerging. Although there is considerable engagement from many stakeholders to add to the evidence base through randomized controlled trials, many gaps in the literature remain. Data from real-world and patient reported sources can provide opportunities to address this evidence deficit. This real-world data can be captured from a variety of sources such as found in routinely collected health care and health services records that include but are not limited to patient generated data from medical, administrative and claims data, patient reported data from surveys, wearable trackers, patient registries, and social media. In this systematic scoping review, we seek to understand the utility of online user generated text into the use of cannabis as a medicine. In this scoping review, we aimed to systematically search published literature to examine the extent, range, and nature of research that utilises user-generated content to examine to cannabis as a medicine. The objective of this methodological review is to synthesise primary research that uses social media discourse and internet search engine queries to answer the following questions: (i) In what way, is online user-generated text used as a data source in the investigation of cannabis as a medicine? (ii) What are the aims, data sources, methods, and research themes of studies using online user-generated text to discuss the medicinal use of cannabis. We conducted a manual search of primary research studies which used online user-generated text as a data source using the MEDLINE, Embase, Web of Science, and Scopus databases in October 2022. Editorials, letters, commentaries, surveys, protocols, and book chapters were excluded from the review. Forty-two studies were included in this review, twenty-two studies used manually labelled data, four studies used existing meta-data (Google trends/geo-location data), two studies used data that was manually coded using crowdsourcing services, and two used automated coding supplied by a social media analytics company, fifteen used computational methods for annotating data. Our review reflects a growing interest in the use of user-generated content for public health surveillance. It also demonstrates the need for the development of a systematic approach for evaluating the quality of social media studies and highlights the utility of automatic processing and computational methods (machine learning technologies) for large social media datasets. This systematic scoping review has shown that user-generated content as a data source for studying cannabis as a medicine provides another means to understand how cannabis is perceived and used in the community. As such, it provides another potential 'tool' with which to engage in pharmacovigilance of, not only cannabis as a medicine, but also other novel therapeutics as they enter the market.
Collapse
Affiliation(s)
- Christine Mary Hallinan
- Faculty of Medicine, Department of General Practice, Dentistry & Health Sciences, The University of Melbourne, Melbourne, Victoria, Australia
- Faculty of Medicine, Department of General Practice, Health & Biomedical Research Information Technology Unit (HaBIC R2), Melbourne Medical School, Dentistry & Health Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Sedigheh Khademi Habibabadi
- Faculty of Medicine, Department of General Practice, Dentistry & Health Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Mike Conway
- Centre for Digital Transformation of Health, Victorian Comprehensive Cancer Centre, The University of Melbourne, Melbourne, Victoria, Australia
| | - Yvonne Ann Bonomo
- St Vincent’s Health—Department of Addiction Medicine, Melbourne, Victoria, Australia
- Faculty of Medicine, St Vincent’s Clinical School, Melbourne Medical School, Dentistry & Health Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
10
|
Stemmer M, Parmet Y, Ravid G. What are IBD Patients Talking About on Twitter? Using Natural Language Understanding to Investigate Patients' Tweets. SN COMPUTER SCIENCE 2023; 4:343. [PMID: 37125220 PMCID: PMC10117261 DOI: 10.1007/s42979-023-01772-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 03/03/2023] [Indexed: 05/02/2023]
Abstract
This research aims to investigate what patients with inflammatory bowel disease (IBD) are talking about on Twitter and learn from the experimental knowledge they share online. The study presents a framework for analyzing patients' tweets and comparing their content to tweets of the general population. We started by constructing two datasets of tweets-a dataset of patients' tweets and a control dataset for comparison. Then, we thematically classified the tweets and obtained a subset of tweets related to health and nutrition. We used a Dirichlet regression to compare the thematic segmentations of the two groups. We continued by extracting keywords from the filtered tweets and applying entity sentiment analysis to determine the patients' sentiments towards the extracted keywords. Finally, we detected emotions within the tweets and used a Wilcoxon test to compare the emotions conveyed in each group. We found statistically significant differences between the patients' thematic segmentations and those of the control group and observed significant differences in the emotions each group expressed while talking about health. Not only do patients talk more about health in comparison to the general Twitter population, but they also address the subject with negative sentiments and express more negative emotions. The personal information IBD patients share on Twitter can be used to derive complementary knowledge about the disease and provide an additional foundation to existing medical research on IBD. The four stages of the study are also feasible to extend to other chronic conditions.
Collapse
Affiliation(s)
- Maya Stemmer
- Ben-Gurion University of the Negev, P.O.B. 653, 8410501 Beer-Sheva, Israel
| | - Yisrael Parmet
- Ben-Gurion University of the Negev, P.O.B. 653, 8410501 Beer-Sheva, Israel
| | - Gilad Ravid
- Ben-Gurion University of the Negev, P.O.B. 653, 8410501 Beer-Sheva, Israel
| |
Collapse
|
11
|
Khademi Habibabadi S, Hallinan C, Bonomo Y, Conway M. Consumer-Generated Discourse on Cannabis as a Medicine: Scoping Review of Techniques. J Med Internet Res 2022; 24:e35974. [PMID: 36383417 PMCID: PMC9713623 DOI: 10.2196/35974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 06/16/2022] [Accepted: 07/27/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Medicinal cannabis is increasingly being used for a variety of physical and mental health conditions. Social media and web-based health platforms provide valuable, real-time, and cost-effective surveillance resources for gleaning insights regarding individuals who use cannabis for medicinal purposes. This is particularly important considering that the evidence for the optimal use of medicinal cannabis is still emerging. Despite the web-based marketing of medicinal cannabis to consumers, currently, there is no robust regulatory framework to measure clinical health benefits or individual experiences of adverse events. In a previous study, we conducted a systematic scoping review of studies that contained themes of the medicinal use of cannabis and used data from social media and search engine results. This study analyzed the methodological approaches and limitations of these studies. OBJECTIVE We aimed to examine research approaches and study methodologies that use web-based user-generated text to study the use of cannabis as a medicine. METHODS We searched MEDLINE, Scopus, Web of Science, and Embase databases for primary studies in the English language from January 1974 to April 2022. Studies were included if they aimed to understand web-based user-generated text related to health conditions where cannabis is used as a medicine or where health was mentioned in general cannabis-related conversations. RESULTS We included 42 articles in this review. In these articles, Twitter was used 3 times more than other computer-generated sources, including Reddit, web-based forums, GoFundMe, YouTube, and Google Trends. Analytical methods included sentiment assessment, thematic analysis (manual and automatic), social network analysis, and geographic analysis. CONCLUSIONS This study is the first to review techniques used by research on consumer-generated text for understanding cannabis as a medicine. It is increasingly evident that consumer-generated data offer opportunities for a greater understanding of individual behavior and population health outcomes. However, research using these data has some limitations that include difficulties in establishing sample representativeness and a lack of methodological best practices. To address these limitations, deidentified annotated data sources should be made publicly available, researchers should determine the origins of posts (organizations, bots, power users, or ordinary individuals), and powerful analytical techniques should be used.
Collapse
Affiliation(s)
- Sedigheh Khademi Habibabadi
- Department of General Practice, Melbourne Medical School, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, Australia
| | - Christine Hallinan
- Department of General Practice, Melbourne Medical School, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, Australia
- Health & Biomedical Research Information Technology Unit, The University of Melbourne, Melbourne, Australia
| | - Yvonne Bonomo
- Department of General Practice, Melbourne Medical School, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, Australia
| | - Mike Conway
- School of Computing & Information Systems, The University of Melbourne, Melbourne, Australia
| |
Collapse
|
12
|
Stemmer M, Parmet Y, Ravid G. Identifying Patients With Inflammatory Bowel Disease on Twitter and Learning From Their Personal Experience: Retrospective Cohort Study. J Med Internet Res 2022; 24:e29186. [PMID: 35917151 PMCID: PMC9382547 DOI: 10.2196/29186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 10/29/2021] [Accepted: 05/20/2022] [Indexed: 11/25/2022] Open
Abstract
Background Patients use social media as an alternative information source, where they share information and provide social support. Although large amounts of health-related data are posted on Twitter and other social networking platforms each day, research using social media data to understand chronic conditions and patients’ lifestyles is limited. Objective In this study, we contributed to closing this gap by providing a framework for identifying patients with inflammatory bowel disease (IBD) on Twitter and learning from their personal experiences. We enabled the analysis of patients’ tweets by building a classifier of Twitter users that distinguishes patients from other entities. This study aimed to uncover the potential of using Twitter data to promote the well-being of patients with IBD by relying on the wisdom of the crowd to identify healthy lifestyles. We sought to leverage posts describing patients’ daily activities and their influence on their well-being to characterize lifestyle-related treatments. Methods In the first stage of the study, a machine learning method combining social network analysis and natural language processing was used to automatically classify users as patients or not. We considered 3 types of features: the user’s behavior on Twitter, the content of the user’s tweets, and the social structure of the user’s network. We compared the performances of several classification algorithms within 2 classification approaches. One classified each tweet and deduced the user’s class from their tweet-level classification. The other aggregated tweet-level features to user-level features and classified the users themselves. Different classification algorithms were examined and compared using 4 measures: precision, recall, F1 score, and the area under the receiver operating characteristic curve. In the second stage, a classifier from the first stage was used to collect patients' tweets describing the different lifestyles patients adopt to deal with their disease. Using IBM Watson Service for entity sentiment analysis, we calculated the average sentiment of 420 lifestyle-related words that patients with IBD use when describing their daily routine. Results Both classification approaches showed promising results. Although the precision rates were slightly higher for the tweet-level approach, the recall and area under the receiver operating characteristic curve of the user-level approach were significantly better. Sentiment analysis of tweets written by patients with IBD identified frequently mentioned lifestyles and their influence on patients’ well-being. The findings reinforced what is known about suitable nutrition for IBD as several foods known to cause inflammation were pointed out in negative sentiment, whereas relaxing activities and anti-inflammatory foods surfaced in a positive context. Conclusions This study suggests a pipeline for identifying patients with IBD on Twitter and collecting their tweets to analyze the experimental knowledge they share. These methods can be adapted to other diseases and enhance medical research on chronic conditions.
Collapse
Affiliation(s)
- Maya Stemmer
- Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Yisrael Parmet
- Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Gilad Ravid
- Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
13
|
He L, Yin T, Zheng K. They May not Work! An Evaluation of Eleven Sentiment Analysis Tools on Seven Social Media Datasets. J Biomed Inform 2022; 132:104142. [PMID: 35835437 DOI: 10.1016/j.jbi.2022.104142] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/05/2022] [Accepted: 07/07/2022] [Indexed: 11/29/2022]
Abstract
OBJECTIVE Sentiment analysis is an important method for understanding emotions and opinions expressed through social media exchanges. Little work has been done to evaluate the performance of existing sentiment analysis tools on social media datasets, particularly those related to health, healthcare, or public health. This study aims to address the gap. MATERIAL AND METHODS We evaluated 11 commonly used sentiment analysis tools on five health-related social media datasets curated in previously published studies. These datasets include Human Papillomavirus Vaccine, Health Care Reform, COVID-19 Masking, Vitals.com Physician Reviews, and the Breast Cancer Forum from MedHelp.org. For comparison, we also analyzed two non-health datasets based on movie reviews and generic tweets. We conducted a qualitative error analysis on the social media posts that were incorrectly classified by all tools. RESULTS The existing sentiment analysis tools performed poorly with an average weighted F1 score below 0.6. The inter-tool agreement was also low; the average Fleiss Kappa score is 0.066. The qualitative error analysis identified two major causes for misclassification: (1) correct sentiment but on wrong subject(s) and (2) failure to properly interpret inexplicit/indirect sentiment expressions. DISCUSSION and Conclusion: The performance of the existing sentiment analysis tools is insufficient to generate accurate sentiment classification results. The low inter-tool agreement suggests that the conclusion of a study could be entirely driven by the idiosyncrasies of the tool selected, rather than by the data. This is very concerning especially if the results may be used to inform important policy decisions such as mask or vaccination mandates.
Collapse
Affiliation(s)
- Lu He
- Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, Irvine, California, United States
| | - Tingjue Yin
- Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, Irvine, California, United States
| | - Kai Zheng
- Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, Irvine, California, United States; Department of Emergency Medicine, School of Medicine, University of California, Irvine, Irvine, California, United States.
| |
Collapse
|
14
|
Malloy C, Rawl SM, Miller WR. Inflammatory Bowel Disease Self-Management: Exploring Adolescent Use of an Online Instagram Support Community. Gastroenterol Nurs 2022; 45:254-266. [PMID: 35833744 PMCID: PMC9425855 DOI: 10.1097/sga.0000000000000657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 10/13/2021] [Indexed: 11/26/2022] Open
Abstract
The purpose of this qualitative study was to explore the challenges adolescents with inflammatory bowel disease (IBD) experience with disease self-management as expressed in an online Instagram social support community. Public Instagram posts between January and December 2019 were manually collected from an online IBD support community. To focus on adolescent self-management needs, only posts from Instagram users who (1) indicated they had inflammatory bowel disease, (2) were 13-24 years old, or were in middle school, high school, or college were collected. Using thematic analysis, authors independently coded and identified emerging themes about self-management. Of 2,700 Instagram posts assessed for eligibility, 83 posts met inclusion criteria. Six major themes about inflammatory bowel disease self-management emerged: Desire for Normalcy, Dietary Changes, Education and Career, Healthcare System, Relationships With Others, and Symptoms and Complications. As the first thematic analysis of Instagram posts in an online inflammatory bowel disease community, results provide a crucial perspective of the concerns of adolescents with inflammatory bowel disease. Self-management challenges were wide-ranging and complex, underscoring the importance of IBD self-management in the adolescent population. Nurses should take a holistic approach to assess self-management challenges and tailor care to the specific needs of adolescents living with inflammatory bowel disease.
Collapse
|
15
|
Boosting biomedical document classification through the use of domain entity recognizers and semantic ontologies for document representation: The case of gluten bibliome. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
16
|
Li L, Grando A, Sarker A. A Data-Driven Iterative Approach for Semi-automatically Assessing the Correctness of Medication Value Sets: A Proof of Concept Based on Opioids. Methods Inf Med 2021; 60:e111-e119. [PMID: 34965602 PMCID: PMC8716187 DOI: 10.1055/s-0041-1740358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background
Value sets are lists of terms (e.g., opioid medication names) and their corresponding codes from standard clinical vocabularies (e.g., RxNorm) created with the intent of supporting health information exchange and research. Value sets are manually-created and often exhibit errors.
Objectives
The aim of the study is to develop a semi-automatic, data-centric natural language processing (NLP) method to assess medication-related value set correctness and evaluate it on a set of opioid medication value sets.
Methods
We developed an NLP algorithm that utilizes value sets containing mostly true positives and true negatives to learn lexical patterns associated with the true positives, and then employs these patterns to identify potential errors in unseen value sets. We evaluated the algorithm on a set of opioid medication value sets, using the recall, precision and F
1
-score metrics. We applied the trained model to assess the correctness of unseen opioid value sets based on recall. To replicate the application of the algorithm in real-world settings, a domain expert manually conducted error analysis to identify potential system and value set errors.
Results
Thirty-eight value sets were retrieved from the Value Set Authority Center, and six (two opioid, four non-opioid) were used to develop and evaluate the system. Average precision, recall, and F
1
-score were 0.932, 0.904, and 0.909, respectively on uncorrected value sets; and 0.958, 0.953, and 0.953, respectively after manual correction of the same value sets. On 20 unseen opioid value sets, the algorithm obtained average recall of 0.89. Error analyses revealed that the main sources of system misclassifications were differences in how opioids were coded in the value sets—while the training value sets had generic names mostly, some of the unseen value sets had new trade names and ingredients.
Conclusion
The proposed approach is data-centric, reusable, customizable, and not resource intensive. It may help domain experts to easily validate value sets.
Collapse
Affiliation(s)
- Linyi Li
- Department of Computer Science, Emory University, Atlanta, Georgia, United States.,Language Technologies Institute, Carnegie Mellon University, Pennsylvania, United States
| | - Adela Grando
- College of Health Solutions, Arizona State University, Phoenix, Arizona, United States
| | - Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, Georgia, United States
| |
Collapse
|
17
|
Lazarus JV, Kakalou C, Palayew A, Karamanidou C, Maramis C, Natsiavas P, Picchio CA, Villota-Rivas M, Zelber-Sagi S, Carrieri P. A Twitter discourse analysis of negative feelings and stigma related to NAFLD, NASH and obesity. Liver Int 2021; 41:2295-2307. [PMID: 34022107 DOI: 10.1111/liv.14969] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 04/27/2021] [Accepted: 05/18/2021] [Indexed: 12/15/2022]
Abstract
BACKGROUND People with non-alcoholic fatty liver disease (NAFLD) and non-alcoholic steatohepatitis (NASH) are stigmatized, partly since 'non-alcoholic' is in the name, but also because of obesity, which is a common condition in this group. Stigma is pervasive in social media and can contribute to poorer health outcomes. We examine how stigma and negative feelings concerning NAFLD/NASH and obesity manifest on Twitter. METHODS Using a self-developed search terms index, we collected NAFLD/NASH tweets from May to October 2019 (Phase I). Because stigmatizing NAFLD/NASH tweets were limited, Phase II focused on obesity (November-December 2019). Via sentiment analysis, >5000 tweets were annotated as positive, neutral or negative and used to train machine learning-based Natural Language Processing software, applied to 193 747 randomly sampled tweets. All tweets collected were analysed. RESULTS In Phase I, 16 835 tweets for NAFLD and 2376 for NASH were retrieved. Of the annotated NAFLD/NASH tweets, 97/1130 (8.6%) and 63/535 (11.8%), respectively, related to obesity and 13/1130 (1.2%) and 5/535 (0.9%), to stigma; they primarily focused on scientific discourse and unverified information. Of the 193 747 non-annotated obesity tweets (Phase II), the algorithm classified 40.0% as related to obesity, of which 85.2% were negative, 1.0% positive and 13.7% neutral. CONCLUSIONS NAFLD/NASH tweets mostly indicated an unmet information need and showed no clear signs of stigma. However, the negative content of obesity tweets was recurrent. As obesity-related stigma is associated with reduced care engagement and lifestyle modification, the main NAFLD/NASH treatment, stigma-reducing interventions in social media should be included in the liver health agenda.
Collapse
Affiliation(s)
- Jeffrey V Lazarus
- Barcelona Institute for Global Health (ISGlobal), Hospital Clínic, University of Barcelona, Barcelona, Spain
| | - Christine Kakalou
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Adam Palayew
- McGill Department of Epidemiology, Biostatistics, and Occupational Health, Montreal, QC, Canada
| | - Christina Karamanidou
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Christos Maramis
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Pantelis Natsiavas
- Institute of Applied Biosciences, Centre for Research & Technology Hellas, Thessaloniki, Greece
| | - Camila A Picchio
- Barcelona Institute for Global Health (ISGlobal), Hospital Clínic, University of Barcelona, Barcelona, Spain
| | - Marcela Villota-Rivas
- Barcelona Institute for Global Health (ISGlobal), Hospital Clínic, University of Barcelona, Barcelona, Spain
| | - Shira Zelber-Sagi
- School of Public Health, University of Haifa, Haifa, Israel.,Department of Gastroenterology, Tel Aviv Medical Center, Tel Aviv, Israel
| | - Patrizia Carrieri
- Aix Marseille Univ, INSERM, IRD, SESSTIM, Sciences Economiques & Sociales de la Santé & Traitement de l'Information Médicale, ISSPAM, Marseille, France
| |
Collapse
|
18
|
Ricard BJ, Hassanpour S. Deep Learning for Identification of Alcohol-Related Content on Social Media (Reddit and Twitter): Exploratory Analysis of Alcohol-Related Outcomes. J Med Internet Res 2021; 23:e27314. [PMID: 34524095 PMCID: PMC8482254 DOI: 10.2196/27314] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Revised: 03/30/2021] [Accepted: 08/01/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Many social media studies have explored the ability of thematic structures, such as hashtags and subreddits, to identify information related to a wide variety of mental health disorders. However, studies and models trained on specific themed communities are often difficult to apply to different social media platforms and related outcomes. A deep learning framework using thematic structures from Reddit and Twitter can have distinct advantages for studying alcohol abuse, particularly among the youth in the United States. OBJECTIVE This study proposes a new deep learning pipeline that uses thematic structures to identify alcohol-related content across different platforms. We apply our method on Twitter to determine the association of the prevalence of alcohol-related tweets with alcohol-related outcomes reported from the National Institute of Alcoholism and Alcohol Abuse, Centers for Disease Control Behavioral Risk Factor Surveillance System, county health rankings, and the National Industry Classification System. METHODS The Bidirectional Encoder Representations From Transformers neural network learned to classify 1,302,524 Reddit posts as either alcohol-related or control subreddits. The trained model identified 24 alcohol-related hashtags from an unlabeled data set of 843,769 random tweets. Querying alcohol-related hashtags identified 25,558,846 alcohol-related tweets, including 790,544 location-specific (geotagged) tweets. We calculated the correlation between the prevalence of alcohol-related tweets and alcohol-related outcomes, controlling for confounding effects of age, sex, income, education, and self-reported race, as recorded by the 2013-2018 American Community Survey. RESULTS Significant associations were observed: between alcohol-hashtagged tweets and alcohol consumption (P=.01) and heavy drinking (P=.005) but not binge drinking (P=.37), self-reported at the metropolitan-micropolitan statistical area level; between alcohol-hashtagged tweets and self-reported excessive drinking behavior (P=.03) but not motor vehicle fatalities involving alcohol (P=.21); between alcohol-hashtagged tweets and the number of breweries (P<.001), wineries (P<.001), and beer, wine, and liquor stores (P<.001) but not drinking places (P=.23), per capita at the US county and county-equivalent level; and between alcohol-hashtagged tweets and all gallons of ethanol consumed (P<.001), as well as ethanol consumed from wine (P<.001) and liquor (P=.01) sources but not beer (P=.63), at the US state level. CONCLUSIONS Here, we present a novel natural language processing pipeline developed using Reddit's alcohol-related subreddits that identify highly specific alcohol-related Twitter hashtags. The prevalence of identified hashtags contains interpretable information about alcohol consumption at both coarse (eg, US state) and fine-grained (eg, metropolitan-micropolitan statistical area level and county) geographical designations. This approach can expand research and deep learning interventions on alcohol abuse and other behavioral health outcomes.
Collapse
Affiliation(s)
| | - Saeed Hassanpour
- Department of Biomedical Data Science, Dartmouth College, Lebanon, NH, United States
- Department of Epidemiology, Dartmouth College, Hanover, NH, United States
- Department of Computer Science, Dartmouth College, Hanover, NH, United States
| |
Collapse
|
19
|
Abbasi-Perez A, Alvarez-Mon MA, Donat-Vargas C, Ortega MA, Monserrat J, Perez-Gomez A, Sanz I, Alvarez-Mon M. Analysis of Tweets Containing Information Related to Rheumatological Diseases on Twitter. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:9094. [PMID: 34501681 PMCID: PMC8430833 DOI: 10.3390/ijerph18179094] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 08/21/2021] [Accepted: 08/24/2021] [Indexed: 12/23/2022]
Abstract
BACKGROUND Tweets often indicate the interests of Twitter users. Data from Twitter could be used to better understand the interest in and perceptions of a variety of diseases and medical conditions, including rheumatological diseases which have increased in prevalence over the past several decades. The aim of this study was to perform a content analysis of tweets referring to rheumatological diseases. METHODS The content of each tweet was rated as medical (including a reference to diagnosis, treatment, or other aspects of the disease) or non-medical (such as requesting help). The type of user and the suitability of the medical content (appropriate content or, on the contrary, fake content if it was medically inappropriate according to the current medical knowledge) were also evaluated. The number of retweets and likes generated were also investigated. RESULTS We analyzed a total of 1514 tweets: 1093 classified as medical and 421 as non-medical. The diseases with more tweets were the most prevalent. Within the medical tweets, the content of these varied according to the disease (some more focused on diagnosis and others on treatment). The fake content came from unidentified users and mostly referred to the treatment of diseases. CONCLUSIONS According to our results, the analysis of content posted on Twitter in regard to rheumatological diseases may be useful for investigating the public's prevailing areas of interest, concerns and opinions. Thus, it could facilitate communication between health care professionals and patients, and ultimately improve the doctor-patient relationship. Due to the interest shown in medical issues it seems desirable to have healthcare institutions and healthcare workers involved in Twitter.
Collapse
Affiliation(s)
- Adrian Abbasi-Perez
- Service of Internal Medicine and Rheumatology, Autoimmune Diseases University Hospital “Principe de Asturias”, 28805 Alcala de Henares, Spain; (A.A.-P.); (A.P.-G.); (M.A.-M.)
| | - Miguel Angel Alvarez-Mon
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, 28805 Alcala de Henares, Spain; (M.A.O.); (J.M.)
| | - Carolina Donat-Vargas
- Carol Cardiovascular and Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institute, 17177 Stockholm, Sweden;
- IMDEA-Food Institute, Campus of International Excellence, Universidad Autónoma de Madrid, Consejo Superior de Investigaciones Científicas, 28049 Madrid, Spain
| | - Miguel A. Ortega
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, 28805 Alcala de Henares, Spain; (M.A.O.); (J.M.)
- Institute Ramon y Cajal for Health Research (IRYCIS), 28034 Madrid, Spain
| | - Jorge Monserrat
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, 28805 Alcala de Henares, Spain; (M.A.O.); (J.M.)
- Institute Ramon y Cajal for Health Research (IRYCIS), 28034 Madrid, Spain
| | - Ana Perez-Gomez
- Service of Internal Medicine and Rheumatology, Autoimmune Diseases University Hospital “Principe de Asturias”, 28805 Alcala de Henares, Spain; (A.A.-P.); (A.P.-G.); (M.A.-M.)
| | - Ignacio Sanz
- Division of Immunology and Rheumatology, Department of Medicine, Emory University, Atlanta, GA 30322, USA;
| | - Melchor Alvarez-Mon
- Service of Internal Medicine and Rheumatology, Autoimmune Diseases University Hospital “Principe de Asturias”, 28805 Alcala de Henares, Spain; (A.A.-P.); (A.P.-G.); (M.A.-M.)
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, 28805 Alcala de Henares, Spain; (M.A.O.); (J.M.)
- Institute Ramon y Cajal for Health Research (IRYCIS), 28034 Madrid, Spain
| |
Collapse
|
20
|
Lui CW, Wang Z, Wang N, Milinovich G, Ding H, Mengersen K, Bambrick H, Hu W. A call for better understanding of social media in surveillance and management of noncommunicable diseases. Health Res Policy Syst 2021; 19:18. [PMID: 33568155 PMCID: PMC7876784 DOI: 10.1186/s12961-021-00683-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 01/24/2021] [Indexed: 11/13/2022] Open
Abstract
Using social media for health purposes has attracted much attention over the past decade. Given the challenges of population ageing and changes in national health profile and disease patterns following the epidemiologic transition, researchers and policy-makers should pay attention to the potential of social media in chronic disease surveillance, management and support. This commentary overviews the evidence base for this inquiry and outlines the key challenges to research laying ahead. The authors provide concrete suggestions and recommendations for developing a research agenda to guide future investigation and action on this topic.
Collapse
Affiliation(s)
- Chi-Wai Lui
- School of Public Health, The University of Queensland, Brisbane, QLD, Australia
| | - Zaimin Wang
- Centre for Chronic Disease, School of Clinical Medicine, The University of Queensland, Brisbane, QLD, Australia.,School of Public Health and Social Work, Queensland University of Technology, Brisbane, QLD, Australia
| | - Ning Wang
- School of Public Health and Social Work, Queensland University of Technology, Brisbane, QLD, Australia
| | - Gabriel Milinovich
- School of Public Health and Social Work, Queensland University of Technology, Brisbane, QLD, Australia
| | - Hang Ding
- RECOVER Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD, 4059, Australia
| | - Kerrie Mengersen
- ARC Centre of Excellence for the Mathematical and Statistical Frontiers, School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia
| | - Hilary Bambrick
- School of Public Health and Social Work, Queensland University of Technology, Brisbane, QLD, Australia
| | - Wenbiao Hu
- School of Public Health and Social Work, Queensland University of Technology, Brisbane, QLD, Australia.
| |
Collapse
|
21
|
Chowdhry A, Kapoor P. Twitter for microblogging in oral health care, research, and academics: Road map and future directions. J Oral Maxillofac Pathol 2021; 25:511-514. [PMID: 35281148 PMCID: PMC8859589 DOI: 10.4103/jomfp.jomfp_190_21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Accepted: 07/03/2021] [Indexed: 11/06/2022] Open
Abstract
Current times have seen growing use of social medial tools, including microblogging sites like Twitter as an efficient method to disseminate information related to health amongst patients, students as well as health care workers. This article explores the role of this short, effective messaging platform in oral health care, teaching, research and learning. The concepts of “tweeting the meeting” and aggregation of conversations via “hashtags” is advocated for academic conferences, which will extend the conference reach to give the users better access to the instructors and enhance the related outcomes. Tweeting and retweeting the required research content may increase the academic footprint of the conducted research and researchers. In addition, it has served an immense role in the current COVID-19 pandemic by the regular circulation of information to the public and helped governments in policymaking and showcasing the areas of public concern. However, it still has a huge potential yet to be explored, with collective efforts towards strengthening the aspects of authenticity and standardization of the shared content.
Collapse
|
22
|
Exploring Public Response to COVID-19 on Weibo with LDA Topic Modeling and Sentiment Analysis. DATA AND INFORMATION MANAGEMENT 2021; 5:86-99. [PMID: 35402850 PMCID: PMC8975181 DOI: 10.2478/dim-2020-0023] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Accepted: 09/15/2020] [Indexed: 01/26/2023]
Abstract
It is necessary and important to understand public responses to crises, including disease outbreaks. Traditionally, surveys have played an essential role in collecting public opinion, while nowadays, with the increasing popularity of social media, mining social media data serves as another popular tool in opinion mining research. To understand the public response to COVID-19 on Weibo, this research collects 719,570 Weibo posts through a web crawler and analyzes the data with text mining techniques, including Latent Dirichlet Allocation (LDA) topic modeling and sentiment analysis. It is found that, in response to the COVID-19 outbreak, people learn about COVID-19, show their support for frontline warriors, encourage each other spiritually, and, in terms of taking preventive measures, express concerns about economic and life restoration, and so on. Analysis of sentiments and semantic networks further reveals that country media, as well as influential individuals and “self-media,” together contribute to the information spread of positive sentiment.
Collapse
|
23
|
Theodoridis X, Pittas S, Bogdanos DP, Grammatikopoulou MG. Social Media as Tools to Study Dietary Habits of Patients with Rheumatic Diseases: Learning from Relevant Work on Inflammatory Bowel Diseases. Mediterr J Rheumatol 2020; 31:382-383. [PMID: 33521568 PMCID: PMC7841103 DOI: 10.31138/mjr.31.4.382] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 03/15/2020] [Accepted: 03/21/2020] [Indexed: 11/12/2022] Open
Affiliation(s)
- Xenophon Theodoridis
- Department of Rheumatology and Clinical Immunology, Faculty of Medicine, School of Health Sciences, University of Thessaly, Larissa, Greece
- Department of Medicine, School of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Stefanos Pittas
- Department of Medicine, School of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Dimitrios P. Bogdanos
- Department of Rheumatology and Clinical Immunology, Faculty of Medicine, School of Health Sciences, University of Thessaly, Larissa, Greece
- Division of Transplantation Immunology and Mucosal Biology, MRC Centre for Transplantation, King’s College London Medical School, SE5 9RS, London, United Kingdom
| | - Maria G. Grammatikopoulou
- Department of Rheumatology and Clinical Immunology, Faculty of Medicine, School of Health Sciences, University of Thessaly, Larissa, Greece
| |
Collapse
|
24
|
Schäfer F, Faviez C, Voillot P, Foulquié P, Najm M, Jeanne JF, Fagherazzi G, Schück S, Le Nevé B. Mapping and Modeling of Discussions Related to Gastrointestinal Discomfort in French-Speaking Online Forums: Results of a 15-Year Retrospective Infodemiology Study. J Med Internet Res 2020; 22:e17247. [PMID: 33141087 PMCID: PMC7671840 DOI: 10.2196/17247] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 04/30/2020] [Accepted: 06/25/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Gastrointestinal (GI) discomfort is prevalent and known to be associated with impaired quality of life. Real-world information on factors of GI discomfort and solutions used by people is, however, limited. Social media, including online forums, have been considered a new source of information to examine the health of populations in real-life settings. OBJECTIVE The aims of this retrospective infodemiology study are to identify discussion topics, characterize users, and identify perceived determinants of GI discomfort in web-based messages posted by users of French social media. METHODS Messages related to GI discomfort posted between January 2003 and August 2018 were extracted from 14 French-speaking general and specialized publicly available online forums. Extracted messages were cleaned and deidentified. Relevant medical concepts were determined on the basis of the Medical Dictionary for Regulatory Activities and vernacular terms. The identification of discussion topics was carried out by using a correlated topic model on the basis of the latent Dirichlet allocation. A nonsupervised clustering algorithm was applied to cluster forum users according to the reported symptoms of GI discomfort, discussion topics, and activity on online forums. Users' age and gender were determined by linear regression and application of a support vector machine, respectively, to characterize the identified clusters according to demographic parameters. Perceived factors of GI discomfort were classified by a combined method on the basis of syntactic analysis to identify messages with causality terms and a second topic modeling in a relevant segment of phrases. RESULTS A total of 198,866 messages associated with GI discomfort were included in the analysis corpus after extraction and cleaning. These messages were posted by 36,989 separate web users, most of them being women younger than 40 years. Everyday life, diet, digestion, abdominal pain, impact on the quality of life, and tips to manage stress were among the most discussed topics. Segmentation of users identified 5 clusters corresponding to chronic and acute GI concerns. Diet topic was associated with each cluster, and stress was strongly associated with abdominal pain. Psychological factors, food, and allergens were perceived as the main causes of GI discomfort by web users. CONCLUSIONS GI discomfort is actively discussed by web users. This study reveals a complex relationship between food, stress, and GI discomfort. Our approach has shown that identifying web-based discussion topics associated with GI discomfort and its perceived factors is feasible and can serve as a complementary source of real-world evidence for caregivers.
Collapse
Affiliation(s)
- Florent Schäfer
- Innovation Science and Nutrition, Danone Nutricia Research, Palaiseau, France
| | | | | | | | | | | | - Guy Fagherazzi
- Deep Digital Phenotyping Research Unit, Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg.,Center of Research in Epidemiology and Population Health, UMR 1018 Inserm, Institut Gustave Roussy, Paris-Sud Paris-Saclay University, Villejuif, France
| | | | - Boris Le Nevé
- Innovation Science and Nutrition, Danone Nutricia Research, Palaiseau, France
| |
Collapse
|
25
|
Ruffer N, Knitza J, Krusche M. #Covid4Rheum: an analytical twitter study in the time of the COVID-19 pandemic. Rheumatol Int 2020; 40:2031-2037. [PMID: 32995894 PMCID: PMC7523492 DOI: 10.1007/s00296-020-04710-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 09/18/2020] [Indexed: 12/19/2022]
Abstract
Social media services, such as Twitter, offer great potential for a better understanding of rheumatic and musculoskeletal disorders (RMDs) and improved care in the field of rheumatology. This study examined the content and stakeholders associated with the Twitter hashtag #Covid4Rheum during the COVID-19 pandemic. The content analysis shows that Twitter connects stakeholders of the rheumatology community on a global level, reaching millions of users. Specifically, the use of hashtags on Twitter assists digital crowdsourcing projects and scientific collaboration, as exemplified by the COVID-19 Global Rheumatology Alliance registry. Moreover, Twitter facilitates the distribution of scientific content, such as guidelines or publications. Finally, digital data mining enables the identification of hot topics within the field of rheumatology.
Collapse
Affiliation(s)
- Nikolas Ruffer
- Department of Rheumatology and Immunology, Klinikum Bad Bramstedt, Bad Bramstedt, Germany
| | - Johannes Knitza
- Department of Internal Medicine 3-Rheumatology and Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Universitätsklinikum Erlangen, Erlangen, Germany
| | - Martin Krusche
- Department of Rheumatology and Clinical Immunology, Charité-Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany
| |
Collapse
|
26
|
O'Neill P, Shandro B, Poullis A. Patient perspectives on social-media-delivered telemedicine for inflammatory bowel disease. Future Healthc J 2020; 7:241-244. [PMID: 33094237 DOI: 10.7861/fhj.2020-0094] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In attempts to reduce the spread of COVID-19 among high-risk inflammatory bowel disease patients, many gastroenterology practices have recently gone 'virtual', using telemedicine technologies to care for their patients. In efforts to support this transition and improve approachability, social media platforms have been used to deliver telemedicine services with significant success. However, the patient perspective on this use of social media has largely been ignored. This study provides a baseline patient perspective on social media usage to help inform clinicians on which methods of telemedicine delivery will be best suited to their patient populations.
Collapse
Affiliation(s)
- Parker O'Neill
- St George's University Hospitals NHS Foundation Trust, London, UK
| | - Benjamin Shandro
- St George's University Hospitals NHS Foundation Trust, London, UK
| | - Andrew Poullis
- St George's University Hospitals NHS Foundation Trust, London, UK
| |
Collapse
|
27
|
Du Y, Paiva K, Cebula A, Kim S, Lopez K, Li C, White C, Myneni S, Seshadri S, Wang J. Diabetes-Related Topics in an Online Forum for Caregivers of Individuals Living With Alzheimer Disease and Related Dementias: Qualitative Inquiry. J Med Internet Res 2020; 22:e17851. [PMID: 32628119 PMCID: PMC7381255 DOI: 10.2196/17851] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 04/07/2020] [Accepted: 06/03/2020] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Diabetes and Alzheimer disease and related dementias (ADRD) are the seventh and sixth leading causes of death in the United States, respectively, and they coexist in many older adults. Caring for a loved one with both ADRD and diabetes is challenging and burdensome. OBJECTIVE This study aims to explore diabetes-related topics in the Alzheimer's Association ALZConnected caregiver forum by family caregivers of persons living with ADRD. METHODS User posts on the Alzheimer's Association ALZConnected caregiver forum were extracted. A total of 528 posts related to diabetes were included in the analysis. Of the users who generated the 528 posts, approximately 96.1% (275/286) were relatives of the care recipient with ADRD (eg, child, grandchild, spouse, sibling, or unspecified relative). Two researchers analyzed the data independently using thematic analysis. Any divergence was discussed among the research team, and an agreement was reached with a senior researcher's input as deemed necessary. RESULTS Thematic analysis revealed 7 key themes. The results showed that comorbidities of ADRD were common topics of discussions among family caregivers. Diabetes management in ADRD challenged family caregivers. Family caregivers might neglect their own health care because of the caring burden, and they reported poor health outcomes and reduced quality of life. The online forum provided a platform for family caregivers to seek support in their attempts to learn more about how to manage the ADRD of their care recipients and seek support for managing their own lives as caregivers. CONCLUSIONS The ALZConnected forum provided a platform for caregivers to seek informational and emotional support for caring for persons living with ADRD and diabetes. The overwhelming burdens with these two health conditions were apparent for both caregivers and care recipients based on discussions from the online forum. Studies are urgently needed to provide practical guidelines and interventions for diabetes management in individuals with diabetes and ADRD. Future studies to explore delivering diabetes management interventions through online communities in caregivers and their care recipients with ADRD and diabetes are warranted.
Collapse
Affiliation(s)
- Yan Du
- Center on Smart and Connected Health Technologies, School of Nursing, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Kristi Paiva
- Center on Smart and Connected Health Technologies, School of Nursing, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Adrian Cebula
- Center on Smart and Connected Health Technologies, School of Nursing, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Seon Kim
- Center on Smart and Connected Health Technologies, School of Nursing, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Katrina Lopez
- Center on Smart and Connected Health Technologies, School of Nursing, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Chengdong Li
- Center on Smart and Connected Health Technologies, School of Nursing, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Carole White
- School of Nursing, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Sahiti Myneni
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Sudha Seshadri
- Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Jing Wang
- Center on Smart and Connected Health Technologies, School of Nursing, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| |
Collapse
|
28
|
Wang J, Deng H, Liu B, Hu A, Liang J, Fan L, Zheng X, Wang T, Lei J. Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed. J Med Internet Res 2020; 22:e16816. [PMID: 32012074 PMCID: PMC7005695 DOI: 10.2196/16816] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 12/05/2019] [Accepted: 12/15/2019] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Natural language processing (NLP) is an important traditional field in computer science, but its application in medical research has faced many challenges. With the extensive digitalization of medical information globally and increasing importance of understanding and mining big data in the medical field, NLP is becoming more crucial. OBJECTIVE The goal of the research was to perform a systematic review on the use of NLP in medical research with the aim of understanding the global progress on NLP research outcomes, content, methods, and study groups involved. METHODS A systematic review was conducted using the PubMed database as a search platform. All published studies on the application of NLP in medicine (except biomedicine) during the 20 years between 1999 and 2018 were retrieved. The data obtained from these published studies were cleaned and structured. Excel (Microsoft Corp) and VOSviewer (Nees Jan van Eck and Ludo Waltman) were used to perform bibliometric analysis of publication trends, author orders, countries, institutions, collaboration relationships, research hot spots, diseases studied, and research methods. RESULTS A total of 3498 articles were obtained during initial screening, and 2336 articles were found to meet the study criteria after manual screening. The number of publications increased every year, with a significant growth after 2012 (number of publications ranged from 148 to a maximum of 302 annually). The United States has occupied the leading position since the inception of the field, with the largest number of articles published. The United States contributed to 63.01% (1472/2336) of all publications, followed by France (5.44%, 127/2336) and the United Kingdom (3.51%, 82/2336). The author with the largest number of articles published was Hongfang Liu (70), while Stéphane Meystre (17) and Hua Xu (33) published the largest number of articles as the first and corresponding authors. Among the first author's affiliation institution, Columbia University published the largest number of articles, accounting for 4.54% (106/2336) of the total. Specifically, approximately one-fifth (17.68%, 413/2336) of the articles involved research on specific diseases, and the subject areas primarily focused on mental illness (16.46%, 68/413), breast cancer (5.81%, 24/413), and pneumonia (4.12%, 17/413). CONCLUSIONS NLP is in a period of robust development in the medical field, with an average of approximately 100 publications annually. Electronic medical records were the most used research materials, but social media such as Twitter have become important research materials since 2015. Cancer (24.94%, 103/413) was the most common subject area in NLP-assisted medical research on diseases, with breast cancers (23.30%, 24/103) and lung cancers (14.56%, 15/103) accounting for the highest proportions of studies. Columbia University and the talents trained therein were the most active and prolific research forces on NLP in the medical field.
Collapse
Affiliation(s)
- Jing Wang
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China
| | - Huan Deng
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China
| | - Bangtao Liu
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China
| | - Anbin Hu
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China
| | - Jun Liang
- IT Center, Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Lingye Fan
- Affiliated Hospital, Southwest Medical University, Luzhou, China
| | - Xu Zheng
- Center for Medical Informatics, Peking University, Beijing, China
| | - Tong Wang
- School of Public Health, Jilin University, Jilin, China
| | - Jianbo Lei
- School of Medical Informatics and Engineering, Southwest Medical University, Luzhou, China.,Center for Medical Informatics, Peking University, Beijing, China.,Institute of Medical Technology, Health Science Center, Peking University, Beijing, China
| |
Collapse
|