1
|
Fu J, Yang J, Li Q, Huang D, Yang H, Xie X, Xu H, Zhang M, Zheng C. What can we learn from a Chinese social media used by glaucoma patients? BMC Ophthalmol 2023; 23:470. [PMID: 37986061 PMCID: PMC10661764 DOI: 10.1186/s12886-023-03208-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 11/07/2023] [Indexed: 11/22/2023] Open
Abstract
PURPOSE Our study aims to discuss glaucoma patients' needs and Internet habits using big data analysis and Natural Language Processing (NLP) based on deep learning (DL). METHODS In this retrospective study, we used web crawler technology to crawl glaucoma-related topic posts from the glaucoma bar of Baidu Tieba, China. According to the contents of topic posts, we classified them into posts with seeking medical advice and without seeking medical advice (social support, expressing emotions, sharing knowledge, and others). Word Cloud and frequency statistics were used to analyze the contents and visualize the keywords of topic posts. Two DL models, Bidirectional Long Short-Term Memory (Bi-LSTM) and Bidirectional Encoder Representations from Transformers (BERT), were trained to identify the posts seeking medical advice. The evaluation matrices included: accuracy, F1 value, and the area under the ROC curve (AUC). RESULTS A total of 10,892 topic posts were included, among them, most were seeking medical advice (N = 7071, 64.91%), and seeking advice regarding symptoms or examination (N = 4913, 45.11%) dominated the majority. The following were searching for social support (N = 2362, 21.69%), expressing emotions (N = 497, 4.56%), and sharing knowledge (N = 527, 4.84%) in sequence. The word cloud analysis results showed that ocular pressure, visual field, examination, and operation were the most frequent words. The accuracy, F1 score, and AUC were 0.891, 0.891, and 0.931 for the BERT model, 0.82, 0.821, and 0.890 for the Bi-LSTM model. CONCLUSION Social media can help enhance the patient-doctor relationship by providing patients' concerns and cognition about glaucoma in China. NLP can be a powerful tool to reflect patients' focus on diseases. DL models performed well in classifying Chinese medical-related texts, which could play an important role in public health monitoring.
Collapse
Affiliation(s)
- Junxia Fu
- Department of Ophthalmology, School of Medicine, Xinhua Hospital Affiliated to Shanghai Jiao Tong University, 200092, Shanghai, China
- Institute of Hospital Development Strategy, China Hospital Development Institute, Shanghai Jiao Tong University, 200092, Shanghai, China
| | - Junrui Yang
- Joint Shantou International Eye Center of Shantou University and the Chinese University of Hong Kong, Shantou University Medical College, Shantou, Guangdong, China
- Department of Ophthalmology, The 74th Army Group Hospital, Guangzhou, Guangdong, China
| | - Qiuman Li
- Department of Pediatric Cardiology, Guangzhou Women and Children's Medical Center, Guangzhou, Guangdong, China
| | - Danqing Huang
- Institute of Hospital Development Strategy, China Hospital Development Institute, Shanghai Jiao Tong University, 200092, Shanghai, China
| | - Hongyang Yang
- Institute of Hospital Development Strategy, China Hospital Development Institute, Shanghai Jiao Tong University, 200092, Shanghai, China
| | - Xiaoling Xie
- Joint Shantou International Eye Center of Shantou University and the Chinese University of Hong Kong, Shantou University Medical College, Shantou, Guangdong, China
| | - Huaxin Xu
- The Faculty of Science, University of Technology Sydney, Sydney, Australia
| | - Mingzhi Zhang
- Joint Shantou International Eye Center of Shantou University and the Chinese University of Hong Kong, Shantou University Medical College, Shantou, Guangdong, China.
| | - Ce Zheng
- Department of Ophthalmology, School of Medicine, Xinhua Hospital Affiliated to Shanghai Jiao Tong University, 200092, Shanghai, China.
- Institute of Hospital Development Strategy, China Hospital Development Institute, Shanghai Jiao Tong University, 200092, Shanghai, China.
| |
Collapse
|
2
|
Fu R, Kundu A, Mitsakakis N, Elton-Marshall T, Wang W, Hill S, Bondy SJ, Hamilton H, Selby P, Schwartz R, Chaiton MO. Machine learning applications in tobacco research: a scoping review. Tob Control 2023; 32:99-109. [PMID: 34452986 DOI: 10.1136/tobaccocontrol-2020-056438] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 04/14/2021] [Indexed: 12/23/2022]
Abstract
OBJECTIVE Identify and review the body of tobacco research literature that self-identified as using machine learning (ML) in the analysis. DATA SOURCES MEDLINE, EMABSE, PubMed, CINAHL Plus, APA PsycINFO and IEEE Xplore databases were searched up to September 2020. Studies were restricted to peer-reviewed, English-language journal articles, dissertations and conference papers comprising an empirical analysis where ML was identified to be the method used to examine human experience of tobacco. Studies of genomics and diagnostic imaging were excluded. STUDY SELECTION Two reviewers independently screened the titles and abstracts. The reference list of articles was also searched. In an iterative process, eligible studies were classified into domains based on their objectives and types of data used in the analysis. DATA EXTRACTION Using data charting forms, two reviewers independently extracted data from all studies. A narrative synthesis method was used to describe findings from each domain such as study design, objective, ML classes/algorithms, knowledge users and the presence of a data sharing statement. Trends of publication were visually depicted. DATA SYNTHESIS 74 studies were grouped into four domains: ML-powered technology to assist smoking cessation (n=22); content analysis of tobacco on social media (n=32); smoker status classification from narrative clinical texts (n=6) and tobacco-related outcome prediction using administrative, survey or clinical trial data (n=14). Implications of these studies and future directions for ML researchers in tobacco control were discussed. CONCLUSIONS ML represents a powerful tool that could advance the research and policy decision-making of tobacco control. Further opportunities should be explored.
Collapse
Affiliation(s)
- Rui Fu
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Anasua Kundu
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Nicholas Mitsakakis
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, Ontario, Canada
| | - Tara Elton-Marshall
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Wei Wang
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Sean Hill
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Susan J Bondy
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Hayley Hamilton
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Peter Selby
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Robert Schwartz
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Michael Oliver Chaiton
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| |
Collapse
|
3
|
Utilizing Deep Learning for Detecting Adverse Drug Events in Structured and Unstructured Regulatory Drug Data Sets. Pharmaceut Med 2022; 36:307-317. [DOI: 10.1007/s40290-022-00434-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/01/2022] [Indexed: 10/16/2022]
|
4
|
Xie J, Zhang B, Ma J, Zeng D, Lo-Ciganic J. Readmission Prediction for Patients with Heterogeneous Medical History: A Trajectory-Based Deep Learning Approach. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS 2022. [DOI: 10.1145/3468780] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Hospital readmission refers to the situation where a patient is re-hospitalized with the same primary diagnosis within a specific time interval after discharge. Hospital readmission causes $26 billion preventable expenses to the U.S. health systems annually and often indicates suboptimal patient care. To alleviate those severe financial and health consequences, it is crucial to proactively predict patients’ readmission risk. Such prediction is challenging because the evolution of patients’ medical history is dynamic and complex. The state-of-the-art studies apply statistical models which use static predictors in a period, failing to consider patients’ heterogeneous medical history. Our approach –
Trajectory-BAsed DEep Learning (TADEL)
– is motivated to tackle the deficiencies of the existing approaches by capturing dynamic medical history. We evaluate TADEL on a five-year national Medicare claims dataset including 3.6 million patients per year over all hospitals in the United States, reaching an F1 score of 87.3% and an AUC of 88.4%. Our approach significantly outperforms all the state-of-the-art methods. Our findings suggest that health status factors and insurance coverage are important predictors for readmission. This study contributes to IS literature and analytical methodology by formulating the trajectory-based readmission prediction problem and developing a novel deep-learning-based readmission risk prediction framework. From a health IT perspective, this research delivers implementable methods to assess patients’ readmission risk and take early interventions to avoid potential negative consequences.
Collapse
Affiliation(s)
- Jiaheng Xie
- Lerner College of Business & Economics, University of Delaware, Newark, DE, USA
| | - Bin Zhang
- Eller College of Management, University of Arizona, Tucson, AZ, USA
| | - Jian Ma
- University of Colorado, Colorado Springs, Colorado Springs CO, USA
| | - Daniel Zeng
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Jenny Lo-Ciganic
- Department of Pharmaceutical Outcomes & Policy, University of Florida, FL
| |
Collapse
|
5
|
Tang KY, Hsiao CH, Hwang GJ. A scholarly network of AI research with an information science focus: Global North and Global South perspectives. PLoS One 2022; 17:e0266565. [PMID: 35427381 PMCID: PMC9012391 DOI: 10.1371/journal.pone.0266565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 03/22/2022] [Indexed: 11/19/2022] Open
Abstract
This paper primarily aims to provide a citation-based method for exploring the scholarly network of artificial intelligence (AI)-related research in the information science (IS) domain, especially from Global North (GN) and Global South (GS) perspectives. Three research objectives were addressed, namely (1) the publication patterns in the field, (2) the most influential articles and researched keywords in the field, and (3) the visualization of the scholarly network between GN and GS researchers between the years 2010 and 2020. On the basis of the PRISMA statement, longitudinal research data were retrieved from the Web of Science and analyzed. Thirty-two AI-related keywords were used to retrieve relevant quality articles. Finally, 149 articles accompanying the follow-up 8838 citing articles were identified as eligible sources. A co-citation network analysis was adopted to scientifically visualize the intellectual structure of AI research in GN and GS networks. The results revealed that the United States, Australia, and the United Kingdom are the most productive GN countries; by contrast, China and India are the most productive GS countries. Next, the 10 most frequently co-cited AI research articles in the IS domain were identified. Third, the scholarly networks of AI research in the GN and GS areas were visualized. Between 2010 and 2015, GN researchers in the IS domain focused on applied research involving intelligent systems (e.g., decision support systems); between 2016 and 2020, GS researchers focused on big data applications (e.g., geospatial big data research). Both GN and GS researchers focused on technology adoption research (e.g., AI-related products and services) throughout the investigated period. Overall, this paper reveals the intellectual structure of the scholarly network on AI research and several applications in the IS literature. The findings provide research-based evidence for expanding global AI research.
Collapse
Affiliation(s)
- Kai-Yu Tang
- Department of International Business, Ming Chuan University, Taipei, Taiwan
- * E-mail:
| | | | - Gwo-Jen Hwang
- Graduate Institute of Digital Learning and Education, National Taiwan University of Science and Technology, Taipei, Taiwan
| |
Collapse
|
6
|
Purushothaman V, McMann T, Nali M, Li Z, Cuomo R, Mackey TK. Content Analysis of Nicotine Poisoning (Nic Sick) Videos on TikTok: Retrospective Observational Infodemiology Study. J Med Internet Res 2022; 24:e34050. [PMID: 35353056 PMCID: PMC9008518 DOI: 10.2196/34050] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 02/01/2022] [Accepted: 02/04/2022] [Indexed: 12/24/2022] Open
Abstract
Background TikTok is a microvideo social media platform currently experiencing rapid growth and with 60% of its monthly users between the ages of 16 and 24 years. Increased exposure to e-cigarette content on social media may influence patterns of use, including the risk of overconsumption and possible nicotine poisoning, when users engage in trending challenges online. However, there is limited research assessing the characteristics of nicotine poisoning–related content posted on social media. Objective We aimed to assess the characteristics of content on TikTok that is associated with a popular nicotine poisoning–related hashtag. Methods We collected TikTok posts associated with the hashtag #nicsick, using a Python programming package (Selenium) and used an inductive coding approach to analyze video content and characteristics of interest. Videos were manually annotated to generate a codebook of the nicotine sickness–related themes. Statistical analysis was used to compare user engagement characteristics and video length in content with and without active nicotine sickness TikTok topics. Results A total of 132 TikTok videos associated with the hashtag #nicsick were manually coded, with 52.3% (69/132) identified as discussing firsthand and secondhand reports of suspected nicotine poisoning symptoms and experiences. More than one-third of nicotine poisoning–related content (26/69, 37.68%) portrayed active vaping by users, which included content with vaping behavior such as vaping tricks and overconsumption, and 43% (30/69) of recorded users self-reported experiencing nicotine sickness, poisoning, or adverse events such as vomiting following nicotine consumption. The average follower count of users posting content related to nicotine sickness was significantly higher than that for users posting content unrelated to nicotine sickness (W=2350.5, P=.03). Conclusions TikTok users openly discuss experiences, both firsthand and secondhand, with nicotine adverse events via the #nicsick hashtag including reports of overconsumption resulting in sickness. These study results suggest that there is a need to assess the utility of digital surveillance on emerging social media platforms for vaping adverse events, particularly on sites popular among youth and young adults. As vaping product use-patterns continue to evolve, digital adverse event detection likely represents an important tool to supplement traditional methods of public health surveillance (such as poison control center prevalence numbers).
Collapse
Affiliation(s)
- Vidya Purushothaman
- Global Health Policy and Data Institute, San Diego, CA, United States.,Department of Anesthesiology and Infectious Diseases and Global Public Health, University of California San Diego, La Jolla, CA, United States
| | - Tiana McMann
- Global Health Policy and Data Institute, San Diego, CA, United States.,Department of Anthropology, University of California San Diego, La Jolla, CA, United States.,S-3 Research, San Diego, CA, United States
| | - Matthew Nali
- Global Health Policy and Data Institute, San Diego, CA, United States.,Department of Anesthesiology and Infectious Diseases and Global Public Health, University of California San Diego, La Jolla, CA, United States.,S-3 Research, San Diego, CA, United States
| | - Zhuoran Li
- Global Health Policy and Data Institute, San Diego, CA, United States.,Department of Anthropology, University of California San Diego, La Jolla, CA, United States.,S-3 Research, San Diego, CA, United States
| | - Raphael Cuomo
- Global Health Policy and Data Institute, San Diego, CA, United States.,Department of Anesthesiology and Infectious Diseases and Global Public Health, University of California San Diego, La Jolla, CA, United States
| | - Tim K Mackey
- Global Health Policy and Data Institute, San Diego, CA, United States.,Department of Anthropology, University of California San Diego, La Jolla, CA, United States.,S-3 Research, San Diego, CA, United States
| |
Collapse
|
7
|
Chew R, Wenger M, Guillory J, Nonnemaker J, Kim A. Identifying Electronic Nicotine Delivery System Brands and Flavors on Instagram: Natural Language Processing Analysis. J Med Internet Res 2022; 24:e30257. [PMID: 35040793 PMCID: PMC8808345 DOI: 10.2196/30257] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 11/01/2021] [Accepted: 11/21/2021] [Indexed: 01/30/2023] Open
Abstract
Background Electronic nicotine delivery system (ENDS) brands, such as JUUL, used social media as a key component of their marketing strategy, which led to massive sales growth from 2015 to 2018. During this time, ENDS use rapidly increased among youths and young adults, with flavored products being particularly popular among these groups. Objective The aim of our study is to develop a named entity recognition (NER) model to identify potential emerging vaping brands and flavors from Instagram post text. NER is a natural language processing task for identifying specific types of words (entities) in text based on the characteristics of the entity and surrounding words. Methods NER models were trained on a labeled data set of 2272 Instagram posts coded for ENDS brands and flavors. We compared three types of NER models—conditional random fields, a residual convolutional neural network, and a fine-tuned distilled bidirectional encoder representations from transformers (FTDB) network—to identify brands and flavors in Instagram posts with key model outcomes of precision, recall, and F1 scores. We used data from Nielsen scanner sales and Wikipedia to create benchmark dictionaries to determine whether brands from established ENDS brand and flavor lists were mentioned in the Instagram posts in our sample. To prevent overfitting, we performed 5-fold cross-validation and reported the mean and SD of the model validation metrics across the folds. Results For brands, the residual convolutional neural network exhibited the highest mean precision (0.797, SD 0.084), and the FTDB exhibited the highest mean recall (0.869, SD 0.103). For flavors, the FTDB exhibited both the highest mean precision (0.860, SD 0.055) and recall (0.801, SD 0.091). All NER models outperformed the benchmark brand and flavor dictionary look-ups on mean precision, recall, and F1. Comparing between the benchmark brand lists, the larger Wikipedia list outperformed the Nielsen list in both precision and recall. Conclusions Our findings suggest that NER models correctly identified ENDS brands and flavors in Instagram posts at rates competitive with, or better than, others in the published literature. Brands identified during manual annotation showed little overlap with those in Nielsen scanner data, suggesting that NER models may capture emerging brands with limited sales and distribution. NER models address the challenges of manual brand identification and can be used to support future infodemiology and infoveillance studies. Brands identified on social media should be cross-validated with Nielsen and other data sources to differentiate emerging brands that have become established from those with limited sales and distribution.
Collapse
Affiliation(s)
- Rob Chew
- Center for Data Science, RTI International, Research Triangle Park, NC, United States
| | - Michael Wenger
- Center for Data Science, RTI International, Research Triangle Park, NC, United States
| | - Jamie Guillory
- Center for Health Analytics, Media, and Policy, RTI International, Research Triangle Park, NC, United States
| | - James Nonnemaker
- Center for Health Analytics, Media, and Policy, RTI International, Research Triangle Park, NC, United States
| | - Annice Kim
- Center for Health Analytics, Media, and Policy, RTI International, Research Triangle Park, NC, United States
| |
Collapse
|
8
|
Wu H, Ji J, Tian H, Chen Y, Ge W, Zhang H, Yu F, Zou J, Nakamura M, Liao J. Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding-Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model. JMIR Med Inform 2021; 9:e26407. [PMID: 34855616 PMCID: PMC8686410 DOI: 10.2196/26407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 04/22/2021] [Accepted: 10/05/2021] [Indexed: 12/17/2022] Open
Abstract
Background With the increasing variety of drugs, the incidence of adverse drug events (ADEs) is increasing year by year. Massive numbers of ADEs are recorded in electronic medical records and adverse drug reaction (ADR) reports, which are important sources of potential ADR information. Meanwhile, it is essential to make latent ADR information automatically available for better postmarketing drug safety reevaluation and pharmacovigilance. Objective This study describes how to identify ADR-related information from Chinese ADE reports. Methods Our study established an efficient automated tool, named BBC-Radical. BBC-Radical is a model that consists of 3 components: Bidirectional Encoder Representations from Transformers (BERT), bidirectional long short-term memory (bi-LSTM), and conditional random field (CRF). The model identifies ADR-related information from Chinese ADR reports. Token features and radical features of Chinese characters were used to represent the common meaning of a group of words. BERT and Bi-LSTM-CRF were novel models that combined these features to conduct named entity recognition (NER) tasks in the free-text section of 24,890 ADR reports from the Jiangsu Province Adverse Drug Reaction Monitoring Center from 2010 to 2016. Moreover, the man-machine comparison experiment on the ADE records from Drum Tower Hospital was designed to compare the NER performance between the BBC-Radical model and a manual method. Results The NER model achieved relatively high performance, with a precision of 96.4%, recall of 96.0%, and F1 score of 96.2%. This indicates that the performance of the BBC-Radical model (precision 87.2%, recall 85.7%, and F1 score 86.4%) is much better than that of the manual method (precision 86.1%, recall 73.8%, and F1 score 79.5%) in the recognition task of each kind of entity. Conclusions The proposed model was competitive in extracting ADR-related information from ADE reports, and the results suggest that the application of our method to extract ADR-related information is of great significance in improving the quality of ADR reports and postmarketing drug safety evaluation.
Collapse
Affiliation(s)
- Hong Wu
- School of Science, China Pharmaceutical University, Nanjing, China
| | - Jiatong Ji
- School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Haimei Tian
- School of Computer Engineering, Jinling Institute of Technology, Nanjing, China
| | - Yao Chen
- School of Science, China Pharmaceutical University, Nanjing, China
| | - Weihong Ge
- Department of Pharmacy, Nanjing Drum Tower Hospital, Nanjing, China
| | - Haixia Zhang
- Department of Pharmacy, Nanjing Drum Tower Hospital, Nanjing, China
| | - Feng Yu
- School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Jianjun Zou
- Department of Clinical Pharmacology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Mitsuhiro Nakamura
- Laboratory of Drug Informatics, Gifu Pharmaceutical University, Gifu, Japan
| | - Jun Liao
- School of Science, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
9
|
Chi S, Tian Y, Wang F, Wang Y, Chen M, Li J. Deep Semisupervised Multitask Learning Model and Its Interpretability for Survival Analysis. IEEE J Biomed Health Inform 2021; 25:3185-3196. [PMID: 33687852 DOI: 10.1109/jbhi.2021.3064696] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Survival analysis is a commonly used method in the medical field to analyze and predict the time of events. In medicine, this approach plays a key role in determining the course of treatment, developing new drugs, and improving hospital procedures. Most of the existing work in this area has addressed the problem by making strong assumptions about the underlying stochastic process. However, these assumptions are usually violated in the real-world data. This paper proposed a semisupervised multitask learning (SSMTL) method based on deep learning for survival analysis with or without competing risks. SSMTL transforms the survival analysis problem into a multitask learning problem that includes semisupervised learning and multipoint survival probability prediction. The distribution of survival times and the relationship between covariates and outcomes were modeled directly without any assumptions. Semisupervised loss and ranking loss are used to deal with censored data and the prior knowledge of the nonincreasing trend of the survival probability. Additionally, the importance of prognostic factors is determined, and the time-dependent and nonlinear effects of these factors on survival outcomes are visualized. The prediction performance of SSMTL is better than that of previous models in settings with or without competing risks, and the effects of predictors are successfully described. This study is of great significance for the exploration and application of deep learning methods involving medical structured data and provides an effective deep-learning-based method for survival analysis with complex-structured clinical data.
Collapse
|
10
|
Pénzes M, Bakacs M, Brys Z, Vitrai J, Tóth G, Berezvai Z, Urbán R. Vaping-Related Adverse Events and Perceived Health Improvements: A Cross-Sectional Survey among Daily E-Cigarette Users. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18168301. [PMID: 34444050 PMCID: PMC8394644 DOI: 10.3390/ijerph18168301] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 07/30/2021] [Accepted: 08/01/2021] [Indexed: 11/20/2022]
Abstract
Web-based samples of e-cigarette users commonly report significant vaping-related health improvements (HIs) and mild adverse events (AEs). This cross-sectional study with in-person interviewing data collection examined self-reported AEs and perceived HIs among Hungarian adult current daily exclusive e-cigarette (n = 65) and dual users (n = 127), and former daily e-cigarette users (n = 91) in 2018. Logistic regression was used to evaluate associations between reporting any AEs/HIs, vaping status, and covariates. More former users (52.7%) reported AEs than current users (39.6%; p = 0.038). Exclusive and dual daily users reported similar rates of AEs (44.6% and 37.0%, respectively; p = 0.308). More current users (46.9%) experienced HIs than former users (35.2%; p = 0.064). Exclusive daily users were more likely to report HIs than dual users (63.1% versus 38.6%; p = 0.001). Former user status and smoking cessation/reduction reasons increased the odds of reporting AEs, whereas nicotine-containing e-liquid use and older age decreased the odds of reporting AEs. Exclusive vaper status, using advanced generation devices, and smoking cessation/reduction reasons increased the odds of experiencing HIs. This study, which used a traditional data collection methodology, found a higher rate of AEs and a lower rate of HIs compared to web-based surveys. Our results highlight that experiencing AEs and HIs is affected by users’ characteristics, in addition to the device and e-liquid type.
Collapse
Affiliation(s)
- Melinda Pénzes
- Department of Public Health, Faculty of Medicine, Semmelweis University, H-1085 Budapest, Hungary
- Correspondence: ; Tel.: +36-70-380-7655
| | - Márta Bakacs
- National Institute of Pharmacy and Nutrition, H-1051 Budapest, Hungary;
| | - Zoltán Brys
- Department of Telecommunications and Media Informatics, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, H-1117 Budapest, Hungary;
| | - József Vitrai
- Pharmaproject-Statisztika Ltd., H-2081 Piliscsaba, Hungary;
| | - Gergely Tóth
- Institute of Sociology, Centre for Social Sciences, Hungarian Academy of Sciences Centre of Excellence, Eötvös Loránd Research Network, H-1097 Budapest, Hungary;
- Faculty of Humanities and Social Sciences, Károli Gáspár University of the Reformed Church in Hungary, H-1091 Budapest, Hungary
| | - Zombor Berezvai
- Institute of Marketing, Corvinus University of Budapest, H-1093 Budapest, Hungary;
| | - Róbert Urbán
- Institute of Psychology, Eötvös Loránd University, H-1064 Budapest, Hungary;
| |
Collapse
|
11
|
Su D, Zhang X, He K, Chen Y. Use of machine learning approach to predict depression in the elderly in China: A longitudinal study. J Affect Disord 2021; 282:289-298. [PMID: 33418381 DOI: 10.1016/j.jad.2020.12.160] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Revised: 10/28/2020] [Accepted: 12/23/2020] [Indexed: 10/24/2022]
Abstract
BACKGROUND Early detection of potential depression among elderly people is conducive for timely preventive intervention and clinical care to improve quality of life. Therefore, depression prediction considering sequential progression patterns in elderly needs to be further explored. METHODS We selected 1,538 elderly people from Chinese Longitudinal Healthy Longevity Study (CLHLS) wave 3-7 survey. Long short-term memory (LSTM) and six machine learning (ML) models were used to predict different depression risk factors and the depression risks in the elderly population in the next two years. Receiver operating curve (ROC) and decision curve analysis (DCA) were used to evaluate the prediction accuracy of the reference model and ML models. RESULTS The area under the ROC curve (AUC) values of logistic regression with lasso regularisation (AUC=0.629, p-value=0.020) was the highest among ML models. DCA results showed that the net benefit of six ML models was similar (threshold: 0.00-0.10), the net benefit of lasso regression was the largest (threshold: 0.10-0.17 and 0.22-0.25), and the net benefit of DNN was the largest (threshold: 0.17-0.22 and 0.25-0.40). In two ML models, activities of daily living (ADL)/ instrumental ADL (IADL), self-rated health, marital status, arthritis, and number of cohabiting were the most important predictors for elderly with depression. LIMITATIONS The retrospective waves used in the LSTM model need to be further increased. CONCLUSION The decision support system based on the proposed LSTM+ML model may be very valuable for doctors, nurses and community medical providers for early diagnosis and intervention.
Collapse
Affiliation(s)
- Dai Su
- Department of Health Management, School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China; Research Center for Rural Health Services, Hubei Province Key Research Institute of Humanities and Social Sciences, Wuhan, China; Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, USA
| | - Xingyu Zhang
- Department of Systems, Populations, and Leadership, University of Michigan School of Nursing, Ann Arbor, USA
| | - Kevin He
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, USA
| | - Yingchun Chen
- Department of Health Management, School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China; Research Center for Rural Health Services, Hubei Province Key Research Institute of Humanities and Social Sciences, Wuhan, China.
| |
Collapse
|
12
|
Pandey B, Kumar Pandey D, Pratap Mishra B, Rhmann W. A comprehensive survey of deep learning in the field of medical imaging and medical natural language processing: Challenges and research directions. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2021. [DOI: 10.1016/j.jksuci.2021.01.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
13
|
Choi J, Yoon J, Chung J, Coh BY, Lee JM. Social media analytics and business intelligence research: A systematic review. Inf Process Manag 2020. [DOI: 10.1016/j.ipm.2020.102279] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
14
|
Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations. ENTROPY 2020; 22:e22020252. [PMID: 33286026 PMCID: PMC7516692 DOI: 10.3390/e22020252] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 02/20/2020] [Accepted: 02/21/2020] [Indexed: 11/25/2022]
Abstract
Increasingly, popular online museums have significantly changed the way people acquire cultural knowledge. These online museums have been generating abundant amounts of cultural relics data. In recent years, researchers have used deep learning models that can automatically extract complex features and have rich representation capabilities to implement named-entity recognition (NER). However, the lack of labeled data in the field of cultural relics makes it difficult for deep learning models that rely on labeled data to achieve excellent performance. To address this problem, this paper proposes a semi-supervised deep learning model named SCRNER (Semi-supervised model for Cultural Relics’ Named Entity Recognition) that utilizes the bidirectional long short-term memory (BiLSTM) and conditional random fields (CRF) model trained by seldom labeled data and abundant unlabeled data to attain an effective performance. To satisfy the semi-supervised sample selection, we propose a repeat-labeled (relabeled) strategy to select samples of high confidence to enlarge the training set iteratively. In addition, we use embeddings from language model (ELMo) representations to dynamically acquire word representations as the input of the model to solve the problem of the blurred boundaries of cultural objects and Chinese characteristics of texts in the field of cultural relics. Experimental results demonstrate that our proposed model, trained on limited labeled data, achieves an effective performance in the task of named entity recognition of cultural relics.
Collapse
|
15
|
SECNLP: A survey of embeddings in clinical natural language processing. J Biomed Inform 2020; 101:103323. [DOI: 10.1016/j.jbi.2019.103323] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2019] [Revised: 09/12/2019] [Accepted: 10/27/2019] [Indexed: 12/11/2022]
|
16
|
Xu Z, Zhang Q, Li W, Li M, Yip PSF. Individualized prediction of depressive disorder in the elderly: A multitask deep learning approach. Int J Med Inform 2019; 132:103973. [PMID: 31569007 DOI: 10.1016/j.ijmedinf.2019.103973] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 08/27/2019] [Accepted: 09/17/2019] [Indexed: 01/17/2023]
Abstract
INTRODUCTION Depressive disorder is one of the major public health problems among the elderly. An effective depression risk prediction model can provide insights on the disease progression and potentially inform timely targeted interventions. Therefore, research on predicting the onset of depressive disorder for elderly adults considering the sequential progression patterns is critically needed. OBJECTIVE This research aims to develop a state-of-the-art deep learning model for the individualized prediction of depressive disorder with a 22-year longitudinal survey data among elderly people in the United States. METHODS We obtain the 22-year longitudinal survey data from the University of Michigan Health and Retirement Study, which consists of information on 20,000 elderly people in the United States from 1992 to 2014. To capture temporal and high-order interactions among risk factors, the proposed deep learning model utilizes a recurrent neural network framework with a multitask structure. The C-statistic and the mean absolute error are used to evaluate the prediction accuracy of the proposed model and a set of baseline models. RESULTS The experiments with the 22-year longitudinal survey data indicate that (a) machine learning models can provide an accurate prediction of the onset of depressive disorder for elderly individuals; (b) the temporal patterns of risk factors are associated with the onset of depressive disorder; and (c) the proposed multitask deep learning model exhibits superior performance as compared with baseline models. CONCLUSION The results demonstrate the capability of deep learning-based prediction models in capturing temporal and high-order interactions among risk factors, which are usually ignored by traditional regression models. This research sheds light on the use of machine learning models to predict the onset of depressive disorder among elderly people. Practically, the proposed methods can be implemented as a decision support system to help clinicians make decisions and inform actionable intervention strategies for elderly people.
Collapse
Affiliation(s)
- Zhongzhi Xu
- School of Data Science, City University of Hong Kong, Hong Kong, China
| | - Qingpeng Zhang
- School of Data Science, City University of Hong Kong, Hong Kong, China.
| | - Wentian Li
- Wuhan Hospital for Psychotherapy, Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Mingyang Li
- Department of Industrial and Management Systems Engineering, The University of South Florida, Tampa, USA
| | - Paul Siu Fai Yip
- Centre for Suicide Research and Prevention and the Faculty of Social Sciences, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
17
|
Named entity recognition from Chinese adverse drug event reports with lexical feature based BiLSTM-CRF and tri-training. J Biomed Inform 2019; 96:103252. [PMID: 31323311 DOI: 10.1016/j.jbi.2019.103252] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 07/12/2019] [Accepted: 07/14/2019] [Indexed: 01/04/2023]
Abstract
BACKGROUND The Adverse Drug Event Reports (ADERs) from the spontaneous reporting system are important data sources for studying Adverse Drug Reactions (ADRs) as well as post-marketing pharmacovigilance. Apart from the conventional ADR information contained in the structured section of ADERs, more detailed information such as pre- and post- ADR symptoms, multi-drug usages and ADR-relief treatments are described in the free-text section, which can be mined through Natural Language Processing (NLP) tools. OBJECTIVE The goal of this study was to extract ADR-related entities from free-text section of Chinese ADERs, which can act as supplements for the information contained in structured section, so as to further assist in ADR evaluation. METHODS Three models of Conditional Random Field (CRF), Bidirectional Long Short-Term Memory-CRF (BiLSTM-CRF) and Lexical Feature based BiLSTM-CRF (LF-BiLSTM-CRF) were constructed to conduct Named Entity Recognition (NER) tasks in free-text section of Chinese ADERs. A semi-supervised learning method of tri-training was applied on the basis of the three established models to give un-annotated raw data with reliable tags. RESULTS Among the three basic models, the LF-BiLSTM-CRF achieved the highest average F1 score of 94.35%. After the process of tri-training, almost half of the un-annotated cases were tagged with labels, and the performances of all the three models improved after iterative training. CONCLUSIONS The LF-BiLSTM-CRF model that we constructed could achieve a comparatively high F1 score, and the fusion of CRF, while BiLSTM-CRF and LF-BiLSTM-CRF in tri-training might further strengthen the reliability of predicted tags. The results suggested the usefulness of our methods in developing the specialized NER tools for identifying ADR-related information from Chinese ADERs.
Collapse
|
18
|
Ali F, El-Sappagh S, Kwak D. Fuzzy Ontology and LSTM-Based Text Mining: A Transportation Network Monitoring System for Assisting Travel. SENSORS 2019; 19:s19020234. [PMID: 30634527 PMCID: PMC6358771 DOI: 10.3390/s19020234] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 12/31/2018] [Accepted: 01/07/2019] [Indexed: 12/31/2022]
Abstract
Intelligent Transportation Systems (ITSs) utilize a sensor network-based system to gather and interpret traffic information. In addition, mobility users utilize mobile applications to collect transport information for safe traveling. However, these types of information are not sufficient to examine all aspects of the transportation networks. Therefore, both ITSs and mobility users need a smart approach and social media data, which can help ITSs examine transport services, support traffic and control management, and help mobility users travel safely. People utilize social networks to share their thoughts and opinions regarding transportation, which are useful for ITSs and travelers. However, user-generated text on social media is short in length, unstructured, and covers a broad range of dynamic topics. The application of recent Machine Learning (ML) approach is inefficient for extracting relevant features from unstructured data, detecting word polarity of features, and classifying the sentiment of features correctly. In addition, ML classifiers consistently miss the semantic feature of the word meaning. A novel fuzzy ontology-based semantic knowledge with Word2vec model is proposed to improve the task of transportation features extraction and text classification using the Bi-directional Long Short-Term Memory (Bi-LSTM) approach. The proposed fuzzy ontology describes semantic knowledge about entities and features and their relation in the transportation domain. Fuzzy ontology and smart methodology are developed in Web Ontology Language and Java, respectively. By utilizing word embedding with fuzzy ontology as a representation of text, Bi-LSTM shows satisfactory improvement in both the extraction of features and the classification of the unstructured text of social media.
Collapse
Affiliation(s)
- Farman Ali
- Department of Information and Communication Engineering, Inha University, Incheon 22212, Korea.
| | - Shaker El-Sappagh
- Department of Information and Communication Engineering, Inha University, Incheon 22212, Korea.
- Department of Information Systems, Benha University, Banha 13518, Egypt.
| | - Daehan Kwak
- Department of Computer Science, Kean University, Union, NJ 07083, USA.
| |
Collapse
|
19
|
Chu J, Dong W, He K, Duan H, Huang Z. Using neural attention networks to detect adverse medical events from electronic health records. J Biomed Inform 2018; 87:118-130. [DOI: 10.1016/j.jbi.2018.10.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 10/10/2018] [Accepted: 10/12/2018] [Indexed: 01/24/2023]
|
20
|
Zhou L, Zhang D, Yang C, Wang Y. HARNESSING SOCIAL MEDIA FOR HEALTH INFORMATION MANAGEMENT. ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS 2018; 27:139-151. [PMID: 30147636 PMCID: PMC6105292 DOI: 10.1016/j.elerap.2017.12.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The remarkable upsurge of social media has dramatic impacts on health care research and practice in the past decade. Social media are reshaping health information management in a variety of ways, ranging from providing cost-effective ways to improve clinician-patient communication and exchange health-related information and experience, to enabling the discovery of new medical knowledge and information. Despite some demonstrated initial success, social media use and analytics for improving health as a research field is still at its infancy. Information systems researchers can potentially play a key role in advancing the field. This study proposes a conceptual framework for social media-based health information management by drawing on multi-disciplinary research. With the guidance of the framework, this research presents related research challenges, identifies important yet under-explored research issues, and discusses promising directions for future research.
Collapse
Affiliation(s)
- Lina Zhou
- University of Maryland, Baltimore County
| | - Dongsong Zhang
- International Business School, Jinan University, China
- University of Maryland, Baltimore County
| | | | - Yu Wang
- International Business School, Jinan University, China
| |
Collapse
|
21
|
Xie J, Zeng DD, Marcum ZA. Using deep learning to improve medication safety: the untapped potential of social media. Ther Adv Drug Saf 2017; 8:375-377. [PMID: 29204265 DOI: 10.1177/2042098617729318] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Affiliation(s)
- Jiaheng Xie
- University of Arizona, Eller College of Management, Tucson, AZ, USA
| | | | - Zachary A Marcum
- University of Washington, School of Pharmacy, 1959 NE Pacific Street, H375G, Box 357630, Seattle, WA 98195-7630, USA
| |
Collapse
|