1
|
Bey R, Cohen A, Trebossen V, Dura B, Geoffroy PA, Jean C, Landman B, Petit-Jean T, Chatellier G, Sallah K, Tannier X, Bourmaud A, Delorme R. Natural language processing of multi-hospital electronic health records for public health surveillance of suicidality. NPJ MENTAL HEALTH RESEARCH 2024; 3:6. [PMID: 38609541 PMCID: PMC10955903 DOI: 10.1038/s44184-023-00046-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 12/06/2023] [Indexed: 04/14/2024]
Abstract
There is an urgent need to monitor the mental health of large populations, especially during crises such as the COVID-19 pandemic, to timely identify the most at-risk subgroups and to design targeted prevention campaigns. We therefore developed and validated surveillance indicators related to suicidality: the monthly number of hospitalisations caused by suicide attempts and the prevalence among them of five known risks factors. They were automatically computed analysing the electronic health records of fifteen university hospitals of the Paris area, France, using natural language processing algorithms based on artificial intelligence. We evaluated the relevance of these indicators conducting a retrospective cohort study. Considering 2,911,920 records contained in a common data warehouse, we tested for changes after the pandemic outbreak in the slope of the monthly number of suicide attempts by conducting an interrupted time-series analysis. We segmented the assessment time in two sub-periods: before (August 1, 2017, to February 29, 2020) and during (March 1, 2020, to June 31, 2022) the COVID-19 pandemic. We detected 14,023 hospitalisations caused by suicide attempts. Their monthly number accelerated after the COVID-19 outbreak with an estimated trend variation reaching 3.7 (95%CI 2.1-5.3), mainly driven by an increase among girls aged 8-17 (trend variation 1.8, 95%CI 1.2-2.5). After the pandemic outbreak, acts of domestic, physical and sexual violence were more often reported (prevalence ratios: 1.3, 95%CI 1.16-1.48; 1.3, 95%CI 1.10-1.64 and 1.7, 95%CI 1.48-1.98), fewer patients died (p = 0.007) and stays were shorter (p < 0.001). Our study demonstrates that textual clinical data collected in multiple hospitals can be jointly analysed to compute timely indicators describing mental health conditions of populations. Our findings also highlight the need to better take into account the violence imposed on women, especially at early ages and in the aftermath of the COVID-19 pandemic.
Collapse
Affiliation(s)
- Romain Bey
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Ariel Cohen
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France.
| | - Vincent Trebossen
- Child and Adolescent Psychiatry Department, Robert Debré University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Basile Dura
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Pierre-Alexis Geoffroy
- Département de psychiatrie et d'addictologie, GHU Paris Nord, DMU neurosciences, Bichat - Claude Bernard Hospital, Assistance Publique-Hôpitaux de Paris, 75018, Paris, France
- GHU Paris - psychiatry & neurosciences, 1, rue Cabanis, 75014, Paris, France
- NeuroDiderot, Inserm, FHU I2-D2, université Paris Cité, 75019, Paris, France
- CNRS UPR 3212, Institute for cellular and integrative neurosciences, 67000, Strasbourg, France
| | - Charline Jean
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
- Université Paris-Est Créteil, INSERM, IMRB U955, Créteil, France
- Service Santé Publique & URC, Hôpital Henri Mondor, Assistance Publique-Hôpitaux de Paris, Créteil, France
| | - Benjamin Landman
- Child and Adolescent Psychiatry Department, Robert Debré University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Thomas Petit-Jean
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Gilles Chatellier
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
- Université Paris Cité, Paris, France
| | - Kankoe Sallah
- URC PNVS, CIC-EC 1425, INSERM, Bichat - Claude Bernard Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Xavier Tannier
- Sorbonne Université, Inserm, Université Sorbonne Paris Nord, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé (LIMICS), Paris, France
| | - Aurelie Bourmaud
- Université Paris Cité, Paris, France
- Clinical Epidemiology Unit, Robert Debré University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- CIC 1426, Inserm, Paris, France
| | - Richard Delorme
- Child and Adolescent Psychiatry Department, Robert Debré University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- Human Genetics and Cognitive Functions, Institut Pasteur, Paris, France
| |
Collapse
|
2
|
Ni C, Song Q, Malin B, Song L, Commiskey P, Stratton L, Yin Z. Examining Online Behaviors of Adult-Child and Spousal Caregivers for People Living With Alzheimer Disease or Related Dementias: Comparative Study in an Open Online Community. J Med Internet Res 2023; 25:e48193. [PMID: 37976095 PMCID: PMC10692884 DOI: 10.2196/48193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 09/05/2023] [Accepted: 10/11/2023] [Indexed: 11/19/2023] Open
Abstract
BACKGROUND Alzheimer disease or related dementias (ADRD) are severe neurological disorders that impair the thinking and memory skills of older adults. Most persons living with dementia receive care at home from their family members or other unpaid informal caregivers; this results in significant mental, physical, and financial challenges for these caregivers. To combat these challenges, many informal ADRD caregivers seek social support in online environments. Although research examining online caregiving discussions is growing, few investigations have distinguished caregivers according to their kin relationships with persons living with dementias. Various studies have suggested that caregivers in different relationships experience distinct caregiving challenges and support needs. OBJECTIVE This study aims to examine and compare the online behaviors of adult-child and spousal caregivers, the 2 largest groups of informal ADRD caregivers, in an open online community. METHODS We collected posts from ALZConnected, an online community managed by the Alzheimer's Association. To gain insights into online behaviors, we first applied structural topic modeling to identify topics and topic prevalence between adult-child and spousal caregivers. Next, we applied VADER (Valence Aware Dictionary for Sentiment Reasoning) and LIWC (Linguistic Inquiry and Word Count) to evaluate sentiment changes in the online posts over time for both types of caregivers. We further built machine learning models to distinguish the posts of each caregiver type and evaluated them in terms of precision, recall, F1-score, and area under the precision-recall curve. Finally, we applied the best prediction model to compare the temporal trend of relationship-predicting capacities in posts between the 2 types of caregivers. RESULTS Our analysis showed that the number of posts from both types of caregivers followed a long-tailed distribution, indicating that most caregivers in this online community were infrequent users. In comparison with adult-child caregivers, spousal caregivers tended to be more active in the community, publishing more posts and engaging in discussions on a wider range of caregiving topics. Spousal caregivers also exhibited slower growth in positive emotional communication over time. The best machine learning model for predicting adult-child, spousal, or other caregivers achieved an area under the precision-recall curve of 81.3%. The subsequent trend analysis showed that it became more difficult to predict adult-child caregiver posts than spousal caregiver posts over time. This suggests that adult-child and spousal caregivers might gradually shift their discussions from questions that are more directly related to their own experiences and needs to questions that are more general and applicable to other types of caregivers. CONCLUSIONS Our findings suggest that it is important for researchers and community organizers to consider the heterogeneity of caregiving experiences and subsequent online behaviors among different types of caregivers when tailoring online peer support to meet the specific needs of each caregiver group.
Collapse
Affiliation(s)
- Congning Ni
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
| | - Qingyuan Song
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
| | - Bradley Malin
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
- Center for Genetic Privacy & Identity in Community Settings, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Lijun Song
- Department of Sociology, Vanderbilt University, Nashville, TN, United States
| | - Patricia Commiskey
- Department of Neurology, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Lauren Stratton
- Care and Support, Alzheimer's Association, Chicago, IL, United States
| | - Zhijun Yin
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
- Center for Genetic Privacy & Identity in Community Settings, Vanderbilt University Medical Center, Nashville, TN, United States
| |
Collapse
|
3
|
Di Cara NH, Maggio V, Davis OSP, Haworth CMA. Methodologies for Monitoring Mental Health on Twitter: Systematic Review. J Med Internet Res 2023; 25:e42734. [PMID: 37155236 PMCID: PMC10203928 DOI: 10.2196/42734] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 11/23/2022] [Accepted: 03/15/2023] [Indexed: 05/10/2023] Open
Abstract
BACKGROUND The use of social media data to predict mental health outcomes has the potential to allow for the continuous monitoring of mental health and well-being and provide timely information that can supplement traditional clinical assessments. However, it is crucial that the methodologies used to create models for this purpose are of high quality from both a mental health and machine learning perspective. Twitter has been a popular choice of social media because of the accessibility of its data, but access to big data sets is not a guarantee of robust results. OBJECTIVE This study aims to review the current methodologies used in the literature for predicting mental health outcomes from Twitter data, with a focus on the quality of the underlying mental health data and the machine learning methods used. METHODS A systematic search was performed across 6 databases, using keywords related to mental health disorders, algorithms, and social media. In total, 2759 records were screened, of which 164 (5.94%) papers were analyzed. Information about methodologies for data acquisition, preprocessing, model creation, and validation was collected, as well as information about replicability and ethical considerations. RESULTS The 164 studies reviewed used 119 primary data sets. There were an additional 8 data sets identified that were not described in enough detail to include, and 6.1% (10/164) of the papers did not describe their data sets at all. Of these 119 data sets, only 16 (13.4%) had access to ground truth data (ie, known characteristics) about the mental health disorders of social media users. The other 86.6% (103/119) of data sets collected data by searching keywords or phrases, which may not be representative of patterns of Twitter use for those with mental health disorders. The annotation of mental health disorders for classification labels was variable, and 57.1% (68/119) of the data sets had no ground truth or clinical input on this annotation. Despite being a common mental health disorder, anxiety received little attention. CONCLUSIONS The sharing of high-quality ground truth data sets is crucial for the development of trustworthy algorithms that have clinical and research utility. Further collaboration across disciplines and contexts is encouraged to better understand what types of predictions will be useful in supporting the management and identification of mental health disorders. A series of recommendations for researchers in this field and for the wider research community are made, with the aim of enhancing the quality and utility of future outputs.
Collapse
Affiliation(s)
- Nina H Di Cara
- School of Psychological Science, University of Bristol, Bristol, United Kingdom
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
| | - Valerio Maggio
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Oliver S P Davis
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- The Alan Turing Institute, London, United Kingdom
| | - Claire M A Haworth
- School of Psychological Science, University of Bristol, Bristol, United Kingdom
- The Alan Turing Institute, London, United Kingdom
| |
Collapse
|
4
|
Liu Y, Yin Z, Ni C, Yan C, Wan Z, Malin B. Examining Rural and Urban Sentiment Difference in COVID-19-Related Topics on Twitter: Word Embedding-Based Retrospective Study. J Med Internet Res 2023; 25:e42985. [PMID: 36790847 PMCID: PMC9937112 DOI: 10.2196/42985] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 01/12/2023] [Accepted: 01/27/2023] [Indexed: 02/16/2023] Open
Abstract
BACKGROUND By the end of 2022, more than 100 million people were infected with COVID-19 in the United States, and the cumulative death rate in rural areas (383.5/100,000) was much higher than in urban areas (280.1/100,000). As the pandemic spread, people used social media platforms to express their opinions and concerns about COVID-19-related topics. OBJECTIVE This study aimed to (1) identify the primary COVID-19-related topics in the contiguous United States communicated over Twitter and (2) compare the sentiments urban and rural users expressed about these topics. METHODS We collected tweets containing geolocation data from May 2020 to January 2022 in the contiguous United States. We relied on the tweets' geolocations to determine if their authors were in an urban or rural setting. We trained multiple word2vec models with several corpora of tweets based on geospatial and timing information. Using a word2vec model built on all tweets, we identified hashtags relevant to COVID-19 and performed hashtag clustering to obtain related topics. We then ran an inference analysis for urban and rural sentiments with respect to the topics based on the similarity between topic hashtags and opinion adjectives in the corresponding urban and rural word2vec models. Finally, we analyzed the temporal trend in sentiments using monthly word2vec models. RESULTS We created a corpus of 407 million tweets, 350 million (86%) of which were posted by users in urban areas, while 18 million (4.4%) were posted by users in rural areas. There were 2666 hashtags related to COVID-19, which clustered into 20 topics. Rural users expressed stronger negative sentiments than urban users about COVID-19 prevention strategies and vaccination (P<.001). Moreover, there was a clear political divide in the perception of politicians by urban and rural users; these users communicated stronger negative sentiments about Republican and Democratic politicians, respectively (P<.001). Regarding misinformation and conspiracy theories, urban users exhibited stronger negative sentiments about the "covidiots" and "China virus" topics, while rural users exhibited stronger negative sentiments about the "Dr. Fauci" and "plandemic" topics. Finally, we observed that urban users' sentiments about the economy appeared to transition from negative to positive in late 2021, which was in line with the US economic recovery. CONCLUSIONS This study demonstrates there is a statistically significant difference in the sentiments of urban and rural Twitter users regarding a wide range of COVID-19-related topics. This suggests that social media can be relied upon to monitor public sentiment during pandemics in disparate types of regions. This may assist in the geographically targeted deployment of epidemic prevention and management efforts.
Collapse
Affiliation(s)
- Yongtai Liu
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
| | - Zhijun Yin
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Congning Ni
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
| | - Chao Yan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Zhiyu Wan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Bradley Malin
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States
| |
Collapse
|
5
|
Sabharwal R, Miah SJ, Fosso Wamba S. Extending artificial intelligence research in the clinical domain: a theoretical perspective. ANNALS OF OPERATIONS RESEARCH 2022:1-32. [PMID: 36407943 PMCID: PMC9641309 DOI: 10.1007/s10479-022-05035-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 10/17/2022] [Indexed: 06/16/2023]
Abstract
Academic research to the utilization of artificial intelligence (AI) has been proliferated over the past few years. While AI and its subsets are continuously evolving in the fields of marketing, social media and finance, its application in the daily practice of clinical care is insufficiently explored. In this systematic review, we aim to landscape various application areas of clinical care in terms of the utilization of machine learning to improve patient care. Through designing a specific smart literature review approach, we give a new insight into existing literature identified with AI technologies in the clinical domain. Our review approach focuses on strategies, algorithms, applications, results, qualities, and implications using the Latent Dirichlet Allocation topic modeling. A total of 305 unique articles were reviewed, with 115 articles selected using Latent Dirichlet Allocation topic modeling, meeting our inclusion criteria. The primary result of this approach incorporates a proposition for future research direction, abilities, and influence of AI technologies and displays the areas of disease management in clinics. This research concludes with disease administrative ramifications, limitations, and directions for future research.
Collapse
Affiliation(s)
- Renu Sabharwal
- Newcastle Business School, The University of Newcastle, Callaghan, NSW Australia
| | - Shah J. Miah
- Newcastle Business School, The University of Newcastle, Callaghan, NSW Australia
| | | |
Collapse
|
6
|
Cao XJ, Liu XQ. Artificial intelligence-assisted psychosis risk screening in adolescents: Practices and challenges. World J Psychiatry 2022; 12:1287-1297. [PMID: 36389087 PMCID: PMC9641379 DOI: 10.5498/wjp.v12.i10.1287] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 08/09/2022] [Accepted: 09/22/2022] [Indexed: 02/05/2023] Open
Abstract
Artificial intelligence-based technologies are gradually being applied to psych-iatric research and practice. This paper reviews the primary literature concerning artificial intelligence-assisted psychosis risk screening in adolescents. In terms of the practice of psychosis risk screening, the application of two artificial intelligence-assisted screening methods, chatbot and large-scale social media data analysis, is summarized in detail. Regarding the challenges of psychiatric risk screening, ethical issues constitute the first challenge of psychiatric risk screening through artificial intelligence, which must comply with the four biomedical ethical principles of respect for autonomy, nonmaleficence, beneficence and impartiality such that the development of artificial intelligence can meet the moral and ethical requirements of human beings. By reviewing the pertinent literature concerning current artificial intelligence-assisted adolescent psychosis risk screens, we propose that assuming they meet ethical requirements, there are three directions worth considering in the future development of artificial intelligence-assisted psychosis risk screening in adolescents as follows: nonperceptual real-time artificial intelligence-assisted screening, further reducing the cost of artificial intelligence-assisted screening, and improving the ease of use of artificial intelligence-assisted screening techniques and tools.
Collapse
Affiliation(s)
- Xiao-Jie Cao
- Graduate School of Education, Peking University, Beijing 100871, China
| | - Xin-Qiao Liu
- School of Education, Tianjin University, Tianjin 300350, China
| |
Collapse
|
7
|
Gladwin TE, Markwell N, Panno A. Do Semantic Vectors Contain Traces of Biophilic Connections Between Nature and Mental Health? ECOPSYCHOLOGY 2022. [DOI: 10.1089/eco.2022.0036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
| | | | - Angelo Panno
- Department of Human Science, The European University of Rome, Rome, Italy
| |
Collapse
|
8
|
Kaswa R, Nair A, Murphy S, Von Pressentin KB. Artificial intelligence: A strategic opportunity for enhancing primary care in South Africa. S Afr Fam Pract (2004) 2022; 64:e1-e2. [PMID: 36073098 PMCID: PMC9558344 DOI: 10.4102/safp.v64i1.5596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 06/22/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Ramprakash Kaswa
- Department of Family Medicine and Rural Health, Walter Sisulu University, Mthatha, South Africa; and, Mthatha General Hospital, Mthatha.
| | | | | | | |
Collapse
|
9
|
Li N, Li RYM, Yao Q, Song L, Deeprasert J. Housing safety and health academic and public opinion mining from 1945 to 2021: PRISMA, cluster analysis, and natural language processing approaches. Front Public Health 2022; 10:902576. [PMID: 36117599 PMCID: PMC9472747 DOI: 10.3389/fpubh.2022.902576] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 07/25/2022] [Indexed: 01/22/2023] Open
Abstract
Housing safety and health problems threaten owners' and occupiers' safety and health. Nevertheless, there is no systematic review on this topic to the best of our knowledge. This study compared the academic and public opinions on housing safety and health and reviewed 982 research articles and 3,173 author works on housing safety and health published in the Web of Science Core Collection. PRISMA was used to filter the data, and natural language processing (NLP) was used to analyze emotions of the abstracts. Only 16 housing safety and health articles existed worldwide before 1998 but increased afterward. U.S. scholars published most research articles (30.76%). All top 10 most productive countries were developed countries, except China, which ranked second (16.01%). Only 25.9% of institutions have inter-institutional cooperation, and collaborators from the same institution produce most work. This study found that most abstracts were positive (n = 521), but abstracts with negative emotions attracted more citations. Despite many industries moving toward AI, housing safety and health research are exceptions as per articles published and Tweets. On the other hand, this study reviewed 8,257 Tweets to compare the focus of the public to academia. There were substantially more housing/residential safety (n = 8198) Tweets than housing health Tweets (n = 59), which is the opposite of academic research. Most Tweets about housing/residential safety were from the United Kingdom or Canada, while housing health hazards were from India. The main concern about housing safety per Twitter includes finance, people, and threats to housing safety. By contrast, people mainly concerned about costs of housing health issues, COVID, and air quality. In addition, most housing safety Tweets were neutral but positive dominated residential safety and health Tweets.
Collapse
Affiliation(s)
- Na Li
- Rattanakosin International College of Creative Entrepreneurship, Rajamangala University of Technology Rattanakosin, Bangkok, Thailand
- College of Computer Science and Information Engineering, Qilu Institute of Technology, Jinan, China
| | - Rita Yi Man Li
- Sustainable Real Estate Research Center, Department of Economics and Finance, Hong Kong Shue Yan University, Hong Kong, Hong Kong SAR, China
| | - Qi Yao
- School of Literature and Journalism, Chongqing Technology and Business University, Chongqing, China
| | - Lingxi Song
- Chakrabongse Bhuvanarth International Institute for Interdisciplinary Studies, Rajamangala University of Technology Tawan-Ok, Bangkok, Thailand
| | - Jirawan Deeprasert
- Rattanakosin International College of Creative Entrepreneurship, Rajamangala University of Technology Rattanakosin, Bangkok, Thailand
| |
Collapse
|
10
|
Metzler H, Baginski H, Niederkrotenthaler T, Garcia D. Detecting Potentially Harmful and Protective Suicide-Related Content on Twitter: Machine Learning Approach. J Med Internet Res 2022; 24:e34705. [PMID: 35976193 PMCID: PMC9434391 DOI: 10.2196/34705] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 06/02/2022] [Accepted: 06/08/2022] [Indexed: 01/11/2023] Open
Abstract
Background Research has repeatedly shown that exposure to suicide-related news media content is associated with suicide rates, with some content characteristics likely having harmful and others potentially protective effects. Although good evidence exists for a few selected characteristics, systematic and large-scale investigations are lacking. Moreover, the growing importance of social media, particularly among young adults, calls for studies on the effects of the content posted on these platforms. Objective This study applies natural language processing and machine learning methods to classify large quantities of social media data according to characteristics identified as potentially harmful or beneficial in media effects research on suicide and prevention. Methods We manually labeled 3202 English tweets using a novel annotation scheme that classifies suicide-related tweets into 12 categories. Based on these categories, we trained a benchmark of machine learning models for a multiclass and a binary classification task. As models, we included a majority classifier, an approach based on word frequency (term frequency-inverse document frequency with a linear support vector machine) and 2 state-of-the-art deep learning models (Bidirectional Encoder Representations from Transformers [BERT] and XLNet). The first task classified posts into 6 main content categories, which are particularly relevant for suicide prevention based on previous evidence. These included personal stories of either suicidal ideation and attempts or coping and recovery, calls for action intending to spread either problem awareness or prevention-related information, reporting of suicide cases, and other tweets irrelevant to these 5 categories. The second classification task was binary and separated posts in the 11 categories referring to actual suicide from posts in the off-topic category, which use suicide-related terms in another meaning or context. Results In both tasks, the performance of the 2 deep learning models was very similar and better than that of the majority or the word frequency classifier. BERT and XLNet reached accuracy scores above 73% on average across the 6 main categories in the test set and F1-scores between 0.69 and 0.85 for all but the suicidal ideation and attempts category (F1=0.55). In the binary classification task, they correctly labeled around 88% of the tweets as about suicide versus off-topic, with BERT achieving F1-scores of 0.93 and 0.74, respectively. These classification performances were similar to human performance in most cases and were comparable with state-of-the-art models on similar tasks. Conclusions The achieved performance scores highlight machine learning as a useful tool for media effects research on suicide. The clear advantage of BERT and XLNet suggests that there is crucial information about meaning in the context of words beyond mere word frequencies in tweets about suicide. By making data labeling more efficient, this work has enabled large-scale investigations on harmful and protective associations of social media content with suicide rates and help-seeking behavior.
Collapse
Affiliation(s)
- Hannah Metzler
- Section for the Science of Complex Systems, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria.,Unit Suicide Research and Mental Health Promotion, Center for Public Health, Medical University of Vienna, Vienna, Austria.,Complexity Science Hub Vienna, Vienna, Austria.,Computational Social Science Lab, Institute of Interactive Systems and Data Science, Graz University of Technology, Graz, Austria.,Institute for Globally Distributed Open Research and Education, Vienna, Austria
| | - Hubert Baginski
- Complexity Science Hub Vienna, Vienna, Austria.,Institute of Information Systems Engineering, Vienna University of Technology, Vienna, Austria
| | - Thomas Niederkrotenthaler
- Unit Suicide Research and Mental Health Promotion, Center for Public Health, Medical University of Vienna, Vienna, Austria
| | - David Garcia
- Section for the Science of Complex Systems, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria.,Complexity Science Hub Vienna, Vienna, Austria.,Computational Social Science Lab, Institute of Interactive Systems and Data Science, Graz University of Technology, Graz, Austria
| |
Collapse
|
11
|
Machine learning-based identification of craniosynostosis in newborns. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2022.100292] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
|
12
|
Walsh J, Dwumfour C, Cave J, Griffiths F. Spontaneously generated online patient experience data - how and why is it being used in health research: an umbrella scoping review. BMC Med Res Methodol 2022; 22:139. [PMID: 35562661 PMCID: PMC9106384 DOI: 10.1186/s12874-022-01610-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 04/13/2022] [Indexed: 11/10/2022] Open
Abstract
PURPOSE Social media has led to fundamental changes in the way that people look for and share health related information. There is increasing interest in using this spontaneously generated patient experience data as a data source for health research. The aim was to summarise the state of the art regarding how and why SGOPE data has been used in health research. We determined the sites and platforms used as data sources, the purposes of the studies, the tools and methods being used, and any identified research gaps. METHODS A scoping umbrella review was conducted looking at review papers from 2015 to Jan 2021 that studied the use of SGOPE data for health research. Using keyword searches we identified 1759 papers from which we included 58 relevant studies in our review. RESULTS Data was used from many individual general or health specific platforms, although Twitter was the most widely used data source. The most frequent purposes were surveillance based, tracking infectious disease, adverse event identification and mental health triaging. Despite the developments in machine learning the reviews included lots of small qualitative studies. Most NLP used supervised methods for sentiment analysis and classification. Very early days, methods need development. Methods not being explained. Disciplinary differences - accuracy tweaks vs application. There is little evidence of any work that either compares the results in both methods on the same data set or brings the ideas together. CONCLUSION Tools, methods, and techniques are still at an early stage of development, but strong consensus exists that this data source will become very important to patient centred health research.
Collapse
Affiliation(s)
- Julia Walsh
- Warwick Medical School, University of Warwick, Coventry, UK.
| | | | - Jonathan Cave
- Department of Economics, University of Warwick, Coventry, UK
| | - Frances Griffiths
- Warwick Medical School, University of Warwick, Coventry, UK.,Centre for Health Policy, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
13
|
Shaikh SG, Suresh Kumar B, Narang G. Recommender system for health care analysis using machine learning technique: a review. THEORETICAL ISSUES IN ERGONOMICS SCIENCE 2022. [DOI: 10.1080/1463922x.2022.2061078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Salim G. Shaikh
- Amity School of Engineering and Technology, Amity University Jaipur, Jaipur, India
| | | | | |
Collapse
|
14
|
Hao F, Zheng K. Online Disease Identification and Diagnosis and Treatment Based on Machine Learning Technology. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:6736249. [PMID: 35449857 PMCID: PMC9018189 DOI: 10.1155/2022/6736249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/02/2022] [Accepted: 03/12/2022] [Indexed: 11/18/2022]
Abstract
The article uses machine learning algorithms to extract disease symptom keyword vectors. At the same time, we used deep learning technology to design a disease symptom classification model. We apply this model to an online disease consultation recommendation system. The system integrates machine learning algorithms and knowledge graph technology to help patients conduct online consultations. The system analyses the misclassification data of different departments through high-frequency word analysis. The study found that the accuracy rate of our machine learning algorithm model to identify entities in electronic medical records reached 96.29%. This type of model can effectively screen out the most important pathogenic features.
Collapse
Affiliation(s)
- Feng Hao
- Jilin Medical University, Jilin 132013, China
| | - Kai Zheng
- Jilin Medical University, Jilin 132013, China
| |
Collapse
|
15
|
Cheng K, Yin Z. "Please Advise": Understanding the Needs of Informal Caregivers of People with Alzheimer's Disease and Related Dementia from Online Communities Through a Structured Topic Modeling Approach. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022; 2022:149-158. [PMID: 35854737 PMCID: PMC9285182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Abstract
The informal or family caregivers of the Alzheimer's disease or related dementia (ADRD) patients, also known as the "invisible second patients", are often reported experiencing emotional and behavioral hardships. In recent years, the rapid development of online communities provides these caregivers a new opportunity for seeking information and emotional support. Comparing with offline social support services which have been constrained during the COVID-19 pandemic, online support allows caregivers to reach many peers in a convenient manner. This research aimed to examine the issues faced by ADRD caregivers through performing a structural topic modeling on posts from two online communities. Results revealed that the top concerns of the caregivers include getting along with Alzheimer's patients, family issues, patients' internal medical issues, stages of the disease, care facilities, etc. The results may have a further implication to the future implementation of psychological and social intervention of ADRD family care.
Collapse
Affiliation(s)
- Kerou Cheng
- Department of Computer Science, Vanderbilt University, Nashville, TN USA
| | - Zhijun Yin
- Department of Computer Science, Vanderbilt University, Nashville, TN USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN USA
| |
Collapse
|
16
|
Rahim AIA, Ibrahim MI, Chua SL, Musa KI. Hospital Facebook Reviews Analysis Using a Machine Learning Sentiment Analyzer and Quality Classifier. Healthcare (Basel) 2021; 9:1679. [PMID: 34946405 PMCID: PMC8701188 DOI: 10.3390/healthcare9121679] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/30/2021] [Accepted: 12/02/2021] [Indexed: 02/05/2023] Open
Abstract
While experts have recognised the significance and necessity of social media integration in healthcare, no systematic method has been devised in Malaysia or Southeast Asia to include social media input into the hospital quality improvement process. The goal of this work is to explain how to develop a machine learning system for classifying Facebook reviews of public hospitals in Malaysia by using service quality (SERVQUAL) dimensions and sentiment analysis. We developed a Machine Learning Quality Classifier (MLQC) based on the SERVQUAL model and a Machine Learning Sentiment Analyzer (MLSA) by manually annotated multiple batches of randomly chosen reviews. Logistic regression (LR), naive Bayes (NB), support vector machine (SVM), and other methods were used to train the classifiers. The performance of each classifier was tested using 5-fold cross validation. For topic classification, the average F1-score was between 0.687 and 0.757 for all models. In a 5-fold cross validation of each SERVQUAL dimension and in sentiment analysis, SVM consistently outperformed other methods. The study demonstrates how to use supervised learning to automatically identify SERVQUAL domains and sentiments from patient experiences on a hospital's Facebook page. Malaysian healthcare providers can gather and assess data on patient care via the use of these content analysis technology to improve hospital quality of care.
Collapse
Affiliation(s)
- Afiq Izzudin A. Rahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Mohd Ismail Ibrahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Sook-Ling Chua
- Faculty of Computing and Informatics, Multimedia University, Persiaran Multimedia, Cyberjaya 63100, Selangor, Malaysia
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| |
Collapse
|
17
|
Implementation of Real-Time Medical and Health Data Mining System Based on Machine Learning. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:7011205. [PMID: 34840702 PMCID: PMC8626197 DOI: 10.1155/2021/7011205] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/05/2021] [Accepted: 11/06/2021] [Indexed: 11/18/2022]
Abstract
This article analyzes the application process of data mining technology in the medical and health management system and uses machine learning algorithms to design a medical and health data mining system. The system collects patient's physical health data based on wireless sensing technology and uses machine learning algorithms to analyze the data. The system uploads the collected health data to the system for cluster analysis. Finally, the method is applied to the diagnosis data mining of patients, so as to prove the effectiveness of the classification method in the medical field through examples.
Collapse
|
18
|
Medical and Health Data Classification Method Based on Machine Learning. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:2722854. [PMID: 34824763 PMCID: PMC8610658 DOI: 10.1155/2021/2722854] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Revised: 11/02/2021] [Accepted: 11/05/2021] [Indexed: 01/30/2023]
Abstract
The information defined in medical health data is researched based on machine learning-related algorithms. Also, this paper used random forest and other related algorithms to perform health data training and fitting. Research shows that the algorithm proposed in the paper can improve the progress of health data classification. The algorithm can provide technical support for the improvement of medical data classification.
Collapse
|
19
|
Aggregating Twitter Text through Generalized Linear Regression Models for Tweet Popularity Prediction and Automatic Topic Classification. Eur J Investig Health Psychol Educ 2021; 11:1537-1554. [PMID: 34940387 PMCID: PMC8700529 DOI: 10.3390/ejihpe11040109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 11/10/2021] [Accepted: 11/23/2021] [Indexed: 11/16/2022] Open
Abstract
Social media platforms have become accessible resources for health data analysis. However, the advanced computational techniques involved in big data text mining and analysis are challenging for public health data analysts to apply. This study proposes and explores the feasibility of a novel yet straightforward method by regressing the outcome of interest on the aggregated influence scores for association and/or classification analyses based on generalized linear models. The method reduces the document term matrix by transforming text data into a continuous summary score, thereby reducing the data dimension substantially and easing the data sparsity issue of the term matrix. To illustrate the proposed method in detailed steps, we used three Twitter datasets on various topics: autism spectrum disorder, influenza, and violence against women. We found that our results were generally consistent with the critical factors associated with the specific public health topic in the existing literature. The proposed method could also classify tweets into different topic groups appropriately with consistent performance compared with existing text mining methods for automatic classification based on tweet contents.
Collapse
|
20
|
Gong J, Sihag V, Kong Q, Zhao L. Visualizing Knowledge Evolution Trends and Research Hotspots of Personal Health Data Research: Bibliometric Analysis. JMIR Med Inform 2021; 9:e31142. [PMID: 34723823 PMCID: PMC8593818 DOI: 10.2196/31142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/17/2021] [Accepted: 09/17/2021] [Indexed: 11/13/2022] Open
Abstract
Background The recent surge in clinical and nonclinical health-related data has been accompanied by a concomitant increase in personal health data (PHD) research across multiple disciplines such as medicine, computer science, and management. There is now a need to synthesize the dynamic knowledge of PHD in various disciplines to spot potential research hotspots. Objective The aim of this study was to reveal the knowledge evolutionary trends in PHD and detect potential research hotspots using bibliometric analysis. Methods We collected 8281 articles published between 2009 and 2018 from the Web of Science database. The knowledge evolution analysis (KEA) framework was used to analyze the evolution of PHD research. The KEA framework is a bibliometric approach that is based on 3 knowledge networks: reference co-citation, keyword co-occurrence, and discipline co-occurrence. Results The findings show that the focus of PHD research has evolved from medicine centric to technology centric to human centric since 2009. The most active PHD knowledge cluster is developing knowledge resources and allocating scarce resources. The field of computer science, especially the topic of artificial intelligence (AI), has been the focal point of recent empirical studies on PHD. Topics related to psychology and human factors (eg, attitude, satisfaction, education) are also receiving more attention. Conclusions Our analysis shows that PHD research has the potential to provide value-based health care in the future. All stakeholders should be educated about AI technology to promote value generation through PHD. Moreover, technology developers and health care institutions should consider human factors to facilitate the effective adoption of PHD-related technology. These findings indicate opportunities for interdisciplinary cooperation in several PHD research areas: (1) AI applications for PHD; (2) regulatory issues and governance of PHD; (3) education of all stakeholders about AI technology; and (4) value-based health care including “allocative value,” “technology value,” and “personalized value.”
Collapse
Affiliation(s)
- Jianxia Gong
- School of Economics and Management, Southeast University, Nanjing, China
| | - Vikrant Sihag
- Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Qingxia Kong
- Department of Technology and Operations Management, Erasmus University Rotterdam, Rotterdam, Netherlands
| | - Lindu Zhao
- School of Economics and Management, Southeast University, Nanjing, China
| |
Collapse
|
21
|
Walsh J, Cave J, Griffiths F. Spontaneously Generated Online Patient Experience of Modafinil: A Qualitative and NLP Analysis. Front Digit Health 2021; 3:598431. [PMID: 34713085 PMCID: PMC8521895 DOI: 10.3389/fdgth.2021.598431] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 01/27/2021] [Indexed: 11/16/2022] Open
Abstract
Objective: To compare the findings from a qualitative and a natural language processing (NLP) based analysis of online patient experience posts on patient experience of the effectiveness and impact of the drug Modafinil. Methods: Posts (n = 260) from 5 online social media platforms where posts were publicly available formed the dataset/corpus. Three platforms asked posters to give a numerical rating of Modafinil. Thematic analysis: data was coded and themes generated. Data were categorized into PreModafinil, Acquisition, Dosage, and PostModafinil and compared to identify each poster's own view of whether taking Modafinil was linked to an identifiable outcome. We classified this as positive, mixed, negative, or neutral and compared this with numerical ratings. NLP: Corpus text was speech tagged and keywords and key terms extracted. We identified the following entities: drug names, condition names, symptoms, actions, and side-effects. We searched for simple relationships, collocations, and co-occurrences of entities. To identify causal text, we split the corpus into PreModafinil and PostModafinil and used n-gram analysis. To evaluate sentiment, we calculated the polarity of each post between −1 (negative) and +1 (positive). NLP results were mapped to qualitative results. Results: Posters had used Modafinil for 33 different primary conditions. Eight themes were identified: the reason for taking (condition or symptom), impact of symptoms, acquisition, dosage, side effects, other interventions tried or compared to, effectiveness of Modafinil, and quality of life outcomes. Posters reported perceived effectiveness as follows: 68% positive, 12% mixed, 18% negative. Our classification was consistent with poster ratings. Of the most frequent 100 keywords/keyterms identified by term extraction 88/100 keywords and 84/100 keyterms mapped directly to the eight themes. Seven keyterms indicated negation and temporal states. Sentiment was as follows 72% positive sentiment 4% neutral 24% negative. Matching of sentiment between the qualitative and NLP methods was accurate in 64.2% of posts. If we allow for one category difference matching was accurate in 85% of posts. Conclusions: User generated patient experience is a rich resource for evaluating real world effectiveness, understanding patient perspectives, and identifying research gaps. Both methods successfully identified the entities and topics contained in the posts. In contrast to current evidence, posters with a wide range of other conditions found Modafinil effective. Perceived causality and effectiveness were identified by both methods demonstrating the potential to augment existing knowledge.
Collapse
Affiliation(s)
- Julia Walsh
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
| | - Jonathan Cave
- Department of Economics, University of Warwick, Coventry, United Kingdom
| | - Frances Griffiths
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
| |
Collapse
|
22
|
Safa R, Bayat P, Moghtader L. Automatic detection of depression symptoms in twitter using multimodal analysis. THE JOURNAL OF SUPERCOMPUTING 2021; 78:4709-4744. [PMID: 34518741 PMCID: PMC8426595 DOI: 10.1007/s11227-021-04040-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/19/2021] [Indexed: 05/03/2023]
Abstract
Depression is the most prevalent mental disorder that can lead to suicide. Due to the tendency of people to share their thoughts on social platforms, social data contain valuable information that can be used to identify user's psychological states. In this paper, we provide an automated approach to collect and evaluate tweets based on self-reported statements and present a novel multimodal framework to predict depression symptoms from user profiles. We used n-gram language models, LIWC dictionaries, automatic image tagging, and bag-of-visual-words. We consider the correlation-based feature selection and nine different classifiers with standard evaluation metrics to assess the effectiveness of the method. Based on the analysis, the tweets and bio-text alone showed 91% and 83% accuracy in predicting depressive symptoms, respectively, which seems to be an acceptable result. We also believe performance improvements can be achieved by limiting the user domain or presence of clinical information.
Collapse
Affiliation(s)
- Ramin Safa
- Department of Computer Engineering, Rasht Branch, Islamic Azad University, Rasht, Iran
| | - Peyman Bayat
- Department of Computer Engineering, Rasht Branch, Islamic Azad University, Rasht, Iran
| | - Leila Moghtader
- Department of Psychology, Rasht Branch, Islamic Azad University, Rasht, Iran
| |
Collapse
|
23
|
He L, Yin T, Hu Z, Chen Y, Hanauer DA, Zheng K. Developing a standardized protocol for computational sentiment analysis research using health-related social media data. J Am Med Inform Assoc 2021; 28:1125-1134. [PMID: 33355353 DOI: 10.1093/jamia/ocaa298] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 12/04/2020] [Indexed: 12/18/2022] Open
Abstract
OBJECTIVE Sentiment analysis is a popular tool for analyzing health-related social media content. However, existing studies exhibit numerous methodological issues and inconsistencies with respect to research design and results reporting, which could lead to biased data, imprecise or incorrect conclusions, or incomparable results across studies. This article reports a systematic analysis of the literature with respect to such issues. The objective was to develop a standardized protocol for improving the research validity and comparability of results in future relevant studies. MATERIALS AND METHODS We developed the Protocol of Analysis of senTiment in Health (PATH) based on a systematic review that analyzed common research design choices and how such choices were made, or reported, among eligible studies published 2010-2019. RESULTS Of 409 articles screened, 89 met the inclusion criteria. A total of 16 distinctive research design choices were identified, 9 of which have significant methodological or reporting inconsistencies among the articles reviewed, ranging from how relevance of study data was determined to how the sentiment analysis tool selected was validated. Based on this result, we developed the PATH protocol that encompasses all these distinctive design choices and highlights the ones for which careful consideration and detailed reporting are particularly warranted. CONCLUSIONS A substantial degree of methodological and reporting inconsistencies exist in the extant literature that applied sentiment analysis to analyzing health-related social media data. The PATH protocol developed through this research may contribute to mitigating such issues in future relevant studies.
Collapse
Affiliation(s)
- Lu He
- Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, Irvine, California, USA
| | - Tingjue Yin
- Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, Irvine, California, USA
| | - Zhaoxian Hu
- Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, Irvine, California, USA
| | - Yunan Chen
- Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, Irvine, California, USA
| | - David A Hanauer
- Department of Learning Health Sciences, School of Medicine, University of Michigan, Ann Arbor, Michigan, USA.,Department of Pediatrics, School of Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - Kai Zheng
- Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, Irvine, California, USA.,Department of Emergency Medicine, School of Medicine, University of California, Irvine, Irvine, California, USA
| |
Collapse
|
24
|
Martinez-Martin N, Greely HT, Cho MK. Ethical Development of Digital Phenotyping Tools for Mental Health Applications: Delphi Study. JMIR Mhealth Uhealth 2021; 9:e27343. [PMID: 34319252 PMCID: PMC8367187 DOI: 10.2196/27343] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 05/06/2021] [Accepted: 05/21/2021] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND Digital phenotyping (also known as personal sensing, intelligent sensing, or body computing) involves the collection of biometric and personal data in situ from digital devices, such as smartphones, wearables, or social media, to measure behavior or other health indicators. The collected data are analyzed to generate moment-by-moment quantification of a person's mental state and potentially predict future mental states. Digital phenotyping projects incorporate data from multiple sources, such as electronic health records, biometric scans, or genetic testing. As digital phenotyping tools can be used to study and predict behavior, they are of increasing interest for a range of consumer, government, and health care applications. In clinical care, digital phenotyping is expected to improve mental health diagnoses and treatment. At the same time, mental health applications of digital phenotyping present significant areas of ethical concern, particularly in terms of privacy and data protection, consent, bias, and accountability. OBJECTIVE This study aims to develop consensus statements regarding key areas of ethical guidance for mental health applications of digital phenotyping in the United States. METHODS We used a modified Delphi technique to identify the emerging ethical challenges posed by digital phenotyping for mental health applications and to formulate guidance for addressing these challenges. Experts in digital phenotyping, data science, mental health, law, and ethics participated as panelists in the study. The panel arrived at consensus recommendations through an iterative process involving interviews and surveys. The panelists focused primarily on clinical applications for digital phenotyping for mental health but also included recommendations regarding transparency and data protection to address potential areas of misuse of digital phenotyping data outside of the health care domain. RESULTS The findings of this study showed strong agreement related to these ethical issues in the development of mental health applications of digital phenotyping: privacy, transparency, consent, accountability, and fairness. Consensus regarding the recommendation statements was strongest when the guidance was stated broadly enough to accommodate a range of potential applications. The privacy and data protection issues that the Delphi participants found particularly critical to address related to the perceived inadequacies of current regulations and frameworks for protecting sensitive personal information and the potential for sale and analysis of personal data outside of health systems. CONCLUSIONS The Delphi study found agreement on a number of ethical issues to prioritize in the development of digital phenotyping for mental health applications. The Delphi consensus statements identified general recommendations and principles regarding the ethical application of digital phenotyping to mental health. As digital phenotyping for mental health is implemented in clinical care, there remains a need for empirical research and consultation with relevant stakeholders to further understand and address relevant ethical issues.
Collapse
Affiliation(s)
- Nicole Martinez-Martin
- Center for Biomedical Ethics, School of Medicine, Stanford University, Stanford, CA, United States
| | | | - Mildred K Cho
- Center for Biomedical Ethics, School of Medicine, Stanford University, Stanford, CA, United States
| |
Collapse
|
25
|
Goering S, Klein E, Specker Sullivan L, Wexler A, Agüera y Arcas B, Bi G, Carmena JM, Fins JJ, Friesen P, Gallant J, Huggins JE, Kellmeyer P, Marblestone A, Mitchell C, Parens E, Pham M, Rubel A, Sadato N, Teicher M, Wasserman D, Whittaker M, Wolpaw J, Yuste R. Recommendations for Responsible Development and Application of Neurotechnologies. NEUROETHICS-NETH 2021; 14:365-386. [PMID: 33942016 PMCID: PMC8081770 DOI: 10.1007/s12152-021-09468-6] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 04/15/2021] [Indexed: 12/12/2022]
Abstract
Advancements in novel neurotechnologies, such as brain computer interfaces (BCI) and neuromodulatory devices such as deep brain stimulators (DBS), will have profound implications for society and human rights. While these technologies are improving the diagnosis and treatment of mental and neurological diseases, they can also alter individual agency and estrange those using neurotechnologies from their sense of self, challenging basic notions of what it means to be human. As an international coalition of interdisciplinary scholars and practitioners, we examine these challenges and make recommendations to mitigate negative consequences that could arise from the unregulated development or application of novel neurotechnologies. We explore potential ethical challenges in four key areas: identity and agency, privacy, bias, and enhancement. To address them, we propose (1) democratic and inclusive summits to establish globally-coordinated ethical and societal guidelines for neurotechnology development and application, (2) new measures, including "Neurorights," for data privacy, security, and consent to empower neurotechnology users' control over their data, (3) new methods of identifying and preventing bias, and (4) the adoption of public guidelines for safe and equitable distribution of neurotechnological devices.
Collapse
Affiliation(s)
| | - Eran Klein
- University of Washington, Seattle, WA USA
- Oregon Health & Science University, Portland, OR USA
| | | | - Anna Wexler
- University of Pennsylvania, Philadelphia, PA USA
| | | | - Guoqiang Bi
- University of Science and Technology of China, Hefei, China
- CAS Shenzhen Institute of Advanced Technology, Shenzhen, China
| | | | | | | | | | | | | | | | | | - Erik Parens
- The Hastings Center, Philipstown, Garrison, NY USA
| | | | - Alan Rubel
- University of Wisconsin-Madison, Madison, WI USA
| | - Norihiro Sadato
- National Institute for Physiological Sciences, Okazaki, Aichi Japan
| | | | | | - Meredith Whittaker
- Google, Mountain View, CA USA
- AI Now, New York City, NY USA
- New York University, New York City, NY USA
| | - Jonathan Wolpaw
- National Center for Adaptive Neurotechnologies, Albany, NY USA
| | | |
Collapse
|
26
|
Borges do Nascimento IJ, Marcolino MS, Abdulazeem HM, Weerasekara I, Azzopardi-Muscat N, Gonçalves MA, Novillo-Ortiz D. Impact of Big Data Analytics on People's Health: Overview of Systematic Reviews and Recommendations for Future Studies. J Med Internet Res 2021; 23:e27275. [PMID: 33847586 PMCID: PMC8080139 DOI: 10.2196/27275] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 02/19/2021] [Accepted: 03/24/2021] [Indexed: 12/17/2022] Open
Abstract
Background Although the potential of big data analytics for health care is well recognized, evidence is lacking on its effects on public health. Objective The aim of this study was to assess the impact of the use of big data analytics on people’s health based on the health indicators and core priorities in the World Health Organization (WHO) General Programme of Work 2019/2023 and the European Programme of Work (EPW), approved and adopted by its Member States, in addition to SARS-CoV-2–related studies. Furthermore, we sought to identify the most relevant challenges and opportunities of these tools with respect to people’s health. Methods Six databases (MEDLINE, Embase, Cochrane Database of Systematic Reviews via Cochrane Library, Web of Science, Scopus, and Epistemonikos) were searched from the inception date to September 21, 2020. Systematic reviews assessing the effects of big data analytics on health indicators were included. Two authors independently performed screening, selection, data extraction, and quality assessment using the AMSTAR-2 (A Measurement Tool to Assess Systematic Reviews 2) checklist. Results The literature search initially yielded 185 records, 35 of which met the inclusion criteria, involving more than 5,000,000 patients. Most of the included studies used patient data collected from electronic health records, hospital information systems, private patient databases, and imaging datasets, and involved the use of big data analytics for noncommunicable diseases. “Probability of dying from any of cardiovascular, cancer, diabetes or chronic renal disease” and “suicide mortality rate” were the most commonly assessed health indicators and core priorities within the WHO General Programme of Work 2019/2023 and the EPW 2020/2025. Big data analytics have shown moderate to high accuracy for the diagnosis and prediction of complications of diabetes mellitus as well as for the diagnosis and classification of mental disorders; prediction of suicide attempts and behaviors; and the diagnosis, treatment, and prediction of important clinical outcomes of several chronic diseases. Confidence in the results was rated as “critically low” for 25 reviews, as “low” for 7 reviews, and as “moderate” for 3 reviews. The most frequently identified challenges were establishment of a well-designed and structured data source, and a secure, transparent, and standardized database for patient data. Conclusions Although the overall quality of included studies was limited, big data analytics has shown moderate to high accuracy for the diagnosis of certain diseases, improvement in managing chronic diseases, and support for prompt and real-time analyses of large sets of varied input data to diagnose and predict disease outcomes. Trial Registration International Prospective Register of Systematic Reviews (PROSPERO) CRD42020214048; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=214048
Collapse
Affiliation(s)
- Israel Júnior Borges do Nascimento
- School of Medicine, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.,Department of Medicine, School of Medicine, Medical College of Wisconsin, Wauwatosa, WI, United States
| | - Milena Soriano Marcolino
- Department of Internal Medicine, University Hospital, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.,School of Medicine and Telehealth Center, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | | | - Ishanka Weerasekara
- School of Health Sciences, Faculty of Health and Medicine, The University of Newcastle, Callaghan, Australia.,Department of Physiotherapy, Faculty of Allied Health Sciences, University of Peradeniya, Peradeniya, Sri Lanka
| | - Natasha Azzopardi-Muscat
- Division of Country Health Policies and Systems, World Health Organization, Regional Office for Europe, Copenhagen, Denmark
| | - Marcos André Gonçalves
- Department of Computer Science, Institute of Exact Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - David Novillo-Ortiz
- Division of Country Health Policies and Systems, World Health Organization, Regional Office for Europe, Copenhagen, Denmark
| |
Collapse
|
27
|
Understanding current states of machine learning approaches in medical informatics: a systematic literature review. HEALTH AND TECHNOLOGY 2021. [DOI: 10.1007/s12553-021-00538-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
28
|
Singh N, Singh P, Gupta M. An inclusive survey on machine learning for CRM: a paradigm shift. DECISION 2021. [DOI: 10.1007/s40622-020-00261-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
29
|
Okechukwu C. The Effectiveness of machine learning in suicide prediction and prevention. MGM JOURNAL OF MEDICAL SCIENCES 2021. [DOI: 10.4103/mgmj.mgmj_82_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
|
30
|
Bourke A, Dixon WG, Roddam A, Lin KJ, Hall GC, Curtis JR, van der Veer SN, Soriano-Gabarró M, Mills JK, Major JM, Verstraeten T, Francis MJ, Bartels DB. Incorporating patient generated health data into pharmacoepidemiological research. Pharmacoepidemiol Drug Saf 2020; 29:1540-1549. [PMID: 33146896 DOI: 10.1002/pds.5169] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 09/17/2020] [Accepted: 10/31/2020] [Indexed: 01/18/2023]
Abstract
Epidemiology and pharmacoepidemiology frequently employ Real-World Data (RWD) from healthcare teams to inform research. These data sources usually include signs, symptoms, tests, and treatments, but may lack important information such as the patient's diet or adherence or quality of life. By harnessing digital tools a new fount of evidence, Patient (or Citizen/Person) Generated Health Data (PGHD), is becoming more readily available. This review focusses on the advantages and considerations in using PGHD for pharmacoepidemiological research. New and corroborative types of data can be collected directly from patients using digital devices, both passively and actively. Practical issues such as patient engagement, data linking, validation, and analysis are among important considerations in the use of PGHD. In our ever increasingly patient-centric world, PGHD incorporated into more traditional Real-Word data sources offers innovative opportunities to expand our understanding of the complex factors involved in health and the safety and effectiveness of disease treatments. Pharmacoepidemiologists have a unique role in realizing the potential of PGHD by ensuring that robust methodology, governance, and analytical techniques underpin its use to generate meaningful research results.
Collapse
Affiliation(s)
| | - William G Dixon
- Arthritis Research UK Centre for Epidemiology, The University of Manchester, Manchester, UK
| | | | - Kueiyu Joshua Lin
- Brigham and Women's & Department of Medicine, Boston, Massachusetts, USA
| | | | - Jeffrey R Curtis
- Division of Clinical Immunology & Rheumatology, The University of Birmingham, Birmingham, Alabama, USA
| | - Sabine N van der Veer
- Centre for Health Informatics, Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK
| | | | | | - Jacqueline M Major
- Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland, USA
| | | | | | | |
Collapse
|
31
|
Hatwell J, Gaber MM, Atif Azad RM. Ada-WHIPS: explaining AdaBoost classification with applications in the health sciences. BMC Med Inform Decis Mak 2020; 20:250. [PMID: 33008388 PMCID: PMC7531148 DOI: 10.1186/s12911-020-01201-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 07/23/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Computer Aided Diagnostics (CAD) can support medical practitioners to make critical decisions about their patients' disease conditions. Practitioners require access to the chain of reasoning behind CAD to build trust in the CAD advice and to supplement their own expertise. Yet, CAD systems might be based on black box machine learning models and high dimensional data sources such as electronic health records, magnetic resonance imaging scans, cardiotocograms, etc. These foundations make interpretation and explanation of the CAD advice very challenging. This challenge is recognised throughout the machine learning research community. eXplainable Artificial Intelligence (XAI) is emerging as one of the most important research areas of recent years because it addresses the interpretability and trust concerns of critical decision makers, including those in clinical and medical practice. METHODS In this work, we focus on AdaBoost, a black box model that has been widely adopted in the CAD literature. We address the challenge - to explain AdaBoost classification - with a novel algorithm that extracts simple, logical rules from AdaBoost models. Our algorithm, Adaptive-Weighted High Importance Path Snippets (Ada-WHIPS), makes use of AdaBoost's adaptive classifier weights. Using a novel formulation, Ada-WHIPS uniquely redistributes the weights among individual decision nodes of the internal decision trees of the AdaBoost model. Then, a simple heuristic search of the weighted nodes finds a single rule that dominated the model's decision. We compare the explanations generated by our novel approach with the state of the art in an experimental study. We evaluate the derived explanations with simple statistical tests of well-known quality measures, precision and coverage, and a novel measure stability that is better suited to the XAI setting. RESULTS Experiments on 9 CAD-related data sets showed that Ada-WHIPS explanations consistently generalise better (mean coverage 15%-68%) than the state of the art while remaining competitive for specificity (mean precision 80%-99%). A very small trade-off in specificity is shown to guard against over-fitting which is a known problem in the state of the art methods. CONCLUSIONS The experimental results demonstrate the benefits of using our novel algorithm for explaining CAD AdaBoost classifiers widely found in the literature. Our tightly coupled, AdaBoost-specific approach outperforms model-agnostic explanation methods and should be considered by practitioners looking for an XAI solution for this class of models.
Collapse
Affiliation(s)
- Julian Hatwell
- Birmingham City University, Curzon Street, Birmingham, B5 5JU UK
| | | | | |
Collapse
|
32
|
Liu Y, Yin Z. Understanding Weight Loss via Online Discussions: Content Analysis of Reddit Posts Using Topic Modeling and Word Clustering Techniques. J Med Internet Res 2020; 22:e13745. [PMID: 32510460 PMCID: PMC7308899 DOI: 10.2196/13745] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 12/24/2019] [Accepted: 02/26/2020] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Maintaining a healthy weight can reduce the risk of developing many diseases, including type 2 diabetes, hypertension, and certain types of cancers. Online social media platforms are popular among people seeking social support regarding weight loss and sharing their weight loss experiences, which provides opportunities for learning about weight loss behaviors. OBJECTIVE This study aimed to investigate the extent to which the content posted by users in the r/loseit subreddit, an online community for discussing weight loss, and online interactions were associated with their weight loss in terms of the number of replies and votes that these users received. METHODS All posts that were published before January 2018 in r/loseit were collected. We focused on users who revealed their start weight, current weight, and goal weight and were active in this online community for at least 30 days. A topic modeling technique and a hierarchical clustering algorithm were used to obtain both global topics and local word semantic clusters. Finally, we used a regression model to learn the association between weight loss and topics, word semantic clusters, and online interactions. RESULTS Our data comprised 477,904 posts that were published by 7660 users within a span of 7 years. We identified 25 topics, including food and drinks, calories, exercises, family members and friends, and communication. Our results showed that the start weight (β=.823; P<.001), active days (β=.017; P=.009), and median number of votes (β=.263; P=.02), mentions of exercises (β=.145; P<.001), and nutrition (β=.120; P<.001) were associated with higher weight loss. Users who lost more weight might be motivated by the negative emotions (β=-.098; P<.001) that they experienced before starting the journey of weight loss. In contrast, users who mentioned vacations (β=-.108; P=.005) and payments (β=-.112; P=.001) tended to experience relatively less weight loss. Mentions of family members (β=-.031; P=.03) and employment status (β=-.041; P=.03) were associated with less weight loss as well. CONCLUSIONS Our study showed that both online interactions and offline activities were associated with weight loss, suggesting that future interventions based on existing online platforms should focus on both aspects. Our findings suggest that online personal health data can be used to learn about health-related behaviors effectively.
Collapse
Affiliation(s)
- Yang Liu
- College of Computer Science and Technology, Changchun Normal University, Changchun, China
| | - Zhijun Yin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.,Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, United States
| |
Collapse
|
33
|
Baclic O, Tunis M, Young K, Doan C, Swerdfeger H, Schonfeld J. Challenges and opportunities for public health made possible by advances in natural language processing. CANADA COMMUNICABLE DISEASE REPORT = RELEVE DES MALADIES TRANSMISSIBLES AU CANADA 2020; 46:161-168. [PMID: 32673380 PMCID: PMC7343054 DOI: 10.14745/ccdr.v46i06a02] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Natural language processing (NLP) is a subfield of artificial intelligence devoted to understanding and generation of language. The recent advances in NLP technologies are enabling rapid analysis of vast amounts of text, thereby creating opportunities for health research and evidence-informed decision making. The analysis and data extraction from scientific literature, technical reports, health records, social media, surveys, registries and other documents can support core public health functions including the enhancement of existing surveillance systems (e.g. through faster identification of diseases and risk factors/at-risk populations), disease prevention strategies (e.g. through more efficient evaluation of the safety and effectiveness of interventions) and health promotion efforts (e.g. by providing the ability to obtain expert-level answers to any health related question). NLP is emerging as an important tool that can assist public health authorities in decreasing the burden of health inequality/inequity in the population. The purpose of this paper is to provide some notable examples of both the potential applications and challenges of NLP use in public health.
Collapse
Affiliation(s)
- Oliver Baclic
- Centre for Immunization and Respiratory Infectious Disease, Public Health Agency of Canada, Ottawa, ON
| | - Matthew Tunis
- Centre for Immunization and Respiratory Infectious Disease, Public Health Agency of Canada, Ottawa, ON
| | - Kelsey Young
- Centre for Immunization and Respiratory Infectious Disease, Public Health Agency of Canada, Ottawa, ON
| | - Coraline Doan
- Data, Partnerships and Innovation Hub, Public Health Agency of Canada, Ottawa, ON
| | - Howard Swerdfeger
- Data, Partnerships and Innovation Hub, Public Health Agency of Canada, Ottawa, ON
| | - Justin Schonfeld
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB
| |
Collapse
|
34
|
Liu Y, Yan C, Yin Z, Wan Z, Xia W, Kantarcioglu M, Vorobeychik Y, Clayton EW, Malin BA. Biomedical Research Cohort Membership Disclosure on Social Media. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2020; 2019:607-616. [PMID: 32308855 PMCID: PMC7153128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
To accelerate medical knowledge discovery, an increasing number of research programs are gathering and sharing data on a large number of participants. Due to the privacy concerns and legal restrictions on data sharing, these programs apply various strategies to mitigate privacy risk. However, the activities of participants and research program sponsors, particularly on social media, might reveal an individual's membership in a study, making it easier to recognize participants' records and uncover the information they have yet to disclose. This behavior can jeopardize the privacy of the participants themselves, the reputation of the projects, sponsors, and the research enterprise. To investigate the dangers of self-disclosure behavior, we gathered and analyzed 4,020 tweets, and uncovered over 100 tweets disclosing the individuals' memberships in over 15 programs. Our investigation showed that self-disclosure on social media can reveal participants' membership in research cohorts, and such activity might lead to the leakage of a person's identity, genomic, and other sensitive health information.
Collapse
Affiliation(s)
| | - Chao Yan
- Vanderbilt University, Nashville, TN
| | | | - Zhiyu Wan
- Vanderbilt University, Nashville, TN
| | - Weiyi Xia
- Vanderbilt University, Nashville, TN
| | | | | | | | - Bradley A Malin
- Vanderbilt University, Nashville, TN
- Vanderbilt University Medical Center, Nashville, TN
| |
Collapse
|
35
|
Huang R, Liu N, Nicdao MA, Mikaheal M, Baldacchino T, Albeos A, Petoumenos K, Sud K, Kim J. Emotion sharing in remote patient monitoring of patients with chronic kidney disease. J Am Med Inform Assoc 2020; 27:185-193. [PMID: 31633755 PMCID: PMC7647270 DOI: 10.1093/jamia/ocz183] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 09/15/2019] [Accepted: 09/30/2019] [Indexed: 12/22/2022] Open
Abstract
OBJECTIVE To investigate the relationship between emotion sharing and technically troubled dialysis (TTD) in a remote patient monitoring (RPM) setting. MATERIALS AND METHODS A custom software system was developed for home hemodialysis patients to use in an RPM setting, with focus on emoticon sharing and sentiment analysis of patients' text data. We analyzed the outcome of emoticon and sentiment against TTD. Logistic regression was used to assess the relationship between patients' emotions (emoticon and sentiment) and TTD. RESULTS Usage data were collected from January 1, 2015 to June 1, 2018 from 156 patients that actively used the app system, with a total of 31 159 dialysis sessions recorded. Overall, 122 patients (78%) made use of the emoticon feature while 146 patients (94%) wrote at least 1 or more session notes for sentiment analysis. In total, 4087 (13%) sessions were classified as TTD. In the multivariate model, when compared to sessions with self-reported very happy emoticons, those with sad emoticons showed significantly higher associations to TTD (aOR 4.97; 95% CI 4.13-5.99; P = < .001). Similarly, negative sentiments also revealed significant associations to TTD (aOR 1.56; 95% CI 1.22-2; P = .003) when compared to positive sentiments. DISCUSSION The distribution of emoticons varied greatly when compared to sentiment analysis outcomes due to the differences in the design features. The emoticon feature was generally easier to understand and quicker to input while the sentiment analysis required patients to manually input their personal thoughts. CONCLUSION Patients on home hemodialysis actively expressed their emotions during RPM. Negative emotions were found to have significant associations with TTD. The use of emoticons and sentimental analysis may be used as a predictive indicator for prolonged TTD.
Collapse
Affiliation(s)
- Robin Huang
- School of Computer Science, The University of Sydney, Camperdown, Australia
| | - Na Liu
- School of Computer Science, The University of Sydney, Camperdown, Australia
| | - Mary Ann Nicdao
- Home Hemodialysis Unit, Regional Dialysis Centre, Blacktown Hospital, Sydney, Australia
| | - Mary Mikaheal
- Home Hemodialysis Unit, Regional Dialysis Centre, Blacktown Hospital, Sydney, Australia
| | - Tanya Baldacchino
- Home Hemodialysis Unit, Regional Dialysis Centre, Blacktown Hospital, Sydney, Australia
| | - Annabelle Albeos
- Home Hemodialysis Unit, Regional Dialysis Centre, Blacktown Hospital, Sydney, Australia
| | | | - Kamal Sud
- Home Hemodialysis Unit, Regional Dialysis Centre, Blacktown Hospital, Sydney, Australia
- Department of Renal Medicine, Nepean Hospital, Sydney, Australia
- The University of Sydney Medical School, Sydney, Australia
| | - Jinman Kim
- School of Computer Science, The University of Sydney, Camperdown, Australia
| |
Collapse
|
36
|
Bakken S. Breadth and Diversity in Biomedical and Health Informatics. J Am Med Inform Assoc 2019. [DOI: 10.1093/jamia/ocz055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|