1
|
Agley J, Mumaw C, Johnson B. Rationale and Study Checklist for Ethical Rejection of Participants on Crowdsourcing Research Platforms. Ethics Hum Res 2024; 46:38-46. [PMID: 38944883 DOI: 10.1002/eahr.500217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Online participant recruitment ("crowdsourcing") platforms are increasingly being used for research studies. While such platforms can rapidly provide access to large samples, there are concomitant concerns around data quality. Researchers have studied and demonstrated means to reduce the prevalence of low-quality data from crowdsourcing platforms, but approaches to doing so often involve rejecting work and/or denying payment to participants, which can pose ethical dilemmas. We write this essay as an associate professor and two institutional review board (IRB) directors to provide a perspective on the competing interests of participants/workers and researchers and to propose a checklist of steps that we believe may support workers' agency on the platform and lessen instances of unfair consequences to them while enabling researchers to definitively reject lower-quality work that might otherwise reduce the likelihood of their studies producing true results. We encourage further, explicit discussion of these issues among academics and among IRBs.
Collapse
|
2
|
Ristow T, Hernandez I. VOIS: A framework for recording Voice Over Internet Surveys. Behav Res Methods 2024; 56:447-467. [PMID: 36697999 PMCID: PMC9876413 DOI: 10.3758/s13428-022-02045-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/01/2022] [Indexed: 01/26/2023]
Abstract
Verbal data provide researchers insight beyond that offered by text-based responses, including tone, reasoning elaboration, and experienced difficulty, among other processes. Additionally, it offers a less cognitively taxing way for participants to provide long responses. Verbal data collection methods are found in a variety of fields, mostly conducted in lab-based settings or requiring specialized hardware. Restricting verbal protocols to lab-based settings can have several drawbacks, including smaller sample sizes, biased populations, reduced adoption, and incompatibility with potential social distancing requirements. No method currently exists for researchers to collect verbal data within major online survey collection platforms. The current paper offers a user-friendly approach for collecting verbal data online, where a researcher can copy and paste JavaScript code into the desired survey platform. By providing a framework that does not require any advanced programming ability, researchers can collect verbal data in a scalable way using familiar modalities.
Collapse
|
3
|
Patel AA, Feng CL, Marquez J, Spaw JP, Garza RM, Lee GK, Nazerali RS. Prioritizing Native Breast Skin Preservation or Scar Symmetry in Autologous Breast Reconstruction? Using Crowdsourcing to Assess Preference. EPLASTY 2023; 23:e75. [PMID: 38229965 PMCID: PMC10790140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 01/18/2024]
Abstract
Background Recent literature on autologous breast reconstruction suggests that such factors as scar symmetry and skin paddle size impact patient preferences more than preservation of native breast skin. Since patient satisfaction with plastic surgery procedures can be largely influenced by beauty standards set by the general public, this study used a novel crowdsourcing method to evaluate laypeople's aesthetic preferences for different bilateral autologous breast reconstructions to determine the relative importance of scar and skin paddle symmetry and preservation of native skin. Methods Using Amazon's Mechanical Turk crowdsourcing marketplace, participants ranked images of reconstructions based on overall aesthetic appearance. Images were digitally modified to reflect 4 types of reconstruction: immediate (IR), delayed symmetric (DS), delayed asymmetric (DA), or mixed (MR). Results DS was ranked most favorably (1.74), followed by IR (1.95), DA (2.93), and MR (3.34). Friedman rank sum and pairwise tests showed statistical significance for comparisons of all 4 reconstruction types. Likert ratings were higher for IR than for DA reconstructions for skin quality (P = .002), scar visibility (P < .001), scar position (P < .001), and breast symmetry, shape, and position (P < .001). Ratings increased for all aesthetic factors following nipple-areolar-complex reconstruction (P < .001). Conclusions More symmetric breast scars were rated aesthetically higher than nonsymmetric scarring, and our participants preferred maintenance of scar symmetry over preservation of native breast skin. These findings are consistent with previous studies that surveyed non-crowdsourced participants, which demonstrates the potential for crowdsourcing to be used to better understand the general public's preferences in plastic surgery.
Collapse
|
4
|
Alsoof D, Kasthuri V, McDonald C, Cusano J, Anderson G, Diebo BG, Kuris E, Daniels AH. How much are patients willing to pay for spine surgery? An evaluation of attitudes toward out-of-pocket expenses and cost-reducing measures. Spine J 2023; 23:1886-1893. [PMID: 37619868 DOI: 10.1016/j.spinee.2023.08.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 08/14/2023] [Accepted: 08/16/2023] [Indexed: 08/26/2023]
Abstract
BACKGROUND CONTEXT With rising healthcare expenditures in the United States, patients and providers are searching to maintain quality while reducing costs. PURPOSE The aim of this study was to investigate patient willingness to pay for anterior cervical discectomy and fusion (ACDF), degenerative lumbar spinal fusions (LF), and adult spine deformity (ASD) surgery. STUDY DESIGN/SETTING A survey was developed and distributed to anonymous respondents through Amazon Mechanical Turk (MTurk). METHODS The survey introduced 3 procedures: ACDF, LF, and ASD surgery. Respondents were asked sequentially if they would pay at each increasing price option. Respondents were then presented with various cost-saving methods and asked to select the options that made them most uncomfortable, even if those would save them out-of-pocket costs. RESULTS In total, 979 of 1,172 total responses (84%) were retained for analysis. The average age was 36.2 years and 44% of participants reported a household income of $50,000 to 100,000. A total of 63% used Medicare and 13% used Medicaid. A total of 40% stated they had high levels of financial stress. A total of 30.1% of participants were willing to undergo an ACDF, 30.3% were willing to undergo a LF, and 29.6% were willing to undergo ASD surgery for the cost of $3,000 (p=.98). Regression demonstrated that for ACDF surgery, a $100 increase in price resulted in a 2.1% decrease in willingness to pay. This is comparable to degenerative LF surgery (1.8% decrease), and ASD surgery (2%). When asked which cost-saving measures participants were least comfortable with for ACDF surgery, 60% stated "Use of the older generation implants/devices" (LF: 51%, ASD: 60%,), 61% stated "Having the surgery performed at a community hospital instead of at a major academic center" (LF: 49%, ASD: 56%), and 55% stated "Administration of anesthesia by a nurse anesthetist" (LF: 48.01%, ASD: 55%). Conversely, 36% of ACDF patients were uncomfortable with a "Video/telephone postoperative visit" to cut costs (LF: 51%, ASD: 39%). CONCLUSIONS Patients are unwilling to contribute larger copays for adult spinal deformity correction than for ACDF and degenerative lumbar spine surgery, despite significantly higher procedural costs and case complexity/invasiveness. Patients were most uncomfortable forfeiting newer generation implants, receiving the operation at a community rather than an academic center, and receiving care by physician extenders. Conversely, patients were more willing to convert postoperative visits to telehealth and forgo neuromonitoring, indicating a potentially poor understanding of which cost-saving measures may be implemented without increasing the risk of complications.
Collapse
|
5
|
Stewart Z, Korsapathy S, Frohlich F. Crowd-sourced investigation of a potential relationship between Bartonella-associated cutaneous lesions and neuropsychiatric symptoms. Front Psychiatry 2023; 14:1244121. [PMID: 37941969 PMCID: PMC10628448 DOI: 10.3389/fpsyt.2023.1244121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 09/19/2023] [Indexed: 11/10/2023] Open
Abstract
Introduction Preliminary studies suggest that infection with Bartonella bacteria can not only cause a characteristic rash, headache, fever, and fatigue but also neuropsychiatric symptoms. To date, this association has only been reported in case studies, and it remains unclear if this association generalizes to larger samples. Methods We used Amazon's Mechanical Turk (MTurk) to crowdsource a large sample (N = 996) of individuals to ascertain the extent to which the presence of participant-identified Bartonella-associated cutaneous lesions (BACL) was associated with self-reported measures of anxiety, depression, and schizotypy. Participants were asked to select images of cutaneous lesions they had seen on their own bodies and complete a battery of self-report questionnaires to assess psychiatric symptoms. Participants were not informed that the focus of the study was on potential dermatological lesions associated with Bartonella. Point-biserial correlations were used to determine the potential relationship between selecting a BACL image and the severity of self-reported psychiatric symptoms. Results Scores of anxiety, depression, and schizotypy were positively and significantly correlated with selecting a BACL image. Furthermore, self-report scores of 10 or higher on the GAD-7 and PHQ-9, which represent the suggested clinical cutoffs for meeting criteria for a depressive or anxiety-related disorder, were also significantly associated with selecting a BACL image. Non-Bartonella-associated cutaneous legions were also significantly associated with self-reported measures of psychiatric symptoms. Discussion The current study broadens the link between the presence of BACL and the presence of psychiatric symptoms of anxiety, depression, and schizotypy and extends a potential relationship beyond the small sample sizes of previous case studies and case series. Further investigation is recommended to address limitations and expand on these findings.
Collapse
|
6
|
Lin YK, Newman S, Piette J. Response Consistency of Crowdsourced Web-Based Surveys on Type 1 Diabetes. J Med Internet Res 2023; 25:e43593. [PMID: 37594797 PMCID: PMC10474500 DOI: 10.2196/43593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 08/01/2023] [Accepted: 08/02/2023] [Indexed: 08/19/2023] Open
Abstract
Although Amazon Mechanical Turk facilitates the quick surveying of a large sample from various demographic and socioeconomic backgrounds, it may not be an optimal platform for obtaining reliable diabetes-related information from the online type 1 diabetes population.
Collapse
|
7
|
Hays RD, Qureshi N, Herman PM, Rodriguez A, Kapteyn A, Edelen MO. Effects of Excluding Those Who Report Having "Syndomitis" or "Chekalism" on Data Quality: Longitudinal Health Survey of a Sample From Amazon's Mechanical Turk. J Med Internet Res 2023; 25:e46421. [PMID: 37540543 PMCID: PMC10439462 DOI: 10.2196/46421] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 06/28/2023] [Accepted: 06/29/2023] [Indexed: 08/05/2023] Open
Abstract
BACKGROUND Researchers have implemented multiple approaches to increase data quality from existing web-based panels such as Amazon's Mechanical Turk (MTurk). OBJECTIVE This study extends prior work by examining improvements in data quality and effects on mean estimates of health status by excluding respondents who endorse 1 or both of 2 fake health conditions ("Syndomitis" and "Chekalism"). METHODS Survey data were collected in 2021 at baseline and 3 months later from MTurk study participants, aged 18 years or older, with an internet protocol address in the United States, and who had completed a minimum of 500 previous MTurk "human intelligence tasks." We included questions about demographic characteristics, health conditions (including the 2 fake conditions), and the Patient Reported Outcomes Measurement Information System (PROMIS)-29+2 (version 2.1) preference-based score survey. The 3-month follow-up survey was only administered to those who reported having back pain and did not endorse a fake condition at baseline. RESULTS In total, 15% (996/6832) of the sample endorsed at least 1 of the 2 fake conditions at baseline. Those who endorsed a fake condition at baseline were more likely to identify as male, non-White, younger, report more health conditions, and take longer to complete the survey than those who did not endorse a fake condition. They also had substantially lower internal consistency reliability on the PROMIS-29+2 scales than those who did not endorse a fake condition: physical function (0.69 vs 0.89), pain interference (0.80 vs 0.94), fatigue (0.80 vs 0.92), depression (0.78 vs 0.92), anxiety (0.78 vs 0.90), sleep disturbance (-0.27 vs 0.84), ability to participate in social roles and activities (0.77 vs 0.92), and cognitive function (0.65 vs 0.77). The lack of reliability of the sleep disturbance scale for those endorsing a fake condition was because it includes both positively and negatively worded items. Those who reported a fake condition reported significantly worse self-reported health scores (except for sleep disturbance) than those who did not endorse a fake condition. Excluding those who endorsed a fake condition improved the overall mean PROMIS-29+2 (version 2.1) T-scores by 1-2 points and the PROMIS preference-based score by 0.04. Although they did not endorse a fake condition at baseline, 6% (n=59) of them endorsed at least 1 of them on the 3-month survey and they had lower PROMIS-29+2 score internal consistency reliability and worse mean scores on the 3-month survey than those who did not report having a fake condition. Based on these results, we estimate that 25% (1708/6832) of the MTurk respondents provided careless or dishonest responses. CONCLUSIONS This study provides evidence that asking about fake health conditions can help to screen out respondents who may be dishonest or careless. We recommend this approach be used routinely in samples of members of MTurk.
Collapse
|
8
|
Madden T, Cohen SY, Paul R, Hurley EG, Thomas MA, Pauletti G. Women's preferences for a new contraceptive under development: an exploratory study. Front Glob Womens Health 2023; 4:1095112. [PMID: 37547129 PMCID: PMC10401268 DOI: 10.3389/fgwh.2023.1095112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 07/06/2023] [Indexed: 08/08/2023] Open
Abstract
Objective Currently available contraceptive methods do not meet the needs of all users. We sought to explore preferences of potential end-users regarding an on-demand, non-hormonal female contraceptive currently under development, using a web-based survey. Study design We recruited respondents for an exploratory survey via web link on Amazon Mechanical Turk (MTurk). Individuals were eligible if they were 18-44 years of age, identified as cis-gender female, were English-speaking, not pregnant, and had used barrier contraception previously. Respondents provided demographic characteristics and a basic reproductive history. We then provided a brief description of the potential contraceptive. Respondents were asked about their interest in the proposed contraceptive and preferences for method attributes. Results A total of 500 respondents completed the survey. Three-quarters of respondents were <35 years of age and 48.2% were currently using a barrier contraceptive method. Three-fourths of respondents (73.8%) expressed interest in using the contraceptive under development. The majority wanted the method to be small (≤2 inches), rod-shaped, and low cost (<$5 per use). More than half (59.4%) said it was important to be able to use the method without partners' knowledge. The most reported potential concerns were vaginal irritation (51.6%) and lack of effectiveness (46.4%). Sixty percent of respondents were confident they could use the method correctly. Discussion Available contraceptive methods lack attributes preferred by some users. Development of new contraceptives frequently does not involve end-user input early in the development process. Individuals in this sample displayed interest in the proposed contraceptive and expressed preferences that can inform the further development of this method.
Collapse
|
9
|
Imeri H, Holmes E, Desselle S, Rosenthal M, Barnard M. A survey study of adults with chronic conditions: Examining the correlation between patient activation and health locus of control. Chronic Illn 2023; 19:118-131. [PMID: 36638782 DOI: 10.1177/17423953211067431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
OBJECTIVES This study aimed to examine (1) the association between patient activation (PA), health locus of control (HLOC), sociodemographic and clinical factors, and (2) the effect of HLOC dimensions, sociodemographic and clinical factors on PA. METHODS Three hundred U.S. adults, with at least one chronic condition (CC) were recruited through Amazon Mechanical Turk and completed an online survey which included sociodemographic questions, the Patient Activation Measure® - 10, and the Multidimensional Locus of Control (MHLC) - Form B. Statistical analyses, including descriptive, correlation, and multiple linear regression, were conducted using IBM SPSS v25. RESULTS Of the 300 participants, more than half were male (66.3%), White (70.7%), with at least a college degree (76.0%), and employed full-time (79.0%). The average PA score was 68.8 ± 14.5. Multiple linear regression indicated that participants who reported they were Black, retired, with a greater number of CCs, and with higher scores in Chance MHLC had higher PA, while participants with higher scores in Internal MHLC, were unemployed and reported to have been affected by COVID-19-related worry or fear to manage their CC, had lower PA. DISCUSSION HLOC dimensions should be addressed concurrently with PA for patients with CCs, thus adding to a more patient-centered clinical approach.
Collapse
|
10
|
Reynolds J, Kincaid R. Gig Work and the Pandemic: Looking for Good Pay from Bad Jobs During the COVID-19 Crisis. WORK AND OCCUPATIONS 2023; 50:60-96. [PMID: 38603298 PMCID: PMC9520279 DOI: 10.1177/07308884221128511] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2023]
Abstract
COVID-19 led to work hour reductions and layoffs for many Americans with wage/salary jobs. Some gig work, however, which is usually considered precarious, remained available. We examine whether people doing gig microtasks right before the pandemic increased their microtask hours during COVID-19 and whether those changes helped them financially. Using data from workers on Amazon's Mechanical Turk platform from February, March, and April of 2020, we find that roughly one third of existing workers increased their microtask hours. Increases were larger for people who lost household income or wage/salary hours. Spending more time on microtasks, however, did little to help workers financially. Furthermore, the people most reliant on microtasks before the pandemic had worse financial outcomes than others. In short, even though microtask work might seem like a good way for people to recoup lost income during the pandemic, it was of limited utility even for the experienced workers in our sample.
Collapse
|
11
|
Abrams AL, Reavy R, Linden-Carmichael AN. Using Young Adult Language to Describe the Effects of Simultaneous Alcohol and Marijuana Use: Implications for Assessment. Subst Use Misuse 2022; 57:1873-1881. [PMID: 36083235 PMCID: PMC9972526 DOI: 10.1080/10826084.2022.2120362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Introduction: Prevalence of alcohol and marijuana use is highest in young adulthood and an increasing number of young adults report simultaneous alcohol and marijuana (SAM) use, which is consistently linked with numerous negative consequences. To better understand reasons for engaging in SAM use and to refine measurement of subjective effects of SAM use, this study aimed to identify (1) how young adults describe subjective experiences during a SAM use occasion and (2) how language describing subjective effects changes as a function of level of alcohol and marijuana use. Methods: Using Amazon's Mechanical Turk (MTurk), 323 participants (53.6% women, 68.4% White, M age = 23.0 years) who reported past-month heavy episodic drinking and past-month SAM use were asked to list words to describe how they feel when using only alcohol, only marijuana, and various combinations of alcohol and marijuana. Results: SAM use language varied as a function of age and substance use behavior but was not associated with sex or race. Large differences in the terms used to describe subjective effects were observed when comparing different combinations of alcohol and marijuana use; most notably the term "cross-faded" appeared primarily when engaging at the heaviest combinations of alcohol and marijuana. Conclusion: Young adults have a wide range of vocabulary for describing subjective effects of SAM use, and subjective effects vary as a function of the level of each substance used. Future research should consider integrating such contemporary language when measuring subjective effects of SAM use.
Collapse
|
12
|
Sehgal NJ, Huang S, Johnson NM, Dickerson J, Jackson D, Baur C. The Benefits of Crowdsourcing to Seed and Align an Algorithm in an mHealth Intervention for African American and Hispanic Adults: Survey Study. J Med Internet Res 2022; 24:e30216. [PMID: 35727616 PMCID: PMC9257620 DOI: 10.2196/30216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 01/31/2022] [Accepted: 03/07/2022] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND The lack of publicly available and culturally relevant data sets on African American and bilingual/Spanish-speaking Hispanic adults' disease prevention and health promotion priorities presents a major challenge for researchers and developers who want to create and test personalized tools built on and aligned with those priorities. Personalization depends on prediction and performance data. A recommender system (RecSys) could predict the most culturally and personally relevant preventative health information and serve it to African American and Hispanic users via a novel smartphone app. However, early in a user's experience, a RecSys can face the "cold start problem" of serving untailored and irrelevant content before it learns user preferences. For underserved African American and Hispanic populations, who are consistently being served health content targeted toward the White majority, the cold start problem can become an example of algorithmic bias. To avoid this, a RecSys needs population-appropriate seed data aligned with the app's purposes. Crowdsourcing provides a means to generate population-appropriate seed data. OBJECTIVE Our objective was to identify and test a method to address the lack of culturally specific preventative personal health data and sidestep the type of algorithmic bias inherent in a RecSys not trained in the population of focus. We did this by collecting a large amount of data quickly and at low cost from members of the population of focus, thereby generating a novel data set based on prevention-focused, population-relevant health goals. We seeded our RecSys with data collected anonymously from self-identified Hispanic and self-identified non-Hispanic African American/Black adult respondents, using Amazon Mechanical Turk (MTurk). METHODS MTurk provided the crowdsourcing platform for a web-based survey in which respondents completed a personal profile and a health information-seeking assessment, and provided data on family health history and personal health history. Respondents then selected their top 3 health goals related to preventable health conditions, and for each goal, reviewed and rated the top 3 information returns by importance, personal utility, whether the item should be added to their personal health library, and their satisfaction with the quality of the information returned. This paper reports the article ratings because our intent was to assess the benefits of crowdsourcing to seed a RecSys. The analysis of the data from health goals will be reported in future papers. RESULTS The MTurk crowdsourcing approach generated 985 valid responses from 485 (49%) self-identified Hispanic and 500 (51%) self-identified non-Hispanic African American adults over the course of only 64 days at a cost of US $6.74 per respondent. Respondents rated 92 unique articles to inform the RecSys. CONCLUSIONS Researchers have options such as MTurk as a quick, low-cost means to avoid the cold start problem for algorithms and to sidestep bias and low relevance for an intended population of app users. Seeding a RecSys with responses from people like the intended users allows for the development of a digital health tool that can recommend information to users based on similar demography, health goals, and health history. This approach minimizes the potential, initial gaps in algorithm performance; allows for quicker algorithm refinement in use; and may deliver a better user experience to individuals seeking preventative health information to improve health and achieve health goals.
Collapse
|
13
|
Roman ZJ, Brandt H, Miller JM. Automated Bot Detection Using Bayesian Latent Class Models in Online Surveys. Front Psychol 2022; 13:789223. [PMID: 35572225 PMCID: PMC9093679 DOI: 10.3389/fpsyg.2022.789223] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 03/29/2022] [Indexed: 11/16/2022] Open
Abstract
Behavioral scientists have become increasingly reliant on online survey platforms such as Amazon's Mechanical Turk (Mturk). These platforms have many advantages, for example it provides ease of access to difficult to sample populations, a large pool of participants, and an easy to use implementation. A major drawback is the existence of bots that are used to complete online surveys for financial gain. These bots contaminate data and need to be identified in order to draw valid conclusions from data obtained with these platforms. In this article, we will provide a Bayesian latent class joint modeling approach that can be routinely applied to identify bots and simultaneously estimate a model of interest. This method can be used to separate the bots' response patterns from real human responses that were provided in line with the item content. The model has the advantage that it is very flexible and is based on plausible assumptions that are met in most empirical settings. We will provide a simulation study that investigates the performance of the model under several relevant scenarios including sample size, proportion of bots, and model complexity. We will show that ignoring bots will lead to severe parameter bias whereas the Bayesian latent class model results in unbiased estimates and thus controls this source of bias. We will illustrate the model and its capabilities with data from an empirical political ideation survey with known bots. We will discuss the implications of the findings with regard to future data collection via online platforms.
Collapse
|
14
|
Burnette CB, Luzier J, Bennett BL, Weisenmuller C, Kerr P, Keener J. The tension between ethics and rigor when using Amazon MTurk for eating disorder research: Response to commentaries on Burnette et al. (2021). Int J Eat Disord 2022; 55:288-289. [PMID: 35064602 PMCID: PMC8849558 DOI: 10.1002/eat.23681] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 01/12/2022] [Accepted: 01/12/2022] [Indexed: 02/03/2023]
Abstract
We respond to commentaries on our 2021 paper "Concerns and recommendations for using Amazon MTurk for eating disorder research." The commentators raised many thoughtful and nuanced points regarding data validity and ethical means of online data collection. We echo concerns about the ethics of recruiting via platforms such as MTurk, and highlight tensions between recommendations for ethical data collection and ensuring data integrity. Especially, we highlight the consistent finding that MTurk workers display elevated (often remarkably so) rates of psychopathology, and argue such findings merit further scrutiny to ensure both data are valid and workers not exploited.
Collapse
|
15
|
Vogel M, Krüger J, Junne F. Eating disorder related research using Amazon Mechanical Turk ( MTurk): Friend or foe?: Commentary on Burnette et al. (2021). Int J Eat Disord 2022; 55:285-287. [PMID: 35014056 DOI: 10.1002/eat.23675] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 12/31/2021] [Accepted: 12/31/2021] [Indexed: 11/09/2022]
Abstract
Burnette et al. reported a study that they sought to undertake to validate common eating disorder questionnaires in sexual and gender minorities. The researchers took advantage of the online recruitment platform Amazon Mechanical Turk (MTurk). Contrary to their expectations, the study proved not feasible due to invalid answering. Thus, Burnette et al. raise concerns against the trustworthiness of crowd-sourced data that may be undermined by financial interests and other kinds of motivations. Our commentary highlights the potential of the COVID-19 pandemic to inflate especially those intentions, which are monetary. Against the background of the COVID-19 pandemic, a further problem seems to be that the anonymity of online crowd sourcing platforms might tempt participants to provide inconsistent answers, possibly reflecting tendencies of reactance. The reported pattern of paradoxical responses in Burnette et al.'s work does not reflect malingering; rather we believe that the study might have served some participants as an outlet for negative emotions. We discuss mechanisms of quality control and highlight the lack of interpersonal interaction associated with online data collections.
Collapse
|
16
|
Burnette CB, Luzier J, Bennett BL, Weisenmuller C, Kerr P, Martin S, Keener J, Calderwood L. Concerns and recommendations for using Amazon MTurk for eating disorder research. Int J Eat Disord 2022; 55:263-272. [PMID: 34562036 PMCID: PMC8992375 DOI: 10.1002/eat.23614] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 09/15/2021] [Accepted: 09/15/2021] [Indexed: 02/03/2023]
Abstract
OBJECTIVE Our original aim was to validate and norm common eating disorder (ED) symptom measures in a large, representative community sample of transgender adults in the United States. We recruited via Amazon Mechanical Turk (MTurk), a popular online recruitment and data collection platform both within and outside of the ED field. We present an overview of our experience using MTurk. METHOD Recruitment began in Spring 2020; our original target N was 2,250 transgender adults stratified evenly across the United States. Measures included a demographics questionnaire, the Eating Disorder Examination-Questionnaire, and the Eating Attitudes Test-26. Consistent with current literature recommendations, we implemented a comprehensive set of attention and validity measures to reduce and identify bot responding, data farming, and participant misrepresentation. RESULTS Recommended validity and attention checks failed to identify the majority of likely invalid responses. Our collection of two similar ED measures, thorough weight history assessment, and gender identity experiences allowed us to examine response concordance and identify impossible and improbable responses, which revealed glaring discrepancies and invalid data. Furthermore, qualitative data (e.g., emails received from MTurk workers) raised concerns about economic conditions facing MTurk workers that could compel misrepresentation. DISCUSSION Our results strongly suggest most of our data were invalid, and call into question results of recently published MTurk studies. We assert that caution and rigor must be applied when using MTurk as a recruitment tool for ED research, and offer several suggestions for ED researchers to mitigate and identify invalid data.
Collapse
|
17
|
Condon M, Wichowsky A. Economic anxiety among contingent survey workers. CURRENT PSYCHOLOGY 2022; 42:1-4. [PMID: 35018080 PMCID: PMC8736285 DOI: 10.1007/s12144-021-02535-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2021] [Indexed: 11/03/2022]
Abstract
Psychologists and other social scientists increasingly conduct experiments with online convenience samples from Amazon's Mechanical Turk Marketplace (MTurk). MTurk and population-based samples differ in well-documented ways, but whether or not compositional differences are problematic for experiments remains controversial. We highlight a critically important characteristic that is likely to interact with many experimental treatments in the psychological and behavioral sciences, and that has not been identified by other studies of MTurk samples: economic anxiety. We document a sizable difference between contingent survey workers and the general population and explain the ways in which economic anxiety is likely to interact with experimental treatments. In an era of rapidly growing economic anxiety and group disparities in economic wellbeing, awareness of this compositional difference is essential, especially in cases where experimental stimuli may interact with economic anxiety. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s12144-021-02535-4.
Collapse
|
18
|
Israel T, Goodman JA, Merrill CRS, Lin YJ, Kary KG, Matsuno E, Choi AY. Reducing Internalized Homonegativity: Refinement and Replication of an Online Intervention for Gay Men. JOURNAL OF HOMOSEXUALITY 2021; 68:2393-2409. [PMID: 33001000 DOI: 10.1080/00918369.2020.1804262] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
We refined and replicated an efficacious brief intervention to reduce internalized homonegativity (IH) with a sample of gay and exclusively same-sex attracted men recruited from outside of LGBT community networks using Amazon Mechanical Turk. We sought to 1) determine if levels of IH differed between the original study's community-based sample and our non-community-based sample, 2) examine the efficacy of the replicated intervention, and 3) assess for longitudinal effects of the intervention at a 30-day follow-up. Four hundred eighty-four participants completed either the intervention or a stress management control condition. Mean levels of IH were higher in the current sample compared with the earlier study's community sample. The intervention was efficacious at reducing global IH, reducing personal homonegativity, and increasing gay affirmation. Ninety-six participants completed the follow-up; follow-up results were not significant and may have been affected by high rates of attrition. Implications for research and practice are discussed.
Collapse
|
19
|
O'Brien EL, Torres GE, Neupert SD. Cognitive Interference in the Context of Daily Stressors, Daily Awareness of Age-Related Change, and General Aging Attitudes. J Gerontol B Psychol Sci Soc Sci 2021; 76:920-929. [PMID: 32898263 DOI: 10.1093/geronb/gbaa155] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVES Previous diary work indicates that older people experience more intrusive and unwanted thoughts (i.e., cognitive interference) on days with stressors. We examined additional predictors of daily cognitive interference to enhance understanding of the psychological context surrounding this link. We specifically focused on factors related to subjective experiences of aging based on studies that have related higher stress and impairments in cognition such as executive control processes (working memory) to negative age stereotypes. Consistent with these findings, we generally expected stronger stress effects on cognitive interference when daily self-perceptions of aging (i.e., within-person fluctuations in awareness of age-related losses [AARC losses]) and general aging attitudes (i.e., individual differences in attitudes toward own aging [ATOA]) were more negative. METHODS Participants (n = 91; aged 60-80) on Amazon's Mechanical Turk completed surveys on 9 consecutive days, reporting on their ATOA (Day 1) as well as their stressors, AARC losses, and cognitive interference (Days 2-9). RESULTS Multilevel models showed that people reported more cognitive interference on days with more AARC losses. Individuals with positive ATOA also experienced less cognitive interference on days with more stressors, whereas those with negative ATOA experienced more. DISCUSSION Both individual differences and fluctuating daily perceptions of aging appear to be important for older adults' cognitive interference. Consistent with other work, positive ATOA protected against daily stressor effects. Further elucidating these relationships can increase understanding of and facilitate efforts to improve (daily) cognitive experiences in older adults.
Collapse
|
20
|
Quality control questions on Amazon's Mechanical Turk ( MTurk): A randomized trial of impact on the USAUDIT, PHQ-9, and GAD-7. Behav Res Methods 2021; 54:885-897. [PMID: 34357539 PMCID: PMC8344397 DOI: 10.3758/s13428-021-01665-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2021] [Indexed: 11/08/2022]
Abstract
Crowdsourced psychological and other biobehavioral research using platforms like Amazon's Mechanical Turk (MTurk) is increasingly common - but has proliferated more rapidly than studies to establish data quality best practices. Thus, this study investigated whether outcome scores for three common screening tools would be significantly different among MTurk workers who were subject to different sets of quality control checks. We conducted a single-stage, randomized controlled trial with equal allocation to each of four study arms: Arm 1 (Control Arm), Arm 2 (Bot/VPN Check), Arm 3 (Truthfulness/Attention Check), and Arm 4 (Stringent Arm - All Checks). Data collection was completed in Qualtrics, to which participants were referred from MTurk. Subjects (n = 1100) were recruited on November 20-21, 2020. Eligible workers were required to claim U.S. residency, have a successful task completion rate > 95%, have completed a minimum of 100 tasks, and have completed a maximum of 10,000 tasks. Participants completed the US-Alcohol Use Disorders Identification Test (USAUDIT), the Patient Health Questionnaire (PHQ-9), and a screener for Generalized Anxiety Disorder (GAD-7). We found that differing quality control approaches significantly, meaningfully, and directionally affected outcome scores on each of the screening tools. Most notably, workers in Arm 1 (Control) reported higher scores than those in Arms 3 and 4 for all tools, and a higher score than workers in Arm 2 for the PHQ-9. These data suggest that the use, or lack thereof, of quality control questions in crowdsourced research may substantively affect findings, as might the types of quality control items.
Collapse
|
21
|
Robinson TP, Kelley ME. Renewal and resurgence phenomena generalize to Amazon's Mechanical Turk. J Exp Anal Behav 2021; 113:206-213. [PMID: 31965578 DOI: 10.1002/jeab.576] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 12/12/2019] [Indexed: 01/10/2023]
Abstract
Amazon's Mechanical Turk (MTurk) is a crowdsourcing platform that provides researchers with the potential for obtaining behavioral data for very little cost. However, the extent to which the results of common behavioral phenomena found in basic, translational, and applied laboratories may be reproduced (as a first step towards prospective research) via MTurk remains relatively unexplored. We evaluated renewal and resurgence arrangements using MTurk as the subject recruitment platform as a first step to determining the generality of the obtained data. Results suggested that MTurk participants produced renewal and resurgence data similar to those reported in basic, translational, and applied studies.
Collapse
|
22
|
Utility estimation for neurogenic bowel dysfunction in the general population. J Pediatr Urol 2021; 17:395.e1-395.e9. [PMID: 33612400 PMCID: PMC8217085 DOI: 10.1016/j.jpurol.2021.01.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 12/11/2020] [Accepted: 01/20/2021] [Indexed: 11/24/2022]
Abstract
BACKGROUND Neurogenic bowel dysfunction (NBD) affects over 80% of individuals with spina bifida causing bowel incontinence and/or constipation. NBD is also associated with decreased quality of life, depression, anxiety, and decreased employment/educational attainment. Because NBD is a life-altering condition without a cure, understanding the utility of different health states related to NBD would aid clinicians as they try to counsel families regarding management options and to better understand the quality of life associated with disease management. OBJECTIVE To elicit utility scores for NBD using an online community sample. STUDY DESIGN A cross-sectional anonymous survey was completed by 1534 voluntary participants via an online platform (Amazon Mechanical Turk (MTurk, http://www.mturk.com/)), representing an 87% response rate. The survey presented hypothetical scenarios that asked respondents to imagine themselves as an individual living with NBD or as the caretaker of a child with NBD. The time trade-off (TTO) method was used to estimate a utility score, and outcomes for each scenario were calculated using median and IQR. Univariate comparisons of distributions of TTO for demographic data were made using Kruskal-Wallis tests. RESULTS The median utility score for NBD was 0.84 [0.70-0.92]. Participants reported that they would give up a median of 5 years of their own life, to prevent NBD in themselves of their child. Utility values for child scenarios were significantly different when stratified by age, gender, race, parental status, marital status, and income. Stratification by current health status did not yield significantly different utility values. DISCUSSION Study findings are comparable with other TTO-determined utility values of moderately severe disease states, including severe persistent asthma (0.83), moderate seizure disorder (0.84) and mild mental retardation (0.84). The significant variations in utility values based on age, gender, race, parent status, partner/marital status and income variables existed in our study, which is similar to findings in other health fields. Study limitations include lack of unanimous agreement about TTO's validity in measuring utility values, and MTurk participant reports can be generalized to greater population. CONCLUSION NBD is perceived by the community as having a substantial impact on the lives of children with spina bifida, representing a 16% reduction from perfect health. In general, health state utilities have been increasingly used in healthcare systems to understand how burdensome a population perceives a disease is and to evaluate whether interventions improve quality of life years.
Collapse
|
23
|
Sorkin DH, Janio EA, Eikey EV, Schneider M, Davis K, Schueller SM, Stadnick NA, Zheng K, Neary M, Safani D, Mukamel DB. Rise in Use of Digital Mental Health Tools and Technologies in the United States During the COVID-19 Pandemic: Survey Study. J Med Internet Res 2021; 23:e26994. [PMID: 33822737 PMCID: PMC8054774 DOI: 10.2196/26994] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 02/18/2021] [Accepted: 04/03/2021] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Accompanying the rising rates of reported mental distress during the COVID-19 pandemic has been a reported increase in the use of digital technologies to manage health generally, and mental health more specifically. OBJECTIVE The objective of this study was to systematically examine whether there was a COVID-19 pandemic-related increase in the self-reported use of digital mental health tools and other technologies to manage mental health. METHODS We analyzed results from a survey of 5907 individuals in the United States using Amazon Mechanical Turk (MTurk); the survey was administered during 4 week-long periods in 2020 and survey respondents were from all 50 states and Washington DC. The first set of analyses employed two different logistic regression models to estimate the likelihood of having symptoms indicative of clinical depression and anxiety, respectively, as a function of the rate of COVID-19 cases per 10 people and survey time point. The second set employed seven different logistic regression models to estimate the likelihood of using seven different types of digital mental health tools and other technologies to manage one's mental health, as a function of symptoms indicative of clinical depression and anxiety, rate of COVID-19 cases per 10 people, and survey time point. These models also examined potential interactions between symptoms of clinical depression and anxiety, respectively, and rate of COVID-19 cases. All models controlled for respondent sociodemographic characteristics and state fixed effects. RESULTS Higher COVID-19 case rates were associated with a significantly greater likelihood of reporting symptoms of depression (odds ratio [OR] 2.06, 95% CI 1.27-3.35), but not anxiety (OR 1.21, 95% CI 0.77-1.88). Survey time point, a proxy for time, was associated with a greater likelihood of reporting clinically meaningful symptoms of depression and anxiety (OR 1.19, 95% CI 1.12-1.27 and OR 1.12, 95% CI 1.05-1.19, respectively). Reported symptoms of depression and anxiety were associated with a greater likelihood of using each type of technology. Higher COVID-19 case rates were associated with a significantly greater likelihood of using mental health forums, websites, or apps (OR 2.70, 95% CI 1.49-4.88), and other health forums, websites, or apps (OR 2.60, 95% CI 1.55-4.34). Time was associated with increased odds of reported use of mental health forums, websites, or apps (OR 1.20, 95% CI 1.11-1.30), phone-based or text-based crisis lines (OR 1.20, 95% CI 1.10-1.31), and online, computer, or console gaming/video gaming (OR 1.12, 95% CI 1.05-1.19). Interactions between COVID-19 case rate and mental health symptoms were not significantly associated with any of the technology types. CONCLUSIONS Findings suggested increased use of digital mental health tools and other technologies over time during the early stages of the COVID-19 pandemic. As such, additional effort is urgently needed to consider the quality of these products, either by ensuring users have access to evidence-based and evidence-informed technologies and/or by providing them with the skills to make informed decisions around their potential efficacy.
Collapse
|
24
|
Linden-Carmichael AN, Allen H. PROFILES OF ALCOHOL AND MARIJUANA USE AMONG SIMULTANEOUS ALCOHOL AND MARIJUANA USERS: INDIVIDUAL DIFFERENCES IN DEMOGRAPHICS AND SUBSTANCE USE. JOURNAL OF DRUG ISSUES 2021; 51:243-252. [PMID: 36875005 PMCID: PMC9979248 DOI: 10.1177/0022042620979617] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Simultaneous alcohol and marijuana (SAM) use - or use of both substances with overlapping effects - is common among emerging adults and is linked to increased risk for problematic substance use outcomes. The current study identified subgroups of emerging adult SAM users based on their typical alcohol and marijuana use patterns and compared groups on key individual characteristics. Latent profile analysis uncovered four profiles of SAM users (n=522): Light Users (LU; 49.0%), Moderate Drinkers with Frequent Marijuana Use (MDFM; 37.9%), Moderate Drinkers with High Peak Levels (MDHP; 5.4%), and Heavy/Frequent Users (HFU; 7.7%). Group differences by demographic characteristics were found, with LU more likely to be college attendees/graduates than MDFM. Additionally, HFU were more likely to be Greek-affiliated than both LU and MDFM. Groups also differed based on other drug use behavior and preferred route of marijuana administration. Findings demonstrate diversity among SAM users based on typical substance use patterns.
Collapse
|
25
|
Han L, Alton K, Colwill AC, Jensen JT, McCrimmon S, Darney BG. Willingness to Use Cannabis for Gynecological Conditions: A National Survey. J Womens Health (Larchmt) 2021; 30:438-444. [PMID: 33667129 DOI: 10.1089/jwh.2020.8491] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
Objective: Expanded legal access to cannabis in the United States has led to its increased use for treating medical conditions. We assessed the use of and attitudes toward cannabis for treating gynecological conditions. Materials and Methods: We utilized Amazon.com Inc.'s Mechanic Turk platform to administer a survey to U.S. women 18 years and older about cannabis use for recreational and medicinal purposes and willingness to use cannabis to treat 17 gynecological conditions. We collected sociodemographic data and views about the legal status of cannabis. We used logistic regression to identify factors associated with willingness to use cannabis for gynecological conditions. Results: In our analytical sample (N = 995), women who reported ever using cannabis were more willing to use cannabis to treat a gynecological condition compared with never users (91.6% vs. 64.6%, p < 0.01). Women willing to use cannabis for gynecological conditions were most interested in using cannabis for treating gynecological pain (61.2% of never users vs. 90.0% of ever users; p < 0.001) compared with procedural pain (38.2% vs. 79.0%, respectively; p < 0.001) or other conditions (38.0% vs. 79.8%, respectively; p < 0.001). In multivariate analysis, willingness to use cannabis for a gynecological condition was associated only with a history of ever using cannabis and views that cannabis should be legal in some capacity and not by age, race, or education. Conclusions: The majority of women would consider using cannabis to treat gynecological conditions. Overall, respondents who had a history of cannabis use were more likely to report willingness to use cannabis for all gynecological conditions, but a large proportion of those who reported never using cannabis were also willing to use it.
Collapse
|