1
|
Response shift results of quantitative research using patient-reported outcome measures: a descriptive systematic review. Qual Life Res 2024; 33:293-315. [PMID: 37702809 PMCID: PMC10850024 DOI: 10.1007/s11136-023-03495-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/21/2023] [Indexed: 09/14/2023]
Abstract
PURPOSE The objective of this systematic review was to describe the prevalence and magnitude of response shift effects, for different response shift methods, populations, study designs, and patient-reported outcome measures (PROM)s. METHODS A literature search was performed in MEDLINE, PSYCINFO, CINAHL, EMBASE, Social Science Citation Index, and Dissertations & Theses Global to identify longitudinal quantitative studies that examined response shift using PROMs, published before 2021. The magnitude of each response shift effect (effect sizes, R-squared or percentage of respondents with response shift) was ascertained based on reported statistical information or as stated in the manuscript. Prevalence and magnitudes of response shift effects were summarized at two levels of analysis (study and effect levels), for recalibration and reprioritization/reconceptualization separately, and for different response shift methods, and population, study design, and PROM characteristics. Analyses were conducted twice: (a) including all studies and samples, and (b) including only unrelated studies and independent samples. RESULTS Of the 150 included studies, 130 (86.7%) detected response shift effects. Of the 4868 effects investigated, 793 (16.3%) revealed response shift. Effect sizes could be determined for 105 (70.0%) of the studies for a total of 1130 effects, of which 537 (47.5%) resulted in detection of response shift. Whereas effect sizes varied widely, most median recalibration effect sizes (Cohen's d) were between 0.20 and 0.30 and median reprioritization/reconceptualization effect sizes rarely exceeded 0.15, across the characteristics. Similar results were obtained from unrelated studies. CONCLUSION The results draw attention to the need to focus on understanding variability in response shift results: Who experience response shifts, to what extent, and under which circumstances?
Collapse
|
2
|
A tutorial on using the paired t test for power calculations in repeated measures ANOVA with interactions. Behav Res Methods 2023; 55:2467-2484. [PMID: 36002625 PMCID: PMC10439102 DOI: 10.3758/s13428-022-01902-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/03/2022] [Indexed: 11/08/2022]
Abstract
The a priori calculation of statistical power has become common practice in behavioral and social sciences to calculate the necessary sample size for detecting an expected effect size with a certain probability (i.e., power). In multi-factorial repeated measures ANOVA, these calculations can sometimes be cumbersome, especially for higher-order interactions. For designs that only involve factors with two levels each, the paired t test can be used for power calculations, but some pitfalls need to be avoided. In this tutorial, we provide practical advice on how to express main and interaction effects in repeated measures ANOVA as single difference variables. In particular, we demonstrate how to calculate the effect size Cohen's d of this difference variable either based on means, variances, and covariances of conditions or by transforming [Formula: see text] or [Formula: see text] from the ANOVA framework into d. With the effect size correctly specified, we then show how to use the t test for sample size considerations by means of an empirical example. The relevant R code is provided in an online repository for all example calculations covered in this article.
Collapse
|
3
|
r2mlm: An R package calculating R-squared measures for multilevel models. Behav Res Methods 2023; 55:1942-1964. [PMID: 35798918 DOI: 10.3758/s13428-022-01841-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/18/2022] [Indexed: 11/08/2022]
Abstract
Multilevel models are used ubiquitously in the social and behavioral sciences and effect sizes are critical for contextualizing results. A general framework of R-squared effect size measures for multilevel models has only recently been developed. Rights and Sterba (2019) distinguished each source of explained variance for each possible kind of outcome variance. Though researchers have long desired a comprehensive and coherent approach to computing R-squared measures for multilevel models, the use of this framework has a steep learning curve. The purpose of this tutorial is to introduce and demonstrate using a new R package - r2mlm - that automates the intensive computations involved in implementing the framework and provides accompanying graphics to visualize all multilevel R-squared measures together. We use accessible illustrations with open data and code to demonstrate how to use and interpret the R package output.
Collapse
|
4
|
Effect sizes and effect size benchmarks in family violence research. CHILD ABUSE & NEGLECT 2023; 139:106095. [PMID: 36989983 DOI: 10.1016/j.chiabu.2023.106095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 01/24/2023] [Accepted: 02/06/2023] [Indexed: 06/19/2023]
Abstract
Scholarly journals increasingly request that authors include effect size (ES) estimates when reporting statistical results. However, there is little guidance on how authors should interpret ESs. Consequently, some authors do not provide ES interpretations, or, when interpretations are provided, they often fail to use appropriate reference groups, using instead the ES benchmarks suggested by Cohen (1988). After discussing the most commonly used ES estimates, we describe the method used by Cohen (1962) to develop ES benchmarks (i.e., small, medium, and large) for use in power analyses and describe the limitations associated with using these benchmarks. Next, we establish general benchmarks for family violence (FV) research. That is, we followed Cohen's approach to establishing his original ES benchmarks using family violence research published in 2021 in Child Abuse & Neglect, which produced a medium ES (d = 0.354) that was smaller than Cohen's recommended medium ES (d = 0.500). Then, we examined the ESs in different subspecialty areas of FV research to provide benchmarks for contextualized FV ESs and to provide information that can be used to conduct power analyses when planning future FV research. Finally, some of the challenges to developing ES benchmarks in any scholarly discipline are discussed. For professionals who are not well informed about ESs, the present review is designed to increase their understanding of ESs and what ES benchmarks tell them (and do not tell them) with respect to understanding the meaningfulness of FV research findings.
Collapse
|
5
|
Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics 2023; 24:48. [PMID: 36788550 PMCID: PMC9926644 DOI: 10.1186/s12859-023-05156-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 01/23/2023] [Indexed: 02/16/2023] Open
Abstract
BACKGROUND An appropriate sample size is essential for obtaining a precise and reliable outcome of a study. In machine learning (ML), studies with inadequate samples suffer from overfitting of data and have a lower probability of producing true effects, while the increment in sample size increases the accuracy of prediction but may not cause a significant change after a certain sample size. Existing statistical approaches using standardized mean difference, effect size, and statistical power for determining sample size are potentially biased due to miscalculations or lack of experimental details. This study aims to design criteria for evaluating sample size in ML studies. We examined the average and grand effect sizes and the performance of five ML methods using simulated datasets and three real datasets to derive the criteria for sample size. We systematically increase the sample size, starting from 16, by randomly sampling and examine the impact of sample size on classifiers' performance and both effect sizes. Tenfold cross-validation was used to quantify the accuracy. RESULTS The results demonstrate that the effect sizes and the classification accuracies increase while the variances in effect sizes shrink with the increment of samples when the datasets have a good discriminative power between two classes. By contrast, indeterminate datasets had poor effect sizes and classification accuracies, which did not improve by increasing sample size in both simulated and real datasets. A good dataset exhibited a significant difference in average and grand effect sizes. We derived two criteria based on the above findings to assess a decided sample size by combining the effect size and the ML accuracy. The sample size is considered suitable when it has appropriate effect sizes (≥ 0.5) and ML accuracy (≥ 80%). After an appropriate sample size, the increment in samples will not benefit as it will not significantly change the effect size and accuracy, thereby resulting in a good cost-benefit ratio. CONCLUSION We believe that these practical criteria can be used as a reference for both the authors and editors to evaluate whether the selected sample size is adequate for a study.
Collapse
|
6
|
DIY bootstrapping: Getting the nonparametric bootstrap confidence interval in SPSS for any statistics or function of statistics (when this bootstrapping is appropriate). Behav Res Methods 2023; 55:474-490. [PMID: 35292932 DOI: 10.3758/s13428-022-01808-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/24/2022] [Indexed: 11/08/2022]
Abstract
Researchers can generate bootstrap confidence intervals for some statistics in SPSS using the BOOTSTRAP command. However, this command can only be applied to selected procedures, and only to selected statistics in these procedures. We developed an extension command and prepared some sample syntax files based on existing approaches from the Internet to illustrate how researchers can (a) generate a large number of nonparametric bootstrap samples, (b) do desired analysis on all these samples, and (c) form the bootstrap confidence intervals for selected statistics using the OMS commands. We developed these tools to help researchers apply nonparametric bootstrapping to any statistics for which this method is appropriate, including statistics derived from other statistics, such as standardized effect size measures computed from the t test results. We also discussed how researchers can extend the tools for other statistics and scenarios they encounter.
Collapse
|
7
|
Probability of superiority for comparing two groups of clusters. Behav Res Methods 2023; 55:646-656. [PMID: 35411476 DOI: 10.3758/s13428-022-01815-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/18/2022] [Indexed: 11/08/2022]
Abstract
The probability of superiority (PS) has been recommended as a simple-to-interpret effect size for comparing two independent samples-there are several methods for computing the PS for this particular study design. However, educational and psychological interventions increasingly occur in clustered data contexts; and a review of the literature returned only one method for computing the PS in such contexts. In this paper, we propose a method for estimating the PS in clustered data contexts. Specifically, the proposal addresses study designs that compare two groups and group membership is determined at the cluster level. A cluster may be: (i) a group of cases with each case measured once, or (ii) a single case with each case measured multiple times, resulting in longitudinal data. The proposal relies on nonparametric point estimates of the PS coupled with cluster-robust variance estimation, such that the proposed approach should remain adequate regardless of the distribution of the response data. Using Monte Carlo simulation, we show the approach to be unbiased for continuous and binary outcomes, while maintaining adequate frequentist properties. Moreover, our proposal performs better than the single extant method we found in the literature. The proposal is simple to implement in commonplace statistical software and we provide accompanying R code. Hence, it is our hope that the method we present helps applied researchers better estimate group differences when comparing two groups and group membership is determined at the cluster level.
Collapse
|
8
|
Statistical Inferences Using Effect Sizes in Human Endothelial Function Research. Artery Res 2021; 27:176-185. [PMID: 34966462 PMCID: PMC8654719 DOI: 10.1007/s44200-021-00006-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 10/07/2021] [Indexed: 11/28/2022] Open
Abstract
INTRODUCTION Magnitudes of change in endothelial function research can be articulated using effect size statistics. Effect sizes are commonly used in reference to Cohen's seminal guidelines of small (d = 0.2), medium (d = 0.5), and large (d = 0.8). Quantitative analyses of effect size distributions across various research disciplines have revealed values differing from Cohen's original recommendations. Here we examine effect size distributions in human endothelial function research, and the magnitude of small, medium, and large effects for macro and microvascular endothelial function. METHODS Effect sizes reported as standardized mean differences were extracted from meta research available for endothelial function. A frequency distribution was constructed to sort effect sizes. The 25th, 50th, and 75th percentiles were used to derive small, medium, and large effects. Group sample sizes and publication year from primary studies were also extracted to observe any potential trends, related to these factors, in effect size reporting in endothelial function research. RESULTS Seven hundred fifty-two effect sizes were extracted from eligible meta-analyses. We determined small (d = 0.28), medium (d = 0.69), and large (d = 1.21) effects for endothelial function that corresponded to the 25th, 50th, and 75th percentile of the data distribution. CONCLUSION Our data indicate that direct application of Cohen's guidelines would underestimate the magnitude of effects in human endothelial function research. This investigation facilitates future a priori power analyses, provides a practical guiding benchmark for the contextualization of an effect when no other information is available, and further encourages the reporting of effect sizes in endothelial function research.
Collapse
|
9
|
Denouncing the use of field-specific effect size distributions to inform magnitude. PeerJ 2021; 9:e11383. [PMID: 34178435 PMCID: PMC8210805 DOI: 10.7717/peerj.11383] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 04/09/2021] [Indexed: 11/20/2022] Open
Abstract
An effect size (ES) provides valuable information regarding the magnitude of effects, with the interpretation of magnitude being the most important. Interpreting ES magnitude requires combining information from the numerical ES value and the context of the research. However, many researchers adopt popular benchmarks such as those proposed by Cohen. More recently, researchers have proposed interpreting ES magnitude relative to the distribution of observed ESs in a specific field, creating unique benchmarks for declaring effects small, medium or large. However, there is no valid rationale whatsoever for this approach. This study was carried out in two parts: (1) We identified articles that proposed the use of field-specific ES distributions to interpret magnitude (primary articles); and (2) We identified articles that cited the primary articles and classified them by year and publication type. The first type consisted of methodological papers. The second type included articles that interpreted ES magnitude using the approach proposed in the primary articles. There has been a steady increase in the number of methodological and substantial articles discussing or adopting the approach of interpreting ES magnitude by considering the distribution of observed ES in that field, even though the approach is devoid of a theoretical framework. It is hoped that this research will restrict the practice of interpreting ES magnitude relative to the distribution of ES values in a field and instead encourage researchers to interpret such by considering the specific context of the study.
Collapse
|
10
|
Abstract
Current statistical inference methods for task-fMRI suffer from two fundamental limitations. First, the focus is solely on detection of non-zero signal or signal change, a problem that is exacerbated for large scale studies (e.g. UK Biobank, N=40,000+) where the 'null hypothesis fallacy' causes even trivial effects to be determined as significant. Second, for any sample size, widely used cluster inference methods only indicate regions where a null hypothesis can be rejected, without providing any notion of spatial uncertainty about the activation. In this work, we address these issues by developing spatial Confidence Sets (CSs) on clusters found in thresholded Cohen's d effect size images. We produce an upper and lower CS to make confidence statements about brain regions where Cohen's d effect sizes have exceeded and fallen short of a non-zero threshold, respectively. The CSs convey information about the magnitude and reliability of effect sizes that is usually given separately in a t-statistic and effect estimate map. We expand the theory developed in our previous work on CSs for %BOLD change effect maps (Bowring et al., 2019) using recent results from the bootstrapping literature. By assessing the empirical coverage with 2D and 3D Monte Carlo simulations resembling fMRI data, we find our method is accurate in sample sizes as low as N=60. We compute Cohen's d CSs for the Human Connectome Project working memory task-fMRI data, illustrating the brain regions with a reliable Cohen's d response for a given threshold. By comparing the CSs with results obtained from a traditional statistical voxelwise inference, we highlight the improvement in activation localization that can be gained with the Confidence Sets.
Collapse
|
11
|
Improving Social Knowledge and Skills among Adolescents with Autism: Systematic Review and Meta-Analysis of UCLA PEERS® for Adolescents. J Autism Dev Disord 2021; 51:4488-4503. [PMID: 33512626 DOI: 10.1007/s10803-021-04885-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/12/2021] [Indexed: 12/18/2022]
Abstract
UCLA PEERS® for Adolescents is a widely applied program among a number of social skills training programs developed over the years. We synthesized current research evidence on the PEERS program to evaluate the treatment effect on four commonly used outcome measures. 12 studies met inclusion criteria for the review and nine met the criteria for meta-analysis. Results showed moderate to large pooled effects across measures and informants in favor of the PEERS program, with the largest effect seen in social knowledge improvement and the smallest effect in the frequency of get-togethers. The heterogeneity of effects across studies were examined and the limitations of the current evidence were discussed.
Collapse
|
12
|
Interpreting the effectiveness of a summer reading program: The eye of the beholder. EVALUATION AND PROGRAM PLANNING 2020; 83:101852. [PMID: 32801067 DOI: 10.1016/j.evalprogplan.2020.101852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 09/05/2019] [Accepted: 07/30/2020] [Indexed: 06/11/2023]
Abstract
In applying a methods-oriented approach to evaluation, this study interpreted the effectiveness of a summer reading program from three different stakeholder perspectives: practitioners from the school district, the funding agency supporting the program, and the policymakers considering mandating summer school. Archival data were obtained on 2330 students reading below benchmark in Grades 2-5. After propensity score matching participants to peers who did not attend the summer program, the final sample consisted of 630 students. Pre-to-posttest growth models revealed positive effects in Grades 2-4 (standardized slopes of .40-.54), but fifth graders demonstrated negligible improvement (standardized slope of .15). The standardized mean differences of propensity score matched treatment and control group students indicated null effects in all grade levels (d = -.13 to .05). Achieving proficient reading performance also was not attributable to summer school participation. Findings underscore the importance of operationalizing effectiveness in summative evaluation.
Collapse
|
13
|
Peer tutoring and mathematics in secondary education: literature review, effect sizes, moderators, and implications for practice. Heliyon 2019; 5:e02491. [PMID: 31687584 PMCID: PMC6819807 DOI: 10.1016/j.heliyon.2019.e02491] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 07/14/2019] [Accepted: 09/16/2019] [Indexed: 12/01/2022] Open
Abstract
A literature review was undertaken to compile all data on peer tutoring in secondary education (7th to 12th grade) mathematics from existing articles. Data from 42 independent studies were included in this research. All data regarding participants' roles (fixed vs. reciprocal), participants' ages (same-age vs. cross-age), the methodological approach taken (quantitative or qualitative), the type of design for those studies that involved a quantitative approach, the variables analyzed, and the organizational matters (number of participants, duration of the program, sessions per week, and duration of the sessions) are included in the article. The effect sizes of the 42 studies were calculated and examined. The main goal of the study was to determine those variables that were moderators of effect size, that is, the variables that significantly influenced students' academic achievement outcomes. Inferential statistical analyses (Student's t-test and ANOVAs) were carried out for the variables. Of the 42 studies examined, 88% showed positive effect sizes with the means being close to medium (Cohen's d = 0.38). Conclusions suggest the implementation of same-age over cross-age tutoring, during programs of fewer than 8 weeks, in sessions of less than 30 minutes is optimal for improving students' academic outcomes. Inclusion of control groups in similar future studies is recommended so effect sizes are not overestimated.
Collapse
|
14
|
Empirically Based Mean Effect Size Distributions for Universal Prevention Programs Targeting School-Aged Youth: A Review of Meta-Analyses. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2019; 19:1091-1101. [PMID: 30136245 DOI: 10.1007/s11121-018-0942-1] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
This review of reviews presents an empirically based set of mean effect size distributions for judging the relative impact of the effects of universal mental health promotion and prevention programs for school-age youth (ages 5 through 18) across a range of program targets and types of outcomes. Mean effect size distributions were established by examining the findings from 74 meta-analyses of universal prevention and promotion programs that included more than 1100 controlled outcome studies involving over 490,000 school-age youth. The distributions of mean effect sizes from these meta-analyses indicated considerable variability across program targets and outcomes that differed substantially from Cohen's (1988, Statistical power analysis for the behavioral sciences (2nd ed.)) widely used set of conventions for assessing if effects are small, medium, or large. These updated mean effect size distributions will provide researchers, practitioners, and funders with more appropriate evidence-based standards for judging the relative effects of universal prevention programs for youth. Limitations in current data and directions for future work are also discussed.
Collapse
|
15
|
Exploring perceptions of meaningfulness in visual representations of bivariate relationships. PeerJ 2019; 7:e6853. [PMID: 31139500 PMCID: PMC6524627 DOI: 10.7717/peerj.6853] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Accepted: 03/27/2019] [Indexed: 11/30/2022] Open
Abstract
Researchers often need to consider the practical significance of a relationship. For example, interpreting the magnitude of an effect size or establishing bounds in equivalence testing requires knowledge of the meaningfulness of a relationship. However, there has been little research exploring the degree of relationship among variables (e.g., correlation, mean difference) necessary for an association to be interpreted as meaningful or practically significant. In this study, we presented statistically trained and untrained participants with a collection of figures that displayed varying degrees of mean difference between groups or correlations among variables and participants indicated whether or not each relationship was meaningful. The results suggest that statistically trained and untrained participants differ in their qualification of a meaningful relationship, and that there is significant variability in how large a relationship must be before it is labeled meaningful. The results also shed some light on what degree of relationship is considered meaningful by individuals in a context-free setting.
Collapse
|
16
|
Non-invasive neurophysiological measures of learning: A meta-analysis. Neurosci Biobehav Rev 2019; 99:59-89. [PMID: 30735681 DOI: 10.1016/j.neubiorev.2019.02.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2018] [Revised: 12/22/2018] [Accepted: 02/04/2019] [Indexed: 01/09/2023]
Abstract
In a meta-analysis of 113 experiments we examined neurophysiological outcomes of learning, and the relationship between neurophysiological and behavioral outcomes of learning. Findings showed neurophysiology yielding large effect sizes, with the majority of studies examining electroencephalography and eye-related outcome measures. Effect sizes on neurophysiological outcomes were smaller than effect sizes on behavioral outcomes, however. Neurophysiological outcomes were, but behavioral outcomes were not, influenced by several modulating factors. These factors included the sensory system in which learning took place, number of learning days, whether feedback on performance was provided, and age of participants. Controlling for these factors resulted in the effect size differences between behavior and neurophysiology to disappear. The findings of the current meta-analysis demonstrate that neurophysiology is an appropriate measure in assessing learning, particularly when taking into account factors that could have an influence on neurophysiology. We propose a first model to aid further studies that are needed to examine the exact interplay between learning, neurophysiology, behavior, individual differences, and task-related aspects.
Collapse
|
17
|
Abstract
This project examined the performance of classical and Bayesian estimators of four effect size measures for the indirect effect in a single-mediator model and a two-mediator model. Compared to the proportion and ratio mediation effect sizes, standardized mediation effect-size measures were relatively unbiased and efficient in the single-mediator model and the two-mediator model. Percentile and bias-corrected bootstrap interval estimates of ab/sY, and ab(sX)/sY in the single-mediator model outperformed interval estimates of the proportion and ratio effect sizes in terms of power, Type I error rate, coverage, imbalance, and interval width. For the two-mediator model, standardized effect-size measures were superior to the proportion and ratio effect-size measures. Furthermore, it was found that Bayesian point and interval summaries of posterior distributions of standardized effect-size measures reduced excessive relative bias for certain parameter combinations. The standardized effect-size measures are the best effect-size measures for quantifying mediated effects.
Collapse
|
18
|
The use of nonparametric effect sizes in single study musculoskeletal physiotherapy research: A practical primer. Phys Ther Sport 2018; 33:117-124. [PMID: 30077090 DOI: 10.1016/j.ptsp.2018.07.009] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 07/16/2018] [Accepted: 07/16/2018] [Indexed: 01/23/2023]
Abstract
There is a strong push for the inclusion of effect size indexes alongside the reporting of statistical analysis in academic journals. Nonparametric methods of analysis have generally been developed less than their parametric counterparts have, and are also generally less well known. Too often researchers use parametric statistics where nonparametric measures would be more appropriate. This holds true for nonparametric measures of effect size, where even when researchers use nonparametric statistics, some use parametric effect size measures to interpret the result. This paper attempts to provide a practical overview and illustration of the correct usage and interpretation of effect size measures for nonparametric statistics for single study designs using real-world physiotherapy data in the worked examples. This primer covers a range of different formulae based on categorical measures of effect size, as well as between- and within-group designs using ranked data. While this primer does use examples focusing on physiotherapy research, the applications of the information can be used in any field of research.
Collapse
|
19
|
Abstract
This review summarizes the literature on QoL in early stage lung cancer patients who underwent surgery. PubMed and PsycINFO were searched. Twelve articles from 10 distinct studies were identified for a total of 992 patients. Five QoL measures were used. One study reported only on pre-surgical QoL, six only on post-surgical QoL and three studies reported on both pre- and post-surgical QoL. Timing for the administration of post-surgical QoL surveys varied. The literature on QoL in Stage I non-small-cell lung cancer patients is very sparse. Additional research is needed to explore the impact of different surgical approaches on QoL.
Collapse
|
20
|
Abstract
Understanding the results and statistics reported in original research remains a large challenge for many sports medicine practitioners and, in turn, may be among one of the biggest barriers to integrating research into sports medicine practice. The purpose of this article is to provide minimal essentials a sports medicine practitioner needs to know about interpreting statistics and research results to facilitate the incorporation of the latest evidence into practice. Topics covered include the difference between statistical significance and clinical meaningfulness; effect sizes and confidence intervals; reliability statistics, including the minimal detectable difference and minimal important difference; and statistical power.
Collapse
|
21
|
The use of parametric effect sizes in single study musculoskeletal physiotherapy research: A practical primer. Phys Ther Sport 2018; 32:87-97. [PMID: 29778828 DOI: 10.1016/j.ptsp.2018.05.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Revised: 11/14/2017] [Accepted: 05/03/2018] [Indexed: 10/16/2022]
Abstract
Many researchers often do not report effect sizes at all, and, if they do report them, often do not report the correct measure for the design that has been used in the research. With the increased level of attention being given to the reporting of effect sizes and their corresponding confidence intervals, it is important that there is field-specific literature pertaining to the calculation and reporting of these measures. This paper acts as a practical primer for the calculation and reporting of effect size measures aimed at, but not limited to, the field of musculoskeletal physiotherapy research. This primer involves a discussion on which effect sizes are appropriate for within and between-subject single study research, illustrating, through examples based on musculoskeletal research data, how these measures are calculated, interpreted, and reported.
Collapse
|
22
|
The value of facial attractiveness for encouraging fruit and vegetable consumption: analyses from a randomized controlled trial. BMC Public Health 2018; 18:298. [PMID: 29490640 PMCID: PMC5831823 DOI: 10.1186/s12889-018-5202-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 02/22/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND An effect of increased fruit and vegetable (FV) consumption on facial attractiveness has been proposed and recommended as a strategy to promote FV intakes, but no studies to date demonstrate a causal link between FV consumption and perceived attractiveness. This study investigated perceptions of attractiveness before and after the supervised consumption of 2, 5 or 8 FV portions/day for 4 weeks in 30 low FV consumers. Potential mechanisms for change via skin colour and perceived skin healthiness were also investigated. METHODS Faces were photographed at the start and end of the 4 week intervention in controlled conditions. Seventy-three independent individuals subsequently rated all 60 photographs in a randomized order, for facial attractiveness, facial skin yellowness, redness, healthiness, clarity, and symmetry. RESULTS Using clustered multiple regression, FV consumption over the previous 4 weeks had no direct effect on attractiveness, but, for female faces, some evidence was found for an indirect impact, via linear and non-linear changes in skin yellowness. Effect sizes, however, were small. No association between FV consumption and skin healthiness was found, but skin healthiness was associated with facial attractiveness. CONCLUSIONS Controlled and objectively measured increases in FV consumption for 4 weeks resulted indirectly in increased attractiveness in females via increases in skin yellowness, but effects are small and gradually taper as FV consumption increases. Based on the effect sizes from this study, we are hesitant to recommend the use of facial attractiveness to encourage increased FV consumption. TRIAL REGISTRATION Clinical trial Registration Number NCT01591057 ( www.clinicaltrials.gov ). Registered: 27th April, 2012.
Collapse
|
23
|
Abstract
Elopement is a dangerous behavior that is emitted by a large proportion of individuals with intellectual and developmental disabilities. Functional analysis and function-based treatments are critical in identifying maintaining reinforcers and decreasing elopement. The purpose of this review was to identify recent trends in the functional analysis and treatment of elopement, as well as determine the efficacy (standardized mean differences) of recent treatments. Over half of subjects' elopement was maintained by social positive reinforcement, while only 25% of subjects' elopement was maintained by social negative reinforcement. Elopement was rarely maintained by automatic reinforcement, and none of the studies in the current review evaluated treatments to address automatically maintained elopement. Functional communication training was the most common intervention regardless of function. Results are discussed in terms of clinical implications and directions for future research.
Collapse
|
24
|
A meta-analysis of single-case research on the use of tablet-mediated interventions for persons with ASD. RESEARCH IN DEVELOPMENTAL DISABILITIES 2017; 70:198-214. [PMID: 28964654 DOI: 10.1016/j.ridd.2017.09.013] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Revised: 09/06/2017] [Accepted: 09/19/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND There is a growing amount of single-case research literature on the benefits of tablet-mediated interventions for individuals with autism spectrum disorder (ASD). With the development of tablet-based computers, tablet-mediated interventions have been widely utilized for education and treatment purposes; however, the overall quality and evidence of this literature-base are unknown. AIMS This article aims to present a quality review of the single-case experimental literature and aggregate results across studies involving the use of tablet-mediated interventions for individuals with ASD. METHODS AND PROCEDURES Using the Tau nonoverlap effect size measure, the authors extracted data from single-case experimental studies and calculated effect sizes differentiated by moderator variables. The moderator variables included the ages of participants, participants' diagnoses, interventions, outcome measures, settings, and contexts. OUTCOMES AND RESULTS Results indicate that tablet-mediated interventions for individuals with ASD have moderate to large effect sizes across the variables evaluated. The majority of research in this review used tablets for video modeling and augmentative and alternative communication. CONCLUSIONS AND IMPLICATIONS To promote the usability of tablet-mediated interventions for individuals with ASD, this review indicates that more single-case experimental studies should be conducted with this population in naturalistic home, community, and employment settings.
Collapse
|
25
|
Out with .05, in with Replication and Measurement: Isolating and Working with the Particular Effect Sizes that are Troublesome for Inferential Statistics. The Journal of General Psychology 2017; 144:309-316. [PMID: 29023206 DOI: 10.1080/00221309.2017.1381496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
It is difficult to obtain adequate power to test a small effect size with a set criterion alpha of 0.05. Probably an inferential test will indicate non-statistical significance and not be published. Rarely, statistical significance will be obtained, and an exaggerated effect size calculated and reported. Accepting all inferential probabilities and associated effect sizes could solve exaggeration problems. Graphs, generated through Monte Carlo methods, are presented to illustrate this. The first graph presents effect sizes (Cohen's d) as lines from 1 to 0 with probabilities on the Y axis and the number of measures on the X axis. This graph shows effect sizes of .5 or less should yield non-significance with sample sizes below 120 measures. The other graphs show results with as many as 10 small sample size replications. There is a convergence of means with the effect size as sample size increases and measurement accuracy emerges.
Collapse
|
26
|
Abstract
BACKGROUND Disagreements over genetic signatures associated with disease have been particularly prominent in the field of psychiatric genetics, creating a sharp divide between disease burdens attributed to common and rare variation, with study designs independently targeting each. Meta-analysis within each of these study designs is routine, whether using raw data or summary statistics, but combining results across study designs is atypical. However, tests of functional convergence are used across all study designs, where candidate gene sets are assessed for overlaps with previously known properties. This suggests one possible avenue for combining not study data, but the functional conclusions that they reach. METHOD In this work, we test for functional convergence in autism spectrum disorder (ASD) across different study types, and specifically whether the degree to which a gene is implicated in autism is correlated with the degree to which it drives functional convergence. Because different study designs are distinguishable by their differences in effect size, this also provides a unified means of incorporating the impact of study design into the analysis of convergence. RESULTS We detected remarkably significant positive trends in aggregate (p < 2.2e-16) with 14 individually significant properties (false discovery rate <0.01), many in areas researchers have targeted based on different reasoning, such as the fragile X mental retardation protein (FMRP) interactor enrichment (false discovery rate 0.003). We are also able to detect novel technical effects and we see that network enrichment from protein-protein interaction data is heavily confounded with study design, arising readily in control data. CONCLUSIONS We see a convergent functional signal for a subset of known and novel functions in ASD from all sources of genetic variation. Meta-analytic approaches explicitly accounting for different study designs can be adapted to other diseases to discover novel functional associations and increase statistical power.
Collapse
|
27
|
Comparing effects: a reanalysis of two studies on season of birth bias in anorexia nervosa. J Eat Disord 2017; 5:2. [PMID: 28078085 PMCID: PMC5223376 DOI: 10.1186/s40337-016-0131-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 12/07/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Outcomes from studies on season of birth bias in eating disorders have been inconsistent. This inconsistency has been explained by differences in methodologies resulting in different types of effect sizes. The aim of the current study was to facilitate comparison by using the same methodology on samples from two studies with differing conclusions. METHODS The statistical analyses used in each study were applied to the samples from the other study and the resulting effect sizes, Cramêr's V and odds ratio (OR), were compared and discussed. RESULTS For both studies, the Cramêr's Vs ranged between 0.03 and 0.08 and the OR ranged between 0.85 and 1.31. According to common conventions, Cramêr's Vs below 0.10 and ORs below 1.44 are considered small. CONCLUSION As a marker of one or more potential risk factors, the observed effects are considered to be small. When reanalysed allowing for direct comparisons, studies with contrasting conclusions converge towards an absence of support for a season of birth bias for patients with AN.
Collapse
|
28
|
Alternatives to P value: confidence interval and effect size. Korean J Anesthesiol 2016; 69:555-562. [PMID: 27924194 PMCID: PMC5133225 DOI: 10.4097/kjae.2016.69.6.555] [Citation(s) in RCA: 193] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Revised: 09/13/2016] [Accepted: 09/15/2016] [Indexed: 11/24/2022] Open
Abstract
The previous articles of the Statistical Round in the Korean Journal of Anesthesiology posed a strong enquiry on the issue of null hypothesis significance testing (NHST). P values lie at the core of NHST and are used to classify all treatments into two groups: "has a significant effect" or "does not have a significant effect." NHST is frequently criticized for its misinterpretation of relationships and limitations in assessing practical importance. It has now provoked criticism for its limited use in merely separating treatments that "have a significant effect" from others that do not. Effect sizes and CIs expand the approach to statistical thinking. These attractive estimates facilitate authors and readers to discriminate between a multitude of treatment effects. Through this article, I have illustrated the concept and estimating principles of effect sizes and CIs.
Collapse
|
29
|
The effects of video modeling in teaching functional living skills to persons with ASD: A meta-analysis of single-case studies. RESEARCH IN DEVELOPMENTAL DISABILITIES 2016; 57:158-169. [PMID: 27442687 DOI: 10.1016/j.ridd.2016.07.001] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2015] [Revised: 06/02/2016] [Accepted: 07/04/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND Many individuals with autism spectrum disorders (ASD) show deficits in functional living skills, leading to low independence, limited community involvement, and poor quality of life. With development of mobile devices, utilizing video modeling has become more feasible for educators to promote functional living skills of individuals with ASD. AIMS This article aims to review the single-case experimental literature and aggregate results across studies involving the use of video modeling to improve functional living skills of individuals with ASD. METHODS AND PROCEDURES The authors extracted data from single-case experimental studies and evaluated them using the Tau-U effect size measure. Effects were also differentiated by categories of potential moderators and other variables, including age of participants, concomitant diagnoses, types of video modeling, and outcome measures. OUTCOMES AND RESULTS Results indicate that video modeling interventions are overall moderately effective with this population and dependent measures. While significant differences were not found between categories of moderators and other variables, effects were found to be at least moderate for most of them. CONCLUSIONS AND IMPLICATIONS It is apparent that more single-case experiments are needed in this area, particularly with preschool and secondary-school aged participants, participants with ASD-only and those with high-functioning ASD, and for video modeling interventions addressing community access skills.
Collapse
|
30
|
Effect Sizes and Primary Outcomes in Large-Budget, Cardiovascular-Related Behavioral Randomized Controlled Trials Funded by NIH Since 1980. Ann Behav Med 2016; 50:130-46. [PMID: 26507906 PMCID: PMC4744141 DOI: 10.1007/s12160-015-9739-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
Abstract
PURPOSE We reviewed large-budget, National Institutes of Health (NIH)-supported randomized controlled trials (RCTs) with behavioral interventions to assess (1) publication rates, (2) trial registration, (3) use of objective measures, (4) significant behavior and physiological change, and (5) effect sizes. METHODS We identified large-budget grants (>$500,000/year) funded by NIH (National Heart Lung and Blood Institute (NHLBI) or National Institute of Diabetes & Digestive and Kidney Diseases (NIDDK)) for cardiovascular disease (dates January 1, 1980 to December 31, 2012). Among 106 grants that potentially met inclusion criteria, 20 studies were not published and 48 publications were excluded, leaving 38 publications for analysis. ClinicalTrials.gov abstracts were used to determine whether outcome measures had been pre-specified. RESULTS Three fourths of trials were registered in ClinicalTrials.gov and all published pre-specified outcomes. Twenty-six trials reported a behavioral outcome with 81 % reporting significant improvements for the target behavior. Thirty-two trials reported a physiological outcome. All were objectively measured, and 81 % reported significant benefit. Seventeen trials reported morbidity outcomes, and seven reported a significant benefit. Nine trials assessed mortality, and all were null for this outcome. CONCLUSIONS Behavioral trials complied with trial registration standards. Most reported a physiological benefit, but few documented morbidity or mortality benefits.
Collapse
|
31
|
Abstract
BACKGROUND Traditional null hypothesis significance testing suffers many limitations and is poorly adapted to theory testing. PURPOSE A proposed alternative approach, called Testing Theory-based Quantitative Predictions, uses effect size estimates and confidence intervals to directly test predictions based on theory. METHOD This paper replicates findings from previous smoking studies and extends the approach to diet and sun protection behaviors using baseline data from a Transtheoretical Model behavioral intervention (N = 5407). Effect size predictions were developed using two methods: (1) applying refined effect size estimates from previous smoking research or (2) using predictions developed by an expert panel. RESULTS Thirteen of 15 predictions were confirmed for smoking. For diet, 7 of 14 predictions were confirmed using smoking predictions and 6 of 16 using expert panel predictions. For sun protection, 3 of 11 predictions were confirmed using smoking predictions and 5 of 19 using expert panel predictions. CONCLUSION Expert panel predictions and smoking-based predictions poorly predicted effect sizes for diet and sun protection constructs. Future studies should aim to use previous empirical data to generate predictions whenever possible. The best results occur when there have been several iterations of predictions for a behavior, such as with smoking, demonstrating that expected values begin to converge on the population effect size. Overall, the study supports necessity in strengthening and revising theory with empirical data.
Collapse
|
32
|
Recovery after brain damage: Is there any indication for generalization between different cognitive functions? J Clin Exp Neuropsychol 2015; 37:571-80. [PMID: 26059257 DOI: 10.1080/13803395.2015.1030358] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
INTRODUCTION The question whether recovery in various cognitive functions is supported by one or two more fundamental functions (for instance, attentional or working memory functions) is a long-standing problem of cognitive rehabilitation. One possibility to answer this question is to analyze the recovery pattern in different cognitive domains and to see whether improvement in one domain is related to performance in another domain. METHOD Ninety-two inpatients with stroke or other brain lesions (Barthel Index >75) were included. Neuropsychological assessment was done at the beginning and the end of a rehabilitation stay. Cognitive performance was analyzed at test and at domain level using conceptually and statistically defined composite scores for attention, immediate and delayed memory, working memory, prospective memory, and word fluency. We used regression analysis to look for generalization between cognitive domains. RESULTS Effect sizes of improvement varied largely (from d = 0.18 in attention and d = 1.36 in episodic memory). Age, gender, and time since injury had no impact on recovery. Impaired patients showed significantly more improvement than nonimpaired patients. Regression analysis revealed no effect of initial performance in one cognitive domain on improvements in other cognitive domains. CONCLUSION Significant recovery in impaired cognitive domains can be expected during neuropsychological rehabilitation. It depends more or less exclusively on improvement in the specific functions itself, and there was no evidence for generalization between cognitive domains.
Collapse
|
33
|
Size and consistency of problem-solving consultation outcomes: an empirical analysis. J Sch Psychol 2015; 53:161-78. [PMID: 25746825 DOI: 10.1016/j.jsp.2015.01.001] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Revised: 01/17/2015] [Accepted: 01/24/2015] [Indexed: 12/14/2022]
Abstract
In this study, we analyzed extant data to evaluate the variability and magnitude of students' behavior change outcomes (academic, social, and behavioral) produced by consultants through problem-solving consultation with teachers. Research questions were twofold: (a) Do consultants produce consistent and sizeable positive student outcomes across their cases as measured through direct and frequent assessment? and (b) What proportion of variability in student outcomes is attributable to consultants? Analyses of extant data collected from problem-solving consultation outcome studies that used single-case, time-series AB designs with multiple participants were analyzed. Four such studies ultimately met the inclusion criteria for the extant data, comprising 124 consultants who worked with 302 school teachers regarding 453 individual students. Consultants constituted the independent variable, while the primary dependent variable was a descriptive effect size based on student behavior change as measured by (a) curriculum-based measures, (b) permanent products, or (c) direct observations. Primary analyses involved visual and statistical evaluation of effect size magnitude and variability observed within and between consultants and studies. Given the nested nature of the data, multilevel analyses were used to assess consultant effects on student outcomes. Results suggest that consultants consistently produced positive effect sizes on average across their cases, but outcomes varied between consultants. Findings also indicated that consultants, teachers, and the corresponding studies accounted for a significant proportion of variability in student outcomes. This investigation advances the use of multilevel and integrative data analyses to evaluate consultation outcomes and extends research on problem-solving consultation, consultant effects, and meta-analysis of case study AB designs. Practical implications for evaluating consultation service delivery in school settings are also discussed.
Collapse
|
34
|
Benchmarks for Expected Annual Academic Growth for Students in the Bottom Quartile of the Normative Distribution. JOURNAL OF RESEARCH ON EDUCATIONAL EFFECTIVENESS 2015; 8:366-379. [PMID: 26726300 PMCID: PMC4696502 DOI: 10.1080/19345747.2014.952464] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Effect sizes are commonly reported for the results of educational interventions. However, researchers struggle with interpreting their magnitude in a way that transcends generic guidelines. Effect sizes can be interpreted in a meaningful context by benchmarking them against typical growth for students in the normative distribution. Such benchmarks are not currently available for students in the bottom quartile. This report remedies this by providing a comparative context for interventions involving these students. Annual growth effect sizes for K-12 students were computed from nationally normed assessments and a longitudinal study of students in special education. They reveal declining growth over time, especially for reading and math. These results allow researchers to better interpret the effects of their interventions and help practitioners by quantifying typical growth for struggling students. More longitudinal research is needed to show growth trajectories for students in the bottom quartile.
Collapse
|
35
|
Estimating the size of treatment effects: moving beyond p values. PSYCHIATRY (EDGMONT (PA. : TOWNSHIP)) 2009; 6:21-29. [PMID: 20011465 PMCID: PMC2791668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
OBJECTIVE To increase understanding of effect size calculations among clinicians who over-rely on interpretations of P values in their assessment of the medical literature. DESIGN We review five methods of calculating effect sizes: Cohen's d (also known as the standardized mean difference)-used in studies that report efficacy in terms of a continuous measurement and calculated from two mean values and their standard deviations; relative risk-the ratio of patients responding to treatment divided by the ratio of patients responding to a different treatment (or placebo), which is particularly useful in prospective clinical trials to assess differences between treatments; odds ratio- used to interpret results of retrospective case-control studies and provide estimates of the risk of side effects by comparing the probability (odds) of an outcome occurring in the presence or absence of a specified condition; number needed to treat-the number of subjects one would expect to treat with agent A to have one more success (or one less failure) than if the same number were treated with agent B; and area under the curve (also known as the drug-placebo response curve)-a six-step process that can be used to assess the effects of medication on both worsening and improvement and the probability that a medication-treated subject will have a better outcome than a placebo-treated subject. CONCLUSION Effect size statistics provide a better estimate of treatment effects than P values alone.
Collapse
|