1
|
D'Agostino McGowan L, Lotspeich SC, Hepler SA. The "Why" behind including "Y" in your imputation model. Stat Methods Med Res 2024:9622802241244608. [PMID: 38625810 DOI: 10.1177/09622802241244608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]
Abstract
Missing data is a common challenge when analyzing epidemiological data, and imputation is often used to address this issue. Here, we investigate the scenario where a covariate used in an analysis has missingness and will be imputed. There are recommendations to include the outcome from the analysis model in the imputation model for missing covariates, but it is not necessarily clear if this recommendation always holds and why this is sometimes true. We examine deterministic imputation (i.e. single imputation with fixed values) and stochastic imputation (i.e. single or multiple imputation with random values) methods and their implications for estimating the relationship between the imputed covariate and the outcome. We mathematically demonstrate that including the outcome variable in imputation models is not just a recommendation but a requirement to achieve unbiased results when using stochastic imputation methods. Moreover, we dispel common misconceptions about deterministic imputation models and demonstrate why the outcome should not be included in these models. This article aims to bridge the gap between imputation in theory and in practice, providing mathematical derivations to explain common statistical recommendations. We offer a better understanding of the considerations involved in imputing missing covariates and emphasize when it is necessary to include the outcome variable in the imputation model.
Collapse
Affiliation(s)
| | - Sarah C Lotspeich
- Department of Statistical Sciences, Wake Forest University, Winston-Salem, NC, USA
| | - Staci A Hepler
- Department of Statistical Sciences, Wake Forest University, Winston-Salem, NC, USA
| |
Collapse
|
2
|
Shih JH, Albert PS, Fine J, Liu D. An imputation approach for a time-to-event analysis subject to missing outcomes due to noncoverage in disease registries. Biostatistics 2023; 25:117-133. [PMID: 36534828 PMCID: PMC10939403 DOI: 10.1093/biostatistics/kxac049] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 11/29/2022] [Accepted: 12/04/2022] [Indexed: 12/17/2023] Open
Abstract
Disease incidence data in a national-based cohort study would ideally be obtained through a national disease registry. Unfortunately, no such registry currently exists in the United States. Instead, the results from individual state registries need to be combined to ascertain certain disease diagnoses in the United States. The National Cancer Institute has initiated a program to assemble all state registries to provide a complete assessment of all cancers in the United States. Unfortunately, not all registries have agreed to participate. In this article, we develop an imputation-based approach that uses self-reported cancer diagnosis from longitudinally collected questionnaires to impute cancer incidence not covered by the combined registry. We propose a two-step procedure, where in the first step a mover-stayer model is used to impute a participant's registry coverage status when it is only reported at the time of the questionnaires given at 10-year intervals and the time of the last-alive vital status and death. In the second step, we propose a semiparametric working model, fit using an imputed coverage area sample identified from the mover-stayer model, to impute registry-based survival outcomes for participants in areas not covered by the registry. The simulation studies show the approach performs well as compared with alternative ad hoc approaches for dealing with this problem. We illustrate the methodology with an analysis that links the United States Radiologic Technologists study cohort with the combined registry that includes 32 of the 50 states.
Collapse
Affiliation(s)
- Joanna H Shih
- Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Drive, Bethesda, MD 20892, USA
| | - Paul S Albert
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Drive, Bethesda, MD 20892, USA
| | - Jason Fine
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Drive, Bethesda, MD 20892, USA
| | - Danping Liu
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Drive, Bethesda, MD 20892, USA
| |
Collapse
|
3
|
Tao R, Lotspeich SC, Amorim G, Shaw PA, Shepherd BE. Efficient semiparametric inference for two-phase studies with outcome and covariate measurement errors. Stat Med 2021; 40:725-738. [PMID: 33145800 PMCID: PMC8214478 DOI: 10.1002/sim.8799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 09/07/2020] [Accepted: 10/20/2020] [Indexed: 11/07/2022]
Abstract
In modern observational studies using electronic health records or other routinely collected data, both the outcome and covariates of interest can be error-prone and their errors often correlated. A cost-effective solution is the two-phase design, under which the error-prone outcome and covariates are observed for all subjects during the first phase and that information is used to select a validation subsample for accurate measurements of these variables in the second phase. Previous research on two-phase measurement error problems largely focused on scenarios where there are errors in covariates only or the validation sample is a simple random sample of study subjects. Herein, we propose a semiparametric approach to general two-phase measurement error problems with a quantitative outcome, allowing for correlated errors in the outcome and covariates and arbitrary second-phase selection. We devise a computationally efficient and numerically stable expectation-maximization algorithm to maximize the nonparametric likelihood function. The resulting estimators possess desired statistical properties. We demonstrate the superiority of the proposed methods over existing approaches through extensive simulation studies, and we illustrate their use in an observational HIV study.
Collapse
Affiliation(s)
- Ran Tao
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Sarah C. Lotspeich
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Gustavo Amorim
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Pamela A. Shaw
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Bryan E. Shepherd
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee
| |
Collapse
|
4
|
Shaw PA, He J, Shepherd BE. Regression calibration to correct correlated errors in outcome and exposure. Stat Med 2021; 40:271-286. [PMID: 33086428 PMCID: PMC8670514 DOI: 10.1002/sim.8773] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 08/31/2020] [Accepted: 09/25/2020] [Indexed: 11/07/2022]
Abstract
Measurement error arises through a variety of mechanisms. A rich literature exists on the bias introduced by covariate measurement error and on methods of analysis to address this bias. By comparison, less attention has been given to errors in outcome assessment and nonclassical covariate measurement error. We consider an extension of the regression calibration method to settings with errors in a continuous outcome, where the errors may be correlated with prognostic covariates or with covariate measurement error. This method adjusts for the measurement error in the data and can be applied with either a validation subset, on which the true data are also observed (eg, a study audit), or a reliability subset, where a second observation of error prone measurements are available. For each case, we provide conditions under which the proposed method is identifiable and leads to consistent estimates of the regression parameter. When the second measurement on the reliability subset has no error or classical unbiased measurement error, the proposed method is consistent even when the primary outcome and exposures of interest are subject to both systematic and random error. We examine the performance of the method with simulations for a variety of measurement error scenarios and sizes of the reliability subset. We illustrate the method's application using data from the Women's Health Initiative Dietary Modification Trial.
Collapse
Affiliation(s)
- Pamela A. Shaw
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Jiwei He
- Office of Biostatistics, Office of Translational Sciences, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Bryan E. Shepherd
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee
| |
Collapse
|
5
|
Giganti MJ, Shepherd BE. Multiple-Imputation Variance Estimation in Studies With Missing or Misclassified Inclusion Criteria. Am J Epidemiol 2020; 189:1628-1632. [PMID: 32685964 PMCID: PMC7705600 DOI: 10.1093/aje/kwaa153] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 07/13/2020] [Accepted: 07/14/2020] [Indexed: 11/15/2022] Open
Abstract
In observational studies using routinely collected data, a variable with a high level of missingness or misclassification may determine whether an observation is included in the analysis. In settings where inclusion criteria are assessed after imputation, the popular multiple-imputation variance estimator proposed by Rubin ("Rubin's rules" (RR)) is biased due to incompatibility between imputation and analysis models. While alternative approaches exist, most analysts are not familiar with them. Using partially validated data from a human immunodeficiency virus cohort, we illustrate the calculation of an imputation variance estimator proposed by Robins and Wang (RW) in a scenario where the study exclusion criteria are based on a variable that must be imputed. In this motivating example, the corresponding imputation variance estimate for the log odds was 29% smaller using the RW estimator than using the RR estimator. We further compared these 2 variance estimators with a simulation study which showed that coverage probabilities of 95% confidence intervals based on the RR estimator were too high and became worse as more observations were imputed and more subjects were excluded from the analysis. The RW imputation variance estimator performed much better and should be employed when there is incompatibility between imputation and analysis models. We provide analysis code to aid future analysts in implementing this method.
Collapse
Affiliation(s)
- Mark J Giganti
- Correspondence to Dr. Mark J. Giganti, Center for Biostatistics in AIDS Research, Harvard T.H. Chan School of Public Health, 651 Huntington Avenue, Boston, MA 02115 (e-mail: )
| | | |
Collapse
|
6
|
Shepherd BE, Shaw PA. Errors in multiple variables in human immunodeficiency virus (HIV) cohort and electronic health record data: statistical challenges and opportunities. STATISTICAL COMMUNICATIONS IN INFECTIOUS DISEASES 2020; 12:20190015. [PMID: 35880997 PMCID: PMC9204761 DOI: 10.1515/scid-2019-0015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 08/21/2020] [Indexed: 06/15/2023]
Abstract
Objectives: Observational data derived from patient electronic health records (EHR) data are increasingly used for human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS) research. There are challenges to using these data, in particular with regards to data quality; some are recognized, some unrecognized, and some recognized but ignored. There are great opportunities for the statistical community to improve inference by incorporating validation subsampling into analyses of EHR data.Methods: Methods to address measurement error, misclassification, and missing data are relevant, as are sampling designs such as two-phase sampling. However, many of the existing statistical methods for measurement error, for example, only address relatively simple settings, whereas the errors seen in these datasets span multiple variables (both predictors and outcomes), are correlated, and even affect who is included in the study.Results/Conclusion: We will discuss some preliminary methods in this area with a particular focus on time-to-event outcomes and outline areas of future research.
Collapse
Affiliation(s)
- Bryan E. Shepherd
- Biostatistics, Vanderbilt University, 2525 West End, Suite 11000, 37203Nashville, Tennessee, USA
| | - Pamela A. Shaw
- Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
7
|
Shaw PA, Gustafson P, Carroll RJ, Deffner V, Dodd KW, Keogh RH, Kipnis V, Tooze JA, Wallace MP, Küchenhoff H, Freedman LS. STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 2-More complex methods of adjustment and advanced topics. Stat Med 2020; 39:2232-2263. [PMID: 32246531 PMCID: PMC7272296 DOI: 10.1002/sim.8531] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2018] [Revised: 02/27/2020] [Accepted: 02/28/2020] [Indexed: 12/24/2022]
Abstract
We continue our review of issues related to measurement error and misclassification in epidemiology. We further describe methods of adjusting for biased estimation caused by measurement error in continuous covariates, covering likelihood methods, Bayesian methods, moment reconstruction, moment-adjusted imputation, and multiple imputation. We then describe which methods can also be used with misclassification of categorical covariates. Methods of adjusting estimation of distributions of continuous variables for measurement error are then reviewed. Illustrative examples are provided throughout these sections. We provide lists of available software for implementing these methods and also provide the code for implementing our examples in the Supporting Information. Next, we present several advanced topics, including data subject to both classical and Berkson error, modeling continuous exposures with measurement error, and categorical exposures with misclassification in the same model, variable selection when some of the variables are measured with error, adjusting analyses or design for error in an outcome variable, and categorizing continuous variables measured with error. Finally, we provide some advice for the often met situations where variables are known to be measured with substantial error, but there is only an external reference standard or partial (or no) information about the type or magnitude of the error.
Collapse
Affiliation(s)
- Pamela A Shaw
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Paul Gustafson
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Raymond J Carroll
- Department of Statistics, Texas A&M University, College Station, Texas, USA
- School of Mathematical and Physical Sciences, University of Technology Sydney, Broadway, New South Wales, Australia
| | - Veronika Deffner
- Statistical Consulting Unit StaBLab, Department of Statistics, Ludwig-Maximilians-Universität, Munich, Germany
| | - Kevin W Dodd
- Biometry Research Group, Division of Cancer Prevention, National Cancer Institute, Bethesda, Maryland, USA
| | - Ruth H Keogh
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| | - Victor Kipnis
- Biometry Research Group, Division of Cancer Prevention, National Cancer Institute, Bethesda, Maryland, USA
| | - Janet A Tooze
- Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | - Michael P Wallace
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Helmut Küchenhoff
- Statistical Consulting Unit StaBLab, Department of Statistics, Ludwig-Maximilians-Universität, Munich, Germany
| | - Laurence S Freedman
- Biostatistics and Biomathematics Unit, Gertner Institute for Epidemiology and Health Policy Research, Sheba Medical Center, Tel Hashomer, Israel
- Information Management Services Inc., Rockville, Maryland, USA
| |
Collapse
|
8
|
Giganti MJ, Shaw PA, Chen G, Bebawy SS, Turner MM, Sterling TR, Shepherd BE. ACCOUNTING FOR DEPENDENT ERRORS IN PREDICTORS AND TIME-TO-EVENT OUTCOMES USING ELECTRONIC HEALTH RECORDS, VALIDATION SAMPLES, AND MULTIPLE IMPUTATION. Ann Appl Stat 2020; 14:1045-1061. [PMID: 32999698 PMCID: PMC7523695 DOI: 10.1214/20-aoas1343] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Data from electronic health records (EHR) are prone to errors, which are often correlated across multiple variables. The error structure is further complicated when analysis variables are derived as functions of two or more error-prone variables. Such errors can substantially impact estimates, yet we are unaware of methods that simultaneously account for errors in covariates and time-to-event outcomes. Using EHR data from 4217 patients, the hazard ratio for an AIDS-defining event associated with a 100 cell/mm3 increase in CD4 count at ART initiation was 0.74 (95%CI: 0.68-0.80) using unvalidated data and 0.60 (95%CI: 0.53-0.68) using fully validated data. Our goal is to obtain unbiased and efficient estimates after validating a random subset of records. We propose fitting discrete failure time models to the validated subsample and then multiply imputing values for unvalidated records. We demonstrate how this approach simultaneously addresses dependent errors in predictors, time-to-event outcomes, and inclusion criteria. Using the fully validated dataset as a gold standard, we compare the mean squared error of our estimates with those from the unvalidated dataset and the corresponding subsample-only dataset for various subsample sizes. By incorporating reasonably sized validated subsamples and appropriate imputation models, our approach had improved estimation over both the naive analysis and the analysis using only the validation subsample.
Collapse
Affiliation(s)
| | - Pamela A. Shaw
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania
| | - Guanhua Chen
- Department of Biostatistics and Medical Informatics, University of Wisconsin
| | | | | | | | | |
Collapse
|
9
|
Gustafson P, Karim ME. When exposure is subject to nondifferential misclassification, are validation data helpful in testing for an exposure–disease association? CAN J STAT 2019. [DOI: 10.1002/cjs.11490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Paul Gustafson
- Department of StatisticsUniversity of British ColumbiaVancouver Canada
| | - Mohammad Ehsanul Karim
- School of Population and Public HealthUniversity of British ColumbiaVancouver Canada
- Centre for Health Evaluation and Outcome SciencesProvidence Health CareVancouver Canada
| |
Collapse
|
10
|
Shen Y, Cheng X, Zhang W. The fragility of randomized controlled trials in intracranial hemorrhage. Neurosurg Rev 2017. [DOI: 10.1007/s10143-017-0870-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
11
|
Wang LE, Shaw PA, Mathelier HM, Kimmel SE, French B. EVALUATING RISK-PREDICTION MODELS USING DATA FROM ELECTRONIC HEALTH RECORDS. Ann Appl Stat 2016; 10:286-304. [PMID: 27158296 DOI: 10.1214/15-aoas891] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The availability of data from electronic health records facilitates the development and evaluation of risk-prediction models, but estimation of prediction accuracy could be limited by outcome misclassification, which can arise if events are not captured. We evaluate the robustness of prediction accuracy summaries, obtained from receiver operating characteristic curves and risk-reclassification methods, if events are not captured (i.e., "false negatives"). We derive estimators for sensitivity and specificity if misclassification is independent of marker values. In simulation studies, we quantify the potential for bias in prediction accuracy summaries if misclassification depends on marker values. We compare the accuracy of alternative prognostic models for 30-day all-cause hospital readmission among 4548 patients discharged from the University of Pennsylvania Health System with a primary diagnosis of heart failure. Simulation studies indicate that if misclassification depends on marker values, then the estimated accuracy improvement is also biased, but the direction of the bias depends on the direction of the association between markers and the probability of misclassification. In our application, 29% of the 1143 readmitted patients were readmitted to a hospital elsewhere in Pennsylvania, which reduced prediction accuracy. Outcome misclassification can result in erroneous conclusions regarding the accuracy of risk-prediction models.
Collapse
Affiliation(s)
- L E Wang
- DEPARTMENT OF BIOSTATISTICS AND EPIDEMIOLOGY, UNIVERSITY OF PENNSYLVANIA, 423 GUARDIAN DRIVE, PHILADELPHIA, PENNSYLVANIA 19104, USA
| | - Pamela A Shaw
- DEPARTMENT OF BIOSTATISTICS AND EPIDEMIOLOGY, UNIVERSITY OF PENNSYLVANIA, 423 GUARDIAN DRIVE, PHILADELPHIA, PENNSYLVANIA 19104, USA
| | - Hansie M Mathelier
- DEPARTMENT OF MEDICINE, UNIVERSITY OF PENNSYLVANIA, 51 N 39TH STREET, PHILADELPHIA, PENNSYLVANIA 19104, USA
| | - Stephen E Kimmel
- DEPARTMENT OF BIOSTATISTICS AND EPIDEMIOLOGY, UNIVERSITY OF PENNSYLVANIA, 423 GUARDIAN DRIVE, PHILADELPHIA, PENNSYLVANIA 19104, USA
| | - Benjamin French
- DEPARTMENT OF BIOSTATISTICS AND EPIDEMIOLOGY, UNIVERSITY OF PENNSYLVANIA, 423 GUARDIAN DRIVE, PHILADELPHIA, PENNSYLVANIA 19104, USA
| |
Collapse
|
12
|
Kurland BF, Doot RK, Linden HM, Mankoff DA, Kinahan PE. Multicenter trials using ¹⁸F-fluorodeoxyglucose (FDG) PET to predict chemotherapy response: effects of differential measurement error and bias on power calculations for unselected and enrichment designs. Clin Trials 2013; 10:886-95. [PMID: 24169628 DOI: 10.1177/1740774513506618] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
BACKGROUND Clinical validation of a predictive biomarker is especially difficult when the biomarker cannot be assessed retrospectively. A cost-effective, prospective multicenter replication study with rapid accrual is warranted prior to further validation studies such as a marker-based strategy for treatment selection. However, it is often unknown how measurement error and bias in a multicenter trial will differ from that in single-institution studies. PURPOSE Power calculations using simulated data may inform the efficient design of a multicenter study to replicate single-institution findings. This case study used serial standardized uptake value (SUV) measures from (18)F-fluorodeoxyglucose (FDG) positron emission tomography (PET) to predict early response to breast cancer neoadjuvant chemotherapy. We examined the impact of accelerating accrual through increased inclusion of secondary sites with greater levels of measurement error and bias. We also examined whether enrichment designs based on breast cancer initial uptake could increase the study power for a fixed budget (200 total scans). METHODS Reference FDG PET SUV data were selected with replacement from a single-institution trial; pathologic complete response (pCR) data were simulated using a logistic regression model predicting response by mid-therapy percent change in SUV. The impact of increased error for SUV measurements in multicenter trials was simulated by sampling from error and bias distributions: 20%-40% measurement error, 0%-40% bias, and fixed error/bias values. The proportion of patients recruited from secondary sites (with higher additional error/bias compared to primary sites) varied from 25% to 75%. RESULTS Reference power (from source data with no added error) was 0.92 for N = 100 to detect an association between percentage change in SUV and response. With moderate (20%) simulated measurement error for 3/4, 1/2, and 1/4 of measurements and 40% for the remainder, power was 0.70, 0.61, and 0.53, respectively. Reduction of study power was similar for other manifestations of measurement error (bias as a percentage of true value, absolute error, and absolute bias). Enrichment designs, which recruit additional patients by not conducting a second scan in patients with unsuitable pre-therapy uptake (low baseline SUV), did not lead to greater power for studies constrained to the same total cost. LIMITATIONS Simulation parameters could be incorrect, or not generalizable. Under a different logistic regression model relating mid-therapy percent change in SUV to pCR (with no relationship for patients with low baseline SUV, rather than the modest point estimate from reference data), the enrichment design did have somewhat greater power than the unselected design. CONCLUSION Even moderate additional measurement error substantially reduced study power under both unselected and enrichment designs.
Collapse
|