1
|
Optimal allocation strategies in platform trials with continuous endpoints. Stat Methods Med Res 2024; 33:858-874. [PMID: 38505941 DOI: 10.1177/09622802241239008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
Platform trials are randomized clinical trials that allow simultaneous comparison of multiple interventions, usually against a common control. Arms to test experimental interventions may enter and leave the platform over time. This implies that the number of experimental intervention arms in the trial may change as the trial progresses. Determining optimal allocation rates to allocate patients to the treatment and control arms in platform trials is challenging because the optimal allocation depends on the number of arms in the platform and the latter typically varies over time. In addition, the optimal allocation depends on the analysis strategy used and the optimality criteria considered. In this article, we derive optimal treatment allocation rates for platform trials with shared controls, assuming that a stratified estimation and a testing procedure based on a regression model are used to adjust for time trends. We consider both, analysis using concurrent controls only as well as analysis methods using concurrent and non-concurrent controls and assume that the total sample size is fixed. The objective function to be minimized is the maximum of the variances of the effect estimators. We show that the optimal solution depends on the entry time of the arms in the trial and, in general, does not correspond to the square root of k allocation rule used in classical multi-arm trials. We illustrate the optimal allocation and evaluate the power and type 1 error rate compared to trials using one-to-one and square root of k allocations by means of a case study.
Collapse
|
2
|
A Bayesian analysis of mortality outcomes in multicentre clinical trials in critical care. Br J Anaesth 2021; 127:487-494. [PMID: 34275603 DOI: 10.1016/j.bja.2021.06.026] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2021] [Revised: 06/15/2021] [Accepted: 06/20/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Multicentre RCTs are widely used by critical care researchers to answer important clinical questions. However, few trials evaluating mortality outcomes report statistically significant results. We hypothesised that the low proportion of trials reporting statistically significant differences for mortality outcomes is plausibly explained by lower-than-expected effect sizes combined with a low proportion of participants who could realistically benefit from studied interventions. METHODS We reviewed multicentre trials in critical care published over a 10-yr period in the New England Journal of Medicine, the Journal of the American Medical Association, and the Lancet. To test our hypothesis, we analysed the results using a Bayesian model to investigate the relationship between the proportion of effective interventions and the proportion of statistically significant results for prior distributions of effect size and trial participant susceptibility. RESULTS Five of 54 trials (9.3%) reported a significant difference in mortality between the control and the intervention groups. The median expected and observed differences in absolute mortality were 8.0% and 2.0%, respectively. Our modelling shows that, across trials, a lower-than-expected effect size combined with a low proportion of potentially susceptible participants is consistent with the observed proportion of trials reporting significant differences even when most interventions are effective. CONCLUSIONS When designing clinical trials, researchers most likely overestimate true population effect sizes for critical care interventions. Bayesian modelling demonstrates that that it is not necessarily the case that most studied interventions lack efficacy. In fact, it is plausible that many studied interventions have clinically important effects that are missed.
Collapse
|
3
|
How to use frailtypack for validating failure-time surrogate endpoints using individual patient data from meta-analyses of randomized controlled trials. PLoS One 2020; 15:e0228098. [PMID: 31990928 PMCID: PMC6986733 DOI: 10.1371/journal.pone.0228098] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 01/07/2020] [Indexed: 11/29/2022] Open
Abstract
Background and Objective The use of valid surrogate endpoints can accelerate the development of phase III trials. Numerous validation methods have been proposed with the most popular used in a context of meta-analyses, based on a two-step analysis strategy. For two failure time endpoints, two association measures are usually considered, Kendall’s τ at individual level and adjusted R2 ( adjRtrial2) at trial level. However, adjRtrial2 is not always available mainly due to model estimation constraints. More recently, we proposed a one-step validation method based on a joint frailty model, with the aim of reducing estimation issues and estimation bias on the surrogacy evaluation criteria. The model was quite robust with satisfactory results obtained in simulation studies. This study seeks to popularize this new surrogate endpoints validation approach by making the method available in a user-friendly R package. Methods We provide numerous tools in the frailtypack R package, including more flexible functions, for the validation of candidate surrogate endpoints using data from multiple randomized clinical trials. Results We implemented the surrogate threshold effect which is used in combination with Rtrial2 to make decisions concerning the validity of the surrogate endpoints. It is also possible thanks to frailtypack to predict the treatment effect on the true endpoint in a new trial using the treatment effect observed on the surrogate endpoint. The leave-one-out cross-validation is available for assessing the accuracy of the prediction using the joint surrogate model. Other tools include data generation, simulation study and graphic representations. We illustrate the use of the new functions with both real data and simulated data. Conclusion This article proposes new attractive and well developed tools for validating failure time surrogate endpoints.
Collapse
|
4
|
Outcome reporting from clinical trials of non-valvular atrial fibrillation treated with traditional Chinese medicine or Western medicine: a systematic review. BMJ Open 2019; 9:e028803. [PMID: 31471437 PMCID: PMC6720335 DOI: 10.1136/bmjopen-2018-028803] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
OBJECTIVES To examine variation in outcomes, outcome measurement instruments (OMIs) and measurement times in clinical trials of non-valvular atrial fibrillation (NVAF) and to identify outcomes for prioritisation in developing a core outcome set (COS) in this field. DESIGN This study was a systematic review. DATA SOURCES Clinical trials published between January 2015 and March 2019 were obtained from PubMed, the Cochrane Library, Web of Science, Wanfang Database, the China National Knowledge Infrastructure and SinoMed. ELIGIBILITY CRITERIA Randomised controlled trials (RCTs) and observational studies were considered. Interventions included traditional Chinese medicine and Western medicine. The required treatment duration or follow-up time was ≥4 weeks. The required sample size was ≥30 and≥50 in each group in RCTs and observational studies, respectively. We excluded trials that aimed to investigate the outcome of complications of NVAF, to assess the mechanisms or pharmacokinetics, or for which full text could not be acquired. DATA EXTRACTION AND SYNTHESIS The general information and outcomes, OMIs and measurement times were extracted. The methodological and outcome reporting quality were assessed. The results were analysed by descriptive analysis. RESULTS A total of 218 articles were included from 25 255 articles. For clinical trials of antiarrhythmic therapy, 69 outcomes from 16 outcome domains were reported, and 28 (31.82%, 28/88) outcomes were reported only once; the most frequently reported outcome was ultrasonic cardiogram. Thirty-one outcomes (44.93%, 31/69) were provided definitions or OMIs; the outcome measurement times ranged from 1 to 20 with a median of 3. For clinical trials of anticoagulation therapy, 82 outcomes from 18 outcome domains were reported; 38 (29.23%, 38/130) outcomes were reported only once. The most frequently reported outcome was ischaemic stroke. Forty (48.78%, 40/82) outcomes were provided OMIs or definitions; and the outcome measurement times ranged from 1 to 27 with a median of 8. CONCLUSION Outcome reporting in NVAF is inconsistent. Thus, developing a COS that can be used in clinical trials is necessary.
Collapse
|
5
|
Selection of Endpoints in Clinical Trials: Trends in European Marketing Authorization Practice in Oncological Indications. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2019; 22:884-890. [PMID: 31426929 DOI: 10.1016/j.jval.2019.03.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Revised: 02/20/2019] [Accepted: 03/03/2019] [Indexed: 06/10/2023]
Abstract
OBJECTIVES To determine the types of endpoints that were the basis for efficacy assessment of medicines used in particular groups of oncological indications. Changes in the endpoints applied in marketing authorization practice were also considered. METHODS The analysis included marketing authorization applications (MAAs) for medicines used in oncological indications that were first-time approved by the European Medicines Agency (EMA) between 2009 and 2017, and the extensions of the analyzed medicines. RESULTS The analysis covered 125 MAAs: first-time approved (62%) and extensions (38%). In the analyzed trials, the endpoints that were reported most frequently included overall survival (OS), progression-free survival (PFS), and overall response rate (in 94.4%, 92.8%, 87.2% of MAAs, respectively). The following trends were observed: decreased significance of OS as a primary endpoint and increased significance of PFS as a primary endpoint (hematological indications). An analysis of MAAs for which the OS results were immature confirms the increased significance of PFS and new efficacy indicators (ie, pathological complete response). CONCLUSIONS An analysis of EMA's marketing authorization practice proves that the use of surrogate endpoints is becoming increasingly common in evaluating oncological health technologies. EMA's guidelines underline the role played by surrogates in the process of assessing efficacy of new therapies. Results of an analysis demonstrate that protocols of clinical trials define surrogates as primary endpoints more and more often. Furthermore, a positive decision on granting marketing authorization is possible also in situations when only such clinical data are available.
Collapse
|
6
|
Measuring Survival Benefit in Health Technology Assessment in the Presence of Nonproportional Hazards. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2019; 22:431-438. [PMID: 30975394 DOI: 10.1016/j.jval.2019.01.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 11/23/2018] [Accepted: 01/11/2019] [Indexed: 06/09/2023]
Abstract
BACKGROUND Proportional hazards (PH) is an assumption often made by researchers, despite evidence of nonproportionality in a significant proportion of clinical trials. In the presence of non-PH, the interpretation of hazard ratios, medians, and landmark survival as summary measures of treatment effect can become problematic. Several recent studies have recommended restricted mean survival time (RMST) as an alternative metric for survival analysis, particularly where non-PH may apply. OBJECTIVES To determine the current approaches of health technology assessment (HTA) agencies to value assessment in the presence of non-PH, and the extent to which RMST is accepted as an alternative measure of treatment benefit. METHODS Methodological guidelines published by 10 HTA agencies were reviewed to establish recommended approaches for presenting survival benefit from clinical trials. Published HTA reports for 23 oncology agents approved by the US Food and Drug Administration and the European Medicines Agency since 2014 were reviewed to determine how guidelines are implemented in practice and identify instances where the PH assumption was tested and RMST analyses reported. RESULTS Testing for non-PH is not widely incorporated into HTA except by the UK National Institute for Health and Care Excellence. RMST is used infrequently but has been used in a number of countries, particularly by agencies that focus on cost effectiveness. CONCLUSIONS HTA agencies vary in their approaches to non-PH. Most do not routinely check the PH assumption. RMST has played a role in assessing clinical benefit within HTA, although not consistently within countries (across drugs) or across countries (for the same drug).
Collapse
|
7
|
Application of Bayesian analyses to doubly randomized delayed start, matched control designs to demonstrate disease modification. Pharm Stat 2018; 18:22-38. [PMID: 30221459 DOI: 10.1002/pst.1905] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2017] [Revised: 06/17/2018] [Accepted: 07/31/2018] [Indexed: 01/01/2023]
Abstract
Disease modification is a primary therapeutic aim when developing treatments for most chronic progressive diseases. The best treatments do not simply affect disease symptoms but fundamentally improve disease course by slowing, halting, or reversing disease progression. One of many challenges for establishing disease modification relates to the identification of adequate analytic tools to show differences in a disease course following intervention. Traditional approaches rely on the comparisons of slopes or noninferiority margins. However, it has proven difficult to conclusively demonstrate disease modification using such approaches. To address these challenges, we propose a novel adaptation of the delayed start study design that incorporates posterior probabilities identified by hierarchical Bayesian inference approaches to establish evidence for disease modification. Our models compare the size of treatment differences at the end of the delayed start period with those at the end of the early start period. Simulations that compare several models are provided. These include general linear models, repeated measures models, spline models, and model averaging. Our work supports the superiority of model averaging for accurately characterizing complex data that arise in real world applications. This novel approach has been applied to the design of an ongoing, doubly randomized, matched control study that aims to show disease modification in young persons with schizophrenia (the Disease Recovery Evaluation and Modification (DREaM) study). The application of this Bayesian methodology to the DREaM study highlights the value of this approach and demonstrates many practical challenges that must be addressed when implementing this methodology in a real world trial.
Collapse
|
8
|
Randomised feasibility trial to compare three standard of care chemotherapy regimens for early stage triple-negative breast cancer (REaCT-TNBC trial). PLoS One 2018; 13:e0199297. [PMID: 30040817 PMCID: PMC6057636 DOI: 10.1371/journal.pone.0199297] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 03/19/2018] [Indexed: 01/02/2023] Open
Abstract
INTRODUCTION Despite the importance of chemotherapy in the treatment of early stage triple negative breast cancer (TNBC), no one optimal regimen has been identified. We conducted a pilot trial comparing outcomes for the three most commonly used chemotherapy regimens to assess the feasibility of conducting a larger definitive trial. METHODS Using integrated consent, newly diagnosed TNBC patients were randomised to one of three standard regimens: dose-dense doxorubicin-cyclophosphamide then paclitaxel, doxorubicin-cyclophosphamide then weekly paclitaxel or 5-FU-epirubicin-cyclophosphamide then docetaxel. Feasibility endpoints included; physician engagement, accrual rates, physician compliance and patient satisfaction with the integrated consent model. Our anticipated pilot trial sample size was 35 randomised patients in one year. RESULTS Between August 30th, 2016 and January 31st 2017, 2 patients met eligibility and were randomised. A survey of 10 participating oncologists was performed to identify potential strategies to enhance accrual. Most investigators (9/10) believed that the best regimen for TNBC was unknown, and 4/10 felt this was a pressing clinical question. Physicians' responses suggested that poor accrual was due to: a lack of interest in some study arms as oncologists already had a preferred regimen (4/10) and concerns about trial demands in busy clinics (3/10). The pilot feasibility endpoints were not met and the study was closed. CONCLUSIONS Despite initial interest in the trial question and multiple investigators agreeing to approach patients, this trial failed to meet feasibility endpoints. The reasons for poor accrual were multiple and require further evaluation if this important patient-centred question is to be answered. TRIAL REGISTRATION ClinicalTrials.gov NCT02688803.
Collapse
|
9
|
Statistical controversies in clinical research: building the bridge to phase II-efficacy estimation in dose-expansion cohorts. Ann Oncol 2017; 28:1427-1435. [PMID: 28200082 PMCID: PMC5834117 DOI: 10.1093/annonc/mdx045] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Regulatory agencies and others have expressed concern about the uncritical use of dose expansion cohorts (DECs) in phase I oncology trials. Nonetheless, by several metrics-prevalence, size, and number-their popularity is increasing. Although early efficacy estimation in defined populations is a common primary endpoint of DECs, the types of designs best equipped to identify efficacy signals have not been established. METHODS We conducted a simulation study of six phase I design templates with multiple DECs: three dose-assignment/adjustment mechanisms multiplied by two analytic approaches for estimating efficacy after the trial is complete. We also investigated the effect of sample size and interim futility analysis on trial performance. Identifying populations in which the treatment is efficacious (true positives) and weeding out inefficacious treatment/populations (true negatives) are competing goals in these trials. Thus, we estimated true and false positive rates for each design. RESULTS Adaptively updating the MTD during the DEC improved true positive rates by 8-43% compared with fixing the dose during the DEC phase while maintaining false positive rates. Inclusion of an interim futility analysis decreased the number of patients treated under inefficacious DECs without hurting performance. CONCLUSION A substantial gain in efficiency is obtainable using a design template that statistically models toxicity and efficacy against dose level during expansion. Design choices for dose expansion should be motivated by and based upon expected performance. Similar to the common practice in single-arm phase II trials, cohort sample sizes should be justified with respect to their primary aim and include interim analyses to allow for early stopping.
Collapse
|
10
|
Why clinical trial outcomes fail to translate into benefits for patients. Trials 2017. [PMID: 28288676 DOI: 10.1186/s13063-017-1870–2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023] Open
Abstract
Clinical research should ultimately improve patient care. For this to be possible, trials must evaluate outcomes that genuinely reflect real-world settings and concerns. However, many trials continue to measure and report outcomes that fall short of this clear requirement. We highlight problems with trial outcomes that make evidence difficult or impossible to interpret and that undermine the translation of research into practice and policy. These complex issues include the use of surrogate, composite and subjective endpoints; a failure to take account of patients' perspectives when designing research outcomes; publication and other outcome reporting biases, including the under-reporting of adverse events; the reporting of relative measures at the expense of more informative absolute outcomes; misleading reporting; multiplicity of outcomes; and a lack of core outcome sets. Trial outcomes can be developed with patients in mind, however, and can be reported completely, transparently and competently. Clinicians, patients, researchers and those who pay for health services are entitled to demand reliable evidence demonstrating whether interventions improve patient-relevant clinical outcomes.
Collapse
|
11
|
Beyond Composite Endpoints Analysis: Semicompeting Risks as an Underutilized Framework for Cancer Research. J Natl Cancer Inst 2016; 108:djw154. [PMID: 27381741 PMCID: PMC5241896 DOI: 10.1093/jnci/djw154] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Revised: 04/15/2016] [Accepted: 05/18/2016] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Composite endpoints (CEP), such as progression-free survival, are commonly used in cancer research. Notwithstanding their popularity, however, CEP analyses suffer from a number of drawbacks, especially when death is combined with a nonterminal event (ie, progression or recurrence), exemplifying the semicompeting risks setting. We investigated the semicompeting risks framework as a complementary analysis strategy that avoids certain drawbacks of CEPs. METHODS The illness-death model under the semicompeting risks framework was compared with standard analysis approaches: CEP analyses and (separate) univariate analyses for each component endpoint. Data from a previously published phase III randomized clinical trial in metastatic colon cancer including 1419 participants in the N9741 trial (conducted between 1997 and 2003) were used to determine the impact of the loss of information associated with combining multiple endpoints, as well as of ignoring the potentially informative role of death. A simulation study was conducted to further explore these issues. RESULTS Failure to account for critical features of semicompeting risks data can lead to potentially severely misleading conclusions. Advantages of semicompeting risks analyses include a clear delineation of treatment effects on both events, the ability to draw conclusions about a patient's joint risk of the two events, and an assessment of the dependence between the two event types. CONCLUSIONS Embedding and analyzing component outcomes in the semicompeting risks framework, either as a supplement or alternative to CEP analyses, represents an important, underutilized, and feasible opportunity for cancer research.
Collapse
|
12
|
Abstract
This review covers a number of the many design and analytic issues associated with clinical trials that incorporate patient reported outcomes as primary or secondary endpoints. We use a clinical trial designed to evaluate a new therapy for the prevention of migraines to illustrate how endpoints are defined by the objectives of the study, the methods for handling longitudinal assessments with multiple scales or outcomes, and the methods of analysis in the presence of missing data.
Collapse
|
13
|
Semi-Competing Risks Data Analysis: Accounting for Death as a Competing Risk When the Outcome of Interest Is Nonterminal. Circ Cardiovasc Qual Outcomes 2016; 9:322-31. [PMID: 27072677 PMCID: PMC4871755 DOI: 10.1161/circoutcomes.115.001841] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Accepted: 02/24/2016] [Indexed: 12/20/2022]
Abstract
Hospital readmission is a key marker of quality of health care. Notwithstanding its widespread use, however, it remains controversial in part because statistical methods used to analyze readmission, primarily logistic regression and related models, may not appropriately account for patients who die before experiencing a readmission event within the time frame of interest. Toward resolving this, we describe and illustrate the semi-competing risks framework, which refers to the general setting where scientific interest lies with some nonterminal event (eg, readmission), the occurrence of which is subject to a terminal event (eg, death). Although several statistical analysis methods have been proposed for semi-competing risks data, we describe in detail the use of illness-death models primarily because of their relation to well-known methods for survival analysis and the availability of software. We also describe and consider in detail several existing approaches that could, in principle, be used to analyze semi-competing risks data, including composite end point and competing risks analyses. Throughout we illustrate the ideas and methods using data on N=49 763 Medicare beneficiaries hospitalized between 2011 and 2013 with a principle discharge diagnosis of heart failure.
Collapse
|
14
|
Twenty-five years of confirmatory adaptive designs: opportunities and pitfalls. Stat Med 2016; 35:325-47. [PMID: 25778935 PMCID: PMC6680191 DOI: 10.1002/sim.6472] [Citation(s) in RCA: 130] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Revised: 02/03/2015] [Accepted: 02/19/2015] [Indexed: 12/26/2022]
Abstract
'Multistage testing with adaptive designs' was the title of an article by Peter Bauer that appeared 1989 in the German journal Biometrie und Informatik in Medizin und Biologie. The journal does not exist anymore but the methodology found widespread interest in the scientific community over the past 25 years. The use of such multistage adaptive designs raised many controversial discussions from the beginning on, especially after the publication by Bauer and Köhne 1994 in Biometrics: Broad enthusiasm about potential applications of such designs faced critical positions regarding their statistical efficiency. Despite, or possibly because of, this controversy, the methodology and its areas of applications grew steadily over the years, with significant contributions from statisticians working in academia, industry and agencies around the world. In the meantime, such type of adaptive designs have become the subject of two major regulatory guidance documents in the US and Europe and the field is still evolving. Developments are particularly noteworthy in the most important applications of adaptive designs, including sample size reassessment, treatment selection procedures, and population enrichment designs. In this article, we summarize the developments over the past 25 years from different perspectives. We provide a historical overview of the early days, review the key methodological concepts and summarize regulatory and industry perspectives on such designs. Then, we illustrate the application of adaptive designs with three case studies, including unblinded sample size reassessment, adaptive treatment selection, and adaptive endpoint selection. We also discuss the availability of software for evaluating and performing such designs. We conclude with a critical review of how expectations from the beginning were fulfilled, and - if not - discuss potential reasons why this did not happen.
Collapse
|
15
|
Flexible selection of a single treatment incorporating short-term endpoint information in a phase II/III clinical trial. Stat Med 2015; 34:3104-15. [PMID: 26112909 PMCID: PMC4745001 DOI: 10.1002/sim.6567] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Revised: 05/11/2015] [Accepted: 06/01/2015] [Indexed: 11/07/2022]
Abstract
Seamless phase II/III clinical trials in which an experimental treatment is selected at an interim analysis have been the focus of much recent research interest. Many of the methods proposed are based on the group sequential approach. This paper considers designs of this type in which the treatment selection can be based on short-term endpoint information for more patients than have primary endpoint data available. We show that in such a case, the familywise type I error rate may be inflated if previously proposed group sequential methods are used and the treatment selection rule is not specified in advance. A method is proposed to avoid this inflation by considering the treatment selection that maximises the conditional error given the data available at the interim analysis. A simulation study is reported that illustrates the type I error rate inflation and compares the power of the new approach with two other methods: a combination testing approach and a group sequential method that does not use the short-term endpoint data, both of which also strongly control the type I error rate. The new method is also illustrated through application to a study in Alzheimer's disease.
Collapse
|
16
|
|
17
|
Specifying the target difference in the primary outcome for a randomised controlled trial: guidance for researchers. Trials 2015; 16:12. [PMID: 25928502 PMCID: PMC4302137 DOI: 10.1186/s13063-014-0526-8] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 12/19/2014] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Central to the design of a randomised controlled trial is the calculation of the number of participants needed. This is typically achieved by specifying a target difference and calculating the corresponding sample size, which provides reassurance that the trial will have the required statistical power (at the planned statistical significance level) to identify whether a difference of a particular magnitude exists. Beyond pure statistical or scientific concerns, it is ethically imperative that an appropriate number of participants should be recruited. Despite the critical role of the target difference for the primary outcome in the design of randomised controlled trials, its determination has received surprisingly little attention. This article provides guidance on the specification of the target difference for the primary outcome in a sample size calculation for a two parallel group randomised controlled trial with a superiority question. METHODS This work was part of the DELTA (Difference ELicitation in TriAls) project. Draft guidance was developed by the project steering and advisory groups utilising the results of the systematic review and surveys. Findings were circulated and presented to members of the combined group at a face-to-face meeting, along with a proposed outline of the guidance document structure, containing recommendations and reporting items for a trial protocol and report. The guidance and was subsequently drafted and circulated for further comment before finalisation. RESULTS Guidance on specification of a target difference in the primary outcome for a two group parallel randomised controlled trial was produced. Additionally, a list of reporting items for protocols and trial reports was generated. CONCLUSIONS Specification of the target difference for the primary outcome is a key component of a randomized controlled trial sample size calculation. There is a need for better justification of the target difference and reporting of its specification.
Collapse
|
18
|
Flexible stopping boundaries when changing primary endpoints after unblinded interim analyses. J Biopharm Stat 2014; 24:817-33. [PMID: 24697500 PMCID: PMC4024106 DOI: 10.1080/10543406.2014.901341] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Accepted: 05/04/2013] [Indexed: 10/25/2022]
Abstract
It has been widely recognized that interim analyses of accumulating data in a clinical trial can inflate type I error. Different methods, from group sequential boundaries to flexible alpha spending functions, have been developed to control the overall type I error at prespecified level. These methods mainly apply to testing the same endpoint in multiple interim analyses. In this article, we consider a group sequential design with preplanned endpoint switching after unblinded interim analyses. We extend the alpha spending function method to group sequential stopping boundaries when the parameters can be different between interim, or between interim and final analyses.
Collapse
|
19
|
[Impacts that dimethoate inhibited the benchmark dose of acetylcholinesterase based on experimental designs]. WEI SHENG YAN JIU = JOURNAL OF HYGIENE RESEARCH 2013; 42:999-1003. [PMID: 24459918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
OBJECTIVE To obtain the impacts of experimental design on benchmark dose (BMD), and the result was applied to test the computer simulation by software Slob (optimal method to calculate the BMD: for a certain sample capacity, to add the experimental groups by reducing the amount of animals in each group) , consequently, this method can be widely used in the future. METHODS Eighty adult female SD rats were ig given dimethoate 0.5, 1, 2, 4, 8, 16 and 32 mg/kg for 21 d, respectively. Rats were sacrificed, and acetylcholinesterase (AChE) activity in the hippocampus, cerebral cortex and serum of rats was determined after dimethoate was ig given to rats for 21 d. And then, the software package PROAST28.1 was applied to calculate the BMD. The four does groups of 10 animals (4 x 10 design) and 8 x 5 design were selected from 8 x 10 design to study the impacts of experimental design on BMD. RESULTS Comparing with the normal control, the significant decline of AChE in hippocampus was observed in 2, 4, 8, 16 and 32 mg/kg groups (P < 0.05), whereas the significant decrease was obtained in 0.5, 1, 2, 4, 8, 16 and 32 mg/kg groups (P < 0.05). Taking the 8 x 10 design as the standard, the confidence interval of BMD calculated by both of 4 x 10 design and 8 x 5 design covered the BMD by 8 x 10 design. And also, confidence interval of BMD, calculated by design scheme 1, 2, 3, 4 and 6 of 4 x 10 design, wider than that of 8 x 5 design, but its scheme 5 narrower than 8 x 5 design. CONCLUSION To add experimental groups in a certain sample capacity was the optimal method to calculate BMD, but was not the common toxicity experimental design (e. g. set four groups including control, low-dose, moderate-dose, high-dose group).
Collapse
|
20
|
Efficient estimation of the distribution of time to composite endpoint when some endpoints are only partially observed. LIFETIME DATA ANALYSIS 2013; 19:513-546. [PMID: 23722304 PMCID: PMC3982403 DOI: 10.1007/s10985-013-9261-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2012] [Accepted: 05/10/2013] [Indexed: 06/02/2023]
Abstract
Two common features of clinical trials, and other longitudinal studies, are (1) a primary interest in composite endpoints, and (2) the problem of subjects withdrawing prematurely from the study. In some settings, withdrawal may only affect observation of some components of the composite endpoint, for example when another component is death, information on which may be available from a national registry. In this paper, we use the theory of augmented inverse probability weighted estimating equations to show how such partial information on the composite endpoint for subjects who withdraw from the study can be incorporated in a principled way into the estimation of the distribution of time to composite endpoint, typically leading to increased efficiency without relying on additional assumptions above those that would be made by standard approaches. We describe our proposed approach theoretically, and demonstrate its properties in a simulation study.
Collapse
|
21
|
Reply: Correlation between clinical outcomes and appropriateness grading for referral to myocardial perfusion imaging for preoperative evaluation prior to non-cardiac surgery. J Nucl Cardiol 2013; 20:654. [PMID: 23475439 DOI: 10.1007/s12350-013-9700-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Accepted: 02/18/2013] [Indexed: 11/30/2022]
|
22
|
Reply: Logistic regression, odds ratio, and factor variables. J Nucl Cardiol 2013; 20:652-3. [PMID: 23670351 DOI: 10.1007/s12350-013-9729-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
23
|
|
24
|
PODSE: a computer program for optimal design of trials with discrete-time survival endpoints. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2013; 111:115-127. [PMID: 23578981 DOI: 10.1016/j.cmpb.2013.02.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2012] [Revised: 02/14/2013] [Accepted: 02/18/2013] [Indexed: 06/02/2023]
Abstract
In experimental settings, one or more groups of subjects receive a treatment and they are compared to a group of subjects that receives a standard treatment or no treatment at all. These compared groups might have an equal number of subjects or some of the groups might have more participants relative to the other groups. Moreover, subjects in these groups can be followed over a short or a long period. To conduct experiments in a sufficient way, researchers should find a good design in the planning phase of the trial. The optimal design for experimental studies on event occurrence with discrete-time survival endpoints where two treatment groups are followed over time, is an optimal combination of the number of time periods, the total number of participants in the trial and the proportion of subjects in the experimental group. It is easy to find the best design for such studies using the PODSE program.
Collapse
|
25
|
|
26
|
|
27
|
[Introduction to randomized trials: the choice of endpoint]. GYNECOLOGIE, OBSTETRIQUE & FERTILITE 2011; 39:595-596. [PMID: 21924939 DOI: 10.1016/j.gyobfe.2011.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2011] [Indexed: 05/31/2023]
|
28
|
Analysis and design of randomised clinical trials involving competing risks endpoints. Trials 2011; 12:127. [PMID: 21595883 PMCID: PMC3130669 DOI: 10.1186/1745-6215-12-127] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2011] [Accepted: 05/19/2011] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND In randomised clinical trials involving time-to-event outcomes, the failures concerned may be events of an entirely different nature and as such define a classical competing risks framework. In designing and analysing clinical trials involving such endpoints, it is important to account for the competing events, and evaluate how each contributes to the overall failure. An appropriate choice of statistical model is important for adequate determination of sample size. METHODS We describe how competing events may be summarised in such trials using cumulative incidence functions and Gray's test. The statistical modelling of competing events using proportional cause-specific and subdistribution hazard functions, and the corresponding procedures for sample size estimation are outlined. These are illustrated using data from a randomised clinical trial (SQNP01) of patients with advanced (non-metastatic) nasopharyngeal cancer. RESULTS In this trial, treatment has no effect on the competing event of loco-regional recurrence. Thus the effects of treatment on the hazard of distant metastasis were similar via both the cause-specific (unadjusted csHR = 0.43, 95% CI 0.25 - 0.72) and subdistribution (unadjusted subHR 0.43; 95% CI 0.25 - 0.76) hazard analyses, in favour of concurrent chemo-radiotherapy followed by adjuvant chemotherapy. Adjusting for nodal status and tumour size did not alter the results. The results of the logrank test (p = 0.002) comparing the cause-specific hazards and the Gray's test (p = 0.003) comparing the cumulative incidences also led to the same conclusion. However, the subdistribution hazard analysis requires many more subjects than the cause-specific hazard analysis to detect the same magnitude of effect. CONCLUSIONS The cause-specific hazard analysis is appropriate for analysing competing risks outcomes when treatment has no effect on the cause-specific hazard of the competing event. It requires fewer subjects than the subdistribution hazard analysis for a similar effect size. However, if the main and competing events are influenced in opposing directions by an intervention, a subdistribution hazard analysis may be warranted.
Collapse
|
29
|
Abstract
PURPOSE To review the existing endpoints of tumour growth delay assays in experimental radiobiology with an emphasis on their efficient estimation for statistically significant identification of the treatment effect. To mathematically define doubling time (DT), tumour-growth delay (TGD) and cancer-cell surviving fraction (SF) in vivo using exponential growth and regrowth models with tumour volume measurements obtained from animal experiments. MATERIALS AND METHODS A statistical model-based approach is used to define and efficiently estimate the three endpoints of tumour therapy in experimental cancer research. RESULTS The log scale is advocated for plotting the tumour volume data and the respective analysis. Therefore, the geometric mean should be used to display the mean tumour volume data, and the group comparison should be a t-test for the log volume to comply with the Gaussian-distribution assumption. The relationship between cancer-cell SF, TGD and rate of growth is rigorously established. The widespread formula for cell kill is corrected; it has been rigorously shown that TGD is the difference between DTs. The software for the tumour growth delay analysis based on the mixed modeling approach with a complete set of instructions and example can be found on the author's webpage. CONCLUSIONS The existing practice for TGD data analysis from animal experiments suffers from imprecision and large standard errors that yield low power and statistically insignificant treatment effect. This practice should be replaced with a model-based statistical analysis on the log scale.
Collapse
|
30
|
Interpreting epidemiological evidence in the presence of multiple endpoints: an alternative analytic approach using the 9-year follow-up of the Seychelles child development study. Int Arch Occup Environ Health 2009; 82:1031-41. [PMID: 19205720 PMCID: PMC3330475 DOI: 10.1007/s00420-009-0402-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2008] [Accepted: 01/18/2009] [Indexed: 10/21/2022]
Abstract
PURPOSE The potential for ill-informed causal inference is a major concern in published longitudinal studies evaluating impaired neurological function in children prenatally exposed to background levels of methyl mercury (MeHg). These studies evaluate a large number of developmental tests. We propose an alternative analysis strategy that reduces the number of comparisons tested in these studies. METHODS Using data from the 9-year follow-up of 643 children in the Seychelles child development study, we grouped 18 individual endpoints into one overall ordinal outcome variable as well as by developmental domains. Subsequently, ordinal logistic regression analyses were performed. RESULTS We did not find an association between prenatal MeHg exposure and developmental outcomes at 9 years of age. CONCLUSION Our proposed framework is more likely to result in a balanced interpretation of a posteriori associations. In addition, this new strategy should facilitate the use of complex epidemiological data in quantitative risk assessment.
Collapse
|
31
|
Primary and secondary endpoints in clinical trials. ARBEITEN AUS DEM PAUL-EHRLICH-INSTITUT (BUNDESINSTITUT FUR IMPFSTOFFE UND BIOMEDIZINISCHE ARZNEIMITTEL) LANGEN/HESSEN 2009; 96:96-105. [PMID: 20799449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
|
32
|
|
33
|
Stopping trials early for benefit: too good to be true. Lancet 2008; 371:1310. [PMID: 18424306 DOI: 10.1016/s0140-6736(08)60569-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
34
|
Abstract
Composite endpoints are often used in clinical trials in order to increase the overall event rates, reduce the sizes of the trials and achieve desired power. For example, in a trial to study the effect of a treatment on the prevention of venous thromboembolic events after a major orthopaedic surgery of the lower limbs, the primary endpoint is usually a composite endpoint consisting of any deep vein thrombosis identified by systematic venography of lower limbs, symptomatic and well-documented non-fatal pulmonary embolism, and death from all causes. Just as any endpoints, missing data can occur in the components of the composite endpoint. If a patient has missing data on some of the components but not all the components, this patient may not have complete data but partial data for the composite endpoint. To be consistent with the intention-to-treat principle, the patient should not be discarded from the analysis. In this research, we propose an approach for the analysis of a composite endpoint with missing data in components. The main idea is to first derive the probabilities of all possible study outcomes based on the appropriate model and then to construct the overall rate for the composite endpoint. Simulations are conducted to compare the approach with several naïve methods. A data example is used to illustrate the application of the approach.
Collapse
|
35
|
Abstract
Statistical validation of a surrogate marker has been studied for more than a decade. Recently, Alonso et al. (2004, Biometrics 60, 724-728) proposed a quantity called the likelihood reduction factor (LRF) to evaluate the validity of a surrogate marker. However, as pointed out in the present article, the LRF may not correctly validate a surrogate marker. Therefore, a new quantity, the proportion of information gain (PIG) using the Kullback-Leibler information, is proposed. Simulations show that under some model assumptions, the PIG precisely reflects the role of a surrogate marker.
Collapse
|
36
|
Abstract
Bauer and Kohne proposed an adaptive design using Fisher's combination of independent p-values based on subsamples from different stages (Biometrics 1994; 50(4):1029-1041). Their method provides great flexibility in the selection of statistical methods for hypothesis testing of subsamples. However, the choices for the stopping boundaries are not flexible enough to meet practical needs (Biometrics 2001; 57(3): 886-891). In this paper, an adaptive design method is proposed using linear combination of the independent p-values. The method provides great flexibility in the selection of stopping boundaries and no numerical integration is required for the two-stage designs. The stopping boundaries and p-values can be calculated manually. The operating characteristics of the adaptive designs are studied using computer simulations with and without sample size adjustment. Examples are presented for superiority and non-inferiority trials with different endpoints (normal, binary, and survival) under different adaptations. The statistical efficiency of the proposed method is compared with other methods based on conditional power.
Collapse
|
37
|
Rationale and strategies for reevaluating the ACR20. J Rheumatol 2007; 34:1184-7. [PMID: 17477484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
OBJECTIVE To assess whether the American College of Rheumatology response criteria ACR20 should be replaced by another definition of response with enhanced discriminant validity. METHODS We worked with statisticians to define over 100 different ways of defining response, including dichotomous definitions (e.g., ACR20; ACR50; ACR70; low disease activity), ordinal definitions (EULAR response; ACR20, ACR50, ACR70), disease activity indexes [Disease Activity Score (DAS); Disease Activity Index, SDAI], continuous definitions (mean percentage improvement in all core set measures; nACR, ACRn), and hybrid definitions (ACR20, ACR50, ACR70 defined for a patient as 0, 1, 2, 3 scale with continuous measures between intervals) along with variations on each of these approaches (e.g., percentage vs absolute change in DAS; e.g., measures requiring vs not requiring joint count improvement). To test clinical validity, we administered a survey using patients from a trial who had various levels of improvement and asked rheumatologists whether and by how much these patients improved. For Sn-to-Chge, we are collecting data from large disease modifying antirheumatic drug multicenter trials in rheumatoid arthritis and ranking candidate definitions of response on their average p values in distinguishing active treatment from placebo or combination compared to single comparator. RESULTS We surveyed 52 rheumatologists about which trial patients had improved and by how much. Trial data were obtained and tested for sensitivity to change. CONCLUSION A rigorous data-driven consensus process was used to reassess the ACR20.
Collapse
|
38
|
Abstract
Scleroderma patients usually have serious medical events in several organ systems and it is desirable to have a composite index that accounts for disease activity in these organ systems. We show how one may use a composite 'time to event' analysis for evaluating such patients and more generally for patients suffering from a chronic disease. The composite 'time to event' analysis requires a composite endpoint with a Kaplan-Meier type analysis. As an illustration, we use data from a clinical trial for scleroderma patients and present sensitivity analysis where one or more of the organ involvement definition criteria are modified. In addition, we propose desirability functions to monitor patients' disease improvement when the outcomes are all continuous. This method offers several possible advantages over existing methods for measuring patients' improvement.
Collapse
|
39
|
Type I Error and Power in Noninferiority/Equivalence Trials with Correlated Multiple Endpoints: An Example from Vaccine Development Trials. J Biopharm Stat 2007; 14:893-907. [PMID: 15587971 DOI: 10.1081/bip-200035454] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Clinical trials necessary for the development of new treatment often require testing of multiple endpoints for equivalence or noninferiority relative to an existing effective standard therapy. An example is a vaccine study with multiple antibody measurements in sera of subjects receiving a combination vaccine such as a pneumococcal vaccine, which contains many different serotypes of the pneumococcal organism. This article describes testing methods for the demonstration of simultaneous marginal equivalence or noninferiority of two treatments on each component of the response vector that follows a multivariate normal distribution. Systematic simulation studies are conducted to evaluate the performance of the testing method and to examine under what conditions the power is substantially different if the multiple endpoints are assumed to be independent when they are actually strongly correlated. Data from an illustrative example are used to describe how the study power can be evaluated in the design of the trials.
Collapse
|
40
|
Efficacy and safety of extended-release venlafaxine in the treatment of generalized anxiety disorder in children and adolescents: two placebo-controlled trials. Am J Psychiatry 2007; 164:290-300. [PMID: 17267793 DOI: 10.1176/ajp.2007.164.2.290] [Citation(s) in RCA: 97] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
OBJECTIVE The authors evaluated the efficacy, safety, and tolerability of extended-release venlafaxine in the treatment of pediatric generalized anxiety disorder. METHOD Two randomized, double-blind, placebo-controlled trials were conducted at 59 sites in 2000 and 2001. Participants 6 to 17 years of age who met DSM-IV criteria for generalized anxiety disorder received a flexible dosage of extended-release venlafaxine (N=157) or placebo (N=163) for 8 weeks. The primary outcome measure was the composite score for nine delineated items from the generalized anxiety disorder section of a modified version of the Schedule for Affective Disorders and Schizophrenia for School-Age Children, and the primary efficacy variable was the baseline-to-endpoint change in this composite score. Secondary outcome measures were overall score on the nine delineated items, Pediatric Anxiety Rating Scale, Hamilton Anxiety Rating Scale, Screen for Child Anxiety Related Emotional Disorders, and the severity of illness and improvement scores from the Clinical Global Impression scale (CGI). RESULTS The extended-release venlafaxine group showed statistically significant improvements in the primary and secondary outcome measures in study 1 and significant improvements in some secondary outcome measures but not the primary outcome measure in study 2. In a pooled analysis, the extended-release venlafaxine group showed a significantly greater mean decrease in the primary outcome measure compared with the placebo group (-17.4 versus -12.7). The response rate as indicated by a CGI improvement score <3 was significantly greater with extended-release venlafaxine than placebo (69% versus 48%). Common adverse events were asthenia, anorexia, pain, and somnolence. Statistically significant changes in height, weight, blood pressure, pulse, and cholesterol levels were observed in the extended-release venlafaxine group. CONCLUSIONS Extended-release venlafaxine may be an effective, well-tolerated short-term treatment for pediatric generalized anxiety disorder.
Collapse
|
41
|
Abstract
Treatment comparisons in clinical trials often involve multiple endpoints. By making use of bootstrap tests, we develop a new non-parametric approach to multiple-endpoint testing that can be used to demonstrate non-inferiority of a new treatment for all endpoints and superiority for some endpoint when it is compared to an active control. It is shown that this approach does not incur a large multiplicity cost in sample size to achieve reasonable power and that it can incorporate complex dependencies in the multivariate distributions of all outcome variables for the two treatments via bootstrap resampling.
Collapse
|
42
|
Confirmatory seamless phase II/III clinical trials with hypotheses selection at interim: opportunities and limitations. Biom J 2006; 48:650-5; discussion 660-2. [PMID: 16972717 DOI: 10.1002/bimj.200610248] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This is a discussion of the following two papers in this special issue on adaptive designs: 'Confirmatory seamless phase II/III clinical trials with hypotheses selection at interim: General concepts' by Frank Bretz, Heinz Schmidli, Franz König, Amy Racine and Willi Maurer, and 'Confirmatory seamless phase II/III clinical trials with hypotheses selection at interim: Applications and practical considerations' by Heinz Schmidli, Frank Bretz, Amy Racine and Willi Maurer.
Collapse
|
43
|
Design of vaccine equivalence/non-inferiority trials with correlated multiple binomial endpoints. J Biopharm Stat 2006; 16:555-72. [PMID: 16892913 DOI: 10.1080/10543400600721596] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Immunogenicity trials that study the immune responses to vaccination are often used in the vaccine development process as alternatives to clinical efficacy trials. The comparisons of immune responses among various treatment groups are conducted in a non-inferiority or equivalence framework. When there exists a level of immune response that correlates with protection against disease, it is of interest to compare the proportion of responders as defined as response above a specific level or as a predefined increase in immune levels for post-vaccination levels above pre-vaccination levels. Since vaccines often contain several antigens, the correlations between the immune responses need to be taken into account in the analysis. In this paper, we describe appropriate testing methods for demonstrating the non-inferioritylequivalence of two treatments on each of the binomial endpoints. We conduct a comprehensive simulation study to shed light on how the Type I error and power are affected and to what extent when correlated multiple binomial endpoints are present in the vaccine trials. We also illustrate the computation of power for assessment of non-inferioritylequivalence in real studies.
Collapse
|
44
|
Abstract
Noninferioritylequivalence designs are often used in vaccine clinical trials. The goal of these designs is to demonstrate that a new vaccine, or new formulation or regimen of an existing vaccine, is similar in terms of effectiveness to the existing vaccine, while offering such advantages as easier manufacturing, easier administration, lower cost, or improved safety profile. These noninferioritylequivalence designs are particularly useful in four common types of immunogenicity trials: vaccine bridging trials, combination vaccine trials, vaccine concomitant use trials, and vaccine consistency lot trials. In this paper, we give an overview of the key statistical issues and recent developments for noninferioritylequivalence vaccine trials. Specifically, we cover the following topics: (i) selection of study endpoints; (ii) formulation of the null and alternative hypotheses; (iii) determination of the noninferioritylequivalence margin; (iv) selection of efficient statistical methods for the statistical analysis of noninferioritylequivalence vaccine trials, with particular emphasis on adjustment for stratification factors and missing pre-or post-vaccination data; and (v) the calculation of sample size and power.
Collapse
|
45
|
Comparing experimental designs for benchmark dose calculations for continuous endpoints. RISK ANALYSIS : AN OFFICIAL PUBLICATION OF THE SOCIETY FOR RISK ANALYSIS 2006; 26:1031-43. [PMID: 16948695 DOI: 10.1111/j.1539-6924.2006.00798.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
The BMD (benchmark dose) method that is used in risk assessment of chemical compounds was introduced by Crump (1984) and is based on dose-response modeling. To take uncertainty in the data and model fitting into account, the lower confidence bound of the BMD estimate (BMDL) is suggested to be used as a point of departure in health risk assessments. In this article, we study how to design optimum experiments for applying the BMD method for continuous data. We exemplify our approach by considering the class of Hill models. The main aim is to study whether an increased number of dose groups and at the same time a decreased number of animals in each dose group improves conditions for estimating the benchmark dose. Since Hill models are nonlinear, the optimum design depends on the values of the unknown parameters. That is why we consider Bayesian designs and assume that the parameter vector has a prior distribution. A natural design criterion is to minimize the expected variance of the BMD estimator. We present an example where we calculate the value of the design criterion for several designs and try to find out how the number of dose groups, the number of animals in the dose groups, and the choice of doses affects this value for different Hill curves. It follows from our calculations that to avoid the risk of unfavorable dose placements, it is good to use designs with more than four dose groups. We can also conclude that any additional information about the expected dose-response curve, e.g., information obtained from studies made in the past, should be taken into account when planning a study because it can improve the design.
Collapse
|
46
|
Abstract
There are many disorders where regulatory agencies have required a new treatment to demonstrate efficacy on multiple co-primary endpoints, all significant at the one-sided 2.5 per cent level, before accepting the treatment's effect for the disorder. This requirement, rooted in the intersection-union (IU) test, has led many researchers to increase the study sample size to make up for the reduction in the statistical power at the study level. Unfortunately, the increase in sample size could be substantial when the endpoints are minimally correlated and the treatment effects on the multiple endpoints are comparable. In this paper, we demonstrate that the frequentist concept of controlling the maximum false positive rate, even when applied to a restricted null space, has only limited success in keeping the sample size increase at a reasonable level. We therefore propose an approach that is based on the notion of controlling an average type I error rate. By employing an upper bound for the average type I error rate, the new approach provides an adjustment to the significance level that depends only on the correlation among the endpoints. For the most common case of two or three co-primary endpoints, the adjusted significance level is at most 5 per cent (one-sided) when the endpoints are moderately correlated. We show how sample size could be calculated under the proposed approach and contrast the needed sample size with that required under the IU test. We provide additional comments and discuss why the new approach is consistent with the principle requiring evidence of significance in the drug development and approval process.
Collapse
|
47
|
A-Line, Bispectral Index, and Estimated Effect-Site Concentrations: A Prediction of Clinical End-Points of Anesthesia. Anesth Analg 2006; 102:1141-6. [PMID: 16551913 DOI: 10.1213/01.ane.0000202385.96653.32] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Autoregressive modeling with exogenous input of middle-latency auditory evoked potentials (A-Line AEP index, AAI) has been developed for monitoring depth of anesthesia. We investigated the prediction of recovery and dose-response relationship of desflurane and AAI or bispectral index (BIS) values. Twenty adult men scheduled for radical prostatectomy were recruited. To minimize opioid effects, analgesia was provided by a concurrent epidural in addition to the general anesthetic. Electrodes for AAI and BIS monitoring and a headphone for auditory stimuli were applied. Propofol and remifentanil were used for anesthetic induction. Maintenance of anesthesia was with desflurane only. For comparison to AAI and BIS monitor parameters, pharmacokinetic models for desflurane and propofol distribution and effect-site concentrations were used to predict clinical end-points (Prediction probability P(K)). Patients opened their eyes at an AAI value of 47 +/- 20 and a BIS value of 77 +/- 14 (mean +/- sd), and the prediction probability for eye opening was P(K) = 0.81 for AAI, P(K) = 0.89 for BIS, and P(K) = 0.91 for desflurane effect-site concentration. The opening of eyes was best predicted by the calculated desflurane effect-site concentration. The relationship between predicted desflurane effect-site concentration versus AAI and BIS was calculated by nonlinear regression analysis (r = 0.75 for AAI and r = 0.80 for BIS). The correlation between BIS and clinical end-points of anesthesia or the desflurane effect-compartment concentration is better than for the AAI.
Collapse
|
48
|
Predictive probability of serum prostate-specific antigen for prostate cancer: an approach using Bayes rule. Am J Clin Pathol 2006; 125:336-42. [PMID: 16613336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2023] Open
Abstract
This article introduces the use of Bayes probability rule to calculate age and serum prostate-specific antigen (PSA)-specific positive predictive values (PPVs) for prostate cancer. The PPV is the conditional probability of having prostate cancer, given a value of PSA and a particular age group. The formulation uses values of sensitivity obtained from previously reported studies of more than 2,700 men with prostate cancer, and it uses values of specificity obtained from previously reported studies of more than 99,000 men without prostate cancer. The formulation also introduces the use of a population-based and age-specific probability of prostate cancer, and for this it relies on the National Cancer Institute-sponsored Surveillance, Epidemiology and End Results data. The Bayes PPV suggests that in younger men, cut points defining an elevated PSA level should be raised rather than lowered. The Bayes formulation also provides estimates of the PPV for narrow intervals of PSA, and these tabulated results may provide useful guidelines for the implications of serum PSA levels at specific age groups.
Collapse
|
49
|
The natural history of untreated HIV infection in Lima, Peru: implications for clinical trial endpoints for HIV vaccines. HUMAN VACCINES 2005; 1:160-4. [PMID: 17012861 DOI: 10.4161/hv.1.4.1976] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Most candidate HIV vaccines are directed at priming memory T cell responses and are being evaluated on their effects on post acquisition viremia and/or disease progression. These vaccines are being studied in areas of high HIV-1 prevalence. As such, we evaluated the frequency of CD4+ T cell decline and time course of opportunistic infections of patients presenting at a major metropolitan hospital in Lima, Peru, an area where such candidate vaccines are being tested. We examined 92 patients with untreated HIV-1 in calendar year 2002: 35% presented with CD4+ T cell counts of <200, 25% between 201 and 400, and 17% with >400 cells/mm3, 30 of 92 patients presented with overt AIDS, 6 were without an AIDS defining OI but CD4 counts <200. Over the course of follow-up, CD4 count decreased by a mean of 31 cells/mm3/year in women and 28 in men (p>0.5). Among persons presenting with CD4 counts >250 cells/mm3, the median time to first OI was 3.5 years. If clinical endpoints are required to evaluate the clinical effectiveness of T cell based vaccines, extended clinical follow-up of subjects enrolled in such trials will be required.
Collapse
|
50
|
Improving the estimation of change from baseline in a continuous outcome measure in the clinical trial setting. Contemp Clin Trials 2005; 26:2-16. [PMID: 15837448 DOI: 10.1016/j.cct.2004.08.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2003] [Revised: 04/29/2004] [Accepted: 08/25/2004] [Indexed: 11/25/2022]
Abstract
In many clinical trials, the primary focus is whether treatment groups differ with respect to the change from baseline to end of therapy in a continuous response variable. Randomized clinical trials often use a repeated measures design in which subjects are followed-up at fixed times throughout the study. With this design, testing for differences between treatment groups with respect to the average change from baseline to end of therapy in the response variable is equivalent to testing for differences between the rates of change in the response variable, assuming the rates of change in each treatment group are linear. This analysis can be performed quite easily using methods such as generalized estimating equations (GEE). However, if the rate of change in the response cannot be assumed linear, the average change from baseline is many times calculated using simply differences between baseline and final measurements and additional data points are not included in the analysis. Instead, we propose using all available data in a repeated measures model that is based on the nonlinear treatment response pattern to estimate the average change from baseline to end of therapy in each treatment group. GEE with robust variance estimation is used for obtaining these model-based estimates of the treatment effect and a simple test for appropriateness of the model is presented. The GEE model presented, in conjunction with the test for appropriateness of the model, form the basis for an adaptive analysis approach for determining the method of estimation of the primary endpoint. This approach results in more efficient estimates of the treatment effect when the response pattern is specified correctly and minimizes the bias in the estimate when the hypothesized response pattern is misspecified. We are motivated by examples in the cystic fibrosis (CF) clinical trial setting and demonstrate the potential for this approach in reducing the sample size required for future CF clinical trials.
Collapse
|