1
|
Held L, Pawel S, Micheloud C. The assessment of replicability using the sum of p-values. ROYAL SOCIETY OPEN SCIENCE 2024; 11:240149. [PMID: 39205991 PMCID: PMC11349439 DOI: 10.1098/rsos.240149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 05/30/2024] [Accepted: 06/26/2024] [Indexed: 09/04/2024]
Abstract
Statistical significance of both the original and the replication study is a commonly used criterion to assess replication attempts, also known as the two-trials rule in drug development. However, replication studies are sometimes conducted although the original study is non-significant, in which case Type-I error rate control across both studies is no longer guaranteed. We propose an alternative method to assess replicability using the sum of p -values from the two studies. The approach provides a combined p -value and can be calibrated to control the overall Type-I error rate at the same level as the two-trials rule but allows for replication success even if the original study is non-significant. The unweighted version requires a less restrictive level of significance at replication if the original study is already convincing which facilitates sample size reductions of up to 10%. Downweighting the original study accounts for possible bias and requires a more stringent significance level and larger sample sizes at replication. Data from four large-scale replication projects are used to illustrate and compare the proposed method with the two-trials rule, meta-analysis and Fisher's combination method.
Collapse
Affiliation(s)
- Leonhard Held
- Epidemiology Biostatistics and Prevention Institute (EBPI) and Center for Reproducible Science (CRS), University of Zurich, Hirschengraben 84, Zurich8001, Switzerland
| | - Samuel Pawel
- Epidemiology Biostatistics and Prevention Institute (EBPI) and Center for Reproducible Science (CRS), University of Zurich, Hirschengraben 84, Zurich8001, Switzerland
| | - Charlotte Micheloud
- Epidemiology Biostatistics and Prevention Institute (EBPI) and Center for Reproducible Science (CRS), University of Zurich, Hirschengraben 84, Zurich8001, Switzerland
| |
Collapse
|
2
|
Vandemeulebroecke M, Häring DA, Hua E, Wei X, Xi D. New strategies for confirmatory testing of secondary hypotheses on combined data from multiple trials. Clin Trials 2024; 21:171-179. [PMID: 38311901 DOI: 10.1177/17407745231214382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2024]
Abstract
BACKGROUND Pivotal evidence of efficacy of a new drug is typically generated by (at least) two clinical trials which independently provide statistically significant and mutually corroborating evidence of efficacy based on a primary endpoint. In this situation, showing drug effects on clinically important secondary objectives can be demanding in terms of sample size requirements. Statistically efficient methods to power for such endpoints while controlling the Type I error are needed. METHODS We review existing strategies for establishing claims on important but sample size-intense secondary endpoints. We present new strategies based on combined data from two independent, identically designed and concurrent trials, controlling the Type I error at the submission level. We explain the methodology and provide three case studies. RESULTS Different strategies have been used for establishing secondary claims. One new strategy, involving a protocol planned analysis of combined data across trials, and controlling the Type I error at the submission level, is particularly efficient. It has already been successfully used in support of label claims. Regulatory views on this strategy differ. CONCLUSIONS Inference on combined data across trials is a useful approach for generating pivotal evidence of efficacy for important but sample size-intense secondary endpoints. It requires careful preparation and regulatory discussion.
Collapse
Affiliation(s)
| | | | - Eva Hua
- China Novartis Institutes for Biomedical Research Co., Shanghai, China
| | - Xiaoling Wei
- China Novartis Institutes for Biomedical Research Co., Shanghai, China
| | - Dong Xi
- Gilead Sciences, Foster City, CA, USA
| |
Collapse
|
3
|
Rosenkranz GK. A Generalization of the Two Trials Paradigm. Ther Innov Regul Sci 2023; 57:316-320. [PMID: 36289189 DOI: 10.1007/s43441-022-00471-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 10/14/2022] [Indexed: 11/30/2022]
Abstract
The two trials paradigm plays a prominent role in drug development and has been widely and controversially discussed. Its purpose is to ensure replicability or substantiation of study results. This note investigates a simple generalization of the paradigm to more than two trials that preserves the project wise type-I error rate and power.
Collapse
Affiliation(s)
- Gerd K Rosenkranz
- Statistical Consultant, Obereckstrasse 11a, 79539, Lörrach, Germany.
| |
Collapse
|
4
|
Held L, Micheloud C, Pawel S. The assessment of replication success based on relative effect size. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1502] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Leonhard Held
- Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of Zurich
| | - Charlotte Micheloud
- Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of Zurich
| | - Samuel Pawel
- Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of Zurich
| |
Collapse
|
5
|
Rosenkranz GK. Replicability of studies following a dual-criterion design. Stat Med 2021; 40:4068-4076. [PMID: 33928668 DOI: 10.1002/sim.9014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 03/20/2021] [Accepted: 04/13/2021] [Indexed: 11/11/2022]
Abstract
Replicability of results is regarded as the corner stone of science. Recent research seems to raise doubts about whether this requirement is generally fulfilled. Often, replicability of results is defined as repeating a statistically significant result. However, since significance may not imply scientific relevance, dual-criterion study designs that take both aspects into account have been proposed and investigated during the last decade. Originally developed for proof-of-concept trials, the design could be appropriate for phase III trials as well. In fact, a dual-criterion design has been requested for COVID-19 vaccine applications by major health authorities. In this article, replicability of dual-criterion designs is investigated. It turns out that the probability to replicate a significant and relevant result can become as low as 0.5. The replication probability increases if the effect estimator exceeds the minimum relevant effect in the original study by an extra amount.
Collapse
|
6
|
Hua E, Janocha R, Severin T, Wei J, Vandemeulebroecke M. A Phase 3 Trial Analysis Proposal for Mitigating the Impact of the COVID-19 Pandemic. Stat Biopharm Res 2021. [DOI: 10.1080/19466315.2021.1905056] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Eva Hua
- Biostatistical Sciences and Pharmacometrics, China Novartis Institutes for Biomedical Research Co., Shanghai, China
| | | | - Thomas Severin
- Global Drug Development, Novartis Pharma AG, Basel, Switzerland
| | - Jiawei Wei
- Biostatistical Sciences and Pharmacometrics, China Novartis Institutes for Biomedical Research Co., Shanghai, China
| | | |
Collapse
|
7
|
Held L. The harmonic mean
χ
2
‐test to substantiate scientific findings. J R Stat Soc Ser C Appl Stat 2020. [DOI: 10.1111/rssc.12410] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Yuan J, Zhu R, Jia D, Palm U, Koch G. Combined-Indications Significance Level of Multiple Related Indications Developed Simultaneously. Ther Innov Regul Sci 2019. [DOI: 10.1177/2168479019864087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Background: Typically, regulatory requirements include 2 confirmatory studies, each at a 1-sided .025 significance level, for a medicine to be approved for a specific indication. When the same medicine has been approved in related indications, 1 confirmatory study at a 1-sided .025 significance level could constitute adequate evidence of efficacy for a new indication. Methods: This article does not contain any studies with human or animal subjects performed by any of the authors. For multiple related indications developed simultaneously to constitute sufficient evidence of clinical efficacy, the combined-studies significance level can be set at the same level as if those indications are developed sequentially. Results: This article establishes possible strategies to develop a few related indications at the same time for marketing registration approval, maintaining a desired combined-studies significance level; for example, 1-sided .0000156 for 2 indications, with 1 option having each indication assessed with 1 confirmatory study at .00395 significance level. Conclusion: It is possible to develop a few indications at the same time for marketing registration approval, where the combined-studies significance level is less stringent than that of the usual paradigm with 2 confirmatory studies each at 1-sided .025 significance level for every indication.
Collapse
Affiliation(s)
| | - Ray Zhu
- Allergan Inc, Irvine, CA, USA
| | | | | | - Gary Koch
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
9
|
Koch GG. Commentary on “Statistics at FDA: Reflections on the Past Six Years”. Stat Biopharm Res 2019. [DOI: 10.1080/19466315.2018.1554505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Gary G. Koch
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
| |
Collapse
|
10
|
Bretz F, Xi D. Commentary on “Statistics at FDA: Reflections on the Past Six Years”. Stat Biopharm Res 2019. [DOI: 10.1080/19466315.2018.1562369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
| | - Dong Xi
- Novartis Pharmaceuticals, East Hanover, NJ
| |
Collapse
|
11
|
Preussler S, Kieser M, Kirchner M. Optimal sample size allocation and go/no-go decision rules for phase II/III programs where several phase III trials are performed. Biom J 2018; 61:357-378. [PMID: 30182372 DOI: 10.1002/bimj.201700241] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 05/08/2018] [Accepted: 07/01/2018] [Indexed: 11/07/2022]
Abstract
The conduct of phase II and III programs is costly, time-consuming and, due to high failure rates in late development stages, risky. There is a strong connection between phase II and III trials as the go/no-go decision and the sample size chosen for phase III are based on the results observed in phase II. An integrated planning of phase II and III is therefore reasonable. The success of phase II/III programs crucially depends on the allocation of the resources to phase II and III in terms of sample size and the rule applied to decide whether to stop or to proceed with phase III. Recently, a utility-based approach was proposed, where optimal planning of phase II/III programs is achieved by taking fixed and variable costs of the drug development program and potential gains after a successful launch into account. However, this method is restricted to programs with a single phase III trial, while regulatory authorities usually require statistical significance in two or more phase III trials. We present a generalization of this procedure to programs where two or more phase III trials are performed. Optimal phase II sample sizes and go/no-go decision rules are provided for time-to-event outcomes and cases, where at least one, two, or three phase III trials need to be successful. Different drug development program strategies (e.g. one large vs. two phase III trials) are compared within these different cases. Application to practical examples typically met in oncology trials illustrates the proposed method.
Collapse
Affiliation(s)
- Stella Preussler
- Institute of Medical Biometry and Informatics, Medical Biometry, University of Heidelberg, Heidelberg, Germany
| | - Meinhard Kieser
- Institute of Medical Biometry and Informatics, Medical Biometry, University of Heidelberg, Heidelberg, Germany
| | - Marietta Kirchner
- Institute of Medical Biometry and Informatics, Medical Biometry, University of Heidelberg, Heidelberg, Germany
| |
Collapse
|
12
|
Neuenschwander B, Roychoudhury S, Branson M. Predictive Evidence Threshold Scaling: Does the Evidence Meet a Confirmatory Standard? Stat Biopharm Res 2018. [DOI: 10.1080/19466315.2017.1392892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
13
|
Wang SJ, Bretz F, Dmitrienko A, Hsu J, Hung HMJ, Huque M, Koch G. Panel forum on multiple comparison procedures: A commentary from a complex trial design and analysis plan. Biom J 2014; 55:275-93. [DOI: 10.1002/bimj.201200047] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2012] [Revised: 12/21/2012] [Accepted: 12/21/2012] [Indexed: 11/10/2022]
Affiliation(s)
- Sue-Jane Wang
- U.S. Food and Drug Administration; HFD-700, WO 21, MailStop Room 3562, 10903 New Hampshire Avenue Silver Spring MD 20993 USA
- Johns Hopkins University; Baltimore MD 21218 USA
| | - Frank Bretz
- Novartis; WSJ-27.1.005, Novartis, Lichtstr. 35, 4002 Basel Switzerland
- Hannover Medical School; OE 8410, Carl-Neuberg-Str. 1 30625 Hannover Germany
| | | | - Jason Hsu
- The Ohio State University; Columbus OH 43210 USA
| | - H. M. James Hung
- U.S. Food and Drug Administration; HFD-700, WO 21, MailStop Room 3562, 10903 New Hampshire Avenue Silver Spring MD 20993 USA
- Johns Hopkins University; Baltimore MD 21218 USA
| | - Mohammad Huque
- U.S. Food and Drug Administration; HFD-700, WO 21, MailStop Room 3562, 10903 New Hampshire Avenue Silver Spring MD 20993 USA
- Jiann-Ping Hsu College of Public Health at Georgia Southern University; Statesboro CA 30458 USA
| | - Gary Koch
- University of North Carolina at Chapel Hill; Chapel Hill NC 27599 USA
| |
Collapse
|
14
|
Zhang J, Zhang JJ. Joint probability of statistical success of multiple phase III trials. Pharm Stat 2013; 12:358-65. [DOI: 10.1002/pst.1597] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Revised: 08/13/2013] [Accepted: 08/13/2013] [Indexed: 11/11/2022]
Affiliation(s)
| | - Jenny J. Zhang
- Gilead Sciences; 333 Lakeside Drive, Foster City CA 94404 USA
| |
Collapse
|
15
|
Affiliation(s)
- Hiroyuki Uesaka
- a The Center for Advanced Medical Engineering and Informatics , Osaka University , Osaka, Japan
| |
Collapse
|
16
|
Bretz F, Maurer W, Gallo P. Discussion of “Some Controversial Multiple Testing Problems in Regulatory Applications” by H. M. J. Hung and S.-J. Wang. J Biopharm Stat 2009. [DOI: 10.1080/10543400802541834] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Frank Bretz
- a Novartis Pharma AG , Basel, Switzerland
- b Institute of Biometry, Hanover Medical University , Hanover, Germany
| | | | - Paul Gallo
- c Novartis Pharmaceuticals , East Hanover, New Jersey, USA
| |
Collapse
|
17
|
Denne J, Enas G. “Substantial Evidence” from a Replicated Secondary Analysis, Followed by a Single Prospective Confirmatory Study. ACTA ACUST UNITED AC 2008. [DOI: 10.1177/009286150804200205] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
18
|
Abstract
A common approach to analysing clinical trials with multiple outcomes is to control the probability for the trial as a whole of making at least one incorrect positive finding under any configuration of true and false null hypotheses. Popular approaches are to use Bonferroni corrections or structured approaches such as, for example, closed-test procedures. As is well known, such strategies, which control the family-wise error rate, typically reduce the type I error for some or all the tests of the various null hypotheses to below the nominal level. In consequence, there is generally a loss of power for individual tests. What is less well appreciated, perhaps, is that depending on approach and circumstances, the test-wise loss of power does not necessarily lead to a family wise loss of power. In fact, it may be possible to increase the overall power of a trial by carrying out tests on multiple outcomes without increasing the probability of making at least one type I error when all null hypotheses are true. We examine two types of problems to illustrate this. Unstructured testing problems arise typically (but not exclusively) when many outcomes are being measured. We consider the case of more than two hypotheses when a Bonferroni approach is being applied while for illustration we assume compound symmetry to hold for the correlation of all variables. Using the device of a latent variable it is easy to show that power is not reduced as the number of variables tested increases, provided that the common correlation coefficient is not too high (say less than 0.75). Afterwards, we will consider structured testing problems. Here, multiplicity problems arising from the comparison of more than two treatments, as opposed to more than one measurement, are typical. We conduct a numerical study and conclude again that power is not reduced as the number of tested variables increases.
Collapse
Affiliation(s)
- Stephen Senn
- Department of Statistics, University of Glasgow, Glasgow, Scotland, UK
| | | |
Collapse
|
19
|
Gallo P, Maurer W. Challenges in implementing adaptive designs: comments on the viewpoints expressed by regulatory statisticians. Biom J 2006; 48:591-7; discussion 613-22. [PMID: 16972710 DOI: 10.1002/bimj.200610250] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This is a discussion of the following three papers appearing in this special issue on adaptive designs: 'FDA's critical path initiative: A perspective on contributions of biostatistics' by Robert T. O'Neill; 'A regulatory view on adaptive/flexible clinical trial design' by H. M. James Hung, Robert T. O'Neill, Sue-Jane Wang and John Lawrence; and 'Confirmatory clinical trials with an adaptive design' by Armin Koch.
Collapse
Affiliation(s)
- Paul Gallo
- Novartis Pharmaceuticals, East Hanover, NJ, USA
| | | |
Collapse
|
20
|
Koch GG. Statistical consideration of the strategy for demonstrating clinical evidence of effectiveness—one larger vs two smaller pivotal studies by Z. Shun, E. Chi, S. Durrleman and L. Fisher,Statistics in Medicine 2005;24:1619–1637. Stat Med 2005. [DOI: 10.1002/sim.2016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|