1
|
Fischer L, Roig MB, Brannath W. An exhaustive ADDIS principle for online FWER control. Biom J 2024; 66:e2300237. [PMID: 38637319 DOI: 10.1002/bimj.202300237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 01/27/2024] [Accepted: 03/09/2024] [Indexed: 04/20/2024]
Abstract
In this paper, we consider online multiple testing with familywise error rate (FWER) control, where the probability of committing at least one type I error will remain under control while testing a possibly infinite sequence of hypotheses over time. Currently, adaptive-discard (ADDIS) procedures seem to be the most promising online procedures with FWER control in terms of power. Now, our main contribution is a uniform improvement of the ADDIS principle and thus of all ADDIS procedures. This means, the methods we propose reject as least as much hypotheses as ADDIS procedures and in some cases even more, while maintaining FWER control. In addition, we show that there is no other FWER controlling procedure that enlarges the event of rejecting any hypothesis. Finally, we apply the new principle to derive uniform improvements of the ADDIS-Spending and ADDIS-Graph.
Collapse
Affiliation(s)
- Lasse Fischer
- Competence Center for Clinical Trials Bremen, University of Bremen, Bremen, Germany
| | - Marta Bofill Roig
- Center for Medical Data Science, Medical University of Vienna, Vienna, Austria
| | - Werner Brannath
- Competence Center for Clinical Trials Bremen, University of Bremen, Bremen, Germany
| |
Collapse
|
2
|
Blackwell SE. Using the 'Leapfrog' Design as a Simple Form of Adaptive Platform Trial to Develop, Test, and Implement Treatment Personalization Methods in Routine Practice. ADMINISTRATION AND POLICY IN MENTAL HEALTH AND MENTAL HEALTH SERVICES RESEARCH 2024:10.1007/s10488-023-01340-4. [PMID: 38316652 DOI: 10.1007/s10488-023-01340-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2023] [Indexed: 02/07/2024]
Abstract
The route for the development, evaluation and dissemination of personalized psychological therapies is complex and challenging. In particular, the large sample sizes needed to provide adequately powered trials of newly-developed personalization approaches means that the traditional treatment development route is extremely inefficient. This paper outlines the promise of adaptive platform trials (APT) embedded within routine practice as a method to streamline development and testing of personalized psychological therapies, and close the gap to implementation in real-world settings. It focuses in particular on a recently-developed simplified APT design, the 'leapfrog' trial, illustrating via simulation how such a trial may proceed and the advantages it can bring, for example in terms of reduced sample sizes. Finally it discusses models of how such trials could be implemented in routine practice, including potential challenges and caveats, alongside a longer-term perspective on the development of personalized psychological treatments.
Collapse
Affiliation(s)
- Simon E Blackwell
- Department of Clinical Psychology and Experimental Psychopathology, Georg-Elias-Mueller-Institute of Psychology, University of Göttingen, Kurze-Geismar-Str.1, 37073, Göttingen, Germany.
| |
Collapse
|
3
|
Fisher A. Online false discovery rate control for LORD++ and SAFFRON under positive, local dependence. Biom J 2024; 66:e2300177. [PMID: 38102999 DOI: 10.1002/bimj.202300177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/25/2023] [Accepted: 11/22/2023] [Indexed: 12/17/2023]
Abstract
Online testing procedures assume that hypotheses are observed in sequence, and allow the significance thresholds for upcoming tests to depend on the test statistics observed so far. Some of the most popular online methods include alpha investing, LORD++, and SAFFRON. These three methods have been shown to provide online control of the "modified" false discovery rate (mFDR) under a condition known as CS. However, to our knowledge, LORD++ and SAFFRON have only been shown to control the traditional false discovery rate (FDR) under an independence condition on the test statistics. Our work bolsters these results by showing that SAFFRON and LORD++ additionally ensure online control of the FDR under a "local" form of nonnegative dependence. Further, FDR control is maintained under certain types of adaptive stopping rules, such as stopping after a certain number of rejections have been observed. Because alpha investing can be recovered as a special case of the SAFFRON framework, our results immediately apply to alpha investing as well. In the process of deriving these results, we also formally characterize how the conditional super-uniformity assumption implicitly limits the allowed p-value dependencies. This implicit limitation is important not only to our proposed FDR result, but also to many existing mFDR results.
Collapse
Affiliation(s)
- Aaron Fisher
- Foundation Medicine Inc., Cambridge, Massachusetts, USA
| |
Collapse
|
4
|
Decraene L, Orban de Xivry JJ, Kleeren L, Crotti M, Verheyden G, Ortibus E, Feys H, Mailleux L, Klingels K. In-depth quantification of bimanual coordination using the Kinarm exoskeleton robot in children with unilateral cerebral palsy. J Neuroeng Rehabil 2023; 20:154. [PMID: 37951867 PMCID: PMC10640737 DOI: 10.1186/s12984-023-01278-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 11/01/2023] [Indexed: 11/14/2023] Open
Abstract
BACKGROUND Robots have been proposed as tools to measure bimanual coordination in children with unilateral cerebral palsy (uCP). However, previous research only examined one task and clinical interpretation remains challenging due to the large amount of generated data. This cross-sectional study aims to examine bimanual coordination by using multiple bimanual robotics tasks in children with uCP, and their relation to task execution and unimanual performance. METHODS The Kinarm exoskeleton robot was used in 50 children with uCP (mean age: 11 years 11 months ± 2 years 10 months, Manual Ability Classification system (MACS-levels: l = 27, ll = 16, lll = 7)) and 50 individually matched typically developing children (TDC). All participants performed three tasks: object-hit (hit falling balls), ball-on-bar (balance a ball on a bar while moving to a target) and circuit task (move a cursor along a circuit by making horizontal and vertical motions with their right and left hand, respectively). Bimanual parameters provided information about bimanual coupling and interlimb differences. Differences between groups and MACS-levels were investigated using ANCOVA with age as covariate (α < 0.05, [Formula: see text]). Correlation analysis (r) linked bimanual coordination to task execution and unimanual parameters. RESULTS Children with uCP exhibited worse bimanual coordination compared to TDC in all tasks (p ≤ 0.05, [Formula: see text] = 0.05-0.34). The ball-on-bar task displayed high effect size differences between groups in both bimanual coupling and interlimb differences (p < 0.001, [Formula: see text] = 0.18-0.36), while the object-hit task exhibited variations in interlimb differences (p < 0.001, [Formula: see text] = 0.22-0.34) and the circuit task in bimanual coupling (p < 0.001, [Formula: see text] = 0.31). Mainly the performance of the ball-on-bar task (p < 0.05, [Formula: see text] = 0.18-0.51) was modulated by MACS-levels, showing that children with MACS-level lll had worse bimanual coordination compared to children with MACS-level l and/or II. Ball-on-bar outcomes were highly related to task execution (r = - 0.75-0.70), whereas more interlimb differences of the object-hit task were moderately associated with a worse performance of the non-dominant hand (r = - 0.69-(- 0.53)). CONCLUSION This study gained first insight in important robotic tasks and outcome measures to quantify bimanual coordination deficits in children with uCP. The ball-on-bar task showed the most discriminative ability for both bimanual coupling and interlimb differences, while the object-hit and circuit tasks are unique to interlimb differences and bimanual coupling, respectively.
Collapse
Affiliation(s)
- Lisa Decraene
- Department of Rehabilitation Sciences, Research Group for Neurorehabilitation, KU Leuven, 3000, Leuven, Belgium.
- REVAL-Rehabilitation Research Centre, Faculty of Rehabilitation Sciences, Hasselt University, 3590, Diepenbeek, Belgium.
- Child and Youth Institute, KU Leuven, 3000, Leuven, Belgium.
| | - Jean-Jacques Orban de Xivry
- Department of Movement Sciences, Research Group of Motor Control and Neuroplasticity, KU Leuven, 3000, Leuven, Belgium
- Leuven Brain Institute, KU Leuven, 3000, Leuven, Belgium
| | - Lize Kleeren
- Department of Rehabilitation Sciences, Research Group for Neurorehabilitation, KU Leuven, 3000, Leuven, Belgium
- Child and Youth Institute, KU Leuven, 3000, Leuven, Belgium
| | - Monica Crotti
- Child and Youth Institute, KU Leuven, 3000, Leuven, Belgium
- Department of Development and Regeneration, KU Leuven, 3000, Leuven, Belgium
| | - Geert Verheyden
- Department of Rehabilitation Sciences, Research Group for Neurorehabilitation, KU Leuven, 3000, Leuven, Belgium
| | - Els Ortibus
- Child and Youth Institute, KU Leuven, 3000, Leuven, Belgium
- Department of Development and Regeneration, KU Leuven, 3000, Leuven, Belgium
- Department of Pediatric Neurology, University Hospitals Leuven, 3000, Leuven, Belgium
| | - Hilde Feys
- Department of Rehabilitation Sciences, Research Group for Neurorehabilitation, KU Leuven, 3000, Leuven, Belgium
- Child and Youth Institute, KU Leuven, 3000, Leuven, Belgium
| | - Lisa Mailleux
- Department of Rehabilitation Sciences, Research Group for Neurorehabilitation, KU Leuven, 3000, Leuven, Belgium
- Child and Youth Institute, KU Leuven, 3000, Leuven, Belgium
| | - Katrijn Klingels
- Department of Rehabilitation Sciences, Research Group for Neurorehabilitation, KU Leuven, 3000, Leuven, Belgium
- REVAL-Rehabilitation Research Centre, Faculty of Rehabilitation Sciences, Hasselt University, 3590, Diepenbeek, Belgium
| |
Collapse
|
5
|
Robertson DS, Wason JM, Ramdas A. Online multiple hypothesis testing. Stat Sci 2023; 38:557-575. [PMID: 38223302 PMCID: PMC7615519 DOI: 10.1214/23-sts901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Modern data analysis frequently involves large-scale hypothesis testing, which naturally gives rise to the problem of maintaining control of a suitable type I error rate, such as the false discovery rate (FDR). In many biomedical and technological applications, an additional complexity is that hypotheses are tested in an online manner, one-by-one over time. However, traditional procedures that control the FDR, such as the Benjamini-Hochberg procedure, assume that all p-values are available to be tested at a single time point. To address these challenges, a new field of methodology has developed over the past 15 years showing how to control error rates for online multiple hypothesis testing. In this framework, hypotheses arrive in a stream, and at each time point the analyst decides whether to reject the current hypothesis based both on the evidence against it, and on the previous rejection decisions. In this paper, we present a comprehensive exposition of the literature on online error rate control, with a review of key theory as well as a focus on applied examples. We also provide simulation results comparing different online testing algorithms and an up-to-date overview of the many methodological extensions that have been proposed.
Collapse
Affiliation(s)
| | - James M.S. Wason
- Population Health Sciences Institute, Newcastle University, Newcastle, UK
| | - Aaditya Ramdas
- Departments of Statistics and Machine Learning, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
6
|
Robertson DS, Wason JMS, König F, Posch M, Jaki T. Online error rate control for platform trials. Stat Med 2023; 42:2475-2495. [PMID: 37005003 PMCID: PMC7614610 DOI: 10.1002/sim.9733] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 01/20/2023] [Accepted: 03/18/2023] [Indexed: 04/04/2023]
Abstract
Platform trials evaluate multiple experimental treatments under a single master protocol, where new treatment arms are added to the trial over time. Given the multiple treatment comparisons, there is the potential for inflation of the overall type I error rate, which is complicated by the fact that the hypotheses are tested at different times and are not necessarily pre-specified. Online error rate control methodology provides a possible solution to the problem of multiplicity for platform trials where a relatively large number of hypotheses are expected to be tested over time. In the online multiple hypothesis testing framework, hypotheses are tested one-by-one over time, where at each time-step an analyst decides whether to reject the current null hypothesis without knowledge of future tests but based solely on past decisions. Methodology has recently been developed for online control of the false discovery rate as well as the familywise error rate (FWER). In this article, we describe how to apply online error rate control to the platform trial setting, present extensive simulation results, and give some recommendations for the use of this new methodology in practice. We show that the algorithms for online error rate control can have a substantially lower FWER than uncorrected testing, while still achieving noticeable gains in power when compared with the use of a Bonferroni correction. We also illustrate how online error rate control would have impacted a currently ongoing platform trial.
Collapse
Affiliation(s)
- David S. Robertson
- MRC Biostatistics Unit, School of Clinical MedicineUniversity of CambridgeCambridgeUK
| | - James M. S. Wason
- Population Health Sciences Institute, Faculty of Medical SciencesNewcastle UniversityNewcastle upon TyneUK
| | - Franz König
- Section of Medical StatisticsMedical University of ViennaViennaAustria
| | - Martin Posch
- Section of Medical StatisticsMedical University of ViennaViennaAustria
| | - Thomas Jaki
- MRC Biostatistics Unit, School of Clinical MedicineUniversity of CambridgeCambridgeUK
- Faculty of Informatics and Data Science, University of RegensburgRegensburgGermany
| |
Collapse
|
7
|
Zehetmayer S, Posch M, Koenig F. Online control of the False Discovery Rate in group-sequential platform trials. Stat Methods Med Res 2022; 31:2470-2485. [PMID: 36189481 PMCID: PMC10130539 DOI: 10.1177/09622802221129051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
When testing multiple hypotheses, a suitable error rate should be controlled even in exploratory trials. Conventional methods to control the False Discovery Rate assume that all p-values are available at the time point of test decision. In platform trials, however, treatment arms enter and leave the trial at different times during its conduct. Therefore, the actual number of treatments and hypothesis tests is not fixed in advance and hypotheses are not tested at once, but sequentially. Recently, for such a setting the concept of online control of the False Discovery Rate was introduced. We propose several heuristic variations of the LOND procedure (significance Levels based On Number of Discoveries) that incorporate interim analyses for platform trials, and study their online False Discovery Rate via simulations. To adjust for the interim looks spending functions are applied with O'Brien-Fleming or Pocock type group-sequential boundaries. The power depends on the prior distribution of effect sizes, for example, whether true alternatives are uniformly distributed over time or not. We consider the choice of design parameters for the LOND procedure to maximize the overall power and investigate the impact on the False Discovery Rate by including both concurrent and non-concurrent control data.
Collapse
Affiliation(s)
- Sonja Zehetmayer
- Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Martin Posch
- Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Franz Koenig
- Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| |
Collapse
|
8
|
Liou L, Hornburg M, Robertson DS. Global FDR control across multiple RNAseq experiments. Bioinformatics 2022; 39:6795009. [PMID: 36326442 PMCID: PMC9805573 DOI: 10.1093/bioinformatics/btac718] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 10/16/2022] [Accepted: 11/02/2022] [Indexed: 11/06/2022] Open
Abstract
MOTIVATION While classical approaches for controlling the false discovery rate (FDR) of RNA sequencing (RNAseq) experiments have been well described, modern research workflows and growing databases enable a new paradigm of controlling the FDR globally across RNAseq experiments in the past, present and future. The simplest analysis strategy that analyses each RNAseq experiment separately and applies an FDR correction method can lead to inflation of the overall FDR. We propose applying recently developed methodology for online multiple hypothesis testing to control the global FDR in a principled way across multiple RNAseq experiments. RESULTS We show that repeated application of classical repeated offline approaches has variable control of global FDR of RNAseq experiments over time. We demonstrate that the online FDR algorithms are a principled way to control FDR. Furthermore, in certain simulation scenarios, we observe empirically that online approaches have comparable power to repeated offline approaches. AVAILABILITY AND IMPLEMENTATION The onlineFDR package is freely available at http://www.bioconductor.org/packages/onlineFDR. Additional code used for the simulation studies can be found at https://github.com/latlio/onlinefdr_rnaseq_simulation. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Milena Hornburg
- Merck Research Laboratories, Merck & Co., Kenilworth, NJ 07033, USA
| | - David S Robertson
- MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK
| |
Collapse
|
9
|
Soil Microbial Co-Occurrence Patterns under Controlled-Release Urea and Fulvic Acid Applications. Microorganisms 2022; 10:microorganisms10091823. [PMID: 36144425 PMCID: PMC9502011 DOI: 10.3390/microorganisms10091823] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 08/17/2022] [Accepted: 09/07/2022] [Indexed: 11/17/2022] Open
Abstract
The increasing amount of agricultural applications of controlled-release urea (CRU) and fulvic acids (FA) demands a better understanding of FA’s effects on microbially mediated nitrogen (N) nutrient cycling. Herein, a 0–60 day laboratory experiment and a consecutive pot experiment (2016–2018) were carried out to reveal the effects of using CRU on soil microbial N-cycling processes and soil fertility, with and without the application of FA. Compared to the CRU treatment, the CRU+FA treatment boosted wheat yield by 22.1%. To reveal the mechanism of CRU+FA affecting the soil fertility, soil nutrient supply and microbial community were assessed and contrasted in this research. From 0–60 days, compared with the CRU treatment, leaching NO3−-N content of CRU+FA was dramatically decreased by 12.7–84.2% in the 20 cm depth of soil column. Different fertilizers and the day of fertilization both have an impact on the soil microbiota. The most dominant bacterial phyla Actinobacteria and Proteobacteria were increased with CRU+FA treatment during 0–60 days. Network analysis revealed that microbial co-occurrence grew more intensive during the CRU+FA treatment, and the environmental change enhanced the microbial community. The CRU+FA treatment, in particular, significantly decreased the relative abundance of Sphingomonas, Lysobacter and Nitrospira associated with nitrification reactions, Nocardioides and Gaiella related to denitrification reactions. Meanwhile, the CRU+FA treatment grew the relative abundance of Ensifer, Blastococcus, and Pseudolabrys that function in N fixation, and then could reduce NH4+-N and NO3−-N leaching and improve the soil nutrient supply. In conclusion, the synergistic effects of slow nutrition release of CRU and growth promoting of FA could improve the soil microbial community of N cycle, reduce the loss of nutrients, and increase the wheat yield.
Collapse
|
10
|
Lin B, Pang Z, Zhang J, Chen C. Fast feature selection via streamwise procedure for massive data. BRAZ J PROBAB STAT 2022. [DOI: 10.1214/21-bjps516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Bingqing Lin
- Shenzhen Key Laboratory of Advanced Machine Learning and Applications, College of Mathematics and Statistics, Shenzhen University, 518060, China
| | - Zhen Pang
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong
| | - Jun Zhang
- College of Mathematics and Statistics, Shenzhen University, Shenzhen, 518060, China
| | - Cuiqing Chen
- College of Mathematics and Statistics, Shenzhen University, Shenzhen, 518060, China
| |
Collapse
|
11
|
Hahn G. Online multivariate changepoint detection with type I error control and constant time/memory updates per series. Stat Probab Lett 2022. [DOI: 10.1016/j.spl.2021.109258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
12
|
Determination of molecular signatures and pathways common to brain tissues of autism spectrum disorder: Insights from comprehensive bioinformatics approach. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100871] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
|
13
|
Li Y, Hu YJ, Satten GA. A Bottom-up Approach to Testing Hypotheses That Have a Branching Tree Dependence Structure, with Error Rate Control. J Am Stat Assoc 2022; 117:664-677. [PMID: 35814292 PMCID: PMC9269868 DOI: 10.1080/01621459.2020.1799811] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Modern statistical analyses often involve testing large numbers of hypotheses. In many situations, these hypotheses may have an underlying tree structure that both helps determine the order that tests should be conducted but also imposes a dependency between tests that must be accounted for. Our motivating example comes from testing the association between a trait of interest and groups of microbes that have been organized into operational taxonomic units (OTUs) or amplicon sequence variants (ASVs). Given p-values from association tests for each individual OTU or ASV, we would like to know if we can declare a certain species, genus, or higher taxonomic group to be associated with the trait. For this problem, a bottom-up testing algorithm that starts at the lowest level of the tree (OTUs or ASVs) and proceeds upward through successively higher taxonomic groupings (species, genus, family etc.) is required. We develop such a bottom-up testing algorithm that controls a novel error rate that we call the false selection rate. By simulation, we also show that our approach is better at finding driver taxa, the highest level taxa below which there are dense association signals. We illustrate our approach using data from a study of the microbiome among patients with ulcerative colitis and healthy controls.
Collapse
Affiliation(s)
- Yunxiao Li
- Department of Biostatistics and Bioinformatics, Emory University
| | - Yi-Juan Hu
- Department of Biostatistics and Bioinformatics, Emory University,corresponding author
| | | |
Collapse
|
14
|
Gang B, Sun W, Wang W. Structure–Adaptive Sequential Testing for Online False Discovery Rate Control. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1955688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Bowen Gang
- Department of Statistics, Fudan University, Shanghai, China
| | - Wenguang Sun
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| | | |
Collapse
|
15
|
Liu Y, Ročková V. Variable Selection Via Thompson Sampling. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1928514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Yi Liu
- Department of Statistics, University of Chicago, Chicago, IL
| | | |
Collapse
|
16
|
Badsha MB, Martin EA, Fu AQ. MRPC: An R Package for Inference of Causal Graphs. Front Genet 2021; 12:651812. [PMID: 33995486 PMCID: PMC8120292 DOI: 10.3389/fgene.2021.651812] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 04/06/2021] [Indexed: 11/24/2022] Open
Abstract
Understanding the causal relationships between variables is a central goal of many scientific inquiries. Causal relationships may be represented by directed edges in a graph (or equivalently, a network). In biology, for example, gene regulatory networks may be viewed as a type of causal networks, where X→Y represents gene X regulating (i.e., being causal to) gene Y. However, existing general-purpose graph inference methods often result in a high number of false edges, whereas current causal inference methods developed for observational data in genomics can handle only limited types of causal relationships. We present MRPC (a PC algorithm with the principle of Mendelian Randomization), an R package that learns causal graphs with improved accuracy over existing methods. Our algorithm builds on the powerful PC algorithm (named after its developers Peter Spirtes and Clark Glymour), a canonical algorithm in computer science for learning directed acyclic graphs. The improvements in MRPC result in increased accuracy in identifying v-structures (i.e., X→Y←Z), and robustness to how the nodes are arranged in the input data. In the special case of genomic data that contain genotypes and phenotypes (e.g., gene expression) at the individual level, MRPC incorporates the principle of Mendelian randomization as constraints on edge direction to help orient the edges. MRPC allows for inference of causal graphs not only for general purposes, but also for biomedical data where multiple types of data may be input to provide evidence for causality. The R package is available on CRAN and is a free open-source software package under a GPL (≥2) license.
Collapse
Affiliation(s)
- Md. Bahadur Badsha
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, United States
| | - Evan A. Martin
- The Graduate Program in Bioinformatics and Computational Biology, University of Idaho, Moscow, ID, United States
| | - Audrey Qiuyan Fu
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, United States
- Department of Mathematics and Statistical Science, Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, ID, United States
| |
Collapse
|
17
|
Goeman JJ, Hemerik J, Solari A. Only closed testing procedures are admissible for controlling false discovery proportions. Ann Stat 2021. [DOI: 10.1214/20-aos1999] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Jelle J. Goeman
- Department of Biomedical Data Sciences, Leiden University Medical Center
| | - Jesse Hemerik
- Oslo Centre for Biostatistics and Epidemiology, University of Oslo, and Biometris, Wageningen University & Research
| | - Aldo Solari
- Department of Economics, Management and Statistics, University of Milano-Bicocca
| |
Collapse
|
18
|
Lee KM, Brown LC, Jaki T, Stallard N, Wason J. Statistical consideration when adding new arms to ongoing clinical trials: the potentials and the caveats. Trials 2021; 22:203. [PMID: 33691748 PMCID: PMC7944243 DOI: 10.1186/s13063-021-05150-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 02/24/2021] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Platform trials improve the efficiency of the drug development process through flexible features such as adding and dropping arms as evidence emerges. The benefits and practical challenges of implementing novel trial designs have been discussed widely in the literature, yet less consideration has been given to the statistical implications of adding arms. MAIN: We explain different statistical considerations that arise from allowing new research interventions to be added in for ongoing studies. We present recent methodology development on addressing these issues and illustrate design and analysis approaches that might be enhanced to provide robust inference from platform trials. We also discuss the implication of changing the control arm, how patient eligibility for different arms may complicate the trial design and analysis, and how operational bias may arise when revealing some results of the trials. Lastly, we comment on the appropriateness and the application of platform trials in phase II and phase III settings, as well as publicly versus industry-funded trials. CONCLUSION Platform trials provide great opportunities for improving the efficiency of evaluating interventions. Although several statistical issues are present, there are a range of methods available that allow robust and efficient design and analysis of these trials.
Collapse
Affiliation(s)
- Kim May Lee
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SR, UK.
- Pragmatic Clinical Trials Unit, Queen Mary University of London, Yvonne Carter Building, 58 Turner Street, London, E1 2AB, UK.
| | - Louise C Brown
- MRC Clinical Trials Unit, University College London, 90 High Holborn 2nd Floor, London, WC1V 6LJ, UK
| | - Thomas Jaki
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SR, UK
- Medical and Pharmaceutical Statistics Research Unit, Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
| | - Nigel Stallard
- Statistics and Epidemiology, Division of Health Sciences, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - James Wason
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SR, UK
- Population Health Sciences Institute, Baddiley-Clark Building, Newcastle University, Richardson Road, Newcastle upon Tyne, UK
| |
Collapse
|
19
|
Abstract
Biological research often involves testing a growing number of null hypotheses as new data are accumulated over time. We study the problem of online control of the familywise error rate, that is testing an a priori unbounded sequence of hypotheses (p-values) one by one over time without knowing the future, such that with high probability there are no false discoveries in the entire sequence. This paper unifies algorithmic concepts developed for offline (single batch) familywise error rate control and online false discovery rate control to develop novel online familywise error rate control methods. Though many offline familywise error rate methods (e.g., Bonferroni, fallback procedures and Sidak's method) can trivially be extended to the online setting, our main contribution is the design of new, powerful, adaptive online algorithms that control the familywise error rate when the p-values are independent or locally dependent in time. Our numerical experiments demonstrate substantial gains in power, that are also formally proved in an idealized Gaussian sequence model. A promising application to the International Mouse Phenotyping Consortium is described.
Collapse
Affiliation(s)
- Jinjin Tian
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Aaditya Ramdas
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA.,Department of Machine Learning, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
20
|
He X, Bartroff J. Asymptotically optimal sequential FDR and pFDR control with (or without) prior information on the number of signals. J Stat Plan Inference 2021. [DOI: 10.1016/j.jspi.2020.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
21
|
Katsevich E, Ramdas A. Simultaneous high-probability bounds on the false discovery proportion in structured, regression and online settings. Ann Stat 2020. [DOI: 10.1214/19-aos1938] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
22
|
Robertson DS, Wason JMS, Bretz F. Graphical approaches for the control of generalized error rates. Stat Med 2020; 39:3135-3155. [PMID: 32557848 PMCID: PMC7612110 DOI: 10.1002/sim.8595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 05/10/2020] [Indexed: 11/12/2022]
Abstract
When simultaneously testing multiple hypotheses, the usual approach in the context of confirmatory clinical trials is to control the familywise error rate (FWER), which bounds the probability of making at least one false rejection. In many trial settings, these hypotheses will additionally have a hierarchical structure that reflects the relative importance and links between different clinical objectives. The graphical approach of Bretz et al (2009) is a flexible and easily communicable way of controlling the FWER while respecting complex trial objectives and multiple structured hypotheses. However, the FWER can be a very stringent criterion that leads to procedures with low power, and may not be appropriate in exploratory trial settings. This motivates controlling generalized error rates, particularly when the number of hypotheses tested is no longer small. We consider the generalized familywise error rate (k-FWER), which is the probability of making k or more false rejections, as well as the tail probability of the false discovery proportion (FDP), which is the probability that the proportion of false rejections is greater than some threshold. We also consider asymptotic control of the false discovery rate, which is the expectation of the FDP. In this article, we show how to control these generalized error rates when using the graphical approach and its extensions. We demonstrate the utility of the resulting graphical procedures on three clinical trial case studies.
Collapse
Affiliation(s)
| | - James M S Wason
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK.,Institute of Health and Society, Newcastle University, Newcastle, UK
| | - Frank Bretz
- Statistical Methodology, Novartis Pharma AG, Basel, Switzerland.,Section for Medical Statistics, Medical University of Vienna, Vienna, Austria
| |
Collapse
|
23
|
Robertson DS, Wildenhain J, Javanmard A, Karp NA. onlineFDR: an R package to control the false discovery rate for growing data repositories. Bioinformatics 2020; 35:4196-4199. [PMID: 30873526 PMCID: PMC6792083 DOI: 10.1093/bioinformatics/btz191] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Revised: 12/18/2018] [Accepted: 03/13/2019] [Indexed: 12/01/2022] Open
Abstract
Summary In many areas of biological research, hypotheses are tested in a sequential manner, without having access to future P-values or even the number of hypotheses to be tested. A key setting where this online hypothesis testing occurs is in the context of publicly available data repositories, where the family of hypotheses to be tested is continually growing as new data is accumulated over time. Recently, Javanmard and Montanari proposed the first procedures that control the FDR for online hypothesis testing. We present an R package, onlineFDR, which implements these procedures and provides wrapper functions to apply them to a historic dataset or a growing data repository. Availability and implementation The R package is freely available through Bioconductor (http://www.bioconductor.org/packages/onlineFDR). Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Jan Wildenhain
- Quantitative Biology, Discovery Sciences, IMED Biotech Unit, AstraZeneca, Cambridge, UK
| | - Adel Javanmard
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA, USA
| | - Natasha A Karp
- Quantitative Biology, Discovery Sciences, IMED Biotech Unit, AstraZeneca, Cambridge, UK
| |
Collapse
|
24
|
Chen S, Arias-Castro E. On the power of some sequential multiple testing procedures. ANN I STAT MATH 2020. [DOI: 10.1007/s10463-020-00752-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
25
|
Cui Z, Kancherla J, Chang KW, Elmqvist N, Corrada Bravo H. Proactive visual and statistical analysis of genomic data in Epiviz. Bioinformatics 2020; 36:2195-2201. [PMID: 31782758 DOI: 10.1093/bioinformatics/btz883] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 11/04/2019] [Accepted: 11/27/2019] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Integrative analysis of genomic data that includes statistical methods in combination with visual exploration has gained widespread adoption. Many existing methods involve a combination of tools and resources: user interfaces that provide visualization of large genomic datasets, and computational environments that focus on data analyses over various subsets of a given dataset. Over the last few years, we have developed Epiviz as an integrative and interactive genomic data analysis tool that incorporates visualization tightly with state-of-the-art statistical analysis framework. RESULTS In this article, we present Epiviz Feed, a proactive and automatic visual analytics system integrated with Epiviz that alleviates the burden of manually executing data analysis required to test biologically meaningful hypotheses. Results of interest that are proactively identified by server-side computations are listed as notifications in a feed. The feed turns genomic data analysis into a collaborative work between the analyst and the computational environment, which shortens the analysis time and allows the analyst to explore results efficiently. We discuss three ways where the proposed system advances the field of genomic data analysis: (i) takes the first step of proactive data analysis by utilizing available CPU power from the server to automate the analysis process; (ii) summarizes hypothesis test results in a way that analysts can easily understand and investigate; (iii) enables filtering and grouping of analysis results for quick search. This effort provides initial work on systems that substantially expand how computational and visualization frameworks can be tightly integrated to facilitate interactive genomic data analysis. AVAILABILITY AND IMPLEMENTATION The source code for Epiviz Feed application is available at http://github.com/epiviz/epiviz_feed_polymer. The Epiviz Computational Server is available at http://github.com/epiviz/epiviz-feed-computation. Please refer to Epiviz documentation site for details: http://epiviz.github.io/.
Collapse
Affiliation(s)
- Zhe Cui
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD 20742, USA.,Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA.,Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA.,Human-Computer Interaction Laboratory, University of Maryland, College Park, MD 20742, USA
| | - Jayaram Kancherla
- Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA.,Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA.,Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| | - Kyle W Chang
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| | - Niklas Elmqvist
- Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA.,Human-Computer Interaction Laboratory, University of Maryland, College Park, MD 20742, USA.,Department of Computer Science, University of Maryland, College Park, MD 20742, USA.,College of Information Studies, University of Maryland, College Park, MD 20742, USA
| | - Héctor Corrada Bravo
- Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA.,Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA.,Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
26
|
Bartroff J, Song J. Sequential Tests of Multiple Hypotheses Controlling False Discovery and Nondiscovery Rates. Seq Anal 2020; 39:65-91. [PMID: 33776197 DOI: 10.1080/07474946.2020.1726686] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
We propose a general and flexible procedure for testing multiple hypotheses about sequential (or streaming) data that simultaneously controls both the false discovery rate (FDR) and false nondiscovery rate (FNR) under minimal assumptions about the data streams which may differ in distribution, dimension, and be dependent. All that is needed is a test statistic for each data stream that controls its conventional type I and II error probabilities, and no information or assumptions are required about the joint distribution of the statistics or data streams. The procedure can be used with sequential, group sequential, truncated, or other sampling schemes. The procedure is a natural extension of Benjamini and Hochberg's (1995) widely-used fixed sample size procedure to the domain of sequential data, with the added benefit of simultaneous FDR and FNR control that sequential sampling affords. We prove the procedure's error control and give some tips for implementation in commonly encountered testing situations.
Collapse
Affiliation(s)
- Jay Bartroff
- Department of Mathematics, University of Southern California, Los Angeles, California, USA
| | - Jinlin Song
- Analysis Group, Inc., Los Angeles, California, USA
| |
Collapse
|
27
|
Xue Y, Wang H, Yan J, Schifano ED. An online updating approach for testing the proportional hazards assumption with streams of survival data. Biometrics 2019; 76:171-182. [PMID: 31424095 DOI: 10.1111/biom.13137] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Accepted: 08/07/2019] [Indexed: 11/28/2022]
Abstract
The Cox model-which remains the first choice for analyzing time-to-event data, even for large data sets-relies on the proportional hazards (PH) assumption. When survival data arrive sequentially in chunks, a fast and minimally storage intensive approach to test the PH assumption is desirable. We propose an online updating approach that updates the standard test statistic as each new block of data becomes available and greatly lightens the computational burden. Under the null hypothesis of PH, the proposed statistic is shown to have the same asymptotic distribution as the standard version computed on an entire data stream with the data blocks pooled into one data set. In simulation studies, the test and its variant based on most recent data blocks maintain their sizes when the PH assumption holds and have substantial power to detect different violations of the PH assumption. We also show in simulation that our approach can be used successfully with "big data" that exceed a single computer's computational resources. The approach is illustrated with the survival analysis of patients with lymphoma cancer from the Surveillance, Epidemiology, and End Results Program. The proposed test promptly identified deviation from the PH assumption, which was not captured by the test based on the entire data.
Collapse
Affiliation(s)
- Yishu Xue
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - HaiYing Wang
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Jun Yan
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | | |
Collapse
|
28
|
Ramdas AK, Barber RF, Wainwright MJ, Jordan MI. A unified treatment of multiple testing with prior knowledge using the p-filter. Ann Stat 2019. [DOI: 10.1214/18-aos1765] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
29
|
Ramdas A, Chen J, Wainwright MJ, Jordan MI. A sequential algorithm for false discovery rate control on directed acyclic graphs. Biometrika 2019. [DOI: 10.1093/biomet/asy066] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Aaditya Ramdas
- Department of Statistics and Data Science, Carnegie Mellon University, 132H Baker Hall, Pittsburgh, Pennsylvania, USA
| | - Jianbo Chen
- Department of Statistics, University of California, 367 Evans Hall, Berkeley, California, USA
| | - Martin J Wainwright
- Department of Statistics, University of California, 367 Evans Hall, Berkeley, California, USA
| | - Michael I Jordan
- Department of Statistics, University of California, 367 Evans Hall, Berkeley, California, USA
| |
Collapse
|
30
|
Madrid Padilla OH, Athey A, Reinhart A, Scott JG. Sequential Nonparametric Tests for a Change in Distribution: An Application to Detecting Radiological Anomalies. J Am Stat Assoc 2018. [DOI: 10.1080/01621459.2018.1476245] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
| | - Alex Athey
- Applied Research Laboratories, University of Texas at Austin, Austin, TX
| | - Alex Reinhart
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA
| | - James G. Scott
- Department of Statistics and Data Sciences and McCombs School of Business, University of Texas at Austin, Austin, TX
| |
Collapse
|