1
|
Clark DA, Handcock MS. Causal inference over stochastic networks. JOURNAL OF THE ROYAL STATISTICAL SOCIETY. SERIES A, (STATISTICS IN SOCIETY) 2024; 187:772-795. [PMID: 39281781 PMCID: PMC11393554 DOI: 10.1093/jrsssa/qnae001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 10/25/2023] [Accepted: 12/02/2023] [Indexed: 09/18/2024]
Abstract
Claiming causal inferences in network settings necessitates careful consideration of the often complex dependency between outcomes for actors. Of particular importance are treatment spillover or outcome interference effects. We consider causal inference when the actors are connected via an underlying network structure. Our key contribution is a model for causality when the underlying network is endogenous; where the ties between actors and the actor covariates are statistically dependent. We develop a joint model for the relational and covariate generating process that avoids restrictive separability and fixed network assumptions, as these rarely hold in realistic social settings. While our framework can be used with general models, we develop the highly expressive class of Exponential-family Random Network models (ERNM) of which Markov random fields and Exponential-family Random Graph models are special cases. We present potential outcome-based inference within a Bayesian framework and propose a modification to the exchange algorithm to allow for sampling from ERNM posteriors. We present results of a simulation study demonstrating the validity of the approach. Finally, we demonstrate the value of the framework in a case study of smoking in the context of adolescent friendship networks.
Collapse
Affiliation(s)
- Duncan A Clark
- Department of Statistics & Data Science, University of California - Los Angeles, Los Angeles, CA, USA
| | - Mark S Handcock
- Department of Statistics & Data Science, University of California - Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
2
|
Egami N, Tchetgen Tchetgen EJ. Identification and estimation of causal peer effects using double negative controls for unmeasured network confounding. J R Stat Soc Series B Stat Methodol 2024; 86:487-511. [PMID: 38618143 PMCID: PMC11009281 DOI: 10.1093/jrsssb/qkad132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 09/27/2023] [Accepted: 10/06/2023] [Indexed: 04/16/2024]
Abstract
Identification and estimation of causal peer effects are challenging in observational studies for two reasons. The first is the identification challenge due to unmeasured network confounding, for example, homophily bias and contextual confounding. The second is network dependence of observations. We establish a framework that leverages a pair of negative control outcome and exposure variables (double negative controls) to non-parametrically identify causal peer effects in the presence of unmeasured network confounding. We then propose a generalised method of moments estimator and establish its consistency and asymptotic normality under an assumption about ψ-network dependence. Finally, we provide a consistent variance estimator.
Collapse
Affiliation(s)
- Naoki Egami
- Department of Political Science, Columbia University, New York, NY, USA
| | - Eric J Tchetgen Tchetgen
- Department of Statistics and Data Science and Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
3
|
Malenica I, Coyle JR, van der Laan MJ, Petersen ML. Adaptive sequential surveillance with network and temporal dependence. Biometrics 2024; 80:ujad007. [PMID: 38281772 PMCID: PMC10826884 DOI: 10.1093/biomtc/ujad007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 08/17/2023] [Accepted: 10/31/2023] [Indexed: 01/30/2024]
Abstract
Strategic test allocation is important for control of both emerging and existing pandemics (eg, COVID-19, HIV). It supports effective epidemic control by (1) reducing transmission via identifying cases and (2) tracking outbreak dynamics to inform targeted interventions. However, infectious disease surveillance presents unique statistical challenges. For instance, the true outcome of interest (positive infection status) is often a latent variable. In addition, presence of both network and temporal dependence reduces data to a single observation. In this work, we study an adaptive sequential design, which allows for unspecified dependence among individuals and across time. Our causal parameter is the mean latent outcome we would have obtained, if, starting at time t given the observed past, we had carried out a stochastic intervention that maximizes the outcome under a resource constraint. The key strength of the method is that we do not have to model network and time dependence: a short-term performance Online Super Learner is used to select among dependence models and randomization schemes. The proposed strategy learns the optimal choice of testing over time while adapting to the current state of the outbreak and learning across samples, through time, or both. We demonstrate the superior performance of the proposed strategy in an agent-based simulation modeling a residential university environment during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Ivana Malenica
- Department of Statistics, Harvard University, Cambridge, MA 02138, United States
- Division of Biostatistics, Berkeley, CA 94704, United States
| | | | | | - Maya L Petersen
- Division of Biostatistics, Berkeley, CA 94704, United States
| |
Collapse
|
4
|
Smith MJ, Phillips RV, Luque-Fernandez MA, Maringe C. Application of targeted maximum likelihood estimation in public health and epidemiological studies: a systematic review. Ann Epidemiol 2023; 86:34-48.e28. [PMID: 37343734 DOI: 10.1016/j.annepidem.2023.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 05/24/2023] [Accepted: 06/06/2023] [Indexed: 06/23/2023]
Abstract
PURPOSE The targeted maximum likelihood estimation (TMLE) statistical data analysis framework integrates machine learning, statistical theory, and statistical inference to provide a least biased, efficient, and robust strategy for estimation and inference of a variety of statistical and causal parameters. We describe and evaluate the epidemiological applications that have benefited from recent methodological developments. METHODS We conducted a systematic literature review in PubMed for articles that applied any form of TMLE in observational studies. We summarized the epidemiological discipline, geographical location, expertize of the authors, and TMLE methods over time. We used the Roadmap of Targeted Learning and Causal Inference to extract key methodological aspects of the publications. We showcase the contributions to the literature of these TMLE results. RESULTS Of the 89 publications included, 33% originated from the University of California at Berkeley, where the framework was first developed by Professor Mark van der Laan. By 2022, 59% of the publications originated from outside the United States and explored up to seven different epidemiological disciplines in 2021-2022. Double-robustness, bias reduction, and model misspecification were the main motivations that drew researchers toward the TMLE framework. Through time, a wide variety of methodological, tutorial, and software-specific articles were cited, owing to the constant growth of methodological developments around TMLE. CONCLUSIONS There is a clear dissemination trend of the TMLE framework to various epidemiological disciplines and to increasing numbers of geographical areas. The availability of R packages, publication of tutorial papers, and involvement of methodological experts in applied publications have contributed to an exponential increase in the number of studies that understood the benefits and adoption of TMLE.
Collapse
Affiliation(s)
- Matthew J Smith
- Inequalities in Cancer Outcomes Network, London School of Hygiene and Tropical Medicine, London, UK.
| | - Rachael V Phillips
- Division of Biostatistics, School of Public Health, University of California at Berkeley, Berkeley, CA
| | - Miguel Angel Luque-Fernandez
- Inequalities in Cancer Outcomes Network, London School of Hygiene and Tropical Medicine, London, UK; Department of Statistics and Operations Research, University of Granada, Granada, Spain
| | - Camille Maringe
- Inequalities in Cancer Outcomes Network, London School of Hygiene and Tropical Medicine, London, UK
| |
Collapse
|
5
|
Novak A, Boutwell BB, Smith TB. Taking the problem of colliders seriously in the study of crime: A research note. JOURNAL OF EXPERIMENTAL CRIMINOLOGY 2023:1-10. [PMID: 37361450 PMCID: PMC10061360 DOI: 10.1007/s11292-023-09565-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/20/2023] [Indexed: 06/28/2023]
Abstract
Objectives We provide a brief overview of collider bias and its implications for criminological research. Methods Owing to the nature of the topics studied, as well as the common data sources used to carry out much of this research, work in the field may often become vulnerable to a specific methodological problem known as collider bias. Collider bias occurs when exposure variables and outcomes independently cause a third variable, and this variable is included in statistical models. Colliders represent somewhat of a paradox in that there is scholarship discussing the issue, yet it has managed to remain a relatively cryptic threat compared to other sources of bias. Results We argue that, far from being an obscure concern, colliders almost certainly have pervasive impact in criminal justice and criminology. Conclusion We close by offering a general set of strategies for addressing the challenges posed by collider bias. While there is no panacea, there are better practices, many of which are underutilized in the disciplines that study crime and its attendant topics.
Collapse
Affiliation(s)
- Abigail Novak
- The University of Mississippi, Oxford, Mississippi USA
| | - Brian B. Boutwell
- The University of Mississippi, Oxford, Mississippi USA
- University of Mississippi Medical Center, Jackson, USA
| | - Thomas Bryan Smith
- Bureau of Economic and Business Research, University of Florida, Gainesville, USA
| |
Collapse
|
6
|
Buchanan AL, Katenka N, Lee Y, Wu J, Pantavou K, Friedman SR, Halloran ME, Marshall BDL, Forastiere L, Nikolopoulos GK. Methods for Assessing Spillover in Network-Based Studies of HIV/AIDS Prevention among People Who Use Drugs. Pathogens 2023; 12:326. [PMID: 36839598 PMCID: PMC9967280 DOI: 10.3390/pathogens12020326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 02/03/2023] [Accepted: 02/08/2023] [Indexed: 02/17/2023] Open
Abstract
Human Immunodeficiency Virus (HIV) interventions among people who use drugs (PWUD) often have spillover, also known as interference or dissemination, which occurs when one participant's exposure affects another participant's outcome. PWUD are often members of networks defined by social, sexual, and drug-use partnerships and their receipt of interventions can affect other members in their network. For example, HIV interventions with possible spillover include educational training about HIV risk reduction, pre-exposure prophylaxis, or treatment as prevention. In turn, intervention effects frequently depend on the network structure, and intervention coverage levels and spillover can occur even if not measured in a study, possibly resulting in an underestimation of intervention effects. Recent methodological approaches were developed to assess spillover in the context of network-based studies. This tutorial provides an overview of different study designs for network-based studies and related methodological approaches for assessing spillover in each design. We also provide an overview of other important methodological issues in network studies, including causal influence in networks and missing data. Finally, we highlight applications of different designs and methods from studies of PWUD and conclude with an illustrative example from the Transmission Reduction Intervention Project (TRIP) in Athens, Greece.
Collapse
Affiliation(s)
- Ashley L. Buchanan
- Department of Pharmacy Practice, University of Rhode Island, Kingston, RI 02881, USA
| | - Natallia Katenka
- Department of Computer Science and Statistics, University of Rhode Island, Kingston, RI 02881, USA
| | - Youjin Lee
- Department of Biostatistics, Brown University, Providence, RI 02912, USA
| | - Jing Wu
- Department of Computer Science and Statistics, University of Rhode Island, Kingston, RI 02881, USA
| | | | - Samuel R. Friedman
- Department of Population Health, New York University, New York, NY 10016, USA
| | - M. Elizabeth Halloran
- Vaccine and Infectious Diseases Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Brandon D. L. Marshall
- Department of Epidemiology, Brown University School of Public Health, Providence, RI 02912, USA
| | - Laura Forastiere
- Department of Biostatistics, Yale School of Public Health, New Haven, CT 06520, USA
| | | |
Collapse
|
7
|
Moodie EEM, Stephens DA. Causal inference: Critical developments, past and future. CAN J STAT 2022. [DOI: 10.1002/cjs.11718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Erica E. M. Moodie
- Department of Epidemiology and Biostatistics McGill University, 2001 McGill College Ave Montréal Quebec Canada H3A 1G1
| | - David A. Stephens
- Department of Mathematics and Statistics McGill University, 805 Sherbrooke St W Montréal Quebec Canada H3A 2K6
| |
Collapse
|
8
|
OUP accepted manuscript. Biometrika 2022. [DOI: 10.1093/biomet/asac009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
9
|
Ogburn EL, Shpitser I, Lee Y. Causal inference, social networks and chain graphs. JOURNAL OF THE ROYAL STATISTICAL SOCIETY. SERIES A, (STATISTICS IN SOCIETY) 2020; 183:1659-1676. [PMID: 34316102 PMCID: PMC8313030 DOI: 10.1111/rssa.12594] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Traditionally, statistical inference and causal inference on human subjects rely on the assumption that individuals are independently affected by treatments or exposures. However, recently there has been increasing interest in settings, such as social networks, where individuals may interact with one another such that treatments may spill over from the treated individual to their social contacts and outcomes may be contagious. Existing models proposed for causal inference using observational data from networks of interacting individuals have two major shortcomings. First, they often require a level of granularity in the data that is infeasible in practice to collect in most settings and, second, the models are high dimensional and often too big to fit to the available data. We illustrate and justify a parsimonious parameterization for network data with interference and contagion. Our parameterization corresponds to a particular family of graphical models known as chain graphs. We argue that, in some settings, chain graph models approximate the marginal distribution of a snapshot of a longitudinal data-generating process on interacting units. We illustrate the use of chain graphs for causal inference about collective decision making in social networks by using data from US Supreme Court decisions between 1994 and 2004 and in simulations.
Collapse
Affiliation(s)
| | | | - Youjin Lee
- University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
10
|
Forastiere L, Airoldi EM, Mealli F. Identification and Estimation of Treatment and Interference Effects in Observational Studies on Networks. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1768100] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
| | - Edoardo M. Airoldi
- Department of Statistical Science, Fox School of Business, Temple University, Philadelphia, PA
| | - Fabrizia Mealli
- Department of Statistics, Informatics, Applications “G. Parenti”, University of Florence, Florence, Italy
| |
Collapse
|
11
|
Miles CH, Petersen M, van der Laan MJ. Causal inference when counterfactuals depend on the proportion of all subjects exposed. Biometrics 2019; 75:768-777. [PMID: 30714118 PMCID: PMC6679813 DOI: 10.1111/biom.13034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 01/23/2019] [Indexed: 10/27/2022]
Abstract
The assumption that no subject's exposure affects another subject's outcome, known as the no-interference assumption, has long held a foundational position in the study of causal inference. However, this assumption may be violated in many settings, and in recent years has been relaxed considerably. Often this has been achieved with either the aid of a known underlying network, or the assumption that the population can be partitioned into separate groups, between which there is no interference, and within which each subject's outcome may be affected by all the other subjects in the group via the proportion exposed (the stratified interference assumption). In this article, we instead consider a complete interference setting, in which each subject affects every other subject's outcome. In particular, we make the stratified interference assumption for a single group consisting of the entire sample. We show that a targeted maximum likelihood estimator for the i.i.d. setting can be used to estimate a class of causal parameters that includes direct effects and overall effects under certain interventions. This estimator remains doubly-robust, semiparametric efficient, and continues to allow for incorporation of machine learning under our model. We conduct a simulation study, and present results from a data application where we study the effect of a nurse-based triage system on the outcomes of patients receiving HIV care in Kenyan health clinics.
Collapse
Affiliation(s)
- Caleb H. Miles
- Department of Biostatistics, Columbia Mailman School of Public Health, New York, New York, U.S.A
| | - Maya Petersen
- Division of Biostatistics, University of California at Berkeley, Berkeley, California, U.S.A
- Division of Epidemiology, University of California at Berkeley, Berkeley, California, U.S.A
| | - Mark J. van der Laan
- Division of Biostatistics, University of California at Berkeley, Berkeley, California, U.S.A
- Department of Statistics, University of California at Berkeley, Berkeley, California, U.S.A
| |
Collapse
|
12
|
Benjamin-Chung J, Arnold BF, Berger D, Luby SP, Miguel E, Colford JM, Hubbard AE. Spillover effects in epidemiology: parameters, study designs and methodological considerations. Int J Epidemiol 2019; 47:332-347. [PMID: 29106568 PMCID: PMC5837695 DOI: 10.1093/ije/dyx201] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/25/2017] [Indexed: 11/13/2022] Open
Abstract
Many public health interventions provide benefits that extend beyond their direct recipients and impact people in close physical or social proximity who did not directly receive the intervention themselves. A classic example of this phenomenon is the herd protection provided by many vaccines. If these 'spillover effects' (i.e. 'herd effects') are present in the same direction as the effects on the intended recipients, studies that only estimate direct effects on recipients will likely underestimate the full public health benefits of the intervention. Causal inference assumptions for spillover parameters have been articulated in the vaccine literature, but many studies measuring spillovers of other types of public health interventions have not drawn upon that literature. In conjunction with a systematic review we conducted of spillovers of public health interventions delivered in low- and middle-income countries, we classified the most widely used spillover parameters reported in the empirical literature into a standard notation. General classes of spillover parameters include: cluster-level spillovers; spillovers conditional on treatment or outcome density, distance or the number of treated social network links; and vaccine efficacy parameters related to spillovers. We draw on high quality empirical examples to illustrate each of these parameters. We describe study designs to estimate spillovers and assumptions required to make causal inferences about spillovers. We aim to advance and encourage methods for spillover estimation and reporting by standardizing spillover parameter nomenclature and articulating the causal inference assumptions required to estimate spillovers.
Collapse
Affiliation(s)
- Jade Benjamin-Chung
- Division of Epidemiology, UC Berkeley School of Public Health, 101 Haviland Hall, Berkeley, CA 94720-7358, USA
| | - Benjamin F Arnold
- Division of Epidemiology, UC Berkeley School of Public Health, 101 Haviland Hall, Berkeley, CA 94720-7358, USA.,Division of Biostatistics, UC Berkeley School of Public Health, 101 Haviland Hall, Berkeley, CA 94720-7358, USA
| | - David Berger
- Department of Economics, University of California, Berkeley, CA 94720-7358, USA
| | - Stephen P Luby
- Division of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Edward Miguel
- Department of Economics, University of California, Berkeley, CA 94720-7358, USA
| | - John M Colford
- Division of Epidemiology, UC Berkeley School of Public Health, 101 Haviland Hall, Berkeley, CA 94720-7358, USA
| | - Alan E Hubbard
- Division of Biostatistics, UC Berkeley School of Public Health, 101 Haviland Hall, Berkeley, CA 94720-7358, USA
| |
Collapse
|
13
|
Affiliation(s)
- Susan Athey
- Graduate School of Business, Stanford University, Stanford, CA
- NBER, Cambridge, MA
| | - Dean Eckles
- MIT Sloan School of Management, Cambridge, MA
| | - Guido W. Imbens
- Graduate School of Business, Stanford University, Stanford, CA
- NBER, Cambridge, MA
| |
Collapse
|
14
|
Sofrygin O, van der Laan MJ. Semi-Parametric Estimation and Inference for the Mean Outcome of the Single Time-Point Intervention in a Causally Connected Population. JOURNAL OF CAUSAL INFERENCE 2017; 5:20160003. [PMID: 29057197 PMCID: PMC5650205 DOI: 10.1515/jci-2016-0003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
We study the framework for semi-parametric estimation and statistical inference for the sample average treatment-specific mean effects in observational settings where data are collected on a single network of connected units (e.g., in the presence of interference or spillover). Despite recent advances, many of the current statistical methods rely on estimation techniques that assume a particular parametric model for the outcome, even though some of the most important statistical assumptions required by these models are most likely violated in the observational network settings, often resulting in invalid and anti-conservative statistical inference. In this manuscript, we rely on the recent methodological advances for the targeted maximum likelihood estimation (TMLE) of causal effects in a network of causally connected units, to describe an estimation approach that permits for more realistic classes of data-generative models and provides valid statistical inference in the context of network-dependent data. The approach is applied to an observational setting with a single time point stochastic intervention. We start by assuming that the true observed data-generating distribution belongs to a large class of semi-parametric statistical models. We then impose some restrictions on the possible set of the data-generative distributions that may belong to our statistical model. For example, we assume that the dependence among units can be fully described by the known network, and that the dependence on other units can be summarized via some known (but otherwise arbitrary) summary measures. We show that under our modeling assumptions, our estimand is equivalent to an estimand in a hypothetical iid data distribution, where the latter distribution is a function of the observed network data-generating distribution. With this key insight in mind, we show that the TMLE for our estimand in dependent network data can be described as a certain iid data TMLE algorithm, also resulting in a new simplified approach to conducting statistical inference. We demonstrate the validity of our approach in a network simulation study. We also extend prior work on dependent-data TMLE towards estimation of novel causal parameters, e.g., the unit-specific direct treatment effects under interference and the effects of interventions that modify the initial network structure.
Collapse
Affiliation(s)
- Oleg Sofrygin
- Department of Biostatistics, University of California, Berkeley, 101 Haviland Hall, Berkeley, CA, 94720, USA
| | - Mark J. van der Laan
- Department of Biostatistics, University of California, Berkeley, 101 Haviland Hall, Berkeley, CA, 94720, USA
| |
Collapse
|
15
|
Abstract
One hundred years ago Sir Ronald Ross published his treatise on a general Theory of Happenings. Dependent happenings are those in which the frequency depends on the number already affected. When there is dependency of events, interventions can have different types of effects. Interventions such as vaccination can have direct protective effects for the person receiving the treatment, as well as indirect/spillover effects for others in the population. Causal inference is a framework for carefully defining the causal effect of a treatment, exposure, or policy, and then determining conditions under which such effects can be estimated from the observed data. We consider here scenarios in which the potential outcomes of an individual can depend on the treatment of other individuals in the population, known as causal inference with interference. Much of the research so far has assumed the population is divided into groups or clusters, and individuals can interfere with others within their clusters but not across clusters. Recent developments have assumed more general forms of interference. We review some of the different types of effects that have been defined for dependent happenings, particularly using the methods of causal inference with interference. Many of the methods are applicable across disciplines, such as infectious diseases, social sciences, and economics.
Collapse
Affiliation(s)
- M Elizabeth Halloran
- Center for Inference and Dynamics of Infectious Diseases, Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center; Department of Biostatistics, School of Public Health, University of Washington
| | - Michael G Hudgens
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill
| |
Collapse
|
16
|
Ogburn EL, Zeger SL. Statistical Reasoning and Methods in Epidemiology to Promote Individualized Health: In Celebration of the 100th Anniversary of the Johns Hopkins Bloomberg School of Public Health. Am J Epidemiol 2016; 183:427-34. [PMID: 26867776 DOI: 10.1093/aje/kwv453] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2015] [Accepted: 12/23/2015] [Indexed: 11/12/2022] Open
Abstract
Epidemiology is concerned with determining the distribution and causes of disease. Throughout its history, epidemiology has drawn upon statistical ideas and methods to achieve its aims. Because of the exponential growth in our capacity to measure and analyze data on the underlying processes that define each person's state of health, there is an emerging opportunity for population-based epidemiologic studies to influence health decisions made by individuals in ways that take into account the individuals' characteristics, circumstances, and preferences. We refer to this endeavor as "individualized health." The present article comprises 2 sections. In the first, we describe how graphical, longitudinal, and hierarchical models can inform the project of individualized health. We propose a simple graphical model for informing individual health decisions using population-based data. In the second, we review selected topics in causal inference that we believe to be particularly useful for individualized health. Epidemiology and biostatistics were 2 of the 4 founding departments in the world's first graduate school of public health at Johns Hopkins University, the centennial of which we honor. This survey of a small part of the literature is intended to demonstrate that the 2 fields remain just as inextricably linked today as they were 100 years ago.
Collapse
|
17
|
Semiparametric Theory and Empirical Processes in Causal Inference. STATISTICAL CAUSAL INFERENCES AND THEIR APPLICATIONS IN PUBLIC HEALTH RESEARCH 2016. [DOI: 10.1007/978-3-319-41259-7_8] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
18
|
Balzer LB, Petersen ML, van der Laan MJ. Adaptive pair-matching in randomized trials with unbiased and efficient effect estimation. Stat Med 2015; 34:999-1011. [PMID: 25421503 PMCID: PMC4318754 DOI: 10.1002/sim.6380] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Revised: 11/02/2014] [Accepted: 11/07/2014] [Indexed: 11/08/2022]
Abstract
In randomized trials, pair-matching is an intuitive design strategy to protect study validity and to potentially increase study power. In a common design, candidate units are identified, and their baseline characteristics used to create the best n/2 matched pairs. Within the resulting pairs, the intervention is randomized, and the outcomes measured at the end of follow-up. We consider this design to be adaptive, because the construction of the matched pairs depends on the baseline covariates of all candidate units. As a consequence, the observed data cannot be considered as n/2 independent, identically distributed pairs of units, as common practice assumes. Instead, the observed data consist of n dependent units. This paper explores the consequences of adaptive pair-matching in randomized trials for estimation of the average treatment effect, conditional the baseline covariates of the n study units. By avoiding estimation of the covariate distribution, estimators of this conditional effect will often be more precise than estimators of the marginal effect. We contrast the unadjusted estimator with targeted minimum loss based estimation and show substantial efficiency gains from matching and further gains with adjustment. This work is motivated by the Sustainable East Africa Research in Community Health study, an ongoing community randomized trial to evaluate the impact of immediate and streamlined antiretroviral therapy on HIV incidence in rural East Africa.
Collapse
Affiliation(s)
- Laura B. Balzer
- Division of Biostatistics, University of California, Berkeley, CA 94110-7358, USA
| | - Maya L. Petersen
- Division of Biostatistics, University of California, Berkeley, CA 94110-7358, USA
| | - Mark J. van der Laan
- Division of Biostatistics, University of California, Berkeley, CA 94110-7358, USA
| | | |
Collapse
|