1
|
Wang W, Zeng J, Li X, Liao F, Zhang T, Yin F, Deng Y, Ma Y. Using a novel strategy to identify the clustered regions of associations between short-term exposure to temperature and mortality and evaluate the inequality of heat- and cold-attributable burdens: A case study in the Sichuan Basin, China. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 349:119402. [PMID: 37879222 DOI: 10.1016/j.jenvman.2023.119402] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 09/24/2023] [Accepted: 10/12/2023] [Indexed: 10/27/2023]
Abstract
BACKGROUND Few studies have focused on the spatially clustered regions in the association between short-term exposure to temperature and mortality, which is important for identifying high-susceptibility population and enhancing the prevention of high/low temperatures. Previous studies have explored the association inequality, but no study has evaluated the inequalities of temperature-attributable burdens, which may be more meaningful for reducing temperature-related regional inequality. METHODS Taking the Sichuan Basin (SCB), an economically imbalanced area with high humidity and four distinctive seasons, as an example, we used a novel multi-stage strategy to investigate the two issues. First, distributed lag nonlinear models were independently constructed to obtain the county-level associations between daily temperature and cardiorespiratory mortality. Then, an estimation-error-based spatial scan statistic was used to detect the association-clustered regions. Third, multivariate meta-regression incorporating the identified clustered regions and socioeconomic and natural factors was used to obtain stable county-specific associations, based on which the heat- and cold-attributable deaths were mapped and their inequalities were evaluated using concentration indices and Lorenz curves. RESULTS On average, a U-shaped temperature-mortality association was examined. A significantly association-clustered region was detected (P = 0.017), in which heat and cold temperatures presented significantly stronger associations than those in the non-clustered region, particularly for heat temperatures. The cold-attributable deaths (3.5%) were substantially more than the heat-attributable deaths (0.5%). Both presented severe inequalities over counties. Significant temperature-attributable inequalities were also found over per-capital public budget, urbanization rate, employment rate and per-capital GDP. The directions of inequalities over GDP and urbanization rate were opposite between heat and cold temperatures. CONCLUSIONS Our analysis provided the first evidence about the clustering of temperature-mortality associations and the inequality of cold- and heat-attributable burdens. Significantly association-clustered regions and heavy temperature-attributable inequalities were found in the SCB. Rural people bore heavier cold-attributable but less heat-attributable mortality risk than urban people, suggesting that different policies should be designed to reduce the temperature-attributable inequalities for heat and cold temperatures and different regions. This novel strategy can provide an interesting new perspective in the association between environmental exposure and human health.
Collapse
Affiliation(s)
- Wei Wang
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Jing Zeng
- Sichuan Provincial Center for Disease Prevention and Control, China
| | - Xuelin Li
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Fang Liao
- Sichuan Provincial Center for Mental Health, Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital, Chengdu, 610072, China
| | - Tao Zhang
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Fei Yin
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Ying Deng
- Sichuan Provincial Center for Disease Prevention and Control, China
| | - Yue Ma
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China; Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
2
|
Wang W, Zeng J, Li X, Liao F, Li S, Tian X, Yin F, Zhang T, Deng Y, Ma Y. Using a novel strategy to investigate the spatially autocorrelated and clustered associations between short-term exposure to PM 2.5 and mortality and the attributable burden: A case study in the Sichuan Basin, China. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2023; 264:115405. [PMID: 37657390 DOI: 10.1016/j.ecoenv.2023.115405] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/21/2023] [Accepted: 08/22/2023] [Indexed: 09/03/2023]
Abstract
Due to the lack of statistical methods, few studies have investigated the spatial autocorrelated distribution in the association between short-term exposure to PM2.5 and mortality and used a statistical manner to explore the association-clustered regions, which play important roles in identifying high-sensitivity/susceptibility regions. The Sichuan Basin (SCB) is one of the most PM2.5-polluted areas, and the extreme economic imbalance may cause considerable spatial heterogeneity and clustering in PM2.5-mortality association. In this work, we used a recently proposed strategy by us to investigate the spatially autocorrelated and clustered association between daily PM2.5 and cardiorespiratory mortality from 2015 to 2019 in 130 counties of the SCB. First, generalized additive models were independently constructed to obtain the county-level association estimations. Then, an estimation-error-based spatial scan statistic was used to detect the association-clustered regions. Third, multivariate conditional meta autoregression was used to obtain the spatially autocorrelated association distribution, based on which the attributable deaths were mapped and their inequality was evaluated using the Gini coefficient and Lorenz curve. Results showed that two significantly association-clustered regions were detected. One is mainly located in the megacity Chengdu where PM2.5 presented a significantly stronger association with no threshold effect at low-level PM2.5 but a threshold at high-level PM2.5. In the other cluster, a threshold effect at low-level PM2.5 but no threshold at high-level PM2.5 were found. The mortality risk at low/middle-level PM2.5 decreased from Chengdu as the center to the surrounding areas. A total of 29,129 (2.0 %) deaths were attributable to the excess PM2.5 exposure. The attributable deaths also decreased from Chengdu as the center to the surrounding areas with Gini coefficients of 0.43 and 0.3 for absolute and relative attributable deaths, respectively. This novel strategy provided a new epidemiological perspective regarding the association and implicated that Chengdu is significantly deserving of more attention regarding PM2.5-related health loss.
Collapse
Affiliation(s)
- Wei Wang
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Jing Zeng
- Sichuan Provincial Center for Disease Prevention and Control, China
| | - Xuelin Li
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Fang Liao
- Sichuan Provincial Center for Mental Health, Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital, Chengdu 610072, China
| | - Sheng Li
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Xinyue Tian
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Fei Yin
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Tao Zhang
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Ying Deng
- Sichuan Provincial Center for Disease Prevention and Control, China
| | - Yue Ma
- West China School of Public Health and West China Fourth Hospital, Sichuan University, China; Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
3
|
Advanced methods and implementations for the meta-analyses of animal models: Current practices and future recommendations. Neurosci Biobehav Rev 2023; 146:105016. [PMID: 36566804 DOI: 10.1016/j.neubiorev.2022.105016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 12/19/2022] [Accepted: 12/20/2022] [Indexed: 12/24/2022]
Abstract
Meta-analytic techniques have been widely used to synthesize data from animal models of human diseases and conditions, but these analyses often face two statistical challenges due to complex nature of animal data (e.g., multiple effect sizes and multiple species): statistical dependency and confounding heterogeneity. These challenges can lead to unreliable and less informative evidence, which hinders the translation of findings from animal to human studies. We present a literature survey of meta-analysis using animal models (animal meta-analysis), showing that these issues are not adequately addressed in current practice. To address these challenges, we propose a meta-analytic framework based on multilevel (linear mixed-effects) models. Through conceptualization, formulations, and worked examples, we illustrate how this framework can appropriately address these issues while allowing for testing new questions. Additionally, we introduce other advanced techniques such as multivariate models, robust variance estimation, and meta-analysis of emergent effect sizes, which can deliver robust inferences and novel biological insights. We also provide a tutorial with annotated R code to demonstrate the implementation of these techniques.
Collapse
|
4
|
Wang Y, Lin L, Thompson CG, Chu H. A penalization approach to random-effects meta-analysis. Stat Med 2022; 41:500-516. [PMID: 34796539 PMCID: PMC8792303 DOI: 10.1002/sim.9261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Revised: 09/08/2021] [Accepted: 10/29/2021] [Indexed: 11/06/2022]
Abstract
Systematic reviews and meta-analyses are principal tools to synthesize evidence from multiple independent sources in many research fields. The assessment of heterogeneity among collected studies is a critical step when performing a meta-analysis, given its influence on model selection and conclusions about treatment effects. A common-effect (CE) model is conventionally used when the studies are deemed homogeneous, while a random-effects (RE) model is used for heterogeneous studies. However, both models have limitations. For example, the CE model produces excessively conservative confidence intervals with low coverage probabilities when the collected studies have heterogeneous treatment effects. The RE model, on the other hand, assigns higher weights to small studies compared to the CE model. In the presence of small-study effects or publication bias, the over-weighted small studies from a RE model can lead to substantially biased overall treatment effect estimates. In addition, outlying studies may exaggerate between-study heterogeneity. This article introduces penalization methods as a compromise between the CE and RE models. The proposed methods are motivated by the penalized likelihood approach, which is widely used in the current literature to control model complexity and reduce variances of parameter estimates. We compare the existing and proposed methods with simulated data and several case studies to illustrate the benefits of the penalization methods.
Collapse
Affiliation(s)
- Yipeng Wang
- Department of Statistics, Florida State University, FL,
USA
- Department of Biostatistics, University of Florida, FL,
USA
| | - Lifeng Lin
- Department of Statistics, Florida State University, FL,
USA
| | | | - Haitao Chu
- Division of Biostatistics, University of Minnesota School
of Public Health, MN, USA
| |
Collapse
|
5
|
Agarwala N, Park J, Roy A. Efficient integration of aggregate data and individual participant data in one-way mixed models. Stat Med 2022; 41:1555-1572. [PMID: 35040178 DOI: 10.1002/sim.9307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 10/26/2021] [Accepted: 12/16/2021] [Indexed: 11/06/2022]
Abstract
Often both aggregate data (AD) studies and individual participant data (IPD) studies are available for specific treatments. Combining these two sources of data could improve the overall meta-analytic estimates of treatment effects. Moreover, often for some studies with AD, the associated IPD maybe available, albeit at some extra effort or cost to the analyst. We propose a method for combining treatment effects across trials when the response is from the exponential family of distribution and hence a generalized linear model structure can be used. We consider the case when treatment effects are fixed and common across studies. Using the proposed combination method, we study the relative efficiency of analyzing all IPD studies vs combining various percentages of AD and IPD studies. For many different models, design constraints under which the AD estimators are the IPD estimators, and hence fully efficient, are known. For such models, we advocate a selection procedure that chooses AD studies over IPD studies in a manner that force least departure from design constraints and hence ensures an efficient combined AD and IPD estimator.
Collapse
Affiliation(s)
- Neha Agarwala
- Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, Maryland, USA
| | - Junyong Park
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Anindya Roy
- Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, Maryland, USA
| |
Collapse
|
6
|
Li Y, Wang F, Li R, Sun Y. Semiparametric integrative interaction analysis for non-small-cell lung cancer. Stat Methods Med Res 2020; 29:2865-2880. [PMID: 32281490 DOI: 10.1177/0962280220909969] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In genomic analysis, it is significant though challenging to identify markers associated with cancer outcomes or phenotypes. Based on the biological mechanisms of cancers and the characteristics of datasets, we propose a novel integrative interaction approach under a semiparametric model, in which genetic and environmental factors are included as the parametric and nonparametric components, respectively. The goal of this approach is to identify the genetic factors and gene-gene interactions associated with cancer outcomes, while estimating the nonlinear effects of environmental factors. The proposed approach is based on the threshold gradient-directed regularisation technique. Simulation studies indicate that the proposed approach outperforms alternative methods at identifying the main effects and interactions, and has favourable estimation and prediction accuracy. We analysed non-small-cell lung carcinoma datasets from the Cancer Genome Atlas, and the results demonstrate that the proposed approach can identify markers with important implications and that it performs favourably in terms of prediction accuracy, identification stability, and computation cost.
Collapse
Affiliation(s)
- Yang Li
- Center for Applied Statistics, Renmin University of China, Beijing, China.,School of Statistics, Renmin University of China, Beijing, China.,Statistical Consulting Center, Renmin University of China, Beijing, China
| | - Fan Wang
- School of Statistics, Renmin University of China, Beijing, China.,Statistical Consulting Center, Renmin University of China, Beijing, China
| | - Rong Li
- School of Statistics, Renmin University of China, Beijing, China.,Statistical Consulting Center, Renmin University of China, Beijing, China
| | - Yifan Sun
- Center for Applied Statistics, Renmin University of China, Beijing, China.,School of Statistics, Renmin University of China, Beijing, China
| |
Collapse
|
7
|
Kundu P, Tang R, Chatterjee N. Generalized meta-analysis for multiple regression models across studies with disparate covariate information. Biometrika 2019; 106:567-585. [PMID: 31427822 PMCID: PMC6690173 DOI: 10.1093/biomet/asz030] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2017] [Indexed: 01/23/2023] Open
Abstract
Meta-analysis is widely popular for synthesizing information on common parameters of interest across multiple studies because of its logistical convenience and statistical efficiency. We develop a generalized meta-analysis approach to combining information on multivariate regression parameters across multiple studies that have varying levels of covariate information. Using algebraic relationships among regression parameters in different dimensions, we specify a set of moment equations for estimating parameters of a maximal model through information available from sets of parameter estimates for a series of reduced models from the different studies. The specification of the equations requires a reference dataset for estimating the joint distribution of the covariates. We propose to solve these equations using the generalized method of moments approach, with the optimal weighting of the equations taking into account uncertainty associated with estimates of the parameters of the reduced models. We describe extensions of the iterated reweighted least-squares algorithm for fitting generalized linear regression models using the proposed framework. Based on the same moment equations, we also develop a diagnostic test for detecting violations of underlying model assumptions, such as those arising from heterogeneity in the underlying study populations. The proposed methods are illustrated with extensive simulation studies and a real-data example involving the development of a breast cancer risk prediction model using disparate risk factor information from multiple studies.
Collapse
Affiliation(s)
- Prosenjit Kundu
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, 615 N. Wolfe Street, Baltimore, Maryland, U.S.A
| | - Runlong Tang
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, 615 N. Wolfe Street, Baltimore, Maryland, U.S.A
| | - Nilanjan Chatterjee
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, 615 N. Wolfe Street, Baltimore, Maryland, U.S.A
| |
Collapse
|
8
|
Ibrahim JG, Kim S, Chen MH, Shah AK, Lin J. Bayesian multivariate skew meta-regression models for individual patient data. Stat Methods Med Res 2018; 28:3415-3436. [PMID: 30309294 DOI: 10.1177/0962280218801147] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
We examine a class of multivariate meta-regression models in the presence of individual patient data. The methodology is well motivated from several studies of cholesterol-lowering drugs where the goal is to jointly analyze the multivariate outcomes, low density lipoprotein cholesterol, high density lipoprotein cholesterol, and triglycerides. These three continuous outcome measures are correlated and shed much light on a subject's lipid status. One of the main goals in lipid research is the joint analysis of these three outcome measures in a meta-regression setting. Since these outcome measures are not typically multivariate normal, one must consider classes of distributions that allow for skewness in one or more of the outcomes. In this paper, we consider a new general class of multivariate skew distributions for multivariate meta-regression and examine their theoretical properties. Using these distributions, we construct a Bayesian model for the meta-data and develop an efficient Markov chain Monte Carlo computational scheme for carrying out the computations. In addition, we develop a multivariate L measure for model comparison, Bayesian residuals for model assessment, and a Bayesian procedure for detecting outlying trials. The proposed multivariate L measure, Bayesian residuals, and Bayesian outlying trial detection procedure are particularly suitable and computationally attractive in the multivariate meta-regression setting. A detailed case study demonstrating the usefulness of the proposed methodology is carried out in an individual patient data multivariate meta-regression setting using 26 pivotal Merck clinical trials that compare statins (cholesterol-lowering drugs) in combination with ezetimibe and statins alone on treatment-naïve patients and those continuing on statins at baseline.
Collapse
Affiliation(s)
- Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, USA
| | - Sungduk Kim
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, USA
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, USA
| | - Arvind K Shah
- Clinical Biostatistics, Merck Research Laboratories, USA
| | - Jianxin Lin
- Clinical Biostatistics, Merck Research Laboratories, USA
| |
Collapse
|
9
|
Crippa A, Discacciati A, Bottai M, Spiegelman D, Orsini N. One-stage dose-response meta-analysis for aggregated data. Stat Methods Med Res 2018; 28:1579-1596. [PMID: 29742975 DOI: 10.1177/0962280218773122] [Citation(s) in RCA: 200] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The standard two-stage approach for estimating non-linear dose-response curves based on aggregated data typically excludes those studies with less than three exposure groups. We develop the one-stage method as a linear mixed model and present the main aspects of the methodology, including model specification, estimation, testing, prediction, goodness-of-fit, model comparison, and quantification of between-studies heterogeneity. Using both fictitious and real data from a published meta-analysis, we illustrated the main features of the proposed methodology and compared it to a traditional two-stage analysis. In a one-stage approach, the pooled curve and estimates of the between-studies heterogeneity are based on the whole set of studies without any exclusion. Thus, even complex curves (splines, spike at zero exposure) defined by several parameters can be estimated. We showed how the one-stage method may facilitate several applications, in particular quantification of heterogeneity over the exposure range, prediction of marginal and conditional curves, and comparison of alternative models. The one-stage method for meta-analysis of non-linear curves is implemented in the dosresmeta R package. It is particularly suited for dose-response meta-analyses of aggregated where the complexity of the research question is better addressed by including all the studies.
Collapse
Affiliation(s)
- Alessio Crippa
- 1 Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden
| | - Andrea Discacciati
- 2 Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Matteo Bottai
- 2 Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | | | - Nicola Orsini
- 1 Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
10
|
Boca SM, Pfeiffer RM, Sampson JN. Multivariate meta-analysis with an increasing number of parameters. Biom J 2017; 59:496-510. [PMID: 28195655 DOI: 10.1002/bimj.201600013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Revised: 10/28/2016] [Accepted: 11/10/2016] [Indexed: 11/11/2022]
Abstract
Meta-analysis can average estimates of multiple parameters, such as a treatment's effect on multiple outcomes, across studies. Univariate meta-analysis (UVMA) considers each parameter individually, while multivariate meta-analysis (MVMA) considers the parameters jointly and accounts for the correlation between their estimates. The performance of MVMA and UVMA has been extensively compared in scenarios with two parameters. Our objective is to compare the performance of MVMA and UVMA as the number of parameters, p, increases. Specifically, we show that (i) for fixed-effect (FE) meta-analysis, the benefit from using MVMA can substantially increase as p increases; (ii) for random effects (RE) meta-analysis, the benefit from MVMA can increase as p increases, but the potential improvement is modest in the presence of high between-study variability and the actual improvement is further reduced by the need to estimate an increasingly large between study covariance matrix; and (iii) when there is little to no between-study variability, the loss of efficiency due to choosing RE MVMA over FE MVMA increases as p increases. We demonstrate these three features through theory, simulation, and a meta-analysis of risk factors for non-Hodgkin lymphoma.
Collapse
Affiliation(s)
- Simina M Boca
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, 2115 Wisconsin Avenue, Suite 110, Washington, DC 20007, USA.,Department of Oncology, Georgetown University Medical Center, 3970 Reservoir Road NW, Research Building, Suite E501, Washington, DC 20057, USA.,Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, 4000 Reservoir Road NW, Washington, DC 20057, USA
| | - Ruth M Pfeiffer
- Division of Cancer Epidemiology and Genetics, Biostatistics Branch, National Cancer Institute, 9609 Medical Center Drive, MSC 9776, Bethesda, MD 20892, USA
| | - Joshua N Sampson
- Division of Cancer Epidemiology and Genetics, Biostatistics Branch, National Cancer Institute, 9609 Medical Center Drive, MSC 9776, Bethesda, MD 20892, USA
| |
Collapse
|
11
|
Huang Y, Liu J, Yi H, Shia BC, Ma S. Promoting similarity of model sparsity structures in integrative analysis of cancer genetic data. Stat Med 2016; 36:509-559. [PMID: 27667129 DOI: 10.1002/sim.7138] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2014] [Revised: 07/24/2016] [Accepted: 09/02/2016] [Indexed: 01/05/2023]
Abstract
In profiling studies, the analysis of a single dataset often leads to unsatisfactory results because of the small sample size. Multi-dataset analysis utilizes information of multiple independent datasets and outperforms single-dataset analysis. Among the available multi-dataset analysis methods, integrative analysis methods aggregate and analyze raw data and outperform meta-analysis methods, which analyze multiple datasets separately and then pool summary statistics. In this study, we conduct integrative analysis and marker selection under the heterogeneity structure, which allows different datasets to have overlapping but not necessarily identical sets of markers. Under certain scenarios, it is reasonable to expect some similarity of identified marker sets - or equivalently, similarity of model sparsity structures - across multiple datasets. However, the existing methods do not have a mechanism to explicitly promote such similarity. To tackle this problem, we develop a sparse boosting method. This method uses a BIC/HDBIC criterion to select weak learners in boosting and encourages sparsity. A new penalty is introduced to promote the similarity of model sparsity structures across datasets. The proposed method has a intuitive formulation and is broadly applicable and computationally affordable. In numerical studies, we analyze right censored survival data under the accelerated failure time model. Simulation shows that the proposed method outperforms alternative boosting and penalization methods with more accurate marker identification. The analysis of three breast cancer prognosis datasets shows that the proposed method can identify marker sets with increased similarity across datasets and improved prediction performance. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Yuan Huang
- VA Cooperative Studies Program Coordinating Center, West Haven, CT; Department of Biostatistics, Yale University, New Haven, CT, U.S.A
| | - Jin Liu
- Center of Quantitative Medicine, Duke-NUS Medical School, Singapore
| | - Huangdi Yi
- Department of Biostatistics, Yale University, New Haven, CT, U.S.A
| | - Ben-Chang Shia
- School of Health Care Administration, Big Data Research Center & School of Management, Taipei Medical University, Taipei, Taiwan
| | - Shuangge Ma
- VA Cooperative Studies Program Coordinating Center, West Haven, CT; Department of Biostatistics, Yale University, New Haven, CT, U.S.A
| |
Collapse
|
12
|
Crippa A, Orsini N. Dose-response meta-analysis of differences in means. BMC Med Res Methodol 2016; 16:91. [PMID: 27485429 PMCID: PMC4971698 DOI: 10.1186/s12874-016-0189-0] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 07/13/2016] [Indexed: 11/10/2022] Open
Abstract
Background Meta-analytical methods are frequently used to combine dose-response findings expressed in terms of relative risks. However, no methodology has been established when results are summarized in terms of differences in means of quantitative outcomes. Methods We proposed a two-stage approach. A flexible dose-response model is estimated within each study (first stage) taking into account the covariance of the data points (mean differences, standardized mean differences). Parameters describing the study-specific curves are then combined using a multivariate random-effects model (second stage) to address heterogeneity across studies. Results The method is fairly general and can accommodate a variety of parametric functions. Compared to traditional non-linear models (e.g. Emax, logistic), spline models do not assume any pre-specified dose-response curve. Spline models allow inclusion of studies with a small number of dose levels, and almost any shape, even non monotonic ones, can be estimated using only two parameters. We illustrated the method using dose-response data arising from five clinical trials on an antipsychotic drug, aripiprazole, and improvement in symptoms in shizoaffective patients. Using the Positive and Negative Syndrome Scale (PANSS), pooled results indicated a non-linear association with the maximum change in mean PANSS score equal to 10.40 (95 % confidence interval 7.48, 13.30) observed for 19.32 mg/day of aripiprazole. No substantial change in PANSS score was observed above this value. An estimated dose of 10.43 mg/day was found to produce 80 % of the maximum predicted response. Conclusion The described approach should be adopted to combine correlated differences in means of quantitative outcomes arising from multiple studies. Sensitivity analysis can be a useful tool to assess the robustness of the overall dose-response curve to different modelling strategies. A user-friendly R package has been developed to facilitate applications by practitioners.
Collapse
Affiliation(s)
- Alessio Crippa
- Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden.
| | - Nicola Orsini
- Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
13
|
Discacciati A, Crippa A, Orsini N. Goodness of fit tools for dose-response meta-analysis of binary outcomes. Res Synth Methods 2015; 8:149-160. [PMID: 26679736 PMCID: PMC5484373 DOI: 10.1002/jrsm.1194] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Revised: 08/28/2015] [Accepted: 10/31/2015] [Indexed: 02/05/2023]
Abstract
Goodness of fit evaluation should be a natural step in assessing and reporting dose-response meta-analyses from aggregated data of binary outcomes. However, little attention has been given to this topic in the epidemiological literature, and goodness of fit is rarely, if ever, assessed in practice. We briefly review the two-stage and one-stage methods used to carry out dose-response meta-analyses. We then illustrate and discuss three tools specifically aimed at testing, quantifying, and graphically evaluating the goodness of fit of dose-response meta-analyses. These tools are the deviance, the coefficient of determination, and the decorrelated residuals-versus-exposure plot. Data from two published meta-analyses are used to show how these three tools can improve the practice of quantitative synthesis of aggregated dose-response data. In fact, evaluating the degree of agreement between model predictions and empirical data can help the identification of dose-response patterns, the investigation of sources of heterogeneity, and the assessment of whether the pooled dose-response relation adequately summarizes the published results. © 2015 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Andrea Discacciati
- Unit of Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden.,Unit of Biostatistics, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Alessio Crippa
- Unit of Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden.,Unit of Biostatistics, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Nicola Orsini
- Unit of Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden.,Unit of Biostatistics, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
14
|
Wang M, Spiegelman D, Kuchiba A, Lochhead P, Kim S, Chan AT, Poole EM, Tamimi R, Tworoger SS, Giovannucci E, Rosner B, Ogino S. Statistical methods for studying disease subtype heterogeneity. Stat Med 2015; 35:782-800. [PMID: 26619806 DOI: 10.1002/sim.6793] [Citation(s) in RCA: 207] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 09/08/2015] [Accepted: 10/13/2015] [Indexed: 12/31/2022]
Abstract
A fundamental goal of epidemiologic research is to investigate the relationship between exposures and disease risk. Cases of the disease are often considered a single outcome and assumed to share a common etiology. However, evidence indicates that many human diseases arise and evolve through a range of heterogeneous molecular pathologic processes, influenced by diverse exposures. Pathogenic heterogeneity has been considered in various neoplasms such as colorectal, lung, prostate, and breast cancers, leukemia and lymphoma, and non-neoplastic diseases, including obesity, type II diabetes, glaucoma, stroke, cardiovascular disease, autism, and autoimmune disease. In this article, we discuss analytic options for studying disease subtype heterogeneity, emphasizing methods for evaluating whether the association of a potential risk factor with disease varies by disease subtype. Methods are described for scenarios where disease subtypes are categorical and ordinal and for cohort studies, matched and unmatched case-control studies, and case-case study designs. For illustration, we apply the methods to a molecular pathological epidemiology study of alcohol intake and colon cancer risk by tumor LINE-1 methylation subtypes. User-friendly software to implement the methods is publicly available.
Collapse
Affiliation(s)
- Molin Wang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A
| | - Donna Spiegelman
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A
| | - Aya Kuchiba
- Department of Biostatistics, National Cancer Center, Tokyo, Japan
| | - Paul Lochhead
- Division of Gastroenterology, Massachusetts General Hospital, Boston, MA, U.S.A
| | - Sehee Kim
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, U.S.A
| | - Andrew T Chan
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A.,Division of Gastroenterology, Massachusetts General Hospital, Boston, MA, U.S.A
| | - Elizabeth M Poole
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A
| | - Rulla Tamimi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A
| | - Shelley S Tworoger
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A
| | - Edward Giovannucci
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A
| | - Bernard Rosner
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A
| | - Shuji Ogino
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, U.S.A.,Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A.,Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, U.S.A
| |
Collapse
|
15
|
Nordio F, Zanobetti A, Colicino E, Kloog I, Schwartz J. Changing patterns of the temperature-mortality association by time and location in the US, and implications for climate change. ENVIRONMENT INTERNATIONAL 2015; 81:80-6. [PMID: 25965185 PMCID: PMC4780576 DOI: 10.1016/j.envint.2015.04.009] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Revised: 04/13/2015] [Accepted: 04/14/2015] [Indexed: 05/20/2023]
Abstract
The shape of the non-linear relationship between temperature and mortality varies among cities with different climatic conditions. There has been little examination of how these curves change over space and time. We evaluated the short-term effects of hot and cold temperatures on daily mortality over six 7-year periods in 211 US cities, comprising over 42 million deaths. Cluster analysis was used to group the cities according to similar temperatures and relative humidity. Temperature-mortality functions were calculated using B-splines to model the heat effect (lag 0) and the cold effect on mortality (moving average lags 1-5). The functions were then combined through meta-smoothing and subsequently analyzed by meta-regression. We identified eight clusters. At lag 0, Cluster 5 (West Coast) had a RR of 1.14 (95% CI: 1.11,1.17) for temperatures of 27 °C vs 15.6 °C, and Cluster 6 (Gulf Coast) has a RR of 1.04 (95% CI: 1.03,1.05), suggesting that people are acclimated to their respective climates. Controlling for cluster effect in the multivariate-meta regression we found that across the US, the excess mortality from a 24-h temperature of 27 °C decreased over time from 10.6% to 0.9%. We found that the overall risk due to the heat effect is significantly affected by summer temperature mean and air condition usage, which could be a potential predictor in building climate-change scenarios.
Collapse
Affiliation(s)
- Francesco Nordio
- TIMI Study Group, Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Environmental Health, Exposure, Epidemiology, and Risk Program, Harvard School of Public Health, Boston, MA, USA
| | - Antonella Zanobetti
- Department of Environmental Health, Exposure, Epidemiology, and Risk Program, Harvard School of Public Health, Boston, MA, USA
| | - Elena Colicino
- Department of Environmental Health, Exposure, Epidemiology, and Risk Program, Harvard School of Public Health, Boston, MA, USA
| | - Itai Kloog
- Department of Geography and Environmental Development, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Joel Schwartz
- Department of Environmental Health, Exposure, Epidemiology, and Risk Program, Harvard School of Public Health, Boston, MA, USA.
| |
Collapse
|
16
|
Wang M, Kuchiba A, Ogino S. A Meta-Regression Method for Studying Etiological Heterogeneity Across Disease Subtypes Classified by Multiple Biomarkers. Am J Epidemiol 2015; 182:263-70. [PMID: 26116215 DOI: 10.1093/aje/kwv040] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Accepted: 02/04/2015] [Indexed: 12/22/2022] Open
Abstract
In interdisciplinary biomedical, epidemiologic, and population research, it is increasingly necessary to consider pathogenesis and inherent heterogeneity of any given health condition and outcome. As the unique disease principle implies, no single biomarker can perfectly define disease subtypes. The complex nature of molecular pathology and biology necessitates biostatistical methodologies to simultaneously analyze multiple biomarkers and subtypes. To analyze and test for heterogeneity hypotheses across subtypes defined by multiple categorical and/or ordinal markers, we developed a meta-regression method that can utilize existing statistical software for mixed-model analysis. This method can be used to assess whether the exposure-subtype associations are different across subtypes defined by 1 marker while controlling for other markers and to evaluate whether the difference in exposure-subtype association across subtypes defined by 1 marker depends on any other markers. To illustrate this method in molecular pathological epidemiology research, we examined the associations between smoking status and colorectal cancer subtypes defined by 3 correlated tumor molecular characteristics (CpG island methylator phenotype, microsatellite instability, and the B-Raf protooncogene, serine/threonine kinase (BRAF), mutation) in the Nurses' Health Study (1980-2010) and the Health Professionals Follow-up Study (1986-2010). This method can be widely useful as molecular diagnostics and genomic technologies become routine in clinical medicine and public health.
Collapse
|
17
|
Abstract
In a meta-analysis with multiple end points of interests that are correlated between or within studies, multivariate approach to meta-analysis has a potential to produce more precise estimates of effects by exploiting the correlation structure between end points. However, under random-effects assumption the multivariate estimation is more complex (as it involves estimation of more parameters simultaneously) than univariate estimation, and sometimes can produce unrealistic parameter estimates. Usefulness of multivariate approach to meta-analysis of the effects of a genetic variant on two or more correlated traits is not well understood in the area of genetic association studies. In such studies, genetic variants are expected to roughly maintain Hardy-Weinberg equilibrium within studies, and also their effects on complex traits are generally very small to modest and could be heterogeneous across studies for genuine reasons. We carried out extensive simulation to explore the comparative performance of multivariate approach with most commonly used univariate inverse-variance weighted approach under random-effects assumption in various realistic meta-analytic scenarios of genetic association studies of correlated end points. We evaluated the performance with respect to relative mean bias percentage, and root mean square error (RMSE) of the estimate and coverage probability of corresponding 95% confidence interval of the effect for each end point. Our simulation results suggest that multivariate approach performs similarly or better than univariate method when correlations between end points within or between studies are at least moderate and between-study variation is similar or larger than average within-study variation for meta-analyses of 10 or more genetic studies. Multivariate approach produces estimates with smaller bias and RMSE especially for the end point that has randomly or informatively missing summary data in some individual studies, when the missing data in the endpoint are imputed with null effects and quite large variance.
Collapse
|
18
|
Soura AB, Mberu B, Elungata P, Lankoande B, Millogo R, Beguy D, Compaore Y. Understanding inequities in child vaccination rates among the urban poor: evidence from Nairobi and Ouagadougou health and demographic surveillance systems. J Urban Health 2015; 92:39-54. [PMID: 25316191 PMCID: PMC4338131 DOI: 10.1007/s11524-014-9908-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Studies on informal settlements in sub-Saharan Africa have questioned the health benefits of urban residence, but this should not suggest that informal settlements (within cities and across cities and/or countries) are homogeneous. They vary in terms of poverty, pollution, overcrowding, criminality, and social exclusion. Moreover, while some informal settlements completely lack public services, others have access to health facilities, sewers, running water, and electricity. There are few comparative studies that have looked at informal settlements across countries accounting for these contextual nuances. In this paper, we comparatively examine the differences in child vaccination rates between Nairobi and Ouagadougou's informal settlements. We further investigate whether the identified differences are related to the differences in demographic and socioeconomic composition between the two settings. We use data from the Ouagadougou and Nairobi Urban Health and Demographic Surveillance Systems (HDSSs), which are the only two urban-based HDSSs in Africa. The results show that children in the slums of Nairobi are less vaccinated than children in the informal settlements in Ouagadougou. The difference in child vaccination rates between Nairobi and Ouagadougou informal settlements are not related to the differences in their demographic and socioeconomic composition but to the inequalities in access to immunization services.
Collapse
|
19
|
Affiliation(s)
- Elena Kulinskaya
- School of Computing Sciences; University of East Anglia; Norwich NR4 7TJ UK
| | - Stephan Morgenthaler
- Ecole polytechnique fédérale de Lausanne (EPFL); Station 8, 1015 Lausanne Switzerland
| | - Robert G. Staudte
- Department of Statistics and Mathematics; La Trobe University; Melbourne, VIC 3086 Australia
| |
Collapse
|
20
|
Gasparrini A, Armstrong B, Kenward MG. Multivariate meta-analysis for non-linear and other multi-parameter associations. Stat Med 2012; 31:3821-39. [PMID: 22807043 PMCID: PMC3546395 DOI: 10.1002/sim.5471] [Citation(s) in RCA: 465] [Impact Index Per Article: 38.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2011] [Accepted: 05/11/2012] [Indexed: 12/13/2022]
Abstract
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- A Gasparrini
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK.
| | | | | |
Collapse
|
21
|
Jackson D, Riley R, White IR. Multivariate meta-analysis: potential and promise. Stat Med 2011; 30:2481-98. [PMID: 21268052 PMCID: PMC3470931 DOI: 10.1002/sim.4172] [Citation(s) in RCA: 262] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2010] [Accepted: 11/01/2010] [Indexed: 01/14/2023]
Abstract
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd.
Collapse
|