1
|
Carlin DJ, Rider CV. Combined Exposures and Mixtures Research: An Enduring NIEHS Priority. ENVIRONMENTAL HEALTH PERSPECTIVES 2024; 132:75001. [PMID: 38968090 PMCID: PMC11225971 DOI: 10.1289/ehp14340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 04/25/2024] [Accepted: 06/12/2024] [Indexed: 07/07/2024]
Abstract
BACKGROUND The National Institute of Environmental Health Sciences (NIEHS) continues to prioritize research to better understand the health effects resulting from exposure to mixtures of chemical and nonchemical stressors. Mixtures research activities over the last decade were informed by expert input during the development and deliberations of the 2011 NIEHS Workshop "Advancing Research on Mixtures: New Perspectives and Approaches for Predicting Adverse Human Health Effects." NIEHS mixtures research efforts since then have focused on key themes including a) prioritizing mixtures for study, b) translating mixtures data from in vitro and in vivo studies, c) developing cross-disciplinary collaborations, d) informing component-based and whole-mixture assessment approaches, e) developing sufficient similarity methods to compare across complex mixtures, f) using systems-based approaches to evaluate mixtures, and g) focusing on management and integration of mixtures-related data. OBJECTIVES We aimed to describe NIEHS driven research on mixtures and combined exposures over the last decade and present areas for future attention. RESULTS Intramural and extramural mixtures research projects have incorporated a diverse array of chemicals (e.g., polycyclic aromatic hydrocarbons, botanicals, personal care products, wildfire emissions) and nonchemical stressors (e.g., socioeconomic factors, social adversity) and have focused on many diseases (e.g., breast cancer, atherosclerosis, immune disruption). We have made significant progress in certain areas, such as developing statistical methods for evaluating multiple chemical associations in epidemiology and building translational mixtures projects that include both in vitro and in vivo models. DISCUSSION Moving forward, additional work is needed to improve mixtures data integration, elucidate interactions between chemical and nonchemical stressors, and resolve the geospatial and temporal nature of mixture exposures. Continued mixtures research will be critical to informing cumulative impact assessments and addressing complex challenges, such as environmental justice and climate change. https://doi.org/10.1289/EHP14340.
Collapse
Affiliation(s)
- Danielle J. Carlin
- Division of Extramural Research and Training, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Cynthia V. Rider
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| |
Collapse
|
2
|
Kowal DR. Regression with race-modifiers: towards equity and interpretability. RESEARCH SQUARE 2024:rs.3.rs-4158747. [PMID: 38645193 PMCID: PMC11030512 DOI: 10.21203/rs.3.rs-4158747/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The pervasive effects of structural racism and racial discrimination are well-established and offer strong evidence that the effects of many important variables on health and life outcomes vary by race. Alarmingly, standard practices for statistical regression analysis introduce racial biases into the estimation and presentation of these race-modified effects. We introduce abundance-based constraints (ABCs) to eliminate these racial biases. ABCs offer a remarkable invariance property: estimates and inference for main effects are nearly unchanged by the inclusion of race-modifiers. Thus, quantitative researchers can estimate race-specific effects "for free"-without sacrificing parameter interpretability, equitability, or statistical efficiency. The benefits extend to prominent statistical learning techniques, especially regularization and selection. We leverage these tools to estimate the joint effects of environmental, social, and other factors on 4th end-of-grade readings scores for students in North Carolina (n = 27, 638) and identify race-modified effects for racial (residential) isolation, PM2.5 exposure, and mother's age at birth.
Collapse
Affiliation(s)
- Daniel R. Kowal
- Department of Statistics, Rice University, Houston, TX 77005
| |
Collapse
|
3
|
Kowala DR. Regression with race-modifiers: towards equity and interpretability. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.04.23300033. [PMID: 38464140 PMCID: PMC10925363 DOI: 10.1101/2024.01.04.23300033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
The pervasive effects of structural racism and racial discrimination are well-established and offer strong evidence that the effects of many important variables on health and life outcomes vary by race. Alarmingly, standard practices for statistical regression analysis introduce racial biases into the estimation and presentation of these race-modified effects. We introduce abundance-based constraints (ABCs) to eliminate these racial biases. ABCs offer a remarkable invariance property: estimates and inference for main effects are nearly unchanged by the inclusion of race-modifiers. Thus, quantitative researchers can estimate race-specific effects "for free"-without sacrificing parameter interpretability, equitability, or statistical efficiency. The benefits extend to prominent statistical learning techniques, especially regularization and selection. We leverage these tools to estimate the joint effects of environmental, social, and other factors on 4th end-of-grade readings scores for students in North Carolina (n = 27, 638) and identify race-modified effects for racial (residential) isolation, PM2.5 exposure, and mother's age at birth.
Collapse
Affiliation(s)
- Daniel R Kowala
- Department of Statistics, Rice University, Houston, TX 77005
| |
Collapse
|
4
|
Midya V, Alcala CS, Rechtman E, Gregory JK, Kannan K, Hertz-Picciotto I, Teitelbaum SL, Gennings C, Rosa MJ, Valvi D. Machine Learning Assisted Discovery of Interactions between Pesticides, Phthalates, Phenols, and Trace Elements in Child Neurodevelopment. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:18139-18150. [PMID: 37595051 PMCID: PMC10666542 DOI: 10.1021/acs.est.3c00848] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/10/2023] [Accepted: 08/10/2023] [Indexed: 08/20/2023]
Abstract
A growing body of literature suggests that developmental exposure to individual or mixtures of environmental chemicals (ECs) is associated with autism spectrum disorder (ASD). However, investigating the effect of interactions among these ECs can be challenging. We introduced a combination of the classical exposure-mixture Weighted Quantile Sum (WQS) regression and a machine-learning method termed Signed iterative Random Forest (SiRF) to discover synergistic interactions between ECs that are (1) associated with higher odds of ASD diagnosis, (2) mimic toxicological interactions, and (3) are present only in a subset of the sample whose chemical concentrations are higher than certain thresholds. In a case-control Childhood Autism Risks from Genetics and Environment (CHARGE) study, we evaluated multiordered synergistic interactions among 62 ECs measured in the urine samples of 479 children in association with increased odds for ASD diagnosis (yes vs no). WQS-SiRF identified two synergistic two-ordered interactions between (1) trace-element cadmium (Cd) and the organophosphate pesticide metabolite diethyl-phosphate (DEP); and (2) 2,4,6-trichlorophenol (TCP-246) and DEP. Both interactions were suggestively associated with increased odds of ASD diagnosis in the subset of children with urinary concentrations of Cd, DEP, and TCP-246 above the 75th percentile. This study demonstrates a novel method that combines the inferential power of WQS and the predictive accuracy of machine-learning algorithms to discover potentially biologically relevant chemical-chemical interactions associated with ASD.
Collapse
Affiliation(s)
- Vishal Midya
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Cecilia Sara Alcala
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Elza Rechtman
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Jill K. Gregory
- Instructional
Technology Group,Icahn School of Medicine
at Mount Sinai, New York, New York 10029, United States
| | - Kurunthachalam Kannan
- Department
of Pediatrics and Department of Environmental Medicine, New York University School of Medicine, New York, New York 10016, United States
| | - Irva Hertz-Picciotto
- Department
of Public Health Sciences, School of Medicine, University of California at Davis, Davis, California 95616, United States
- UC
Davis MIND (Medical Investigations of Neurodevelopmental Disorders)
Institute, University of California at Davis, Sacramento, California 95817, United States
| | - Susan L. Teitelbaum
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Chris Gennings
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Maria J. Rosa
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Damaskini Valvi
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| |
Collapse
|
5
|
Kowal DR. Subset selection for linear mixed models. Biometrics 2023; 79:1853-1867. [PMID: 35758839 PMCID: PMC9792623 DOI: 10.1111/biom.13707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 06/09/2022] [Indexed: 12/30/2022]
Abstract
Linear mixed models (LMMs) are instrumental for regression analysis with structured dependence, such as grouped, clustered, or multilevel data. However, selection among the covariates-while accounting for this structured dependence-remains a challenge. We introduce a Bayesian decision analysis for subset selection with LMMs. Using a Mahalanobis loss function that incorporates the structured dependence, we derive optimal linear coefficients for (i) any given subset of variables and (ii) all subsets of variables that satisfy a cardinality constraint. Crucially, these estimates inherit shrinkage or regularization and uncertainty quantification from the underlying Bayesian model, and apply for any well-specified Bayesian LMM. More broadly, our decision analysis strategy deemphasizes the role of a single "best" subset, which is often unstable and limited in its information content, and instead favors a collection of near-optimal subsets. This collection is summarized by key member subsets and variable-specific importance metrics. Customized subset search and out-of-sample approximation algorithms are provided for more scalable computing. These tools are applied to simulated data and a longitudinal physical activity dataset, and demonstrate excellent prediction, estimation, and selection ability.
Collapse
|
6
|
Feldman J, Kowal DR. Bayesian data synthesis and the utility-risk trade-off for mixed epidemiological data. Ann Appl Stat 2022. [DOI: 10.1214/22-aoas1604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
7
|
Joubert BR, Kioumourtzoglou MA, Chamberlain T, Chen HY, Gennings C, Turyk ME, Miranda ML, Webster TF, Ensor KB, Dunson DB, Coull BA. Powering Research through Innovative Methods for Mixtures in Epidemiology (PRIME) Program: Novel and Expanded Statistical Methods. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:1378. [PMID: 35162394 PMCID: PMC8835015 DOI: 10.3390/ijerph19031378] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 01/18/2022] [Accepted: 01/21/2022] [Indexed: 11/16/2022]
Abstract
Humans are exposed to a diverse mixture of chemical and non-chemical exposures across their lifetimes. Well-designed epidemiology studies as well as sophisticated exposure science and related technologies enable the investigation of the health impacts of mixtures. While existing statistical methods can address the most basic questions related to the association between environmental mixtures and health endpoints, there were gaps in our ability to learn from mixtures data in several common epidemiologic scenarios, including high correlation among health and exposure measures in space and/or time, the presence of missing observations, the violation of important modeling assumptions, and the presence of computational challenges incurred by current implementations. To address these and other challenges, NIEHS initiated the Powering Research through Innovative methods for Mixtures in Epidemiology (PRIME) program, to support work on the development and expansion of statistical methods for mixtures. Six independent projects supported by PRIME have been highly productive but their methods have not yet been described collectively in a way that would inform application. We review 37 new methods from PRIME projects and summarize the work across previously published research questions, to inform methods selection and increase awareness of these new methods. We highlight important statistical advancements considering data science strategies, exposure-response estimation, timing of exposures, epidemiological methods, the incorporation of toxicity/chemical information, spatiotemporal data, risk assessment, and model performance, efficiency, and interpretation. Importantly, we link to software to encourage application and testing on other datasets. This review can enable more informed analyses of environmental mixtures. We stress training for early career scientists as well as innovation in statistical methodology as an ongoing need. Ultimately, we direct efforts to the common goal of reducing harmful exposures to improve public health.
Collapse
Affiliation(s)
- Bonnie R. Joubert
- Division of Extramural Research and Training, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, NC 27709, USA;
| | - Marianthi-Anna Kioumourtzoglou
- Department of Environmental Health Sciences, Columbia University Mailman School of Public Health, New York, NY 10032, USA;
| | - Toccara Chamberlain
- Division of Extramural Research and Training, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, NC 27709, USA;
| | - Hua Yun Chen
- Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois Chicago, Chicago, IL 60612, USA; (H.Y.C.); (M.E.T.)
| | - Chris Gennings
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA;
| | - Mary E. Turyk
- Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois Chicago, Chicago, IL 60612, USA; (H.Y.C.); (M.E.T.)
| | - Marie Lynn Miranda
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, South Bend, IN 46556, USA;
| | - Thomas F. Webster
- Department of Environmental Health, Boston University School of Public Health, Boston, MA 02118, USA;
| | | | - David B. Dunson
- Department of Statistical Science, Duke University, Durham, NC 27710, USA;
| | - Brent A. Coull
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA;
| |
Collapse
|
8
|
Kowal DR. Bayesian subset selection and variable importance for interpretable prediction and classification. JOURNAL OF MACHINE LEARNING RESEARCH : JMLR 2022; 23:108. [PMID: 38105917 PMCID: PMC10723825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Subset selection is a valuable tool for interpretable learning, scientific discovery, and data compression. However, classical subset selection is often avoided due to selection instability, lack of regularization, and difficulties with post-selection inference. We address these challenges from a Bayesian perspective. Given any Bayesian predictive model ℳ , we extract a family of near-optimal subsets of variables for linear prediction or classification. This strategy deemphasizes the role of a single "best" subset and instead advances the broader perspective that often many subsets are highly competitive. The acceptable family of subsets offers a new pathway for model interpretation and is neatly summarized by key members such as the smallest acceptable subset, along with new (co-) variable importance metrics based on whether variables (co-) appear in all, some, or no acceptable subsets. More broadly, we apply Bayesian decision analysis to derive the optimal linear coefficients for any subset of variables. These coefficients inherit both regularization and predictive uncertainty quantification via ℳ . For both simulated and real data, the proposed approach exhibits better prediction, interval estimation, and variable selection than competing Bayesian and frequentist selection methods. These tools are applied to a large education dataset with highly correlated covariates. Our analysis provides unique insights into the combination of environmental, socioeconomic, and demographic factors that predict educational outcomes, and identifies over 200 distinct subsets of variables that offer near-optimal out-of-sample predictive accuracy.
Collapse
Affiliation(s)
- Daniel R Kowal
- Department of Statistics, Rice University, Houston, TX 77005, USA
| |
Collapse
|