1
|
Bodnar LM, Kirkpatrick SI, Roberts JM, Kennedy EH, Naimi AI. Is the Association Between Fruits and Vegetables and Preeclampsia Due to Higher Dietary Vitamin C and Carotenoid Intakes? Am J Clin Nutr 2023; 118:459-467. [PMID: 37321543 PMCID: PMC10447882 DOI: 10.1016/j.ajcnut.2023.06.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 04/03/2023] [Accepted: 06/08/2023] [Indexed: 06/17/2023] Open
Abstract
BACKGROUND Diets dense in fruits and vegetables are associated with a reduced risk of preeclampsia, but pathways underlying this relationship are unclear. Dietary antioxidants may contribute to the protective effect. OBJECTIVE We determined the extent to which the effect of dietary fruit and vegetable density on preeclampsia is because of high intakes of dietary vitamin C and carotenoids. METHODS We used data from 7572 participants in the Nulliparous Pregnancy Outcomes Study: monitoring mothers-to-be (8 United States medical centers, 2010‒2013). Usual daily periconceptional intake of total fruits and total vegetables was estimated from a food frequency questionnaire. We estimated the indirect effect of ≥2.5 cups/1000 kcal of fruits and vegetables through vitamin C and carotenoid on the risk of preeclampsia. We estimated these effects using targeted maximum likelihood estimation and an ensemble of machine learning algorithms, adjusting for confounders, including other dietary components, health behaviors, and psychological, neighborhood, and sociodemographic factors. RESULTS Participants who consumed ≥2.5 cups of fruits and vegetables per 1000 kcal were less likely than those who consumed <2.5 cups/1000 kcal to develop preeclampsia (6.4% compared with 8.6%). After confounder adjustment, we observed that higher fruit and vegetable density was associated with 2 fewer cases of preeclampsia (risk difference: -2.0; 95% CI: -3.9, -0.1)/100 pregnancies compared with lower density diets. High dietary vitamin C and carotenoid intake was not associated with preeclampsia. The protective effect of high fruit and vegetable density on the risk of preeclampsia and late-onset preeclampsia was not mediated through dietary vitamin C and carotenoids. CONCLUSIONS Evaluating other nutrients and bioactives in fruits and vegetables and their synergy is worthwhile, along with characterizing the effect of individual fruits or vegetables on preeclampsia risk.
Collapse
Affiliation(s)
- Lisa M Bodnar
- Department of Epidemiology, School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States; Department of Obstetrics, Gynecology, and Reproductive Sciences, Magee-Womens Research Institute, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States.
| | - Sharon I Kirkpatrick
- School of Public Health Sciences, University of Waterloo, Waterloo, Ontario, Canada
| | - James M Roberts
- Department of Epidemiology, School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States; Department of Obstetrics, Gynecology, and Reproductive Sciences, Magee-Womens Research Institute, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Edward H Kennedy
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Ashley I Naimi
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia, United States
| |
Collapse
|
2
|
Bodnar LM, Cartus AR, Kirkpatrick SI, Himes KP, Kennedy EH, Simhan HN, Grobman WA, Duffy JY, Silver RM, Parry S, Naimi AI. Machine learning as a strategy to account for dietary synergy: an illustration based on dietary intake and adverse pregnancy outcomes. Am J Clin Nutr 2020; 111:1235-1243. [PMID: 32108865 PMCID: PMC7266693 DOI: 10.1093/ajcn/nqaa027] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 01/31/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Conventional analytic approaches for studying diet patterns assume no dietary synergy, which can lead to bias if incorrectly modeled. Machine learning algorithms can overcome these limitations. OBJECTIVES We estimated associations between fruit and vegetable intake relative to total energy intake and adverse pregnancy outcomes using targeted maximum likelihood estimation (TMLE) paired with the ensemble machine learning algorithm Super Learner, and compared these with results generated from multivariable logistic regression. METHODS We used data from 7572 women in the Nulliparous Pregnancy Outcomes Study: monitoring mothers-to-be. Usual daily periconceptional intake of total fruits and total vegetables was estimated from an FFQ. We calculated the marginal risk of preterm birth, small-for-gestational-age (SGA) birth, gestational diabetes, and pre-eclampsia according to density of fruits and vegetables (cups/1000 kcal) ≥80th percentile compared with <80th percentile using multivariable logistic regression and Super Learner with TMLE. Models were adjusted for confounders, including other Healthy Eating Index-2010 components. RESULTS Using logistic regression, higher fruit and high vegetable densities were associated with 1.1% and 1.4% reductions in pre-eclampsia risk compared with lower densities, respectively. They were not associated with the 3 other outcomes. Using Super Learner with TMLE, high fruit and vegetable densities were associated with fewer cases of preterm birth (-4.0; 95% CI: -4.9, -3.0 and -3.7; 95% CI: -5.0, -2.3), SGA (-1.7; 95% CI: -2.9, -0.51 and -3.8; 95% CI: -5.0, -2.5), and pre-eclampsia (-3.2; 95% CI: -4.2, -2.2 and -4.0; 95% CI: -5.2, -2.7) per 100 births, respectively, and high vegetable densities were associated with a 0.9% increase in risk of gestational diabetes. CONCLUSIONS The differences in results between Super Learner with TMLE and logistic regression suggest that dietary synergy, which is accounted for in machine learning, may play a role in pregnancy outcomes. This innovative methodology for analyzing dietary data has the potential to advance the study of diet patterns.
Collapse
Affiliation(s)
- Lisa M Bodnar
- Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Obstetrics, Gynecology, and Reproductive Sciences, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Magee-Womens Research Institute, Pittsburgh, PA, USA
| | - Abigail R Cartus
- Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Sharon I Kirkpatrick
- School of Public Health and Health Systems, University of Waterloo, Waterloo, Ontario, Canada
| | - Katherine P Himes
- Department of Obstetrics, Gynecology, and Reproductive Sciences, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Magee-Womens Research Institute, Pittsburgh, PA, USA
| | - Edward H Kennedy
- Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Hyagriv N Simhan
- Department of Obstetrics, Gynecology, and Reproductive Sciences, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Magee-Womens Research Institute, Pittsburgh, PA, USA
| | - William A Grobman
- Department of Obstetrics and Gynecology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Jennifer Y Duffy
- Department of Obstetrics & Gynecology, School of Medicine, University of California, Irvine, Irvine, CA, USA
| | - Robert M Silver
- Department of Obstetrics and Gynecology, University of Utah, Salt Lake City, UT, USA
| | - Samuel Parry
- Department of Obstetrics and Gynecology, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
| | - Ashley I Naimi
- Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
3
|
Naimi AI, Balzer LB. Stacked generalization: an introduction to super learning. Eur J Epidemiol 2018; 33:459-464. [PMID: 29637384 DOI: 10.1007/s10654-018-0390-z] [Citation(s) in RCA: 99] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Accepted: 03/28/2018] [Indexed: 12/24/2022]
Abstract
Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into a host of methods among which is the "Super Learner". Super Learner uses V-fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of Super Learner by epidemiologists has been hampered by limitations in understanding conceptual and technical details. We work step-by-step through two examples to illustrate concepts and address common concerns.
Collapse
Affiliation(s)
- Ashley I Naimi
- Department of Epidemiology, University of Pittsburgh, 130 DeSoto Street 503 Parran Hall, Pittsburgh, PA, 15261, USA.
| | - Laura B Balzer
- Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, USA
| |
Collapse
|
4
|
Scharfstein D, McDermott A, Díaz I, Carone M, Lunardon N, Turkoz I. Global sensitivity analysis for repeated measures studies with informative drop-out: A semi-parametric approach. Biometrics 2017; 74:207-219. [DOI: 10.1111/biom.12729] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2016] [Revised: 04/01/2017] [Accepted: 04/01/2017] [Indexed: 11/29/2022]
Affiliation(s)
- Daniel Scharfstein
- Johns Hopkins Bloomberg School of Public Health; Baltimore, Maryland U.S.A
| | - Aidan McDermott
- Johns Hopkins Bloomberg School of Public Health; Baltimore, Maryland U.S.A
| | - Iván Díaz
- Department of Healthcare Policy and Research, Weill Cornell Medicine; New York, New York U.S.A
| | - Marco Carone
- University of Washington School of Public Health; Seattle, Washington U.S.A
| | | | - Ibrahim Turkoz
- Janssen Research and Development, LLC, Titusville; New Jersey U.S.A
| |
Collapse
|
5
|
van der Laan M, Gruber S. One-Step Targeted Minimum Loss-based Estimation Based on Universal Least Favorable One-Dimensional Submodels. Int J Biostat 2016; 12:351-78. [PMID: 27227728 PMCID: PMC4912007 DOI: 10.1515/ijb-2015-0054] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Consider a study in which one observes n independent and identically distributed random variables whose probability distribution is known to be an element of a particular statistical model, and one is concerned with estimation of a particular real valued pathwise differentiable target parameter of this data probability distribution. The targeted maximum likelihood estimator (TMLE) is an asymptotically efficient substitution estimator obtained by constructing a so called least favorable parametric submodel through an initial estimator with score, at zero fluctuation of the initial estimator, that spans the efficient influence curve, and iteratively maximizing the corresponding parametric likelihood till no more updates occur, at which point the updated initial estimator solves the so called efficient influence curve equation. In this article we construct a one-dimensional universal least favorable submodel for which the TMLE only takes one step, and thereby requires minimal extra data fitting to achieve its goal of solving the efficient influence curve equation. We generalize these to universal least favorable submodels through the relevant part of the data distribution as required for targeted minimum loss-based estimation. Finally, remarkably, given a multidimensional target parameter, we develop a universal canonical one-dimensional submodel such that the one-step TMLE, only maximizing the log-likelihood over a univariate parameter, solves the multivariate efficient influence curve equation. This allows us to construct a one-step TMLE based on a one-dimensional parametric submodel through the initial estimator, that solves any multivariate desired set of estimating equations.
Collapse
Affiliation(s)
| | - Susan Gruber
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Healthcare Institute
| |
Collapse
|