1
|
Bachman SP, Brown MJM, Leão TCC, Nic Lughadha E, Walker BE. Extinction risk predictions for the world's flowering plants to support their conservation. New Phytol 2024; 242:797-808. [PMID: 38437880 DOI: 10.1111/nph.19592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 01/23/2024] [Indexed: 03/06/2024]
Abstract
More than 70% of all vascular plants lack conservation status assessments. We aimed to address this shortfall in knowledge of species extinction risk by using the World Checklist of Vascular Plants to generate the first comprehensive set of predictions for a large clade: angiosperms (flowering plants, c. 330 000 species). We used Bayesian Additive Regression Trees (BART) to predict the extinction risk of all angiosperms using predictors relating to range size, human footprint, climate, and evolutionary history and applied a novel approach to estimate uncertainty of individual species-level predictions. From our model predictions, we estimate 45.1% of angiosperm species are potentially threatened with a lower bound of 44.5% and upper bound of 45.7%. Our species-level predictions, with associated uncertainty estimates, do not replace full global, or regional Red List assessments, but can be used to prioritise predicted threatened species for full Red List assessment and fast-track predicted non-threatened species for Least Concern assessments. Our predictions and uncertainty estimates can also guide fieldwork, inform systematic conservation planning and support global plant conservation efforts and targets.
Collapse
|
2
|
Abstract
It is crucial in clinical trials to investigate treatment effect consistency across subgroups defined by patient baseline characteristics. However, there may be treatment effect variability across subgroups due to small subgroup sample size. Various Bayesian models have been proposed to incorporate this variability when borrowing information across subgroups. These models rely on the underlying assumption that patients with similar characteristics will have similar outcomes to the same treatment. Patient populations within each subgroup must subjectively be deemed similar enough Pocock (1976) to borrow response information across subgroups. We propose utilizing the machine learning method of Bayesian Additive Regression Trees (BART) to provide a method for subgroup borrowing that does not rely on an underlying assumption of homogeneity between subgroups. BART is a data-driven approach that utilizes patient-level observations. The amount of borrowing between subgroups automatically adjusts as BART learns the covariate-response relationships. Modeling patient-level data rather than treating the subgroup as a single unit minimizes assumptions regarding homogeneity across subgroups. We illustrate the use of BART in this context by comparing performance from existing subgroup borrowing methods in a simulation study and a case study in non-small cell lung cancer. The application of BART in the context of subgroup analyses alleviates the need to subjectively choose how much information to borrow based on subgroup similarity. Having the amount of borrowing be analytically determined and controlled for based on the similarity of individual patient-level characteristics allows for more objective decision making in the drug development process with many other applications including basket trials.
Collapse
Affiliation(s)
- Jane Pan
- Department of Biostatistics, University of California, Los Angeles, California, USA
| | - Veronica Bunn
- Statistical and Quantitative Sciences, Takeda Pharmaceuticals, Cambridge, Massachusetts, USA
| | - Bradley Hupf
- Statistical and Quantitative Sciences, Takeda Pharmaceuticals, Cambridge, Massachusetts, USA
| | - Jianchang Lin
- Statistical and Quantitative Sciences, Takeda Pharmaceuticals, Cambridge, Massachusetts, USA
| |
Collapse
|
3
|
Morris RS, Tignanelli CJ, deRoon-Cassini T, Laud P, Sparapani R. Improved Prediction of Older Adult Discharge After Trauma Using a Novel Machine Learning Paradigm. J Surg Res 2021; 270:39-48. [PMID: 34628162 DOI: 10.1016/j.jss.2021.08.021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 07/16/2021] [Accepted: 08/27/2021] [Indexed: 12/22/2022]
Abstract
BACKGROUND The ability to reliably predict outcomes after trauma in older adults (age ≥ 65 y) is critical for clinical decision making. Using novel machine-learning techniques, we sought to design a nonlinear, competing risks paradigm for prediction of older adult discharge disposition following injury. MATERIALS AND METHODS The National Trauma Databank (NTDB) was used to identify patients 65+ y between 2007 and 2014. Training was performed on an enriched cohort of diverse patients. Factors included age, comorbidities, length of stay, and physiologic parameters to predict in-hospital mortality and discharge disposition (home versus skilled nursing/long-term care facility). Length of stay and discharge status were analyzed via competing risks survival analysis with Bayesian additive regression trees and a multinomial mixed model. RESULTS The resulting sample size was 47,037 patients. Admission GCS and age were important in predicting mortality and discharge disposition. As GCS decreased, patients were more likely to die (risk ratio increased by average of 1.4 per 2-point drop in GCS, P < 0.001). As GCS decreased, patients were also more likely to be discharged to a skilled nursing or long-term care facility (risk ratio decreased by 0.08 per 2-point decrease in GCS, P< 0.001). The area under curve for prediction of discharge home was improved in the competing risks model 0.73 versus 0.43 in the traditional multinomial mixed model. CONCLUSIONS Predicting older adult discharge disposition after trauma is improved using machine learning over traditional regression analysis. We confirmed that a nonlinear, competing risks paradigm enhances prediction on any given hospital day post injury.
Collapse
Affiliation(s)
- Rachel S Morris
- Department of Surgery, Medical College of Wisconsin, Milwaukee, Wisconsin.
| | - Christopher J Tignanelli
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota; Department of Surgery, North Memorial Medical Center, Robbinsdale, Minnesota; Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota
| | | | - Purushottam Laud
- Institute for Health and Equity, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Rodney Sparapani
- Institute for Health and Equity, Medical College of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
4
|
Sparapani RA, Rein LE, Tarima SS, Jackson TA, Meurer JR. Non-parametric recurrent events analysis with BART and an application to the hospital admissions of patients with diabetes. Biostatistics 2020; 21:69-85. [PMID: 30059992 DOI: 10.1093/biostatistics/kxy032] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2017] [Accepted: 04/23/2018] [Indexed: 11/12/2022] Open
Abstract
Much of survival analysis is concerned with absorbing events, i.e., subjects can only experience a single event such as mortality. This article is focused on non-absorbing or recurrent events, i.e., subjects are capable of experiencing multiple events. Recurrent events have been studied by many; however, most rely on the restrictive assumptions of linearity and proportionality. We propose a new method for analyzing recurrent events with Bayesian Additive Regression Trees (BART) avoiding such restrictive assumptions. We explore this new method via a motivating example of hospital admissions for diabetes patients and simulated data sets.
Collapse
Affiliation(s)
- Rodney A Sparapani
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| | - Lisa E Rein
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| | - Sergey S Tarima
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| | - Tourette A Jackson
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| | - John R Meurer
- Institute for Health and Equity, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA
| |
Collapse
|
5
|
Nethery RC, Mealli F, Sacks JD, Dominici F. Evaluation of the health impacts of the 1990 Clean Air Act Amendments using causal inference and machine learning. J Am Stat Assoc 2020; 1:1-12. [PMID: 33424062 PMCID: PMC7788006 DOI: 10.1080/01621459.2020.1803883] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 07/13/2020] [Accepted: 07/20/2020] [Indexed: 10/23/2022]
Abstract
We develop a causal inference approach to estimate the number of adverse health events that were prevented due to changes in exposure to multiple pollutants attributable to a large-scale air quality intervention/regulation, with a focus on the 1990 Clean Air Act Amendments (CAAA). We introduce a causal estimand called the Total Events Avoided (TEA) by the regulation, defined as the difference in the number of health events expected under the no-regulation pollution exposures and the number observed with-regulation. We propose matching and machine learning methods that leverage population-level pollution and health data to estimate the TEA. Our approach improves upon traditional methods for regulation health impact analyses by formalizing causal identifying assumptions, utilizing population-level data, minimizing parametric assumptions, and collectively analyzing multiple pollutants. To reduce model-dependence, our approach estimates cumulative health impacts in the subset of regions with projected no-regulation features lying within the support of the observed with-regulation data, thereby providing a conservative but data-driven assessment to complement traditional parametric approaches. We analyze the health impacts of the CAAA in the US Medicare population in the year 2000, and our estimates suggest that large numbers of cardiovascular and dementia-related hospitalizations were avoided due to CAAA-attributable changes in pollution exposure.
Collapse
Affiliation(s)
- Rachel C Nethery
- Department of Biostatistics, Harvard T.H. Chan School of Public Health
| | - Fabrizia Mealli
- Department of Statistics, Computer Science, Applications, University of Florence
| | - Jason D Sacks
- National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency
| | | |
Collapse
|
6
|
Zeldow B, Lo Re V, Roy J. A SEMIPARAMETRIC MODELING APPROACH USING BAYESIAN ADDITIVE REGRESSION TREES WITH AN APPLICATION TO EVALUATE HETEROGENEOUS TREATMENT EFFECTS. Ann Appl Stat 2019; 13:1989-2010. [PMID: 33072236 DOI: 10.1214/19-aoas1266] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Bayesian Additive Regression Trees (BART) is a flexible machine learning algorithm capable of capturing nonlinearities between an outcome and covariates and interactions among covariates. We extend BART to a semiparametric regression framework in which the conditional expectation of an outcome is a function of treatment, its effect modifiers, and confounders. The confounders are allowed to have unspecified functional form, while treatment and effect modifiers that are directly related to the research question are given a linear form. The result is a Bayesian semiparametric linear regression model where the posterior distribution of the parameters of the linear part can be interpreted as in parametric Bayesian regression. This is useful in situations where a subset of the variables are of substantive interest and the others are nuisance variables that we would like to control for. An example of this occurs in causal modeling with the structural mean model (SMM). Under certain causal assumptions, our method can be used as a Bayesian SMM. Our methods are demonstrated with simulation studies and an application to dataset involving adults with HIV/Hepatitis C coinfection who newly initiate antiretroviral therapy. The methods are available in an R package called semibart.
Collapse
Affiliation(s)
- Bret Zeldow
- Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave, Boston, Massachusetts 02115, USA
| | - Vincent Lo Re
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Jason Roy
- Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Piscataway, New Jersey 08854, USA
| |
Collapse
|
7
|
Nethery RC, Mealli F, Dominici F. ESTIMATING POPULATION AVERAGE CAUSAL EFFECTS IN THE PRESENCE OF NON-OVERLAP: THE EFFECT OF NATURAL GAS COMPRESSOR STATION EXPOSURE ON CANCER MORTALITY. Ann Appl Stat 2019; 13:1242-1267. [PMID: 31346355 PMCID: PMC6658123 DOI: 10.1214/18-aoas1231] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Most causal inference studies rely on the assumption of overlap to estimate population or sample average causal effects. When data suffer from non-overlap, estimation of these estimands requires reliance on model specifications, due to poor data support. All existing methods to address non-overlap, such as trimming or down-weighting data in regions of poor data support, change the estimand so that inference cannot be made on the sample or the underlying population. In environmental health research settings, where study results are often intended to influence policy, population-level inference may be critical, and changes in the estimand can diminish the impact of the study results, because estimates may not be representative of effects in the population of interest to policymakers. Researchers may be willing to make additional, minimal modeling assumptions in order to preserve the ability to estimate population average causal effects. We seek to make two contributions on this topic. First, we propose a flexible, data-driven definition of propensity score overlap and non-overlap regions. Second, we develop a novel Bayesian framework to estimate population average causal effects with minor model dependence and appropriately large uncertainties in the presence of non-overlap and causal effect heterogeneity. In this approach, the tasks of estimating causal effects in the overlap and non-overlap regions are delegated to two distinct models, suited to the degree of data support in each region. Tree ensembles are used to non-parametrically estimate individual causal effects in the overlap region, where the data can speak for themselves. In the non-overlap region, where insufficient data support means reliance on model specification is necessary, individual causal effects are estimated by extrapolating trends from the overlap region via a spline model. The promising performance of our method is demonstrated in simulations. Finally, we utilize our method to perform a novel investigation of the causal effect of natural gas compressor station exposure on cancer outcomes. Code and data to implement the method and reproduce all simulations and analyses is available on Github (https://github.com/rachelnethery/overlap).
Collapse
Affiliation(s)
- Rachel C Nethery
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Fabrizia Mealli
- Department of Statistics, Informatics, Applications, University of Florence, Florence, Italy
| | - Francesca Dominici
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
8
|
Abstract
Medical therapy often consists of multiple stages, with a treatment chosen by the physician at each stage based on the patient's history of treatments and clinical outcomes. These decisions can be formalized as a dynamic treatment regime. This paper describes a new approach for optimizing dynamic treatment regimes that bridges the gap between Bayesian inference and existing approaches, like Q-learning. The proposed approach fits a series of Bayesian regression models, one for each stage, in reverse sequential order. Each model uses as a response variable the remaining payoff assuming optimal actions are taken at subsequent stages, and as covariates the current history and relevant actions at that stage. The key difficulty is that the optimal decision rules at subsequent stages are unknown, and even if these decision rules were known the relevant response variables may be counterfactual. However, posterior distributions can be derived from the previously fitted regression models for the optimal decision rules and the counterfactual response variables under a particular set of rules. The proposed approach averages over these posterior distributions when fitting each regression model. An efficient sampling algorithm for estimation is presented, along with simulation studies that compare the proposed approach with Q-learning.
Collapse
Affiliation(s)
| | - Ying Yuan
- Department of Biostatistics, MD Anderson Cancer Center
| | - Peter F Thall
- Department of Biostatistics, MD Anderson Cancer Center
| |
Collapse
|