1
|
Silber JH, Rosenbaum PR, Reiter JG, Jain S, Hill AS, Hashemi S, Brown S, Olfson M, Ing C. Exposure to Operative Anesthesia in Childhood and Subsequent Neurobehavioral Diagnoses: A Natural Experiment Using Appendectomy. Anesthesiology 2024; 141:489-499. [PMID: 38753986 PMCID: PMC11361557 DOI: 10.1097/aln.0000000000005075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]
Abstract
BACKGROUND Observational studies of anesthetic neurotoxicity may be biased because children requiring anesthesia commonly have medical conditions associated with neurobehavioral problems. This study takes advantage of a natural experiment associated with appendicitis to determine whether anesthesia and surgery in childhood were specifically associated with subsequent neurobehavioral outcomes. METHODS This study identified 134,388 healthy children with appendectomy and examined the incidence of subsequent externalizing or behavioral disorders (conduct, impulse control, oppositional defiant, attention-deficit hyperactivity disorder) or internalizing or mood or anxiety disorders (depression, anxiety, or bipolar disorder) when compared to 671,940 matched healthy controls as identified in Medicaid data between 2001 and 2018. For comparison, this study also examined 154,887 otherwise healthy children admitted to the hospital for pneumonia, cellulitis, and gastroenteritis, of which only 8% received anesthesia, and compared them to 774,435 matched healthy controls. In addition, this study examined the difference-in-differences between matched appendectomy patients and their controls and matched medical admission patients and their controls. RESULTS Compared to controls, children with appendectomy were more likely to have subsequent behavioral disorders (hazard ratio, 1.04; 95% CI, 1.01 to 1.06; P = 0.0010) and mood or anxiety disorders (hazard ratio, 1.15; 95% CI, 1.13 to 1.17; P < 0.0001). Relative to controls, children with medical admissions were also more likely to have subsequent behavioral (hazard ratio, 1.20; 95% CI, 1.18 to 1.22; P < 0.0001) and mood or anxiety (hazard ratio, 1.25; 95% CI, 1.23 to 1.27; P < 0.0001) disorders. Comparing the difference between matched appendectomy patients and their matched controls to the difference between matched medical patients and their matched controls, medical patients had more subsequent neurobehavioral problems than appendectomy patients. CONCLUSIONS Although there is an association between neurobehavioral diagnoses and appendectomy, this association is not specific to anesthesia exposure and is stronger in medical admissions. Medical admissions, generally without anesthesia exposure, displayed significantly higher rates of these disorders than appendectomy-exposed patients. EDITOR’S PERSPECTIVE
Collapse
|
2
|
Brumberg K, Small DS, Rosenbaum PR. Optimal refinement of strata to balance covariates. Biometrics 2024; 80:ujae061. [PMID: 38994639 DOI: 10.1093/biomtc/ujae061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 05/21/2024] [Accepted: 06/18/2024] [Indexed: 07/13/2024]
Abstract
What is the best way to split one stratum into two to maximally reduce the within-stratum imbalance in many covariates? We formulate this as an integer program and approximate the solution by randomized rounding of a linear program. A linear program may assign a fraction of a person to each refined stratum. Randomized rounding views fractional people as probabilities, assigning intact people to strata using biased coins. Randomized rounding is a well-studied theoretical technique for approximating the optimal solution of certain insoluble integer programs. When the number of people in a stratum is large relative to the number of covariates, we prove the following new results: (i) randomized rounding to split a stratum does very little randomizing, so it closely resembles the linear programming relaxation without splitting intact people; (ii) the linear relaxation and the randomly rounded solution place lower and upper bounds on the unattainable integer programming solution; and because of (i), these bounds are often close, thereby ratifying the usable randomly rounded solution. We illustrate using an observational study that balanced many covariates by forming matched pairs composed of 2016 patients selected from 5735 using a propensity score. Instead, we form 5 propensity score strata and refine them into 10 strata, obtaining excellent covariate balance while retaining all patients. An R package optrefine at CRAN implements the method. Supplementary materials are available online.
Collapse
|
3
|
Jain S, Rosenbaum PR, Reiter JG, Ramadan OI, Hill AS, Hashemi S, Brown RT, Kelz RR, Fleisher LA, Silber JH. Mortality Among Older Medical Patients at Flagship Hospitals and Their Affiliates. J Gen Intern Med 2024; 39:902-911. [PMID: 38087179 DOI: 10.1007/s11606-023-08415-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 09/05/2023] [Indexed: 02/23/2024]
Abstract
BACKGROUND We define a "flagship hospital" as the largest academic hospital within a hospital referral region and a "flagship system" as a system that contains a flagship hospital and its affiliates. It is not known if patients admitted to an affiliate hospital, and not to its main flagship hospital, have better outcomes than those admitted to a hospital outside the flagship system but within the same hospital referral region. OBJECTIVE To compare mortality at flagship hospitals and their affiliates to matched control patients not in the flagship system but within the same hospital referral region. DESIGN A matched cohort study PARTICIPANTS: The study used hospitalizations for common medical conditions between 2018-2019 among older patients age ≥ 66 years. We analyzed 118,321 matched pairs of Medicare patients admitted with pneumonia (N=57,775), heart failure (N=42,531), or acute myocardial infarction (N=18,015) in 35 flagship hospitals, 124 affiliates, and 793 control hospitals. MAIN MEASURES 30-day (primary) and 90-day (secondary) all-cause mortality. KEY RESULTS 30-day mortality was lower among patients in flagship systems versus control hospitals that are not part of the flagship system but within the same hospital referral region (difference= -0.62%, 95% CI [-0.88%, -0.37%], P<0.001). This difference was smaller in affiliates versus controls (-0.43%, [-0.75%, -0.11%], P=0.008) than in flagship hospitals versus controls (-1.02%, [-1.46%, -0.58%], P<0.001; difference-in-difference -0.59%, [-1.13%, -0.05%], P=0.033). Similar results were found for 90-day mortality. LIMITATIONS The study used claims-based data. CONCLUSIONS In aggregate, within a hospital referral region, patients treated at the flagship hospital, at affiliates of the flagship hospital, and in the flagship system as a whole, all had lower mortality rates than matched controls outside the flagship system. However, the mortality advantage was larger for flagship hospitals than for their affiliates.
Collapse
|
4
|
Ramadan OI, Rosenbaum PR, Reiter JG, Jain S, Hill AS, Hashemi S, Kelz RR, Fleisher LA, Silber JH. Impact of Hospital Affiliation With a Flagship Hospital System on Surgical Outcomes. Ann Surg 2024; 279:631-639. [PMID: 38456279 PMCID: PMC10926994 DOI: 10.1097/sla.0000000000006132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
OBJECTIVE To compare general surgery outcomes at flagship systems, flagship hospitals, and flagship hospital affiliates versus matched controls. SUMMARY BACKGROUND DATA It is unknown whether flagship hospitals perform better than flagship hospital affiliates for surgical patients. METHODS Using Medicare claims for 2018 to 2019, we matched patients undergoing inpatient general surgery in flagship system hospitals to controls who underwent the same procedure at hospitals outside the system but within the same region. We defined a "flagship hospital" within each region as the major teaching hospital with the highest patient volume that is also part of a hospital system; its system was labeled a "flagship system." We performed 4 main comparisons: patients treated at any flagship system hospital versus hospitals outside the flagship system; flagship hospitals versus hospitals outside the flagship system; flagship hospital affiliates versus hospitals outside the flagship system; and flagship hospitals versus affiliate hospitals. Our primary outcome was 30-day mortality. RESULTS We formed 32,228 closely matched pairs across 35 regions. Patients at flagship system hospitals (32,228 pairs) had lower 30-day mortality than matched control patients [3.79% vs. 4.36%, difference=-0.57% (-0.86%, -0.28%), P<0.001]. Similarly, patients at flagship hospitals (15,571/32,228 pairs) had lower mortality than control patients. However, patients at flagship hospital affiliates (16,657/32,228 pairs) had similar mortality to matched controls. Flagship hospitals had lower mortality than affiliate hospitals [difference-in-differences=-1.05% (-1.62%, -0.47%), P<0.001]. CONCLUSIONS Patients treated at flagship hospitals had significantly lower mortality rates than those treated at flagship hospital affiliates. Hence, flagship system affiliation does not alone imply better surgical outcomes.
Collapse
|
5
|
Jain S, Rosenbaum PR, Reiter JG, Ramadan OI, Hill AS, Silber JH, Fleisher LA. Assessing the Ambulatory Surgery Center Volume-Outcome Association. JAMA Surg 2024; 159:397-403. [PMID: 38265816 PMCID: PMC10809135 DOI: 10.1001/jamasurg.2023.7161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 10/01/2023] [Indexed: 01/25/2024]
Abstract
Importance In surgical patients, it is well known that higher hospital procedure volume is associated with better outcomes. To our knowledge, this volume-outcome association has not been studied in ambulatory surgery centers (ASCs) in the US. Objective To determine if low-volume ASCs have a higher rate of revisits after surgery, particularly among patients with multimorbidity. Design, Setting, and Participants This matched case-control study used Medicare claims data and analyzed surgeries performed during 2018 and 2019 at ASCs. The study examined 2328 ASCs performing common ambulatory procedures and analyzed 4751 patients with a revisit within 7 days of surgery (defined to be either 1 of 4735 revisits or 1 of 16 deaths without a revisit). These cases were each closely matched to 5 control patients without revisits (23 755 controls). Data were analyzed from January 1, 2018, through December 31, 2019. Main Outcomes and Measures Seven-day revisit in patients (cases) compared with the matched patients without the outcome (controls) in ASCs with low volume (less than 50 procedures over 2 years) vs higher volume (50 or more procedures). Results Patients at a low-volume ASC had a higher odds of a 7-day revisit vs patients who had their surgery at a higher-volume ASC (odds ratio [OR], 1.21; 95% CI, 1.09-1.36; P = .001). The odds of revisit for patients with multimorbidity were higher at low-volume ASCs when compared with higher-volume ASCs (OR, 1.57; 95% CI, 1.27-1.94; P < .001). Among patients with multimorbidity in low-volume ASCs, for those who underwent orthopedic procedures, the odds of revisit were 84% higher (OR, 1.84; 95% CI, 1.36-2.50; P < .001) vs higher-volume centers, and for those who underwent general surgery or other procedures, the odds of revisit were 36% higher (OR, 1.36; 95% CI, 1.01-1.83; P = .05) vs a higher-volume center. The findings were not statistically significant for patients without multimorbidity. Conclusions and Relevance In this observational study, the surgical volume of an ASC was an important indicator of patient outcomes. Older patients with multimorbidity should discuss with their surgeon the optimal location of their care.
Collapse
|
6
|
Rosenbaum PR. A second evidence factor for a second control group. Biometrics 2023; 79:3968-3980. [PMID: 37563803 DOI: 10.1111/biom.13921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 07/24/2023] [Indexed: 08/12/2023]
Abstract
In an observational study of the effects caused by a treatment, a second control group is used in an effort to detect bias from unmeasured covariates, and the investigator is content if no evidence of bias is found. This strategy is not entirely satisfactory: two control groups may differ significantly, yet the difference may be too small to invalidate inferences about the treatment, or the control groups may not differ yet nonetheless fail to provide a tangible strengthening of the evidence of a treatment effect. Is a firmer conclusion possible? Is there a way to analyze a second control group such that the data might report measurably strengthened evidence of cause and effect, that is, insensitivity to larger unmeasured biases? Evidence factor analyses are not commonly used with a second control group: most analyses compare the treated group to each control group, but analyses of that kind are partially redundant; so, they do not constitute evidence factors. An alternative analysis is proposed here, one that does yield two evidence factors, and with a carefully designed test statistic, is capable of extracting strong evidence from the second factor. The new technical work here concerns the development of a test statistic with high design sensitivity and high Bahadur efficiency in a sensitivity analysis for the second factor. A study of binge drinking as a cause of high blood pressure is used as an illustration.
Collapse
|
7
|
Lasater KB, Rosenbaum PR, Aiken LH, Brooks-Carthon JM, Kelz RR, Reiter JG, Silber JH, McHugh MD. Explaining racial disparities in surgical survival: a tapered match analysis of patient and hospital factors. BMJ Open 2023; 13:e066813. [PMID: 37169502 PMCID: PMC10186454 DOI: 10.1136/bmjopen-2022-066813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 04/26/2023] [Indexed: 05/13/2023] Open
Abstract
OBJECTIVES Evaluate whether hospital factors, including nurse resources, explain racial differences in Medicare black and white patient surgical outcomes and whether disparities changed over time. DESIGN Retrospective tapered-match. SETTING 571 hospitals at two time points (Early Era 2003-2005; Recent Era 2013-2015). PARTICIPANTS 6752 black patients and three sets of 6752 white controls selected from 107 001 potential controls (Early Era). 4964 black patients and three sets of 4964 white controls selected from 74 108 potential controls (Recent Era). INTERVENTIONS Black patients were matched to white controls on demographics (age, sex, state and year of procedure), procedure (demographics variables plus 136 International Classification of Diseases (ICD)-9 principal procedure codes) and presentation (demographics and procedure variables plus 34 comorbidities, a mortality risk score, a propensity score for being black, emergency admission, transfer status, predicted procedure time). OUTCOMES 30-day and 1-year mortality. RESULTS Before matching, black patients had more comorbidities, higher risk of mortality despite being younger and underwent procedures at different percentages than white patients. Whites in the demographics match had lower mortality at 30 days (5.6% vs 6.7% Early Era; 5.4% vs 5.7% Recent Era) and 1-year (15.5% vs 21.5% Early Era; 12.3% vs 15.9% Recent Era). Black-white 1-year mortality differences were equivalent after matching patients with respect to presentation, procedure and demographic factors. Black-white 30-day mortality differences were equivalent after matching on procedure and demographic factors. Racial disparities in outcomes remained unchanged between the two time periods spanning 10 years. All patients in hospitals with better nurse resources had lower odds of 30-day (OR 0.60, 95% CI 0.46 to 0.78, p<0.010) and 1-year mortality (OR 0.77, 95% CI 0.65 to 0.92, p<0.010) even after accounting for other hospital factors. CONCLUSIONS Survival disparities among black and white patients are largely explained by differences in demographic, procedure and presentation factors. Better nurse resources (eg, staffing, work environment) were associated with lower mortality for all patients.
Collapse
|
8
|
Jain S, Rosenbaum PR, Reiter JG, Ramadan OI, Hill AS, Hashemi S, Brown RT, Kelz RR, Fleisher LA, Silber JH. Defining Multimorbidity in Older Patients Hospitalized with Medical Conditions. J Gen Intern Med 2023; 38:1449-1458. [PMID: 36385407 PMCID: PMC10160274 DOI: 10.1007/s11606-022-07897-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 10/26/2022] [Indexed: 11/17/2022]
Abstract
BACKGROUND The term "multimorbidity" identifies high-risk, complex patients and is conventionally defined as ≥2 comorbidities. However, this labels almost all older patients as multimorbid, making this definition less useful for physicians, hospitals, and policymakers. OBJECTIVE Develop new medical condition-specific multimorbidity definitions for patients admitted with acute myocardial infarction (AMI), heart failure (HF), and pneumonia patients. We developed three medical condition-specific multimorbidity definitions as the presence of single, double, or triple combinations of comorbidities - called Qualifying Comorbidity Sets (QCSs) - associated with at least doubling the risk of 30-day mortality for AMI and pneumonia, or one-and-a-half times for HF patients, compared to typical patients with these conditions. DESIGN Cohort-based matching study PARTICIPANTS: One hundred percent Medicare Fee-for-Service beneficiaries with inpatient admissions between 2016 and 2019 for AMI, HF, and pneumonia. MAIN MEASURES Thirty-day all-location mortality KEY RESULTS: We defined multimorbidity as the presence of ≥1 QCS. The new definitions labeled fewer patients as multimorbid with a much higher risk of death compared to the conventional definition (≥2 comorbidities). The proportions of patients labeled as multimorbid using the new definition versus the conventional definition were: for AMI 47% versus 87% (p value<0.0001), HF 53% versus 98% (p value<0.0001), and pneumonia 57% versus 91% (p value<0.0001). Thirty-day mortality was higher among patients with ≥1 QCS compared to ≥2 comorbidities: for AMI 15.0% versus 9.5% (p<0.0001), HF 9.9% versus 7.0% (p <0.0001), and pneumonia 18.4% versus 13.2% (p <0.0001). CONCLUSION The presence of ≥2 comorbidities identified almost all patients as multimorbid. In contrast, our new QCS-based definitions selected more specific combinations of comorbidities associated with substantial excess risk in older patients admitted for AMI, HF, and pneumonia. Thus, our new definitions offer a better approach to identifying multimorbid patients, allowing physicians, hospitals, and policymakers to more effectively use such information to consider focused interventions for these vulnerable patients.
Collapse
|
9
|
Ramadan OI, Rosenbaum PR, Reiter JG, Jain S, Hill AS, Hashemi S, Kelz RR, Fleisher LA, Silber JH. Redefining Multimorbidity in Older Surgical Patients. J Am Coll Surg 2023; 236:1011-1022. [PMID: 36919934 PMCID: PMC11411458 DOI: 10.1097/xcs.0000000000000659] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
BACKGROUND Multimorbidity in surgery is common and associated with worse postoperative outcomes. However, conventional multimorbidity definitions (≥2 comorbidities) label the vast majority of older patients as multimorbid, limiting clinical usefulness. We sought to develop and validate better surgical specialty-specific multimorbidity definitions based on distinct comorbidity combinations. STUDY DESIGN We used Medicare claims for patients aged 66 to 90 years undergoing inpatient general, orthopaedic, or vascular surgery. Using 2016 to 2017 data, we identified all comorbidity combinations associated with at least 2-fold (general/orthopaedic) or 1.5-fold (vascular) greater risk of 30-day mortality compared with the overall population undergoing the same procedure; we called these combinations qualifying comorbidity sets. We applied them to 2018 to 2019 data (general = 230,410 patients, orthopaedic = 778,131 patients, vascular = 146,570 patients) to obtain 30-day mortality estimates. For further validation, we tested whether multimorbidity status was associated with differential outcomes for patients at better-resourced (based on nursing skill-mix, surgical volume, teaching status) hospitals vs all other hospitals using multivariate matching. RESULTS Compared with conventional multimorbidity definitions, the new definitions labeled far fewer patients as multimorbid: general = 85.0% (conventional) vs 55.9% (new) (p < 0.0001); orthopaedic = 66.6% vs 40.2% (p < 0.0001); and vascular = 96.2% vs 52.7% (p < 0.0001). Thirty-day mortality was higher by the new definitions: general = 3.96% (conventional) vs 5.64% (new) (p < 0.0001); orthopaedic = 0.13% vs 1.68% (p < 0.0001); and vascular = 4.43% vs 7.00% (p < 0.0001). Better-resourced hospitals offered significantly larger mortality benefits than all other hospitals for multimorbid vs nonmultimorbid general and orthopaedic, but not vascular, patients (general surgery difference-in-difference = -0.94% [-1.36%, -0.52%], p < 0.0001; orthopaedic = -0.20% [-0.34%, -0.05%], p = 0.0087; and vascular = -0.12% [-0.69%, 0.45%], p = 0.6795). CONCLUSIONS Our new multimorbidity definitions identified far more specific, higher-risk pools of patients than conventional definitions, potentially aiding clinical decision-making.
Collapse
|
10
|
Silber JH, Rosenbaum PR, Reiter JG, Jain S, Ramadan OI, Hill AS, Hashemi S, Kelz RR, Fleisher LA. The Safety of Performing Surgery at Ambulatory Surgery Centers Versus Hospital Outpatient Departments in Older Patients With or Without Multimorbidity. Med Care 2023; 61:328-337. [PMID: 36929758 PMCID: PMC10079624 DOI: 10.1097/mlr.0000000000001836] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
BACKGROUND Surgery for older Americans is increasingly being performed at ambulatory surgery centers (ASCs) rather than hospital outpatient departments (HOPDs), while rates of multimorbidity have increased. OBJECTIVE To determine whether there are differential outcomes in older patients undergoing surgical procedures at ASCs versus HOPDs. RESEARCH DESIGN Matched cohort study. SUBJECTS Of Medicare patients, 30,958 were treated in 2018 and 2019 at an ASC undergoing herniorrhaphy, cholecystectomy, or open breast procedures, matched to similar HOPD patients, and another 32,702 matched pairs undergoing higher-risk procedures. MEASURES Seven and 30-day revisit and complication rates. RESULTS For the same procedures, HOPD patients displayed a higher baseline predicted risk of 30-day revisits than ASC patients (13.09% vs 8.47%, P < 0.0001), suggesting the presence of considerable selection on the part of surgeons. In matched Medicare patients with or without multimorbidity, we observed worse outcomes in HOPD patients: 30-day revisit rates were 8.1% in HOPD patients versus 6.2% in ASC patients ( P < 0.0001), and complication rates were 41.3% versus 28.8%, P < 0.0001. Similar patterns were also found for 7-day outcomes and in higher-risk procedures examined in a secondary analysis. Similar patterns were also observed when analyzing patients with and without multimorbidity separately. CONCLUSIONS The rates of revisits and complications for ASC patients were far lower than for closely matched HOPD patients. The observed initial baseline risk in HOPD patients was much higher than the baseline risk for the same procedures performed at the ASC, suggesting that surgeons are appropriately selecting their riskier patients to be treated at the HOPD rather than the ASC.
Collapse
|
11
|
Rosenbaum PR. Sensitivity analyses informed by tests for bias in observational studies. Biometrics 2023; 79:475-487. [PMID: 34505285 DOI: 10.1111/biom.13558] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 08/01/2021] [Accepted: 08/19/2021] [Indexed: 11/29/2022]
Abstract
In an observational study, the treatment received and the outcome exhibited may be associated in the absence of an effect caused by the treatment, even after controlling for observed covariates. Two tactics are common: (i) a test for unmeasured bias may be obtained using a secondary outcome for which the effect is known and (ii) a sensitivity analysis may explore the magnitude of unmeasured bias that would need to be present to explain the observed association as something other than an effect caused by the treatment. Can such a test for unmeasured bias inform the sensitivity analysis? If the test for bias does not discover evidence of unmeasured bias, then ask: Are conclusions therefore insensitive to larger unmeasured biases? Conversely, if the test for bias does find evidence of bias, then ask: What does that imply about sensitivity to biases? This problem is formulated in a new way as a convex quadratically constrained quadratic program and solved on a large scale using interior point methods by a modern solver. That is, a convex quadratic function of N variables is minimized subject to constraints on linear and convex quadratic functions of these variables. The quadratic function that is minimized is a statistic for the primary outcome that is a function of the unknown treatment assignment probabilities. The quadratic function that constrains this minimization is a statistic for subsidiary outcome that is also a function of these same unknown treatment assignment probabilities. In effect, the first statistic is minimized over a confidence set for the unknown treatment assignment probabilities supplied by the unaffected outcome. This process avoids the mistake of interpreting the failure to reject a hypothesis as support for the truth of that hypothesis. The method is illustrated by a study of the effects of light daily alcohol consumption on high-density lipoprotein (HDL) cholesterol levels. In this study, the method quickly optimizes a nonlinear function of N = 800 $N=800$ variables subject to linear and quadratic constraints. In the example, strong evidence of unmeasured bias is found using the subsidiary outcome, but, perhaps surprisingly, this finding makes the primary comparison insensitive to larger biases.
Collapse
|
12
|
Brumberg K, Ellis DE, Small DS, Hennessy S, Rosenbaum PR. Using natural strata when examining unmeasured biases in an observational study of neurological side effects of antibiotics. J R Stat Soc Ser C Appl Stat 2023. [DOI: 10.1093/jrsssc/qlad010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
Abstract
Fluoroquinolones are widely prescribed antibiotics that carry a US Food and Drug Administration warning about possible side-effects on the central and peripheral nervous system. We compare 436,891 patients with sinusitis treated with fluoroquinolones to two control groups treated with azithromycin or amoxicillin. In addition to looking for nervous system complications, we look for evidence of bias using outcomes for which an effect was not anticipated. The comparison uses ‘natural strata’ that form control groups proportional in size to the treated group and balance many covariates beyond those that define the strata. The main technical contribution is a new method for near-optimal construction of natural strata with multiple groups. The online supplement material contains proofs, details, and information about the R package natstrat and replication.
Collapse
|
13
|
Ye T, Small DS, Rosenbaum PR. Dimensions, power and factors in an observational study of behavioral problems after physical abuse of children. Ann Appl Stat 2022. [DOI: 10.1214/22-aoas1611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
14
|
Silber JH, Rosenbaum PR, Reiter JG, Hill AS, Jain S, Wolk DA, Small DS, Hashemi S, Niknam BA, Neuman MD, Fleisher LA, Eckenhoff R. Alzheimer's Dementia After Exposure to Anesthesia and Surgery in the Elderly: A Matched Natural Experiment Using Appendicitis. Ann Surg 2022; 276:e377-e385. [PMID: 33214467 PMCID: PMC8437105 DOI: 10.1097/sla.0000000000004632] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
OBJECTIVE The aim of this study was to determine whether surgery and anesthesia in the elderly may promote Alzheimer disease and related dementias (ADRD). BACKGROUND There is a substantial conflicting literature concerning the hypothesis that surgery and anesthesia promotes ADRD. Much of the literature is confounded by indications for surgery or has small sample size. This study examines elderly patients with appendicitis, a common condition that strikes mostly at random after controlling for some known associations. METHODS A matched natural experiment of patients undergoing appendectomy for appendicitis versus control patients without appendicitis using Medicare data from 2002 to 2017, examining 54,996 patients without previous diagnoses of ADRD, cognitive impairment, or neurological degeneration, who developed appendicitis between ages 68 through 77 years and underwent an appendectomy (the ''Appendectomy'' treated group), matching them 5:1 to 274,980 controls, examining the subsequent hazard for developing ADRD. RESULTS The hazard ratio (HR) for developing ADRD or death was lower in the Appendectomy group than controls: HR = 0.96 [95% confidence interval (CI) 0.94-0.98], P < 0.0001, (28.2% in Appendectomy vs 29.1% in controls, at 7.5 years). The HR for death was 0.97 (95% CI 0.95-0.99), P = 0.002, (22.7% vs 23.1% at 7.5 years). The HR for developing ADRD alone was 0.89 (95% CI 0.86-0.92), P < 0.0001, (7.6% in Appendectomy vs 8.6% in controls, at 7.5 years). No subgroup analyses found significantly elevated rates of ADRD in the Appendectomy group. CONCLUSION In this natural experiment involving 329,976 elderly patients, exposure to appendectomy surgery and anesthesia did not increase the subsequent rate of ADRD.
Collapse
|
15
|
Rosenbaum PR, Rubin DB. Propensity Scores in the Design of Observational Studies for Causal Effects. Biometrika 2022. [DOI: 10.1093/biomet/asac054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Summary
The design of any study, whether experimental or observational, that is intended to estimate the causal effects of a treatment condition relative to a control condition, refers to those activities that precede any examination of outcome variables. As defined in our 1983 article (Rosenbaum & Rubin, 1983), the propensity score is the unit-level conditional probability of assignment to treatment versus control given the observed covariates; so, the propensity score explicitly does not involve any outcome variables, in contrast to other summaries of variables sometimes used in observational studies. Balancing the distributions of covariates in the treatment and control groups by matching or balancing on the propensity score is therefore an aspect of the design of the observational study. In this invited comment on our 1983 article, we review the situation in the early 1980’s, and we recall some apparent paradoxes that propensity scores helped to resolve. We demonstrate that it is possible to balance an enormous number of low-dimensional summaries of a high-dimensional covariate, even though it is generally impossible to match individuals closely for all of the components of a high-dimensional covariate. In a sense, there is only one crucial observed covariate, the propensity score, and there is one crucial unobserved covariate, the ‘principal unobserved covariate’. The propensity score and the principal unobserved covariate are equal when treatment assignment is strongly ignorable, that is, unconfounded. Controlling for observed covariates is a prelude to the crucial step from association to causation, the step that addresses potential biases from unmeasured covariates. The design of an observational study also prepares for the step to causation: by selecting comparisons to increase the design sensitivity, by seeking opportunities to detect bias, by seeking mutually supportive evidence affected by different biases, by incorporating quasi-experimental devices such as multiple control groups, and by including the economist’s instruments. All of these considerations reflect the formal development of sensitivity analyses that were largely informal prior to the 1980s.
Collapse
|
16
|
Rosenbaum PR. A statistic with demonstrated insensitivity to unmeasured bias for 2 × 2 × S tables in observational studies. Stat Med 2022; 41:3758-3771. [PMID: 35607846 DOI: 10.1002/sim.9446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 04/08/2022] [Accepted: 05/10/2022] [Indexed: 11/10/2022]
Abstract
Are weak associations between a treatment and a binary outcome always sensitive to small unmeasured biases in observational studies? This possibility is often discussed in epidemiology. The familiar Mantel-Haenszel test for a 2 × 2 × S $$ 2\times 2\times S $$ contingency table exaggerates sensitivity to unmeasured biases when the population odds ratios vary among the S $$ S $$ strata. A statistic built from several components, here from the S $$ S $$ strata, is said to have demonstrated insensitivity to bias if it uses only those components that provide indications of insensitivity to bias. Briefly, such a statistic is a d $$ d $$ -statistic. There are 2 S - 1 $$ {2}^S-1 $$ candidate statistics with S $$ S $$ strata, and a d $$ d $$ -statistic considers them all. To have level α $$ \alpha $$ , a test based on a d $$ d $$ -statistic must pay a price for its double use of the data, but as the sample size increases, that price becomes small, while the gain may be large. The price is paid by conditioning on the limited information used to identify components that are insensitive to a bias of specified magnitude, basing the test result on the information that remains after conditioning. In large samples, the d $$ d $$ -statistic achieves the largest possible design sensitivity, so it does not exaggerate sensitivity to unmeasured bias. A simulation verifies that the large sample result has traction in samples of practical size. A study of sunlight as a cause of cataract is used to illustrate issues and methods. Several extensions of the method are discussed. An R package dstat2x2xk implements the method.
Collapse
|
17
|
Rosenbaum PR. A New Transformation of Treated-Control Matched-Pair Differences for Graphical Display. AM STAT 2022. [DOI: 10.1080/00031305.2022.2063944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
|
18
|
Yu R, Rosenbaum PR. Graded Matching for Large Observational Studies. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2058001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
19
|
Jain S, Rosenbaum PR, Reiter JG, Hill AS, Wolk DA, Hashemi S, Fleisher LA, Eckenhoff R, Silber JH. Risk of Parkinson's disease after anaesthesia and surgery. Br J Anaesth 2022; 128:e268-e270. [PMID: 35101245 PMCID: PMC9074782 DOI: 10.1016/j.bja.2021.12.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 12/22/2021] [Accepted: 12/27/2021] [Indexed: 11/19/2022] Open
|
20
|
Yu R, Small DS, Rosenbaum PR. The information in covariate imbalance in studies of hormone replacement therapy. Ann Appl Stat 2021. [DOI: 10.1214/21-aoas1448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
21
|
Zhang B, Small DS, Lasater KB, McHugh M, Silber JH, Rosenbaum PR. Matching One Sample According to Two Criteria in Observational Studies. J Am Stat Assoc 2021; 118:1140-1151. [PMID: 37347087 PMCID: PMC10281706 DOI: 10.1080/01621459.2021.1981337] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Revised: 07/20/2021] [Accepted: 09/08/2021] [Indexed: 10/20/2022]
Abstract
Multivariate matching has two goals: (i) to construct treated and control groups that have similar distributions of observed covariates, and (ii) to produce matched pairs or sets that are homogeneous in a few key covariates. When there are only a few binary covariates, both goals may be achieved by matching exactly for these few covariates. Commonly, however, there are many covariates, so goals (i) and (ii) come apart, and must be achieved by different means. As is also true in a randomized experiment, similar distributions can be achieved for a high-dimensional covariate, but close pairs can be achieved for only a few covariates. We introduce a new polynomial-time method for achieving both goals that substantially generalizes several existing methods; in particular, it can minimize the earthmover distance between two marginal distributions. The method involves minimum cost flow optimization in a network built around a tripartite graph, unlike the usual network built around a bipartite graph. In the tripartite graph, treated subjects appear twice, on the far left and the far right, with controls sandwiched between them, and efforts to balance covariates are represented on the right, while efforts to find close individual pairs are represented on the left. In this way, the two efforts may be pursued simultaneously without conflict. The method is applied to our on-going study in the Medicare population of the relationship between superior nursing and sepsis mortality. The match2C package in R implements the method.
Collapse
|
22
|
Kelz RR, Sellers MM, Niknam BA, Sharpe JE, Rosenbaum PR, Hill AS, Zhou H, Hochman LL, Bilimoria KY, Itani K, Romano PS, Silber JH. A National Comparison of Operative Outcomes of New and Experienced Surgeons. Ann Surg 2021; 273:280-288. [PMID: 31188212 PMCID: PMC6898745 DOI: 10.1097/sla.0000000000003388] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
OBJECTIVE To determine whether outcomes achieved by new surgeons are attributable to inexperience or to differences in the context in which care is delivered and patient complexity. BACKGROUND Although prior studies suggest that new surgeon outcomes are worse than those of experienced surgeons, factors that underlie these phenomena are poorly understood. METHODS A nationwide observational tapered matching study of outcomes of Medicare patients treated by new and experienced surgeons in 1221 US hospitals (2009-2013). The primary outcome studied is 30-day mortality. Secondary outcomes were examined. RESULTS In total, 694,165 patients treated by 8503 experienced surgeons were matched to 68,036 patients treated by 2119 new surgeons working in the same hospitals. New surgeons' patients were older (25.8% aged ≥85 vs 16.3%,P<0.0001) with more emergency admissions (53.9% vs 25.8%,P<0.0001) than experienced surgeons' patients. Patients of new surgeons had a significantly higher baseline 30-day mortality rate compared with patients of experienced surgeons (6.2% vs 4.5%,P<0.0001;OR 1.42 (1.33, 1.52)). The difference remained significant after matching the types of operations performed (6.2% vs 5.1%, P<0.0001; OR 1.24 (1.16, 1.32)) and after further matching on a combination of operation type and emergency admission status (6.2% vs 5.6%, P=0.0007; OR 1.12 (1.05, 1.19)). After matching on operation type, emergency admission status, and patient complexity, the difference between new and experienced surgeons' patients' 30-day mortality became indistinguishable (6.2% vs 5.9%,P=0.2391;OR 1.06 (0.97, 1.16)). CONCLUSIONS Among Medicare beneficiaries, the majority of the differences in outcomes between new and experienced surgeons are related to the context in which care is delivered and patient complexity rather than new surgeon inexperience.
Collapse
|
23
|
Karmakar B, Small DS, Rosenbaum PR. Reinforced Designs: Multiple Instruments Plus Control Groups as Evidence Factors in an Observational Study of the Effectiveness of Catholic Schools. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2020.1745811] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
24
|
Lasater KB, McHugh MD, Rosenbaum PR, Aiken LH, Smith HL, Reiter JG, Niknam BA, Hill AS, Hochman LL, Jain S, Silber JH. Evaluating the Costs and Outcomes of Hospital Nursing Resources: a Matched Cohort Study of Patients with Common Medical Conditions. J Gen Intern Med 2021; 36:84-91. [PMID: 32869196 PMCID: PMC7458128 DOI: 10.1007/s11606-020-06151-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 08/12/2020] [Indexed: 11/26/2022]
Abstract
BACKGROUND Nursing resources, such as staffing ratios and skill mix, vary across hospitals. Better nursing resources have been linked to better patient outcomes but are assumed to increase costs. The value of investments in nursing resources, in terms of clinical benefits relative to costs, is unclear. OBJECTIVE To determine whether there are differential clinical outcomes, costs, and value among medical patients at hospitals characterized by better or worse nursing resources. DESIGN Matched cohort study of patients in 306 acute care hospitals. PATIENTS A total of 74,045 matched pairs of fee-for-service Medicare beneficiaries admitted for common medical conditions (25,446 sepsis pairs; 16,332 congestive heart failure pairs; 12,811 pneumonia pairs; 10,598 stroke pairs; 8858 acute myocardial infarction pairs). Patients were also matched on hospital size, technology, and teaching status. MAIN MEASURES Better (n = 76) and worse (n = 230) nursing resourced hospitals were defined by patient-to-nurse ratios, skill mix, proportions of bachelors-degree nurses, and nurse work environments. Outcomes included 30-day mortality, readmission, and resource utilization-based costs. KEY RESULTS Patients in hospitals with better nursing resources had significantly lower 30-day mortality (16.1% vs 17.1%, p < 0.0001) and fewer readmissions (32.3% vs 33.6%, p < 0.0001) yet costs were not significantly different ($18,848 vs 18,671, p = 0.133). The greatest outcomes and cost advantage of better nursing resourced hospitals were in patients with sepsis who had lower mortality (25.3% vs 27.6%, p < 0.0001). Overall, patients with the highest risk of mortality on admission experienced the greatest reductions in mortality and readmission from better nursing at no difference in cost. CONCLUSIONS Medicare beneficiaries with common medical conditions admitted to hospitals with better nursing resources experienced more favorable outcomes at almost no difference in cost.
Collapse
|
25
|
Jain S, Rosenbaum PR, Reiter JG, Hoffman G, Small DS, Ha J, Hill AS, Wolk DA, Gaulton T, Neuman MD, Eckenhoff RG, Fleisher LA, Silber JH. Using Medicare claims in identifying Alzheimer's disease and related dementias. Alzheimers Dement 2020; 17:10.1002/alz.12199. [PMID: 33090695 PMCID: PMC8296851 DOI: 10.1002/alz.12199] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 08/25/2020] [Accepted: 08/29/2020] [Indexed: 12/11/2022]
Abstract
INTRODUCTION This study develops a measure of Alzheimer's disease and related dementias (ADRD) using Medicare claims. METHODS Validation resembles the approach of the American Psychological Association, including (1) content validity, (2) construct validity, and (3) predictive validity. RESULTS We found that four items-a Medicare claim recording ADRD 1 year ago, 2 years ago, 3 years ago, and a total stay of 6 months in a nursing home-exhibit a pattern of association consistent with a single underlying ADRD construct, and presence of any two of these four items predict a direct measure of cognitive function and also future claims for ADRD. DISCUSSION Our four items are internally consistent with the measurement of a single quantity. The presence of any two items do a better job than a single claim when predicting both a direct measure of cognitive function and future ADRD claims.
Collapse
|