Original Research

Mortality burden from variation in provision of surgical care in emergency general surgery: a cohort study using the National Inpatient Sample

Abstract

Background The decision to undertake a surgical intervention for an emergency general surgery (EGS) condition (appendicitis, diverticulitis, cholecystitis, hernia, peptic ulcer, bowel obstruction, ischemic bowel) involves a complex consideration of factors, particularly in older adults. We hypothesized that identifying variability in the application of operative management could highlight a potential pathway to improve patient survival and outcomes.

Methods We included adults aged 65+ years with an EGS condition from the 2016–2017 National Inpatient Sample. Operative management was determined from procedure codes. Each patient was assigned a propensity score (PS) for the likelihood of undergoing an operation, modeled from patient and hospital factors: EGS diagnosis, age, gender, race, presence of shock, comorbidities, and hospital EGS volumes. Low and high probability for surgery was defined using a PS cut-off of 0.5. We identified two model-concordant groups (no surgery-low probability, surgery-high probability) and two model-discordant groups (no surgery-high probability, surgery-low probability). Logistic regression estimated the adjusted OR (AOR) of in-hospital mortality for each group.

Results Of 375 546 admissions, 21.2% underwent surgery. Model-discordant care occurred in 14.6%; 5.9% had no surgery despite a high PS and 8.7% received surgery with low PS. In the adjusted regression, model-discordant care was associated with significantly increased mortality: no surgery-high probability AOR 2.06 (1.86 to 2.27), surgery-low probability AOR 1.57 (1.49 to 1.65). Model-concordant care showed a protective effect against mortality (AOR 0.83, 0.74 to 0.92).

Conclusions Nearly one in seven EGS patients received model-discordant care, which was associated with higher mortality. Our study suggests that streamlined treatment protocols can be applied in EGS patients as a means to save lives.

Level of evidence III.

What is already known on this topic

  • Variation in surgical care is generally interpreted as evidence of uncertainty among providers regarding optimal strategies. There is likely significant variation in the use of operative management in emergency general surgery (EGS); the extent of variation would expose an opportunity to improve outcomes and could be used to quantify the effect of the impact of decision aids.

What this study adds

  • Our study attempts to address this key gap in knowledge, using national data to (1) quantify the extent of variation in operative decision-making in EGS, and (2) determine if this variation represents an opportunity to improve guidance for operative decision-making.

How this study might affect research, practice or policy

  • Our study suggests that nearly one in seven EGS patients received care that was different from expected by their disease and comorbidities, which subsequently was associated with higher mortality. Streamlined treatment protocols could be applied in EGS care as a means to save lives.

Introduction

In the USA, older individuals bear a disproportionate burden of associated mortality, morbidity, and costs of emergency general surgery (EGS).1–4 Decisions around whether to operate are not always clear in this population. Even for seemingly straightforward cases such as cholecystectomy, some literature suggests that surgery is safe and possibly preferred in the older patient,5–7 whereas other literature shows that older patients face up to 10-fold higher perioperative mortality.8 An article by Kaufman et al compared operative and non-operative management in EGS conditions and showed that the effect on mortality varied over conditions and across time, suggesting that operative management is not the universally favored management and varies by as a factor of both disease type and comorbidity burden.9 Currently, the most used tool to support decision-making in these scenarios is the American College of Surgeons National Surgical Quality Improvement Program (NSQIP) surgical risk calculator.10 11 Unfortunately, the NSQIP risk calculator provides the estimated risk of mortality and complications after operative management, but no equivalent evidence exists to estimate the risks of non-operative management. Because there are not easily applicable decision aids to help clinicians decide between operative and non-operative management, there is likely significant variation in the use of operative management.

Variation in surgical care is generally interpreted as evidence of uncertainty among providers regarding optimal strategies.12 Accordingly, reducing variation through dissemination of best-practice evidence is a key strategy for improving quality of care and improving patient outcomes across surgical specialties.12 13 One key illustrative example on the value of reducing variation can be seen in trauma care, which parallels EGS in many ways including the urgent nature and high mortality risk. Reducing variation in trauma care has been accomplished through standardization of diagnostic testing, transfer and triage protocols, clinical protocols, and multidisciplinary involvement, and this has been shown to reduce complications, mortality, and length of care.14 15 Given the uncertainty of the benefits of surgery in older adults presenting with EGS diseases, we hypothesize that there may be considerable variation in the choice to pursue operative management. The presence and extent of variation would expose an opportunity to improve outcomes as it would indicate that providers have uncertainty about optimal management, and could be used to quantify the effect of the impact of decision aids.

Our study attempts to address this key gap in knowledge, using national data to (1) quantify the extent of variation in operative decision-making in EGS, and (2) determine if this variation represents an opportunity to improve guidance for operative decision-making. To do this, we developed propensity score for the likelihood of having an operative procedure for an EGS condition. The propensity score would represent the ‘average’ operative decision, and we could then test how often patients received care that was discordant with the average decision. We had two hypotheses: (1) we would see variation in the use of operative management, and (2) care discordant with the ‘average’ decision would be associated with increased mortality.

Methods

Study population

This study uses the 2016 and 2017 National Inpatient Sample (NIS) from the Healthcare Cost and Utilization Project, supported by the Agency for Healthcare Research and Quality.16 The NIS includes a weighted sample of hospital admissions across the USA and approximates a 20% stratified sample that is representative of hospitals throughout the USA. In addition to demographics, the NIS includes up to 40 diagnosis codes and 25 procedure codes per inpatient admission, allowing for extraction of multiple procedures. The NIS includes data from inpatient stays, not individual patients, and therefore records of events and diagnoses before or after the stay are not available and not included in this analysis.

We included any hospitalization of a patient aged 65 years or older who had a non-elective admission for one of eight EGS conditions with complete data. Patients with a traumatic mechanism of injury were excluded from the population. These conditions were identified using International Classification of Diseases, 10th revision (ICD-10) diagnosis codes, using a method developed by Guttman et al.17 These conditions included appendicitis, cholecystitis, diverticulitis, hernia, intestinal ischemia, bowel obstruction, peptic ulcer disease, and perforated intestine. To allow risk stratification, each patient was categorized as having one EGS diagnosis. Patients with more than one EGS diagnosis present on the admission were categorized as having the disease type with the highest unadjusted mortality within the population. We classified patients as having operative management if any of the following procedure codes were associated with this hospitalization: appendectomy, gallbladder surgery, colon surgery, laparotomy, laparoscopy, adhesiolysis, small bowel surgery, hernia, and ulcer surgery. We based this categorization on Smith et al’s prior work.18 Although older data were used, these data allowed use of pre-existing ICD coding strategies which have already been published and validated so that these could be compared across existing studies.

Study variables and analysis

The main outcome of interest was in-hospital mortality, identified using the NIS discharge disposition variable. Additional variables collected included patient factors and hospital factors. Patient factors included age, sex, race, the presence of shock (ICD-10 code R57 or T811), Elixhauser comorbidities,19 and frailty score using a deficit accumulation method incorporating 38 diagnoses, designed for ICD-10 code used by Lai et al.20 Race was included in the propensity score as a surrogate for sociologic constructs as it is known that race-related and socioeconomic disparities exist in the provision of EGS operations.21–24 Individual component diagnoses for the frailty score were identified, as well as a composite frailty score. This frailty score can range from 0 (least frail) to 1 (most frail); for analysis, the frailty score was divided into quintiles. Hospital factors included hospital region (Northeast, Midwest, South, and West) and hospital teaching status (rural, urban non-teaching, urban teaching). Given that multiple years of NIS were used and hospitals could not be identified between years, hospital volumes were not included. Distributions of factors associated with having an operation are reported. Given the large sample size present in this dataset, we do not present p values for group comparisons. Differences between subgroups are shown using standardized differences.

Propensity score

We developed a score for patients’ propensity to undergo an operative procedure. This score included EGS disease type (with each disease included as a categorical variable within the model), hospital factors (region, teaching status), age, sex, the presence of shock, and the Elixhauser comorbidities.25 26 We also included all diagnoses identified for the above frailty deficit accumulation score, which encompassed 38 diagnoses (see online supplemental file for ICD-10 codes used).20 Use of the variables which make up these composite indices ensures the inclusion of aging-related variables which influence a surgeon’s or patient’s decision to undergo an operation. The deficit accumulation frailty method has been described and validated as a method of operationalizing the accumulation of health issues which can compromise function.27–29 Examples of diagnoses which were included in our model were the presence of fluid and electrolyte imbalances, dementia, hypertension, arrhythmias, osteoporosis, and history of cerebrovascular infarct (see online supplemental file). We present a receiver operating curve with classification to determine the cut-off point at which point the propensity score had the highest accuracy.

Model concordance and identification of subgroups

We identified four surgical decision subgroups based on (1) the likelihood of having an operation using a propensity score cut-off of 0.5 and (2) whether an operation was performed. This defined two model-concordant groups: patients who did not undergo surgery who had a low probability for surgery based on their propensity score (no surgery-low probability), and patients who underwent surgery with a high probability for surgery based on their propensity score (surgery-high probability). Similarly, we also defined two model-discordant groups: patients who did not undergo surgery with a high probability of receiving surgery (no surgery-high probability), and patients who underwent surgery with a low probability of surgery (surgery-low probability). Factors associated with membership in these groups are described (model concordant vs. model discordant and between each of the four surgical decision groups). Differences between model-concordant and model-discordant groups are also displayed using standardized differences.

Regression modeling on in-hospital mortality

An adjusted logistic regression was created for the effect of subgroup membership on in-hospital mortality. We used the no surgery-low probability group as the reference group. Because relationships between factors and mortality would likely differ from the relationships between factors and the propensity to choose an operative management strategy, the logistic regression was further adjusted for age, female sex, presence of shock, frailty quintile, EGS diagnosis type, race, and hospital factors.

Regulatory research

STATA/MP, V.17.0 (STATACorp, College Station, Texas, USA) was used, with use of the elixhauser,19 stddiff, and the psmatch230 programs.

Results

We identified 365 514 individuals in NIS who met our inclusion criteria. Over half of the population was female, at 56.7%. The median age was 77 years, with an IQR 71–84. Just over one-fifth of the population (n=77.785, 21.3%) had an operative procedure and 4.0% (n=14 558) died during the hospitalization. Diverticulitis was the most frequently encountered diagnosis, encompassing 38.9% of the cohort, followed by bowel obstruction (19.3%), peptic ulcer (18.1%), and cholecystitis (13.3%). Table 1 shows the distribution of variables for the entire population, as well as between the operatively and non-operatively managed patients. Standardized differences and p values are shown to demonstrate the differences in factors between the operative and non-operative groups. The operative group was younger (median age 78 (IQR 71–84) vs 75 (70–82)), with a higher proportion of appendicitis, cholecystitis, hernia, ischemic bowel, and perforated viscus but lower proportions of diverticulitis and peptic ulcer. Patients with higher levels of frailty were more likely to be treated non-operatively.

Table 1
|
Study Population

Our propensity score for the likelihood of having an operation had an area under the receiver operating characteristic curve of 0.87. The plot of the density curve of the propensity score by treatment group is presented in figure 1. In this figure, we also highlight the cut-off point which was used to create the four subgroups: no surgery-low probability, surgery-high probability, surgery-low probability, and no surgery-high probability. The no surgery-low probability was the most common category, and this is used as the reference group for subsequent comparisons. The distribution of propensity scores and grouping is presented in figure 2.

Figure 1
Figure 1

Receiver operating curve (ROC) for propensity model.

Figure 2
Figure 2

Density graph for surgical decision groups. PS, propensity score.

We compared patients with model-concordant and model-discordant care. Model-concordant care was associated with an unadjusted in-hospital mortality of 3.3%, compared with model-discordant care mortality of 7.9%. Model-concordant care was higher in patients with higher levels of frailty (quintiles 4 and 5) as well as in patients with diverticulitis and peptic ulcer disease (table 2). Model-discordant care was higher in patients with shock, cholecystitis, bowel obstruction, ischemic bowel, and perforated viscus. Patients who were black and Hispanic had higher proportions of concordant care than those who were white, Asian, Native American, or categorized in another racial category.

Table 2
|
Model-concordant versus model-discordant care

Examining EGS group subtypes commonly seen in model-discordant care, the reason for discordance varied (table 3). When model-discordant care was provided in patients with cholecystitis and perforated viscus, this tended to be distributed in the no surgery-high probability group. However, when model-discordant care was provided for patients with ischemic bowel and bowel obstructions, it was distributed in the surgery-low probability group. Examining surgical decision subgroups within race and ethnicity, patients who were white or black were more likely to have operative care that was model discordant, whereas patients from Hispanic and Asian subgroups were more likely to have non-operative discordant care.

Table 3
|
Four-group distribution of factors

In table 4, model-concordant provision of surgery (surgery-high probability) was associated with lower odds of in-hospital death (OR 0.70, 95% CI 0.64 to 0.77) compared with the no surgery-low probability group. Model-discordant treatment was associated with increased odds of death for both groups: no surgery-high probability (OR 1.81, 95% CI 1.66 to 1.97) and surgery-low probability (OR 1.39, 95% CI 1.32 to 1.47).

Table 4
|
Logistic regression, in-hospital mortality

Discussion

In our study of over 300 000 older adults with an EGS condition, over one-fifth received surgery, and 4% died during the hospitalization. Patients whose care was consistent with their predicted treatment had better outcomes—those who received discordant care were more likely to experience in-hospital mortality. Our findings demonstrated that not only that there is variation in the provision of operative management but that when patients received care which fell outside of the ‘average’ treatment paradigm, mortality was higher. This information is important for both providers and the EGS community, as these data support important ongoing work to develop improved standards for EGS practice.

Patients who received surgery tended to be younger, less frail, and have specific EGS conditions: appendicitis, gallbladder disease, hernia, ischemic bowel, and perforated viscus. Patients with diverticulitis and peptic ulcer disease were more likely to be treated non-operatively. Multiple prior studies have shown that medically complex patients or those with high levels of frailty are more likely to be treated non-operatively.9 31–33 For patients who are unlikely to survive their EGS condition no matter the treatment, non-operative management is a reasonable approach. However, non-operative management can also weaken an already frail patient and increase the increase the risk of a subsequent surgery. In studies of frail patients with appendicitis and cholecystitis using the National Readmissions Database, nearly one in five patients treated non-operatively at the index admission eventually failed and required a later surgery, which resulted in higher rates of complication and higher mortality than those who had an operation performed at the index admission.32 33

Multiple factors influence surgical decision-making in EGS conditions and can influence the surgeon’s recommendations and patients’ decisions. Instrumental qualitative work on perspectives of older patients and surgeons regarding high-stakes surgical decision-making was published by Nabozny et al, who showed that even though patients and surgeons highly value quality of life, this notion is difficult to incorporate into surgical decisions.34 Many simply view this as a simple choice between life and death, where choosing surgery is a surrogate for choosing life, even though surgery is not synonymous with survival.34 For surgeons, conversations are often framed by the structure of an informed consent, which is a poor framework for decision-making because these frameworks rely on disclosure of discrete procedural complications rather than prioritizing alignment with individual patient preferences.34 35 Surgeons struggle with framing these difficult conversations, and in some circumstances, it can be ‘easier to just operate than to explain to the family why surgery is not the right treatment.’34

Unfortunately, the current lack of data on outcomes after non-operative management limits our ability to use data to show whether there is clinical equipoise between operative and non-operative management. Clinical experience and single-center studies tend to favor operative management in those who can tolerate the operation. For those who are referred to surgery but cannot tolerate an operation, outcomes are poor. A single-center retrospective study examined outcomes after non-operative EGS management and showed very high 1-year mortality rates: 11% at 30 days and 23% at 1 year. When surgical consultants deemed the patient too high risk for an operative procedure, the 1-year mortality was 53%.36 However, these single-center studies may have selection bias, as these are patients who have been selected for a surgical consultation. Population-based long-term studies may show more potential for equipoise in certain clinical scenarios. A study by Kaufman et al used Medicare data to examine patient populations using a matched instrumental variable analysis and found that mortality over 180 days varied by condition. Although operative management in hepatobiliary conditions was associated with a lower risk of mortality at 30, 90, and 180 days, in upper gastrointestinal and colorectal conditions, the opposite was true.9

This study had several key limitations, mostly related to the use of administrative data. A key limitation of the study is that the NIS does not have any clinical information about disease severity, functional status, other terminal diseases, or patient preferences, which could have influenced treatment decisions and outcomes; likewise, no information was extracted about non-operative management strategies such as use of antibiotics or drains. In addition, these data are older, from 2016 to 2017; these data allowed use of ICD code methodologies that were developed previously and avoided changes in decision-making that occurred during the COVID-19 pandemic. To limit the effect of the lack of functional status, our propensity score did include not only Elixhauser diagnoses, but 38 diagnoses which are included in a deficit accumulation frailty score. The deficit accumulation method is a reproducible method which can quantify frailty, which can serve well as a proxy for development of functional deficits and accelerated aging.27–29 37 Another potential limitation of our study is the inability to account for variation in provider and hospital characteristics, such as medical and surgical expertise and biases. Our study was also only able to examine in-hospital mortality as the outcome of interest, which limits our understanding of the overall impact of treatment decisions on long-term patient quality of life. In some circumstances, such as the patient who is unlikely to survive an operation, non-operative management despite a high probability for surgery may be the preferred treatment strategy despite high mortality. Because our study goal was to outline existing variation in surgical practice for the field of EGS, examining nuances between varying disease types was also outside the scope of this study. It is likely that the variation between practices and the effects of that variation differ greatly between EGS disease types.

In summary, the aim of this study was to characterize variation in use of operative management for EGS conditions in the USA; our secondary aim was to determine if this variation had subsequent clinical consequences. Our findings highlight that there is variation in provision of operative management in EGS conditions, and that patients had a higher risk of mortality if they received care which was not concordant with the treatment which was predicted. By demonstrating the consequences of variation in care, our study underscores the need for further studies that examine whether standardized treatment protocols or decision aids can reduce variation and improve care. Our study suggests that there is an opportunity to improve operative decision-making in EGS care through reduction of variation.