Under traditional fee-for-service reimbursement, a primary care practice has limited financial incentive to reduce hospitalizations among their patient population. Value-based care is a healthcare delivery and payment model that rewards providers for the quality and cost efficiency of care. For example, a primary care practice might enter a financial contract with an insurance company that involves reimbursing the practice for limiting the total healthcare expenditures for patients attributed to that practice.
Hospital admissions are a substantial component of total US healthcare spending. Reducing hospital readmissions during the 30-day post-discharge window is an opportunity to improve patient outcomes while reducing costs. Under fee-for-service, there is little incentive to invest in post-discharge care management to reduce readmission rates. However, value-based care models offer an opportunity for primary care providers to receive reimbursement for the savings obtained by reducing readmissions.
Research Question
For a program designed to reduce the prevalence of hospital readmissions, it is important to assess the impact of that program in order to determine if the cost of the program justifies the reduction in healthcare spend and improvement in patient outcomes.
In this analysis, we use simulated data to demonstrate estimating the impact of a primary care practice incorporating post-discharge care management on the probability of all-cause hospital readmission during 30 days.
Data Overview
Simulated Data Description
For this analysis, we simulate data on practices, patients, and hospitalizations.
In a real-world scenario, we would expect the following:
independent practices will have varying numbers of providers
the number of patients per practice will vary based on the number of providers and the payer max (e.g. heavy medicare practices tend to have smaller patient panels)
the timing of the intervention will vary across practices
the rate of both hospitalization and readmission will vary by both patient and practice characteristics
There are also many factors beyond care management which may impact the probability of readmission, including:
patient clinical factors such as primary diagnosis and comorbidities
social drivers of heath including access to housing, social support, and transportation
quality of primary care
The simulated data for this analysis incorporates many of these characteristics.
Treatment Assigment
For the purposes of this study, we assume that practices selected if and when to participate in the treatment of adoption care management. If practices were assigned to participate in the trial, then randomization would indicate there are no systematic differences among other pre-treatment variables that could be confounding. In the absence of randomization, it requires much stronger assumptions and specialized methods to ensure balance among the treatment and control practices.
Practice Data
This is a glimpse of the simulated practice data.
Code
#summary(prac$intervention_start)prac %>%slice(1:8) %>%gt() %>%cols_label(practice_id ="Prac ID",n_providers ="# Providers",n_medicare_pt ="# Medicare Pts",n_medicaid_pt ="# Medicaid Pts",n_private_pt ="# Private Pts",n_pt ="# Patients",has_intervention ="Case Mgmt",intervention_start ="Case Mgmt Start" ) %>%tab_header("Practice Level Data") %>%tab_footnote("This is the treatment variable", cells_column_labels(has_intervention)) %>%tab_footnote("The intervention start date varies by practice between '2023-01-01' and '2025-11-01'", cells_column_labels(intervention_start)) %>%fmt_number(starts_with("n_"), decimals =0)
Practice Level Data
Prac ID
# Providers
# Medicare Pts
# Medicaid Pts
# Private Pts
# Patients
Case Mgmt1
Case Mgmt Start2
1
5
3,180
523
1,538
5,241
1
2023-06-01
2
6
2,253
236
5,599
8,088
1
2023-06-01
3
2
974
197
1,774
2,945
1
2023-09-01
4
5
709
974
7,942
9,625
0
NA
5
3
1,284
462
2,635
4,381
1
2025-10-01
6
3
412
347
4,840
5,599
1
2024-11-01
7
4
1,357
204
6,135
7,696
1
2025-08-01
8
1
202
18
1,410
1,630
0
NA
1 This is the treatment variable
2 The intervention start date varies by practice between '2023-01-01' and '2025-11-01'
Patient Data
Code
pat %>%group_by(insurance, esrd, ssdi) %>%sample_n(2) %>%ungroup() %>%gt() %>%tab_header("Simulated Patient Level Data") %>%cols_label(practice_id ="Prac ID",patient_id ="Patient ID",insurance ="Insurance",age ="Age",sex ="Sex",esrd ="ESRD",ssdi ="SSDI" ) %>%tab_footnote("Patient ID is nested within Practice ID", cells_column_labels(patient_id))
Simulated Patient Level Data
Prac ID
Patient ID1
Insurance
Age
Sex
ESRD
SSDI
48
1429
medicaid
59
M
0
0
46
2755
medicaid
40
F
0
0
20
286
medicare
65
M
0
0
1
1171
medicare
74
F
0
0
48
1
medicare
61
M
0
1
44
1422
medicare
21
M
0
1
23
1581
medicare
34
M
1
0
31
28
medicare
72
F
1
0
49
2614
private
64
F
0
0
45
3630
private
20
F
0
0
1 Patient ID is nested within Practice ID
Hospitalization and Readmission Data
Code
hosp %>%slice(1:10) %>%gt() %>%tab_header("Simulated Hospitalization Level Data") %>%cols_label(practice_id ="Prac ID",patient_id ="Patient ID",hosp_date ="Hospitalization Date",readmit ="Readmission" ) %>%tab_footnote("Outcome of Interest: All-cause readmission within 30 days of discharge", cells_column_labels(readmit)) %>%tab_footnote("Two patients hospitalized on the same day may have different values for the treatment of interest (received post-discharge care management)",cells_column_labels(hosp_date))
Simulated Hospitalization Level Data
Prac ID
Patient ID
Hospitalization Date1
Readmission2
1
17
2024-07-01
0
1
18
2023-05-01
0
1
38
2022-09-01
0
1
104
2022-04-01
1
1
114
2023-06-01
0
1
125
2022-02-01
0
1
131
2025-01-01
0
1
138
2023-05-01
0
1
155
2025-06-01
0
1
159
2023-09-01
1
1 Two patients hospitalized on the same day may have different values for the treatment of interest (received post-discharge care management)
2 Outcome of Interest: All-cause readmission within 30 days of discharge
Other data issues not considered
Real-world data would likely have additional complications such as:
ambiguity in attributing patients to practices
many hospitalizations for the same patient within the study period
dual insurance coverage or self-pay
lags in the implementation of the intervention
other relevant patient diagnoses
For simplicity, we don’t consider any of the above in this simulation.
Compute Patient Outcome
This analysis joins the three simulated data sets together and has one row per patient hospitalization. Readmission is the binary outcome of interest. For simplicity, there is at most one record per patient.
Code
patient_outcome <- pat %>%left_join(prac, by ="practice_id") %>%left_join(hosp, by =c("practice_id", "patient_id")) %>%mutate(has_intervention_pt =case_when( has_intervention ==0~0,# if the patient is hospitalized before the practice adopts care managment,# the patient has a 0 for treatment even if the practice subsequently adopts care management intervention_start > hosp_date ~0,TRUE~1 ),hosp_date_years =as.integer(hosp_date -as.Date("2022-01-01")) /365, )
Analysis Dataset
Code
patient_outcome %>%select(practice_id, patient_id, insurance, age, sex, esrd, ssdi, hosp_date, has_intervention_pt, readmit) %>%filter(!is.na(hosp_date)) %>%group_by(readmit) %>%sample_n(3) %>%ungroup() %>%gt() %>%tab_header("Analysis Data Set", subtitle ="One row per patient for patients that have been hospitalized") %>%cols_label(practice_id ="Prac ID",patient_id ="Patient ID",insurance ="Insurance",age ="Age",sex ="Sex",esrd ="ESRD",ssdi ="SSDI",hosp_date ="Hospitalization Date",has_intervention_pt ="Case Management",readmit ="Readmission" ) %>%tab_footnote("Treatment of Interest", cells_column_labels(has_intervention_pt)) %>%tab_footnote("Outcome of Interest", cells_column_labels(readmit))
Analysis Data Set
One row per patient for patients that have been hospitalized
Prac ID
Patient ID
Insurance
Age
Sex
ESRD
SSDI
Hospitalization Date
Case Management1
Readmission2
4
8309
private
7
M
0
0
2024-08-01
0
0
13
5116
private
7
F
0
0
2025-06-01
1
0
57
1327
medicare
78
M
0
0
2025-04-01
1
0
23
862
medicare
82
M
0
0
2024-05-01
0
1
23
1275
medicare
73
F
0
0
2025-08-01
0
1
61
1096
medicare
74
M
0
0
2022-09-01
0
1
1 Treatment of Interest
2 Outcome of Interest
Exploratory Analysis
Before fitting a statistical model to estimate the effect of interest, we provide some data summaries and visualizations to understand the relationships among the features.
Code
# hospitalization ratepatient_outcome %>%count(is_hosp =!is.na(hosp_date)) %>%mutate(p = n /sum(n)) %>%gt() %>%tab_header("Distribution of Hospitalization",subtitle = glue::glue("Among the {comma(nrow(patient_outcome))} patients in the study") ) %>%fmt_number(n, decimals =0) %>%fmt_percent(p, decimals =1) %>%cols_label(is_hosp ="Hospitalized?",n ="# Patients",p ="% Patients" ) %>%tab_footnote("TRUE if a patient is hospitalized at least once from Jan 2022 through Dec 2025",cells_column_labels(is_hosp) )
Distribution of Hospitalization
Among the 420,344 patients in the study
Hospitalized?1
# Patients
% Patients
FALSE
405,205
96.4%
TRUE
15,139
3.6%
1 TRUE if a patient is hospitalized at least once from Jan 2022 through Dec 2025
Code
# readmission ratepatient_outcome %>%filter(!is.na(hosp_date)) %>%count(readmit) %>%mutate(p = n /sum(n)) %>%gt() %>%tab_header("30-day readmission",subtitle = glue::glue("Among the {comma(nrow(patient_outcome %>% filter(!is.na(hosp_date))))} hospitalized in the study") ) %>%fmt_number(n, decimals =0) %>%fmt_percent(p, decimals =1) %>%cols_label(readmit ="Readmission?",n ="# Patients",p ="% Patients" ) %>%tab_footnote("TRUE if a patient is readmitted within 30 days of discharge",cells_column_labels(readmit) )
30-day readmission
Among the 15,139 hospitalized in the study
Readmission?1
# Patients
% Patients
0
13,271
87.7%
1
1,868
12.3%
1 TRUE if a patient is readmitted within 30 days of discharge
This table shows the bivariate relationship between treatment (care management) and outcome (readmission). Patients receiving care management readmit at a lower rate.
Code
# readmission rate by treatmentpatient_outcome %>%filter(!is.na(hosp_date)) %>%count(has_intervention_pt, readmit) %>%mutate(has_intervention_pt =factor(has_intervention_pt,levels =c(0, 1),labels =c("No Care Management", "Care Management"))) %>%group_by(has_intervention_pt) %>%mutate(p = n /sum(n)) %>%gt() %>%tab_header("30-day readmission",subtitle = glue::glue("Among the {comma(nrow(patient_outcome %>% filter(!is.na(hosp_date))))} hospitalized in the study") ) %>%fmt_number(n, decimals =0) %>%fmt_percent(p, decimals =1) %>%cols_label(readmit ="Readmission?",n ="# Patients",p ="% Patients" ) %>%# tab_footnote(# "Note: patients with care managment treatment have a lower rate of readmission",# cells_column_labels(p)# ) %>%tab_style(style =cell_fill(color ="green"),locations =cells_body(columns = p, rows = (readmit ==1& has_intervention_pt =="Care Management") ) ) %>%tab_style(style =cell_fill(color ="yellow"),locations =cells_body(columns = p, rows = (readmit ==1& has_intervention_pt =="No Care Management") ) )
30-day readmission
Among the 15,139 hospitalized in the study
Readmission?
# Patients
% Patients
No Care Management
0
9,267
86.5%
1
1,449
13.5%
Care Management
0
4,004
90.5%
1
419
9.5%
This plot shows the change in propensity to receive treatment, \(P(T = 1)\), over time. The incremental nature of rolling out a care management intervention across practices means that more practices will have adopted the intervention by the end of the study period.
Two implications for modeling:
we will only include years 2023-2025 in the model as there are not treatment cases during 2022
we need to assess if time is related to \(Y\) in addition to \(T\) to determine if it is a confounder
Code
# proportion of treated and untreated over timepatient_outcome %>%filter(!is.na(hosp_date)) %>%select(has_intervention_pt, hosp_date) %>%ggplot(aes(x = hosp_date, y = has_intervention_pt)) +geom_jitter(alpha =0.1, width =0, height =0.1) +stat_smooth(method ="glm", method.args =list(family = binomial), se =FALSE) +scale_y_continuous(breaks =c(0, 1), labels =c("No Care Mgmt", "Care Mgmt")) +labs(title ="Propensity to receive treatment over time",subtitle ="Baseline trend from logistic model over time",x ="Hospitalization Date",y =NULL )
These plot shows baseline readmission rates over time. The first plot shows the trend for all patients. The second plot shows separate trends for treated and control patients.
The underlying decreasing trend indicates that time of hospitalization is a confounder. Time of hospitalization is related to both:
\(T\): the propensity to be in the treatment group increases with time due to the incremental rollout, and
\(Y\): the decreasing probability of readmission may be related to other macro-trends beyond care management
Code
# readmission rates over timeg_outcome_trend <- patient_outcome %>%filter(!is.na(hosp_date)) %>%ggplot(aes(x = hosp_date, y = readmit)) +geom_jitter(alpha =0.1, width =0, height =0.1) +stat_smooth(method ="glm", method.args =list(family = binomial), se =FALSE) +scale_y_continuous(breaks =c(0, 1), labels =c("No Readmit", "Readmit")) +labs(title ="Rate of Readmissions over time",subtitle ="Baseline trend from logistic model over time",x ="Hospitalization Date",y =NULL )g_outcome_trend
Here, we check on the bivariate relationships between both ESRD status and insurance and the outcome.
Code
# hospitalization rate by insurancepatient_outcome %>%filter(!is.na(hosp_date)) %>%count(insurance, readmit) %>%group_by(insurance) %>%mutate(p = n /sum(n)) %>%ungroup() %>%gt() %>%tab_header("Readmission rate by insurance",subtitle ="Among Hospitalized Patients") %>%fmt_number(n, decimals =0) %>%fmt_percent(p, decimals =1)
Readmission rate by insurance
Among Hospitalized Patients
insurance
readmit
n
p
medicaid
0
1,083
90.2%
medicaid
1
118
9.8%
medicare
0
4,850
83.0%
medicare
1
993
17.0%
private
0
7,338
90.6%
private
1
757
9.4%
Code
# readmission rate by ESRDpatient_outcome %>%filter(!is.na(hosp_date)) %>%count(esrd, readmit) %>%group_by(esrd) %>%mutate(p = n /sum(n)) %>%ungroup() %>%gt() %>%tab_header("Readmission rate by ESRD status",subtitle ="Among Hospitalized Patients") %>%fmt_number(n, decimals =0) %>%fmt_percent(p, decimals =1)
Readmission rate by ESRD status
Among Hospitalized Patients
esrd
readmit
n
p
0
0
12,968
88.1%
0
1
1,746
11.9%
1
0
303
71.3%
1
1
122
28.7%
We saw that medicare patients have a higher rate of readmission. Aggregating to the practice level, we see that practices with a higher medicare mix also have higher readmission rates.
Code
# per practice: medicare mix vs readmit ratemedicare_outcome_df <- patient_outcome %>%group_by(practice_id) %>%summarise(medicare_mix =sum(insurance =="medicare") /n(),readmit_rate =sum(readmit[!is.na(hosp_date)]) /sum(!is.na(hosp_date)) )ggplot(medicare_outcome_df, aes(x = medicare_mix, y = readmit_rate)) +geom_point(alpha =0.5) +labs(title ="Readmit rate vs Medicare Mix",subtitle ="One observation per practice",x ="% Patients with Medicare",y ="Readmission Rate" ) +expand_limits(y =0) +scale_y_continuous(breaks =pretty_breaks(8), label = percent)
Model
Inclusion Criteria
any patient hospitalized since 2023-01-01
patient does not have ESRD diagnosis
Model Considerations
allow for baseline decreasing rate of readmissions over time
allow for practice and patient level covariates
allow for variation among practices beyond what is captured in practice-level covariates
accomodate the hierarchical dependency that arises from patients being grouped within practices
Includes hospitalized patients meeting inclusion criteria
Treated (T)
Readmission (Y)
# Patients
% of Patients
0
0
6,068
53.9%
0
1
878
7.8%
1
0
3,914
34.8%
1
1
393
3.5%
Generalized Linear Mixed Model
We fit a generalized linear mixed model that models the probability of readmission for individual \(i\) in practice \(j\). This is an extension of a generalized linear model (with logit link) that allows for random practice-level effects.
\(\alpha_j\) is the random effect for practice \(j\)
\(T\) is the treatment of interest
\(X_2\) is a time covariate (time since beginning of study)
\(X_3\) is age
\(X_4\) is sex
\(X_5\) is ssdi status
There are several approaches to fit this model. We could:
fit an approximateion to the marginal log-likelihood using lme4::glmer()
fit a fully Bayesian model using brms()
use generalized estimating equations instead of a multilevel
Here, we choose glmer() but other options are reasonable.
Frequentist Results
Code
m2 <-glmer( readmit ~ (1| practice_id) + has_intervention_pt + hosp_date_years + age + sex + ssdi,family = binomial,data = model_data)# other options: profile, bootm2_confint <-confint(m2, method ="Wald") %>%as_tibble(rownames ="covariate") %>%rename(lower95 =`2.5 %`,upper95 =`97.5 %` )m2_coeff <-summary(m2)$coefficients %>%as_tibble(rownames ="covariate") %>%rename(se =`Std. Error`)m2_results <- m2_coeff %>%inner_join(m2_confint, by ="covariate") %>%select(covariate, Estimate, se, lower95, upper95) %>%mutate(odds_ratio =exp(Estimate),lower95_odds =exp(lower95),upper95_odds =exp(upper95) ) %>%select(covariate, odds_ratio, lower95_odds, upper95_odds) %>%mutate(covariate =recode_values(covariate,"(Intercept)"~"Intercept","has_intervention_pt"~"Treatment","hosp_date_years"~"Years from study beginning","age"~"Age","sexM"~"Male Indicator","ssdi"~"SSDI Status" ))gt(m2_results) %>%fmt_number(c(odds_ratio, lower95_odds, upper95_odds), decimals =3) %>%cols_label(covariate ="Covariate",odds_ratio ="Odds Ratio",lower95_odds ="Lower 95% CI",upper95_odds ="Upper 95% CI" ) %>%tab_style(style =cell_fill(color ="green"),locations =cells_body(columns = odds_ratio, rows = (covariate =="Treatment") ) ) %>%tab_footnote("Primary Effect of Interest",cells_body(columns = odds_ratio, rows = (covariate =="Treatment") ) ) %>%tab_footnote("CIs produced via Wald method for simplicity", cells_column_labels(c(lower95_odds, upper95_odds)) ) %>%tab_footnote("Produced by exponentiating the raw coefficients",cells_column_labels(odds_ratio) )
Covariate
Odds Ratio1
Lower 95% CI2
Upper 95% CI2
Intercept
0.069
0.054
0.087
Treatment
3 0.699
0.602
0.811
Years from study beginning
0.989
0.920
1.064
Age
1.015
1.012
1.017
Male Indicator
0.958
0.851
1.078
SSDI Status
1.803
1.469
2.214
1 Produced by exponentiating the raw coefficients
2 CIs produced via Wald method for simplicity
3 Primary Effect of Interest
Bayesian Results
The results of the bayesian analysis obtained via brms are consistent with the frequentist likelihood-based approach using lmer4::glmer().
Code
mbayes <-brm( readmit ~ (1| practice_id) + has_intervention_pt + hosp_date_years + age + sex + ssdi,family =bernoulli(link ="logit"),data = model_data,prior =c(prior(normal(0, 2.5), class = b),prior(normal(0, 2.5), class = Intercept),prior(exponential(1), class = sd) ),chains =4,iter =2000, # 1000 warmup + 1000 sampling per chain by defaultwarmup =1000,cores =4,seed =42)# posterior predictive check of mean readmission rate#pp_check(mbayes, type = "stat", stat = "mean")fixef_results <-fixef(mbayes) %>%as_tibble(rownames ="covariate") %>%mutate(odds_ratio =exp(Estimate),lower95_odds =exp(`Q2.5`),upper95_odds =exp(`Q97.5`) ) %>%select(covariate, odds_ratio, lower95_odds, upper95_odds) %>%mutate(covariate =recode_values(covariate,"Intercept"~"Intercept","has_intervention_pt"~"Treatment","hosp_date_years"~"Years from study beginning","age"~"Age","sexM"~"Male Indicator","ssdi"~"SSDI Status" ))gt(fixef_results) %>%tab_header("Fixed Effects from Bayesian Model",subtitle ="Model fit using brms interface to Stan") %>%fmt_number(c(odds_ratio, lower95_odds, upper95_odds), decimals =3) %>%cols_label(covariate ="Covariate",odds_ratio ="Odds Ratio",lower95_odds ="Lower 95% CI",upper95_odds ="Upper 95% CI" ) %>%tab_style(style =cell_fill(color ="green"),locations =cells_body(columns = odds_ratio, rows = (covariate =="Treatment") ) ) %>%tab_footnote("Primary Effect of Interest",cells_body(columns = odds_ratio, rows = (covariate =="Treatment") ) ) %>%tab_footnote("Posterior Credible Intervals", cells_column_labels(c(lower95_odds, upper95_odds)) ) %>%tab_footnote("Produced by exponentiating the raw coefficients",cells_column_labels(odds_ratio) )
Fixed Effects from Bayesian Model
Model fit using brms interface to Stan
Covariate
Odds Ratio1
Lower 95% CI2
Upper 95% CI2
Intercept
0.069
0.053
0.087
Treatment
3 0.698
0.598
0.811
Years from study beginning
0.989
0.916
1.070
Age
1.015
1.012
1.017
Male Indicator
0.957
0.848
1.075
SSDI Status
1.795
1.451
2.205
1 Produced by exponentiating the raw coefficients
2 Posterior Credible Intervals
3 Primary Effect of Interest
Conclusion
In this simulation, post-discharge care management is associated with a 30% reduced odds of readmission. Is this effect causal? For causal inference, we would need to assume ignorability of the treatment assignment. Equivalently, we would need to assume all confounding covariates were included. This study was simulated, so we are certain that all confounding covariates have been included. However, in practice, there are likely other covariates impacting treatment selection and readmission. Therefore, we would want to employ other methods such as matching in order to make causal claims.