Oh my! Priors in the Bayesian IV model - Causal Book: Design Patterns in Causal Inference

In the example used in [[IV the Bayesian way]], we used the default [[Weakly informative prior distributions|weakly informative priors]]. But what were they and what did they mean? First, let's check the priors for each parameter estimation with the following line: ```r brms::get_prior(model_1 + model_2 + brms::set_rescor(rescor = TRUE), data = df) # This will produce a long list because it covers fixed effects too. # Alternatively, you can run brms::stancode(model_multivar) for a dictionary of priors as Stan code. ``` Now that we know the priors for each parameter, we can actually replicate the `brms` model using another library and function where the model specification is more explicit: `rethinking/ulam`. `ulam` uses `cmdstanr` in the backend and provides a nice interface.^[CmdStanR (Command Stan R) is a lightweight interface to [[Stan]] for R users, providing an alternative to the traditional [RStan](https://mc-stan.org/rstan/) interface. See [here](https://mc-stan.org/cmdstanr/articles/cmdstanr.html) for details.] The code for the same model (but a slightly different implementation) using `ulam` would be as follows: ```r # Center the variables in the data frame to replicate brms behavior df$centered_state_tax_per_pack <- df$state_tax_per_pack - mean(df$state_tax_per_pack) df$centered_price_per_pack <- df$price_per_pack - mean(df$price_per_pack) # Explicit Bayesian model with priors copied from the Stan code of the brms model set.seed(2024) show_me_everything <- rethinking::ulam( alist( # Models for each response (each stage of the IV model) price_per_pack ~ normal(mu_price, sigma_price), pack_sales_per_capita ~ normal(mu_sales, sigma_sales), # Linear predictors for each outcome mu_price <- b_price_Intercept + b_tax * centered_state_tax_per_pack + r_state_price[state_idx] + r_year_price[year_idx], mu_sales <- b_sales_Intercept + b_price * centered_price_per_pack + r_state_sales[state_idx] + r_year_sales[year_idx], # State and year effects for each outcome r_state_price[state_idx] ~ normal(0, sigma_state_price), r_state_sales[state_idx] ~ normal(0, sigma_state_sales), r_year_price[year_idx] ~ normal(0, sigma_year_price), r_year_sales[year_idx] ~ normal(0, sigma_year_sales), # Priors for the intercepts and slopes b_price_Intercept ~ student_t(3, 1.8, 2.5), b_sales_Intercept ~ student_t(3, 95.6, 43), c(b_price, b_tax) ~ normal(0, 2.5), # Priors for the standard deviations of residuals and state and year effects c(sigma_year_price, sigma_state_price, sigma_price) ~ student_t(3, 0, 2.5), c(sigma_year_sales, sigma_state_sales, sigma_sales) ~ student_t(3, 0, 43) ), data = list( price_per_pack = df$price_per_pack, pack_sales_per_capita = df$pack_sales_per_capita, centered_state_tax_per_pack = df$centered_state_tax_per_pack, centered_price_per_pack = df$centered_price_per_pack, year_idx = as.integer(as.factor(df$year)), state_idx = as.integer(as.factor(df$state)) ), chains = 4, cores = 4, iter = 4000, warmup = 2000 ) rethinking::precis(show_me_everything) ``` This model unpacks `brms` for prior selection. There are still implementation differences, some of which we can address more easily than others. For example, the priors for the coefficients can also be defined separately and scaled as follows to more closely mimic the behavior of `brms`: ```r b_tax ~ normal(0, 2.5 * sd(price_per_pack)/sd(centered_state_tax_per_pack)), b_price ~ normal(0, 2.5 * sd(pack_sales_per_capita)/sd(centered_price_per_pack)), ``` In addition, the `brms` model took into account the correlation between the residuals of the two response variables, price and sales. This is currently missing in the alternative `ulam` implementation of the model. In the `brms` [[IV the Bayesian way|results shown in the main chapter]], the residual correlation `rescor(priceperpack,packsalespercapita)` at the bottom of the output is small. This indicates that there is little unmodeled dependence between the two response variables, but there is still some that can impact the estimate. While there are these and other differences between the two implementations (`brms` and `ulam`), the results of the `ulam` model above are quite close to our original results from `brms`: ```r mean sd 5.5% 94.5% rhat ess_bulk b_price_Intercept 2.75 0.23 2.38 3.13 1.02 117.95 b_sales_Intercept 94.00 3.84 87.85 99.96 1.01 210.52 b_tax 1.21 0.01 1.19 1.23 1.00 5671.45 b_price -7.98 0.56 -8.88 -7.11 1.00 721.56 sigma_price 0.20 0.00 0.19 0.20 1.00 13742.46 sigma_state_price 0.13 0.01 0.11 0.15 1.00 7815.67 sigma_year_price 1.59 0.17 1.35 1.87 1.00 7620.34 sigma_sales 13.84 0.20 13.53 14.16 1.00 7964.89 sigma_state_sales 21.31 2.19 18.13 25.04 1.00 5947.39 sigma_year_sales 15.98 2.05 12.93 19.39 1.00 1658.37 ``` Our coefficient of interest, `b_price`, indicates a reduction in annual per capita cigarette consumption of $7.98 on average (for $1 increase in the prices). This is consistent with the `brms` estimate of $7.82 (mean) and the frequentist estimate of $7.90. Using the `ulam` implementation of the model above, the priors can be changed to see the effect on the results. If we had priors for any of the parameters in this model, such as price and sales, we would have specified them. Nevertheless, it takes a trivial amount of effort to change any of the priors in the code and see the results of the change for learning purposes. We leave this exercise to the reader. > [!info]- Last updated: November 4, 2024