It is sometimes desirable to run one or more Bayesian models repeatedly across multiple simulated datasets. Examples:

  1. Validate the implementation of a Bayesian model using simulation-based calibration (SBC; Cook, Gelman, and Rubin 2006; Talts et al. 2020).
  2. Simulate a randomized controlled experiment to explore frequentist properties such as power and Type I error.

This vignette focuses on (1). The goal is to simulate multiple datasets from the model below, analyze each dataset, and assess how often the estimated credible intervals for beta capture the true value of beta from the simulation.

lines <- "data {
  int <lower = 1> n;
  vector[n] x;
  vector[n] y;
}
parameters {
  real beta;
}
model {
  y ~ normal(x * beta, 1);
  beta ~ normal(0, 1);
}"
writeLines(lines, "model.stan")

Next, we define a pipeline to simulate multiple datasets and fit each dataset with the model. Below, we commit to 10 replications: 2 batches with 5 iterations per batch. (In practical situations, the total number of replications should be hundreds of times more.) We also supply custom variable names and summary functions to return the 95% credible intervals for beta. Below, we use the data_copy argument to copy the true value of beta from the data to the results, which will help us assess the credible intervals.

# _targets.R
library(targets)
library(stantargets)
options(crayon.enabled = FALSE)
tar_option_set(memory = "transient", garbage_collection = TRUE)

generate_data <- function() {
  true_beta <- stats::rnorm(n = 1, mean = 0, sd = 1)
  x <- seq(from = -1, to = 1, length.out = n)
  y <- stats::rnorm(n, x * true_beta, 1)
  list(n = n, x = x, y = y, true_beta = true_beta)
}

list(
  tar_stan_mcmc_rep_summary(
    model,
    "model.stan",
    generate_data(), # Runs once per rep.
    batches = 5, # Number of branch targets.
    reps = 2, # Number of model reps per branch target.
    data_copy = "true_beta", # Append scalars from data to the output data frame.
    variables = "beta",
    summaries = list(
      ~posterior::quantile2(.x, probs = c(0.025, 0.975))
    ),
    log = R.utils::nullfile()
  )
)

We now have a pipeline that runs the model 10 times: 5 batches (branch targets) with 2 replications per batch.

tar_visnetwork(targets_only = TRUE)

Run the computation with tar_make()

tar_make()
#> ● run target model_batch
#> ● run target model_file_model
#> Compiling Stan program...
#> ● run branch model_data_b0b9380a
#> ● run branch model_data_ffcdb73c
#> ● run branch model_data_b968a03a
#> ● run branch model_data_f8763cb2
#> ● run branch model_data_0bfdabdc
#> ● run branch model_model_5d061b58
#> Model executable is up to date!
#> ● run branch model_model_a9336683
#> Model executable is up to date!
#> ● run branch model_model_bde6a6d6
#> Model executable is up to date!
#> ● run branch model_model_384f982f
#> Model executable is up to date!
#> ● run branch model_model_0d59666a
#> Model executable is up to date!
#> ● run target model
#> ● end pipeline

The result is an aggregated data frame of summary statistics, where the .rep column distinguishes among individual replicates. We have the credible intervals for beta in columns q2.5 and q97.5. And thanks to data_copy, we also have true_beta, the value of beta used to generate the dataset in each simulation rep.

tar_load(model)
model
#> # A tibble: 10 x 7
#>    variable   q2.5 q97.5 true_beta .rep     .file      .name
#>    <chr>     <dbl> <dbl>     <dbl> <chr>    <chr>      <chr>
#>  1 beta     -0.311 1.40      0.768 e028a25a model.stan model
#>  2 beta     -0.533 1.24      0.491 49d9679e model.stan model
#>  3 beta      0.382 2.11      0.631 c2034aa2 model.stan model
#>  4 beta     -0.821 0.901    -0.139 9c310b5f model.stan model
#>  5 beta     -0.300 1.45      0.875 e9c6eb3c model.stan model
#>  6 beta     -0.889 0.851     0.557 06b6d8f4 model.stan model
#>  7 beta      0.360 2.04      1.50  ac50f4a6 model.stan model
#>  8 beta      1.67  3.44      2.15  20619c83 model.stan model
#>  9 beta     -1.64  0.142    -0.679 11ad0797 model.stan model
#> 10 beta     -0.403 1.36      0.987 7aa1dd5a model.stan model

Now, let’s assess how often the estimated 95% credible intervals capture the true values of beta. If the model is implemented correctly, the coverage value below should be close to 95%. (Ordinarily, we would increase the number of batches and reps per batch and run batches in parallel computing.)

library(dplyr)
model %>%
  summarize(coverage = mean(q2.5 < true_beta & true_beta < q97.5))
#> # A tibble: 1 x 1
#>   coverage
#>      <dbl>
#> 1        1

For maximum reproducibility, we should express the coverage assessment as a custom function and a target in the pipeline.

# _targets.R
library(targets)
library(stantargets)

generate_data <- function() {
  true_beta <- stats::rnorm(n = 1, mean = 0, sd = 1)
  x <- seq(from = -1, to = 1, length.out = n)
  y <- stats::rnorm(n, x * true_beta, 1)
  list(n = n, x = x, y = y, true_beta = true_beta)
}

list(
  tar_stan_mcmc_rep_summary(
    model,
    "model.stan",
    generate_data(),
    batches = 5, # Number of branch targets.
    reps = 2, # Number of model reps per branch target.
    data_copy = "true_beta", # Append scalars from data to the output data frame.
    variables = "beta",
    summaries = list(
      ~posterior::quantile2(.x, probs = c(0.025, 0.975))
    ),
    log = R.utils::nullfile()
  ),
  tar_target(
    coverage,
    model %>%
      summarize(coverage = mean(q2.5 < true_beta & true_beta < q97.5))
  )
)

The new coverage target should the only outdated target, and it should be connected to the upstream model target.

tar_visnetwork(targets_only = TRUE)

When we run the pipeline, only the coverage assessment should run. That way, we skip all the expensive computation of simulating datasets and running MCMC multiple times.

tar_make()
#> ✔ skip target model_batch
#> ✔ skip target model_file_model
#> ✔ skip branch model_data_b0b9380a
#> ✔ skip branch model_data_ffcdb73c
#> ✔ skip branch model_data_b968a03a
#> ✔ skip branch model_data_f8763cb2
#> ✔ skip branch model_data_0bfdabdc
#> ✔ skip branch model_model_5d061b58
#> ✔ skip branch model_model_a9336683
#> ✔ skip branch model_model_bde6a6d6
#> ✔ skip branch model_model_384f982f
#> ✔ skip branch model_model_0d59666a
#> ✔ skip target model
#> ● run target coverage
#> ● end pipeline
tar_read(coverage)
#> # A tibble: 1 x 1
#>   coverage
#>      <dbl>
#> 1        1

Multiple models

tar_stan_rep_mcmc_summary() and similar functions allow you to supply multiple Stan models. If you do, each model will share the the same collection of datasets. Below, we add a new model2.stan file to the stan_files argument of tar_stan_rep_mcmc_summary(). In the coverage summary below, we group by .name to compute a coverage statistic for each model.

# _targets.R
library(targets)
library(stantargets)

generate_data <- function() {
  true_beta <- stats::rnorm(n = 1, mean = 0, sd = 1)
  x <- seq(from = -1, to = 1, length.out = n)
  y <- stats::rnorm(n, x * true_beta, 1)
  list(n = n, x = x, y = y, true_beta = true_beta)
}

list(
  tar_stan_mcmc_rep_summary(
    model,
    c("model.stan", "model2.stan"), # another model
    generate_data(),
    batches = 5,
    reps = 2,
    data_copy = "true_beta",
    variables = "beta",
    summaries = list(
      ~posterior::quantile2(.x, probs = c(0.025, 0.975))
    ),
    log = R.utils::nullfile()
  ),
  tar_target(
    coverage,
    model %>%
      group_by(.name) %>%
      summarize(coverage = mean(q2.5 < true_beta & true_beta < q97.5))
  )
)

In the graph below, notice how targets model_model1 and model_model2 are both connected to model_data upstream. Downstream, model is equivalent to dplyr::bind_rows(model_model1, model_model2), and it will have special columns .name and .file to distinguish among all the models.

tar_visnetwork(targets_only = TRUE)

References

Cook, Samantha R., Andrew Gelman, and Donald B. Rubin. 2006. “Validation of Software for Bayesian Models Using Posterior Quantiles.” Journal of Computational and Graphical Statistics 15 (3): 675–92. http://www.jstor.org/stable/27594203.

Talts, Sean, Michael Betancourt, Daniel Simpson, Aki Vehtari, and Andrew Gelman. 2020. “Validating Bayesian Inference Algorithms with Simulation-Based Calibration.” http://arxiv.org/abs/1804.06788.