It is sometimes desirable to run one or more Bayesian models repeatedly across multiple simulated datasets. Examples:

- Validate the implementation of a Bayesian model using simulation-based calibration (SBC; Cook, Gelman, and Rubin 2006; Talts et al. 2020).
- Simulate a randomized controlled experiment to explore frequentist properties such as power and Type I error.

This vignette focuses on (1). The goal is to simulate multiple datasets from the model below, analyze each dataset, and assess how often the estimated credible intervals for `beta`

capture the true value of `beta`

from the simulation.

```
lines <- "data {
int <lower = 1> n;
vector[n] x;
vector[n] y;
}
parameters {
real beta;
}
model {
y ~ normal(x * beta, 1);
beta ~ normal(0, 1);
}"
writeLines(lines, "model.stan")
```

Next, we define a pipeline to simulate multiple datasets and fit each dataset with the model. Below, we commit to 10 replications: 2 batches with 5 iterations per batch. (In practical situations, the total number of replications should be hundreds of times more.) We also supply custom variable names and summary functions to return the 95% credible intervals for `beta`

. Below, we use the `data_copy`

argument to copy the true value of `beta`

from the data to the results, which will help us assess the credible intervals.

```
# _targets.R
library(targets)
library(stantargets)
options(crayon.enabled = FALSE)
tar_option_set(memory = "transient", garbage_collection = TRUE)
generate_data <- function() {
true_beta <- stats::rnorm(n = 1, mean = 0, sd = 1)
x <- seq(from = -1, to = 1, length.out = n)
y <- stats::rnorm(n, x * true_beta, 1)
list(n = n, x = x, y = y, true_beta = true_beta)
}
list(
tar_stan_mcmc_rep_summary(
model,
"model.stan",
generate_data(), # Runs once per rep.
batches = 5, # Number of branch targets.
reps = 2, # Number of model reps per branch target.
data_copy = "true_beta", # Append scalars from data to the output data frame.
variables = "beta",
summaries = list(
~posterior::quantile2(.x, probs = c(0.025, 0.975))
),
log = R.utils::nullfile()
)
)
```

We now have a pipeline that runs the model 10 times: 5 batches (branch targets) with 2 replications per batch.

`tar_visnetwork(targets_only = TRUE)`

Run the computation with `tar_make()`

```
tar_make()
#> ● run target model_batch
#> ● run target model_file_model
#> Compiling Stan program...
#> ● run branch model_data_b0b9380a
#> ● run branch model_data_ffcdb73c
#> ● run branch model_data_b968a03a
#> ● run branch model_data_f8763cb2
#> ● run branch model_data_0bfdabdc
#> ● run branch model_model_5d061b58
#> Model executable is up to date!
#> ● run branch model_model_a9336683
#> Model executable is up to date!
#> ● run branch model_model_bde6a6d6
#> Model executable is up to date!
#> ● run branch model_model_384f982f
#> Model executable is up to date!
#> ● run branch model_model_0d59666a
#> Model executable is up to date!
#> ● run target model
#> ● end pipeline
```

The result is an aggregated data frame of summary statistics, where the `.rep`

column distinguishes among individual replicates. We have the credible intervals for `beta`

in columns `q2.5`

and `q97.5`

. And thanks to `data_copy`

, we also have `true_beta`

, the value of `beta`

used to generate the dataset in each simulation rep.

```
tar_load(model)
model
#> # A tibble: 10 x 7
#> variable q2.5 q97.5 true_beta .rep .file .name
#> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 beta -0.311 1.40 0.768 e028a25a model.stan model
#> 2 beta -0.533 1.24 0.491 49d9679e model.stan model
#> 3 beta 0.382 2.11 0.631 c2034aa2 model.stan model
#> 4 beta -0.821 0.901 -0.139 9c310b5f model.stan model
#> 5 beta -0.300 1.45 0.875 e9c6eb3c model.stan model
#> 6 beta -0.889 0.851 0.557 06b6d8f4 model.stan model
#> 7 beta 0.360 2.04 1.50 ac50f4a6 model.stan model
#> 8 beta 1.67 3.44 2.15 20619c83 model.stan model
#> 9 beta -1.64 0.142 -0.679 11ad0797 model.stan model
#> 10 beta -0.403 1.36 0.987 7aa1dd5a model.stan model
```

Now, let’s assess how often the estimated 95% credible intervals capture the true values of `beta`

. If the model is implemented correctly, the coverage value below should be close to 95%. (Ordinarily, we would increase the number of batches and reps per batch and run batches in parallel computing.)

```
library(dplyr)
model %>%
summarize(coverage = mean(q2.5 < true_beta & true_beta < q97.5))
#> # A tibble: 1 x 1
#> coverage
#> <dbl>
#> 1 1
```

For maximum reproducibility, we should express the coverage assessment as a custom function and a target in the pipeline.

```
# _targets.R
library(targets)
library(stantargets)
generate_data <- function() {
true_beta <- stats::rnorm(n = 1, mean = 0, sd = 1)
x <- seq(from = -1, to = 1, length.out = n)
y <- stats::rnorm(n, x * true_beta, 1)
list(n = n, x = x, y = y, true_beta = true_beta)
}
list(
tar_stan_mcmc_rep_summary(
model,
"model.stan",
generate_data(),
batches = 5, # Number of branch targets.
reps = 2, # Number of model reps per branch target.
data_copy = "true_beta", # Append scalars from data to the output data frame.
variables = "beta",
summaries = list(
~posterior::quantile2(.x, probs = c(0.025, 0.975))
),
log = R.utils::nullfile()
),
tar_target(
coverage,
model %>%
summarize(coverage = mean(q2.5 < true_beta & true_beta < q97.5))
)
)
```

The new `coverage`

target should the only outdated target, and it should be connected to the upstream `model`

target.

`tar_visnetwork(targets_only = TRUE)`

When we run the pipeline, only the coverage assessment should run. That way, we skip all the expensive computation of simulating datasets and running MCMC multiple times.

```
tar_make()
#> ✔ skip target model_batch
#> ✔ skip target model_file_model
#> ✔ skip branch model_data_b0b9380a
#> ✔ skip branch model_data_ffcdb73c
#> ✔ skip branch model_data_b968a03a
#> ✔ skip branch model_data_f8763cb2
#> ✔ skip branch model_data_0bfdabdc
#> ✔ skip branch model_model_5d061b58
#> ✔ skip branch model_model_a9336683
#> ✔ skip branch model_model_bde6a6d6
#> ✔ skip branch model_model_384f982f
#> ✔ skip branch model_model_0d59666a
#> ✔ skip target model
#> ● run target coverage
#> ● end pipeline
```

```
tar_read(coverage)
#> # A tibble: 1 x 1
#> coverage
#> <dbl>
#> 1 1
```

`tar_stan_rep_mcmc_summary()`

and similar functions allow you to supply multiple Stan models. If you do, each model will share the the same collection of datasets. Below, we add a new `model2.stan`

file to the `stan_files`

argument of `tar_stan_rep_mcmc_summary()`

. In the coverage summary below, we group by `.name`

to compute a coverage statistic for each model.

```
# _targets.R
library(targets)
library(stantargets)
generate_data <- function() {
true_beta <- stats::rnorm(n = 1, mean = 0, sd = 1)
x <- seq(from = -1, to = 1, length.out = n)
y <- stats::rnorm(n, x * true_beta, 1)
list(n = n, x = x, y = y, true_beta = true_beta)
}
list(
tar_stan_mcmc_rep_summary(
model,
c("model.stan", "model2.stan"), # another model
generate_data(),
batches = 5,
reps = 2,
data_copy = "true_beta",
variables = "beta",
summaries = list(
~posterior::quantile2(.x, probs = c(0.025, 0.975))
),
log = R.utils::nullfile()
),
tar_target(
coverage,
model %>%
group_by(.name) %>%
summarize(coverage = mean(q2.5 < true_beta & true_beta < q97.5))
)
)
```

In the graph below, notice how targets `model_model1`

and `model_model2`

are both connected to `model_data`

upstream. Downstream, `model`

is equivalent to `dplyr::bind_rows(model_model1, model_model2)`

, and it will have special columns `.name`

and `.file`

to distinguish among all the models.

`tar_visnetwork(targets_only = TRUE)`

Cook, Samantha R., Andrew Gelman, and Donald B. Rubin. 2006. “Validation of Software for Bayesian Models Using Posterior Quantiles.” *Journal of Computational and Graphical Statistics* 15 (3): 675–92. http://www.jstor.org/stable/27594203.

Talts, Sean, Michael Betancourt, Daniel Simpson, Aki Vehtari, and Andrew Gelman. 2020. “Validating Bayesian Inference Algorithms with Simulation-Based Calibration.” http://arxiv.org/abs/1804.06788.