Powerful simulation pipelines with {targets}

Will Landau

Modeling and simulation

Credit: https://xkcd.com/303/, Randal Munroe

But first, let’s revisit one of our favorite webcomics. This comic is famous and forever memorialized in the history of the internet because reveals such a relatable unspoken reality in software engineering. You write a program, but then before you use it, you have to let it compile. And while it compiles, there’s nothing for you to do. You just have to sit there and let the computer do its thing.

And I know, not all of us are professional software engineers. We’re statisticians, mathematicians, ecologists, economists, geneticists, quantitative scientists of all kinds. But still… this is us… all the time.

Only instead of “my code is compiling”, it’s usually: “my model is fitting”, or “my simulation is running”. And instead of slacking off and having sword fights in the office…

Disruption and repetition

Clinical trial simulation

Drug development

Clinical trials

Responsibilities

Clinical trial simulation: goals

Quantify decision-making ability.

Compare alternative designs.

Clinical trial simulation: steps

Simulate virtual patients.
Analyze with models.
Repeat thousands of times.

Clinical trial simulation: challenges

Complicated R code.

Long execution times.

Frequent changes.

Example trial

Hypothetical trial

Extremely rare and severe autoimmune liver disease.

Phase 2A study, first look at efficacy.

Double blind: novel therapy vs standard of care.

Randomize up to 100 patients (1:1).

Bayesian joint model

Longitudinal endpoint: log bilirubin

\[ \begin{aligned} \color{orange}{\eta_i(t)} = x_i(t) \beta + b_i \end{aligned} \]

\(\color{orange}{\eta_i(t)}\): expected log bilirubin for patient \(i\) at time \(t\).

Survival endpoint: years without liver transplant or death

\[ \begin{aligned} h_i(t) = h_0(t, \lambda) \exp \left (\color{orange}{\eta_i(t)} \alpha + d_i \color{blue}{\theta} \right ) \end{aligned} \]

\(d_i\): 1 for treatment, 0 for control.

\(\color{blue}{\theta}\): log hazard ratio of treatment vs control.

Decision rules

\[ \begin{aligned} g(y) = \text{P}(\exp(\color{blue}{\theta}) < 0.75 \ | \ y) \end{aligned} \]

Declare efficacy if:

\[ \begin{aligned} g(y) > 0.8 \end{aligned} \]

Declare futility if:

\[ \begin{aligned} g(y) < 0.4 \end{aligned} \]

Clinical trial design

10-year accrual period.

Futility interim at \(n\) events (liver transplant or death).

Goal: simulate to propose a suitable \(n\).

Simulate futility

Simulate 10000 futile trials for \(n = 40, 50, 60, 70\).

A fixed hazard ratio of 1 would be unrealistic.

Instead, draw from a distribution that captures futility.

Use the posterior distribution from a historical trial.

Historical futility data

Datasets pbc and pbcseq from the survival package.

Indication: primary biliary cholangitis (PBC).

Mayo Clinic trial of D-penicillamine vs placebo.

D-penicillamine did not improve survival.

Estimated hazard ratio

R code

Functions

\[ \Huge {\color{blue}{y}} = {\color{purple}{f}}({\color{teal}{x}}) \]

Everything that exists in an object.
Everything that happens is a function call.

— John Chambers

Top-level R functions

\[ \Huge {\color{blue}{y}} = {\color{purple}{f}}({\color{teal}{x}}) \]

Function	Description
`draw_hazard_ratios()`	From historical model.
`trial()`	Simulate one trial rep.
`plot_results()`	Visualize results.

Example simulation replication

trial(hazard_ratio = 0.5, events = 50)
#> # A tibble: 1 × 4
#>   events efficacy enrolled years
#>    <dbl>    <dbl>    <dbl> <dbl>
#> 1     50    0.998       83  8.52

Example simulation replication

trial(hazard_ratio = 1, events = 50)
#> # A tibble: 1 × 4
#>   events efficacy enrolled years
#>    <dbl>    <dbl>    <dbl> <dbl>
#> 1     50   0.0324       72  7.35

All together, informally

Step 1: draw historical hazard ratios.

hazard_ratios <- draw_hazard_ratios(10000)

Step 2: run simulations.

results <- hazard_ratios |>
  expand_grid(events = c(40, 50, 60, 70)) |>
  pmap(trial) |>
  list_rbind()

Step 3: plot results.

plot_results(results)

Workflow challenges

Implementing the joint model

rstanarm::stan_jm(
  formulaLong = log_bilirubin ~
    study_arm + albumin + years_measured +
    (1 | patient_id),
  formulaEvent = Surv(years_survived, event) ~
    study_arm,
  time_var = "years_measured",
  dataLong = data_longitudinal,
  dataEvent = data_survival
)

Backend: Stan

Computational demands of Stan/HMC

HMC = Hamiltonian Monte Carlo.

Physics simulation samples from the posterior distribution.

Computation time

system.time(draw_hazard_ratios())
#>    user  system elapsed 
#> 658.306   2.424 177.468

system.time(trial(1, events = 70))
#>    user  system elapsed 
#> 115.568   0.726  32.261

32 seconds x 10000 reps x 4 scenarios \(>\) 2 weeks!

Numbered notebooks?

What about data?

Complexity?

Keeping results up to date?

Pipelines

What is a pipeline?

A pipeline is a formal group of tasks.

Each task has dependencies, a command, and output.

A task can only modify its own output.

A task cannot eventually depend on itself.

Pipeline tools

Examples: Make, Airflow, Nextflow, Prefect, and targets.

targets: a pipeline tool for R

Retains the comfort of your local R session.

Focuses on your functions, not your scripts.

Integrates with Quarto.

Runs distributed computing.

Trial simulation pipeline

Project-oriented workflow

fs::dir_tree()
#> _targets.R
#> R
#> ├── draw_hazard_ratios.R
#> ├── plot_results.R
#> ├── trial.R
#> └── utils.R

R/ scripts contain functions.

_targets.R defines the pipeline.

Define the pipeline in _targets.R

library(targets)

tar_option_set(...)

tar_source()

list(
  tar_target(hazard_ratios, ...),
  tar_target(trials, ...),
  ...
)

Options and parallel computing

library(crew.cluster)

controller <- crew_controller_sge(
  workers = 500,
  cores = 4,
  seconds_idle = 120
)

tar_option_set(
  packages = c("rstanarm", ...),
  controller = controller
)

Inputs

tar_target(
  name = events,
  command = c(40, 50, 60, 70)
)

Equivalent to:

function() {
  result <- c(40, 50, 60, 70)
  saveRDS(result, "_targets/objects/events")
}

Inputs

tar_target(
  name = hazard_ratios,
  command = draw_hazard_ratios(10000)
)

Equivalent to:

function(draw_hazard_ratios) {
  result <- draw_hazard_ratios(10000)
  saveRDS(result, "_targets/objects/hazard...")
}

Simulations

tar_target(
  name = trials,
  command = trial(hazard_ratios, events),
  pattern = cross(hazard_ratios, events)
)

For each combination of hazard ratio and event threshold:

function(trial, hazard_ratio, event) {
  result <- trial(hazard_ratio, event)
  saveRDS(result, "_targets/objects/trials...")
}

Results

tar_target(
  name = plot,
  command = plot_results(trials)
)

Equivalent to:

function(plot_results, trials) {
  result <- plot_results(trials)
  saveRDS(result, "_targets/objects/events")
}

Results

tarchetypes::tar_quarto(
  name = quarto,
  path = "results.qmd"
)

Equivalent to:

function(path, plot) {
  quarto::quarto_render(path)
  c("results.html", path)
}

Quarto report

Graph visualization

tar_visnetwork(targets_only = TRUE)

Implicit dependencies

tar_target(
  name = plot,
  command = plot_results(trials)
)

Running the pipeline

Run the pipeline

tar_make(as_job = TRUE)

Auto-scaled distributed computing

Track progress

tar_progress_summary()[, 1:3]
#> # A tibble: 1 × 3
#>   skipped dispatched completed
#>     <int>      <int>     <int>
#> 1       0        500     12613

Track progress

tar_visnetwork(targets_only = TRUE)

Monitor workers

library(crew.cluster)
monitor <- crew_monitor_sge()

monitor$jobs()[, c(1, 3, 4, 9)]
#> # A tibble: 2 × 4
#>   job_number name   state slots
#>   <chr>      <chr>  <chr> <chr>
#> 1 131853812  crew-… r     4    
#> 2 131853813  crew-… r     4

Simulation results

tar_read(trials, branches = seq_len(8))
#> # A tibble: 10 × 4
#>    events efficacy enrolled years
#>     <dbl>    <dbl>    <dbl> <dbl>
#>  1     40   0.055        58  6.31
#>  2     50   0.0685       77  7.99
#>  3     60   0.0602       81  7.93
#>  4     70   0.153        93  9.01
#>  5     40   0.220        65  6.03
#>  6     50   0.479        73  6.43
#>  7     60   0.918        87  8.91
#>  8     70   0.832        89  8.99

Plot

tar_read(plot)

Quarto report

Add a new target

tar_target(
  name = years,
  command = plot_years(trials)
)

Skip up-to-date targets

tar_make()
#> ...
#> ✔ skipping targets (40002 so far)...
#> ▶ dispatched target years

Years until interim

tar_read(years)

Change the results plot

tar_target(
  name = plot,
  command = plot_results(trials, height = 7)
)

Change an event threshold

tar_target(
  name = events,
  command = c(40, 50, 60, 75)
)

Change an upstream function

Up-to-date pipeline

tar_outdated()
#> character(0)

tar_visnetwork(targets_only = TRUE)

Conclusions

Interim at 40 events looks promising.

Need to simulate full design with interims and final analysis

Recap

Benefits of pipeline tools

Time savings
Reproducibility
Peace of mind

Thanks!

Slides: wlandau.github.io/useR2025

Links to resources

targets documentation: https://docs.ropensci.org/targets

Clinical trial simulation pipeline: https://github.com/wlandau/targets-example-liver

Sources

Brilleman, S. L., M. J. Crowther, M. Moreno-Betancur, J. Buros Novik, and R. Wolfe. 2018. “Joint Longitudinal and Time-to-Event Models via Stan.” In StanCon 2018. https://github.com/stan-dev/stancon_talks/.
Dickson, E. R., T. R. Fleming, R. H. Wiesner, W. P. Baldus, C. R. Fleming, J. Ludwig, and J. T. McCall. 1985. “Trial of Penicillamine in Advanced Primary Biliary Cirrhosis.” New England Journal of Medicine 312 (16): 1011–15. https://doi.org/10.1056/NEJM198504183121602.
Goodrich, B., J. Gabry, I. Ali, and S. L. Brilleman. 2024. “rstanarm: Bayesian applied regression modeling via Stan.” https://mc-stan.org/rstanarm.
Landau, W. M., Kunzmann, K., Sidi, Y., Stock, C. 2024. “brms.mmrm: Bayesian MMRMs using ‘brms’”. R package version 1.1.1, https://openpharma.github.io/brms.mmrm/.
Landau, W. M. 2021. “The targets R package: a dynamic Make-like function-oriented pipeline toolkit for reproducibility and high-performance computing.” Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959.
Lawrence, G. A., M. E. Boye, M. J. Crowther, J. G. Ibrahim, G. Quartey, S. Micallef, and F. Y. Bois. 2016. “Joint modeling of survival and longitudinal non-survival data: current methods and issues. Report of the DIA Bayesian joint modeling working group.” Statistics in Medicine 34 (14): 2181–95. https://doi.org/10.1002/sim.6141.
Modrák, M., Moon, A. H., Kim, S., Bürkner, P., Huurre, N., Faltejsková, K., Gelman, A., Vehtari, A. 2023. “Simulation-Based Calibration Checking for Bayesian Computation: The Choice of Test Quantities Shapes Sensitivity”. Bayesian Analysis. https://doi.org/10.1214/23-BA1404.
Therneau, T. M., and P. M. Grambsch. 2000. Modeling Survival Data: Extending the Cox Model. New York: Springer.