Leveraging {targets} and {crew} to simulate clinical trials


Will Landau

Agenda

  1. Clinical trial design and simulation
  2. Example simulation project
  3. targets
  4. crew
  5. Q & A

Trial design: optimization and balance

Clinical trial simulation

Example trial and simulation

  • Randomized, controlled, parallel, non-adaptive phase 3 study.
  • Randomize half the patients to drug, half to placebo.
  • The main outcome (primary endpoint) is continuous, and a higher score is healthier.
  • Declare efficacy if the p-value is < 0.05 from a 1-sided hypothesis test.
  • Goal of the simulation: determine minimum number of patients required to demonstrate superiority to placebo with 90% power and Type I error less than 5%.

Simulation code: R functions

  • A data analysis is a sequence of transformations.
  • Functions are great tools to express those transformations.
  • simulate_dataset() accepts a simulation scenario (with sample size and assumed efficacy level) and returns a simulated dataset.
  • analyze_dataset() accepts a simulated dataset and returns a one-row data frame with a p-value (and optionally posterior probabilities, etc.).
  • simulate_trial() chains simulate_dataset() and analyze_dataset() together.

simulate_dataset()

simulate_dataset <- function(
  mean_response_drug,
  sample_size
) {
  patient_id <- paste0(
    "patient_",
    sample.int(n = 1e9, size = sample_size, replace = FALSE)
  )
  study_arm <- rep(
    x = c("drug", "placebo"),
    each = sample_size / 2
  )
  response_drug <- rnorm(
    n = sample_size / 2,
    mean = mean_response_drug,
    sd = 4.25
  )
  response_placebo <- rnorm(
    n = sample_size / 2,
    mean = 1,
    sd = 4.25
  )
  tibble(
    patient_id = patient_id,
    study_arm = study_arm,
    response = c(response_drug, response_placebo)
  )
}
library(tibble)
dataset <- simulate_dataset(
  mean_response_drug = 2,
  sample_size = 800
)
dataset
#> # A tibble: 800 × 3
#>    patient_id        study_arm response
#>    <chr>             <chr>        <dbl>
#>  1 patient_545860557 drug         0.950
#>  2 patient_68238256  drug         0.524
#>  3 patient_197379753 drug         7.05 
#>  4 patient_577319868 drug        -1.46 
#>  5 patient_291086621 drug         8.60 
#>  6 patient_851227638 drug        -0.202
#>  7 patient_96276021  drug        13.2  
#>  8 patient_2032492   drug        -5.26 
#>  9 patient_439198057 drug        -2.48 
#> 10 patient_947918099 drug         6.99 
#> # ℹ 790 more rows
#> # ℹ Use `print(n = ...)` to see more rows

analyze_dataset()

analyze_dataset <- function(dataset) {
  dataset %>%
    mutate(
      study_arm = factor(
        study_arm,
        levels = c("placebo", "drug")
      )
    ) %>%
    lm(formula = response ~ study_arm) %>%
    summary() %>%
    coefficients() %>%
    as.data.frame() %>%
    filter(grepl("^study_arm", rownames(.))) %>%
    mutate(
      p_value = pnorm(
        q = `t value`,
        lower.tail = FALSE
      )
    ) %>%
    pull(p_value) %>%
    tibble(p_value = .)
}
library(dplyr)
analyze_dataset(dataset)
#> # A tibble: 1 × 1
#>   p_value
#>     <dbl>
#> 1  0.0169

simulate_trial()

simulate_trial <- function(
  mean_response_drug,
  sample_size
) {
  dataset <- simulate_dataset(
    mean_response_drug = mean_response_drug,
    sample_size = sample_size
  )
  analyze_dataset(dataset)
}
simulate_trial(
  mean_response_drug = 2,
  sample_size = 800
)
#> # A tibble: 1 × 1
#>   p_value
#>     <dbl>
#> 1  0.0208

Scenarios

Procedure

  • Use simulate_trial() to simulate thousands of replications of each scenario (row above).
  • Within each scenario, calculate the proportion of replications declared efficacy (p-value < 0.05).

Goals

  1. Determine the lowest sample size with power \(\ge\) 90% and type 1 error < 5%.
  2. Manage the computational demands of large simulations.

A reproducible analysis pipeline tool

Demanding computation in R: simulation and beyond

  • Clinical trial simulation
  • Bayesian data analysis: Stan, JAGS, NIMBLE, greta, SBC
  • Network meta-analysis
  • PK/PD: nlmixr, mrgsolve
  • Statistical genomics
  • Machine learning: keras, tensorflow, torch, tidymodels
  • Permutation tests
  • Database queries: DBI
  • Big data ETL

Typical notebook-based simulation

Messy reality: managing data

Messy reality: managing change

Make-like pipeline tools

  • Orchestrate moving parts.
  • Skip up-to-date results.
  • Scale the computation.
  • Manage output data.

targets

  • Designed for R.
  • Encourages good function-oriented programming habits.
  • Automatic dependency detection.
  • Behind-the-scenes data management.
  • Distributed computing.

Resources

Get started with targets

fs::dir_tree()
#> .
#> ├── _targets.R # Create with use_targets() and then modify by hand.
#> ├── R
#> │   ├── analyze_dataset.R     # Write completely by hand.
#> │   ├── simulate_dataset.R    # Write completely by hand.
#> │   └── simulate_trial.R      # Write completely by hand.


  1. Write functions that produce datasets, models, and summaries.
  2. Call use_targets() to generate code files for targets.
  3. Edit _targets.R by hand to define a test pipeline (start small).
  4. Use tar_manifest() and tar_visnetwork() to inspect the pipeline.
  5. Use tar_make() to run the pipeline.
  6. Inspect the results with tar_read() or tar_load().
  7. Scale up the pipeline from the small test case.

_targets.R file

library(targets)

tar_option_set(
  packages = c("dplyr", "tibble")
)

tar_source()

list(
  tar_target(
    name = simulations_scenario_1,
    command = simulate_trial(
      mean_response_drug = 2,
      sample_size = 700
    )
  ),
  tar_target(
    name = simulations_scenario_2,
    command = simulate_trial(
      mean_response_drug = 2,
      sample_size = 800
    )
  ),

  tar_target(
    name = results,
    command = bind_rows(
      simulations_scenario_1,
      simulations_scenario_2
    )
  )
)

Inspect the pipeline

tar_manifest()
#> # A tibble: 3 × 2
#>   name                   command                                
#>   <chr>                  <chr>                                  
#> 1 simulations_scenario_1 "simulate_trial(mean_response_drug = 2…
#> 2 simulations_scenario_2 "simulate_trial(mean_response_drug = 2…
#> 3 results                "bind_rows(simulations_scenario_1, sim…

tar_outdated()
#> [1] "simulations_scenario_1" "simulations_scenario_2"
#> [3] "results" 

tar_visnetwork()

Run the pipeline

tar_make()
#> ▶ start target simulations_scenario_1
#> ● built target simulations_scenario_1 [0.118 seconds]
#> ▶ start target simulations_scenario_2
#> ● built target simulations_scenario_2 [0.009 seconds]
#> ▶ start target results
#> ● built target results [0.001 seconds]
#> ▶ end pipeline [0.209 seconds]

Results in the data store

tar_read(results)
#> # A tibble: 2 × 1
#>   p_value
#>     <dbl>
#> 1  0.0236
#> 2  0.0716
fs::dir_tree()
#> ├── _targets
#> │   ├── meta
#> │   │   ├── meta
#> │   │   ├── process
#> │   │   └── progress
#> │   ├── objects
#> │   │   ├── results
#> │   │   ├── simulations_scenario_1
#> │   │   └── simulations_scenario_2
#> │   └── user

Change a command

# _targets.R file:
library(targets)
tar_option_set(
  packages = c("dplyr", "tibble")
)
tar_source()
list(
  tar_target(
    name = simulations_scenario_1,
    command = simulate_trial(
      mean_response_drug = 2,
      sample_size = 750 # changed
    )
  )
tar_visnetwork()
tar_outdated()
#> [1] "simulations_scenario_1" "results" 

tar_make()
#> ▶ start target simulations_scenario_1
#> ● built target simulations_scenario_1 [0.057 seconds]
#> ✔ skip target simulations_scenario_2
#> ▶ start target results
#> ● built target results [0.001 seconds]
#> ▶ end pipeline [0.137 seconds]

Change a function

# R/analyze_dataset.R file:
analyze_dataset <- function(dataset) {
  dataset %>%
    # ... unchanged code ...
    mutate(
      p_value = pt( # Use Student t.
        q = `t value`,
        df = 18.37 # degrees of freedom
        lower.tail = FALSE
      )
    ) %>%
    pull(p_value) %>%
    tibble(p_value = .)
}
tar_visnetwork()
tar_outdated()
#> [1] "simulations_scenario_1" "simulations_scenario_2"
#> [3] "results" 

tar_make()
#> ▶ start target simulations_scenario_1
#> ● built target simulations_scenario_1 [0.055 seconds]
#> ▶ start target simulations_scenario_2
#> ● built target simulations_scenario_2 [0.009 seconds]
#> ▶ start target results
#> ● built target results [0 seconds]
#> ▶ end pipeline [0.144 seconds]

Evidence of reproducibility

tar_make()
#> ✔ skip target simulations_scenario_1
#> ✔ skip target simulations_scenario_2
#> ✔ skip target results
#> ✔ skip pipeline [0.04 seconds]

tar_outdated()
#> character(0)

tar_visnetwork()

Extending targets and scaling out

Cloud storage

# _targets.R file:
library(targets)

tar_option_set(
  packages = c("dplyr", "tibble"),
  repository = "aws",
  resources = tar_resources(
    aws = tar_resources_aws(
      bucket = "YOUR_BUCKET",
      prefix = "YOUR_PROJECT_NAME"
    )
  ),
  cue = tar_cue(file = FALSE) # optional
)

tar_source()

# More code below...
# R console on a different computer:
tar_meta_download()
tar_read(results)
#> # A tibble: 2 × 1
#>   p_value
#>     <dbl>
#> 1  0.0236
#> 2  0.0716

Target factories

#' @title Example target factory in an R package.
#' @export
#' @description A target factory to analyze data.
#' @return A list of 3 target objects to:
#'   1. Track the file for changes,
#'   2. Read the data in the file, and
#'   3. Analyze the data.
#' @param File Character of length 1, path to the file.
target_factory <- function(file) {
  list(
    tar_target_raw("file", file, format = "file", deployment = "main"),
    tar_target_raw("data", quote(read_data(file)), format = "fst_tbl", deployment = "main"),
    tar_target_raw("model", quote(run_model(data)), format = "qs")
  )
}

Target factories simplify pipelines


# _targets.R
library(targets)
library(yourExamplePackage)
list(
  target_factory("data.csv")
)


# R console
tar_manifest()
#> # A tibble: 3 x 2
#>   name  command          
#>   <chr> <chr>            
#> 1 file  "\"data.csv\""   
#> 2 data  "read_data(file)"           
#> 3 model "run_model(data)"

Example: literate programming

  • Goal: do the hard computation upstream, then show the results in downstream literate programming documents using tar_read() and tar_load().
  • tar_quarto() and tar_render() render documents as targets in the pipeline.1

Example report.qmd


---
title: "Results"
format: html
---

```{r}
library(targets)
tar_read(results)
```

Quarto in the _targets.R file

library(targets)
library(tarchetypes)

tar_option_set(
  packages = c("dplyr", "tibble")
)

tar_source()

list(
  tar_target(
    name = simulations_scenario_1,
    command = simulate_trial(
      mean_response_drug = 2,
      sample_size = 700
    )
  ),
  tar_target(
    name = simulations_scenario_2,
    command = simulate_trial(
      mean_response_drug = 2,
      sample_size = 800
    )
  ),

  tar_target(
    name = results,
    command = bind_rows(
      simulations_scenario_1,
      simulations_scenario_2
    )
  ),
  
  tar_quarto(report, "report.qmd")
)

Run the report in the pipeline

tar_visnetwork()
tar_outdated()
#> [1] "report"

tar_make()
#> ✔ skip target simulations_scenario_1
#> ✔ skip target simulations_scenario_2
#> ✔ skip target results
#> ▶ start target report
#> ● built target report [1.808 seconds]
#> ▶ end pipeline [1.91 seconds]
browseURL("report.html")

Scale out with tar_map_rep()

  • Need to compare different sample sizes and different efficacy scenarios.
  • Need thousands of simulation replications to estimate operating characteristics (power, type 1 error).
  • The targets package supports flexible static branching and dynamic branching.
  • tarchetypes::tar_map_rep() is a target factory that uses static branching for scenarios and dynamic branching for replications within scenarios.

Scaled out _targets.R file

library(targets)
library(tarchetypes)
library(tibble)

tar_option_set(
  packages = c("dplyr", "tibble")
)

scenarios <- tribble(
  ~efficacy, ~mean_response_drug, ~sample_size,
  "strong",   2,                   700,
  "strong",   2,                   800,
  "null",     1,                   700,
  "null",     1,                   800
)

tar_source()
list(
  tar_map_rep(
    name = simulations,
    command = simulate_trial(
      mean_response_drug = mean_response_drug,
      sample_size = sample_size
    ),
    values = scenarios,
    batches = 25, # branch targets per scenario
    reps = 40, # reps per branch target,
    names = all_of(c("efficacy", "sample_size")),
    columns = all_of(c("efficacy", "sample_size"))
  ),

  tar_target(
    name = results,
    command = simulations %>%
      group_by(efficacy, sample_size) %>%
      summarize(success = mean(p_value < 0.05))
  ),

  tar_quarto(report, "report.qmd")
)

Branching structure

tar_visnetwork()
  • Each square “pattern” target is a dynamic target with a simulation scenario.
  • Each simulation scenario has 25 dynamic branches.
  • Each dynamic branch runs 40 simulation replications.

Run the scaled out pipeline

tar_make()
#> ▶ start target simulations_batch
#> ● built target simulations_batch [0.001 seconds]
#> ▶ start branch simulations_strong_700_81ce2d93
#> ● built branch simulations_strong_700_81ce2d93 [0.16 seconds]
#> ▶ start branch simulations_strong_700_4d5726ca
#> ● built branch simulations_strong_700_4d5726ca [0.147 seconds]
#> ...
#> ▶ start target simulations
#> ● built target simulations [0.003 seconds]
#> ▶ start target results
#> ● built target results [0.004 seconds]
#> ▶ start target report
#> ● built target report [1.742 seconds]
#> ▶ end pipeline [19.171 seconds]

Aggregated simulations

tar_read(simulations)
#> # A tibble: 4,000 × 7
#>    p_value efficacy sample_size tar_batch tar_rep    tar_seed tar_group
#>      <dbl> <chr>          <dbl>     <int>   <int>       <int>     <int>
#>  1 0.00436 strong           700         1       1  1242392391         3
#>  2 0.00383 strong           700         1       2  1005013755         3
#>  3 0.00468 strong           700         1       3   848869470         3
#>  4 0.00932 strong           700         1       4   407471040         3
#>  5 0.0131  strong           700         1       5  2136134101         3
#>  6 0.00887 strong           700         1       6 -1329548112         3
#>  7 0.00481 strong           700         1       7  1808542408         3
#>  8 0.0207  strong           700         1       8  1569885781         3
#>  9 0.00588 strong           700         1       9 -1140470982         3
#> 10 0.0709  strong           700         1      10    89908551         3
#> # ℹ 3,990 more rows
#> # ℹ Use `print(n = ...)` to see more rows

Results

tar_read(results)
#> # A tibble: 4 × 3
#>   efficacy sample_size success
#>   <chr>          <dbl>   <dbl>
#> 1 null             700   0.021
#> 2 null             800   0.027
#> 3 strong           700   0.875
#> 4 strong           800   0.906
  • A sample size of 800 achieves 90% power while controlling type 1 error at 5%.

Problem

  • Real simulations can take hours to run.
  • Targets run sequentially by default.

Solution

  • crew: a framework for asynchronous and distributed computing.
  • Plugs into targets pipelines to run steps in parallel.

A distributed worker launcher for asynchronous tasks

Parallel/async tools before crew

What is crew?

  • mirai is a sleek and sophisticated task scheduler by Charlie Gao.
  • mirai uses NNG via Charlie’s nanonext package to achieve incredible speed and scale.
  • The purpose of crew is to extend mirai to the full variety of computing platforms that can run parallel workers in a local network.

Why crew?

  1. Fast
    • crew is fast since mirai is asynchonous and ultra-efficient.
  2. Frugal
    • crew launches new workers when the task load increases.
    • Workers can exit when the task load decreases (e.g. configurable maximum idle time).
  3. Friendly

crew interface

# Set up the session.
library(crew)
controller <- crew_controller_local(workers = 2, seconds_idle = 10)
controller$start()

# Submit a task.
controller$push(name = "example", command = Sys.sleep(10))

# Collect the result.
while (is.null(result <- controller$pop())) Sys.sleep(0.001)
print(result)
#> # A tibble: 1 × 11
#>   name    command       result    seconds      seed error trace warnings ...
#>   <chr>   <chr>         <list>      <dbl>     <int> <chr> <chr> <chr>    ...
#> 1 example Sys.sleep(10) <lgl [1]>    10.0 319445426 NA    NA    NA       ...

# Terminate the controller, including the dispatcher process.
controller$terminate()

Push or pop tasks at any time

run_task <- function(...) {...}

index <- 0
n_tasks <- 10000
while (index < n_tasks || !controller$empty()) {
  if (index < n_tasks) {
    index <- index + 1
    controller$push(
      command = run_task(),
      data = list(run_task = run_task)
    )
  }
  result <- controller$pop()
  print(result)
}

crew.cluster & controller groups

# Local process controller
local <- crew_controller_local(name = "name_local", workers = 2)

# Sun Grid Engine controller
sge <- crew.cluster::crew_controller_sge(
  name = "name_sge",
  seconds_launch = 60,
  workers = 50,
  sge_cores = 4,
  sge_memory_gigabytes_required = 2L,
  seconds_idle = 30,
  seconds_exit = 2,
  sge_log_output = "logs/",
  script_lines = paste0("module load R/", getRversion())
  verbose = TRUE
)

# Controller group
controller <- crew_controller_group(local, sge)

# Submit a task to whichever controller.
controller$push(command = Sys.sleep(10))

# Submit to a specific controller.
controller$push(command = Sys.sleep(10), controller = "name_sge")

Custom launcher plugins


How to write a launcher plugin

custom_launcher_class <- R6::R6Class(
  classname = "custom_launcher_class",
  inherit = crew::crew_class_launcher,
  public = list(
    launch_worker = function(call, launcher, worker, instance) {
      bin <- file.path(R.home("bin"), "R")
      processx::process$new(
        command = bin,
        args = c("-e", call),
        cleanup = FALSE
      )
    },
    terminate_worker = function(handle) {
      handle$kill()
    }
  )
)

Controller object creator

crew_controller_custom <- function(
  name = "custom controller name",
  workers = 1L,
  host = NULL,
  port = NULL,
  tls = crew::crew_tls(mode = "automatic"),
  seconds_interval = 0.5,
  seconds_timeout = 10,
  seconds_launch = 30,
  seconds_idle = Inf,
  seconds_wall = Inf,
  tasks_max = Inf,
  tasks_timers = 0L,
  reset_globals = TRUE,
  reset_packages = FALSE,
  reset_options = FALSE,
  garbage_collection = FALSE,
  launch_max = 5L
) {
  client <- crew::crew_client(
    name = name,
    workers = workers,
    host = host,
    port = port,
    tls = tls,
    seconds_interval = seconds_interval,
    seconds_timeout = seconds_timeout
  )
  launcher <- custom_launcher_class$new(
    name = name,
    seconds_launch = seconds_launch,
    seconds_idle = seconds_idle,
    seconds_wall = seconds_wall,
    tasks_max = tasks_max,
    tasks_timers = tasks_timers,
    reset_globals = reset_globals,
    reset_packages = reset_packages,
    reset_options = reset_options,
    garbage_collection = garbage_collection,
    launch_max = launch_max,
    tls = tls
  )
  controller <- crew::crew_controller(client = client, launcher = launcher)
  controller$validate()
  controller
}

Custom launcher plugin in action

# Create a controller with the launcher you defined.
controller <- crew_controller_custom(workers = 2, seconds_idle = 10)

# Start the controller, including the dispatcher process.
controller$start()

# Submit a task.
controller$push(name = "example", command = Sys.sleep(10))

# Collect the result.
while (is.null(result <- controller$pop())) Sys.sleep(0.001)
print(result)
#> # A tibble: 1 × 11
#>   name    command       result    seconds      seed error trace warnings ...
#>   <chr>   <chr>         <list>      <dbl>     <int> <chr> <chr> <chr>    ...
#> 1 example Sys.sleep(10) <lgl [1]>    10.0 319445426 NA    NA    NA       ...

# Terminate the controller, including the dispatcher process.
controller$terminate()

crew with Shiny

  • Thanks to Daniel Woodie for sparking an early version of this app.
  • Click the button to submit a 5-second task.
  • Submit as many tasks as you like.
  • A time stamp refreshes every second (thanks to asynchronicity).
  • Each task creates a random phyllotaxis using the aRtsy package.

crew with Shiny: UI

# app.R file:
library(crew)
library(shiny)
library(ggplot2)
library(aRtsy)

run_task <- function() {
  Sys.sleep(5)
  canvas_phyllotaxis(
    colors = colorPalette(name = "random", n = 3),
    iterations = 1000,
    angle = runif(n = 1, min = - 2 * pi, max = 2 * pi),
    size = 1,
    p = 1
  )
}

status_message <- function(n) {
  paste(format(Sys.time()), "tasks in progress:", n)
}

ui <- fluidPage(
  actionButton("task", "Submit a task (5 seconds)"),
  textOutput("status"),
  plotOutput("result")
)

crew with Shiny: server (1/2)

server <- function(input, output, session) {
  # reactive values and outputs
  reactive_result <- reactiveVal(ggplot())
  reactive_status <- reactiveVal("No task submitted yet.")
  reactive_poll <- reactiveVal(FALSE)
  output$result <- renderPlot(reactive_result(), height = 600, width = 600)
  output$status <- renderText(reactive_status())
  
  # crew controller
  controller <- crew_controller_local(workers = 4, seconds_idle = 10)
  controller$start()
  onStop(function() controller$terminate())

  # button to submit a task
  observeEvent(input$task, {
    controller$push(
      command = run_task(),
      data = list(run_task = run_task),
      packages = "aRtsy"
    )
    reactive_poll(TRUE)
  })

crew with Shiny: server (2/2)

  # event loop to collect finished tasks
  observe({
    req(reactive_poll())
    invalidateLater(millis = 100)
    result <- controller$pop()$result
    if (!is.null(result)) reactive_result(result[[1]])
    reactive_status(status_message(n = length(controller$tasks)))
    reactive_poll(controller$nonempty())
  })
}

shinyApp(ui = ui, server = server)

crew parallelizes targets pipelines

  • Implicit parallel computing.
  • Run conditionally independent targets in parallel as needed.
  • Automatically wait for upstream dependencies to finish.
  • The dependency graph governs these decisions.

Dependency graph

tar_visnetwork()
  • Each square “pattern” target is a dynamic target with a simulation scenario.
  • Each simulation scenario has 25 dynamic branches.
  • Each dynamic branch runs 40 simulation replications.

First target runs

Then simulations run in parallel

Then simulations aggregate

Then operating characteristics

Then the report

Pipeline done

How to use crew with targets

  1. Supply a crew controller in _targets.R.
tar_option_set(controller = crew_controller_local(...))
  1. Run the pipeline with tar_make() in the R console.
tar_make()

_targets.R with crew

library(crew)
library(targets)
library(tarchetypes)
library(tibble)

tar_option_set(
  packages = c("dplyr", "tibble"),
  controller = crew_controller_local(
    workers = 4,
    seconds_idle = 30
  )
)

scenarios <- tribble(
  ~efficacy, ~mean_response_drug, ~sample_size,
  "strong",   2,                   700,
  "strong",   2,                   800,
  "null",     1,                   700,
  "null",     1,                   800
)

tar_source()
list(
  tar_map_rep(
    name = simulations,
    command = simulate_trial(
      mean_response_drug = mean_response_drug,
      sample_size = sample_size
    ),
    values = scenarios,
    batches = 25, # branch targets per scenario
    reps = 40, # reps per branch target,
    names = all_of(c("efficacy", "sample_size")),
    columns = all_of(c("efficacy", "sample_size"))
  ),

  tar_target(
    name = results,
    command = simulations %>%
      group_by(efficacy, sample_size) %>%
      summarize(success = mean(p_value < 0.05))
  ),

  tar_quarto(report, "report.qmd")
)

On a Sun Grid Engine (SGE) cluster

library(crew.cluster)
library(targets)
library(tarchetypes)
library(tibble)

tar_option_set(
  packages = c("dplyr", "tibble"),
  controller = crew_controller_sge(
    seconds_launch = 60,
    workers = 50,
    sge_cores = 4,
    sge_memory_gigabytes_required = 2L,
    seconds_idle = 30,
    sge_log_output = "logs/",
    script_lines = paste0(
      "module load R/",
      getRversion()
    )
  )
)

scenarios <- tribble(
  ~efficacy, ~mean_response_drug, ~sample_size,
  "strong",   2,                   700,
  "strong",   2,                   800,
  "null",     1,                   700,
  "null",     1,                   800
)

tar_source()
list(
  tar_map_rep(
    name = simulations,
    command = simulate_trial(
      mean_response_drug = mean_response_drug,
      sample_size = sample_size
    ),
    values = scenarios,
    batches = 25, # branch targets per scenario
    reps = 40, # reps per branch target,
    names = all_of(c("efficacy", "sample_size")),
    columns = all_of(c("efficacy", "sample_size"))
  ),

  tar_target(
    name = results,
    command = simulations %>%
      group_by(efficacy, sample_size) %>%
      summarize(success = mean(p_value < 0.05))
  ),

  tar_quarto(report, "report.qmd")
)

Help wanted

Resources


Presentation

Tools

Special thanks