crew is an all-inclusive wrapper around
mirai to manage workers and tasks from one place. However,
you can let mirai manage the tasks and just use
crew to manage workers. With crew’s
customizable launcher plugins system, along
with the pre-built plugins in crew.aws.batch and
crew.cluster, you can deploy your mirai tasks
to a wide range of computing environments.
How it works
First, create a crew controller with “default” compute
profile and seconds_idle = Inf.1
library(crew)
controller <- crew_controller_local(profile = "default", seconds_idle = Inf)Next, launch one or more workers.
controller$launch(n = 1)Submit a mirai task normally.2
The task will start as soon as the worker connects to the controller. When the task completes, you can get the result with:
task$data
#> [1] 2Below, a “completed” count greater than zero confirms that the task actually ran on the controller.3
controller$client$status()
#> connections cumulative awaiting executing completed
#> 1 1 0 0 1 To stop the workers, either close the local R session or terminate the controller.
controller$terminate()Parallel functional programming
The pattern is the same with mirai-powered parallel
purrr. First, create the controller and launch the workers.
library(crew)
controller <- crew_controller_local(profile = "default", seconds_idle = Inf)
controller$launch(n = 4)Then, use the controller’s compute profile in mirai’s
parallel purrr functions.
Asynchronous parallel functional programming
mirai::mirai_map() schedules functional programming
tasks without blocking the R session. The pattern is analogous to the
purrr case.
controller <- crew_controller_local(profile = "default", seconds_idle = Inf)
controller$launch(n = 4)
tasks <- mirai_map(seq_len(4), \(x) Sys.sleep(10))
tasks
#> < mirai map [0/4] > # The tasks are still running.Auto-scaling
crew can automatically scale workers in response to
demand from mirai tasks. To enable this, we configure the
controller differently:
controller <- crew_controller_local(
profile = "default",
seconds_idle = 30, # Workers will terminate after 30 seconds of idleness.
workers = 4 # No more than 4 workers will run at one time.
)The autoscale() method runs an asynchronous
later loop that launches new workers in the background.
controller$autoscale()later autoscaling not compatible with either of the
functional programming sections above, but it can accommodate individual
tasks.
task <- mirai(1 + 1)
# After waiting a few seconds:
task$data
#> [1] 2To deactivate the auto-scaling loop:
controller$descale()Caveats and limitations
- If you follow the patterns in this vignette, do not submit or
collect tasks directly through the controller (e.g. controller methods
push(),map(),walk(),pop(), orcollect()). Those methods rely on the task counters incontroller$client$status()(frommirai::info()), which increment with every task in the compute profile, regardless of how the task was submitted. If you submit any tasks outside the controller (e.g. through `mirai::mirai()), then you must submit and collect all other tasks outside the controller as well. - Due to the constraints of
later, the auto-scalinglaterloop is only compatible with individually-launched tasks at the top level of the call stack (outside function calls) or in Shiny apps.controller$autoscale()will not work with parallelpurrrormirai_map()unless those functions manually calllater::run_now()to triggercrew’s auto-scaling.