In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The NNG-powered mirai R package is a sleek and sophisticated scheduler that efficiently processes these intense workloads. The crew package extends mirai with a unifying interface for third-party worker launchers. Inspiration also comes from packages future, rrq, clustermq, and batchtools.
Installation
| Type | Source | Command |
|---|---|---|
| Release | CRAN | install.packages("crew") |
| Development | GitHub | remotes::install_github("wlandau/crew") |
| Development | R-universe | install.packages("crew", repos = "https://wlandau.r-universe.dev") |
Documentation
The documentation website at https://wlandau.github.io/crew/ includes a function reference and tutorial vignettes linked below.
Risks
The crew package has unavoidable risks, and the user is responsible for safety, security, and computational resources. Please read the software license and the vignette about specific known risks.
Similar work
-
mirai: a powerful R framework for asynchronous tasks built on NNG. The purpose ofcrewis to extendmiraito different computing platforms for distributed workers. -
rrq: a task queue for R based on Redis. -
rrqueue: predecessor ofrrq. -
clustermq: sends R function calls as jobs to computing clusters. -
future: a unified interface for asynchronous evaluation of single tasks and map-reduce calls on a wide variety of backend technologies. -
batchtools: tools for computation on batch systems. -
targets: a Make-like pipeline tool for R. -
later: delayed evaluation of synchronous tasks. -
promises: minimally-invasive asynchronous programming for a small number of tasks within Shiny apps. -
callr: initiates R process from other R processes. - High-performance computing CRAN task view.
Thanks
The crew package incorporates insightful ideas from the following people.
-
Charlie Gao created
miraiandnanonextand graciously accommodated the complicated and demanding feature requests that madecrewpossible. -
Rich FitzJohn and Robert Ashton developed
rrq. -
Gábor Csárdi developed
callrand wrote an edifying blog post on implementing task queues. -
Kirill Müller created the
workersprototype, an initial effort that led directly to the current implementation ofcrew.crewwould not exist without Kirill’s insights about orchestration models for R processes. -
Henrik Bengtsson. Henrik’s
futurepackage ecosystem demonstrates the incredible power of a consistent R interface on top of a varying collection of high-performance computing technologies. -
Michael Schubert. Michael’s
clustermqpackage supports efficient high-performance computing on traditional clusters, and it demonstrates the value of a centralR6object to manage an entire collection of persistent workers. -
David Kretch. The
pawsR package is a powerful interface to Amazon Web Services, and the documentation clearly communicates the capabilities and limitations of AWS to R users. -
Adam Banker, co-authored
pawswith David Kretch. -
David Neuzerling. David’s
lambdrpackage establishes a helpful pattern to submit and collect AWS Lambda jobs from R. -
Mark Edmondson. Mark maintains several R packages to interface with Google Cloud Platform such as
googleCloudStorageRandgoogleCloudRunner, and he started the conversation around helpingtargetssubmit jobs to Google Cloud Run. -
Joe Cheng for sparking the integration of
crewwithpromises.
Code of Conduct
Please note that the crew project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Citation
To cite package ‘crew’ in publications use:
Landau WM (2023). _crew: A Distributed Worker Launcher Framework_.
https://wlandau.github.io/crew/, https://github.com/wlandau/crew.
A BibTeX entry for LaTeX users is
@Manual{,
title = {crew: A Distributed Worker Launcher Framework},
author = {William Michael Landau},
year = {2023},
note = {https://wlandau.github.io/crew/, https://github.com/wlandau/crew},
}