Debugging targets pipelines
Will Landau
Why targets?
- Manage computationally demanding work in R:
- Bayesian data analysis: JAGS, Stan, NIMBLE,
greta
- Deep learning:
keras, tensorflow, torch
- Machine learning:
tidymodels
- PK/PD:
nlmixr, mrgsolve
- Clinical trial simulation:
rpact, Mediana
- Statistical genomics
- Social network analysis
- Permutation tests
- Database queries:
DBI
- Big data ETL
Typical notebook-based project
Messy reality: managing data
Messy reality: managing change
targets
![]()
- Designed for R.
- Encourages good programming habits.
- Automatic dependency detection.
- Behind-the-scenes data management.
- Distributed computing.
Resources
Extensions to {targets}
![]()
- Ecosystem of packages to support literate programming, Bayesian data analysis, etc. in
targets.
- Compatible with other tools such as
renv, Quarto, R Markdown, Shiny, pins, and vetiver.
Debugging: challenges
- R code is easiest to debug in the interactive console.
- To ensure reproducibility and to manage heavy computation, a pipeline is automated and non-interactive.
- External
callr::r() process
- Data management
- Environment management
- High-performance computing
- Error handling
Debugging: techniques
- Finish the pipeline anyway.
- Inspect error messages.
- Debug functions.
- Check for system issues.
- Pause the pipeline with
browser().
- Pause the pipeline with the
targets debug option.
- Save a
targets workspace.