Standardize a tidy input dataset.
Usage
hb_data(
data,
response,
study,
study_reference,
group,
group_reference,
patient,
covariates
)
Arguments
- data
A tidy data frame or
tibble
with the data.- response
Character of length 1, name of the column in
data
with the response/outcome variable.data[[response]]
must be a continuous variable, and it should be the change from baseline of a clinical endpoint of interest, as opposed to just the raw response. Treatment differences are computed directly from this scale, please supply change from baseline unless you are absolutely certain that treatment differences computed directly from this quantity are clinically meaningful.- study
Character of length 1, name of the column in
data
with the study ID.- study_reference
Atomic of length 1, element of the
study
column that indicates the current study. (The other studies are historical studies.)- group
Character of length 1, name of the column in
data
with the group ID.- group_reference
Atomic of length 1, element of the
group
column that indicates the control group. (The other groups may be treatment groups.)- patient
Character of length 1, name of the column in
data
with the patient ID.- covariates
Character vector of column names in
data
with the columns with baseline covariates. These can be continuous, categorical, or binary. Regardless,historicalborrow
derives the appropriate model matrix.
Value
A standardized tidy data frame with one row per patient and the following columns:
response
: continuous response/outcome variable. (Should be change from baseline of an outcome of interest.)study_label
: human-readable label of the study.study
: integer study index with the max index equal to the current study (atstudy_reference
).group_label
: human-readable group label (e.g. treatment arm name).group
: integer group index with an index of 1 equal to the control group (atgroup_reference
).patient_label
: original patient ID.patient
: integer patient index.covariate_*
: baseline covariate columns.
Details
Users do not normally need to call this function. It mainly serves exposes the indexing behavior of studies and group levels to aid in interpreting summary tables.
Examples
data <- hb_sim_independent(n_continuous = 1, n_study = 2)$data
data <- dplyr::select(
data,
study,
group,
patient,
response,
tidyselect::everything()
)
colnames(data) <- c("trial", "arm", "subject", "change", "cov1", "cov2")
data$trial <- paste0("trial", data$trial)
data$arm <- paste0("arm", data$arm)
hb_data(
data = data,
response = "change",
study = "trial",
study_reference = "trial1",
group = "arm",
group_reference = "arm1",
patient = "subject",
covariates = c("cov1", "cov2")
)
#> # A tibble: 400 × 9
#> response study_label group_label patient_label study group patient
#> <dbl> <chr> <chr> <int> <int> <int> <int>
#> 1 1.91 trial2 arm1 101 1 1 101
#> 2 3.27 trial2 arm1 102 1 1 102
#> 3 3.04 trial2 arm1 103 1 1 103
#> 4 2.01 trial2 arm1 104 1 1 104
#> 5 1.61 trial2 arm1 105 1 1 105
#> 6 0.912 trial2 arm1 106 1 1 106
#> 7 0.885 trial2 arm1 107 1 1 107
#> 8 1.33 trial2 arm1 108 1 1 108
#> 9 1.04 trial2 arm1 109 1 1 109
#> 10 -1.81 trial2 arm1 110 1 1 110
#> # ℹ 390 more rows
#> # ℹ 2 more variables: covariate_cov1 <dbl>, covariate_cov2 <dbl>