This vignette describes a scoring method similar to Mogg and Bradley (1999); difference of mean reaction times (RTs) between conditions with probe-at-test and probe-at-control, for correct responses, after removing RTs below 200 ms and above 520 ms, on Visual Probe Task data.
Load the included VPT dataset and inspect its documentation.
data("ds_vpt", package = "splithalfr")
?ds_vpt
The columns used in this example are:
Only select trials from assessment blocks
ds_vpt <- subset(ds_vpt, block_type == "assess")
Writing a scoring method for the splithalfr requires implementing two functions; a sets function that describes which sets of data should be split into halves and a score function that calculates a score.
The sets function receives data from a single participant and returns a list of datasets for each condition. In this case, we will generate two data frames: one with data from trials with probe-at-test (patt == “yes”) and one with data from trials with probe-at-control (patt == “no”).
vpt_fn_sets <- function (ds) {
return (list(
# Probe-at-test
patt_yes = subset(ds, patt == "yes"),
# Probe-at-control
patt_no = subset(ds, patt == "no")
))
}
The score function receives these two data frames from a single participant and for each:
Finally, it returns the difference between the two mean RTs.
vpt_fn_score <- function (sets) {
rt_yes <- subset(sets$patt_yes, response == 1)$rt
rt_yes <- rt_yes[rt_yes >= 200 & rt_yes <= 520]
rt_no <- subset(sets$patt_no, response == 1)$rt
rt_no <- rt_no[rt_no >= 200 & rt_no <= 520]
return (mean(rt_no) - mean(rt_yes))
}
By combining the sets and score functions, a score for a single participant can be calculated. For instance, the score of UserID 1 can be calculated via the statement below.
vpt_fn_score(vpt_fn_sets(subset(ds_vpt, UserID == 1)))
To calculate scores for each participant, call sh_apply with four arguments:
The sh_apply function will return a data frame with one row per participant, and two columns: one that identifies participants (“UserID” in this example) and a column “score”, that contains the output of the score function.
vpt_scores <- sh_apply(ds_vpt, "UserID", vpt_fn_sets, vpt_fn_score)
It is recommended to check your scoring method by calculating the score of a representative participant via a different approach. For splithalfr tests, the author has done so via Excel. Note that in the example dataset, some participants (such as UserID 28) did not have any correct responses in the patt == yes condition with RTs within the range [200, 520]. For these participants, a score could not be calculated.
To calculate split-half scores for each participant, call sh_apply with an additional split_count argument, which specifies how many splits should be calculated. For each participant and split, the splithalfr will randomly divide the dataset of each element of sets into two halves that differ at most by one in size. When called with a split_count argument that is higher than zero, sh_apply returns a data frame with the following columns:
Since for some participants a score could not be calculated, the split scores are missing for these participants as well.
vpt_splits <- sh_apply(ds_vpt, "UserID", vpt_fn_sets, vpt_fn_score, 1000)
Next, the output of sh_apply can be analyzed in order to estimate reliability. By default, functions are provided that automatically calculate mean Spearman-Brown (mean_sb_by_split) and Flanagan-Rulon (mean_fr_by_split) coefficients. If any missing values were encountered in the data provided to these functions, they give a warning, and then pairwise remove the missing data before calculating reliability.
# Spearman-Brown
mean_sb_by_split(vpt_splits)