This is the practical part of a tutorial manuscript for this package, which you can find in full on PsyArXiv.
Using the codebook package locally in RStudio
knit_by_pkgdown <- !is.null(knitr::opts_chunk$get("fig.retina"))
knitr::opts_chunk$set(
warning = TRUE, # show warnings during codebook generation
message = TRUE, # show messages during codebook generation
error = TRUE, # do not interrupt codebook generation in case of errors,
# usually better for debugging
echo = TRUE # show R code
)
ggplot2::theme_set(ggplot2::theme_bw())
pander::panderOptions("table.split.table", Inf)
So, let us load some data. In this Tutorial, I will walk you through by using the “bfi” dataset made available in the psych package (Revelle et al. 2010; Revelle et al. 2016; Goldberg and Others 1999). The bfi dataset is already very well-documented in the psych R package, but using the codebook, we can add automatically computed reliabilities, graphs, and machine-readable metadata to the mix. The dataset is already available in R, but this will not usually be the case. Therefore, I have uploaded it to the Open Science Framework where you can also find many other publicly available datasets. A new package in R, rio (Chan and Leeper 2018), makes loading data from websites as easy as loading local data. You can import the dataset directly from the Open Science Framework by writing
Rmarkdown documents have to be reproducible and self-contained. Therefore, it is not enough for a dataset to be loaded locally, you have to load the dataset at the beginning of the document. We can also use the document interactively, although this will not work so well for the codebook package. To see how this works, execute the line you just added by pressing Command + Enter (if you are on a Mac) or Ctrl + Enter (on other platforms).
Did it work? RStudio has a nice data viewer you can use to check. In the environment tab on the top right, you should see “codebook_data”. Click that row. A spreadsheet view of the dataset opens in RStudio. As you can see, it is pretty uninformative. Just long columns of numbers with variable names like A4. Are we talking aggressiveness, agreeableness, or the German industrial norm for paper size? The lack of useful metadata is palpable. What can the codebook package do with this? Click the Knit button again. This time, it will take longer. Once the result is shown in the viewer tab, scroll through it. You can see that a few warnings let us know that the package saw items that might form part of a scale, but there was no aggregated scale. You will also see graphs of the distribution for each item and summary statistics.
The last codebook we generated could already be useful if the variables had meaningful names and self-explanatory values. Yet, this is not often the case. What we need is more metadata: labels for variables and values, a dataset description, and so on. The codebook package can use metadata that are stored in R attributes. So, what are attributes and how do metadata get there? Attributes in R are most commonly used to store the type of a variable. A datetime in R is just a number with two attributes (a time zone and a class). However, these attributes can just as easily store other metadata. The Hmisc (Harrell, 2018), haven (Wickham & Miller, 2018), and rio (Chan & Leeper, 2018) packages, for example, use them to store labels. The haven and rio packages set these attributes when importing data from SPSS or Stata files. However, it is also easily possible to add metadata ourselves:
Here, we assigned a new label to a variable. Because it is inconvenient to write the above repeatedly, the labelled package (Larmarange, 2018) adds a few convenience functions. Load the labelled package, by writing the following in your codebook.rmd.
Now, let us label the C5 item.
We can also label values in this manner:
Write these labelling commands after loading the dataset and click “Knit” again. As you can now see in the viewer pane, the C1 variable has gained a label at the top and the lowest and highest values on the X axis are now labelled too. If the prospect of adding such labels for every single variable in this way seems tedious, do not worry. Many researchers already have a codebook in the form of a spreadsheet and want to import this. The bfi dataset in the psych package is a good example of this, because it comes with a tabular dictionary. On the line after loading the bfi data, type the following to import the data dictionary.
To see what you just loaded, click the “dict” row in the environment tab in the top right panel. As you can see, the dictionary has information on the constructs on which this item loads and on the direction with which it should load on the construct. Let us now make these metadata usable by the codebook package. We will often need to slightly reshape data to help us do this. To make this easier, we will use the suite of packages called the tidyverse. Load them by typing the following.
Now, we want to use the variable labels that are already in the dictionary. Because we want to label many variables at once, we need a list of variable labels. Instead of assigning one label to one variable as above, we assign many labels to the whole dataset from a named list. Here, each element of the list is one item that we want to label.
But we already have a list of variables and labels in our data dictionary that we can use, so we do not have to tediously write out this list. We have to slightly reshape it though, because right now, it is in the form of a rectangular data frame, not a named list. To do so, we use a convenience function from the codebook function called dict_to_list. This function expects to receive a data frame with two columns, the first should be the variable names, the second the variable labels. To select these columns, we use the select function from the tidyverse packages. We also use a special operator, called a pipe, which looks like this %>%. It allows us to write R code from left to right, passing along the thing we are working on. This allows us to read the code below almost like an English sentence. We take the dict dataset, then we select the variable and label columns, then we use the dict_to_list function. We assign the result of this operation to become the variable labels of codebook_data. Add the following line after importing the dictionary.
Did it work? Click codebook_data in the Environment tab again. You should see the variable labels below the variable names now. You can also click Knit again, and you will see that your codebook now contains the variable labels. They are both part of the plots and part of the codebook table at the end. You cannot see this, but they are also part of the metadata that can be found using, for example, Google Dataset Search.
So far, so good. But, you may have noticed that education is simply a number. Are these years of education? The average is 3, so that does not seem likely. No, these are actually levels of education. In the dict data frame, you can see, that there are are value labels for the levels of this variable. However, these levels of education are abbreviated, and you can probably imagine that it would be hard for an automated program to understand how these map to the values in our dataset. So, let us try to do a little better. We again use a function from the labelled package, but this time it is not var_label, but val_labels. And unlike var_label, it expects not just one label, but a named vector, with a name for each value that you want to label. You do not need to label all. Named vectors are created using the c() function. Add the following lines right after the last one.
val_labels(codebook_data$gender) <- c("male" = 1, "female" = 2)
val_labels(codebook_data$education) <- c("in high school" = 1,
"finished high school" = 2,
"some college" = 3,
"college graduate" = 4,
"graduate degree" = 5)
Did it work? Click the Knit button to find out. If it worked, the bars in the graphs for education and gender should now be labelled.
Now, on to the many Likert items. They all have the same value labels. We could assign them the same way as we did for gender and education, tediously repeating the lines for each variable, but the lazy programmer prefers a function for such situations. Creating a function is actually really simple. Just pick a name, ideally one to remember it by – I went with add_likert_labels and assign the keyword function followed by two different kinds of parentheses. The first, normal parentheses surround the variable x. The x here is a placeholder for the many variables we plan to use this function for in the next step. Next, we open curly braces to show that we now intend to write out what we plan to do with said variable x. Then, inside the curly braces, we use the val_labels function from above and assign a named vector.
add_likert_labels <- function(x) {
val_labels(x) <- c("Very Inaccurate" = 1,
"Moderately Inaccurate" = 2,
"Slightly Inaccurate" = 3,
"Slightly Accurate" = 4,
"Moderately Accurate" = 5,
"Very Accurate" = 6)
x
}
A function is just a tool, we have not used it yet. We want to use it only on the Likert items, so we now need a list of them. An easy way is to subset the dict dataframe to only take those variables that are part of the Big6. To do so, we use the filter and pull functions from the dplyr package.
Now, we want to apply our new function to these items. We again use a function from the dplyr package, called mutate_at. It expects a list of variables and a function to apply to each. We have both! So, we now add value labels to all likert items in the codebook_data.
Did it work? Click Knit again. Now, all items should also have value labels. However, this is pretty repetitive. Can we group the items by the factor that they are supposed to load on? And while we are at it, how can the metadata on keying that is in our dictionary become part of the dataset?
The codebook package relies on a simple convention to be able to summarise psychological scales, such as the Big Five dimension extraversion, which are aggregates across several items. You probably aggregated scales before. Below, we assign a new variable, extraversion, to the result of selecting all extraversion items in the data and passing them to the aggregate_and_document_scale function. This function takes the mean of its inputs, and assigns a label to the result, so that we can still tell which variables it is an aggregate of.
First we need to reverse items which negatively load on the Extraversion factor, such as “Don’t talk a lot.”. To do so, I suggest to follow a simple convention already when coming up with names for your items, namely the format scale_numberR (e.g., bfi_extra_1R for a reverse-coded extraversion item, bfi_neuro_2 for a neuroticism item). That way, the analyst always knows how an item relates to a scale. In the data we just imported, this information is encoded in our data dictionary. Let us rename the reverse-coded items, so that we cannot forget about it. Start by grabbing all items with a negative keying from our dictionary with the following lines.
You can see in your environment tab that the names A1, C4, C5, and so on, are now stored in the reversed_items vector. We can now refer to this vector using the rename_at function which applies a function to all variables we list. Here, we use the super simple function add_R which does exactly what its name says.
Click codebook_data in the environment tab, and you will see that some variables have been renamed to A1R, C4R, C5R, and so on. Now, it could be ambiguous whether the suffix R means “should be reversed before aggregation” or “has already been reversed”. With the help of metadata in the form of labelled values, there is no potential for confusion. We can reverse the underlying values, but keep the value labels right. So, if somebody responded “Very Accurate” that stays the case, but the underlying value will switch from 6 to 1 for a reversed item. Because you usually import data, where this has not yet been done, the codebook package makes it easy to bring the data into this shape. A command using dplyr functions and the reverse_labelled_values function can easily remedy this.
All this statement does is find variable names which end with a number (\d
is the regular expression codeword for a number) and R and reverse them.
With the next lines we assign extraversion to the result of selecting all extraversion items in the data and passing them to the aggregate_and_document_scale function
Try knitting! Adding further scales is really easy, just repeat the above line while changing the name of the scale and the items.
codebook_data$openness <- codebook_data %>%
select(O1:O5R) %>%
aggregate_and_document_scale()
codebook_data$conscientiousness <- codebook_data %>%
select(C1:C5R) %>%
aggregate_and_document_scale()
codebook_data$agreeableness <- codebook_data %>%
select(A1R:A5) %>%
aggregate_and_document_scale()
codebook_data$neuroticism <- codebook_data %>%
select(N1R:N5R) %>%
aggregate_and_document_scale()
Adding scales that integrate smaller scales is easy too. The data dictionary mentions the Giant Three. Let us add one, Plasticity, which subsumes Extraversion and Openness.
By the way, writing E1R:E5
only works if the items are all in order in your dataset. But maybe you mixed items across constructs, then you will need a different way to select them. You can simply list all items, writing select(E1R, E2R, E3, E4, E5)
. This can get tedious when listing many items. Another solution is to write select(starts_with("E"))
. This is pretty elegant, but it will not work, because we would try to average education with extraversion items (both start with E). This is why we should try to give items descriptive stems such as extraversion_ or bfik_extra. With longer stems like these, confusion is unlikely and we can refer to groups of items by their stems. If you have already named your item in such a sparse fashion, another solution is to use a regular expression, as we saw above. In our scenario, select(matches("^E\\dR?$"))
would work.
Lastly, you might want to sign your work and add a few descriptive words about the entire dataset. You could simply edit the rmarkdown document to add a description, but unfortunately, this information will not become part of the machine-readable metadata. Metadata (or attributes) of the dataset are a lot less persistent than metadata about variables. Hence, you should add it right before calling the codebook function. Enter the following lines above the call codebook(codebook_data). Adding metadata about the dataset is very simple. We simply wrap the metadata function around codebook_data and assign a value to a field. The fields name and description are required. If you do not edit them, they will be automatically generated based on the data frame name and its contents. To overwrite them, use the following commands.
metadata(codebook_data)$name <- "25 Personality items representing 5 factors"
metadata(codebook_data)$description <- "25 personality self report items taken from the International Personality Item Pool (ipip.ori.org)[...]"
It’s also good practice to give datasets a canonical identifier. This way, if a dataset is described in multiple locations, we know it’s the same dataset. Here, I could have simply used the URL of the R package from which I took the package, but URLs can change. Instead, I generated a persistent document object identifier (DOI) on the OSF. and specified it here.
Of course, it’s also a good idea to let others know who they can contact about the dataset, how to cite it, and where to find more information. That’s why I set the attributes creator, citation, and url below.
metadata(codebook_data)$creator <- "William Revelle"
metadata(codebook_data)$citation <- "Revelle, W., Wilt, J., and Rosenthal, A. (2010) Individual Differences in Cognition: New Methods for examining the Personality-Cognition Link In Gruszka, A. and Matthews, G. and Szymura, B. (Eds.) Handbook of Individual Differences in Cognition: Attention, Memory and Executive Control, Springer."
metadata(codebook_data)$url <- "https://CRAN.R-project.org/package=psych"
Lastly, it is useful to note when and where the data was collected, and when it was published. Ideally, you would make more specific information available here, but this is all I know about the BFI dataset.
metadata(codebook_data)$datePublished <- "2010-01-01"
metadata(codebook_data)$temporalCoverage <- "Spring 2010"
metadata(codebook_data)$spatialCoverage <- "Online"
These attributes are documented in more depth on https://schema.org/Dataset. You can also add attributes that are not documented there, but they will not become part of the machine-readable metadata. Click Knit again. In the viewer tab, you can see that the metadata section of the codebook has been populated with your additions.
Having added all the variable-level metadata, you might want to reuse the marked-up data elsewhere, share it with collaborators or the public. You can most easily export it using the rio package (Chan and Leeper 2018), which permits embedding the variable metadata in the dataset file for those formats that support it. The only way to keep all metadata in one file is by staying in R:
The variable-level metadata can also be transferred to SPSS and Stata files. Please note that this export is based on reverse-engineering the SPSS and Stata file structure, so the resulting files should be tested before sharing.
Now, you might want to share your codebook with others. In the project folder you created in the beginning, there now is a codebook.html file. You can email it to collaborators, or you can upload it to the OSF file storage. However, if you want Google Dataset Search to index your dataset, this is not good enough. The OSF will not render your HTML files for security reasons and Google will not index the content of your emails (at least not publicly). For those who are familiar with Github or who already have their own website, uploading the html file to their own website should be easy. For those who want to learn about Github Pages, there are several guides available. The very simplest way to publish the html for the codebook that I could find is the following. Rename the codebook.html to index.html. Sign up on netlify.com. After creating an account, you can drag and drop the folder containing the codebook to the netlify web page (make sure it does not contain anything you do not want to share, such as the raw data). Netlify will upload the files and create a random URL like estranged-amardillo.netlify.com. You can change this to say something more meaningful like bfi-study.netlify.com in the settings. Now, visit the URL to see that everything is working. The last step is to publicly share a link to the codebook, so that search engines can find out that it exists. You could tweet the link with the hashtag #codebook, and it also makes sense to add a link from the repository where you are sharing the raw data or the related study’s supplementary material. I added a link to the bfi-study codebook on the OSF (https://osf.io/k39bg/), where I had also shared the data. That was it! Depending on the speed of the search engine crawler, the dataset should be findable on Google Dataset Search within 3 days to 3 weeks.
To see the resulting codebook, open the “Example with manual labelling” vignette.
Dataset name: 25 Personality items representing 5 factors
25 personality self report items taken from the International Personality Item Pool (ipip.ori.org)[…]
Metadata for search engines
Date published: 2010-01-01
Creator:William Revelle
gender
0 missing values.
name | label | data_type | value_labels | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gender | gender | integer | 1. male, 2. female |
0 | 2800 | 2800 | 1.67 | 0.47 | 1 | 1 | 2 | 2 | 2 | ▃▁▁▁▁▁▁▇ |
education
223 missing values.
name | label | data_type | value_labels | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
education | education | integer | 1. in high school, 2. finished high school, 3. some college, 4. college graduate, 5. graduate degree |
223 | 2577 | 2800 | 3.19 | 1.11 | 1 | 3 | 3 | 4 | 5 | ▂▂▁▇▁▂▁▃ |
age
0 missing values.
name | label | data_type | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
age | age | integer | 0 | 2800 | 2800 | 28.78 | 11.13 | 3 | 20 | 26 | 35 | 86 | ▁▇▆▃▂▁▁▁ |
Reliability: ωordinal [95% CI] = 0.8 [0.78;0.81].
Missing: 87.
Index | Estimate |
---|---|
Omega | 0.7673 |
Omega Psych Tot | 0.795 |
Omega Psych H | 0.6522 |
Omega Ordinal | 0.795 |
Cronbach Alpha | 0.7609 |
Greatest Lower Bound | 0.7978 |
Alpha Ordinal | 0.7929 |
Positive correlations: 10 out of 10 (100%)
Detailed output
##
## Information about this analysis:
##
## Dataframe: res$dat
## Items: E1R, E2R, E3, E4, E5
## Observations: 2713
## Positive correlations: 10 out of 10 (100%)
##
## Estimates assuming interval level:
##
## Omega (total): 0.77
## Omega (hierarchical): 0.65
## Revelle's omega (total): 0.8
## Greatest Lower Bound (GLB): 0.8
## Coefficient H: 0.78
## Cronbach's alpha: 0.76
## Confidence intervals:
## Omega (total): [0.75, 0.78]
## Cronbach's alpha: [0.75, 0.78]
##
## Estimates assuming ordinal level:
##
## Ordinal Omega (total): 0.8
## Ordinal Omega (hierarch.): 0.79
## Ordinal Cronbach's alpha: 0.79
## Confidence intervals:
## Ordinal Omega (total): [0.78, 0.81]
## Ordinal Cronbach's alpha: [0.78, 0.81]
##
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.
##
## Eigen values: 2.565, 0.768, 0.643, 0.561, 0.464
## Loadings:
## PC1
## E1R 0.700
## E2R 0.780
## E3 0.691
## E4 0.758
## E5 0.644
##
## PC1
## SS loadings 2.565
## Proportion Var 0.513
##
## vars n mean sd median trimmed mad min max range skew kurtosis
## E1R 1 2713 4.03 1.63 4 4.14 1.48 1 6 5 -0.38 -1.09
## E2R 2 2713 3.86 1.61 4 3.93 1.48 1 6 5 -0.22 -1.15
## E3 3 2713 4.00 1.35 4 4.07 1.48 1 6 5 -0.47 -0.46
## E4 4 2713 4.42 1.46 5 4.59 1.48 1 6 5 -0.83 -0.31
## E5 5 2713 4.42 1.34 5 4.57 1.48 1 6 5 -0.78 -0.09
## se
## E1R 0.03
## E2R 0.03
## E3 0.03
## E4 0.03
## E5 0.03
name | label | data_type | value_labels | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
E1R | Don’t talk a lot. | numeric | 6. Very Inaccurate, 5. Moderately Inaccurate, 4. Slightly Inaccurate, 3. Slightly Accurate, 2. Moderately Accurate, 1. Very Accurate |
23 | 2777 | 2800 | 4.03 | 1.63 | 1 | 3 | 4 | 5 | 6 | ▃▅▁▆▅▁▇▇ |
E2R | Find it difficult to approach others. | numeric | 6. Very Inaccurate, 5. Moderately Inaccurate, 4. Slightly Inaccurate, 3. Slightly Accurate, 2. Moderately Accurate, 1. Very Accurate |
16 | 2784 | 2800 | 3.86 | 1.61 | 1 | 3 | 4 | 5 | 6 | ▃▅▁▇▅▁▇▆ |
E3 | Know how to captivate people. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
25 | 2775 | 2800 | 4 | 1.35 | 1 | 3 | 4 | 5 | 6 | ▂▃▁▃▇▁▇▃ |
E4 | Make friends easily. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
9 | 2791 | 2800 | 4.42 | 1.46 | 1 | 4 | 5 | 6 | 6 | ▁▂▁▂▃▁▇▆ |
E5 | Take charge. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
21 | 2779 | 2800 | 4.42 | 1.33 | 1 | 4 | 5 | 5 | 6 | ▁▂▁▂▅▁▇▅ |
Reliability: ωordinal [95% CI] = 0.68 [0.66;0.7].
Missing: 74.
Index | Estimate |
---|---|
Omega | 0.6104 |
Omega Psych Tot | 0.6634 |
Omega Psych H | 0.5148 |
Omega Ordinal | 0.6825 |
Cronbach Alpha | 0.6025 |
Greatest Lower Bound | 0.6696 |
Alpha Ordinal | 0.6751 |
Positive correlations: 10 out of 10 (100%)
Detailed output
##
## Information about this analysis:
##
## Dataframe: res$dat
## Items: O1, O2R, O3, O4, O5R
## Observations: 2726
## Positive correlations: 10 out of 10 (100%)
##
## Estimates assuming interval level:
##
## Omega (total): 0.61
## Omega (hierarchical): 0.51
## Revelle's omega (total): 0.66
## Greatest Lower Bound (GLB): 0.67
## Coefficient H: 0.65
## Cronbach's alpha: 0.6
## Confidence intervals:
## Omega (total): [0.59, 0.63]
## Cronbach's alpha: [0.58, 0.63]
##
## Estimates assuming ordinal level:
##
## Ordinal Omega (total): 0.68
## Ordinal Omega (hierarch.): 0.68
## Ordinal Cronbach's alpha: 0.68
## Confidence intervals:
## Ordinal Omega (total): [0.66, 0.7]
## Ordinal Cronbach's alpha: [0.66, 0.69]
##
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.
##
## Eigen values: 1.98, 0.936, 0.825, 0.664, 0.595
## Loadings:
## PC1
## O1 0.666
## O2R 0.604
## O3 0.730
## O4 0.432
## O5R 0.673
##
## PC1
## SS loadings 1.980
## Proportion Var 0.396
##
## vars n mean sd median trimmed mad min max range skew kurtosis
## O1 1 2726 4.82 1.13 5 4.96 1.48 1 6 5 -0.90 0.42
## O2R 2 2726 4.30 1.56 5 4.45 1.48 1 6 5 -0.60 -0.79
## O3 3 2726 4.44 1.22 5 4.56 1.48 1 6 5 -0.77 0.30
## O4 4 2726 4.90 1.22 5 5.10 1.48 1 6 5 -1.21 1.07
## O5R 5 2726 4.52 1.33 5 4.67 1.48 1 6 5 -0.74 -0.24
## se
## O1 0.02
## O2R 0.03
## O3 0.02
## O4 0.02
## O5R 0.03
name | label | data_type | value_labels | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
O1 | Am full of ideas. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
22 | 2778 | 2800 | 4.82 | 1.13 | 1 | 4 | 5 | 6 | 6 | ▁▁▁▂▅▁▇▇ |
O2R | Avoid difficult reading material. | numeric | 6. Very Inaccurate, 5. Moderately Inaccurate, 4. Slightly Inaccurate, 3. Slightly Accurate, 2. Moderately Accurate, 1. Very Accurate |
0 | 2800 | 2800 | 4.29 | 1.57 | 1 | 3 | 5 | 6 | 6 | ▂▃▁▅▃▁▇▇ |
O3 | Carry the conversation to a higher level. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
28 | 2772 | 2800 | 4.44 | 1.22 | 1 | 4 | 5 | 5 | 6 | ▁▁▁▂▆▁▇▅ |
O4 | Spend time reflecting on things. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
14 | 2786 | 2800 | 4.89 | 1.22 | 1 | 4 | 5 | 6 | 6 | ▁▁▁▁▃▁▆▇ |
O5R | Will not probe deeply into a subject. | numeric | 6. Very Inaccurate, 5. Moderately Inaccurate, 4. Slightly Inaccurate, 3. Slightly Accurate, 2. Moderately Accurate, 1. Very Accurate |
20 | 2780 | 2800 | 4.51 | 1.33 | 1 | 4 | 5 | 6 | 6 | ▁▂▁▃▅▁▇▇ |
Reliability: ωordinal [95% CI] = 0.77 [0.76;0.78].
Missing: 93.
Index | Estimate |
---|---|
Omega | 0.733 |
Omega Psych Tot | 0.7711 |
Omega Psych H | 0.6184 |
Omega Ordinal | 0.7707 |
Cronbach Alpha | 0.7293 |
Greatest Lower Bound | 0.7662 |
Alpha Ordinal | 0.7695 |
Positive correlations: 10 out of 10 (100%)
Detailed output
##
## Information about this analysis:
##
## Dataframe: res$dat
## Items: C1, C2, C3, C4R, C5R
## Observations: 2707
## Positive correlations: 10 out of 10 (100%)
##
## Estimates assuming interval level:
##
## Omega (total): 0.73
## Omega (hierarchical): 0.62
## Revelle's omega (total): 0.77
## Greatest Lower Bound (GLB): 0.77
## Coefficient H: 0.74
## Cronbach's alpha: 0.73
## Confidence intervals:
## Omega (total): [0.72, 0.75]
## Cronbach's alpha: [0.71, 0.75]
##
## Estimates assuming ordinal level:
##
## Ordinal Omega (total): 0.77
## Ordinal Omega (hierarch.): 0.77
## Ordinal Cronbach's alpha: 0.77
## Confidence intervals:
## Ordinal Omega (total): [0.76, 0.78]
## Ordinal Cronbach's alpha: [0.76, 0.78]
##
## Note: the normal point estimate and confidence interval for omega are based on the procedure suggested by Dunn, Baguley & Brunsden (2013) using the MBESS function ci.reliability, whereas the psych package point estimate was suggested in Revelle & Zinbarg (2008). See the help ('?scaleStructure') for more information.
##
## Eigen values: 2.42, 0.827, 0.682, 0.566, 0.504
## Loadings:
## PC1
## C1 0.666
## C2 0.715
## C3 0.669
## C4R 0.745
## C5R 0.680
##
## PC1
## SS loadings 2.420
## Proportion Var 0.484
##
## vars n mean sd median trimmed mad min max range skew kurtosis
## C1 1 2707 4.51 1.24 5 4.65 1.48 1 6 5 -0.86 0.32
## C2 2 2707 4.36 1.32 5 4.50 1.48 1 6 5 -0.74 -0.14
## C3 3 2707 4.30 1.29 5 4.41 1.48 1 6 5 -0.69 -0.12
## C4R 4 2707 4.45 1.37 5 4.59 1.48 1 6 5 -0.60 -0.62
## C5R 5 2707 3.69 1.63 4 3.74 1.48 1 6 5 -0.06 -1.22
## se
## C1 0.02
## C2 0.03
## C3 0.02
## C4R 0.03
## C5R 0.03
name | label | data_type | value_labels | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C1 | Am exacting in my work. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
21 | 2779 | 2800 | 4.5 | 1.24 | 1 | 4 | 5 | 5 | 6 | ▁▁▁▂▅▁▇▅ |
C2 | Continue until everything is perfect. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
24 | 2776 | 2800 | 4.37 | 1.32 | 1 | 4 | 5 | 5 | 6 | ▁▂▁▂▆▁▇▅ |
C3 | Do things according to a plan. | integer | 1. Very Inaccurate, 2. Moderately Inaccurate, 3. Slightly Inaccurate, 4. Slightly Accurate, 5. Moderately Accurate, 6. Very Accurate |
20 | 2780 | 2800 | 4.3 | 1.29 | 1 | 4 | 5 | 5 | 6 | ▁▂▁▂▆▁▇▅ |
C4R | Do things in a half-way manner. | numeric | 6. Very Inaccurate, 5. Moderately Inaccurate, 4. Slightly Inaccurate, 3. Slightly Accurate, 2. Moderately Accurate, 1. Very Accurate |
26 | 2774 | 2800 | 4.45 | 1.38 | 1 | 3 | 5 | 6 | 6 | ▁▂▁▅▅▁▇▇ |
C5R | Waste my time. | numeric | 6. Very Inaccurate, 5. Moderately Inaccurate, 4. Slightly Inaccurate, 3. Slightly Accurate, 2. Moderately Accurate, 1. Very Accurate |
16 | 2784 | 2800 | 3.7 | 1.63 | 1 | 2 | 4 | 5 | 6 | ▃▆▁▇▅▁▇▆ |