The functions sha1
, sha256
, sha512
, md4
, md5
and ripemd160
bind to the respective digest functions in OpenSSL’s libcrypto. Both binary and string inputs are supported and the output type will match the input type.
md5("foo")
[1] "acbd18db4cc2f85cedef654fccc4a4d8"
md5(charToRaw("foo"))
[1] ac bd 18 db 4c c2 f8 5c ed ef 65 4f cc c4 a4 d8
Functions are fully vectorized for the case of character vectors: a vector with n strings will return n hashes.
# Vectorized for strings
md5(c("foo", "bar", "baz"))
[1] "acbd18db4cc2f85cedef654fccc4a4d8" "37b51d194a7513e45b56f6524f2d51f2"
[3] "73feffa4b7f6bb68e44cf984c85f6e88"
Besides character and raw vectors we can pass a connection object (e.g. a file, socket or url). In this case the function will stream-hash the binary contents of the conection.
# Stream-hash a file
myfile <- system.file("CITATION")
md5(file(myfile))
Hashing....
[1] e4 4f 1b 99 e3 2f 27 e0 a7 e6 a0 0a 36 07 0e 1b
Same for URLs. The hash of the R-3.1.1-win.exe
below should match the one in md5sum.txt
# Stream-hash from a network connection
md5(url("http://cran.us.r-project.org/bin/windows/base/old/3.1.1/R-3.1.1-win.exe"))
Similar functionality is also available in the digest package, but with a slightly different interface:
# Compare to digest
library(digest)
digest("foo", "md5", serialize = FALSE)
[1] "acbd18db4cc2f85cedef654fccc4a4d8"
# Other way around
digest(cars, skip = 0)
[1] "310a2ceeffc1930c91d7a101d596356d"
md5(serialize(cars, NULL))
[1] 31 0a 2c ee ff c1 93 0c 91 d7 a1 01 d5 96 35 6d