This will be a basic introduction to the Scrypt hash function, or more accurately, KDF function. I will assume most of my audience is here to gain an understanding of why Scrypt is used and the basics of how it works. My goal is to explain it in a general sense, I will be omitting proofs and implementation details and instead focusing on the high-level principles.
Scrypt is a slow-by-design hash function. Its purpose is to take some input data, and create a fingerprint of that data, but to do it very slowly. One of the best examples for its real world use case is how it is used by Qvault. That is, to take a password and create a 256-bit private key.
For example, let’s pretend your password is password1234. By using scrypt, we can extend that deterministically into a 256-bit key:
That long 256-bit key can now be used as the private key to encrypt and decrypt data using the AES-256 cipher.
Most encryption algorithms, including AES-256, require that a key of sufficient length is used. By hashing the password, we get a longer and fixed-size key.
Furthermore, we chose to use the scrypt algorithm as opposed to a faster hash like SHA-256 for two reasons:
- It is slow
- It uses memory as well as CPU resources
The reason we want a slow hash is so that an attacker has a harder time guessing the user’s password. If an attacker is trying to brute-force their way into a vault, that means they are just guessing passwords over and over in order to break in. AES-256 is very fast, so this means the attacker would be able to try many passwords per second on a modern computer.
Because an attacker must run a scrypt hash on each password before attempting to decrypt the vault , their attack becomes so slow it will be nearly impossible to guess the password. On a relatively powerful desktop computer it takes ~1.5 seconds to hash a Qvault password because we have set the memory and computational requirements fairly high.
Like all hashing functions, scrypt has the following properties:
- Deterministic (Same input produces the same output every time)
- Fixed-size output
- Irreversible (By using the output an attacker can’t find the input)
Additionally, Scrypt has the following properties:
- Computationally expensive and slow (It takes a long time for a computer to run the hash)
- Memory intensive (Potentially several gigabytes of RAM is used to run the hash)
Thanks for reading! Here are some additional resources:
Follow us on medium! https://medium.com/qvault
By Lane Wagner
Mounting a second encrypted hard drive automatically under Debian or Ubuntu
Scott Arciszewski -
Use the Web Crypto API to Generate a Public / Private Key Pair for End to End, Asymmetric Cryptography on the Web
Jeff Gould -
Dimitri Merejkowsky -