Intro
If you're into Cryptography For Beginners, you're in the right place. Maybe you're just getting into Rails and want to add a user login/logout function. Maybe you're just really into cryptographic hashing algorithms. I sure am! If you're really a beginner, a little website called Khan Academy has some especially great videos on cryptography.
This isn't really a tutorial, but when using BCrypt, always remember to uncomment the gem in your Rails Gemfile.
source 'https://rubygems.org'
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
ruby '2.6.1'
...
# Use Active Model has_secure_password
# gem 'bcrypt', '~> 3.1.7'
gem 'bcrypt', '~> 3.1.7'
...
Now, you can do cool things like this:
# Encrypts password
my_password = BCrypt::Password.create('iLOVEdogs123')
=> "$2a$10$kypbnGGCpJ7UQlysnqzJG.6H.dUewn7UPVWA3Ip.E.8U4jlVnFNnu"
# Tests if input matches
my_password == 'iLOVEdogs123'
=> true
my_password == 'ilovedogs12'
=> false
# Not super important for this post, but this is why the above crypt begins with "$2a$"
my_password.version
=> "2a"
# "Cost" factor - how quickly the password is encrypted
my_password.cost
=> 10
Now there are a lot of things you can do with your app using BCrypt - authentication, authorization, login, logout, etc.! But there are already a lot of blog posts and helpful docs on how to code that. I am more interested in the behind the scenes action.
History
BCrypt is a hashing algorithm that was designed by Niels Provos and David Mazières of the OpenBSD Project in 1999. The B stands for...
Blowfish!
Blowfish is a symmetric-key block cipher, designed by Bruce Schneier in 1993. Now there's a more modern version called Twofish, but we don't care about that here!
It's pretty amazing that their original proposal over 20 years ago was titled "A Future-Adaptable Password Scheme" because it seems to have truly endured the test of time.
They envisioned an algorithm with computational cost that would increase as hardware improved. Or, as computers get faster and better able to guess passwords, encryption should be slower, or have more "cost." Notice in the following line:
pw = BCrypt::Password.create('password123')
=> "$2a$10$kypbnGGCpJ7UQlysnqzJG.6H.dUewn7UPVWA3Ip.E.8U4jlVnFNnu"
pw.cost
=> 10
cost
is 10. If cost increases, speed decreases, but the speed with which a hacker can guess your passwords also decreases. For example, an attacker using Ruby could check ~140,000 passwords a second with MD5 but only ~450 passwords a second with bcrypt
. BCrypt allows you to configure cost depending on how important the speed/security tradeoff is to you. Here's a nice video that shows some examples of different cost factors.
General Hash Function Background
In general, a hash algorithm or function takes data (i.e., the password) and maps to "fixed-size values," or creates a "digital fingerprint," or hash, of it. This hash is not exactly the same as the Ruby class, but they are similar. A hashing algorithm is like a key-value pair of passwords and their encryptions, but you wouldn't want to store or save them like that! The process is never truly "reversible," in the sense that if I hashed a list of passwords, and all you had was a list of unique crypts, the only way you could "hack" my passwords would be through something like brute force search. But you could never take a hashed value and return it to its original form!
This is why modular arithmetic and the XOR
gate/operator are so fundamental to cryptography and understanding the algorithm behind the BCrypt magic. XOR
stands for "exclusive or," and it is a logical relationship between A and B where one, and only one, must be true. So A XOR B
returns true if A is true or B is true, but not both. The XOR
gate also returns true or 1 if there is an odd number of 'true' inputs, and so is also thought of as addition mod 2
. To me, modular arithmetic is the clearest way to think about why all modern forms of cryptography, hashing, encryption, etc., are "irreversible" - 12 mod 7 = 5
and 40 mod 7 = 5
but even if you know the output is 5, and the algorithm is mod 7
, you mathematically cannot determine the original number in any way other than essentially guessing.
These are all pretty fundamental, basic ideas behind BCrypt, which is clearly way more complicated. But it's interesting to me how BCrypt has mostly built upon these ideas in a way that others haven't.
For example, some other common "general purpose" hash functions, MD5, SHA1, SHA2, SHA3 are fast, but insecure. A modern server can calculate the MD5 hash of about 330MB every second. For lowercase, alphanumeric, 6 character passwords, you can try every single option in about 40 seconds.
So how does BCrypt use all of this?
BCrypt isn't really saving or storing these hashes. It's actually hashing or encrypting everything entered, and comparing the hashes. If I save my password as 'password123' and enter 'password' on the login page of my Rails app, BCrypt hashes both of those strings and compares them for authentication. For example, when I save my password as 'password123,' BCrypt does the following:
pw = BCrypt::Password.create('password123')
=> "$2a$10$kypbnGGCpJ7UQlysnqzJG.6H.dUewn7UPVWA3Ip.E.8U4jlVnFNnu"
and if I entered an incorrect password, such as just 'password,' it is calculating a separate hash and comparing them.
not_pw = BCrypt::Password.create('password')
=> "$2a$10$vx6htugaV4KRG2ucXc8iHOo/Ch4FRfM7aa6Tpc79j9ecPo9U6APsu"
Which finally allows BCrypt to compare:
pw == 'password123'
=> true
pw == 'password'
=> false
You could never take either of those long crypts starting with $2a$10$
and go backwards. It might seem obvious, but it's cool to really think - it's not just that you can't go backwards because you're a human and not a computer. No one, no computer, not even an omnipotent/omniscient being could go backwards!! It would be truly impossible for BCrypt to even have a function or method that would take a hashed password and somehow reverse it and return the original password. All it can do is compare hashed values.
But good news for hackers, bad news for you - hackers don't need to be omniscient or mathematically rigorous here. In fact, the brute force solution will, theoretically, always work. If you think about a world where we had unlimited time and resources - probably anything would be hackable. So if you're ever feeling existential and sad about your own mortality, at least know that it means you can't be hacked as easily.
Rainbow Tables
But there's still a lot that hackers can do to use their time more wisely than a simple brute force attack. A rainbow table attack uses a list of possible passwords hashed with the same algorithm to compare. So you could theoretically compose a list of common passwords, hash them with an algorithm like BCrypt, and compare those hashes to the actual list of password hashes. The solution to this - salts! Kind of.
Sneaky Salt!
There are a lot of puns we can make here about salts, but all jokes aside, a salt is simply "random data that is used as an additional input to a one-way hash function." Salts add yet another layer of randomness to our already pretty random crypt. So, according to the bcrypt
Ruby gem docs, we have something like this:
hash(password) #=> <unique gibberish>
hash(salt + password) #=> <really unique gibberish>
There is also the concept of a pepper, which is again fun for pun purposes, but not explicitly used in BCrypt.
But! Salts on their own are not enough. They help prevent rainbow table attacks, but salts will not necessarily prevent dictionary or brute force attacks. Attackers can throw a list of potential passwords at each individual password:
hash(salt + "aadvark") =? <really unique gibberish>
hash(salt + "abacus") =? <really unique gibberish>
So the really revolutionary aspect of BCrypt is just that it's slow! Hash algorithms aren't usually designed to be slow, they're designed to turn a lot of data into secure fingerprints as quickly as possible. Furthermore, BCrypt is so impressive because it is truly adaptable - the work factor feature allows you to determine how 'expensive,' 'costly,' or 'slow' the hashing algorithm will be. So as hardware gets faster, you can make BCrypt slower, and keep up with Moore's Law!
In my earlier example, the default work factor or cost was set to 10:
pw = BCrypt::Password.create('password123')
=> "$2a$10$kypbnGGCpJ7UQlysnqzJG.6H.dUewn7UPVWA3Ip.E.8U4jlVnFNnu"
pw.cost
=> 10
Using a work factor of 12, BCrypt hashes the password yaaa
in about 0.3 seconds. MD5 takes less than a microsecond. This might seem like nothing, but think of it as cracking a password every 40 seconds vs. every 12 years. You might not need that kind of security, and you might need a faster algorithm - which is why it's great that BCrypt allows you to choose your balance of speed and security.
A Final Tutorial Note
If you ever need it - here's how the docs say to change the default cost factor, using either BCrypt::Engine.cost = new_value
or specifying :cost
as an additional argument in creation.
Option 1:
BCrypt::Engine.cost = 8
BCrypt::Password.create('password').cost
#=> 8
Option 2:
BCrypt::Password.create('password', :cost => 6).cost #=> 6
Conclusion
BCrypt is cool and safe. Unless you have very advanced and specific speed-security balance needs, learn to love it!
Top comments (21)
Wow, This was the best article I have ever read about Bcrypt.
You could also add that it is one of the password hashing algorithms recommended by OWASP:
owasp.org/www-project-cheat-sheets...
You could also write your own article but you haven’t
I don't see the point of writing an article on something that is already explained well in this post and that has a lot of resources online.
I also don't see the point of your comment.
Overall is a very good article about the topic. But I find it misleading a bit when you say:
MD5, yes, that one definitely shouldn't be around anymore in anything related with security.
SHA1 might be stronger than MD5, but its days are also done since the collision attacks discovery back in 2017.
SHA2 and SHA3 otherwise are still strong options for data integrity and other security features that revolves around it. But yes, you shouldn't use them for password "encryption" when we have better options as bcrypt.
Disagree. Not misleading whatsoever....
Don't roll your own authentication systems, they are not safe. Use battle tested / third party services like Auth0 or OAuth.
SaaS don't work in all cases. I have had to do my own when I was working on a full offline platform, and understanding how such encryption work helps alot when implementing authentication plus it wouldn't hurt to learn how things work.
Agree with you, but it is know that rolling your authentication system can lead to security issues. Yes, learning new things like BCrypt is good as well. 👍
Wrong
BCrypt has been out there since 1999 and now days computers are much faster to figure out problems than ever. As you can see it is better to let experts resolve this adaptation issue for you.
Just imagine someone hacked your BCrypt setup? How are you going to solve the issue? Imagine how it will be to migrate to a modern encryption like Argon2 or Argon2id?
The answer: Delegate, delegate, delegate.
Have you read the original white paper on BCrypt from 1999? It sounds like you haven't, and possibly haven't even read my post fully, because Auth0 uses bcrypt. The entire purpose, foundation, and legacy of the algorithm is based on exactly what you are saying - that adjusting the cost factor has an almost perfectly adaptable relationship with advancements in computing speed. BCrypt vs. Argon2 is an interesting question, but is entirely separate from whether or not to use third party auth. My post is about the algorithm itself, not necessarily about who is using it.
also note that everytime bcrypt (the ruby gem) would give you a different output for the same password. This is because "bcrypt-ruby automatically handles the storage and generation of these salts for you."
source: github.com/codahale/bcrypt-ruby
This would prevent rainbow table attacks.
I actually quoted and cited the ruby gem readme in this post. I covered the definition of a salt, and actually bcrypt handling the generation/storage does not change the fact that a salt will always yield a unique result. The important fact here is that it only gives two different hashes because you aren't saving either instance of password creation. Once a password is created and saved, it will always have the same hash:
The question of rainbow table attacks also misses the point - for longer explanation please read this article that I also linked by the gem creator: codahale.com/how-to-safely-store-a...
Note that bcrypt does not always start with
$2a
. The2
is the identifier for bcrypt. After that comes a revision. Nobody uses the original, as it had some serious flaws.There is some weird 2x/2y thing which isn't widely adopted. (Mostly in PHP I think.)
In 2014 another issue was found and thus 2b now exists.
See also en.wikipedia.org/wiki/Bcrypt#Versi...
Mansplain much?
Thanks for sharing you delicate "fishsoup".
... and as so often in real life:You should learn to be aware in which situation you use which specific tool.
(and keep in mind that nearly no human invention will lack some points for further improvement. ;))
Ew
I don't think
hash(salt + password)
is whatbcrypt
does. Salts are not hashed, they are kept in plaintext.Read this: stackoverflow.com/a/6832628/9868445
I have started learning Ruby a few weeks ago, so I will definitely bear this algorith in mind when I build my first application. Great tutorial!
Some comments may only be visible to logged-in visitors. Sign in to view all comments.