Passwords. Every CEO's nightmare, surely, is to be woken with the news that their company's password data is up for grabs on the Internet for a price.
OK - let's get down to brass tacks. How does this happen?
"Ahah", eager new developer with a brilliant idea for a site thinks. "Let's avoid the pitfalls of #1 in this series and make our users have passwords."
Fine. Quick bit of SQL and their web framework of choice, pop the user and their password in the appropriate table, and they can go
SELECT * FROM users WHERE username="$user" AND password="$password" and check if it's valid. Simple, right?
Sidebar right here: the seasoned veterans screaming at the other issues with that snippet, just hush. We'll get to that in a couple of articles time, OK? :D
And here's problem #1. Their database contains the user's unencrypted password. In the event that someone outside the company gains access to the contents of that table, their CEO will be on the phone about 30 seconds after the events in the first paragraph. And they can't assume it won't happen. All the security in the world (and it's a fair bet so far they don't have that!) won't protect them from a disgruntled employee, a social engineering attack, or a misconfigured firewall.
"Easy", they think. "I'll just encrypt the password, then I can decrypt and compare when the user logs in." And worse, they'll probably invent some really clever (I really need the sarcasm emoji about now) code to do it.
And here we are with problems #2 and #3!
Let's deal with #3 first. Rolling your own encryption code. Just. Don't. The odds on our hero coming up with some thing that doesn't have a fundamental flaw with it are small. And if they don't understand the issues involved, what the risks are, and what the encryption function needs to do, their idea of an encryption algorithm might be a bit better than
#2 is the bigger problem, and it's an 'ahah' moment when I can get users to realise that it's connected to 'no, I can't tell you your password' and why sites force them to change it if they forgot it. If someone has access to your DB and your code, and therefore the decryption algorithm (which if they have access to your system you better assume they probably do), they have all it takes to crack all your passwords, and we're back to problem #1 and the phone ringing at 4am.
The key takeaway here is we don't need to know what the password is.
We just need to know that what the user typed encrypts to the same thing.
I need this to be an 'ahah' moment for you if you don't know it already.
Pick a good one-way algorithm, where turning the password into its 'encrypted' (actually, hashed) form is fast, and breaking the algorithm to get it back is prohibitively hard. Think of it as the algorithmic equivalent of tearing up a piece of paper into lots of pieces - that's quick. Putting it back into one whole piece is not..
All you have to do, then, is hash the user's password when they create it, pop it in your users table, and do the same to what they give you when they log in. Compare the two, and bingo.
But, as a commenter points out, and I did in fact know (insufficient caffeine, yer honour), if we don’t pick our hashing algorithm correctly, we have problem #4.
As it stands, two people with the same password have the same hash. Equally, if you know what the algorithm is you can turn up with a table of candidate hashes (basically, run a list of likely passwords through the hash algorithm, which is cheap) and look for matches.
Solution: pick an algorithm that uses a ‘salt’, a random addition to the password that’s hashed with, and stored with, the hashed password.
We are firmly in “do not roll your own“ territory here. Any library worth its salt (see what I did there?) here will have both hash and compare functions (e.g.
PKBDF2) to save you getting it wrong. And maybe now your CEO can rest a little easier on this front.
Final lesson: don’t code tired, and have someone else review your code before it gets anywhere near production. :) (thanks ARMB)
As a corollary to this: if a site can give you your current password back if you've forgotten it?
In the other direction.
At the very least do NOT trust it with any data you care about being exposed to a wider audience.
Sidebar 2: yes, oh seasoned veterans. This is all an oversimplification. But clearly, some folks - lots of folks - don't even get this far.
It is a constant source of wonder and bewilderment to me how often haveibeenpwned gets updated with new leaked passwords that have clearly been leaked from large, popular sites that don't follow at the minimum these simple best practices. The negligence to let that happen probably should be a criminal offence, and CTOs who presided over it barred from ever holding such an office again.