By the time you have read this, I will have given a talk at WordPress Leeds
the folks who organise the meetup are fantastic, and you should definitely attend if you are ever in the area
covering some aspects of HTTP Security and how you can make your websites less hackable.
I chose the click-baity subtitle of "ways to not get hacked"
It felt a bit of a shame for the slide deck to only see the light of day during that meetup. So I thought I'd share them with the world. Plus the folks who attended could then sit back and watch me being a wally, rather than frantically taking notes
even if those notes where on the fact that I was a complete wally
I wrote the slides using revealjs
see the readme in the GitHub repo for the deck on how to get it up and running
and should run on any computer with a browser and connection to the web.
What This Talk Is Not
This talk is not designed to be too technical. There are a few easy wins for folks who aren't technical at all (some of which are technology agnostic, and some are WordPress specific). We'll start with the easy wins.
Security = Catchup (Sometimes)
Sometimes security is a game of cat & mouse, as more hacks and vulnerabilities are found, reported, and patched out. It's a full time job staying at the forefront of web security. Just as Troy Hunt or Scott Helme
A few days before I gave this talk, a critical vulnerability in a popular jQuery plugin was reported on. It had been "in the wild"
that's security talk for "we, kinda, knew about it"
for three years.
There's also this comment from another article of mine on this very site:
Imagine you want to steal a car. You case a street and check out each car, one by one. You look for any visible means entry, but you're also looking for any physical locks on the steering wheel, etc. You also need to know which models are easier to hot wire.
Now imagine that you have to park your car along a street where a lot of thefts have taken place. To ensure that your car isn't going to be picked out, you make sure that you have put any valuables away in the glove box or trunk locked your car; placed a physical lock on the steering wheel; engaged the imobiliser; armed your alarm; etc.
In security, you need to be looking for the ways that someone could break into your app. You want to find as many as possible and put things in place to stop others from exploiting them.
I would say that every web developer should know of the OWASP Top 10 security risks, at the very least. You could easily lose a day or two, doing a deep dive on the OWASP site (just like anyone could with TV Tropes) and still only scratch the surface.
Plugins
We developers know that there's always a library for that. Whether it's an npm package, a NuGet package, or just some code laying around on Stack Overflow
You don't copy-paste from Stack Overflow without understanding what the code does first, do you?
As such, it is really easy to fall into the trap of having hundreds of lines of code which are required to run your website - most of which is either bloated or indecipherable.
If you need to include someone else's code (via a plugin or whatever), make sure that it is small, short, and sticks to the point. It should, as Charles Emmerson-Winchester III says do:
One thing at a time. Do it very well. And then I move on
Single Responsibility Principle
This goes hand-in-hand with the Principle of Least Privilege . Basically, if you need to give someone or something access to your application
or blog, or website, or black ops site
give them the LEAST possible access; just enough to get the work done, and no more.
In the world of blogging:
- Writers should NOT have admin access
- Editors, too
- Similar to admin best practises for computers and security
Multi-Factor Auth
Multi-Factor authentication is when you need more than one thing in order to log in. Classic single-factor authentication is a username and password combo - it's a single thing that you know.
Other factors can include something that you have (i.e an authentication app), something that you are (i.e. finger print), etc. The more of these that you enable, the more things that a malicious person will need to have in order to break in. But it will make it harder for a proper user to log in.
I use a combination of Authy and YubiKeys to provide a second factor for all of my accounts. These devices us a time-based algorithm to create a one time use code which needs to be entered at login.
Because it's time-based, it's almost impossible to brute force
almost; but it's hundreds of times better than not having it
How HTTP Works
As a terribly reduced example, let's imagine that I want to load the homepage for the BBC.
- I tell my browser to go to https://bbc.co.uk/
- My browser sends an HTTP request to my Internet Service Provider
- My ISP helps to figure out where the server for the BBC is
- It then forwards my browser's request for the website through a series of routers
I mean a series of tubes, obviously
- It finally gets to the server running the BBC website
- The server generates the HTML for the homepage, puts it in an HTTP response, sends it back to me
- The HTML takes the reverse of the same route it took to get there, in order to get back to me
At any point, someone can intercept the request and see the contents. You can do the same with the response. Unless it's encrypted using HTTPS, that is.
If it's not encrypted, then anyone in the chain between me and the BBC server can change the details of the request or the response. This is sometimes called a Man-in-the-Middle attack.
Encrypting the message isn't fool-proof, because the same folks between me and the BBC server can get the keys used to encrypt our messages. But they have to be watching the requests from the very beginning, which isn't easy to do.
See the comment by Russ:
Encrypting the message isn't fool-proof, because the same folks between me and the BBC server can get the keys used to encrypt our messages. But they have to be watching the requests from the very beginning, which isn't easy to do.
is this true? I was under the impression that things like TLS have both parties use their own private/public key pairs, something like diffie hellman to get a shared secret, then a KDF to derive a key that's never transmitted over the wire.
I was slightly wrong in what I'd said, as I was conflating two things together. Whilst it's not possible to intercept and decrypt TLS messages as they fly through the web, it is possible to decrypt them after the fact - given enough compute power and time.
Let's Encrypt
Regardless of how you feel about the big name Certificate Authorities, they have made some big mistakes recently. Symantech made a number of big fouls ups which has meant that they have exited the CS business.
Let's Encrypt is a free service which offers TLS
no one has used SSL since the 90s, but the acronym has stuck
certificates to anyone. You need to do a little work to get them to auto-renew, but your hosting provider might be able to help with that. I know that DreamHost
one of the hosting providers that I use
can help you to set up auto-renewing certificates.
Sub-Resource Integrity
Each time that you access a webpage, a bunch of HTML is sent over to your browser. That HTML
unless the website is all plain text
will include some links to external resources (things like images, style sheets and JavaScript usually). Before the browser can display the page, it needs to go get these external resources
"external" here means external to the browser that you have loaded the website on
If that external resource is JavaScript, then you need to know whether someone has managed to futz with it, either in transit (when on the way to your browser) or at it's origin (where you are downloading it from). That's where Sub-Resource Integrity comes in.
SRI is essentially a hash of the file contents. When you include a script (or style sheet) in your HTML page, you can tell the browser that it should expect that script to have a specific finger-print (in this case a SHA256 hash). If the downloaded script's has doesn't match the SHA 256 hash, then the browser will not attempt to run it.
think of "hashing" something as taking it's contents and putting it through a mathematical formula. Everything which goes the formula should produce a different answer to everything else
Content Security Policy
There's an old programmers joke which goes:
The three hardest things in computer science are naming things and off by one errors
But that was written before CSP was invented.
Essentially, CSP is a white-list of places that a browser is allowed to load content (HTML, scripts, images, style sheets, etc.) from when loading your website. If your website (or something it loads) attempts to load something from a website not on this list, your browser will stop loading the website.
Just read that last sentence again:
If your website (or something it loads) attempts to load something from a website not on this list, your browser will stop loading the website.
That's exactly right. CSP can break your website. If you don't get it right
and there's no way that you'll get it right on the first try
then you'll stop your website (or web application) from working.
This is a server side thing, and isn't for the faint of heart. It also requires knowledge of how your web-server and reverse proxy work. Each web-server handles this differently, so you'll need to look up how to do this yourself (unfortunately), or find a boffin
"boffin" is an old British slang term which means: "a person with knowledge or a skill considered to be complex or arcane"
to help you.
An example of a set of CSP rules are:
content-security-policy:
upgrade-insecure-requests;
default-src 'self';
connect-src 'self' https://cdn.jsdelivr.net https://api.unsplash.com;
script-src 'self' https://cdnjs.cloudflare.com https://code.jquery.com;
img-src 'self' https://pbs.twimg.com https://imagegen.podchaser.com https://*.unsplash.com https://casper.ghost.org https://*.gravatar.com;
style-src 'self' https://cdnjs.cloudflare.com/ https://maxcdn.bootstrapcdn.com;
frame-src 'self' https://html5-player.libsyn.com/;
font-src 'self' https://maxcdn.bootstrapcdn.com;
object-src 'none'
This tells the browser that it can only load the following things from the specified places:
- Scripts
- the origin of the website (self)
- https://cdn.jsdelivr.net
- https://api.unsplash.com
- Images:
- the origin of the website (self)
- https://pbs.twimg.com (Twitter)
- https://imagegen.podchaser.com (Podchaser)
- https://*.unsplash.com (Unsplash)
- https://casper.ghost.org (Ghost)
- https://*.gravatar.com (Gravatar)
- etc.
It also says that the browser should upgrade-insecure-requests
which means that it should attempt to load everything over HTTPS
Secure Headers
These are things added at the web-server or reverse proxy level (like CSP), and are designed to make your websites and web applications more secure for the end user.
We already know a little about CSP.
HTST tells the browser to always come to our website using HTTPS, and to never try over HTTP.
Content Type is a way of stopping malicious people from telling the browser that the image it has loaded is a piece of JavaScript (for example), by changing it's MIME Type
MIME types are ways of identifying files without having to rely on the file extension
XSS Protection helps to stop Cross Site Scripting attacks. This one is a little complex, but one example would be to imagine that you have a comments box where users can leave comments. A malicious user could put some JavaScript into that comment box, and that script would be run on the browser when thee comment is loaded. That's JavaScript we didn't write or want to run on our website.
It's a little more complex than that, but that's the basic gist.
Top comments (8)
is this true? I was under the impression that things like TLS have both parties use their own private/public key pairs, something like diffie hellman to get a shared secret, then a KDF to derive a key that's never transmitted over the wire.
You're absolutely right. I'll need to think of how to change this slightly.
I wanted to somehow point out that, given enough compute time, the encryption can be brute forced after the fact. I think I'll just leave that bit out.
But I will edit this post/notes to strike through that bit.
Yeah, encryption keys are only valid for
n
numbers of operations, that number changes depending on the bit size of the key, but it's a pretty large number. But yeah, getting into that's kinda the nitty gritty.Even still, brute forcing even the smaller of AES keys (128 bits) takes a long time - not sure how accurate this is to today's compute, but from: eetimes.com/document.asp?doc_id=12.... in the uh, scientific notation of years!
Thank you again for follow-up security post, Jamie.
I am trying to understand by rephrasing.
In
Single Responsibility Principle
,So Is the rule of thumb is to "black list" everyone and open up access one by one?
Did I understand it correctly?
😲 I honestly didn't know!
Effectively yes. Think of your employer. Doea everyone in the world have access to your company building? I'd suspect that only those who need to be there so have access to it.
What about the server rooms? Assuming that you have on prem servers, of course. Is everyone at your work given admin access to the resources on your network? (please don't answer that one, just think about it).
You don't want to give everyone access to everything.
Yeah. The Secure Sockets Layer algorithm had too many potential flaws and was replaced with Transport Layer Security. From an end user perspective it's the same thing, though.
Thanks Jamie. Analogies did help solidify the concept 😀
You're welcome
That is a very interesting post.
I am always looking for good security related resources, so that I can push them to the developers I am working with and try to evangelize as musch developers as I can.
I wrote some on my own, but as you said, it is such a large topic, that it is very important to rely on other resources. :-)