Cryptography for programmers 1: Basics

#security #startup #typescript #webdev

The other day I read an article that talked about how many popular Android apps had basic cryptographic vulnerabilities in their code. The analysis was done with Columbia University's Crylogger, an open source dynamic analysis tool that detects cryptographic vulnerabilities. I was a bit surprised with the results (All of the analysed apps broke at least one of their chosen 26 crypto rules), but not very much. I myself have encountered more than one of this rules being broken in projects I participated. These rules do not come intuitively to people who have never learned about cryptography, and who don't understand the basics. And most programmers usually have never learned cryptography properly in a structured way. When the average programmer wants to implement some cryptography in their project, they do the same as they would with any other problem: They look for the solution on stack overflow. The problem with cryptography code, is that even if it seems to be working, it does not mean it is secure. Cryptographic code that is perfectly safe in some context, could be potentially useless in another one. Another common issue is that new projects often don't worry too much about security, and leave it for the future. But there are cases where it is too costly to refactor later on. If you were to store user passwords with an insecure hash, there is no way you can solve this without making users reset their password, or having to handle two different types of hash, so the mistake stays. Storing the passwords securely from the beginning however, is very easy and would not have taken any extra time.

The aim of this series is to teach the basics of cryptography for programmers, so that their projects will not be a part of the statistics. Sometimes cryptography can seem intimidating to get into. My aim here is not to teach the specifics of cryptography algorithms, or go into the details and nuances of academic crypto. My aim is to make it as practical as possible, while giving an idea of what are the best practices and mistakes to avoid. For those that want to dive deeper into the whys of what I will say, I will link sources into the specifics.

I will be providing code examples in typescript / node for some of the sections, but I don't recommend using the code as-is, without reading the entire series, understanding the code and thinking if it really fits your problem. I will also be providing incomplete and incorrect code, that can be improved in following sections of the series. By the end of the series, readers should be able to understand and apply the 26 crypto rules chosen by Columbia for Crylogger.

The series will be organized in 4 parts:

In part 1 (this one) I will give some basic general rules for writing secure cryptography code.

In part 2 I will talk about block cryptography, the kind of cryptography that is most used for encryption/decryption of data. And we will discuss secure ways of generating cryptographic keys and randomness.

In part 3 I will talk about Authentication, password and login management and we will put it all together to implement a login with JWT.

In part 4 I will talk about public key cryptography, and the basics of internet protocols (SSL/TLS, SSH, ...)

The Basic principles of cryptography

1. Avoid unnecessary complexity

Cryptographic systems exist inside a bigger system, and are not an isolated component. The more complex a system is, the more likely that there is a vulnerability somewhere. And your system is only as secure as the most insecure component. Let's say you are very proud because you are using state-of-the-art cryptography algorithms, but the way you are generating or storing the key is insecure. Then as good as the cryptography is, the key is easy to get by an attacker and all your data can be trivially decrypted. The more parts of your code that require security and the more complex they are, the more likely that someone, at some point, will fuck it up.

2. Kerckhoff Principle: Only the key should be a secret

In cryptographic security, only the encryption / decryption key (or the private key in asymmetric crypto) should be secret. This means, the system security should be designed so that it is secure even if an attacker knows everything about your system, except the key. The most common thing that is kept secret, in hopes that it will increase security, is the code. Some people go as far as using obscure / unusual encryption algorithms, so that an attacker will not know how the data is encrypted. The reality is that the code of a project is not a secret, since every programmer in your team has it, and so if the security of your system depends on the secrecy of the code, you are potentially giving too much power to too many people.

The right way to do it is to keep the key secret, in a way that only a very select trusted group of employees can access it (and ideally they can't read or copy it, only use it), and that is the only secret that your system security should rely on. It is not that unusual for employees of companies to be offered money by hackers in exchange for information or access.

3. Don't roll your own crypto

This is likely the most famous rule of cryptography. It basically means that you should never program the encryption algorithms yourself, or even create new encryption algorithms custom-made for your project. Inventing your own cryptography would be like building your own plane engine. Plane engines have undergone a long testing process, and they have gone through their ups and downs (hehe). The same way cryptography algorithms have survived the test of time for many years. Even algorithms created by cryptography experts need to undergo intense scrutiny and survive after years, before they are used and trusted.

This also applies to different implementations of the same algorithm. Even if you have read about RSA on the internet, and you understand the maths behind it, that does not mean that your implementation will be safe. The popular cryptography libraries (such as openSSL), have also undergone the test of time, and have evolved and updated to solve all the possible bugs that have been found, many of which you would never think about when you are implementing the algorithm yourself. Even when an algorithm is theoretically safe, the way you implement it can give out some information. These leaks of information can be used in what we call side-channel attacks, which do not deal with the theoretical security of the algorithm, but with things such as the time that it takes to perform operations, or even the sound the computer makes, and the energy it consumes.

The job of the programmer when implementing cryptographic code, is to use secure up-to-date libraries (and to keep them updated), and to provide the right parameters to the right algorithms for the job. In the next episodes we will discuss what exactly we mean by the right parameters. We will see that even using secure implementations of secure algorithms, we can still mess up if we don't know what we are doing. Most of the 26 rules that the study above is checking deal with passing insecure parameters to crypto libraries, and using out of date algorithms.

Let's learn how to not do that!

Top comments (10)

Graham • Sep 18 '20

I'm looking forward to the remaining parts! Thanks for this one!

Sergi Canal • Sep 23 '20

Thanks Graham, I am glad that people is liking it since it is my first post! I have spent some time doing the second part, and I am quite proud of it. Check it out! 😉

Graham • Sep 23 '20

Woo! Here I go :D

Rafael Ribeiro • Sep 23 '20

This is exactly the kind of post that I come here for. Thanks a lot for your effort and I look forward to read the next parts

Sergi Canal • Sep 23 '20

Thank you Rafael, I appreciate it. You can check out the next part which is out now! I am quite proud of it 😄

Junxiao Shi • Sep 24 '20

I was guilty for R-10 and R-19, but have fixed them.
I still have R-04 and R-21 but it's out of my hands because the network protocol I'm working with is defined as such.