This guide is for people who want to learn about SSL and TLS. These are protocols that guard the contents of an HTTP session from third parties and eavesdroppers. The combination of TLS (or SSL) and HTTP creates an HTTPS connection. But in order to understand this you need to remember what a socket is.
Sockets are widely known in computer science and the methods for sockets are named the same in almost all programming languages. As such they usually have methods called
recv(). The image below shows how they are called:
When I use the term "server" in this post, I only mean the computer that called
listen() on a socket. It does not have to be a web server or HTTP server.
TLS can also be used to secure other protocols such as SSH. In fact, any protocol and theoretically be encrypted with TLS since it only involves signed certificates.
SSL is the old version and TLS is the new version. This timeline shows the progression from the original SSL standard to the newest version of the standard, TLS 1.3:
Despite the fact that no SSL version has been supported since 2015, the name "SSL" is commonly used to refer to TLS. But in this post, SSL will refer to the original version of SSL.
In the case of websites and HTTPS, when a protocol is deprecated, browsers will not use the deprecated protocol anymore.
for public key cryptography (the mathematics behind SSL/TLS protocols and the encryption), there exists an entire class of standards collectively called PKCS. There are several standards in this family:
- PKCS#1 - RSA public and private keys used in SSL/TLS.
- PKCS#3 - Diffie-Hellman key exchange.
- PKSC#7 - Describes the binary syntax that certificates are encoded into.
- PKCS#8 - Describes the private key format. They're in the PEM format which looks like this:
-----BEGIN PRIVATE KEY----- MIIBVgIBADANBgkqhkiG9w0BAQEFAASCAUAwggE8AgEAAkEAq7BFUpkGp3+LQmlQ Yx2eqzDV+xeG8kx/sQFV18S5JhzGeIJNA72wSeukEPojtqUyX2J0CciPBh7eqclQ 2zpAswIDAQABAkAgisq4+zRdrzkwH1ITV1vpytnkO/NiHcnePQiOW0VUybPyHoGM /jf75C5xET7ZQpBe5kx5VHsPZj0CBb3b+wSRAiEA2mPWCBytosIU/ODRfq6EiV04 lt6waE7I2uSPqIC20LcCIQDJQYIHQII+3YaPqyhGgqMexuuuGx+lDKD6/Fu/JwPb 5QIhAKthiYcYKlL9h8bjDsQhZDUACPasjzdsDEdq8inDyLOFAiEAmCr/tZwA3qeA ZoBzI10DGPIuoKXBd3nk/eBxPkaxlEECIQCNymjsoI7GldtujVnr1qT+3yedLfHK srDVjIT3LsvTqw== -----END PRIVATE KEY-----
The private key can also be password-protected which would make it look like this:
-----BEGIN ENCRYPTED PRIVATE KEY----- MIIBrzBJBgkqhkiG9w0BBQ0wPDAbBgkqhkiG9w0BBQwwDgQImQO8S8BJYNACAggA MB0GCWCGSAFlAwQBKgQQ398SY1Y6moXTJCO0PSahKgSCAWDeobyqIkAb9XmxjMmi hABtlIJBsybBymdIrtPjtRBTmz+ga40KFNfKgTrtHO/3qf0wSHpWmKlQotRh6Ufk 0VBh4QjbcNFQLzqJqblW4E3v853PK1G4OpQNpFLDLaPZLIyzxWOom9c9GXNm+ddG LbdeQRsPoolIdL61lYB505K/SXJCpemb1RCHO/dzsp/kRyLMQNsWiaJABkSyskcr eDJBZWOGQ/WJKl1CMHC8XgjqvmpXXas47G5sMSgFs+NUqVSkMSrsWMa+XkH/oT/x P8ze1v0RDu0AIqaxdZhZ389h09BKFvCAFnLKK0tadIRkZHtNahVWnFUks5EP3C1k 2cQQtWBkaZnRrEkB3H0/ty++WB0owHe7Pd9GKSnTMIo8gmQzT2dfZP3+flUFHTBs RZ9L8UWp2zt5hNDtc82hyNs70SETaSsaiygYNbBGlVAWVR9Mp8SMNYr1kdeGRgc3 7r5E -----END ENCRYPTED PRIVATE KEY-----
(This is why AWS and other cloud computing services give you private keys in a file with a
A PEM file just has
-----BEGIN followed by a label and then
-----. Then it has a base64 encoded message based on the label, followed by
-----END, the label and
-----. The label could be a public key, a private key, certificate or other cryptographic object.
PKCS#10 which defines the message that you send to the CA to request a certificate.
PKCS#12 - Describes how to bundle the private key, certificate and other security resources into a file called a SafeBag. SafeBags have the
Certificates are electronic documents that are used to improve security of SSL/TLS by proving that you own the public key inside the certificate. Public and private keys are also used in SSL/TLS. Certificates are transmitted in a chain. The parent certificate is transmitted after the child, and then a few higher certificates might come next until a root certificate is transmitted that ends the chain.
Root certificates are certificates which are at the end (top) of the chain. They are unquestionably trusted. Some root certificates for HTTPS are included in browsers.
Certificates have a key pair associated with it. Key pairs consist of a public key and a private key. A key pair proves that you are really the person who you claim to be. This is better than using a password because these days it is very easy to do a brute force attack and crack passwords a dozen characters long. Keys are fixed lengths of long garbled bytes.
Each certificate in the chain is in the X.509 file format. This is what you will find in an X.509 certificate:
First the certificate itself:
- Version Number
- Serial Number
- Signature Algorithm ID
- Issuer Name
- Validity period
- Not Before
- Not After
- Subject name
- Subject Public Key Info
- Public Key Algorithm
- Subject Public Key
- Certificate Key Usage
- Certificate Policy
- Certificate Basic Constraints
- etc. (more extensions here)
Then the certificate signature algorithm followed by the signature.
A certification authority (CA) issues certificates to people or organizations. when a certificate is issued that means it has been signed. CAs are grouped together in logical units called a public key infrastructure (PKI). Usually each company makes its own PKI. Unless the company has the right to issue certificates to third parties, the certificates issued from a company's PKI are usually for its own internal use.
The CA will revoke certificates if it finds out that its private keys have been stolen (e.g. the customer told them that). The X.509 format has a special blacklist called a certificate revocation list (CRL), a list of certificates that have been revoked.
Usually certificates also have a certificate policy (CP). A CP is a document that describes the conventions that the PKI adheres to when issuing or managing a certificate. This document (Digicert) is what a CP would look like in real life.
When two computers make a connection (usually after calling
accept() for a socket that another computer is
listen()ing to), they run a standard procedure called a handshake. This whole process takes about a few hundred milliseconds.⚑
First the client sends a ClientHello message. The ClientHello message includes the TLS version to use, the list of cipher suites (encryption algorithms specifically for TLS) that are supported, and a string of random bytes called the client random.
Then the server sends a ServerHello message. It contains the server's TLS certificate, a cipher suite from the client's list that the server wants to use, and another random string of bytes called the server random.
The client checks if the certificate can be trusted. In the case of web browsers, they have a list of root certificates' public keys that can be trusted. (See note about self-signed certificates below.)⚑⚑
The client gets the server's public key from the certificate. It creates a premaster key which is then encrypted using the server's public key which can only be decrypted with the server's private key. When the server gets it, it decrypts the encrypted premaster key. Now both computers have the premaster key.
Both the client and the server generate the same session key using the client random, server random and premaster key. If they generate different session keys then that means the TLS session has been intercepted by hackers. The generated session keys must be identical within the same TLS session.
Session keys are used to encrypt and decrypt the data that is transferred with
recv(). The session key can only be used during the lifetime of the socket connection (which is, in the case of HTTP, the HTTP session). When the connection is closed, the session key is invalid and cannot be used anymore.
The client sends a finished message encrypted with the session key. Then the server sends a finished message too, encrypted with the session key.
If any of these steps fail (including the validation algorithm below) then the SSL/TLS connection must not be considered secure. Web browsers would typically abort the connection.
At this point, the data that is sent or received is combined with the session key using a hashing algorithm to generate a message authentication code (MAC). Its purpose is to ensure integrity of the data being transmitted. If when the server and the client combine the session key and their data and the generated MACs don't match, that means an attacker has tampered with the data.
⚑In case you are wondering - this process uses the Diffie-Hellman and RSA key exchange algorithms under the hood.
⚑⚑This is the certificate validation algorithm:
- First the public key algorithm is checked.
- The current date and time is tested against the certificate's validity period.
- The revocation status is checked using a certificate revocation list from the browser or some other place.
- The issuer name on the certificate is checked to make sure it matches the subject name on the higher certificate (if any).
- Checks that the subject name is in the permitted subtrees list of all higher certificates and that it's not any of their excluded subtrees lists.
- Checks that the all of the certificate's policies are in the permitted policies of the higher certificate.
- Makes sure that this certificate's policies are being used properly, in other words, only make SSL connections with certificates that have the SSL policy.
- The certificate's path length is checked to make sure it's not longer than a maximum path length specified in this or a higher certificate.
- The key usage extension is checked to make sure it's allowed to sign certificates.
- Extensions are validated if there are any.
- All of these steps are repeated for the next highest certificate in the chain until it reaches a root certificate. Root certificates are not validated and are automatically trusted.
In Chrome or Firefox or any other browser it's just a matter of clicking the padlock on the URL and then clicking on "More Information" or "Certificate". Then you should see a dialog that looks like this:
This is a graphical browser of the certificate's properties. If you go to "Details" you can see all of the certificate fields in every certificate in the chain:
These are some of the types of certificates you can get:
Domain Validated (DV) - Prove you control the domain. This usually means the CA will do a WHOIS search on your domain.
This is the only certificate type for which installation onto a server can be automated. The other types require manual ID verification and cannot be automated.
Organization Validated (OV) - Prove that you are a legal organization who owns the domain. Requires submitting legal documents and a manual check by a human. This is not available for individuals.
Extended Validation (EV) - Prove that you are a real person or organization who owns the domain. Requires submitting legal documents and a manual check by a human. Also requires your location as the CA needs to make sure you are physically present to control the domain.
Usually to request a certificate you have to fill in a certificate signing request (CSR), create a key pair for it and sign it with the private key. Each CA requires different information to be sent in the CSR depending on the certificate you are requesting. Just to see the kind of information that would be requested, here is the information that Digitcert requires.
You also need to read the certification practice statement (CPS) each CA has that details its particular procedure for issuing and managing certificates. For example, this is Digicert's CSP.
The good news is if you're just trying to get a TLS certificate for your website, you can get free DV certificates from Let's Encrypt or Cloudflare.
You could also create a certificate and sign it yourself using
openssl program but I don't recommend this because the major browsers will alert the users of your site that you self-signed your certificate, which means your users will think your site is not trustworthy. It is not because a self-signed certificate cannot be trusted.
Contrary to popular belief, this Stack Overflow answer explains how you can secure a self-signed certificate:
It's worth noting that in addition to purchasing a certificate (as mentioned above), you can also create your own for free; this is referred to as a "self-signed certificate". The difference between a self-signed certificate and one that's purchased is simple: the purchased one has been signed by a Certificate…
Since all of these acronyms can be difficult to remember, especially if you are reading this for this first time, I have compiled a list of SSL/TLS and certificate acronyms that I used in this post:
|PKI||public key infrastructure|
|CRL||certificate revocation list|
|MAC||message authentication code|
|CSR||certificate signing request|
|CSP||certification signing process|
That's all folks. If you spot an error here let me know so I can correct it.