DEV Community

Lucas Rafaldini
Lucas Rafaldini

Posted on • Updated on

Back to Basics #1 - The HTTP Protocol

To start posting on this blog, I decided to start a series of posts on topics that are basic, but essential for beginning developers. As a first topic, I thought it would be cool to talk about the famous HTTP that everyone uses and few people understand. So, let's go to the guts of the protocol.

What is a protocol?

Do you remember when someone called you to a party and it's a costume party? Do you know remeber when someone said that, for instance, a judge or a POTUS should not use bad words (at least, not in public)? This is what we call "decorum" and, roughly speaking, we can say that this is a social protocol. A protocol is basically something that everyone agrees without saying anything explicitly, so it is understood by everyone that things must happen one way and only this way, or they will not happen at all.

Bringing this concept to the Web, the HTTP protocol (acronym for Hyper Text Transfer Protocol) is an agreement that establishes a decorum for the transfer of data between Clients and Servers, so no one speaks different "languages" when exchanging information.

How does it work?

HTTP is a TCP/IP application layer protocol. That is something unintelligible for a beginner but here we go: the protocol works on top of the Client-Server model, which assumes that a Client (a computer, a cell phone, a smartwatch, etc) makes resource requests to a Server (well, I don't have a lot of analogies for that, but for now we’ll say it’s a computer that just stores files in general). When you access a website, you are making a request to a server for one or more files that make up the website you want to access; the server checks whether your request is valid or not and sends these files to you; so, you finally see the site on your screen. The page usually comes rendered in an HTML file (short for Hyper Text Markup Language).

Request Types

The first line of every request identifies its type. Requests can have some types. The main types are:

  • GET: Request to get data from the server, whether that data is a page, an image or the response to any behavior done by the Client;

  • HEAD: Similar to GET, but returns only the page header to the Client;

  • POST: This type of request can be understood as the opposite of GET. The Client sends data to the server, be it login data, image uploads or even a message on a service like Whatsapp, for instance;

  • PUT: Similar to POST, with the difference that it is used when the Client wants to send only a part of a record and not a new complete record, such as when you change your username on a social network. So that the Client does not need to send all your data again, overwriting your old registration, he sends a PUT with only the new name;

  • DELETE: Request to delete some record on the server.

Request Headers

Every request, therefore, has a type and that type is specified in its header. An example of a header would be:

GET / HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: pt-BR,pt;q=0.8,en-US;q=0.5,en;q=0.3
Connection: close
Cookie:
Enter fullscreen mode Exit fullscreen mode

The first line of this header identifies the type (method) of the request. As we can see, it is a ** GET . Still on the first line, we can see which path is inside the Server (Host) where the request is being sent. In this case, the Client wants to access the root page, "/". The third and last data of the first line tells the server which version of the HTTP protocol is being used. In this case, **HTTP / 1.1.

The rest of the header content can vary, relatively. Its representation is always structured following an order of Name:Value.

The most common data sent in a header are:

  • Host: Domain name of the server, that is, the domain to which you are trying to send your request, such as www.facebook.com or www.google.com;

  • User-Agent: This piece of data is used to identify to the Server (Host) who is making the request, containing data such as the Client's browser, version and operating system;

  • Accept: Tells the Server (Host) what types of data the Client is able to understand. This is essential for the Server to be able to respond to the Client's request in a way that it understands. Examples of this data are text/plain e application/json;

  • Content-Type: This data is similar to the aforementioned Accept. The difference is that in this case, we are identifying what type of data is sent by the request. The text/plain and application/json examples also apply;

  • Connection: Defines whether or not the connection to the Server (Host) should be kept open for future requests. When the connection is maintained, keep-alive is sent; when not, close;

  • Cookie: Cookie assists in the processes of maintaining the user session and identifying the client. As HTTP is a stateless protocol, it cannot maintain a "state" of the Client. So, when you log in to a website and then want to perform some task that requires the user to be logged in, you would need to log in again. In order for this redundant process to be avoided for the Client, we use Cookies. It is nothing more, grossly, than a hashed string that can load user data, such as your login, password and complementary data to the aforementioned User-Agent;

  • Referer: This data provides to the Server (Host) the address from which the request is originated, which can be used for tracking and analyzing usage of a website, for instance;

  • Origin: This data is quite similar with Referer, but it takes a more simplified data about the origin of the request to the Server (Host). While Referer usually sends the entire path to a URL (such as www.example.com/home/login), Origin would send something like www.example.com.

Everything that goes up must come down

And it is no different with requests. Just as the Client needs to send his request, identifying himself and following the standards of Header, the Server (Host) needs to send a answer to this request, what we call Response.

An exemple of Response is:

HTTP/1.1 200 OK
Date: Tue, 02 Jun 2020 21:45:35 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=31536000
Server: Apache/2.2.14
X-Frame-Options: SAMEORIGIN
Set-Cookie:
Connection: close
Content-Length: 194814
Enter fullscreen mode Exit fullscreen mode

As we have already seen, this first line is very similar to the first line of the Header sent by the Client, with the difference that, in this case, the Client receives the HTTP version and the Response Status.

There are dozens of status codes and it is up to the developer to investigate which is more appropriate, if he is responsible for sending it from the server. For cases where the developer is on the Client's side, a quick search for the code number provides clarifying answers.

In any case, it is important that we have general knowledge about these codes so that we are not caught completely off guard when an unprecedented code appears. To facilitate memorization, we can generally classify the codes by their initial digit:

  • Started with 1 (101, 102, etc): Returns information about the request, whether it was accepted or whether the process continues;

  • Started with 2(201, 202, etc): Returns that the request was successfully executed, its types being referenced to the different types of requests that can be executed;

  • Started with 3(301, 302, etc): Returns the information that there is a need for redirection so that the execution of the request can be completed;

  • Started with 4(401, 402, etc): Returns the information that there is an error on the part of the Client in the request;

  • Started with 5(501, 502, etc): Returns the information that there is an error on the part of the Server(Host) in the request.

Other data returned in a Response are:

  • Date: Date on which the response was originated;

  • Expires: Information about when the content should be considered out of date. The value -1 indicates that the content expires immediately after being sent;

  • Cache-Control: Provides information about cache policies;

  • Server: Provides information about the server;

  • Set-Cookie: Provides Cookies sent from the server to the Client;

  • X-Frame-Options: Provides information for the browser to render (or not) a page in <frame> or <iframe>.

What about the body?

Of course, in addition to the headers, there is the body of requests and responses. It usually includes data related to the request that does not identify the Client/Server or does not have sensitive data that, if intercepted, can cause security problems.

HTTP x HTTPS

HTTPS (Hypertext Transfer Protocol Secure) is nothing more than a more secure version of HTTP. It is a junction of HTTP with the SSL (Secure Sockets Layer _) / TLS (_Transport Layer Security) protocol created to ensure the transmission of sensitive information such as personal data, payment and login data. SSL / TLS protocols allow data to be transmitted over an encrypted connection and require that the Server(Host) and the Client are authentic. HTTPS became primarily necessary due to the massive increase in the use of Wi-Fi connections, in which it becomes possible to carry out attacks by intercepting the data between the Client's exit and the entry in the Wi-Fi modem, for instance. Such attacks are known as Man in the Middle attacks.

Did you like the article? Do you think there was a missing question to be addressed?

Please comment and let me know if this content helped you in any way.

See you later, aligator!

Originally posted in Portuguese here

Top comments (0)