As backend engineers, we really like our black boxes.
We don't enjoy getting into the nitty-gritty details.
One of the topics that backend engineers avoid learning are communication protocols.
Sure you know what is HTTP, and how it works on the surface level.
But do you know how packets get transferred?
Don't worry, it's not the end of the world.
In this article, I will go over one of the three main internet protocols, which powers the world wide web.
In this article you will learn:
- What the TCP protocol is
- TCP Packet Format
- The different stages of the TCP protocol
- Features of the TCP protocol
What is the TCP protocol?
TCP which stands for transmission control protocol is one of the main protocols in the internet suite. It is used on top of the IP (internet protocol) to ensure reliable transmission of packets.
TCP fixes many issues that arise when you use IP such as lost packets, out of order packets, duplicate packets, and corrupted packets.
Since TCP is most commonly used on top of IP, it is referred to as TCP/IP.
Now that we know the basics of TCP, let us go a bit deeper and discuss the packet format of TCP.
TCP Packet Format
The figure above shows the difference between the TCP and the IP packet. As you can see the TCP has a lot more fields in the header, this is made to ensure reliability.
Let us briefly go over each field in the TCP header:
- Source Port Number — The sending device port.
- Destination Port Number — The receiving device port.
- Sequence Number — The initial client initiating the connection must choose a random sequence number, which is then incremented according to the number of bytes.
- Acknowledgement Number — This field holds the sequence number of the next byte the receiver expects. The acknowledgement field is defined only if the ACK field is set.
- Offset — Size of the TCP header.
- Reserved — This field must always be zero, it is reserved for future use.
Flags fields — TCP uses nine control flags to manage data flow in specific situations:
- URG — Indicates that some urgent data has been placed.
- ACK — Indicates that acknowledgement number is valid.
- PSH — Indicates that data should be passed to the application as soon as possible.
- RST — Resets the connection.
- SYN — Synchronizes sequence numbers to initiate a connection.
- FIN — Means that the sender of the flag has finished sending data.
- Window Size — This field is used by the receiver to indicate to the sender how much data it can accept.
- Checksum — Indicates whether the header was damaged in transit.
- Urgent Pointer — Returns the urgent pointer of a TCP segment. If the URG flag is set, this field is an offset from the sequence number indicating the last urgent data byte.
- Options/Padding — Specifies various TCP options.
- Data — Contains the information that your sending.
Let's go through the process of transmitting a packet using TCP.
Before we send data using TCP, we first have to establish a connection.
The client and server must know each other.
They go through a process known as the TCP three-way handshake.
Let's go through the process:
- The client sends a packet with the SYN flag set to 1 along with it's random sequence number.
- The server sends a packet with both the SYN and ACK flags set to 1.
- Finally the client sends a packet with the ACK flag set to 1 indicating that it acknowledges the last request.
Data Transfer Phase
Now that the client and server are acquainted with each other, they are ready to send and receive data to each other.
This is essentially no man's land, but there are two very important rules that every request must follow:
- Every request must be followed by an ACK request, acknowledging that the receiver received the previous request.
- If the receiver doesn't send the ACK request, this indicates to the user that the receiver did not receive the request, and will re-transmit it again.
Once the client and server are satisfied, the connection is finally ready to be closed.
Let's go through this process:
- Either the client or the receiver can initiate the termination by sending a request with the FIN flag set to 1.
- The receiver will respond back with the FIN and ACK flags set to 1.
- One more ACK request is needed, and finally the connection is closed.
Features of TCP
Now that you know most of the nitty details of the TCP protocol.
Let's go over some of the features that TCP provides us.
Every TCP connection must be stateful, meaning the client and server must know each other. This is both good and bad, it honestly depends on your application use case. Some apps need a stateful connection while others don't.
TCP has several different methods for controlling congestion or traffic. We do this because we don't want to overload our servers. Methods in controlling congestion include:
- TCP Slow Start
- Congestion Avoidance
- Fast Retransmit
- Fast Recovery
As we have seen in the previous section, every request must be acknowledged in TCP. We do this to make sure no packets are lost, and this helps us handle packets that are out of order.
TCP is pretty much used everywhere, but you would very rarely use it raw. Most of the time you would use another protocol that is built on top of TCP such as HTTP, Web Sockets, SMTP, FTP, etc...
Many developers feel overwhelmed by the sheer amount of protocols, but the secret is to study from first principles. Most of these protocols come up from either TCP or UDP, once you know these two, the others are simply variants that are easy to learn.
Before we end it off, I would like to say if you enjoy my content feel free to follow me on Twitter @tamerlan_dev.
Thanks for reading, if I missed anything feel free to leave them down in the comments and start a discussion.