DEV Community

Dave Saunders
Dave Saunders

Posted on • Originally published at

What is TCP?

(my newsletter subscribers received this first)

TCP (Transmission Control Protocol) is a protocol for machines to communicate over a network, and is the foundation on which the internet is built.

One of the most useful characteristics of TCP is that it is 'resilient', which means it can cope with an unreliable network without losing any data.

This is possible because the machine receiving the data 'acknowledges' that it has received it. If the sender doesn't get this acknowledgement, it knows to re-send the data.

A server receiving, and acknowledging a cat picture

How does it work?

When we send data over a network, we normally group that data into 'packets':

A client sending data to the server as 'packets'

At the beginning of each packet is the 'header'; a section of meta-data used to describe what is in the packet:

A packet layout, showing the header followed by the data below

The TCP header has a special block called the 'Sequence Number'.

The Sequence Number is used to keep track of how many bytes have been sent and received by the two machines talking to each other.

  • The machine sending the data sets the Sequence Number to the number of bytes that have been sent in total during this conversation. It then sends the next block of data.
  • The machine receiving 'acknowledges' receipt of the data by adding the number of bytes it has received to the Sequence Number, and passing it back to the sender. It does this by setting a value in another part of the TCP header; the 'Acknowledgement Number'.

A diagram showing the server receiving data and replying by increasing the Sequence Number

This simple mechanism allows us to detect when data has gone missing; if the sender sends 20 bytes, but the receiver only acknowledges 10, we know that some data has not been received.

There are various mechanisms for dealing with this, but the simplest is for the sender to simply re-send the data.

Randomising the Sequence Number

For security reasons, we don't want the Sequence Number to be predictable. If it was, it would be possible to 'hijack' the communication.

So, even though the Sequence Numbers track how many bytes have been exchanged, they don't start from 0. Instead, both parties generate random starting values for their Sequence Number.

But if these numbers are random, how do both clients agree on them?

This is where the 'TCP Handshake' comes in...

The TCP Handshake

When a connection is started, the first machine decides on its random 'Sequence Number' and sends it to the other party.

This transmission is called the SYN, because it is intended to 'SYNchronise' the Sequence Number.

The other side knows this is a SYN request, because we turn on a special bit in the packet header when sending it.

In this example, we've chosen 1234 as the random 'Sequence Number'.

TCP Handshake part 1: The client sends a ransom Sequence Number

The other party must now ACK or 'ACKnowledge' the Sequence Number.

It does this by adding 1 to it, and passing it back in the 'Acknowledgement Number' section of the header (it also sets a special ACK bit in the header).

It now generates its own random Sequence Number and sends it back to the other machine.

In this example, it generated 5678 as its Sequence Number:

TCP Handshake part 2: The server replies by incrementing 1 to the Sequence Number and sending its own random number

Finally, the original party acknowledges the other machine's Sequence Number, also by adding 1 to it and sending it back in the 'Acknowledgement Number' section of the header:

TCP Handshake part 3: The client acknowledges the server's Sequence Number

The Handshake is now complete, and the clients can start communicating.

You might also see the TCP Handshake written as SYN, SYN/ACK, ACK, referring to the order the handshake packets are sent in.

Want to know more?

Check out these links:

Top comments (0)