DEV Community

Cover image for Deep dive into WebSockets
Pankhudi Bhonsle
Pankhudi Bhonsle

Posted on

Deep dive into WebSockets

Hello there! In this post, we'll be doing a conceptual deep dive into WebSockets.

To begin with, let's first understand the use case of WebSockets.

In a client-server architecture, if client wants to talk to the server, it can make an HTTP request to the server and the server responds back.

HTTP Request-Response Cycle:

Each time client needs something - it sends request and server responds with response

Now if there is a data update that has happened on the server,
there's no way it can make a real time update to the client - unless the client constantly keeps requesting the server. Meh! Bad idea!

What if I tell you - we have a better way to achieve this "Real time updates" - Server being able to push updates to its client? Enter, WebSockets!

Some real life use cases of WebSockets:

  • 🏏 Sports updates: Well, I am not into Cricket but when the Indian team plays, my friend surrenders himself to the pace of the game. So if you are planning to include sports updates in your application, WebSockets can keep your users up to speed.

  • 🎮 Multiplayer Games: Usually multiplayer games have a server that stores the game state. When a player makes a move - it is sent to the server - that updates the game state. The updates are instantly relayed to all other players via WebSockets.

  • 💡 Collaboration apps: Want to build the next draw.io or figma-like collborative tools? You can follow the pattern above to allow users to collaborate and instantly update others.

Okay, So what are WebSockets anyway?

Person raising hand to ask a doubt

The Wikipedia definition says: 🤖

WebSocket is a computer communications protocol, providing full-duplex communication channels over a single TCP connection.

Let's try breaking it into 3 pieces:

"WebSocket is a computer communications protocol..." - meaning it has set of rules to allow two or more entities to transmit information.

"...providing full-duplex communication channels..." - meaning it enables communication to be bidirectional.

"...over a single TCP connection." - and all of this with just 1 TCP connection - How does it matter you ask? Well you save a lot of time and resources not creating multiple TCP connections!

So, how does the WebSockets handshake look like?

Client sends HTTP1.1 req with Upgrade header. Server responds with status code 101 - switching protocol. Bidirectional communication can happen now.

Let's examine how the Req/Res looks:

Request from client:

    GET /endpoint HTTP/1.1
    Host: server.example.com
    Upgrade: WebSockets
    Connection: Upgrade 
Enter fullscreen mode Exit fullscreen mode

Reponse from server:

    HTTP/1.1 101 Switching Protocol
    Upgrade: WebSockets
    Connection: Upgrade    
Enter fullscreen mode Exit fullscreen mode

What's an Upgrade header?

The Upgrade header field provides a simple mechanism
for transition from HTTP/1.1 to some other application-layer
protocols upon the existing transport-layer connection.
For example, Upgrade to HTTP/2.0 or Upgrade to IRC/6.9

You can read more about it here RFC2616 Upgrade header

Technically, the client can "indicate a preference" that it wants to upgrade, if possible. Whether to upgrade or not is determined by the server.

Summarising the handshake: 🤝

  • The client tells the server it wants to upgrade the connection to WebSocket.
  • The server analyses the request and let's say chooses to upgrade it.
  • To convey that, server responds with 101 - switching protocol.
  • Once the handshake is successful, clients and servers can transfer data back and forth for indefinite amount of time, until either of them closes the connection.

--

Looking at the request response above, I'll confess WebSockets look very HTTPish to me, and seemingly I wasn't alone.
But let us address that too.

WebSocket protocol is not HTTP!

A lady saying "Repeat after me"
Understand that the WebSocket handshake uses HTTP protocol and hence the request/response was in HTTP format. But they are still two different application layer protocols. Meaning they have a different set of rules, semantics. After the handshake is successful, a connection is established and the WebSocket essentially uses raw TCP to read/write data.

Now Why on Earth does WebSocket handshake use HTTP format ?
Well, the WebSocket Protocol was designed to address the goals of bidirectional communication while being compatible with the existing HTTP infrastructure like servers, proxies and filtering. Isn't that beautiful!

Hopefully this post helped you understand WebSockets a bit better. Let me know in the comments if you have any questions or feedbacks! Happy Coding! 👩‍💻

Top comments (0)