Intro
Disclaimer: I didn't know much about Websockets 1 week ago, all the experience I had with Websockets was when I developed a chat application back in 2016 using a JS framework that tried to be a Ruby on Rails implementation called SailsJS, so I decided to research about this technology and consumed multiple resources which I will link in this blog post and each section.
Websockets are a way to handle full-duplex communications (or two-way communications), they are very useful to build applications that needs real time data like chat application or a stock dashboard. Websockets arrived as an improvement over previous solutions like:
- Polling
The most simple solution consists of making HTTP requests at a set interval
- Long Polling
Desperate Times Call for Desperate Measures
This just means that the client will call the HTTP endpoint and the server will not resolve the request immediately but wait until a message needs to be delivered to the client, this approach worked but had a lot of disadvantages, like timeouts, latency, and it sounds very hacky to begin with.
- HTTP streaming and Server-sent events (SSE)
These solutions work well but are more suited to one-way communications as in use cases where the server needs to send a notification to its user, e.g.: A new post was created, etc.
Websocket servers can (but don't have to) be run alongside a normal HTTP server since WebSockets use a different protocol ws
and wss
(for secured connections like HTTPS
).
Websockets are a simple protocol from the user's perspective as they only consist of 3 different events:
open
close
message
Connecting (Handshake)
All WebSocket connections will start with a handshake, always initiated by the client, that if successful will upgrade the connection from HTTP
to ws/wss
protocol.
Notes:
A client can create as many connections as it desires.
Arbitrary headers cannot be set in the browser when setting up a WebSocket connection.
All the behavior below is defined at the browser level by W3's HTML5 WebSocket API specification and at the protocol level via RFC 6455 "The WebSocket Protocol" and is of course all managed by the system (the browser in JavaScript) and you don't need to worry about it, the following is for information purpose only
The handshake process consists of the following steps:
- Client sends the following Headers on a standard
http(s)
request
HTTP GET ws://127.0.0.1:8000/ 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: hXcK3GM6B2r6yj/4L0Vuqw==
Origin: http://localhost:3000
Sec-WebSocket-Version: 13
Sec-WebSocket-Key
This header is a random string, the server will take these bytes and appends the special string 58EAFA5-E914-47DA-95CA-C5AB0DC85B11
, hashed it and encode it in base64 and return in the Sec-WebSocket-Accept
header.
If the server accepts the connection, it will respond with the following headers:
HTTP GET ws://127.0.0.1:8000/ 101 Switching Protocols
Connection: Upgrade
Sec-WebSocket-Accept: rXzSb8mhB4ljxko8kbyiCohJ4Fc=
Upgrade: websocket
The connection will be considered as successful only when the server:
Replies with status code
101
Includes
Connection
Header with valueUpgrade
Includes
Upgrade
Header with valuewebsocket
Example in JavaScript
All of that complexity is already handled by the browser, and establishing a WebSocket connection is quite straightforward:
const socket = new WebSocket('wss://example.com/socket');
socket.addEventListener('open', (event) => {
console.log('WebSocket connection established!');
});
Message exchange
Once a connection has been established, it is kept alive both in the server and the client.
Both the server and the client can send messages at any moment in either text
or binary
data (Blob or ArrayBuffer objects). Sending messages in binary data is marginally faster but not by a big factor since sending text
data requires a UTF-8 conversion but this process is quite fast nowadays.
Example in Javascript
Sending messages is as simple as:
socket.send(message); // e.g: '{a: 1]'
And receiving them:
socket.addEventListener('message', (event) => {
console.log('Received message:', event.data);
});
Closing
A WebSocket connection can be closed at any moment by either the client or the server.
From the client's perspective
it's very straightforward:
socket.close();
From the server:
It can be a bit trickier since there might be problems if you wait for the clients to disconnect, clients might hang and eventually will cause performance issues. Two ways to address this is by using the following code (that was taken from Stackoverflow):
- Hard shutdown
With this approach, we just terminate the connection without waiting for clients to disconnect
// Soft close
socket.close();
process.nextTick(() => {
if ([socket.OPEN, socket.CLOSING].includes(socket.readyState)) {
// Socket still hangs, hard close
socket.terminate();
}
});
- Soft shutdown
Here, we give the clients some time to disconnect before terminating them
// First sweep, soft close
wss.clients.forEach((socket) => {
socket.close();
});
setTimeout(() => {
// Second sweep, hard close
// for everyone who's left
wss.clients.forEach((socket) => {
if ([socket.OPEN, socket.CLOSING].includes(socket.readyState)) {
socket.terminate();
}
});
}, 10000);
Auth and security
Unfortunately, the WebSocket protocol doesn't suggest a way to handle authentication, but many people have come up with many solutions over time.
As said before, the most traditional way to authenticate in the modern web by sending a Header with a token like a JWT
is not possible with WebSockets since sending a custom header is not supported in the browser.
This leaves us with solutions like the one presented in this document:
- Sending credentials as the first message in the WebSocket connection
This method is fully reliable but moves the authentication to the application layer and can expose you to leak information if you’re not careful enough. Another negative point is that it allows everyone to open WebSocket connections with your server.
- Adding credentials in the WebSocket URI as a query parameter
Another method is to send tokens as a query parameter when opening the WebSocket connection, something like: wss://localhost:3000/ws?token=myToken
The downside of this is that this information might end up in the logs
of your system, one way to mitigate this would be to use single-use tokens, but the industry tends to consider this risk as unacceptable.
- Setting a cookie on the domain of the WebSocket URI
This solution is also reliable as long as your WebSocket server is running in the same domain as your http
server if this is not the case then it won't be possible since the Same-Origin Policy doesn't allow setting a cookie on a different origin.
But there are two ways to overcome this problem:
1. Move the WebSocket server to a subdomain of the main `http` server, e.g: `websocket.example.com`
2. Use an `iframe` running on the same WebSocket domain to set the cookie
If you go with this approach you need to consider that as today [2023-07-18 Tue] Google Chrome won’t work with this approach unless you set the SameSite
and Secure
properties as well, e.g.
document.cookie = 'my_token=token; SameSite=None; Secure;'
Cross-Site WebSocket Hijacking
One thing to be aware of cookie-based authentication solution is that Websocket connections are not restrained by the same-origin policy
, and thus it opens a vector attack called Cross-Site WebSocket Hijacking
.
This means that if you set a cookie for the websocket domain
this cookie will be sent when connecting to the server no matter from which website the WebSocket connection is established, this opens a security problem since malign websites can take advantage that the user already has a cookie set for the domain to also subscribe to messages with the WebSocket server.
One way to mitigate this problem is to always validate the Origin
of the client before establishing the connection.
You can much more information in this article
Resources
Binary vs text messages
https://stackoverflow.com/questions/7730260/binary-vs-string-transfer-over-a-streamShutting down WebSocket connections in the server
https://stackoverflow.com/questions/41074052/how-to-terminate-a-websocket-connection/49791634#49791634Authentication in WebSockets
https://websockets.readthedocs.io/en/stable/topics/authentication.htmlCross-Site WebSocket Hijacking
https://christian-schneider.net/CrossSiteWebSocketHijacking.htmlWebsocket framed protocol
https://sookocheff.com/post/networking/how-do-websockets-work/
Top comments (0)