DEV Community

loading...
Cover image for Fun with Socat & WebSocket

Fun with Socat & WebSocket

urandom
Updated on ・8 min read

Do you have weird desires to do things in shell that shouldn't be done in shell? your opinion doesn't matter, I have already written the blog post :P

Before we venture any further, please be reminded that the wonderful websocat exists, there is really no reason to do this in shell (except for fun :)


Why would anyone do this?

Live reloads for web projects are pretty standard nowadays. If you are already using socat as SimpleHTTPServer, it is only logical to also use socat to notify client side to reload :D

In this case, a complete WebSocket server is not necessary. I only need to send messages to the client, and don't need to handle incoming messages, which make the data flow way easier to handle (easier to handle in socat+shell, it probably doesn't make much difference in other languages).

I was mostly referencing this blog post by another nice gentleman on implementing WebSocket server in ruby, so for those who want to learn more about the boring details, you know where to go.

The journey

If you have sniffed WebSocket request-response before, you can probably recognize that WebSocket request is actually a normal HTTP GET request. And the response first has a header section, just like a normal HTTP response, followed by the payload. And for our server-to-client only scenario, it look just like (and in essence, is) Server-sent events, with data dripping to the client unidirectionally.

The Headers look just like your usual HTTP header

To handle the request, we could utilize socat's SYSTEM address type to echo out the Header,

#!/bin/sh

websocket_script="
  echo 'HTTP/1.1 101 Switching Protocols';
  echo 'Sec-WebSocket-Accept:  UdMEX53kyT/LBV+MbgNRheSRFvQ=';
  echo 'Connection: Upgrade';
  echo 'Upgrade: websocket';
  echo '';
"
socat TCP-LISTEN:8080,crlf SYSTEM:"$websocket_script"

But if you actually tried this, you will find that life is never that simple.

surprised pikachu face

You see, there is a special header called Sec-WebSocket-Accept that the browser will validate to ensure that our life isn't that easy.
From MDN the algorithm to generate this is:

The server takes the value of the Sec-WebSocket-Key sent in the handshake request, appends 258EAFA5-E914-47DA-95CA-C5AB0DC85B11, takes SHA-1 of the new value, and is then base64 encoded.

Well that seems doable, we can grep the key from request header and printf to append the magic string, then we have sha1sum to get the hash and base64 for encoding, how hard could it be?

welp

Well... I have no idea why head -n1 got stuck and couldn't close the stream1. But grep already has -m NUM to stop reading a file after NUM matching lines, that should suffice.

hmm, I don't think that work well

hum... I think base64 takes binary as input. But that looks like hex?

xxd to the rescue

It was indeed hex. A quick google tells us we can use xxd to convert between hex and binary.

$ printf "a\n" | xxd -p
610a
$ printf "610a" | xxd -r -p
a

Now we have everything we need to get the hashed key:

xxd in work

And plugging it into other echo statements should just work TM

#!/bin/sh

websocket_magic_string=258EAFA5-E914-47DA-95CA-C5AB0DC85B11
websocket_script="
  echo 'HTTP/1.1 101 Switching Protocols';
  accept=\$(
    grep -m1 'Sec-WebSocket-Key' |
    cut -d' ' -f2 |
    xargs printf '%s$websocket_magic_string' |
    sha1sum | xxd -r -p | base64
  );
  echo \"Sec-WebSocket-Accept: \$accept\";
  echo 'Connection: Upgrade';
  echo 'Upgrade: websocket';
  echo '';
"

socat TCP-LISTEN:8080,crlf SYSTEM:"$websocket_script"

every one like green light

Now with the easy part out of the way, let's see how to format a WebSocket compliant payload.

The rules are as follow:

print a special bitmask as char

if message.length < 126
  print message.length as char

else if message.length < 2**16
  print 126 as char # i.e. the 126th ascii character
  for each byte of (message.length as unsigned 16-bit integer)
    print the byte as char

else
  print 127 as char
  for each byte of (message.length as unsigned 64-bit integer)
    print the byte as char

print the message as chars

(again, you can read the blog post I am referencing for details)

The first part was easy enough, just printing a char2 on screen. However there are no type or type cast in shell, we only have raw byte and raw bytes that happen to be printable.

printf to the rescue

printf is capable of printing printable and non-printable characters with escape sequences, making it especially useful for showing raw bytes. Like printf '\n', running printf "\\$(printf %o '65')" will return "A". This gives us the ability to print the bitmask and handle the first case.

while IFS= read -r line; do
  printf "\\201"; # websocket server message mask 10000001

  size=$(printf "$line" | wc -c);

  if { test $size -lt 126; }; then
    printf "\\$(printf %o $size)";

  ...

For the second part, we need to coerce a string of decimal digits into two char, representing the decimal value.
For example, we should convert "16737" into "Aa".

----------------------------
|   dec  |  65 * 256 + 97  |
----------------------------
|  char  |   A          a  |
----------------------------

You may have noticed that we essentially need to convert the decimal value into binary bytes (as opposed to a single char, meaning the printf trick doesn't work anymore).
We will need to change the radix of the number, so brush up your math skills and prepare for some fun.

dc to the rescue

Instead of showing off my non-existent math skill here, a basic desk calculator should do the job. And by desk calculator I mean dc (stands for desk calculator), it conveniently has parameters to control the input/output radix. With a simple dc -e "16 o $size p" you too can safely forget how to do math!

# 16 o => set 16 as output radix
# 11 => input value (by default base 10)
# p => print out the result
$ dc -e "16 o 11 p"
B

Well, we need 2 char, but a single hex digit is only 4-bit, what if our input size is under 256? It won't give us 4 hex digits!

Luckily we are trying to represent unsigned integer, which mean the leading bits are zeros. If we add enough zeros and take from the end, we don't even need to calculate how many zero to prepend.

$ echo 'AA' | xargs printf '%04d%s' 0 | tail -c4
00AA
# or
$ { printf '%04d' '0'; printf 'AA'; } | tail -c4
00AA

Now we have some hex string, If only there is a way to convert between hex and binary...

$ dc -e "16 o 16737 p" |
> xargs printf "%04d%s" 0 |
> tail -c4 |
> xxd -r -p
Aa

With that, followed by printing the actual text, we are now able to send a WebSocket compliant payload data.

while IFS= read -r line; do
  size=$(printf "$line" | wc -c);

  printf "\\201"; # websocket server message mask 10000001

  if { test $size -lt 126; }; then
    printf "\\$(printf %o $size)";

  elif { test $size -lt 65536; }; then
    printf "\\$(printf %o 126)";
    dc -e "16 o $size p" |
    xargs printf '%04d%s' 0 |
    tail -c4 |
    xxd -r -p

  else
    printf "\\$(printf %o 127)";
    dc -e "16 o $size p" |
    xargs printf '%016d%s' 0 |
    tail -c16 |
    xxd -r -p

  fi

  printf "$line";
done

The last 90%

Now having the main chunk of code done, We can go through some remaining "minor" issues.

Input

Normally socat only accepts two addresses, so only 2 out of "STDIO", "TCP-LISTEN", and "SYSTEM". Since the server itself is a must, that leaves us choosing between stdin and our script.
Fortunately, fork-ed command inherits parent's file descriptor. We can duplicate stdin to a new file descriptor and reference it inside the command.

A drawback is that due to stdin being a stream, if we have multiple copies of "SYSTEM" (by running the server with fork option), only one of them will actually receive the message. Of course we can instead write to a normal file and have the commands tail -f from said file, or even use socat to run another UDP server. However I find these solution less appealing, and, in this use-case, I can live with only having a single client.

CRLF

We commonly use socat TCP with the option crlf, which means socat will prepend \r before each \n. But if you think this is a common use case, you are in for a surprise (or I am very interested in what do you commonly use socat for).

Since we are sending integer as char, when our message size is 10, we will be sending the 10th ascii character — the new line character. After which socat will make our life miserable by prepending extra data. So we will have to drop the crlf option and attach \r ourselves.

Portability

Since xxd isn't portable3, we will have to find a replacement. As we previously discovered, printf is fully capable of printing raw bytes, but we will have to split each char worth of hex digit in advance.

We can use sed 's/../&\n/g' to replace each two characters with themselves plus a new line, or we could use a simpler fold -w2 to do the same.

$ echo '1234' | sed 's/../&\n/g'
12
34

# or
$ echo '5678' | fold -w2
56
78

Finally, the end result

#!/bin/sh

websocket_magic_string=258EAFA5-E914-47DA-95CA-C5AB0DC85B11
websocket_script="{
printf 'HTTP/1.1 101 Switching Protocols\r\n';
printf 'Sec-WebSocket-Accept: ';
grep -m1 'Sec-WebSocket-Key' |
  sed -e 's/\r//g' |
  cut -d' ' -f2 |
  xargs printf '%s$websocket_magic_string' |
  sha1sum |
  head -c 40 |
  fold -w2 |
  xargs -I{} printf '%o\n' '0x{}' |
  xargs -I{} printf '\\\\{}' |
  base64 |
  xargs printf '%s\r\n';
printf 'Connection: Upgrade\r\n';
printf 'Upgrade: websocket\r\n';
printf '\r\n';
cat <&3;
}"

while IFS= read -r line; do
  printf '\201'; # websocker_server_mask 10000001

  size=$(printf "%s" "$line" | wc -c);
  if { test $size -lt 126; }; then
    printf "\\$(printf %o $size)";
    extended_payload_length_size=0;
  elif { test $size -lt 65536; }; then
    printf "\\$(printf %o 126)";
    extended_payload_length_size=4;
  else
    printf "\\$(printf %o 127)";
    extended_payload_length_size=16;
  fi

  {
    printf '%016d' '0';
    dc -e "16o${size}p" | tr -d "\n";
  } |
    tail -c $extended_payload_length_size |
    fold -w2 |
    xargs -I{} printf '%o\n' '0x{}' |
    xargs -I{} printf '\{}';

  printf '%s' "$line";
done |
  socat TCP-LISTEN:8080,reuseaddr,fork SYSTEM:"$websocket_script" 3<&0

That’s it. For those who want to implement client-to-server message handling, maybe you will be able to do that with dual type address and some named pipe. But then again, Why would anyone do that ;)


  1. the reason was related to grep buffering the output when the destination isn't a tty, I still lack a thorough understanding regarding this, maybe I will write a blog post when I can wrap my head around it better. 

  2. char in this post are 8-bit, unsigned 

  3. I wasn't concerned about proper metrics like POSIX compliant. Instead, I just want it to run on docker alpine/socat image. 

Discussion (0)