Katsuyuki Omuro for Fastly

Posted on Nov 22, 2022

Supercharge your API with realtime push powers using Fastly Fanout

#realtime #fastly #fanout #messaging

More than ever before, users expect applications and websites to be “realtime”—to see new information as soon as it becomes available on the server side, without needing to ask for a refresh. Fastly Fanout is here to augment your API with the power of realtime push, enabling you to power these applications using your existing HTTP origin instead of needing to maintain a complicated and dedicated messaging infrastructure.

Realtime is the thing these days. Users are in a world where websites and apps constantly show updated data in a dynamic and interactive way. Your website users need to see the newest chat message, the newest stock price, and the newest in-stock item—and they need to see these updates right now, without needing to request a refresh.

Realtime apps need realtime APIs

APIs are the future of the world. We already live in a world where you can shorten a URL, translate text into various languages, get social media updates, and even order stuff from retailers and control the lights in your house, all using web APIs. As demand for realtime web applications continues to grow, it only makes sense that support for realtime push in APIs will become essential to meet this need.

At first glance, the requirements of realtime push seem to be at odds with the traditional “one-request, one-response” model of HTTP messaging. However, by utilizing long-lived TCP connections, realtime APIs can and do exist. For example, Twitter’s streaming endpoint is such an API. With HTTP streaming, the response from the server is sent without a Content-Length header, and instead uses Transfer-Encoding: chunked. The response stays open forever, and the server continues to append to the response as data becomes ready on the server.

GET /1.1/statuses/simple.json HTTP 1.1
Host: stream.twitter.com
Authorization: OAuth [...]

HTTP/1.1 200 OK
connection: close
content-type: application/json
date: Wed, 19 Oct 2022 00:32:07 GMT
server: tsa
transfer-encoding: chunked
x-connection-hash: 6028cbda...6f0443a6

{"created_at": "Wed Oct 19 00:32:06 +0000 2022", "id": 588862464202297345, "text": "tweet 1", ... }
{"created_at": "Wed Oct 19 00:32:07 +0000 2022", "id": 588862468392255488, "text": "tweet 2", ... }

... (ad infinitum)

Twitter’s streaming API hangs open after the request and adds data about tweets to the end of the response as they become ready.

Streaming HTTP responses are arguably a hack that subverts the conventional one-request, one-response model that HTTP was designed to serve. It’s also the underlying mechanism of the popular Server-Sent Events technology, supported by all modern browsers.

HTTP long-polling is another common technique, used for example by Facebook until very recently for its chat platform, where the server’s response hangs open only until the next "event" happens or a timeout period passes. At that point the server responds and ends the connection. The client then initiates another request right away, waiting for the next event. This effectively creates a data channel for the server to send push updates, but in a way that uses HTTP more conventionally:

Avoiding the use of HTTP protocols may be an option as well. Recently, for example, WebSocket support can be found in all modern browsers, which does away with the creative use of HTTP and adds support for messages to flow in both directions. HTTP techniques are still handy though, so we'll focus on those in this post, and will cover WebSockets in the next post.

So how can we implement an HTTP API to support realtime push?

Implementing a realtime API

There are lots of off-the-shelf components that are designed to push data from the server to the client, like Ably, Firebase, and Pusher. While these are all fine solutions for their use case, they tend to be end-to-end, that is, they are hosted solutions, or they provide a server library to use on your stack. They may use some private protocol and sometimes they require the use of their client SDKs. These may, indeed, be the product of choice if you’re using it to build an entire application stack yourself and don’t plan to expose your APIs publicly.

However, if it’s an API that you’re trying to offer, this is not the best strategy for you. You need control over your API surface. You want to speak HTTP directly and perhaps even document the API using tools such as OpenAPI, rather than require the use of a specific SDK.

Of course, you can go ahead and roll your own stack, but that often means maintaining a complicated and dedicated messaging infrastructure. And with all these long-lived connections, the problem of server resources is one you’ll need to tackle, not to mention scalability as demand for your product increases. If you’re Twitter or Facebook, you may have the resources to build such a project and maintain it, but even if you do, it often ends up not being very CDN-friendly, as you’ll still have to handle the traffic at your origin.

With realtime becoming an imminent need worldwide, it was clear that we needed a better solution for APIs.

The realtime solution for APIs: Fastly Fanout

Earlier this year, Fastly announced that Fanout (https://fanout.io) would be joining Fastly, and that we would be offering Fanout as a product. I actually joined Fastly as part of that acquisition, and now that we've wired it all up, I can show you how to use Fanout with a Fastly service.

Under the hood, Fastly Fanout is based on Pushpin, an open source project that Fanout started (and is still under active development).

Fanout runs on Fastly's edge servers across our global network and forwards requests to your origin that serves your API. It’s your API, so you get full control over its API surface, and we don’t require your users to use a specific server or client SDK. It runs at the edge, so it solves the problems with persistent connections and with scalability of origin traffic. And it supports HTTP streaming (e.g., Server-Sent Events), HTTP long-polling, and WebSocket connections (more on WebSocket in the next article).

But more fundamentally, Fanout has a different design philosophy to other push services, which makes it really great for building APIs. Not only did we want to give you full control over your API, we wanted to use standard, interoperable components. We also wanted to make sure our solution maintained the proper separation of concerns on your team.

Suppose your API design team has defined a realtime API at /stream. Then you have engineer roles build the backend (and perhaps the frontend too), as well as operation roles who decide on things like your infrastructure and caching.

As Fastly Fanout, we enter this diagram in the middle like this:

This architecture allows you to keep your original API design. Fanout handles the long-lived connections and the publishing of data, and lets the origin focus on the business logic. Your API consumers don’t even need awareness of Fanout.

It also frees you from technological and vendor lock-in. As all of these components are standard and interoperable, you’re free to:

make API design changes as you wish
make engineering decisions, such as to switch your origin's programming language or framework, or to switch database engines
make operations decisions, like moving your origin to a cloud provider, or switching from Fastly Fanout to running Pushpin on-prem

The off-the-shelf, end-to-end solutions we mentioned earlier would have encroached on every layer here, but Fanout’s design allows for each of the concerns to stay with their respective roles.

This is all possible because we define the backend interaction with Fanout to use an open integration protocol. We call this protocol GRIP (Generic Realtime Intermediary Protocol), which we publish as an open standard for anyone to implement. Fastly Fanout is simply one such implementation. Fastly and Fanout have always been dedicated to open-source and open standards, and our common stance here is one of the reasons we have always gotten along so well! Fastly will continue to support both Pushpin and GRIP.

This architecture is truly empowering! Because Fanout runs at the edge, realtime activity can happen at any of our numerous POPs nearest the user.

We take on all of the heavy lifting of holding your long-lived connections around the world and publishing messages to individual clients, so your origin doesn’t have to. We free you from the scalability issues of having to sustain numerous connections, as well as from the burden of having to maintain a complex messaging infrastructure.

Enabling Fanout

Fanout is a feature of Fastly that can be enabled per service. Once it’s enabled for your service, handle the request in Compute@Edge, and call req.handoff_fanout("backend") for each request that you wish to pass through the Fanout proxy to your backend (Supported in Rust programs for now, other languages coming soon). This design gives you flexibility and granular control over which requests you wish to handle with Fanout.

If you simply want to pass all requests through Fanout, then we have a Fanout forward starter kit that you can use. Obtain the Fastly CLI, and create a Compute@Edge project using the starter kit as follows, and put it in front of your origin server. See Compute@Edge Starter Kits for more information.

$ mkdir my-project
$ cd my-project
$ fastly compute init --from=https://github.com/fastly/compute-starter-kit-rust-fanout-forward

Realtime response lifecycle with GRIP

As mentioned earlier, Fanout and your origin interact using GRIP. When a client makes a request through the Fanout proxy, Fanout forwards it to the origin.

The origin response is what makes Fanout interesting. The origin is free to make a runtime decision about whether the response should use realtime push. Of course it may elect to always use realtime, but it can, for example, check a query parameter, an Accept header, or for any condition you wish. The origin uses GRIP to indicate to the proxy what to do (such as to hold the response as HTTP streaming, and to subscribe the connection to channels foo and bar), and may also provide an initial payload depending on the circumstances. After that, the origin is free to drop this connection, and if the GRIP instruction contains a hold, Fanout will now hold that connection open.

You may be wondering how complicated this GRIP is. It’s actually super simple:

HTTP/1.1 200 OK
Content-Type: text/plain
Grip-Hold: stream
Grip-Channel: foo, bar

[start stream]

The bolded HTTP headers above are the instructions that instruct Fanout to hold the response as HTTP streaming, and to subscribe the connection to channels foo and bar.

If you call this endpoint now, you’ll see the following, and the response will hang open.

$ curl https://your-service-name.edgecompute.app/stream
[start stream]
(response hangs)

At any later time, when the origin is ready to publish a message to one of these channels, it can signal to the proxy using a GRIP Push:

This is done by making an HTTP POST to the GRIP publishing endpoint:

POST /service/ABCDEFGHI0123456789/publish/ HTTP/1.1
Host: api.fastly.com
Fastly-Key: YOUR_FASTLY_KEY
Content-Type: application/json

{
  "items": [
    {
      "channel": "foo",
      "formats": {
        "http-stream": {
          "content": "hello\n"
        }
      }
    }
  ]
}

On Fastly Fanout, the publishing endpoint is at api.fastly.com/service/<service-id>/publish/. Make a POST to it using your Fastly Service ID, and provide a Fastly API token that has the publish scope for your service. The endpoint accepts up to 64kb of POST data per call, and we accept an array of items, so that you can publish multiple items at once. Each item must specify the channel (foo in this example) as well as the data.

The data is provided as a keyed object, and we allow the data to be specified in any number of formats. This example is for the http-stream format (HTTP streaming), but we also support http-response (HTTP long-polling) and ws-message (WebSocket). This design allows various clients to listen on different transports for updates on the same channel.

Performing the publish shown above would insert the following bolded line into the output stream of our example stream, and the response will again continue to hang.

$ curl https://your-service-name.edgecompute.app/stream
[start stream]
hello
(response hangs)

Example: Adding realtime to an API endpoint

Now that we’ve talked about how Fanout works, it’s a great opportunity to show how it can be used to take an existing API endpoint and add realtime powers.

Imagine an example API that represents a counter, identified by an ID. You can get the current value of a counter by calling it with the GET method, and you can increment it by calling with the POST method. This is implemented using the following PHP code:

if (preg_match('#^/counter/(\d+)/?#', $_SERVER['REQUEST_URI'], $matches)) {
  $counter_id = $matches[1];
  if ($_SERVER['REQUEST_METHOD'] === 'GET') {
    $value = getCounterValue($counter_id);
    echo json_encode(['value' => $value]) . "\n";
    die;
  }
  if ($_SERVER['REQUEST_METHOD'] === 'POST') {
    $value = incrementCounter($counter_id);
    echo json_encode(['value' => $value]) . "\n";
    die;
  }
}

We’ll put Fastly Fanout in front of it, and then we can invoke it like this:

$ curl https://fastly-fanout-blog.edgecompute.app/counter/1
{"value": 1}
$ curl -X POST https://fastly-fanout-blog.edgecompute.app/counter/1
{"value": 2}
$ curl https://fastly-fanout-blog.edgecompute.app/counter/1
{"value": 2}
$

Now let’s add realtime to this GET endpoint. As discussed earlier, Fanout lets us do this conditionally. So this time, let’s check for the existence of an Accept header with the value text/event-stream. In that case we use GRIP to ask Fanout to hold an HTTP stream open and subscribe the connection to a channel named after the counter’s ID:

  if ($_SERVER['REQUEST_METHOD'] === 'GET') {
    $value = getCounterValue($counter_id);

    // -- Start of new code
    if ($_SERVER['HTTP_ACCEPT'] === 'text/event-stream') {
      header('Content-Type: text/event-stream');
      header('Grip-Hold: stream');
      header('Grip-Channel: counter-' . $counter_id);
      echo 'data: ' . json_encode(['value' => $value]) . "\n\n";
      die;
    }
    // -- End of new code

    echo json_encode(['value' => $value]) . "\n";
    die;
  }

This is what happens if we call this endpoint now. Note that if we call it with the Accept: text/event-stream header, we see the “data” prefix and the response will hang open.

% curl https://fastly-fanout-blog.edgecompute.app/counter/1
{"value": 2}
% curl -H "Accept: text/event-stream" https://fastly-fanout-blog.edgecompute.app/counter/1
data: {"value": 2}
(response hangs)

We’ll add publishing to our POST endpoint now. In the code that handles POST, we build the data structure for publishing, and send it to the GRIP publishing endpoint.

  if ($_SERVER['REQUEST_METHOD'] === 'POST') {
    $value = incrementCounter($counter_id);

    // -- Start of new code
    $headers = [
      'Fastly-Key: ' . getenv('FASTLY_KEY'),
      'Content-Type: application/json',
    ];
    $content = [
      'items' => [
        [
          'channel' => 'counter-' . $counter_id,
          'formats' => [
            'http-stream' => [
              'content' => 'data: ' . json_encode(['value' => $value]) . "\n\n",
            ],
          ],
        ],
      ]
    ]; 
    $context = stream_context_create([
      'http' => [
        'method' => 'POST',
        'header' => join("\n", $headers),
        'content' => json_encode($content),
      ]
    ]);
    file_get_contents(
      'https://api.fastly.com/service/' . getenv('FASTLY_SERVICE_ID') . '/publish/',
      false,
      $context
    );    
    // -- End of new code

    echo json_encode(['value' => $value]) . "\n";
    die;
  }

Let’s try calling this! It’s best to leave the previous request hanging, and use a new terminal window.

$ curl -X POST https://fastly-fanout-blog.edgecompute.app/counter/1
{"value": 3}
$ curl -X POST https://fastly-fanout-blog.edgecompute.app/counter/1
{"value": 4}
$

The updates stream through to the first window (new lines bolded):

$ curl -H "Accept: text/event-stream" https://fastly-fanout-blog.edgecompute.app/counter/1
data: {"value": 2}

data: {"value": 3}

data: {"value": 4}

(response hangs)

So there we have it! We’ve used GRIP to add realtime to an API by expanding on the existing origin code. This was done in a PHP codebase—a language that we’d usually never expect to be doing realtime. With Fanout, we’re able to do this without needing to hold our own connections, without needing to build a messaging architecture, using an API that we’ve defined, using all standard components, all standard protocols, and an open standard, GRIP, to tie it all together.

Check out the code for this example, available for viewing on Glitch at https://glitch.com/edit/#!/fanout-blog-demo, and behind Fanout at https://fastly-fanout-blog.edgecompute.app/.

Conclusion

As more of the world demands apps that update in real time, your web APIs need realtime push as well. Fastly Fanout is here to help you add realtime push to your APIs, allowing you to leverage your existing HTTP origin and design principles, while also relieving you from having to sustain numerous connections and having to maintain a complex messaging infrastructure.

We’re certain that this opens a new world of possibility and we can’t wait to see what you’ll be creating with your new realtime powers. Try it out and let us know what you think.

Additional reading

Fanout on Fastly Developer Hub 
https://developer.fastly.com/learning/concepts/real-time-messaging/fanout

Fanout Documentation 
https://docs.fanout.io/

Pushpin - Open Source GRIP Proxy
 https://pushpin.org/

Pushpin on GitHub
https://github.com/fanout/pushpin

GRIP - Generic Realtime Intermediary Protocol 
https://pushpin.org/docs/protocols/grip/

DEV Community

Supercharge your API with realtime push powers using Fastly Fanout

Realtime apps need realtime APIs

Implementing a realtime API

The realtime solution for APIs: Fastly Fanout

Enabling Fanout

Realtime response lifecycle with GRIP

Example: Adding realtime to an API endpoint

Conclusion

Additional reading

Top comments (0)

Read next

Building an Event-Driven Architecture for Content Embedding Generation with AWS Bedrock, DynamoDb, and AWS Batch

33 Essential Concepts Every JavaScript Developer Should Know

Day 19: Limiting Container Resources

Aspiring Tech Writer Looking to Collaborate with Developers!