A common pattern in the PHP landscape is shifting from a monolith to an architecture that consists out of (micro)services. In this shift, chances are you'll still have one main entry point application that then calls some services to build up a view or apply some actions.
With this shift, we have three main wishes that seem conflicting:
- We want our website to be fast.
- We want to add features that our users didn't even know they wanted.
- We want to do that in a nicely separated and modular way, separating our landscape into (micro)services that expose API's.
In PHP, with everything we want, we run into the following things:
- PHP is still very much a synchronous language, there is limited support for asynchronous operations and multi-threading. This means that HTTP calls to your services will be executed sequentially. Time spent on HTTP calls adds up quickly, and with enough distinct services that can easily result in slow pages or endpoints.
- One of the core concepts of PHP is that all runtime data, like variables and request data, is cleaned up after the handling of a request is done. That also goes for resources, like opened files or opened connections. Because of this, you can't share a HTTP connection, even if both client and server agree to keep the connection alive.
- Setting up a TCP connection is quite expensive. A new connection with TLS, which has become a default for basically all digital communication, is even more expensive. Although there are a lot of factors in play (ping, CPU usage, crypto configuration, etc.), you can expect an overhead of a couple of milliseconds for a TCP connection and about 10-100 milliseconds for a TLS connection.
We're doing more and more HTTP requests, sequentially, opening a new connection every time, with a lot of lost time as a result.
The library ext-http adds extended connection reuse to PHP. You can specify a handle identifier when creating a client, and it will reuse that handle and any connections it had open. You can use the handle identifier in different script executions, and it will be able to use the same handle.
An example of how you can use it:
<?php declare(strict_types=1); use http\Client; $client = new Client(null, 'handle-name'); $request = new Client\Request('GET', 'https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png'); $client->enqueue($request); $client->send(); /** @var Client\Response $response */ $response = $client->getResponse(); ...
As you can see, the library introduces
Request classes similar to how they are defined in PSR standards. Unfortunately, they do not actually match PSR standards completely, so you'll have to do some mapping yourself.
A small benchmark with this example shows a nice improvement: The first request takes about 60ms, but after that the request time drops to about 15ms.
To support concurrent requests, you'll have to make sure you're not using the same
handle-name at the same time. This introduces a new challenge, because now you have to somehow maintain a list of available handles. For the test, I tried to make a random integer between 0 and 100 part of the
handle-name. That gave decent results, but of course it stops working when you experience higher load.
Although I haven't tried it yet, I think you can solve this for example by storing handle identifiers in a queue (e.g. a FIFO queue in Redis). You retrieve the handle that has been idle the longest, and you can create a new handle if there's nothing available.
In an ideal world the library would take care of this, but doing it yourself for now seems like an OK compromise in most situations.
Although this solution is very specific to PHP, the principle is applicable in all languages, also languages that support async processing. Connection reuse will reduce request time and is also going to lower the load on both client and server.
Are you reusing connections where possible? If not, that might be an easy way to improve performance and resource usage.