Dave Cridland

Posted on Apr 12, 2019 • Edited on Nov 18, 2019

Versioning in APIs

It's Christmas Eve, in 1991. All around the world, children are reaching fever pitch in excitement. Many adults are brimming with enthusiasm too.

But Mark Crispin is not. Instead, he sends a detailed email to the "ietf-822" mailing list, detailing what's wrong with the newly proposed MIME-Version header field. It's almost unusable in its current form, he says. And what is one supposed to do, he asks, if you get a different version to 1.0? "WHAT DOES IT MEAN???" he asks, in a rare burst of all-caps.

He ends by saying that if the community ever needs to change the version, they'd be better off making a whole new protocol - and he'll fight strenuously against any move to simply change the version number.

Mark's rant

A Brief History of MIME

A small detour is in order here.

MIME, or more formally, the Multipurpose Internet Mail Extensions (Or "Multimedia"; like many IETF acronyms the expansion has changed over time) was created in the early 1990's as a replacement for the often arbitrary methods of "improving" Internet Mail.

Competing proposals, like X.400, had much richer, more complex message formats available than the Internet's plain text. The concept of Unicode wasn't around yet, but the idea of different character sets and different ways of representing them certainly was - but again, Internet Mail just supported ASCII. Proprietary email protocols - of which there were many - also supported rich multimedia messaging. The wider community had introduced exciting things like "uuencode" and "binhex" to work around the lack.

So throughout the early 1990's, a concerted effort was put in place to expand Internet Mail to include all these kinds of rich messaging - formatted text, images, and attachments.

By the time I started on email, some four or five years later, MIME was the hot new thing, and many of us had to painstakingly learn how to eyeball Quoted Printable in order to read messages.

MIME went on to become the basis of multimedia in the web - HTTP, despite starting as a "Hyper Text Transport Protocol", would be better named "MIME Transport Protocol". As the core of both the web and email - the two most widely used application-level protocols by far - MIME can lay a reasonable claim to being the most widely used design in history.

It's now nearly three decades - a full thirty years - since MIME was designed. Perhaps it's still too early to tell, but it really looks like Mark was right.

It's still mandatory to include MIME-Version: 1.0.

Protocols and APIs

OK, so this far into the article, and I've mentioned "versions", but not "APIs".

When we were young, we called APIs "Protocols". These days, that word seems reserved for things lower in the stack. The design principles remain the same, however. What works for the design of HTTP, or even IP, works just as well for your blogging API.

So forgive me for being a little old-fashioned, and any time you see "Protocol", just read it as "API" if it makes you happier.

A Whole New World

I haven't, I admit, been working with email recently. But I'd be willing to bet that despite it being mandatory, nobody checks the MIME-Version. As Mark said all those years ago, what would you even do if the check failed?

Mark suggested that since any change to MIME-Version would be a breaking change, the best solution was a whole new protocol. Since MIME is defined as a set of header fields, you'd just use different ones. MIME uses - aside from its MIME-Version, anyway - a set of header fields which begin Content-. Mark suggested we'd have to switch to something like Body- - a clean break.

The effect of that is that you've not "changed version" in any semantic sense, you have "changed protocol" entirely. Sometimes, the best way to solve this is indeed to call your protocols after arbitrary numbers - but don't kid yourself that these are "versions" in any useful sense of the word.

So if you've rooted your protocol at /blogging/v1/ and you want to change something crucial, you'll be changing everything to /blogging/v2/ - and all older clients will break (or continue using the old protocol, maybe forever). Expecting every client and every server to change at once is known as a "Fork Lift Upgrade" - from when these things were baked into hardware you'd need to physically change - and is generally considered a bad thing.

Extensions

Sharp eyes amongst you will have noted that MIME didn't stop using Content-Type and friends, either. Yet MIME didn't stop being developed after 1992, when the original RFCs were published. New RFCs were published twice afterward - in 1993 and 1996, and it developed whole new features like communicating presentation information in 1997.

These all relied upon Rule One:

Receivers ignore extra stuff they don't understand.

In JSON terms, if an object you get back from an API has some additional keys that are new to you, don't worry about them. They're unimportant.

Many of the X.nnn protocols and their derivatives modify this a bit, by allowing a sender to say whether an extension is Critical or not. Critical extensions, if not understood, mean the message as a whole isn't understood and generates an error. Non-critical extensions can simply be ignored.

IMAP, Mark Crispin's Internet Mail Access Protocol, was built around an unchanged Rule One. Servers can add data (in specific places - a lot of IMAP's structures are positional) without ill-effect. IMAP4rev1 has been in use since 1996.

XMPP is built entirely around Rule One. Clients and servers routinely throw around traffic they don't understand. Servers usually, in fact, don't understand all the traffic they forward between clients. XMPP has been in use since 1999 (and is also still version 1.0).

Indeed, the "End to end principle" is a specific case of Rule One - receivers that are not the endpoint just pass data through unchanged.

HTTP has had something of a chequered history - both HTTP/0.9 and HTTP/1.0 were designed outside of the usual pool of experts, and suffered as a result. HTTP/1.1 was largely cross-compatible with HTTP/1.0, but fixed most of the issues. HTTP/2.0 is a complete redesign. The version negotiation in HTTP/1.0 wasn't actually ever used in practise - all the changes have been made through either Rule One, or by negotiation.

Negotiation

Sometimes it's nice to know if the traffic you want to send will be understood by the receiver. Other times, you want to tell the receiver to send back additional data.

The solution for this still can be a version - but again, it's a fork-lift, an all or nothing binary switch.

Instead, keep the switches small. IMAP pioneered a model where the server advertises "capability strings", and the clients would parse those, and either negotiate specific options or just know they could use a particular capability. You can think of these as explicit or implicit negotiation - which to use depends on the complexity of the feature and how many round-trips you can stomach.

Similarly, in XMPP, any entity - client or server - can ask any other what it supports, and move onto explicit or implicit negotiation. Still, much of the time, a strategy of "just use it and see if it works" is just fine - a sort of unilateral implicit negotiation, if you will - or "suck it and see".

HTTP based protocols - like REST APIs - don't lend themselves well to a complex negotiation, but both Rule One and implicit negotiation work well. An overview endpoint can be used to give server-side capabilities and initial endpoints, and requests can use header fields or parameters to indicate supported capabilities and request their use. You can also borrow X.nnn's "Critical Extensions", of course.

Done right, individual servers and clients can be upgraded entirely independently.

And don't for a moment think that because it's HTTP, your client code is downloaded from your site, so it doesn't matter - if it's a public API, that won't be true, and if it's a private API people will still leave their browsers open all the time if your site is any good.

And versions once more

What you now have is not one protocol, monolithic and versioned, but dozens of small, specialised protocols. A blogging protocol might have one protocol for posts, another for comments, as a simple example. Posts might contain a set of content portions, and images might well be a different thing to text.

Usually, the best way to adapt these to new needs is to add new features, either following Rule One, or by negotiation as needed. But the cost of a whole new extension is much smaller than the cost of a whole new protocol, and since "Naming Things" remains one of the two hardest problems in Computer Science (alongside Cache Invalidation and Off By One Errors), then naming protocols with a simple version number does make a lot of sense.

In XMPP, protocol extensions work by XML namespaces - loosely, we name protocols by a URI. For official extensions, we use a URN form that includes a version number. Each new version is an entirely different protocol - clients that understood, say, "urn:xmpp:mam:0" won't be able to understand "urn:xmpp:mam:1" at all - but servers can easily support both, and often do during transitional periods.

Much of the time, we're able to incorporate changes in a way that doesn't force a "namespace bump" on us. Many protocol extensions survive their entire development period on "version 0". Sometimes, this is by horrible abuse of XML namespaces, mind - where we change XML schema arbitrarily, knowing that in practise, things just work. But occasionally, we need to make a breaking change.

So while it's well worth avoiding changing versions even when they're at the extension level, at least if you do need to do so, it's useful that it's so fine-grained.

But remember - these are no different to just using a different, entirely new, protocol. For REST APIs, changing a "v1" to a "v2" is no different from changing the endpoint URI in any other way.

But they're not versions

Real versions - like what's now called "Semantic Versioning" (and was used for soname versioning for years) contain all sorts of information about backwards compatibility and so on.

These are very useful and effective for software library versioning, and there's a temptation to assume that this model works well for protocols too.

It does not.

The evidence of MIME is that a successful, well-designed, extensible protocol will never change version - whereas protocol extensibility and feature negotiation can make a simple design last decades.

The model of extensible protocols with feature negotiation has been developed and refined many times over the past few decades. As a concept, it has consistently stood the test of time.

You might not be designing your protocol or API to last decades, of course. But just as you should use semantic versioning for software libraries even if you're not planning on maintaining them for decades, you should design extensible protocols as a matter of course.

Top comments (10)

zakwillis • Jan 2 '20

Hi Dave, trying to understand this.

Are you stating that the endpoint, depending upon versions should be;
/shoppingbasket/user/v1/fred (original)
/shoppingbasket/user/v2/fred (this json/DTO object may have more attributes).
? I think you are.

Thinking of this in .Net Core/.Net Web API, these are separate endpoints and they have to be because they are returning different objects. Like Martin says, seems like double the pain but I get your idea.

I have seen certain APIs do something like this (JIRA)
server/version/shoppingbasket/fred

This makes sense as you could maintain separate code bases but simply deploy the URL without screwing up the routing. In your client request, the base URL can be configured. Clients know what they are signing up for.

Anyway, interesting thoughts.

Dave Cridland • Jan 2 '20

I'm really saying you don't want to do v2. Have /shoppingbasket/user/fred return an object, and the caller be forgiving about new attributes. If you need truly breaking changes, then you do need a v2, but it's a last resort and an indication you've broken something.

HTTP (or at least REST) doesn't make this sort of thing easy.

zakwillis • Jan 3 '20

Hi Dave, I thought about this some more. It makes sense. Automappers can be strict or optimistic, so they can rigidly enforce a data structure in the client.

My main concern with your approach is;

We are assuming no fields/substructures are ever removed. A protocol is simply a common set of rules for communication. The parts of speech can be part of a protocol toolkit and can never be removed. In the data based example, the data is not a protocol - it is subject to change, can see fields removed. In our day and age, we are becoming ever sensitive to risk upset to others and are constantly changing policies etc.

What might have to happen, is the protocol (endpoint in your system) delivers simply a validation object and a version object. This tells the client what it has to conform to. There could be an additional validation end point. This is quite neat because it lets the validator end point say which formats it conforms to?

REST is itself, a protocol, yet the packets are unknown to the protocol. A bit like a pipe doesn't care what is shoved down it, as long as it fits. Plumbers know that a soil pipe is a protocol, a mains pipe provides water - they don't test what goes through it when they connect a toilet or a tap. A manufacturer can quite rightly say, the dishwasher was never designed to run on the waste pipe protocol.

zakwillis • Jan 2 '20

Understood Dave. Will think about it some more. Thanks.

Martin Häusler • Jul 2 '19

An interesting take on the problem. As a server-side developer, supporting two versions of an API that are actually different (breaking changes) is very difficult. Twice the number of endpoints, twice the number of DTOs, twice the bugs, twice the headache... Breaking changes also mean potentially very little source code sharing between the two versions; translating to twice the maintenance effort too. The argument "if you ever do a /v2 then it's already a different endpoint" really struck home for me. Gonna have to think about that some more.

Sebastian Vargr • Jan 17 '20 • Edited

This is why i like the GraphQL API format.

The client say i want Y in X shape, and the server complies if it can.

That way the server adapts, to whatever the clients wants instead of the other way around.

Furthermore incompatibilities only occur if the server stops supporting whatever shape the client is requesting, and even better the API url can stay the same.

Héctor López • Apr 13 '19

"These are very useful and effective for software library versioning, and there's a temptation to assume that this model works well for protocols too.

It does not."

I believe this is something you only realize when you've gone through and suffered from it. Guilty.

Dave Cridland • Apr 13 '19

Pretty much every major protocol and API has tried to do this. MIME has its fixed header field. XMPP has 'version="1.0"' dutifully exchanged between every peer, every time. HTTP was standardized at 1.1 only to distinguish it from the (rather borked) 1.0. Even IMAP went through a series of versioned revisions (though later versions are themselves actually capabilities), and Mark was one of the first to identify this as a problem.

If you're only seeing this in hindsight, you're in really good company.

Tony Metzidis • Jan 9 '20

great review. how do you feel JS has handled versioning e.g. ES5, ES6 ?

Zia • Apr 12 '19

Good explanation