It's Christmas Eve, in 1991. All around the world, children are reaching fever pitch in excitement. Many adults are brimming with enthusiasm too.
But Mark Crispin is not. Instead, he sends a detailed email to the "ietf-822" mailing list, detailing what's wrong with the newly proposed
MIME-Version header field. It's almost unusable in its current form, he says. And what is one supposed to do, he asks, if you get a different version to
1.0? "WHAT DOES IT MEAN???" he asks, in a rare burst of all-caps.
He ends by saying that if the community ever needs to change the version, they'd be better off making a whole new protocol - and he'll fight strenuously against any move to simply change the version number.
A small detour is in order here.
MIME, or more formally, the Multipurpose Internet Mail Extensions (Or "Multimedia"; like many IETF acronyms the expansion has changed over time) was created in the early 1990's as a replacement for the often arbitrary methods of "improving" Internet Mail.
Competing proposals, like X.400, had much richer, more complex message formats available than the Internet's plain text. The concept of Unicode wasn't around yet, but the idea of different character sets and different ways of representing them certainly was - but again, Internet Mail just supported ASCII. Proprietary email protocols - of which there were many - also supported rich multimedia messaging. The wider community had introduced exciting things like "uuencode" and "binhex" to work around the lack.
So throughout the early 1990's, a concerted effort was put in place to expand Internet Mail to include all these kinds of rich messaging - formatted text, images, and attachments.
By the time I started on email, some four or five years later, MIME was the hot new thing, and many of us had to painstakingly learn how to eyeball Quoted Printable in order to read messages.
MIME went on to become the basis of multimedia in the web - HTTP, despite starting as a "Hyper Text Transport Protocol", would be better named "MIME Transport Protocol". As the core of both the web and email - the two most widely used application-level protocols by far - MIME can lay a reasonable claim to being the most widely used design in history.
It's now nearly three decades - a full thirty years - since MIME was designed. Perhaps it's still too early to tell, but it really looks like Mark was right.
It's still mandatory to include
OK, so this far into the article, and I've mentioned "versions", but not "APIs".
When we were young, we called APIs "Protocols". These days, that word seems reserved for things lower in the stack. The design principles remain the same, however. What works for the design of HTTP, or even IP, works just as well for your blogging API.
So forgive me for being a little old-fashioned, and any time you see "Protocol", just read it as "API" if it makes you happier.
I haven't, I admit, been working with email recently. But I'd be willing to bet that despite it being mandatory, nobody checks the MIME-Version. As Mark said all those years ago, what would you even do if the check failed?
Mark suggested that since any change to MIME-Version would be a breaking change, the best solution was a whole new protocol. Since MIME is defined as a set of header fields, you'd just use different ones. MIME uses - aside from its
MIME-Version, anyway - a set of header fields which begin
Content-. Mark suggested we'd have to switch to something like
Body- - a clean break.
The effect of that is that you've not "changed version" in any semantic sense, you have "changed protocol" entirely. Sometimes, the best way to solve this is indeed to call your protocols after arbitrary numbers - but don't kid yourself that these are "versions" in any useful sense of the word.
So if you've rooted your protocol at
/blogging/v1/ and you want to change something crucial, you'll be changing everything to
/blogging/v2/ - and all older clients will break (or continue using the old protocol, maybe forever). Expecting every client and every server to change at once is known as a "Fork Lift Upgrade" - from when these things were baked into hardware you'd need to physically change - and is generally considered a bad thing.
Sharp eyes amongst you will have noted that MIME didn't stop using
Content-Type and friends, either. Yet MIME didn't stop being developed after 1992, when the original RFCs were published. New RFCs were published twice afterward - in 1993 and 1996, and it developed whole new features like communicating presentation information in 1997.
These all relied upon Rule One:
- Receivers ignore extra stuff they don't understand.
In JSON terms, if an object you get back from an API has some additional keys that are new to you, don't worry about them. They're unimportant.
Many of the X.nnn protocols and their derivatives modify this a bit, by allowing a sender to say whether an extension is Critical or not. Critical extensions, if not understood, mean the message as a whole isn't understood and generates an error. Non-critical extensions can simply be ignored.
IMAP, Mark Crispin's Internet Mail Access Protocol, was built around an unchanged Rule One. Servers can add data (in specific places - a lot of IMAP's structures are positional) without ill-effect. IMAP4rev1 has been in use since 1996.
XMPP is built entirely around Rule One. Clients and servers routinely throw around traffic they don't understand. Servers usually, in fact, don't understand all the traffic they forward between clients. XMPP has been in use since 1999 (and is also still version 1.0).
Indeed, the "End to end principle" is a specific case of Rule One - receivers that are not the endpoint just pass data through unchanged.
HTTP has had something of a chequered history - both HTTP/0.9 and HTTP/1.0 were designed outside of the usual pool of experts, and suffered as a result. HTTP/1.1 was largely cross-compatible with HTTP/1.0, but fixed most of the issues. HTTP/2.0 is a complete redesign. The version negotiation in HTTP/1.0 wasn't actually ever used in practise - all the changes have been made through either Rule One, or by negotiation.
Sometimes it's nice to know if the traffic you want to send will be understood by the receiver. Other times, you want to tell the receiver to send back additional data.
The solution for this still can be a version - but again, it's a fork-lift, an all or nothing binary switch.
Instead, keep the switches small. IMAP pioneered a model where the server advertises "capability strings", and the clients would parse those, and either negotiate specific options or just know they could use a particular capability. You can think of these as explicit or implicit negotiation - which to use depends on the complexity of the feature and how many round-trips you can stomach.
Similarly, in XMPP, any entity - client or server - can ask any other what it supports, and move onto explicit or implicit negotiation. Still, much of the time, a strategy of "just use it and see if it works" is just fine - a sort of unilateral implicit negotiation, if you will - or "suck it and see".
HTTP based protocols - like REST APIs - don't lend themselves well to a complex negotiation, but both Rule One and implicit negotiation work well. An overview endpoint can be used to give server-side capabilities and initial endpoints, and requests can use header fields or parameters to indicate supported capabilities and request their use. You can also borrow X.nnn's "Critical Extensions", of course.
Done right, individual servers and clients can be upgraded entirely independently.
And don't for a moment think that because it's HTTP, your client code is downloaded from your site, so it doesn't matter - if it's a public API, that won't be true, and if it's a private API people will still leave their browsers open all the time if your site is any good.
What you now have is not one protocol, monolithic and versioned, but dozens of small, specialised protocols. A blogging protocol might have one protocol for posts, another for comments, as a simple example. Posts might contain a set of content portions, and images might well be a different thing to text.
Usually, the best way to adapt these to new needs is to add new features, either following Rule One, or by negotiation as needed. But the cost of a whole new extension is much smaller than the cost of a whole new protocol, and since "Naming Things" remains one of the two hardest problems in Computer Science (alongside Cache Invalidation and Off By One Errors), then naming protocols with a simple version number does make a lot of sense.
In XMPP, protocol extensions work by XML namespaces - loosely, we name protocols by a URI. For official extensions, we use a URN form that includes a version number. Each new version is an entirely different protocol - clients that understood, say, "urn:xmpp:mam:0" won't be able to understand "urn:xmpp:mam:1" at all - but servers can easily support both, and often do during transitional periods.
Much of the time, we're able to incorporate changes in a way that doesn't force a "namespace bump" on us. Many protocol extensions survive their entire development period on "version 0". Sometimes, this is by horrible abuse of XML namespaces, mind - where we change XML schema arbitrarily, knowing that in practise, things just work. But occasionally, we need to make a breaking change.
So while it's well worth avoiding changing versions even when they're at the extension level, at least if you do need to do so, it's useful that it's so fine-grained.
But remember - these are no different to just using a different, entirely new, protocol. For REST APIs, changing a "v1" to a "v2" is no different from changing the endpoint URI in any other way.
Real versions - like what's now called "Semantic Versioning" (and was used for
soname versioning for years) contain all sorts of information about backwards compatibility and so on.
These are very useful and effective for software library versioning, and there's a temptation to assume that this model works well for protocols too.
It does not.
The evidence of MIME is that a successful, well-designed, extensible protocol will never change version - whereas protocol extensibility and feature negotiation can make a simple design last decades.
The model of extensible protocols with feature negotiation has been developed and refined many times over the past few decades. As a concept, it has consistently stood the test of time.
You might not be designing your protocol or API to last decades, of course. But just as you should use semantic versioning for software libraries even if you're not planning on maintaining them for decades, you should design extensible protocols as a matter of course.