James Turner

Posted on May 25, 2019

Should behavioural changes be considered breaking changes under SemVer?

#discuss #healthydebate #semver

While writing a different post, I had a thought about point 8 of the Semantic Versioning (SemVer) specification.

8. Major version X (X.y.z | X > 0) MUST be incremented if any backwards incompatible changes are introduced to the public API. It MAY include minor and patch level changes. Patch and minor version MUST be reset to 0 when major version is incremented.

The term "backwards incompatible change" is interesting here because it doesn't specifically reference interfaces and contracts between two pieces of code - a "backwards incompatible change" can just as easily be a behavioural change of the code base.

Consider the following for an example of a behavioural code change:

Initial code:



function encodeText(text) {
    return text.replace('a', 'xyz');
}

New code:



function encodeText(text) {
    return text.replace('a', 'xyz').replace('b', 'foo');
}

Contractually, the code is the same as it is still called encodeText, still takes one argument and still returns a string. Its behaviour on the other hand isn't as now additional text is encoded too.

Let's take it one step further, what about bug fixes?

For a bug fix, you are changing one set of behaviours for another (unintended behaviour to intended behaviour). The original bug may never have been intended but given it is now in the wild, it might now be depended to act that way.

You might be thinking that is a little extreme and I agree. If bug fixes were considered breaking changes because they are a behavioural change, there would be a lot more major versions for libraries.

So question time:

Should behavioural changes be counted as a type of breaking change?
If yes to above, should bug fixes be counted as a breaking change?

Top comments (13)

rhymes • May 25 '19

If encodeText is part of the API of the program then it's a breaking change. Because now you replace "foo" with something else.

Regarding bug fixes, it depends on the scope I guess. If the bug fix changes the behavior so much that with the old version the program users were relying on something that was faulty and now it's doing the right thing but that thing has a different outcome, it's a breaking change.

James Turner • May 25 '19

What makes it hard is that fixing any bug would change the behaviour as the code goes from bug to bug free. If it had no behavioural change, the bug couldn't be fixed.

It is totally a trade off though as it would get ridiculous jumping major versions for every bug fix. I guess the intention of SemVer isn't to be that pedantic though.

I have written code before where I captured exceptions to look at the specific exception message to perform an action. I'm sure the developers would have never considered the exception message a breaking change. I consider this now bad code but it does show how we can depend on the unintentional.

Dian Fay • May 27 '19

The difference is that the behavior of a bug is unanticipated. If I call your encodeText function, it's because I want to transform 'abbreviate' into 'xyzbbrevixyzte'. If it throws when I pass in 'abbŗévīaţè' and you fix it to handle expanded character sets, you're not interfering with existing usages so there's no reason for it to be more than a semver-patch bump. If, however, the fix means that the function no longer works for length-1 strings for arcane Unicode reasons, it's still a bugfix but one with a breaking change involved which requires a semver-major bump.

If encodeText suddenly turns 'abbreviate' into 'xyzfoofoorevixyzte' instead, that's either a breaking change (semver-major) or a bug which should be fixed with a semver-patch bump.

rhymes • May 28 '19 • Edited

I just read this old issue report that might be relevant. Sometimes backports introduce breaking changes that aren't foreseen :D

Consider defaulting Application.confidential to false #1142

rhymes commented on Sep 10, 2018

Recently we had an app break because of the following reasons:

version 4.4.0 was released with the following changelog message: Backport security fix from 5.x for token revocation when using public clients
we applied the security fix knowing revocation wasn't an issue for our use case
the security fix went to production without deep testing
Android users started complaining, iOS users were fine
after reverting and debugging we noticed a difference between the two: Android clients didn't send the client secret, iOS did
the security fix set confidential to true which seems to disable clients without client secret

Setting it to false fixed the issue.

My question is: shouldn't a breaking change like this be clearer?

The changelog doesn't mention it, the upgrade guide says to add the migration but it doesn't clearly state: "hey, the default will break clients without client secret key"

Expected behavior

Either the default should change or the documentation should be clearer about this change.

Thank you

View on GitHub

James Turner • May 29 '19

Wow, that is a subtle breaking change!

Sam Rose • May 27 '19

I think I'd say yes, in most cases. If it's something that has come to be depended on, it's nice to give people the option to depend on a version that still has the old behaviour and update to the new behaviour when they're ready. :)

Sam Rose • May 27 '19

Further still, what do you think about performance changes? If you change nothing about an API call except making it O(n) instead of O(1), is that a breaking change?

James Turner • May 27 '19

I was thinking about that too - technically if someone wrote some really bad consuming code that was doing tasks in parallel and now your update significantly speeds up one aspect which breaks how other things interact with it, arguably that is a breaking change.

Sure, it is a super unfair case for the developer who wrote the update but it really shows you how ambiguous the phrase "breaking change" is!

I think in the end, it is trying to follow the spirit of the change to say whether it is breaking or not. Changing the arguments of a public function, that is totally a breaking change. Changing the functional behaviour of a function (like my example in the post) is a breaking change. Fixing a bug where a null pointer exception is being thrown, that shouldn't be a breaking change. Improving performance shouldn't be a breaking change.

Next we find out that someone is dependent on the number of bytes used in the software and the update broke their workflow... I guess that would make sense in the 70s - 90s where memory size was a direct problem.

Timothy McGrath • May 25 '19

I think a behavioral change should definitely be a breaking change. If the new behavior (or fixed behavior) isn't the same as a client previously expected you definitely could break a client.

This is especially tricky because the client won't even get a compile error when they update to your new version. It might even be worth deprecating the old method and adding a new one for this reason...

James Turner • May 26 '19

I guess though it is hard to determine whether a client previously expected it - for all we know, the client is actually checking the stack trace of exceptions (for whatever reason that would be useful) to determine how their code should operate. Or bug fixes, where is the line that any bug fix isn't a breaking change because they are behavioural change? Is there no line? Does it make the "patch" versions of SemVer redundant if every behavioural change doesn't work?

I get to an extent, if someone is using reflection to go through the code, it is more on them than on the code they are consuming. With something like exceptions though, all the properties on them are non-reflection from the consumer's point of view.

James Turner • May 26 '19

Interesting - yeah, in cases like an API where you can track how people used the API and what the responses were like, it gives you the tools to be confident about how people use it and what it would take to break any rules.

From the perspective of an open source library though, you really don't have either of those as options. Really everything is on the table except maybe using reflection to access internal code (as that hardly seems fair and is against the spirit of consuming the library). Something as seemingly trivial as the error message on an exception object could change the flow of the consumer's code. I don't think that is obviously a good idea but I guess what I am getting at is the grey area of "breaking change".

It might start with something like an exception message to then you find out some code is depending on the actual stack trace of an exception (again, I think this would be a really bad idea BUT it is a valid property on an exception - in C# at least - without needing to use reflection).

Going back to your example of actually monitoring, capturing and replaying activity against an API - that is quite fascinating. I mean, it completely makes sense but at the same time, seems like quite an undertaking. Would love to know more, anything you can share (eg. Is it a custom solution you made? Something off-the-shelf? Do you need to worry about sensitive data? etc) would be great.

View full discussion (13 comments)