I got an inspiration for this article after watching this amazing tech talk by Ilya Sazonov and Fedor Sazonov. If you know Russian, go check it out. It's worth it.
In this article, I'm telling you:
- Why do you need to care about forward compatible enum values?
- What are the ways to achieve it?
- How can Jackson library help you out?
Suppose we develop the service that consumes data from one input (e.g. Apache Kafka, RabbitMQ, etc.), deduplicates messages, and produces the result to some output. Look at the diagram below that describes the process.
As you can see, the service resolves deduplication rules by the platform
field value.
- If the platform is
WEB
, deduplicate all the messages in 2 hours window. - If the platform is
MOBILE
, deduplicate all the messages in 3 days window. - Otherwise, proceed with the message flow as-is.
We’re not discussing the technical details behind the deduplication process. It could be Apache Flink, Apache Spark, or Kafka Streams. Anyway, it’s out of the scope of this article.
Regular enum issue
Seems like the simple Java enum is a perfect candidate to map the platform
field. Look at the code example below.
public enum Platform {
WEB, MOBILE
}
On the one hand, we have a strongly typed platform
field which helps to check possible errors during compilation. Also, we can add new enum values easily.
Isn’t that a brilliant approach? Actually, it is not. We’ve forgotten the third rule of deduplication. It says that unknown platform value should not trigger any deduplications but proceed with the message flow as-is. Then what happens if the service consumes such a message as below:
{
"platform": "SMART_TV",
...
}
There is a new platform called SMART_TV
. Nevertheless, no one has warned us that we need to deal with the new value. Because the third rule of deduplication should cover this scenario, right? However, in that case we’d got a deserialization error that would lead to unexpected termination of the message flow.
UKNOWN value antipattern
What can we do about it? Sometimes developers tend to add special UNKNOWN
value to handle such errors. Look at the code example below.
public enum Platform {
WEB, MOBILE, UNKNOWN
}
Everything seems great. If the platform
fields has some unexpected string, just map it to Platform.UNKNOWN
value. Anyway, it means that the output message topic received a corrupted platform
field. Look at the diagram below.
Though we haven’t applied any deduplication rules, the client received an erased value of the platform
field. Sometimes that’s OK, but not in this case. The deduplication-service
is just a middleware that should not put any unexpected modifications to the submitted message flow. Therefore, the UNKNOWN
value is not an option.
Besides, the
UNKNOWN
presence has some design drawbacks as well. As long as it’s an actual value, one can accidentally use it with inappropriate behavior. For example, you may want to traverse all existing enum values withPlatform.values()
. But theUNKNOWN
is not the one that you wish to use in your code. As a matter of fact, avoid introducingUNKNOWN
enum values at all costs.
String typing
What if we don't use enum at all but just deal with the plain String
value? In that case, the clients can assign any string to the platform
field without breaking the pipeline. That's a valid approach, if you don't have to introduce any logic based on the provided value. But we provided some deduplication rules depending on the platform. Meaning that string literals like "WEB"
or "MOBILE"
ought to repeat through the code.
The problems don't end here. Imagine that the client sent additional requirements to the platform
field determination:
- The value should be treated as case insensitive. So,
"WEB"
,"web"
, and"wEB"
string are all treated like theWEB
platform. - Trailing spaces should be omitted. It means that the
" mobile “
value truncates to”mobile"
string and converts to theMOBILE
platform.
Now the code may look like this:
var resolvedPlatform = message.getPlatform().toUpperCase().trim();
if ("WEB".equals(resolvedPlatform)) {
...
}
else if ("MOBILE".equals(resolvedPlatform)) {
...
}
...
Firstly, this code snippet is rather smelly. Secondly, the compiler cannot track possible errors due to string typing usage. So, there is a higher chance of making a mistake.
As you can see, string typing solves the issue with the forward compatibility, but still it’s not a perfect approach.
Forward compatible enums
Thankfully, Jackson provides a great mechanism to deal with unknown enum values much cleaner. At first we should create an interface Platform
. Look at the code snippet below.
public interface Platform {
String value();
}
As you can see, the implementations encapsulate the string value that the client passed through the input message queue.
Then we declare a regular enum implementation as an inner static class. Look at the code example below.
public class Enum implements Platform {
WEB, MOBILE;
public static Enum parse(String rawValue) {
if (rawValue == null) {
throw new IllegalArgumentException("Raw value cannot be null");
}
var trimmed = rawValue.toUpperString().trim();
for (Enum enumValue : values()) {
if (enumValue.name().equals(trimmed)) {
return enumValue;
}
}
throw new IllegalArgumentException("Cannot parse enum from raw value: " + rawValue);
}
@Override
@JsonValue
public String value() {
return name();
}
}
That's a regular Java enum we've seen before. Though there are some details I want to point out:
- The Jackson @JsonValue tells the library to serialize the whole object as the result of a single method invocation. Meaning that Jackson always serializes
Platform.Enum
as the result of thevalue()
method. - We're going to use the static
parse
method to obtain enum value from the rawString
input.
And now we're creating another Platform
implementation to carry unexpected platform values. Look at the code example below.
@Value
public class Simple implements Platform {
String value;
@Override
@JsonValue
public String value() {
return value;
}
}
The Value is the annotation from the Lombok library. It generates
equals
,hashCode
,toString
,getters
and marks all the fields asprivate
andfinal
.
Just a dummy container for the raw string.
After adding Enum
and Simple
implementations, let's also create a static factory method to create the Platform
from the provided input. Look at the code snippet below.
public interface Platform {
String value();
static Platform of(String value) {
try {
return Platform.Enum.parse(value);
}
catch (IllegalArgumentException e) {
return new Simple(value);
}
}
}
The idea is trivial. Firstly, we’re trying to create the Platform
as a regular enum value. If parsing fails, then the Simple
wrapper returns.
Finally, time to bind all the things together. Look at the Jackson deserializer code below.
class Deserializer extends StdDeserializer<Platform> {
protected Deserializer() {
super(Platform.class);
}
@Override
public Platform deserialize(JsonParser p, DeserializationContext ctx) throws IOException {
return Platform.of(p.getValueAsString());
}
}
Look at the whole Platform
declaration below to summarize the experience.
public interface Platform {
String value();
static Platform of(String value) {
try {
return Platform.Enum.parse(value);
}
catch (IllegalArgumentException e) {
return new Simple(value);
}
}
public class Enum implements Platform {
WEB, MOBILE;
public static Enum parse(String rawValue) {
if (rawValue == null) {
throw new IllegalArgumentException("Raw value cannot be null");
}
var trimmed = rawValue.toUpperString().trim();
for (Enum enumValue : values()) {
if (enumValue.name().equals(trimmed)) {
return enumValue;
}
}
throw new IllegalArgumentException("Cannot parse enum from raw value: " + rawValue);
}
@Override
@JsonValue
public String value() {
return name();
}
}
@Value
public class Simple implements Platform {
String value;
@Override
@JsonValue
public String value() {
return value;
}
}
class Deserializer extends StdDeserializer<Platform> {
protected Deserializer() {
super(Platform.class);
}
@Override
public Platform deserialize(JsonParser p, DeserializationContext ctx) throws IOException {
return Platform.of(p.getValueAsString());
}
}
}
When we parse the message with platform
we should put the deserializer accordingly.
class Message {
...
@JsonDeserialize(using = Platform.Deserializer.class)
private Platform platform;
}
Such setup us gives two opportunities. On the one hand, we can split the message flow according to the platform
value and still apply regular Java enum. Look at the code example below.
var resolvedPlatform = message.getPlatform();
if (Platform.Enum.WEB.equals(resolvedPlatform)) {
...
}
else if (Platform.Enum.MOBILE.equals(resolvedPlatform)) {
...
}
...
Besides, Jackson wrap all unexpected values with Platform.Simple
object and serialize the output result as a plain string. Meaning that the client will receive the unexpected platform
value as-is. Look at the diagram below to clarify the point.
As a matter of fact, the following pattern allows us to keep using enums as a convenient language tool and also push the unexpected string values forward without data loss and pipeline termination. I think that it's brilliant.
Conclusion
Jackson is a great tool with lots of de/serialization strategies. Don't reject enum as a concept if values may vary. Look closely and see whether the library can overcome the issues.
That's all I wanted to tell you about forward compatible enum values. If you have questions or suggestions, leave your comments down below. Thanks for reading!
Resources
- Enum in API — The deceit of illusory simplicity by Ilya Sazonov and Fedor Sazonov
- Jackson library
- Apache Kafka
- RabbitMQ
- Apache Flink
- Apache Spark
- Kafka Streams
- @JsonValue
- @Value Lombok annotation
Top comments (5)
Pretty neat. The reason I'm choosing enums over strings is not capitalization, though. You can easily use string constants and use equalsignorecase to get around that:
Half as smelly as your initial example, and the actual check will be as concise as the check in your last snippet.
What I use enums for is switches with IDE (or other linter) exhaustiveness checks, which are not possible in your last snippet. So the enums gives you zero benefit over string constants, but introduce significant boilerplate (complexity, maintenance, yadda yadda).
Here's an idea to make it more useful:
Now, if you get around to add
SMART_TV
, but forgot to extend the switch, the linter will tell you.Plus you get rid of
if else if else if else ...
.And if you use new switch expression
case WEB -> { ... }
, it will be the compiler complaining, no linter needed.@vlaaaaad thank you! Your mention about switch-expression is meaningful
Thank you for the great article.
should
if (rawValue != null) {
beif (rawValue == null) {
?Yeah, that's a typo. Thanks for noticing. Fixed that
This is a good one! Thanks dude