I'm in a bit of a conundrum, and I can't seem to find anywhere online where someone has addressed it. Here's what I'm dealing with:
I'm using a third-party library to parse some documents into objects. Unfortunately for me, the documents are not XML or JSON, hence the need for this third-party library. Doubly unfortunate, the schema of this document type comes in three different versions, mostly the same but slightly different. These differences are represented in the objects that this third-party library returns.
Here's an example of what I'm talking about. Three POCO classes, each representing an order version:
// Order V1
public class ThirdPartyOrderV1
{
public decimal Amount { get; set; }
public ThirdPartyOrderItemV1[] LineItems { get; set; }
public ThirdPartyOrderAddressV1 BillingAddress { get; set; }
public ThirdPartyOrderAddressV1 ShippingAddress { get; set; }
public string SomeV1ThingIDontCareAbout { get; set; }
public decimal AnotherV1ThingIDontCareAbout { get; set; }
// ...
}
public class ThirdPartyOrderItemV1
{
public int ItemID { get; set; }
public int Quantity { get; set; }
// ...
}
public class ThirdPartyOrderAddressV1
{
public string Name { get; set; }
public string Street { get; set; }
// ...
}
// Order V2
public class ThirdPartyOrderV2
{
public decimal Amount { get; set; }
public ThirdPartyOrderItemV2[] LineItems { get; set; }
public ThirdPartyOrderAddressV2 BillingAddress { get; set; }
public ThirdPartyOrderAddressV2 ShippingAddress { get; set; }
public object SomeV2ThingIDontCareAbout { get; set; }
// ...
}
public class ThirdPartyOrderItemV2
{
public int ItemID { get; set; }
public int Quantity { get; set; }
// ...
}
public class ThirdPartyOrderAddressV2
{
public string Name { get; set; }
public string Street { get; set; }
// ...
}
// Order V3
public class ThirdPartyOrderV3
{
public decimal Amount { get; set; }
public ThirdPartyOrderItemV3[] LineItems { get; set; }
public ThirdPartyOrderAddressV3 BillingAddress { get; set; }
public ThirdPartyOrderAddressV3 ShippingAddress { get; set; }
// ...
}
public class ThirdPartyOrderItemV3
{
public int ItemID { get; set; }
public int Quantity { get; set; }
public string Metadata { get; set; }
// ...
}
public class ThirdPartyOrderAddressV3
{
public string Name { get; set; }
public string Street { get; set; }
// ...
}
Notice that they look almost identical. And for my purposes, they are identical in that I don't need to consume the properties that are different between them. When we first wrote this project, we didn't have time to think too deeply about it, so we (horror of horrors) copy and pasted our logic after the parse:
public class Program
{
public static void Main()
{
var parser = new ThirdPartyOrderParser();
var order = parser.ParseOrder("text");
if (order is ThirdPartyOrderV1)
{
IngestOrderV1(order as ThirdPartyOrderV1);
}
else if (order is ThirdPartyOrderV2)
{
IngestOrderV2(order as ThirdPartyOrderV2);
}
else if (order is ThirdPartyOrderV3)
{
IngestOrderV3(order as ThirdPartyOrderV3);
}
}
private static void IngestOrderV1(ThirdPartyOrderV1 order)
{
Console.WriteLine("Order total: " + order.Total);
}
private static void IngestOrderV2(ThirdPartyOrderV2 order)
{
Console.WriteLine("Order total: " + order.Total);
}
private static void IngestOrderV3(ThirdPartyOrderV3 order)
{
Console.WriteLine("Order total: " + order.Total);
}
}
-- .NET Fiddle
All three of those IngestOrderV* methods do the same logic on the same properties of the order, but handle each version individually, and have no shared code. Not ideal, and not maintainable. Now, we have some time to go back and see a better way to do it. Obviously, what we'd like to do is something like this:
public void IngestOrder(string text)
{
var order = thirdPartyLib.ParseOrder(text);
var orderTotal = order.Amount;
var totalQuantity = order.LineItems.Sum(i => i.Quantity);
}
At this point, I would love to throw the developers of this third-party library under the bus for not using any polymorphism in their document classes. But I can't. Here's why:
- This library parses hundreds of different document types, so I'm sure these classes are auto-generated by their system, which would probably have a dickens of a time trying to come up with an effective base class or interface for some of the documents that have changed more from version to version than this document I care about has changed.
- I need to handle different versions in a common way. That's not true for all their users, who probably more often than not are using this library to only parse the version that their company has adopted, so for most of this library's user base, this is a non-issue.
- Declaring an
IThirdPartyOrder
interface sounds fine and dandy, except that the properties are not simple value types. You would needIThirdPartyOrderItem
andIThirdPartyOrderAddress
, etc. And if you replace all the properties of all the version objects with their non-specific types like that, you lose all binding between a V3Item and a V3Order. The developer would then always have to cast his properties to a specific version every time he or she wants to reference a version-specific aspect of the object. This class has hundreds of custom-typed properties, so that gets really hairy really fast.
OK, so enough whining (and forgiving). What can I do with this situation where these objects are the same "shape" but don't share an interface? Here are some ideas I've thought about and/or played with:
Get creative with dynamic
If you treat the object coming back from the parser as dynamic
, you can access all the properties without caring about type bindings. So, as long as the properties have the same names, we can get away with something like this:
public class Program
{
public static void Main()
{
var parser = new ThirdPartyOrderParser();
dynamic order = parser.ParseOrder("text");
Console.WriteLine("Order total: " + order.Total);
// ...
}
}
-- .NET Fiddle
Doing this is awesome. We can now handle any object that comes our way, as long as it has properties with the names we specify. Effectively, we're accessing the object as we would in JavaScript land. Write it once and we're good to go, right?
Well, there are a few huge downsides to this. JavaScript returns undefined
for properties that don't exist on our object, but .NET will throw a hard exception if we try to reference a non-existent property. So basically you're deferring the property checking to runtime instead of compile time. You can get around this by casting the dynamic to an object again, then using reflection to check if the property exists prior to calling it, but that's really tedious. Delaying type and property checking until runtime is extremely dangerous. It's removing the safety from your gun.
Not only is it unsafe, you lose all IntelliSense if we go this route. Refactoring becomes a huge pain, and debugging issues in production become a problem.
Finally: you can't invoke extension methods dynamically. So, maybe you know that the LineItems property is always going to be a collection, but the compiler doesn't. It has no idea that LineItems
even exists, let alone its type. Heck, it doesn't even know what order
is. So, without a bunch of casting and magical trickery, you can't iterate through your properties or call LINQ or other extension methods
Other "dynamic" approaches
A few other ideas that came to my mind: This library states that since orders are one-way POCOs (no backreferences), they can all easily be serialized to JSON, XML, YAML, etc. So maybe I could serialize the object to JSON and traverse through the object using the Json.NET JObject traversal method:
public class Program
{
public static void Main()
{
var parser = new ThirdPartyOrderParser();
var order = parser.ParseOrder("text");
var json = JsonConvert.SerializeObject(order);
var obj = JObject.Parse(json);
Console.WriteLine("Order total: " + obj["Total"]);
}
}
-- .NET Fiddle
Like with the dynamic class idea above, we won't get an IntelliSense or compiler type checking. But checking for property existence is a lot easier, and you don't have to get creative with reflection. And you get LINQ back. Plus, if I really wanted to, I could write a class (say, ThirdPartyOrderCommon
that has the same common structure as the third-party order classes and do a strongly-typed deserialization to get what I want.
But on the other hand, like I said, this orders object is HUGE and contains hundreds of property types. That's a lot of code for me to write and maintain (and duplicate). Plus, it feels wrong to serialize an object just to translate it from one type to another (maybe AutoMapper would be a better candidate for that?)
So, I'm stumped
Surely, I'm not the only one who has run into something like this. What do you all do?
Top comments (3)
Auto mapper is cleaner, you should still create a set of interfaces of just the common data points you need. Also you don’t need to cast to dynamic just to use reflection, but reflection tends to be expensive so you get a performance hit.
If you’ll forgive my crossing from C# into Java (where I feel more comfortable), this is how I would approach this issue:
It’s probably a bit of work, but it’ll at least remove the differences you are blocked by. In case it’s needed, the same/mirrored way can be used to convert back into any version of theirs.
EDIT:
Scratch all that. There is a far simpler way, based on the following pseudocode:
FooV1 fooV1 = fromJson(toJson(fooV2), FooV1)
Basically, given that you only care about the overlapping fields, you can just translate between the different version by using json as a go-between. That way, you still retain the benefits of static typing without any real effort.
My two cents.
The simplest solution is the one said by Alain Van Hout.
But depending on your constraints, probably a refactor of the code to use a business object is better.
You would be independent of the format used to save the data, meaning that you could in the future change your third-party library, if needed.
Tip: