In last week’s article, we took a high-level look at how programming to an interface helps us write more modular code. By targeting abstractions of dependencies, we decouple our code from specific implementations. In other words, when we design our code to interact with placeholders of subsystems instead of accessing them directly, we gain an element of freedom. And with this, we become able to swap out parts of our system for others (that are compatible) if we need to with minimal effort.
To show this in action, we looked at a simple example where we replaced the expectation of a
List with that of an
IEnumerable. In this week’s article, we’ll look at a more tangible use of this technique.
Let’s imagine we’re building a data viewing platform. We want to be able to pull in already-existing data from third-party systems to visualise – anything from financial reports, to short-term issue management, to long-term project roadmaps. To do this, we start building a data-import subsystem.
During its development, we create a small sample dataset to work with. It consists of 15 items. We build our import system to retrieve those 15 items, loop through them, and project each one into our own system’s format. We store the mappings (i.e. the results of the conversion) in a
Dictionary where we use the converted object’s identifier as a key, and the mapped data as its value. We use a
Dictionary for near-immediate access to each object in further processing that follows the mapping process.
After a week of hard work, the system’s ready. We decide to test it with live data. The UI loads, we find the button to start the data import, and we click it.
And then we wait...
And wait a bit more...
And then we start wondering whether the process has hung somewhere – it was much faster during development.
After further investigation, we find the main source of the delay is the loop when mapping data. Each item takes (on average) 4ms to transform. Our development dataset has 15 items, meaning the process takes 60ms. However, our live dataset has 5,000 entries. A processing time of (approximately) 4ms per item means the whole conversion takes 20,000ms – or 20 seconds – to complete. And that’s on top of other overheads in the request process.
Luckily, we use a pure function when mapping data: the transformation is completely independent and doesn’t rely on anything else. Instead of looping through using
foreach, we can use
Parallel.ForEach. This would take advantage of our server’s multiple processor cores to convert multiple items at the same time.
Dictionary isn’t thread-safe for writing: attempting to add data from more than one thread can cause an exception to be thrown. But we can use
You might remember that storing the mapped data in a
Dictionary wasn’t the final step. Further processing took place afterwards. If we’d written code that explicitly expected the
Dictionary type, we’d have to modify the signature of every method where it was subsequently passed in as an argument. But if these methods expected an
IDictionary, no further changes would be necessary; both
When building our example system, we used a small dataset. The overall responsiveness during the development phase meant we didn’t anticipate long delays in typical live usage. We resolved the problem by introducing multithreaded processing. But this meant we had to make a small change to the data structures used in the mapping process.
Implementation changes can be unexpected and don’t always come from specification updates. Programming to an interface could mean some changes don’t ripple out, saving you from having to modify parts of your systems that aren’t explicitly connected.
Thanks for reading!
This article is from my newsletter. If you found it useful, please consider subscribing. You’ll get more articles like this delivered straight to your inbox (once per week), plus bonus developer tips too!