In relational databases it's common practice to create constraints to lock down the data and its relationship to other tables. So, a
Users table might have an
Subscription table gets a non-nullable
User column because every Subscription must be associated with a User. Yeah, that also sounds reasonable.
But I'm here to argue against constraints! To some that might sound heretical (or at least very stupid), but hear me out:
We use constraints for two reasons: For performance (it's a lot quicker to search a uniquely-indexed column), and to guarantee data integrity (so we physically can’t enter invalid data, e.g. the business requires a subscription to be tied to a user).
Here's the problem: We programmers often optimize too early, and SQL constraints don't actually express the business requirements!
Let me break that down a bit:
On the topic of performance, there are of course valid cases where constraints speed up queries by several magnitudes, but unless we see that performance problem and its subsequent remediation with a constraint we run the risk of prematurely optimizing. Fewer constraints would make the code more malleable, and create less friction for making subsequent changes to the database structure.
All I'm proposing here is that we look for opportunities to defer database-optimizations until they are truly needed, to counteract our tendency for premature optimizations.
I propose no business-constraints at all. Does that sound barking mad? The more I work with SQL the more I get bit by half-baked "business constraints" that almost-kinda-sort-of duplicates what the code already handles. What does it really give us to mark a
Usercolumn as non-nullable, when the code already ensures a
Subscriptionis only ever created in relation to a
User? It amounts to no help at all because constraints are simply not the actual source of truth.
"But wait" I hear some protest, "it's so simple to add non-nullable to the
Usercolumn". Yes, but it still does harm, because it fundamentally isn't the source of truth for the business requirement.
At this point we need to discuss the drawbacks of constraints: The more of them there are the more difficult the system becomes to test and adjust.
More difficult to test, because to test a
Subscriptionwe now also have to create a
User, and associate them correctly. Even if a test has nothing to do with the
Userentity it must be created. And then imagine another handful of entities that have to be put together just right to create a
Subscription, it quickly becomes very painful to work with! Then we might start hiding all that test-complexity in entity factories, but those are just abstractions on top of complexity that we have caused for ourselves. Why not get rid of the complexity entirely?
And more difficult to adjust because introducing new features ends up requiring careful migrations to keep the constraints synced with code: What if we actually can have subscriptions without a user? What if an email column doesn't have to be unique? If the data-storage layer imposes artificial constraints it slows down quickly experimenting with such ideas.
I think many of us create constraints because we were trained and/or told to do so. But none of us are realistically implementing 100% of business rules in SQL, so, where do we draw the line? Constraints are of course good when they give real, tangible performance benefits, but what would happen if we loosen up the system to let those entities be more malleable and easier to work with and test?
At the end of the day only you know your domain, so there are no general answers to my assertions. What I'm trying to get at though, is a certain level of dogma I've felt that sneaks into relational storage design, where it is taken as fact that we must create constraints that sounds business-logic-compatible because that's just how we do it. Instead, I propose ONLY adding a constraint where and when it is truly and provably necessary. That way databases are allowed to focus just on storing and retrieving data, which is the role they truly excel at.