Jon Lauridsen

Posted on Apr 13, 2022 • Edited on Aug 8, 2025

Database constraints are (often) harmful

#database #design #postgres #sql

Relational database design encourages constraints to enforce structure: mark User.email as unique, make Subscription.user_id non-nullable, etc. No-brainers, right? Every user has an email, every subscription belongs to a user.

But that is a harmful way to develop. I know that sounds heretical and/or stupid, but hear me out:

Constraints typically exist for two reasons: data integrity (e.g. preventing invalid data like a subscription without a user) and performance (e.g. speeding up indexed lookups).

The problem on both fronts is we tend to optimize too early.

1. Data Integrity

This is where premature constraints do the most damage. The reality is that no amount of non-null fields, unique indices, or stored procedures actually capture the full business logic. That's why you're not writing your application directly in SQL.

We all agree business rules belong in the application layer. So the real question is: why try to enforce a partial version of those rules in the storage layer?

2. Performance

Index constraints are less damaging but still introduce friction. Yes, they can speed up queries and should be used when solving actual performance problems, but unless you're fixing a real bottleneck you're introducing rigidity too early.

Even a simple index adds overhead to schema changes. It may be minor, but it's still friction.

Performance constraints are valid, but they should be deferred until needed. Start flexible.

Constraints Aren't Free

At the core, this post is making one point: constraints make systems harder to test and evolve.

Testing: Want to test a Subscription? If it's constrained, you also need to set up a User, even if the test has nothing to do with users. That adds noise and complexity. Factory abstractions can hide it, but that's self-inflicted overhead that could be avoided if the database allowed user-less subscriptions.
Change: Want to support multiple users with the same email? Sure, update the code, but with a uniqueness constraint you also need a migration. Why that extra coupling? Or it could be you just want to experiment with the idea behind a feature flag? The constraint blocks your experiment before it even starts.

It can be heretical to say out loud, but: the more constraints, the more brittle the system.

Trust the Code

At the end of the day, data integrity constraints exist where we don't trust the code. Think about it: if your code already guarantees that a Subscription is only created with a User, what value does a non-null constraint add?

It's just duplication. And the database is not the source of truth for business logic.

The code should be trusted. Let it handle rules and enforcement. Let the database focus on storing data.

I think many developers add constraints because it feels tidy. Constraints give a (false) sense of "hard correctness". But unless you're modeling all business rules in the DB (and you aren't), why model any of them?

So don't add foreign keys by default. Drop the half-baked business-rule constraints. Defer performance optimizations until a graph tells you it's time.

There's nothing wrong with joining data in memory as long as that's manageable.

Let the code express the business. Let the DB persist the data.

Photo by Sam Moqadam on Unsplash

Top comments (21)

Tsvetan Ganev • Apr 13 '22

Interesting take. I've heard the same arguments about foreign keys being evil because they degraded performance. If we follow this logic, we don't need RDBS at all and all applications should store their data in plain binary files. I agree not all software systems require strong data consistency/integrity but we can't scrap 50 years of DB development so easily. Critical systems that work with money in some form must be modeled strictly on the data layer as well as on the application layer.

Jon Lauridsen • Apr 13 '22 • Edited

I see your angle. What I find necessary is to challenge the dogma I sense of always creating constraints. I want to trust the code to handle business requirements and I see no issue joining manually.

I do see where you’re coming from with your financial comment and it could be true: The data storage constraints can act as a kind of extra bookkeeping check, further enhancing correctness on top of test-automation. It's sound logic. But I suspect I’d still want to see how far we can guarantee correctness in the code, with appropriate layers of test automation.

Maurice Borgmeier • Apr 13 '22

In defense of constraints for data integrity: If you need to absolutely ensure that a given value is unique in a column (e.g., your email) and you're operating in an environment where multiple entities change the data (e.g., multiple threads/servers) at the same time, a database constraint is the way to go to avoid race conditions and other funky distributed systems behavior.

Constraints can make testing more complex and impact performance at scale. However, not any workload needs to be scalable to the moon (premature optimization), and often constraints can be a cheap (as in economical) way to solve a problem.

FJones • Apr 16 '22

The database is not the tool to prevent race conditions. It's an easier solution, and works as a stopgap, absolutely, but vulnerability to race conditions on the data layer are an architectural flaw that should be remedied elsewhere.

I often notice this with asynchronous microservices - where the solution is an execution engine or message broker, not relying on the database to resolve race conditions.

Bradley Ross • Sep 27 '23

When it comes to software, I am a belt, suspenders, and two pairs of pants type of person. Everybody makes mistakes sometime, and assuming that the other person was supposed to handle it is a recipe for disaster.

I had a case where a large number of records with information on cases contained NULL in the case identifier. The person responsible for that code didn't believe me and told me to prove it by providing ten examples. I gave him a list of four hundred cases. I had the report on my desk and the reason the number was so low was that the SQL query was limited to 400 cases. The response of the database administrator was that he had only asked for ten cases, and he thought that I was trying to embarrass him by listing so manycases. What you need to do is insert the constraints and review all the errors it picks up.

In another case, there was a web page on LinkedIn that didn't work. Many people were unable reorder interest groups on the profile page. People were complaining loudly on several social media sites. Requests to the help desk received the following message: "We are giving top priority and will notify you when a solution is found." I examined the code for the page and determined that the servlet was using two SQL statements: one had an inner join and the other had an outer join. They were joining the two result sets by row number. I notified the help desk and had them validate it by fixing the problem with my profile using the help desk's interface with the database. I wrote this up and sent it to the help desk who repeated the previous request. I asked them to reread my email, and they said that they were sorry for not reading it. The person also said that they had been told to give the above response and ignore it.

If you really want to do a good job, you have to be willing to look for things that make you look bad. Most people don't want to do that, and the managers tend to be worse.

You have to assume that other portions of the system may contain errors.
If you check for errors in other programs, the ability of other programs to totally corrupt the database is limited.
The optimizing routines for the DBMS are designed to use the constraints to run faster. More constraints mean that more information about the data is available, and better optimizations are possible.

Jon Lauridsen • Oct 19 '23 • Edited

Hi Bradley,

Thank you for your insights, and I think I recognize your angle. Your examples do highlight the benefits of using a database's powerful tools, but I also feel they implicitly tell a story where the code is not doing its job. If an empowered team owns its solution maybe there isn't a database administrator at all. Maybe a great domain needs to be modeled, which can encapsulate that complexity.

I'm reminded of this scene from the movie Contact, where sometimes letting go of the seatbelt results in a smoother ride. It can seem scary, and of course I don't know your context so ultimately only you get to decide what's the best way forward. I'm happy to simply delight in the wonderful differences there are between our contexts, but if it matters then I do think there's more to what you describe between the lines that's worth exploring deeper.

But I enjoyed reading your angle because it does illuminate different contexts we all find ourselves in. And thanks for reading the article.

Ben Sinclair • Apr 14 '22

I see your points, and even agree in principle, but disagree with most of them in the real world.

A lot of assumptions people have about anything are wrong. For instance, a lot of services use a unique email as an identifier, but there's no one-to-one mapping between people and email addresses. I have lots (ben@my.work.address, ben@my.play.address...), some people share one (support@my.work.address, thesmithfamily@example.com) and enforcing uniqueness then becomes a business problem instead or you'll get a buttload of users who're breaking your Ts&Cs.

In the real world, code gets deployed which may or may not have been tested well, and tests may or may not cover every eventuality. The QA user buys negative PI beers.

Without constraints, and especially without transactions, data will get corrupted. It's like using C to manage strings as character arrays. You'll screw it up at least once while you're writing your program and see some sort of delicious segfault. If you have the chance to use a string type which handles all that for you, why not do it?

The difference between my string example and the RDBMS of the question is that the database persists longer than your new code. Errors in it go unnoticed, code crashes part-way through processing a "DROP user" and you have dangling subscriptions. Then you need to write a new function called removeOrphanSubscriptions which you call on cron every night at the software witching hour.

And the problem with optimisations is that choosing to not use a constraint because it migh degrade performance is the optimisation, not the other way around.

Jon Lauridsen • Apr 14 '22

I hadn't considered transactions as part of my argument, we're definitely in agreement that transactions form a backbone of reliable writes. There should never be a dangling Subscription because a service crashed halfway through, and if what I wrote implies I'm against transactions then please point out the offending text so I can fix it :)

I think we both agree that the correct place to prevent buying negative beer should be in the code, right? Maybe we'd argue exactly where and how to do it, but from what you write I think we'd both argue the code should be responsible. The code is the responsible system. You describe the case where a mistake in the code would let an error through that'd get caught in the database, and yes you're right constraints might catch such errors, but it's kind of based on luck: Because business requirements can't be (or certainly shouldn't be) modeled entirely in the database, there are no guarantees that errors in the code layer will be caught in the database. It is possible to get lucky and catch an error, but in my opinion & experience it can just as easily be a regex-constraint that's gone out of sync with the code and now produces hard-to-verify errors because of some edge-case where a certain combination of characters result in a database error. Hope is not a strategy, and if the code should be responsible I'm all for focusing all efforts on that layer instead of adding partial safety nets where they don't really belong.

Perhaps constraints as you describe it made a lot more sense when code was managed by different people than the database, because those constraints allow the database-context to establish rules to make sense of its own data. But if one team manages both code and DB? I'm really suspicious of the value a constraint adds in that context, other than taking up even more minutes to writing and verifying a migration.

That said I'm not here to tell anyone how to work, you make good points and as long as we both can work on systems we enjoy I'm all the happier :)

FJones • Apr 16 '22

I definitely agree with keeping business constraints out of the database. One common issue we face with business constraints is testability. "This should only run once per day" translates to "let's add the date to our unique key". As a result, we can't run the same process more than once per day, because it clashes with the business constraint in the database, while the code path for testing bypasses those checks intentionally.

Database constraints should verify internal integrity. Even not-null checks are, at times, a hindrance both in operation and in changes of logic. Ensure that relationships are valid, ensure that types are valid - don't go beyond that.

player8472 • Oct 29 '23

Whenever I read about something like this I ask myself if the person who write it actually ever worked on a production system that doesn't have constraints on the database side.

I have (twice in two jobs by now) and can only tell you:
It is not pretty.

Yeah, sure the programmers love it for the most part as they get less "weird" exceptions.
But fact is: Programmers forget stuff, database architects forget stuff.
You don't want to know how many hours I've wasted on finding out why a statistical analysis doesn't deliver the desired result on Production when it does on the DEV and Integration Environments.

And it is almost always because the user was able to create multiple identical entries...
Or entries which shouldn't be able to exists in the first place as there is no entity for the entered foreign key.
Or because some idiot didn't properly implement data deletion and there are invalid Foreign Key references.

And yes, I know the argument: "But we do it codeside" - Truth is: 90% of you don't!
You write Hibernate, Rails or whatever you are using to communicate with the database.
Database Architects and Developers don't implement constraints to make your life hard, but to prevent you from making mistakes.

Not using constraints (in classic RDBMS - some modern db-systems have mechanisms to take care of crap like this) is like just empty catching exceptions.
Makes your life easier until it doesn't...

Jon Lauridsen • Jan 3 '24

Thanks for your reply! I too have been in places that weren't pretty, with database and code constraints diverging in various subtle ways in various legacy systems. And yeah it's hard to get on top of, and I happen to have also debugged my way into cases that could've been solved by adding more constraints. But the complexity of navigating tightly coupled entities where one has to traverse numerous foreign-key relations just to raise the entity I actually need is incredibly energy-sapping. And it's brittle, because fundamentally the DB cannot enforce all the business rules anyway, so there's no guarantee I'm spawning a truly representative entity in my test-code unless I go through the code to raise those related entities.

And if the code is easy enough to use and we trust it to do the right thing then we might as well only express those constraints in code.

End of the day it's fine we disagree, the Internet would be boring if we didn't :) I hope you see much success with your decisions.

Jon Lauridsen • Apr 14 '22

I hear you state "With constraints you can trust the data".

But I don't think that can be true outside trivial cases, because business requirements aren't enforced in the database. You can constrain some business requirements (e.g. Subscription requires a User), but such constraints are simultaneously non-exhaustive (so the code still needs to validate if you don't trust what it's in the database) and aren't authoritative (because the code is where business requirements are supposed to be modeled).

If instead we say the database's role is only to store data (that is: a single service reads and writes from it), then there is plenty trust in the data because you know exactly what the service writes. A lack of trust there implies someone is messing with production data manually, in which case you either need to change that culture or add validation to all read-actions to ensure business requirement correctness (because database constraints would at best give a shallow appearance of correctness but might stil fail catastrophically inside the code because constraints don't model all business requirements).

I'd heavily suggest changing the culture though.

Aaron Reese • Apr 14 '22

WOW!
Where to start....
in a properly structured RDBMS where business entities are modelled correctly, constraints are never going to slow down your application. There are plenty of other reasons (missing indexes, partitioning, non SARGable queries) but check constraints and foreign keys are not going to be it.
SOMEONE needs to be responsible for the integrity of the business data. I agree that overly strict RDBMS constraints can APPEAR to slow down development but in reality they do not; they are like the canary in the mine and act as an early warning system that the business entities are not correctly modeled. If you don't do it at the database level you have to do it somewhere else. Most business solutions these days will have multiple entry points: Desktop, mobile and external applications. It is now possible to move the integrity checks to the API layer if it is common to all applications but DO NOT leave it up to the front end devs to encapsulate the business rules.
Strict RDBMS integrity rules FORCE changes in business requirements to be properly considered and investigated. You can't create a Subscription without the User, but without referential integrity built into the database model who is responsible for checking that you can delete the customer and not leave the Subscription orphaned.
I am going to assume that your example of the unique constraint for a User being the email address was contrived, but I fear it may not have been. There is a difference between the entity key and a unique constraint although there are often interchangable. the entity key should be unique AND idempotent (i.e. cannot change) AND ubiquitous (i.e. every entity must have one). For many reasons this will often become a NONCE key; either a UUID or an IDENTITY INT; it has no business significance outside the database. This fundamental failure of understanding is why you MUST engage a database architect and not leave entity definitions to the developers (sorry devs, but strings aren't strings. They are CHARS, VARCHARS, NVARCHARS, TEXT and have a length and a collation - otherwise you waste space and performance in my database). As a Database guy, I would not expect to know all the subtelties of your programming domain so why should you expect to know all the GOTACHAs in mine.
At the end of the day it will be the database people that are tasked with sorting out any crap data that creeps in through the front end and API layers.

NOONE puts data directly into my tables, all APIs call stored procedures for C_UD and will use a stored procedure or view for the R(ead). This is the only way to really validate that the data being entered is fully kosher. You say that we don't do turing complete code in SQL - Sorry. I do (MSSQL) and in the Oracle world PL/SQL is specifically designed to do this.

Jon Lauridsen • Apr 15 '22 • Edited

It is now possible to move the integrity checks to the API layer if it is common to all applications but DO NOT leave it up to the front end devs to encapsulate the business rules.

Fully agree here. I didn't even realize I was implying anything but keeping the API layer responsible for business requirements.

As for the rest, I really do like your clear angle and rebuttal, even if I disagree. I oppose the sentence "my database", because I want to break down silos and optimize for shared ownership of the product. I suppose you could just as well argue I'm an API-centric chauvinist who wants the world to revolve around the API and you wouldn't be entirely wrong, I consider the DB "just storage" and ultimately not worth super-specializing into (unless you're the vendor building one!) because it doesn't provide customer value directly.

So I think we're fundamentally pitted against each other because I want developers to not care about CHARS, VARCHARS, etc. Those are implementation details to be changed at a whim to cater to whatever business requirements provide value.

Thankfully it's fine we're "pitted against each other" because I love hearing your angle, it's so different from mine and we can only learn from our differences. It might be best we don't work on the same project haha 😅, but thank you for taking the time to elucidate your angle.

Aaron Reese • Apr 15 '22

Its not a war, at the end of the day we are all on the same team trying to provide value to the business. My point is that constraints in the RDBMs database are there for a reason; it prevents rogue application code from destroying fundamental business data. If you are not going to use them you might as well use a document database or flat text files. Just like your C# / java / python classes are trying to model the real world, so is the relational database. As it is the last line of defense, it HAS to be correct. If front end and middleware developers are creating workarounds then this is an indication that the database model is out of date.
I once did a lignting talk at our local geek-meet about how code-first ORMs (specifically Entity Framework, but the same applies to Eloquent) can help you to rapidly prototype the application and supporting database, but the data types it uses are not optimised (unless you use Fluent, in which case you might as well use DDL in Management Studio) Using a repository layer and Active Record pattern can help but it is still no substitue for a properly configured database.

Some of the crap I have seen:
Dates stored in datetime fields.
Date and time stored in SEPARATE datetime fields.
dates stored as strings
Y/N held in VARCHAR(1000)
Phone number 1, Phone number 2, Phone number 3 ....

FJones • Apr 16 '22

A few points:

Performance is indeed a touchy subject, and I do agree that database constraints should (at least theoretically) have similar or even better performance than application layer constraints - because the database is in fact optimized specifically for that.
Modelling business entities is not the job of the data layer. I know, people will disagree on this, but I firmly believe that the data layer should not be opinionated about business logic. It is storage, and it exists to model data structures, not business structures.
As I said in my other comment, the constraints the data layer should be responsible for are relationships (in a relational DBMS), and internal integrity (that is, valid types, limits, and so on).
While I agree that a lot of people eschew an API layer, that's bad practice as well. There should absolutely be an API core that models the actual business logic.

JoelBonetR 🥇 • Apr 14 '22 • Edited

Didn't read all the comments (sorry if I duplicate a concern here).

A priority in software is robustness.
You can have a bug executing a query (so you'll need to rollback).
You can have a bug in code that alters the data wrongly.

Having a layer to ensure robustness in each step is preferrable, specially when working with users that pay you money.

You can define a model that can or cannot fullfill the formalization (I would expect, at least 1, 2, 3 and BC), but those normal forms are here for a reason plus they imply constraints.

If you really need to do something fast you can just use an ORM (Code First) without specifying any constraint and migrate in a single command instead doing the create tables in SQL commands.

I would anyway encourage you to analyze the business needs for each feature before coding a single line or executing a single query.

Analyzing the business needs you can find inconsistences and you can define and get to know the big picture and the forecast for the roadmap so you can take early decisions. Let's use the email column as example.

You can state that every user will have an email and that emails are unique and it will be a good assumption, but have you asked and defined with client if they are about to let (at any point in the future) the users to change their email from their account?

If it's a yes, simply add an auto increment ID and let the email as required but not as PK.
If it's a no, then you can use email as PK or ignore the client statement and do the thing defined in the line above, it will work anyway and you protected your future yourself from a possible bad day throwing alter tables and sync-ing the code with the new model.

Jon Lauridsen • Apr 14 '22

If I may push back a bit, what does it matter if the client may consider letting users change their email some time in the future? If they say yes, would you design for functionality that isn't needed now? What if they say yes, but never get to it? What if they say yes, but actually what they need next year is something else? What if they say no, but actually meant yes?

We can both agree it's crucial to talk to our customers :), but I'll assert that in the majority of cases a discussion about "any point in the future" is too vague and too theoretical to be of use. I'd rather suggest we build what is needed right now: A User table with an email column. That's it. In data-storage terms I don't care if it's unique or not as I'll handle that in the more flexible code, all I know in data-storage terms is I need to store an email and right now the need is for it to be associated with a User. No constraints to mimic business logic, no concessions for theoretical futures, just what's needed to make the app work right now.

Aaron Reese • Apr 14 '22

I don't care if it's unique or not as I'll handle that in the more flexible code,

Now you have to write it several times if there are multiple entry paths to the database

all I know in data-storage terms is I need to store an email and right now the need is for it to be associated with a User

So let's model the database with a UserContactMethod table:
ID (Primary Key)
UserID (Foreign key to the User table)
Contact method (enumerated value of phone/email/Twitter handle)
ContactMethodValue

Not a lot of effort to design up front and now caters for 0..N email address against a single User, the same email against multiple users and automatic extendability for other contact methods.. NOT modelling this according to the real world entities will likely result in significant remodeling in the future.

Dude Bro • Sep 14 '25 • Edited

Business logic and data integrity aren't the same thing... A subscription cannot exist without a linked user because that is fundamentally corrupt data. That way of modeling the data is objectively incorrect. So this argument of 'why use constraints at all if you can't shove all the business logic into the database' is a strawman - It's not business logic.

Want to test a Subscription? If it's constrained, you also need to set up a User, even if the test has nothing to do with users.

Again... a subscription without a user is corrupt data. Period. Doesn't matter if you're in relational databases or not. If you have set it up this way, you have modeled the data incorrectly. A constraint for this is a feature, and your theoretical test without a user is just bad, meaningless, and incorrect.

That adds noise and complexity.

Nope. Not at all. It's completely trivial.

Want to support multiple users with the same email? ...with a uniqueness constraint you also need a migration.

I'm going to ignore the silliness of this example. How is that a bad thing? Yes, you have to be intentional about changing your data model. It prevents you from breaking your application based on previous assumptions. And when you do make this migration, your updated code will be lean. Not bloated garbage because you have corrupt data.

Think about it: if your code already guarantees that a Subscription is only created with a User, what value does a non-null constraint add?

Lol. Oh yes, the myth of perfect code. Your 'perfect code' is going to change. You also realize that in production environments, more than just the application interacts with the database, correct? Your logic here does not work in reality.

It's just duplication

Wrong. Once again, correct data modeling is not business logic. And anyone competent who has worked in production would quickly find out that just 'trusting the code' does not work.

I highly suggest anyone reading this article who doesn't know better to summarily dismiss all of it. People like this are why so many RDBMS are a nightmare to work in.

Jon Lauridsen • Oct 8 '25 • Edited

Thank you for participating here. You're making two errors that keep repeating in these discussions:

First, you conflate logical correctness with mechanical enforcement. Allowing a NULL foreign key doesn't create corrupt data; it explicitly allows that relationship to not exist which is not the same thing. Whether that trade-off makes sense depends on context, not dogma. In many architectures, the database is not the single source of truth, and in those cases constraints introduce more friction. You're describing one particular worldview as universal, and it isn't. You may prefer monolithic database-centric systems, but that isn't inherently more correct than alternatives.

Second, your "it's trivial" claims only hold in trivial systems. Many of us who have run large-scale integration tests, seeded data for analytics, or done partial restores know how fast "just add a user" snowballs. Constraints add friction that compounds. They can still be worth it, and my article is about considering that friction not permanently rejecting constraints, but pretending constraints are free or always desirable is lazy thinking.

Your "anyone competent" line reads as gatekeeping. Competence of course also includes knowing when not to apply a tool. Most software does not scale, and its almost sole forcing function is running out of time and money. It's not that database constraints automatically cause software projects to fail, but most software operates on tight margins and so every source of friction deserves scrutiny to help those projects survive and adapt. In that light it is wise to apply constraints sparingly, when and where they really matter.

View full discussion (21 comments)