O/RM is the art of fitting a square peg into a round hole. Your SQL database is based upon records and relations, and objects are based upon OOP. OOP is little more than a software development mass psychosis may I add - However, ignoring the facts that OOP basically destroyed everything that was good in this world, O/RM is still rubbish tech by itself, and the whole idea of "mapping objects to database records" needs to stop. Because at the end of the day O/RM is like adding brain damage to your database.
For instance, using the repository pattern makes it very tempting to "save" your objects. Ignoring that the repository pattern is little more than a thin layer on top of CRUD, "saving" your objects results in all sorts of difficult to solve problems. For instance, imagine two people editing the same record simultaneously. Let's create a class to illustrate the problem.
class Customer
{
string Name;
string Address;
string PhoneNo;
}
Imagine John and Jane working on the same object at the same time. John updates the customer's Address
, and Jane updates the customer's PhoneNo
. They both start working on the object at the same time, and have therefor "fetched" the object from the database. John updates the Address
field, and clicks "save". Jane updates the PhoneNo
field, and clicks "save". Since Jane got her customer object from the "repository" before John saved it, the Address
is now reverted back to its old value, the value it had before John updated it. This is referred to as a "race condition". Because the one to save the object last "wins".
To solve the above problem, requires what's referred to as database locking mechanisms. SQL Server for instance solves this by allowing users to create RowVersion
columns on tables. This is called "optimistic locking", and tends to propagate into the GUI, resulting in leaky abstractions, requiring the developer being forced to add tons of garbage code to his GUI layer, making sure objects are truly saved when the user tries to save them.
There are different versions of database record locking techniques, but they're all basically garbage, and requires tons of garbage code to propagate, all the way out into the UI and the human being actually editing records. You are now left with a software "solution" where something as fundamental as updating a single field on a single database record possibly might fail, forcing the end user to "revert" his changes, and "reload" the original object. 25% of your code is now safe guarding code, trying to prevent the user from applying race conditions to your data, and this garbage code penetrates from your database, through your middleware, into the UI layer of your app.
Ignoring the fact that 98% of software developers don't even know how to apply database record locking, and simply ignores it, resulting in garbage data in your database - Even if you apply a "perfect" piece of code safe guarding for such scenarios, you're still ending up with garbage code, where something as fundamentally as changing a database record might result in the end user having a modal form in his face.
"We couldn't save your object, do you want to force it, reload the original object, or cancel?"
Most end users wouldn't even understand what to do here, and probably simply call in sick, staying home the rest of the week - Or bother some tech guy, who'd need an hour to simply figure out why the object couldn't be saved. Hence, the use of O/RM in your database layer, resulted in an end user having half his day destroyed, not able to do his job.
We have a word for this, and it's called madness!
The big joke
The big joke in this equation is that you do not need locking if you respect the database for what it is. Your database is a relational database system, and it allows for updating individual fields, on individual records, on individual tables. To type it out with code, imagine John's change resulting in the following.
update customers set Address = 'xyz' where ...
Then imagining Jane's change resulting in the following.
update customers set PhoneNo = 'xyz' where ...
It's literally that simple. You've now eliminated all sources for potential race conditions. You no longer need locking, and you can basically SHIFT+DELETE 25% of your codebase, ending up with orders of magnitudes more safe code. No more "synchronisation code" in your UI layer, middleware, or database.
The technical term for this is "record slicing", and implies doing partial updates on records. However, this is fundamentally incompatible with best practice O/RM design patterns, such as Active Records, Repository Pattern, etc, etc, etc. I realise that some O/RM libraries do implement this slicing technique, but then you're no longer using your O/RM library as an O/RM library, but rather a sadly implemented substitute for SQL, and it becomes a terribly implemented functional programming language, with a sub-optimal database serialisation layer, violating every single "best practice" you were taught as you started using O/RM libraries. The reasons this is impossible to implement in O/RM without violating OOP best practices, can be described as follows ...
Can I have half your object please?
When phrased such as above, it is easily understood. OOP is based upon classes and strongly typing. Supplying "half an object" to OOP simply doesn't make sense, and is literally impossible. Hence, if you want to apply record slicing in your O/RM, you're no longer using your O/RM as an O/RM, but rather something else, and you're no longer using your OOP language as OOP, but rather like a badly implemented functional programming language.
If you want to see how we solved this in Aista, you can watch the following video where I demonstrate record slicing, completely eliminating the problem.
If you want to study how Hyperlambda solves this problem, you can register a one month trial cloudlet below, and reproduce what I did myself, by creating your own frontend in a couple of minutes implementing record slicing.
Top comments (41)
I can imagine the following situation (with or without ORM, it doesn't matter):
The manager tells John to update the user name to "Pepe" and the boss tells Jane to update the user name to "Juan", both proceed at the same time and hit the button at the same time (that's unlikely to happen due to the same time meaning the same absolute time but let's imagine it's possible)
Both hit the button at the same time "SUBMIT".
Which name will the user have? You can't know without looking at the DB.
Let's imagine the difference between John and Jane hitting the button is of 1ms, being John the first one and Jane the second one.
Which name will the user have? Again you can't know without looking at the DB. Maybe the request of Jane took more time to process than the one of John. Which you can solve with transactions but even then, maybe the request of Jane took more time to reach the server and thus the result will be "Pepe" and not "Juan".
The solution you propose against ORMs is pretty common when using ORMs (the pattern I mean) as well and it's simply send the update request for the property that changed regarding the fetched object you already have.
Let's convert this following example you shared into something less moronic and more reallistic:
Imagine John and Jane working on the same object at the same time. John updates the customer's Address, and Jane updates the customer's PhoneNo. They both start working on the object at the same time, and have therefor "fetched" the object from the database.
John updates the Address field, and clicks "save". Jane updates the PhoneNo field, and clicks "save".
Since Jane just updated the phone number, the request is only meant to change the phone number, while since John just updated the address so it sends a request to update just the address.
This deals to a better pattern in which you reduce the network traffic, get more specific requests, faster and lighter to process.
No drama, everybody wins, no race conditions affecting different fields on the same schema and the most important, Noting to do with ORMs but on validations.
When you validate an object in the update process you set the values as optional (except of course the primary key). If there's something on any field, validate it with the data type, length and whatever. Then send the update to the ORM.
validate against:
Update requests get:
Then the request to the ORM looks like:
and the next req will be
Now everything put together:
And there you go! Both requests can be executed at the same time, asynchronously etc without the issue mentioned in the post and using an ORM, which brings some extra benefits in security and development speed and maintenance.
I do not use OOP in most situations, I just use Objects for convenience hence you can create a new post about using half OOP in those situations and I'll probably agree but it's just that I don't care. I use a multi-paradigm language (JavaScript) so I use the paradigm that's more convenient for each task.
The same way when you read something from a model, you got an object (that's ok) with the fields you actually need, not with everything. e.g. you may not need the
created_at
,updated_at
etc of each register to show movies from Matthew McConaughey on a public view so, why asking for them in the first place?Having a user without information on optional fields doesn't make the "user" be less "user" and threating required fields (on create) as optional (on update or get) also doesn't harm the overall ORM philosophy.
The only thing I can say generically against ORMs is that If you are a master at SQL, you can probably get more performant queries by writing them yourself.
In repository pattern, mapping is not the responsibility of the repository. It’s the responsibility of your controllers 🤷🏻♀️
It's not the tool (ORM) but how you use it so again, the premise IMO is incorrect, hence it is the conclusion as well.
Hope it bring some light to the topic 😉
It is not very common, it's extremely rare. Most people implement
object.Save()
, which updates everything.This is the correct way to do things, and the way Hyperlambda does it. But it is fundamentally incompatible with ORM libs without violating the entire "philosophy" of how an ORM lib should be used. Most who actually do implement it, implements it by retrieving the object, applying the changes, and (sigh!) invokes
updatedObject.Save()
. I don't even need to tell you why that's bad I presume ...There doesn't exists a single ORM that implements this. Because it is a violation of the "Design Patterns" ORMs' are built on. Don't get me wrong, some ORMs do implement this, but the devs using this feature of ORMs are no longer using their ORM as an ORM, but rather as "something else". Regardless, once you start using your ORM like this, you've got no use for your ORM, and you might as well throw it out. EF, the most popular ORM for .Net had existed for more than a decade before they even implemented the ability to do this ...
Your code is fundamentally incompatible with OOP. If you're using such code, you're not using OOP, and you're not using ORM, since the latter implies an OOP model ...
Hope it bring some light to the topic 😉
We may have very different experience on projects, languages, tools and overall but only saw this on a couple of "dinosaur" projects in which they handled those things on various abstractions (non-sense) and sometimes even wrong.
I've always worked like that explained on the comment above for reasons I feel obvious.
As far as I'm concerned I don't use classes in JS, just Objects (as everything is an Object and the Object Object itself is a convenient data structure to use) and also don't think there's a single thing to rule them all so If some thing needs a paradigm for any reason good! go for it but I won't stick to a single thing.
I may need to take a think on that to see why is it incompatible with OOP.
AFAIK there are optional props in most languages that implement OOP, i.e.
Maybe @codenameone can bring some light here as I'm not using Java since some years ago.
At first glance I think those are different topics, on one side we have Object Mapping (to a model) and on the other side we have object property propagation (how many of those properties you need to propagate and/or use on each place).
I don't see any way in which this is incompatible with OOP. SQL itself isn't OOP and we use ORMs to bridge the gap. They aren't ideal OOP and one of the big mistakes people make with such tools is to build them as if they are (excessive inheritance in object model, etc.).
There's no need to be religious about OOP or functional programming. Each has its strengths and we each find the balance in the languages best suited for the job.
I'm good with null personally even though we have
Optional
. That maps directly to the way the DB stores the data.Save doesn't update everything in tools like hibernate. It instruments the object and knows the specific fields we set. You don't need any optional or anything for this to work seamlessly. It generates relatively long and verbose queries but they are more efficient as a result.
Bingo; ORM is an OOP pattern. No OOP, no ORM. JavaScript is not OOP. You're not using an ORM, you never have been using an ORM, and assuming you don't start typing in a strict typed programming language any time soon, you probably never will use an ORM either - What you're doing is using the good parts from JavaScript to circumvent the problem of optimistic/pessimistic database locks ...
I took the sentence of ORM being an OOP pattern as true at first sight but something was crackling in my head.
Take that definition instead: ORM is a programming technique for converting data between incompatible type systems in relational databases and objects in programming languages. This creates, in effect, a "virtual object database" that can be used from within the programming language.
You have a RDB with it's types and it's tuples and so on, you create Objects with properties that map those.
As long as the programming language accepts Objects it does not need to live in an object-oriented implementation, and even objects didn't exist you could use any data structure to replicate the same.
Also nothing to do with strict typed programming languages; a
varchar(180)
will never map to a programming language type, that's why you add validations and logic in between, to tackle certain security concerns and ensure your data is consistent in the DB and to prevent errors or your data being cut down when trying to store it.I get my Model mapped to Objects, hence the Object-Relation Mapping is fulfilled. So yes, I'm using an ORM.
O in ORM implies “Objects”, as in strongly and strict typed OOP. No strict types, no ORM. Many will disagree with me here, but ORM is fundamentally a design pattern for OOP languages. No (strictly typed) OOP, no ORM …
You’re using something else. Don’t believe me, check out the Java guy commenting here … 😉
If you invent something like Array Oriented Programming and ArrayHandlers instead classes, while adding functions to Array indexes instead methods it will not make Array exclusive of this weird thingy.
If I'm using JavaScript (imagine I don't use TS) and without implementing OOP as paradigm for everything will not make my ORM be less ORM. The OOP implementation is in the ORM itself and I don't need to follow the same, I just need to use what it provides.
about Array programming languages? It's already been with ys for 6 decades. It's called Lisp 😉
You're using the "good parts" from JS. JS is not OOP. I'll probably have 95% of the workd disagreeing with me here, but the truth is the truth regardless of how many who argue against it ...
JS is a multi-paradigm language, and OOP fits inside this "multi". It not being "Pure" doesn't mean it's not capable of handling OOP. Neither you need Pure OO to provide an ORM.
Also Lisp has the code represented as linked lists that does not mean is array oriented.
Bring it up with Martin Fowler 😉
Maybe you should ask Alan Kay instead, don't you feel it more appropriate?
😉
I agree. Unfortunately 99% of the world still believe OOP is equivalent with C++ and Java ... 😕
I built Hyperlambda by drawing inspiration from Lisp and Alan Kay in fact. You should watch Alan Kay's speech at OOPSLA 97 BTW ...
By the rest of the post and comments, you seem to be in this 99% isn't it?
Again those are different concepts.
On one side there is OOP (a programming paradigm) and on the other side we have programming language implementations which can be more or less accurate to fulfill the paradigm and can or cannot be a re-interpretation of the original paradigm.
I'm not picky with this things, as far as I'm concerned, if Java and C++ did an OOP implementation that's not aligned with the original idea of OOP the people behind that should have good reasons to proceed that way (and they had in fact).
I'll search that in youtube and save it for later 😁 can't recall if I watched it but even if I did it was long time ago.
His main thesis is that we ate still in the stoneage. The name is "The computer revolution has not happened yet" ... 😉
Well there's always iterations.
Look at the ML niche, analog computers are getting stronger on that field of study for good reasons.
Update is a problematic statement and most of us avoid it so the chance of a race like that is pretty rare. This is true regardless of what you use, immutable data is better and much easier to optimize at scale.
Sometimes we do need it though and in this case ORMs handle locking almost seamlessly. No 25% of code or anything really. You can enjoy optimistic or pessimistic locking in ORMs.
About performance. Since caching is the mother of all performance optimizations by giving up on ORM you give up on Level-2 distributed caching. There goes 90% of your performance down the drain when compared to ORM.
I get that you don't like OOP. I disagree but that's subjective. But ORMs can be far superior to the DB on which they are situated. They are much easier to refactor and adapt to new SQL features and can be more performant. That isn't the case with coded SQL.
There are valid cases to be made that the abstraction leads to ignorance and people who don't benchmark the resulting SQL. But that can be said about almost any technology.
So you're proposing for me to not being able to update records in my database? With anything but an ORM, updates are not particularly problematic, they're really quite simple in fact. I added an SQL update statement in my OP to illustrate how easy in fact ...
i rest my case ... ;)
Both of these implies code propagating into the UI layer such as I illustrate with the; "We couldn't save your object, do you want to force it, reload from storage, or cancel" modal dialogue box. Either that, or silently failing the update ...
Not true. Not having an ORM doesn't imply anything else besides really "not having an ORM". Everything that an ORM does, is just as easily implemented in everything that's not an ORM ...
Updates are easier with an ORM but updates are inherently bad since they are mutable state and can fail (regardless of ORM). They should be minimized in a good DB strategy. This scales better and provides logging for recovery/tracking.
Everything will end up failing somewhere if you have two users updating the same row. This was true when I was writing fat client SQL in the 90s and it's true now. You seem to gloss over these inherent facts as if races don't happen. Two users send a bad update, then who wins?
ORM fails and solves that dilemma for us. In your approach the failure will be in bad data. A user will have his work overwritten and lost forever. That's why we don't do updates.
You can't cache at scale without an abstraction. You either create your own abstraction and deal with all the bugs/complexity/scale. Distributed caching is one of the hardest problems in our industry. Doing it without an existing framework is a whole lot of work and is very likely to include many serious bugs.
No, updates are inheritingly more difficult with ORMs, and the reason is originating from OOP being an inheritingly bad idea. ORMs just further emphasise how bad of an idea OOP was in the first place.
I don't disagree, I like immutable data. However, I've got a
client_account
table. I've got 11,000 log items referencing one of my client accounts. I need to update one field on myclient_account
field. What do you propose I do? Tell my users "Sorry mate, we can't update the phone number on client accounts because we're using O/RM" ...?Not true. This is an "ORM only problem". If you disagree, feel free to explain shy, with an example use case. Most database systems today are ACID compatible, and there are never any problems with such things in ACID ...
You don't need an ORM to abstract. Below is an example from Hyperlambda.
Adding Redis to the mix makes it distributed. And yes, the above Hyperlambda is thread safe ...
I think you had a bad experience with a specific ORM tool. Hibernate does have edge cases that you need to understand to leverage it properly but it's pretty amazing and lets you just invoke native SQL whenever you want. So even if you do find such an edge case, you can always do an update with SQL.
I do disagree though. Hibernate updates are fast and generate code that would be inconvenient to handcode.
Again. Nothing to do with ORM.
Personally I have a separate table of user properties which is where we add meta data and delete old entries. But if you want you can use an update. This is one of the cases where update makes sense. It's a rare operation that will happen once in a blue moon. The likelihood of collision is very low.
That's key-value caching. Are you comparing this to ORM level-2 cache?
Say you pulled an SQL value out of the DB and placed it into Redis. Now every single place where you do that handcoded update you need to remember to invalidate the Redis key or you might have a serious problem. Are you using the same transactional context between SQL and Redis?
What if the transaction fails? Do things go into Redis or not?
All of those things are handled seamlessly with an ORM and a transaction manager.
Redis is great. I use it a lot. But it's a key-value store. It isn't your SQL database.
You have a lot of valid points. You’re still fundamentally wrong, because you don’t see the fundamentally bad idea of treating an RDBMS things within the context of a strictly typed OOP language, AKA ORM…
That's an argument I can agree with. ORMs are a leaky abstraction that marries SQL with OOP. Two very different things. This causes most of the problems with ORMs.
Having said that the alternatives for OOP programmers are worse:
I'm not an OOP purist. I think pragmatism trumps everything. Java isn't pure OOP either which is one of the things I like about it.
Check out Dapper for .Net ... 😉
I had you pinned as more of a jooq person...
Hyperlambda is my choice of kicks ...
You're probably using outdated tech... in the past 15 years I've been in Novell, IBM, Amazon... Never had an issue with race conditions by using ORMs. We've had thousands to millions of requests per second. Nobody updates the whole row on every property change, so I suspect you're using outdated tech or a poorly implemented homegrown solution.
Entity Framework, "best practices" typically results in this. EF is the by far most popular ORM on the .Net platform, but the problem isn't related to any specific ORM, it's related to "best practice design patterns" a lot of us have been taught over decades of software development, such as repository pattern and active records, using view models and db models. I have seen such issues resulting from nHibernate, Entity Framework, even Dapper, and tons of other ORM libs.
As to outdated; CosmosDB, the arguably newest contribution to NoSQL database systems doesn't even have partial record updates. With NoSQL database systems the problem grows exponentially ...
NoSQL is much "newer" than SQL - Just sayin' ... ;)
When I say race conditions though, I don't mean literal race conditions in code. To understand these race conditions, look at the use case of John and Jane updating the same record.
I tend to agree with @joelbonetr that this isn't actually about ORMs.
Let's just look at your video which looks at the HTTP requests, and not SQL queries. A decade ago, Google create RequestFactory in GWT, which would make HTTP requests basically looking like yours (much more complex because it models a true object graph, allowing object reuse, reference cycles, etc. and it's not just CRUD, a bit like GraphQL if you squint; anyway) but on the server side it would load the entity, "patch" it, and then save, so it would totally fit with ORMs.
Now let me imagine a different situation: John and Jane make adjustments to "data", and some fields' value depend on other fields (it could just be the traditional cascading dropdowns or similar filters like "this field cannot have this value if that other field has that value"; or it could be more "functional"/business, without hard rules encoded in the software: the user sees the data and adjusts one field based on what it sees on the screen in the other fields). Now if you only "patch" one field at a time, losing the context in which that modification was made, what guarantee do you have that the data still is meaningful? how about validation rules across fields?
Users don't look at the actual data, they look at a snapshot that was fetched moments ago. If their intention is to only modify one field, then sure your solution will work ; if they check each field and fill in others based on that context, then they'd rather either have a lock on the record preventing any modification by another user, or have to deal with "optimistic locking", i.e. "someone made a change to that record since you fetched it", with a conflict resolution screen or whatever (offer to save as-is and have the user look at the audit log to see what had changed and possibly redo the changes)
If you really want to solve the issue of concurrent modifications (while still allowing them), then you'll need changes reflected in real time on the screen, similar to how you can collaborate in a Google Doc or Google Spreadsheet.
What you are describing is intelligent and interesting, but does not invalidate the primary argument, or change anything. ORMs are still garbage tech, solving the wrong problem, with the wrong solution ...
"Damn kids, get out of my lawn!"
"Music in my times was better"
I've implemented ORMs before Hibernate and EclipseLink existed. There's .umtiple ways of avoiding updating the whole object or race conditions. You can even have distributed caching in the middle with row versioning for a seamless experience. After over 20 years, I've started making things simpler, most of my microservices are dumb CRUDs. I see your point, but I understand the other side as well. Use whatever you want, and let others use whatever they want... nobody can learn in the head of another.
I do too, but don't tell anyone ;)
Really, my point with these articles, is to give people "forgiveness" for creating simple code. Simple code is almost always superior to complex code ...
Not almost always... I'd say always, period.
Hehe, agree! But yet again, don't tell anyone, I still want people to believe they can make me change my mind by arguing ... ;)
With today's javascriptization of the dev world (at least web dev), databases are mostly used as a fancy typed document storage with magical querying capabilities. And in document storage you always read everything and write everything. ORMs map pretty well to this.
I'm not saying you are wrong though, just that for many people it might be hard to understand the issue.
Word ... 😕
So subtle 🤣🤣🤣
I try ... 😉😊
JSON Patch is a great way to implement partial updates quite easily (and it can be achieved without altering you frontend).
Thx Tamas, I try :)
I have brain damage and I love it, so ORM giving my database brain damage just means it's even easier for me to understand and work with.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.