Introduction to Object-relational mapping: the what, why, when and how of ORM

#database #objectrelationalmapping #orm #oop

If you have ever used a relational database for persisting your data and an object-oriented programming language for your application, then Object-relational mapping paradigm is definitely something you should be familiar with. If not then read on as this article hopes to provide an introduction to the concept of ORM by answering the basic questions like why is there a need for an ORM, what an ORM is, the benefits and drawbacks of ORM, first steps needed to set up an ORM and the patterns of setting it up.

Understanding the need for ORM

Object-oriented programming languages (OOP) are great in combining variables and functions into classes and objects. Imagine this as objects in the real world (cars) each with their properties (wheels) and behaviours (drive). On the other hand, relational databases (also called Relational Database Management System or RDBMS) are powerful in their relationships between individual tables through the use of foreign keys. Imagine this as relations in the real world between one entity (passengers) and another (plane seats) linked together by a unique identifier (plane ticket ID). Today, the majority of applications are written in an OOP language and at the same time persist their data in the relational databases. Consequently, a need for a more harmonic way to communicate between them arose.

The clash between objects and relations is a highly complex problem as these two represent fundamentally different paradigms. The differences vary from the basic data structures, through differences in manipulations and transactions to conceptual differences. A more umbrella term that covers both specific and philosophical contrasts between them is Object-relational impedance mismatch (also Paradigm Mismatch).

It is important to state that the difference between the object-oriented and relational concept is intentional as they are optimised to do what they are best at. The object-oriented programming language is powerful in its ability to describe the real world using objects as the fundamental data type and the four principles of OOP - encapsulation, abstraction, inheritance and polymorphism. On the other hand, relational databases are exceptional in persisting data and thanks to the nature of the relationship between entities (most commonly represented by tables) through primary and foreign keys, they ensure data integrity through referential integrity. They also use Structured Query Language (SQL) designed for fast data retrieving and manipulation. Since both concepts are widely used, especially together, Object Relational Mapping tools can blur the line between the OOP language and RDBMS a little bit and allow your application to make the best of both sides.

Once upon a time, there was Object Relational Mapping (ORM)...

ORM is a concept that lets you manipulate data from a database through an object-oriented paradigm. In other words, it allows you to call and manipulate data from the database using your language of choice instead of SQL. The process of converting data into objects is called hydration and usually involves converting column values into object properties. That is why ORM libraries are language-specific (here is a list of ORM libraries). That is only the basic concept, the ORM libraries are much more powerful, especially when your application and database get more complex.

Here are the benefits that ORM can bring...

Greatly supplements raw SQL as the library encapsulates SQL queries into simpler methods, allowing you to interact with the object directly. This is highly beneficial as it saves time from repetitive SQL queries. Moreover, it also means that you do not have to know SQL too well but keep in mind that understanding how a relational database works will allow you to understand the magic that is happening under the hood.
Interacts with your database in your favourite OOP language.
Allows for the use of your database of choice without the need to worry about the different SQL dialects.
Built-in features that could greatly save your time (for example optimistic and pessimistic locks).
Improves the maintainability of your code by having a clear data structure overview in classes and objects and enables check for data types.
Using SQLite in development but MySQL in production? No problem, the link between database and application is loose so changes on either side are easier to implement.

...and here are some drawbacks

ORMs take a bit of time to get used to, these libraries are definitely on the more complex end so getting the hang of them might take a bit of time. Consequently setting them up correctly to take advantage of their full potential can also be time-consuming.
There is a lot of magic happening under the hood which makes the understanding of what is happening difficult. To not lose control, it is good to double-check the documentation and do the research to make sure it’s doing what you want. *Without deeper understanding and correctly calling the methods, the SQL queries can be more performance heavy than writing SQL directly, especially when more data leads to a larger number of SQL queries (the so-called N+1 problem).

That being said, you do not have to use ORM for the OOP and RDBMS to coexist. There is a heated debate between those for and against ORM (I won’t go into the details here, but here is a cool article looking at the dispute through values or an article on why ORMs are so hated by Martin Fowler). You need to decide for yourself, whether using ORM will improve your application or instead bring unnecessary complexity. One way to avoid the problem altogether is to just use a non-relational database and/or switch to functional programming language...

So how do I set up ORM? Easy as 1, 2, 3...

The first step is to select and install a library that implements the Object-relational mapping paradigm based on the OOP language that you are using
The second step is to create a connection (in ORM terms a session) between the ORM and your database. This step can be found in the documentation of the ORM library. Here is an example of this step documented for SQLAlchemy, which is an ORM library for Python.
The third step is to set up the mapping itself. What I mean by that is creating the entity classes that are then linked to your relational tables and connecting them to each other. This process should again be nicely documented in the ORM library’s documentation. Here is an example of SQLAlchemy documentation to inspire you. Unsurprisingly, there are few ways on how to approach the mapping itself, these are called ORM Patterns. The two most common ones are active record and data mapper.

ORM Patterns: Active Record vs Data Mapper

The ORM Patterns can be viewed as philosophies on how to map data between tables and objects. Think of it as how the ORM layer between the table and objects is behaving.

With Active Record, a table is represented by a class (usually called entities) where properties of the class more or less correspond directly with the columns of the table. Therefore, an object instance is tied to a single row in the table. The biggest difference from the Data Mapper paradigm is that in addition to the data the entities also contain methods that operate on them (save, delete, insert...), allowing for much closer binding between data and objects. The major benefit of an active record is its simplicity and quick set up as what you see in one is likely represented in the other. However, because of the tighter coupling between data and methods, the object contradicts the single responsibility principle (here is a nice summary by Uncle Bob). Moreover, testing these bound objects is difficult and with increasing complexity, changes in one side can have an unwanted impact on the other. That is why Active Record is most suitable for CRUD applications. ORM libraries that support Active Record are Ruby on Rails (Ruby), Laravel’s Eloquent (PHP), Symphony’s Propel (PHP), Django’s ORM (Python).

As for Data Mapper, the set up is similar to the active record, however, the objects do not contain data manipulation methods (we cannot call a save method on the object to persist the data). Instead, the objects access the Data Mapper layer that transfers the information to the persistent database and vice versa (in Java’s Hibernate this is called the Entity Manager). Thanks to this separation, the objects do not need to know how the data are saved into the database and they do not need to inherit the ORM methods thus following the single responsibility principle. Resulting from this detachment is also a much stricter process of interaction with the database which allows for more formal process and stricter control over database access. ORM libraries that support Data Mapper are Hibernate (Java), Doctrine 2 (PHP), SQL Alchemy (Python), Entity Framework (MS .NET), Prisma (Golang).

While Active Record is trying to blur the line between them as much as possible and creates a direct link, the Data Mapper creates a true middle layer that isolates the persistent database from the app’s business logic.

...and the OOP and RDBMS lived happily ever after

At the end of the day, ORM is just a tool. A powerful, magical tool that can bridge the differences between two very different worlds and allow you, the user, to interact with relations as if they were objects in the language of your choice. However, just with any tool, it requires practice and understanding, as relying only on the magic can lead to dangerous performance issues. Whether you decide to implement ORM or not, keep in mind that the decision is entirely up to you, so make sure that you understand what your application needs, especially in the future, to keep OOP, RDBMS and you happy ever after. The end.

Top comments (6)

Philippe Bruno • Mar 29 '21

Thanks Tina for the great explanation and hand drawn illustrations! One question though that everybody talking about ORMs seem to avoid: this mapping is nice for simple queries, but how does it look like with complex queries with multiple joins?

Tina • Mar 29 '21

Hey Philippe, thanks for a great question :)

If you are asking about the "visual" aspect of it and hence the maintainability, I admit, it can get pretty messy :) Though some ORM tools provide solution for that as well, like php's Doctrine 2 has this QueryBuilder that makes the the complex queries clearer.

now if you are asking about the execution of complex queries, then I would say, never blindly trust the ORM and always check what it is doing under the hood. Is it sending one query or sending 100 from my one line? Luckily, most of the ORMs do allow you to adjust your queries so it does what you want, but it does require some time spent with the documentation.

Philippe Bruno • Mar 29 '21

Hi Tina, thanks for your quick reply. From your article, my understanding is that each row from a table gets translated into an instance of a class where the various columns are represented as properties. With a simple table, I can totally picture this in my mind. But what about a case where you have a many-to-many relationship like students can be enrolled in multiple classes and a class has many enrolled students... What would be the resulting mapping? If each student is an instance of a Student class, how would you map the fact the student can be enrolled in one or more classes? Do you have a property of the student class that would be some sort of array of classes? I am lost. Maybe this question could be an opportunity for you to write a part two to an otherwise excellent article.

Tina • May 23 '21

Hey Philippe, sorry for not such a quick reply this time.

The beauty of ORM is that the properties do not have to be columns but can be an instance of a different table. So, in your example, a student would be of class Student and would have property class, which would be an instance of a Class. and through that property, you would be able to access student.class.name or whichever other properties the Class class has. The ORM uses various relationship properties to ensure that these relationships are behaving correctly.. Here, I am sending you a link to the SQL Alchemy documentation that shows the different relationships, including many-to-many.