A stored procedure is a set of SQL statements that is stored on the database server and is available to be executed by name. Stored procedures are the cause of “religious wars” in the world of relational databases where some DB users live and die by them and other DB users consider them an anti-pattern. They are certainly a tool that has the potential to be mis-used and create performance issues and blockers to scalability. As a DBA/Developer/Architect, should you be using them or not? Let’s dive in and try to answer that question.
When I see stored procedures used in databases, they fall into one of the following categories:
- A stored procedure is used for every single CRUD operation against the DB
- Stored procedures are used in certain situations for special handling, including:
- Running multi-statement transactions so that commit/rollback logic can be controlled at the DB level
- Running operations that involve chatty transactions so that round-trips between the app and DB can be avoided
- Centralization of logic that is called by several systems
- Encapsulation of operations to which highly granular security permissions need to be granted
- 10,000-line monsters containing business reams of logic
There are various arguments for and against the use of stored procedures. Let’s examine a few:
When you write a stored procedure, the query execution plan is stored/cached on the server which saves time when the stored proc is executed. This has become less of a factor over the years as DB technology has progressed. SQL optimizers have gotten better at storing plans for “dynamic” SQL and storing those plans for re-use. By making use of prepared statements in data access code, the same benefit of execution plan caching can often be gained.
There are opportunities to grant an application or user permissions to execute a stored procedure but restrict access to the underlying table(s). Or, you could grant access to a procedure that INSERTs data into a table but deny access to a procedure that UPDATEs or DELETEs data from a table.
If every CRUD operation performed against the database is contained in stored procedures, it’s relatively easy to have an understanding of the query patterns used in the database. As a DBA, getting your arms around the query patterns (both reads and writes) that are being executed against a database is a huge step in being able to manage and optimize a database -- especially when inheriting a new system with which you have little familiarity.
SQL Injection Attacks are a classic security vulnerability. If you’re not familiar with them, google “little bobby tables” and check out the famous XKCD comic that pokes fun at them. Because stored procedures are typically parameterized, they provide a level of protection against unsanitized SQL inputs. To be fair, I have also seen stored procedures exploited with SQL Injection attacks; they’re not a magic fix for SQL injection attacks, and smart developers still take precautions to sanitize data and use parameterization correctly.
For certain database operations that require several roundtrips from the app to the database, there can be a performance boost by planting all the necessary logic within a stored procedure to handle the entire operation without ever “leaving the database.”
By storing SQL statements together in an encapsulated, named element which can be executed by various processes, there is an opportunity to re-use logic by application code, reporting services, and other DB clients that may be simpler than encapsulating that logic in an application tier (for example, in a microservice). In a system that is embracing microservices, this argument becomes largely irrelevant.
There is a subtle temptation when writing stored procedures to add little bits of logic that really don’t belong in the database. When there is business logic shared between the services code and the database (in stored procedures), it is harder to have a holistic view of a software system’s business logic; things can easily be missed which can lead to unexpected bugs.
In the case of 10,000-line monsters mentioned above, this embedding of business logic is taken to an extreme. This leads to highly unwieldy and difficult to maintain code. This is probably the most infamous use of stored procedures and one of the anti-patterns dissenters love to cite.
The code used in stored procedures often moves away from standard ANSI-SQL and into implementation-specific syntax. This makes it harder to move between DB systems. If the interaction between applications and databases is all plain-vanilla SQL, it is relatively easy to migrate an app from using any database system to any other database system.
While it is good practice to keep database schemas (tables, indexes, views, etc.) in source control, when the schema contains stored proc definitions, the management of the schema becomes much more complex. Often, changes to stored procedures are tightly bound to Data Access code, so managing versions of schemas and their appropriate code builds can be challenging. And, doing seamless upgrades of applications can cause complexities as well.
An ORM (Object Relational Mapper) is a piece of software designed to abstract away the manual mapping of domain objects (used in code) with the corresponding table structures (used in databases). By employing an ORM, you (in theory) save yourself a lot of trouble writing boring, repetitive data access layer code. ORMs typically write SQL statements dynamically and execute them directly. It is often hard, or at least involves a decent investment in time and effort, to get ORMs to use stored procedures which subtracts from their intended utility.
Note: I’m not actually a huge fan of ORMs, but I understand their value. My personal rule when deploying ORMs is to do so in a way where it’s easy to let the ORM do the data mapping when it’s efficient, but to allow the insertion of manual data-access logic for cases when the ORM is making poor choices. In other words, if the ORM makes your life easier 80% of the time, then leverage it for that 80% -- but don’t tie yourself to the ORM for that other painful 20%.
I admit that in the part of my career where I was writing apps using the Microsoft stack and SQL Server that I was a huge proponent and user of stored procs. I was also a huge Microsoft snob and had no plans of moving off of SQL Server to any other database platform. Fifteen years later, with a more mellow attitude and a little more hard-won experience under my belt, I am much more of a believer in building systems that use the right tools for the right jobs. The database is there to provide durability, handle concurrency, provide consistency, and generally take away the stress of storing data. Put the data in the database. Put logic that surrounds the data access in a data API and expose that to your applications.
In general, I believe that the trend regarding stored procedures is to move away from them -- especially when architecting systems that need to be highly available and massively scalable. The trends you will find around these types of architectures are the use of microservices, the adoption of various types of horizontally scalable data platforms (Distributed SQL, NoSQL, ElasticSearch, Spark, Snowflake, etc.), and a general desire to decouple business logic from data operations.
I also believe that the potential pros that can be gained from the use of stored procedures can also be gained in other ways -- through the use of microservices and through good coding practices; however, the potential cons are much harder to overcome.
I am hesitant to go so far as to call stored procedures a “crutch.” But I do think that if a system architect were to adopt a policy of explicitly barring the use of stored procedures in favor of other mechanisms that support reuse and decoupling of data and logic concerns that it would be a decision I would consider healthy.
Disclaimer: I work for Cockroach Labs and in CockroachDB, we don’t support stored procedures. The discussion around whether we should support stored procedures comes up fairly often. It would certainly make data migrations easier. And, for some of the reasons noted above in this article, stored procedures can certainly add some value. But for the most part, we seem to take the stance that most modern systems are not adopting stored procedures, and so, this feature never seems to get high enough prioritization to get done. Instead, we seem to take on features that enable adoption of the more forward-thinking trends (cloud native, microservices, containerization, serverless). We may support them in the future, and if we do, I hope we will put some guardrails around their usage to gently nudge users away from using them in potentially limiting ways.
Since the use of stored procedures is, as noted above, a bit of a religious war, I’d love to hear any comments you have on the pros and cons of stored procs and whether you think they are a tool that has a place in modern application architectures.