loading...

Does Phoenix make database CRUD too easy?

michaelherold profile image Michael Herold ・5 min read

The Phoenix framework logo, an orange firebird.

In 2014, the Phoenix framework emerged onto the web development scene as a fast, productive, and concurrent solution for the modern web. Its focus was on massive concurrency, thanks to the power of the Elixir language and Erlang's BEAM virtual machine. The original branding stressed the productive aspect, positioning the framework against the venerable king of productivity, Ruby on Rails. Those who are familiar with Rails will see a lot of inspiration from Rails. The way Phoenix auto-reloads code in development mode makes it just as easy to use as Rails. The router grammar requires a good squint to see the difference between the two. This inspiration certainly makes for a quick learning experience for Rails developers. But I wonder: at what cost? Does Phoenix make it too easy to structure your application like a Rails application? Is this convenience to the detriment of using the full power and expressivity of Elixir?

Phoenix in the early days

Originally, the core team clearly optimized Phoenix for productivity and similarity to Rails. There were modules that were called "models" that you were encouraged to treat similarly to Rails models. Concerns were mixed in the model modules (say that three times fast) between querying, persistence, and business logic. The first application that I spiked out in these days looked a lot like a Rails application, despite reading through Programming Phoenix. I understood that I should start to split business logic out into separate applications and only use Phoenix as the web interface to the application. However, the umbrella application concept put me into an "architecture astronaut" mode and I could either ignore it and "be productive" or endlessly spin my wheels.

I admit, many of my problems in Phoenix prior to the 1.3 release likely stemmed from my simultaneously learning Elixir and Phoenix; I didn't know enough of either to be very effective with the toolset. Compounded by the fact that I was only using it on side projects, I wanted to continue to build and grow my skills rather than take the time to design my application as a piece of software first and a web interface second. I think this is often the same problem that people have early on in their Rails days: they want to build and "ship stuff," not deliberately build maintainable software.

Contexts and Domain-Driven Design

With the release of v1.3.0, Phoenix radically shifted the way the default generators worked. In his Lonestar ElixirConf 2017 keynote, Chris McCord described the new default architecture for Phoenix applications, which revolved around contexts. Based in Domain-Driven Design, contexts (or, as people are loath to say, bounded contexts) give a place for your business logic to live and define the interface between different components of your system. Does your application have a system of accounts with many bits of logic around them? Okay, those likely belong in an Accounts context. Are you interfacing with social media as a facet of your business? Great, you probably should have a SocialMedia context.

Contexts are a great first step toward wrangling the architecture of a Phoenix application. Instead of a Phoenix-is-your-architecture structure, you can move toward a semblance of loosely connected pieces that work together. These form natural boundaries between the components of your application. The boundaries can later be used as seams by which you can slice-and-dice your application into several smaller pieces, should that be something that you need to do. The Phoenix maintainers made the decision to change the default generators to use this design pattern for structuring your application and it was a very positive move. However, the part that I worry about is that they chose a specific piece of the default Phoenix stack as the interface between different contexts: Ecto, the database Domain-Specific Language (DSL) and interface language.

An Ecto-based architecture

Because of the decision to use Ecto as the boundary between generated contexts, that is the natural direction that people will take when learning Phoenix. Ecto is a wonderful library and very fun to work with. It helps you to be more thoughtful about your database design. You have to design queries since you don't get a lot for "free" like in Rails' ActiveRecord. Ecto also removes many of the ways that you can accidentally hamstring yourself (I'm looking at you, N+1 queries!). Its thoughtful architecture makes you be explicit about what you want from a query. However, because Ecto is a database library, that means that the coordination of work between your contexts is fundamentally coupled to interacting with a database.

There is nothing inherently wrong with this decision. If you are making a simple CRUD application, this pattern will serve you well and will help you move quickly. However, how many "simple CRUD applications" continue to stay simple? How many of them continue to be CRUD applications? Every web app that I've ever worked on has increasingly gravitated away from the happy place where you are building a "glorified spreadsheet". As time goes on, more and more of your business logic has nothing to do with CRUD and has everything to do with munging that data. The business logic is happier without knowing it works with a database.

Alternatives?

This mild existential crisis that I had with the architecture of my Elixir/Phoenix application was spurred by reading an excellent commentary on building a Multi-User Dungeon (MUD) in Erlang and the thought and care that the author put into designing a process-based architecture for the application. It made me wonder: is Ecto really the best choice for the interface "language" between contexts? Is there something else we should use that gives us the nice affordances of Ecto.Changesets and their validation, but doesn't intrinsically tie us to the database?

I have considered using Formex as the mediation point between contexts. It interfaces well with Phoenix.HTML and is pretty lightweight in what you need to know to use it for interacting with the application. I have also considered rolling my own way of doing things, but that usually isn't the wise choice. I'm wondering if anyone else out there has thought about this? Do you have any way of breaking the tie between your traditional database and your application? I'd love to hear from you in the comments!

The post Does Phoenix make database CRUD too easy? appeared first on Michael Herold.

Posted on by:

michaelherold profile

Michael Herold

@michaelherold

Husband. Father. Lead Application Engineer at Flywheel. Maintainer of Hashie. Open source contributor. Speaker. Hiker.

Discussion

pic
Editor guide
 

I'm neither a Ruby nor an Erlang/Elixir developer so I can't speak to most of this, but I am going to go ahead and challenge the idea that 'business logic is happier without knowing it works with a database' :)

There are arguments for ensuring that the logic is portable without regard for a particular flavor of (relational, since if you're going NoSQL you're already locked in) database, most commonly that you're building an enterprise product which will be deployed in environments that already run aSQL with admins who don't want to have to learn and manage bSQL. But it's not a foregone conclusion.

Databases and SQL are better at many things than higher-level general purpose programming languages. When user input isn't required, it makes no sense to pull information out of a database, operate on it in application code, and put it back when SQL could accomplish the same goals in a fraction of the time. Database constraints too are the only way to truly enforce data validity and integrity, so you're already performing some business logic functions in the database. This stuff never really happens in a vacuum, and coming to grips with that is a lot more sustainable than trying to pretend otherwise.

 

I've thought about your comment for a few days now. Here are my thoughts on the topic, not in the order in which you presented them since I think my thoughts will be more coherent that way. Hopefully, that's okay! :)

When user input isn't required, it makes no sense to pull information out of a database, operate on it in application code, and put it back when SQL could accomplish the same goals in a fraction of the time.
[...]
admins who don't want to have to learn and manage bSQL.

I'm curious about this. It sounds to me like you're suggesting you use things like triggers or PL/pgSQL (in the Postgres world, of course!) to directly update data within the database? Is that something you do? How does it work out? Have you run into any maintenance issues there?

I've often thought that triggers would a nice thing to use, but splitting your logic between the database and the application feels like it would lead to a maintenance headache in the form of increased cognitive load. When there is a bug in the application, you then have to look in multiple places for the source of the bug. In addition, there's more "surface area" for onboarding new developers onto the application.

Maybe I misunderstood and you're not suggesting that. If I did misunderstand, I'd love to hear what else you meant!

I am going to go ahead and challenge the idea that 'business logic is happier without knowing it works with a database' :)
[...]
Database constraints too are the only way to truly enforce data validity and integrity, so you're already performing some business logic functions in the database.

Between these two statements, I think that you and I mean two different things when we say "business logic". To me, business logic consists of all the things that can't (easily, at least ... I know you can do some magic things in procedural RDBMS languages) express in the confines of SQL and/or relational algebra.

Business logic is why we have applications instead of spreadsheets.

Data integrity is a separate, related set of logical changes to the data that is input by the humans and other systems into your application. Constraints help protect you from errors. They help you clean your data as it changes over time (for instance, foreign key constraints with ON DELETE CASCADE).

Data integrity is there to help you reduce the cognitive load in your business logic and, hopefully, make it so you can write your business logic how a human thinks about it, not how you have to tell a computer to do it.

Does that make sense? Thank you for the thoughtful comment!

 

I make extensive use of views, triggers, functions, and stored SQL scripts (which last my data access framework treats ~identically to functions; to bring this back to Elixir momentarily, Moebius is built toward similar goals by Massive's original developer). Lately I've applied tobyhede's Postgres event sourcing model in a couple of cases, and of course trying to do that outside SQL would be impractical at absolute best.

I definitely agree that splitting business logic can lead to headaches -- which is why I try to put as much of it as possible in the database and write fairly minimal higher-level APIs on top! All you need for that is a simple web framework, while migration tools like sqitch make it practical to iterate on the database the same way you do on the higher-level code. Of course this all does require some facility with SQL, or the time and willingness to learn.

Testing is also a concern. pgTAP exists but so far ensuring API routes exercise the database fully and maintaining good test coverage for those has served me well enough.

Data integrity is there to help you reduce the cognitive load in your business logic and, hopefully, make it so you can write your business logic how a human thinks about it, not how you have to tell a computer to do it.

If you ride that train of thought a little further, the boundary will start to get get really, really blurry. It's a false dichotomy: business logic as you presently conceive of it is a set of rules which describe things happening to information. So are integrity constraints: even your example of CASCADE on a foreign key is a rule which declares that when a tuple is deleted, tuples in another table with an attribute referencing it must also be deleted. That's low-level business logic!

To pontificate a little: web developers often have a weakness for synecdoche, taking 'application' to mean strictly a subset of data processing and user-facing components written in higher-level languages. When you think instead of a complete system assembled to achieve a purpose, the database is more than simply a storage device: it's an organizing principle. The database describes how information is structured and what can or can't be stored. Data storage considerations shape the architecture of higher-level components (how many XControllers have you run into where X is a table?). It's a mistake to think of the database as a wholly separate concern. If something would be faster or simpler as a trigger and you don't have a pressing reason to minimize database usage, make it a trigger! That's why the database developers gave you the option!

 

I think there's a misconception here. Ecto itself does not need to be tied to your database.

I actually use Ecto a lot to validate data and enforce a schema, without even mentioning the database.

There's even an open issue at the Ecto repository that will split the ecto package into ecto and ecto_sql, with the former enhancing the schema and changes management of the first one with database access functionality. So in order to use Ecto the way it is today, you will need to include both packages, and if you don't need the database access functionality you can just ditch the second package. All of this in the next version (3.0).

So no, I don't think Ecto is a bad choice for linking the domains, as it is positioning itself as a data validation library that can be enhanced with database access. :)

 

That's really interesting, I hadn't seen that issue yet.

Do you use an intermediate layer of Ecto changesets to map between contexts? Using embedded schema and other such patterns has lead to some headaches for me. I posted a question on the Elixir Forum that I never got a reply to (sadly, I can't link to it because they appear to be down right now) around this very problem.

I'd love to hear how you approach the problem of decoupling from the database while still using Ecto! It sounds like Ecto is moving in an interesting direction.

 

Well, I usually don't :D

But I use Ecto to normalize some parameters and returns that have nothing to do with the database (like returns from external APIs). This gives me confidence that I could move my persistence layer without that much change in my domain's boundaries.

I'm interested to hear how using embedded schemas backfired with you, though. :)

Elixir Forum is back up, so I can link you directly to the issue.

Basically, I'm presenting a schema that embeds two other schemas to use in a single form within Phoenix.HTML.Form.form_for/4. This schema is then translated into an Ecto.Multi for committing. However, if there's an error I lose some of the context when translating back from the Ecto.Multi from the schema-of-schemas.

This is all to work around what I see as a deficiency in the impl for Ecto.Changeset in Phoenix.HTML.FormData.form_for_errors/1. It doesn't play nicely with the error state for my construct.

There is likely a better way to do what I'm trying to do, but I'm still wrapping my head around this. :)