How To Write A Custom Elixir Schema Validation

This week I ventured into new territory and wrote a custom schema validation for the Elixir/Phoenix project I've been working on.

Ecto.Changeset has a good amount of prebuilt validators that will accomplish most tasks, but if its necessary to validate in a manner outside of the established validations, that can be done with a custom validation. A custom validation function can be just about anything, the only requirement is that it returns a changeset, just like a built-in validation does. This example deals with three fields: revenue, expense and net_gain. Ultimately, the validation is that revenue minus expense equals net_gain:

This function utilizes two helper functions: get_field/3 and add_error/4. These are Ecto.Changeset helpers and can be accessed by adding import Ecto.Changeset to the top of the module. While looking around the docs there are a few other functions that could be convenient in different scenarios, like get_change/3, fetch_field/2, and fetch_change/2, but for this purpose get_field/3 fits the bill.

In both cases of the do block, changeset is returned. If the match checks out, the validation is true and the changeset is simply passed on. If the math does not match up, an error is tacked onto the :error field of the changeset and returned.

With the custom validation function taking in -and ultimately returning- the changeset, it can be piped anywhere into a schema validation, like any other Ecto-provided validation:

One caveat I found is that preceding validations in the pipeline are not automatically recognized by a custom validation. For instance, if a field was empty in the validate_required/3 function, an error would be added to the changeset, and it would keep working down the pipeline of validations. When that empty field gets to the custom validation function, it needs to be handled in some way, or it could throw unexpected errors.

This can be dealt with by incorporating a changeset.valid? call into the function, which checks the :errors field of the changeset and return false if any exist:

The logic shifts from a case statement to a with expression, to better handle the multiple checks. If you're not familiar with the with expression, I wrote about it a pretty detailed explanation of it. Back to the logic, the catch with simply incorporating changeset.valid? is that it's a broad check, if there are any errors, it will return false. It works when piped into the function like this:

But, if an engineer were to come in later and add another field to the validate_required check, one that has nothing to do with the custom validation after it, it could result in a false positive in the first bit of logic of mathematical_validation/1. For example:

In the above example, if the first_name field was empty, an error would be added to the changeset via validate_required. The first check of mathematical_validation/1 would fail, resulting in the logic portion of the validation not being checked at all, even though the error was not in the any of the three fields that function is concerned about.

A more verbose strategy would be to check for the presence of the fields explicitly before stepping into the logic to be performed, like this:

Now, before the mathematical validation is performed, it is checking that all the necessary fields are present. There is some redundancy with the validate_required function that is called before this custom function in the pipeline. This is where a rabbit-hole could start. Either the redundancy is accepted, or steps could be taken to remove the validate_required check entirely and add appropriate error messages to each of the nil checks in mathematical_validation/1 individually.

Properly handling the incoming errors is probably the trickiest part to writing a custom validation. But after the error handling has been considered, literally any logic can be used as a validation as long as a changeset is returned, and that's pretty powerful.

This post is part of an ongoing This Week I Learned series. I welcome any critique, feedback, or suggestions in the comments.

Discussion (2)

Max Katz

Great write-up! One other way I would maybe handle the need to check for those fields' existence is by having another function clause for mathematical_validation/1:

defp mathematical_validation(changeset = %Ecto.Changeset{errors: errors}) when length(errors) == 0, do: changeset

defp mathematical_validation(changeset) do

This cleans up the actual logic for this function, of course this wouldn't work I guess if you didn't have the validate_required check before it in the pipeline, so it all kinda depends on the requirement. Mostly I just wanted to say hiiii, long time no talk :)

Noel Worden Author

Yeah, I think I started with something like this, but the feedback in the PR was that if other, unrelated validations were inserted in the wrong place in the pipeline then the errors field might show a false positive, leading to this custom function not running at all.

But, it obviously wasn't too bad of an idea if thats what you thought too!

Sorry for the late response, kind of dropped the ball.

