Previously in Grokking Applicatives we discovered Applicatives and more specifically invented the apply
function. We did this by considering the example of validating the fields of a credit card. The apply
function allowed us to easily combine the results we obtained from validating the number, expiry and CVV individually into a Result<CreditCard>
that represented the validation status of an entire CreditCard
. You might also remember we somewhat glossed over the error handling when multiple fields were invalid. We took the easy road and just returned the first error that occurred.
An unhappy customer đĄ
In the spirit of agile we decided to ship our previous implementation, because, well it was better than nothing. A short while later, customers start complaining. All the complaints are along these lines.
"I entered my credit card details on your site and it took three attempts before it was finally accepted. I submitted the form and each time it gave me a new error. Why couldn't you tell me about all the errors at once?"
To see this more clearly consider a customer that enters the following data, in JSON form.
{
ânumberâ: âa bad numberâ,
âexpiryâ: âinvalid expiryâ,
âcvvâ: ânot a CVVâ
}
The first time they submit the form they get an error like ââa bad numberâ is not a valid credit card numberâ
. So they fix that and resubmit. Then they get a message like ââinvalid expiryâ is not a valid expiry dateâ
. So they fix that and submit a third time and still receive an error along the lines of âânot a CVVâ is not a valid CVVâ
. Pretty annoying!
We should be able to do better and return all of the errors at once. We even previously pointed out that all of the field level validation functions were independent of each other. So there was no good reason not to run all of the functions and aggregate the errors if there were any, we were just being lazy!
A better validation Applicative đȘ
Let's start by updating the signature of validateCreditCard
to signify our new desire to return all the validation errors that we find.
let validateCreditCard (card: CreditCard): Result<CreditCard, string list>
The only change here is that weâre now returning a list
of error messages rather than a single one. How should we update our implementation to satisfy this new signature?
Letâs return to the apply
function that we defined before and see if we can just fix it there. It would be very nice if all we had to do was modify apply
and leave validateCreditCard
otherwise unchanged.
For reference hereâs the apply
function that we wrote last time, the one that returns the first error it encounters.
let apply a f =
match f, a with
| Ok g, Ok x -> g x |> Ok
| Error e, Ok _ -> e |> Error
| Ok _, Error e -> e |> Error
| Error e1, Error _ -> e1 |> Error
We can see from this that itâs only the final case where we have multiple errors to deal with and so itâs only there that we need to fix things. The simplest fix is to just concatenate both errors. This has the effect of building up a list of errors each time we call apply
with invalid data. Letâs see what that looks like then.
let apply a f =
match f, a with
| Ok g, Ok x -> g x |> Ok
| Error e, Ok _ -> e |> Error
| Ok _, Error e -> e |> Error
| Error e1, Error e2 -> (e1 @ e2) |> Error
That was easy, we just used @
to concatenate the two lists in the case where both sides were Error
. Everything else remained the same.
Letâs walk through validating the credit card step-by-step with the example of the bad data that the customer was supplying earlier. First we call Ok (createCreditCard) |> apply (validateNumber card.Number)
. This hits the third case of the pattern match in apply
because f
is Ok
, but the argument a
is Error
. That returns us something like an Error [ âInvalid numberâ ]
, but whose type is still Result<string -> string -> CreditCard, string list>
.
We then pipe this like |> apply (validateExpiry card.Expiry)
. This hits the final case in the pattern match because now both f
and a
are Error
. This means the @
operator is used to concat the errors together to create something like Error [ âInvalid expiryâ; âInvalid numberâ ]
. The type of which is now Result<string -> CreditCard, string list>
because we now just need to supply a CVV to finish creating the CreditCard
.
So in the final step we do exactly that and pipe this result like |> apply (validateCvv card.Cvv)
. Just like the last step we hit the case where both f
and a
are Error
and so we concat them. Now weâve got something with the type Result<CreditCard, string list>
as we wanted with a value like Error [ âInvalid CVVâ; âInvalid expiryâ; âInvalid numberâ ]
.
A small compile time error
You might have spotted that weâve actually changed the type of the apply
function now. By using the @
operator F# has inferred that the errors must be a list
. So now the signature of apply
is Result<T, E list> -> Result<T -> V, E list> -> Result<V, E list>
.
We now have an apply
that works for Result<T, E list>
. That is, it works for any results where the errors are contained in a list
, rather than being single values like a string
. There are a couple of interesting points to make about this:
- The errors in the list can be any type, providing theyâre all of the same type.
- All of our validated results must now have a
list
of errors if we want to use them withapply
.
Point 1 is useful because it allows us to model our errors in more meaningful ways than just using strings. Although for the rest of this post weâll keep using string
in order to keep it simple. Modelling errors deserves a blog post of its own.
Point 2 however causes us a little problem we have to solve here. Our original field level validation functions are still returning Result<string, string>
so they no longer work with our new version of apply
.
We have two choices when it comes to fixing this issue. We could keep the functions as they are and transform their outputs by wrapping the error, if it exists, of the result in a list
. Which might look something like this.
let validateCreditCard (card: CreditCard): Result<CreditCard, string list> =
let liftError result =
match result with
| Ok x -> Ok x
| Error e -> Error [ e ]
Ok (createCreditCard)
|> apply (card.Number |> validateNumber |> liftError)
|> apply (card.Expiry |> validateExpiry |> liftError)
|> apply (card.Cvv |> validateCvv |> liftError)
The other choice is to update those field validation functions so that they return Result<string, string list>
as required. It might be tempting to take the first choice and if we had no control over those functions weâd have to do that. However, by letting those field level functions return a list we give them the flexibility to do more complex validation and potentially indicate multiple errors.
For instance the validateNumber
function could indicate both a problem with the length and the presence of invalid characters like this.
let validateNumber number: Result<string, string list> =
let errors =
if String.length num > 16 then
[ "Too long" ]
else
[]
let errors =
if num |> Seq.forall Char.IsDigit then
errors
else
"Invalid characters" :: errors
if errors |> Seq.isEmpty then
Ok num
else
Error errors
Using Result<T, E list>
throughout gives us a more composable and flexible api that allows us to refactor the errors returned from those functions in the future without affecting the rest of the program.
So given that theyâre functions in our domain, then weâll take that approach. Letâs give that a try and see what it looks like all together when using this new version of apply
.
let validateNumber num: Result<string, string list> =
if String.length num > 16 then
Error [ âToo longâ ]
else
Ok num
let validateExpiry expiry: Result<string, string list> =
// validate expiry and return all errors we find
let validateCvv cvv: Result<string, string list> =
// validate cvv and return all cvv errors we find
let validateCreditCard (card: CreditCard): Result<CreditCard, string list> =
Ok (createCreditCard)
|> apply (validateNumber card.Number)
|> apply (validateExpiry card.Expiry)
|> apply (validateCvv card.Cvv)
Lovely job! Apart from a couple of small changes to lift the errors up into lists within validateNumber
etc the rest has stayed the same. In particular, the body of validateCreditCard
is completely unchanged.
Do I have to use list for the errors
The only requirement weâve placed on the error type is that we can use the @
operator to concat the errors together. So as long as the errors are concat-able then we can use a different type here. The fancy category theory name for this is a semi-group. A semi-group is anything that has at least a concat operator defined for it. A common type to use here is a NonEmptyList
, because we know that if the result is an Error
then theyâll be at least one item in the list.
A tale of two applicatives đ
Weâve seen two implementations of apply
for Result
now. Can we have both? Unfortunately not really, at least not both defined in the Result
module of F#. In order to do this F# would have to be able to decide which one to use based on whether the error type supported concat, which might not even be obvious without an explicit type annotation. Even then we might get undesired results because strings support concat, but itâs unlikely we want to concat the individual error messages into one long string.
How should we decide which one is correct then? Well, we donât have to. We can define another type called Validation
which has a Success
case and a Failure
case, similar to Ok
and Error
for Result
. The difference is that for Validation
we can define apply
using the version weâve created in this post which accumulates errors and for Result
use the apply
function that short circuits and returns the first error, that we saw in the last post. Luckily for us the excellent FSharpPlus library has already done exactly that.
What did we learn đ§âđ«
Weâve seen that applicatives are a great tool to have at our disposal when writing validation code. They allow us to write validation functions for each field and then easily compose these to create functions for validating larger structures made up of those fields.
Weâve also seen that whilst applicative computations are usually independent of each other thereâs nothing to guarantee that a particular implementation of apply
will make full use of this. Specifically, when working with validation we want to make sure that apply
accumulates all errors and so we should make sure to use a type like Validation
from FSharpPlus to get this behaviour.
Top comments (6)
I am not sure it is really a good idea to change the individual validation functions to return a list if errors when they return a single error. If one day one of the functions needs to return 2 errors, then (and only then) I will change its signature, but not earlier. Why are you (and others) so relaxed about (wrong) function signatures in this case?
Hi Deyanp, I donât think there is a universally ârightâ signature for representing the errors returned from a validation function. Like I point out above you can either leave it as returning a single error and the do the conversion inside the parent validation function to map it into a list of one error so the types line up with apply, or you can decide to refactor the field level function to return a list. You should do whatever makes most sense for your situation. However, Iâve personally found that by being consistent in always returning a list of errors then it makes the code more consistent, easier to compose and less fragile in the face of future changes. On a more philosophical note, a single error is just a special case of the more general situation whereby there might be several things disjoint things wrong with the data. Therefore I think a list of errors is actually the more natural representation. Iâm struggling to think of a case whereby I really need to enforce some semantics that only one error could ever be returned at a time. I guess you can make the YAGNI argument against prematurely converting to a list but the change seems fairly localised here.
Hi Matt,
I still cannot get my head around why a list of errors would be the more natural representation ....
List can be regarded as an "Effect" as per Scott Wlaschin's "Effect World". Why would you change the signature of a function to return a List intead of single string, when it really validates a single thing?
In my functions I am trying to have the most correct and minimal/most primitive types in the signature. I do not have my functions return an Option if they don't need to. I also do not return Result if not needed. Actually, when I look at a function and see that it has a Result return type but only returns Ok, then I go and change the signature and remove the Result.
This is obviously philosophical and in no means challenging your excellent (series of) article(s), I am just trying to wrap my head around the question why you and so many other people seem to be relaxed about function signatures when it comes to applicative validation ... I would Result.mapError List.singleton in the orchestration function ...
Best regards,
Deyan
Don't worry about the critique, I like the discussion, it's part of the reason I wanted to start posting on here.
I think there are cases when having a function return a degenerate value is fine. For example, if I was writing C# and I was implementing an interface that required me to return
Async<string>
, but in my particular implementation I didn't need to do any async work then I would just implement withTask.FromResult("not async")
. So viewed from that angle then you could think ofResult<'a, 'e list>
as the most permissive interface that a validation function could have. Therefore by using this consistently it allows looser coupling between the validation functions working at different levels of the data.You said:
It's worth noting that whilst we're validating a single piece of data, it's not the
Ok
value that we're turning into alist
here, it's theError
case, so what the signature is really saying is that there might be multiple validation errors for this single piece of data.As for lists being effects, that is certainly one way to interpret them. If I'm not mistaken it represents the effect of running several computations. A
Result
is also an effect, one that represents a computation that might fail. So from that viewpoint then aResult<'a, 'e list>
could be interpreted as "this function will run several validation computations on this single piece of data and if they're all fine it will just return the single piece of data otherwise it will return the outputs from all of the failed validation computations". That to me sounds like quite a general statement about how validation works and therefore makes for a type that can encapsulate many different validation computations.Thanks for your patience, Matt! I think it will be over after you read the below ;)
I understand that you want to extend the output set of possible values of the function in order to generalize its interface to theoretically handle more than 1 error.
The fact of the matter is that the current version of the function(s) is returning only a single error though, and a
Result<_, string list>
may not be really justified by the function when looking at it in isolation.The
Result<_, string list>
is solely required so that this function can be used in a parent/"orchestration" function utilizing applicative validation. So the "innocent" function knows this parent/orchestration context now ...Let me ask you a similar question - imagine you have an orchestration/workflow function returning
Result<_, OrchestrFunctionErrorDU>
, and invoking 2 simple/reusable in different context "worker" functions. Which variant of the 2 below would you choose?OPTION 1
OPTION 2
The difference is that in Option 1 the worker functions do not know about the higher level context and its error DU, and return some primitive error type (in this case string). In Option 2 they do know about the higher-level error DU, and they use it by pretending (imho) to be able to return both error cases, but of course returning only one of them.
I think you would choose option 2 ... I would choose option 1 but still trying to understand why people would choose option 2 ...
P.S. if you need the result CE (from Scott Wlaschin) to make the above work in fsx here it is:
Yes, in this case I would take option1, because
OrchestrFunctionErrorDU
has nothing to do with the lower level functions. However, I don't think this is the same case, although it might appear to be.My primary motivation for choosing to use
Result<'a, 'e list>
is not because the parent function needs an error list in order forapply
to work, but instead because it's a good representation of a validation computation in its own right. The argument being that when validating any data it's not unreasonable to expect to encounter several errors.It just also happens that in this case it does make the parent composition easier too, but that's a secondary reason for doing it that way.
If I were actually modelling the errors then I would likely use a DU to describe the errors cases for each field and then unify those with a DU at the parent level (similar to your example). In that case I would only lift those errors into the parent DU within the parent function, because as you rightly say the child validation function should have no knowledge of that parent level DU type.
So I normally expect to do some error mapping in the parent function, but in the case of whether or not to return a list of errors I choose to use a list in the child function because above all else I believe it is a better api for that function, which allows that function to evolve more independently in the future.
I guess another way to think about this is the locality of any future changes. If I were to change the
validateChild
function in the future and it now started returning several errors instead of one I would be forced to also go and fixvalidateParent
which called that. However, in the scenario where I'm unifying errors in parentParentErrorDU
type then that is something that is defined at the same abstraction level asvalidateParent
(probably in the same F# module) and so if I change that type then I'm going to have to fix the error mappings invalidateParent
but I'm OK with that because that is a change at the same level of abstraction. I don't however have to go and change any of the child validation functions just becauseParentErrorDU
changed.So by returning a list of errors from the child and doing any parent level DU mapping in the parent we've achieved high cohesion and low coupling even though they look like they're contradictory design choices.