I’m actually very torn about this part of the project. Let me explain.
When I first decided to take on this project, I knew validation was going to be extremely important and so after completing adding the base64 functionality, I went through the process of going through and validating each input the user could send. The first and the most important to me was the base64 image string. While I had a bit more experience with working with base64 images, I wasn’t very confident in being able to validate them. Even now I don’t think the app does a great job of validating that any base64 string is a valid one, much less a string that could be decoded into an image. Base64 strings tend to have a tag that indicates the data type of the encoded data. So remember the Regex I talked about before? It looks for strings with these tags and pulls out the data after them. The problem is though that when encoding the base64 string, it will only output a string of characters without any of these tags, specifically the data type tag. I currently don’t know if there’s a mechanism to at least identify if a base64 string is actually an image or potentially something else (potentially malicious commands). Honestly, even just writing my thoughts about this makes me realize that this can potentially be a security concern and I honestly have to do more research on this and hopefully find a better solution than what I have now.
Thankfully a few of the other inputs are significantly easier to validate though one which I spent a long time working on was the image transformation endpoint. So for other endpoints, they either had the base64 image and a primitive to validate or just the base64 image. Image transformation took an object where at least one of the four transformations had to be specified for it to be valid. This made validation a bit tricky because now I had to figure out if one of those attributes in the object was in the object, I then had to figure out which attribute it was and then I had to properly validate it. It was a bit of a complicated process but I ultimately was able to figure it out. The other endpoints validation would be significantly simpler because validating a primitive was significantly less work.
So what has me torn is the amount of work I've put into my validation versus its actual validity (hehe, word play). I’ve put several hours into getting these solutions and I’m proud of the code I wrote but from a security perspective, at the very least for the base64 images, I don’t know if this is code that I would want to rely on if put into production or put between myself and bad actors. Yeah, the validation is there mostly for people who put in bad data by accident or simply don’t know how to use the API but this code also has to defend against people purposefully trying to break the system and I don’t think I put in enough engineering time to solve that problem.
Regardless, I think for the next part of this project, I’ll rip out my validation solution and use a more trusted package to do validation.
(Shout-outs to @swyx for letting me know that I shouldn’t be rolling my own validation. Honestly I wouldn’t have even thought of it if he didn’t point that out)
Here are commits for some of the validation code I wrote:
In the next article in the series, I’ll talk about actually implementing the image manipulation for the API.