There are so many Helm charts out there, and it's very tempting to just pick one and go, but making a rash decision can come back and haunt you later down the road.
I have to admit that the official Helm repository has come a long way and if it used to mostly serve half baked Charts, by now most of them even work!
So, before you blindly pick a Helm chart and incorporate it into your project, I recommend going over the list below and make sure that you do your due diligence.
tldr: Things you should keep in mind when choosing a Helm Chart:
- What does a good implementation looks like?
- Is this the right repo for this application?
- Read the documentation - Readme.md
- Is this project active enough?
- Is this chart stable?
- Will you have to modify it? Or use it as is.
- Is the Chart an overkill for my requirements?
- What about security?
- Take it for a test drive.
A helm chart is there to make your life easier. It allows you to use public or official knowledge on how to optimally implement an application. Helm Charts also tend to be very generalized to support multiple use cases, and by doing so may lead you to an implementation that is not "right" for your requirements.
To avoid this culprit, and to be able to make the right decision when picking a Chart, you have some understanding of the underlying solution.
The first thing I do before even thinking about Kubernetes is taking the time to research how the application works.
- What does a successful implementation looks like?
- Is there a way to make it highly available? If so, what are my options?
- What about security? TLS, Hardening, AAA
- What are my options for running this solution at scale?
Only then I move on to Kubernetes. With the building blocks in mind, my decision is more informed and less cargo culting.
I then look at:
Some charts implement redundancy in ways that may or may not fit your overall architecture. Sometimes the Charts is written in a way that is geared towards a different use case.
One example is Redis Cluster vs. Sentinels. These are two different implementations of HA for Redis, and there are separate Helm Charts for each.
There are cases where charts support specific versions of an application and could be a little behind the latest releases.
Chart elements such as Configmaps and Secrets that contain the application's settings may break with the latest release.
How will your applications find and work with each other?
The Kafka Chart used to support internal connections but failed to solve the problem of exposing the brokers outside the cluster. That is no longer the case, but it's an excellent example of a potential time sink that could be avoided.
A quick look at how the Kubernetes services are configured should give you a clear view of how to interact with the application.
Most Charts are built for scale, but that may look a little different for your use case. Make sure that your view of "scale" corresponds with the Chart.
Well, I'm betting that you started your journey will a quick Google search for a chart. Then one of four things happened:
- You ended up on the official Helm repository by clicking on the first result.
- The top result is a blog post from the company who craeted the service (elastic.co for example). It may lead you to their official Helm repo.
- A collection of very random GitHub projects.
- No real results.
I usually go back and forth between scenario #1 and #2, and try to figure out which one answers my requirements better. Both are good starting points.
If I end up on the third scenario, it usually means that I'm only going to use the projects as inspiration and try not to use them "as is".
In case there are no results at all, well, you have much work ahead of you.
Every good Chart has an informative and useful Readme file. Even if the documentation is very short, it could give you valuable information or point you in a better direction.
This Chart is primarily intended to be used for YARN and MapReduce job execution where HDFS is just used as a means to transport small artifacts within the framework and not for a distributed filesystem. Data should be read from cloud based datastores such as Google Cloud Storage, S3 or Swift.
Taken from the stable/hadoop Chart.
See, if you want to use it for storing large data sets, it may not be the best choice.
The stable/Jenkins chart has an excellent example of a readme file (and overall Chart). You can see what options are available for overriding, it gives you hints for possible problems, and features you can use.
Sometimes a Chart plainly states that it's deprecated and contain a link to a different one.
A Readme file is there to inform and document but also for marketing and establishing trust. Can you trust this Chart?
While you're looking at the Readme file on GitHub, look for other clues on the page:
- How many watchers, Stars and Forks does this repo have?
- Read some of the issues, and see if they are addressed.
- Is the project maintainer looking at pull requests?
- When was the last commit?
- Does it seem like this Chart is actively maintained?
Just like choosing an external library, the above points are a great indicator of whether you should use this Chart or not.
In our current state of DevOps, everything is at Beta or Alpha, and experimental projects are running in production for the lack of a better option.
Helm Charts are not different, and you may find yourself looking at "Incubator" charts instead of "Stable."
Even under "stable", you may see an indication that the Chart is not 100% production-ready.
Another favorite of mine is fabulous PR blog posts by vendors announcing their official Chart, or even Operator that is just around the corner. Too bad the post dates to a year ago, and the Chart is still in "alpha."
Look for the safer option, but don't rule out anything just because of a label. It may be safer to bet on a new way of doing something than using the wrong solution.
The steps below may help you decide for yourself if this Chart is stable enough for your needs.
While Charts are written to support as much use cases as possible, and along the way, make the templates completely unreadable, they can't solve all of the problems.
Maybe you have a unique configuration that requires additional settings? Alternatively, suppose that you need to add elements that are just not there in the templates.
If you realize that it will take too many modifications to make the Chart operable, think about writing your own.
Every company has its tolerance for failures, different resources, and outlook. Some charts launch a full-blown service architecture worthy of supporting Amazon on Black Friday (not really), but in your case, it will be running the application on an on-prem Kubernetes cluster that caters to a much lower set of users.
Is the overhead worth it? From hardware resources to supporting the Chart itself, try to keep it simple.
This is a huge subject on its own, and a while ago I wrote a short article about reviewing Helm and Kubernetes code with a security mindset. I highly recommend you go over it. There is some overlap, but it should give you a good idea of what you need to keep in mind when evaluating the Helm Chart.
This one is obvious - Take the Helm Chart for a test drive. Take a few hours and try to implement it on your Dev cluster or a local machine.
You can learn a lot by trying to implement a Chart and trying to configure it for your use case.
It can give you a glimpse of the effort it will take to customize the Chart and give a better estimate for the project scope. In some cases, you may want to quickly rule it out and save you time later down the road.