DEV Community

Hafsa Jabeen for Codesphere Inc.

Posted on • Originally published at codesphere.com on

Canary Deployments: Release, Observe, Shift,& Repeat

Introduction

Canary Deployments: Release, Observe, Shift,& Repeat

According to Flexera’s “State of the Cloud” report 94% of enterprises around the globe used cloud as part of their operations in some way as of 2019. Times are changing fast and now more and more organizations are shifting to cloud-native applications, and rightfully so. They offer a myriad of advantages over traditional applications. You can harness the full potential of cloud infrastructure, scale seamlessly with little effort, and save some real bucks all without compromising on the quality of your product. But why are we talking about this and not canary deployments? Here is why.

One of the basic pillars of cloud-native development is continuous integration/deploymentwhich allows quick iterations and updates. This is where canary releases enter the show as a perfect companion for cloud-native development. You can leverage them while introducing new features or changes by only releasing them to a handful of users at first. It allows you to carefully monitor the effect of said changes before you commit to a full-scale release. In this article we talk about actionable steps you can take to set up efficient canary deployments, and the set of potential benefits and challenges that come with them.

What is canary release?

Canary releases take their name from an old practice of coal miners from the 19th and early 20th centuries. As canaries were sensitive to dangerous gasses more than humans, miners would bring caged canaries to mines along with them. So, if the levels of gasses like methane or carbon dioxide were high the canaries would show signs of suffering or even die before the levels were high enough to be hazardous for humans. This acted as an early warning sign for miners to evacuate mines timely and take appropriate safety measures.

Thankfully! Canary releases are far more sophisticated and don't involve any animal brutality but the idea remains the same: early detection and fixing of errors. You release a new version of your software to a handful of people first, sometimes your own employees or any subset of your users, before rolling it out for everyone. It allows you to get valuable feedback and determine any potential bugs and errors and help reduce any negative impact you might face if something was to go wrong. All in all, the canary release is a risk mitigation approach to software updates.

Canary Deployments: Release, Observe, Shift,& Repeat
A visual representation of canary deployment

What is the difference between canary release and canary deployment?

So far you might be a little confused between canary releases and canary deployments. Often these terms are used interchangeably. However, there is a subtle distinction between the two.

A canary release is an initial or test version of an application. Often even/odd numbering practice is used to differentiate between stable and unstable versions. Companies sometimes release canary product versions for people to download and use them. For example, if you loook at Python releases it displays a set of stable versions and pre-release versions- allowing users to opt in to any of them. The goal here is the same which is monitoring the new versions to detect issues before making a full release.

On the other hand, a canary deployment means, deploying the new version of the application in a duplicate environment or a few servers first and splitting the users. A small number of users chosen by you will be directed to the canary version and the rest will keep using the old version. Later on, depending upon the success or failure of the new release you can shift all users to the new version or roll back everyone to the old one. There are two common approaches to do that:

  • Rolling Deployment: It involves deploying the canary version to a few servers at a time in steps. After you release the changes, a small number of people will be able to see and use them. If it works fine, you go on and release the canary version to the new set of servers, and keep doing so until it is deployed on all the servers. This deployment approach is quite straightforward and easy to manage but can cause some downtime.
  • Side-by-side Deployment: In this deployment approach you duplicate the production environment and deploy the canary version in the new environment. On the basis of pre-defined rules a small bunch of people are redirected to the new version and the performance is monitored. In case of a smooth user experience, gradually all the users are shifted to the new version. The upside is that there is almost zero downtimeand the rollbacks are easy. However, it requires sophisticated load balancing and routing mechanisms.

If you want to take advantage of all the upsides of this deployment approach with no effort, codesphere offers this feature as a part of the zero configuration deployment services.

However, coming back to canary deployments vs releases, the use of these terms often comes down to personal preference in most cases or how they are addressed in your organization. In this article, we will be using both these terms alternatively.

So, Are these just blue-green deployments?

Although they might seem similar and aim to provide flexibility and agility, the development philosophy and the way they work are different. In blue-green deployments, two identical environments are maintained which are referred to as green and blue. In all instances, only one of them is active and receives traffic, while the other one is idle. When developers are deploying a new version of the application, it is done in the inactive environment. After that, rigorous testing is done to make sure it works fine, and then the whole traffic is re-routed to the production environment with the new version. The other environment then becomes inactive and then used for future updates.

On the other hand, in canary deployment, the two production environments run in parallel and the shift is gradual. Canary releases are hence considered more agile vs blue-green deployments because of their fast feedback and incremental nature.

How to implement canary releases effectively?

The focus of the canary release is on gradual rollouts and user feedback, complementing the “Building in Public” philosophy. In such instances, a company shares its journey and development progress with its audience and community. This approach, emphasizing transparency and community involvement, synergizes with canary releases' objective of engaging early adopters and power users to provide valuable feedback during the controlled deployment phase. By effectively implementing canary releases, deployment teams can ensure fast issue detection, and smoother updates, and build a strong sense of collaboration and trust with their user base.

Having a detailed plan of action and clear goals in mind is crucial while setting up these iterative and controlled software releases which involve careful planning, execution, monitoring, and analysis of the whole process. Let’s dive into the nitty-gritty of setting up an effective canary release.

Deciding the goal, duration, and users of the Canary release

Canary Deployments: Release, Observe, Shift,& Repeat

The first step in implementing a successful canary release is setting clear goals. It means knowing what specific metrics or indicators you want to monitor to determine the success or failure of the release. The next step is choosing the users and duration of the canary release. The duration varies depending on the goals and objectives but here are a few ways to choose the users for canary deployment.

  • You can choose a diverse but random subset of users.
  • Another way of choosing users is through their geographical region.
  • Early adopters can be a great subset of users as well. You can give users the option to install the canary version and likely get reliable feedback.
  • Another option is to release the canary version for the employees and internal users before making it available to other people.

Sometimes, a mix of different strategies works best for a canary release.

Monitoring and analyzing canary deployments

The main goal of canary releases is to monitor the canary release to find potential bugs, errors, or developmental problems. Hence, it is important to have a system or monitoring tools in place to effectively observe and track the key performance indicators, like error rates, response time, and resource usage. You can also set up alerts for issue detection to stay on top of the performance of your canary deployment.

Getting and Analysing user feedback during canary deployments

The next step involves gathering user feedback and managing them. You should encourage users involved in the canary release to provide feedback on their experience with the new version. This feedback is invaluable to evaluate how users perceive the newly introduced changes. You can urge users to give feedback by making them a part of your journey by communicating the purpose and goals of the canary release, providing multiple feedback channels, and handing out incentives. This feedback should then go through a structured process for analysis to identify pain points, problems, and suggestions.

Rolling back unsuccessful Canary deployments

When you deploy a canary there is always a chance that things might not go as planned. In such cases, you have to quickly roll back the canary release. One way to do that and avoid complicated routing configurations is by using feature flags. You use them to introduce a new version to a subset of users within your current production environment. The upside of using feature flags is that if the new release causes any issues, you can quickly disable the newly introduced feature, without having to restart the application. At Codesphere, for example, we use feature flags for all new features introduced to the platform.

Determining the success or failure of a Canary deployment

You can use the user feedback, user engagement, error rate, effect of canary deployment on business matrices, comparison of user experience with the control group, and rollback rates to determine the success and failure of your canary deployment. Success is determined by meeting predefined goals, positive user feedback, improved performance, and an overall positive impact on key metrics. In contrast, a failed canary release may be characterized by critical issues, negative user feedback, increased error rates, or regression in performance compared to the previous version. In the most extreme form canary testing could be used as a continuous a/b test, where only features that improve the defined funnel goals will ever be carried into the main version.

Upsides of canary releases

Apart from mitigating the risk associated with deploying new features, there are several other benefits of using canary deployments.

  • Early Error Detection: By releasing the update to a limited number of users, canary deployments assist in the early detection of bugs, errors, and performance issues. The development team can then fix these issues before rolling out the update to everyone.
  • Smooth Transition: By following the incremental development process, canary releases allow a gradual rollout, ensuring a smooth transition for users. This approach prevents sudden, disruptive changes and helps maintain a positive user experience.
  • Testing: Canary releases are also highly beneficial for A/B testing. They allow for a quick evaluation of the experimental version's performance, helping teams make data-driven decisions about its effectiveness and user acceptance.
  • Improved Quality: Canary deployments focus a lot on feedback and analyze it to get rid of any issues and improve the software. So, you end up with a high-quality and user-centric product.
  • Reduced Downtime: Canary releases offer multiple feedback stages and allow developers to fix any detected issues. Hence, there are fewer chances of disruptions because of any unforeseen problems during full-scale deployment.
  • Easy Rollbacks: With canary deployment, you can easily revert to the old version at any instant. This feature allows a swift recovery and minimal to no disruption to users.
  • Increased User Trust: The transparency of canary releases, especially when aligned with the "Building in Public" philosophy, fosters trust and engagement with users, making them feel part of the product development process.

Problems and limitations of Canary releases

While there is no denying that canary deployments come with several benefits, there are a few challenges involved as well.

  • Limited representation: As a canary deployment is only available to a small number of users or infrastructure, the response may not fully cover the diversity of your user base. As a result, it is difficult to infer that all your users will have the same experience as that of the canary group.
  • Complexity: Releasing a canary version does require you to maintain a clone environment of your live production environment for gradual rollouts. This requirement adds extra work and complexity to the development process and requires time, resources, and effort.
  • Extra Monitoring: On top of developing two environments, you have to closely monitor both environments as well. Effective monitoring and analysis demand having a robust tool system and a structured process, this can add overhead to your monitoring system.
  • User Disruption: Despite the controlled nature of canary deployments and having mitigating risk plans, there is a possibility of a negative user experience. This can happen especially during the early rollouts. So, those users can encounter issues with their workflows and it can lead to confusion and frustration.
  • Database and Schema changes: The challenge is that the database should be able to work with the control and canary version simultaneously. Otherwise, you can face issues such as data inconsistency between the canary and control groups, data migration complexity, and challenges with rollback procedures.

However, effective planning, robust monitoring, and clear communication can help mitigate some of these challenges and ensure a smoother canary release process.

Can you do a Canary deployment with Kubernetes?

While Kubernetes permits basic deployment strategies like RollingUpdate, it does not have specific built-in functionality for canary deployments.

Kubernetes provides the foundation and tools to implement canary deployments using custom configurations and integrations with other tools. You can implement canary release patterns effectively and manage the traffic between old and new versions based on various criteria. However, it can introduce additional complexity to the already laborious deployment process and can incur extra costs because of extra resource usage. Another major downside is that Kubernetes does not offer native support for advanced canary deployment strategies like automatic rollback based on certain performance metrics.

Release Management & Canary Deployments with Codesphere

Codesphere aims to make software release, an otherwise stressful process, a breeze for everyone. The platform does provide several built-in features to assist development and release management but what it also provides is the flexibility to tailor the process according to the specific needs of each project. You have access to features like version control integration, automated testing, and CI/CD pipelines, that simplify managing a new software release.

Canary Deployments: Release, Observe, Shift,& Repeat
_Codesphere workplace displaying canary and control version _

Codesphere allows you to connect each domain with a single or multiple workspaces. If you decide to do a canary release, you will have three options you can choose from depending on your preference:

  1. Connecting the main production workspace and test version to a single domain, this way half of your traffic will be split between the two versions. Even if a visitor was to refresh and re-visit, they will be directed to the same version as the first time, to not create user confusion.
  2. To have more control over the traffic, you could clone the workspace where you want the major traffic and link it to the same domain. For example, if you want only 25% of the traffic to the canary version and 75% of the traffic to go to the control version, you could simply add two more copies of the main production environment, with a few clicks and link it to your domain along with one canary version. Since you can clone a workspace in seconds on Codesphere it doesn’t take you much time or effort.
  3. The last option is to link both the control version and the canary version to different domains. In your frontend, you could allow users to opt-in to experimental features - based on the user decision you would then redirect users either to the domain with the control version or the experimental version.

Codesphere has a built-in resource usage feature you can use to monitor your canary release along with some soon-to-be-released features like uptime alerts to carefully watch your experimental release. Along with that, you can choose to integrate any monitoring service depending on the specific requirements of your project.

Conclusion

To sum it up, canary deployment is a useful risk-minimizing software release strategy. It allows organizations to release updates with reduced risk and gather valuable user feedback early on. The controlled rollout to a limited subset of users or infrastructure enables quick issue detection and iterative improvements. It also enables development teams to deliver a high-quality end product by taking into account the reported issues during several release stages and resolving them. Through effective implementation of canary deployments, organizations can ensure more reliable, user-centric, and innovative applications. While some complexities and resource overhead may arise, careful planning, transparent communication, and robust monitoring can help prevent such problems.

Top comments (1)

Collapse
 
alexandervoll profile image
Alexander Voll • Edited

So cool to see how Codesphere makes accessible to everyone what was traditionally only feasible for huge companies and teams with unlimited amounts of money!