Flagger – Canary deployments on Kubernetes

Fabian Piau | Tuesday May 19th, 2020 - 07:56 PM

Update
October, 17th, 2020 : Use newer versions (Helm 3, Kube 18, Istio 1.7, Flagger 1.2).

This article is the second one of the series dedicated to Flagger. In a nutshell, Flagger is a progressive delivery tool that automates the release process for applications running on Kubernetes. It reduces the risk of introducing a new software version in production by gradually shifting traffic to the new version while measuring metrics and running conformance tests.

Make sure you have a local Kubernetes cluster running with the service mesh Istio. If you don’t, read the first article: Flagger – Get Started with Istio and Kubernetes.

In this second guide, we will focus on the installation of Flagger and run multiple canary deployments of the application Mirror HTTP Server (MHS). Remember that this dummy application can simulate valid and invalid responses based on the request. This is exactly what we need to test the capabilities of Flagger. We will cover both happy (rollout) and unhappy (rollback) scenarios.

Note
This is a hands-on guide and can be followed step by step on MacOS. It will require some adjustments if you are using a Windows or Linux PC. It is important to note that this article will not go into details and only grasp the concepts & technologies so if you are not familiar with Docker, Kubernetes, Helm or Istio, I strongly advise you to check some documentation yourself before continuing reading.

Installing Flagger

Let’s install Flagger by running these commands.

kubectl create ns flagger-system

We install Flagger in its own namespace flagger-system.

helm repo add flagger https://flagger.app

kubectl apply -f https://raw.githubusercontent.com/weaveworks/flagger/master/artifacts/flagger/crd.yaml

helm upgrade -i flagger flagger/flagger \
--namespace=flagger-system \
--set crd.create=false \
--set meshProvider=istio \
--set metricsServer=http://prometheus.istio-system:9090

Reference: Flagger Install on Kubernetes
Flagger depends on Istio telemetry and Prometheus (in that case, we assume Istio is installed in the istio-system namespace).
All parameters are available on the Flagger readme file on GitHub.
We don’t specify a version for Flagger, which means it will use the latest available in the repo (1.2.0 at the time of writing).

After a few seconds, you should get a message confirming that Flagger has been installed. From the Kube dashboard, verify that a new namespace has been created flagger-system and the Flagger pod is running.

Flagger is deployed in your cluster

Experiment 0 – Initialize Flagger with MHS v1.1.1

Mirror HTTP Server has multiple versions available. To play with Flagger canary deployment feature, we will switch between version 1.1.1, 1.1.2 and 1.1.3 of MHS (the latest version at the time of writing).

Before deploying MHS, let’s create a new namespace application, we don’t want to use the default one at the root of the cluster (this is good practice). The name is too generic, but sufficient for this tutorial, in general you will use the name of the team or the name of a group of features.

kubectl create ns application

Do not forget to activate Istio on this new namespace:

kubectl label namespace application istio-injection=enabled

To deploy MHS via Flagger, I created a Helm chart.

This “canary flavored” chart was created based on the previous chart without Flagger which itself was created with the helm create mhs-chart command, then adapted. In this “canary flavored” chart, I did some extra adaptation to use 2 replicas instead of 1 to make it more realistic and use a fixed version to 1.1.1, I also added the canary resource where the magic happens.

Clone the chart repo:

git clone https://github.com/fabianpiau/mhs-canary-chart.git

And install MHS:

cd mhs-canary-chart
helm install --name mhs --namespace application ./mhs

After a few moments, if you look at the dashboard, you should see 2 replicas of MHS in the namespace application.

MHS 1.1.1 is deployed in your cluster

It is important to note that no canary analysis has been performed and the version has been automatically promoted. It was not a “real” canary release.
Why? Because Flagger needs to initialize itself the first time we do a canary deployment of the application. So make sure the version you are deploying with Flagger the first time is fully tested and works well!
You could also guess this auto-promotion happened because there was no initial version of the application in the cluster. Although this is obviously a good reason, it’s important to note that, even if we had a previous version before (e.g. 1.1.0), the canary version 1.1.1 would have still been automatically promoted without analysis.

You can still check the canary events with:

kubectl -n application describe canary/mhs

You should have a similar output without a canary analysis:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Synced 2m29s flagger mhs-primary.application not ready: waiting for rollout to finish: observed deployment generation less then desired generation
Normal Synced 92s (x2 over 2m30s) flagger all the metrics providers are available!
Normal Synced 92s flagger Initialization done! mhs.application

Or you can also directly check the log from Flagger:

export FLAGGER_POD_NAME=$(kubectl get pods --namespace flagger-system -l "app.kubernetes.io/name=flagger,app.kubernetes.io/instance=flagger" -o jsonpath="{.items[0].metadata.name}")

kubectl -n flagger-system logs $FLAGGER_POD_NAME

If you take a closer look at the Kube dashboard, you should see some mhs and mhs-primary resources:

mhs-primary are the primary instances (= the non-canary ones). Flagger automatically add the -primary suffix to differentiate them from the canary instances.
mhs are the canary instances. They exist only during the canary deployment and will disappear once the canary deployment ends. That’s why, in the screenshot above, you don’t see any mhs canary pods (i.e. 0 / 0 pod).

Why this naming convention? I asked Flagger team directly and there is a technical constraint.

Flagger is now initialized properly and MHS is deployed to your cluster. You can use the terminal to confirm MHS is accessible (thanks to the Istio Gateway):

curl -I -H Host:mhs.example.com 'http://localhost'

You should receive an HTTP 200 OK response:

HTTP/1.1 200 OK
x-powered-by: Express
date: Sun, 17 May 2020 16:47:33 GMT
x-envoy-upstream-service-time: 10
server: istio-envoy
transfer-encoding: chunked

And:

curl -I -H Host:mhs.example.com -H X-Mirror-Code:500 'http://localhost'

should return an HTTP 500 response:

HTTP/1.1 500 Internal Server Error
x-powered-by: Express
date: Sun, 17 May 2020 16:48:09 GMT
x-envoy-upstream-service-time: 12
server: istio-envoy
transfer-encoding: chunked

Experiment 1 – MHS v1.1.2 canary deployment

We are going to install a newer version 1.1.2. You need to manually edit the file mhs-canary-chart/mhs/values.yaml and replace tag: 1.1.1 with tag: 1.1.2 (this line).

Then:

cd mhs-canary-chart
helm upgrade mhs --namespace application ./mhs

While the canary deployment is in progress, it’s very important to generate some traffic to MHS. Without traffic, Flagger will consider that something went wrong with the new version and will rollback automatically to the previous one. Obviously, you don’t need this extra step in a production environment that continuously receives real traffic.

Run this loop command in another terminal to generate artificial traffic:

while (true); do curl -I -H Host:mhs.example.com 'http://localhost' ; sleep 0.5 ; done

Check the Kube dashboard, you should see the canary pod with the new version 1.1.2 at some point:

Canary deployment of MHS 1.1.2 in progress in your cluster

Check the canary events with the same command as before:

kubectl -n application describe canary/mhs

After a while (about 6 minutes) you should have a similar event output:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Synced 30m flagger mhs-primary.application not ready: waiting for rollout to finish: observed deployment generation less then desired generation
Normal Synced 29m (x2 over 30m) flagger all the metrics providers are available!
Normal Synced 29m flagger Initialization done! mhs.application
Normal Synced 10m flagger New revision detected! Scaling up mhs.application
Normal Synced 9m16s flagger Starting canary analysis for mhs.application
Normal Synced 9m16s flagger Advance mhs.application canary weight 10
Normal Synced 8m16s flagger Advance mhs.application canary weight 20
Normal Synced 7m16s flagger Advance mhs.application canary weight 30
Normal Synced 6m16s flagger Advance mhs.application canary weight 40
Normal Synced 5m16s flagger Advance mhs.application canary weight 50
Normal Synced 4m16s flagger Copying mhs.application template spec to mhs-primary.application
Normal Synced 3m16s flagger Routing all traffic to primary
Normal Synced 2m16s flagger (combined from similar events): Promotion completed! Scaling down mhs.application

The canary release performed successfully. Now you have version 1.1.2 installed on all the primary pods and the canary pod has been removed.

MHS 1.1.2 is deployed in your cluster

Why did this deployment take about 6 minutes? Because it includes a 5 minutes canary analysis. During this analysis, the traffic was routed progressively to the canary pod. The canary traffic increased by steps of 10% every 1 minute until it reached 50% of the global traffic. The analysis is configurable and defined in the canary.yaml file that was added to the chart.

Below is the analysis configuration:

  analysis:
    # stepper schedule interval
    interval: 1m
    # max traffic percentage routed to canary - percentage (0-100)
    maxWeight: 50
    # canary increment step - percentage (0-100)
    stepWeight: 10
    # max number of failed metric checks before rollback (global to all metrics)
    threshold: 5
    metrics:
      - name: request-success-rate
        # percentage before the request success rate metric is considered as failed (0-100)
        thresholdRange:
          min: 99
        # interval for the request success rate metric check
        interval: 30s
      - name: request-duration
        # maximum req duration P99 in milliseconds before the request duration metric is considered as failed
        thresholdRange:
          max: 500
        # interval for the request duration metric check
        interval: 30s

The canary analysis has been covered with the 2 basic metrics that are provided out of the box by Istio / Prometheus (request success rate + duration). It is possible to define your own custom metrics. In that case, they will need to be provided by your application. Your application will need to expose a Prometheus endpoint that includes your custom metrics. And you will be able to update the Flagger analysis configuration to use them with your own PromQL query. Note this goes beyond the scope of this hands-on guide that uses only the built-in metrics.

Experiment 2 – MHS v1.1.3 faulty deployment

Again, you need to manually edit the file mhs-canary-chart/mhs/values.yaml and replace tag: 1.1.2 with tag: 1.1.3.

Then:

cd mhs-canary-chart
helm upgrade mhs --namespace application ./mhs

We generate some artificial traffic:

while (true); do curl -I -H Host:mhs.example.com 'http://localhost' ; curl -I -H Host:mhs.example.com -H X-Mirror-Code:500 'http://localhost' ; sleep 0.5 ; done

This time, we also generate invalid traffic to make sure the request success rate is going down!

Check the canary events with the same command as before:

kubectl -n application describe canary/mhs

After a while (about 6 minutes) you should have a similar event output:

Normal Synced 8m23s (x2 over 20m) flagger New revision detected! Scaling up mhs.application
Normal Synced 7m23s (x2 over 19m) flagger Advance mhs.application canary weight 10
Normal Synced 7m23s (x2 over 19m) flagger Starting canary analysis for mhs.application
Warning Synced 6m23s flagger Halt mhs.application advancement success rate 57.14% < 99%
Warning Synced 5m24s flagger Halt mhs.application advancement success rate 0.00% < 99%
Warning Synced 3m24s flagger Halt mhs.application advancement success rate 71.43% < 99%
Warning Synced 2m24s flagger Halt mhs.application advancement success rate 50.00% < 99%
Warning Synced 84s flagger Halt mhs.application advancement success rate 63.64% < 99%
Warning Synced 24s flagger Rolling back mhs.application failed checks threshold reached 5
Warning Synced 24s flagger Canary failed! Scaling down mhs.application

And you are still on version 1.1.2.

Flagger decided not to go ahead and propagate version 1.1.3 as it could not perform a successful analysis and the error threshold was reached, i.e. 5 times (indeed, each time, about 50% of the requests were ending up in an HTTP 500 response). Flagger has simply redirected all traffic back to the primary instances and removed the canary pod.

Congratulations, you’ve come to the end of this second tutorial!

Observations

Before we clean up the resources we’ve created, let’s wrap up with a list of observations:

Deleting a deployment will delete all pods (canary / primary). And we don’t end up with orphan resources.
Prometheus is required. Without it, the canary analysis won’t work.
It is not possible to re-trigger a canary deployment of the same version if it has just failed. It forces you to bump up the version (even if it was a configuration and not a code issue).
Flagger off-boarding process is not as simple as removing the canary resource from the chart and deploy a new version. If you delete the canary resource then Flagger won’t trigger the canary process, it will change the version in mhs and remove mhs-primary but mhs has 0 pods so it will make your service unavailable! You need to be careful and adopt a proper manual off-boarding process. Recently, the Flagger team added a property revertOnDeletion you can enable to avoid this issue. You can read the documentation to know more about this canary finalizer.
After multiple deployments, it seems that some events can be missing, the Kubernetes describe command is accumulating them (x<int> over <int>m) sometimes the order is not preserved and/or some events are not showing up. You can look at the phase status (terminal status are Initialized, Succeeded and Failed). The best is to look directly at the logs on the Flagger pod as this is always accurate and complete.
The canary analysis should be configured to run for a short period of time (i.e. no more than 30 minutes) to leverage continuous deployment and avoid releasing a new version while a canary deployment for the previous one is still in progress. If you want to perform canary releases over longer periods, Flagger may not be the best tool.
Finally, it’s important to remember that the first time you deploy with Flagger (like in experiment 0 above), the tool needs to initialize itself (Initialized status) and will not perform any analysis.

Cleaning up resources

Now the tutorial is complete you can remove the MHS application and its namespace.

helm delete mhs --namespace application

kubectl delete namespaces application

We recommend that you leave Flagger and Istio in place to save time in the next tutorial. If however you’d like to remove everything now, then you can run the following commands.

Remove Flagger:

helm delete flagger --namespace flagger-system

kubectl delete namespaces flagger-system

Remove Istio and Prometheus:

kubectl delete -f https://raw.githubusercontent.com/istio/istio/release-1.7/samples/addons/prometheus.yaml

istioctl manifest generate --set profile=demo | kubectl delete -f -

kubectl delete namespaces istio-system

What’s next?

The next article will focus on the Grafana dashboard provided out of the box with Flagger which is a nice addition, so you don’t need to manually run any kuberctl commands to check the result of your canary deployments. Stay tuned! In the meantime, you can stop the Kubernetes cluster by unchecking the box and restarting Docker Desktop. Your computer deserves another break.

Comments: No Comments »
Categories: Agile programming
Tags: cloud, docker, flagger, helm, istio, kubernetes
Comments rss

Flagger – Get Started with Istio and Kubernetes

Fabian Piau | Saturday May 2nd, 2020 - 06:40 PM

Version française disponible

Update
October, 17th, 2020 : Use newer versions (Helm 3, Kube 18, Istio 1.7).

This series of articles is dedicated to Flagger, a tool that integrates with Kubernetes, the popular container orchestration platform. Flagger enables automated deployments and will be one step closer to a continuous deployment process.

This article is the first of the series and also the only one where we won’t use Flagger yet… this article will walk through how you to run a Kubernetes cluster on your local environment and deploy an application which will be accessible via an Istio gateway.

Docker

Install Docker by installing the Docker Desktop for Mac application, you can refer to the official installation guide. For Windows users, the equivalent application “Docker for Windows” exists.

In the next part, we will also use Docker for Mac to set up our local Kubernetes cluster. Note that this tutorial has been tested with Docker for Mac 2.4.0.0 that includes a Kubernetes Cluster in version 1.18.8, this is the latest at the moment of writing.

If you use a different version, technology is moving fast so I cannot guarantee that the commands used in this series will work without any adjustment.

Mirror HTTP Server

First a few words about the application Mirror HTTP Server we will use in this series of articles.

MHS is a very simple JavaScript application based on Node.js using the framework Express which allows you to customize the HTTP response received by setting specific HTTP headers in the request. The Docker image is publicly available on the Docker Hub. You can consult the Github repo of the project to find out more, please note that I am not the author.

This little app is exactly what we need to test the capabilities of Flagger to simulate 200 OK responses and 500 Internal Server Error responses.

Let’s pull the Docker image:

docker pull eexit/mirror-http-server

And run a new container that uses it:

docker run -itp 8080:80 eexit/mirror-http-server

Then let’s make sure it is functioning properly:

curl -I 'http://localhost:8080'

You should receive an HTTP 200 OK response:

HTTP/1.1 200 OK
X-Powered-By: Express
Date: Fri, 01 May 2020 17:57:17 GMT
Connection: keep-alive

While:

curl -I -H X-Mirror-Code:500 'http://localhost:8080'

will return an HTTP 500 response:

HTTP/1.1 500 Internal Server Error
X-Powered-By: Express
Date: Fri, 01 May 2020 17:57:45 GMT
Connection: keep-alive

For simplicity, we use the curl command, but you can use your favourite tool, e.g. Postman.

Kubernetes

Now that you’ve installed Docker for Mac, having a Kubernetes cluster running locally will be a simple formality. You just need to check a box!

Enable Kubernetes with Docker for Mac

If the light is green, then your Kubernetes cluster has successfully started. Please note, this requires a significant amount of resources, so don’t panic if the fan is running at full speed and it takes a bit of time to start…

Kube dashboard

We will install our first application in our Kubernetes cluster.

Kubernetes via Docker does not come with the dashboard by default, you have to install it yourself. This dashboard is very practical and provides a graphical interface of what is going on in your cluster and will save you from having to enter kubectl commands.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.4/aio/deploy/recommended.yaml

The dashboard is protected, but you can use the default user to access it. You can generate a default token via this command:

kubectl -n kube-system describe secret default | grep token: | awk '{print $2}'

Copy it.

You will need to re-use this command and /or the token copied if your session has expired, this happens when you don’t interact with the dashboard for a little while.

Finally, create a proxy to access the dashboard from the browser (this command will need to run indefinitely):

kubectl proxy

If you access http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login and use the token that you copied to authenticate, you should see this screen.

Kube Dashboard

Helm

We use Homebrew for the installation of Helm. Homebrew is a handy package manager available for Mac.

We will use Helm to install Istio and the MHS application in our cluster. Helm is a bit like Homebrew, but for Kubernetes. We are using version 3. Helm will save you from having to enter many kubectl apply commands.

Let’s install Helm 3 with:

brew install helm@3

To verify that Helm has been installed:

helm version

You should have a similar output (note that Helm 3.3.4 is the latest version at the time of writing):

version.BuildInfo{Version:"v3.3.4", GitCommit:"a61ce5633af99708171414353ed49547cf05013d", GitTreeState:"dirty", GoVersion:"go1.15.2"}

Istio & Prometheus

Now, we are going to install the Istio Service Mesh. For full explanations and the benefits of using a Service Mesh, I invite you to read the official documentation.

First of all, you must increase the memory limits of your Kubernetes via Docker, otherwise you will run into deployment issues. Your laptop’s fans will recover, don’t worry…

Here is my configuration:

Kubernetes Configuration in Docker for Mac for Istio

I followed the Docker Desktop recommendations for Istio.

Let’s go and install Istio 1.7.3 (the latest version at the time of writing). First, download the source:

curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.7.3 sh -

cd istio-1.7.3

Add the istioctl client to your path:

export PATH=$PWD/bin:$PATH

Install Istio with the provided client, we use the demo profile:

istioctl install --set profile=demo

After a few minutes, you should get a message confirming that Istio has been installed. And voilà!

To install the latest version of Istio, you can simply replace the first line with curl -L https://istio.io/downloadIstio | sh -.

Add Prometheus as it’s required for Flagger:

kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.7/samples/addons/prometheus.yaml

From the Kube dashboard, verify that a new namespace has been created istio-system and that it contains the Istio tools including Prometheus.

Istio is deployed in your cluster

Why is Prometheus important? Because it is an essential component for Flagger which will provide the metrics to show if the new version of your application is healthy or not, thus it will know when to promote or rollback a version. I will come back to this in detail in the next article.

Deploying Mirror HTTP Server

kubectl create ns application

Do not forget to activate Istio on this new namespace:

kubectl label namespace application istio-injection=enabled

To deploy MHS, I created a Helm chart.

This chart was created with the helm create mhs-chart command, then I updated to use the latest image of MHS. I also added a gateway.yaml file to configure the Istio gateway so it can be accessible outside of the cluster.

Clone the chart repo:

git clone https://github.com/fabianpiau/mhs-chart.git

And install MHS:

cd mhs-chart
helm install --name mhs --namespace application ./mhs

After a few moments, if you look at the dashboard, you should see 1 replica of MHS in the namespace application.

MHS is deployed in your cluster

You now have 1 MHS pod running in your Kubernetes cluster. The pod is exposed to the outside world via an Istio gateway.

To test, use the similar commands that we used against the docker container earlier:

curl -I -H Host:mhs.example.com 'http://localhost'

You should receive an HTTP 200 OK response that was handled by Envoy, the proxy used by Istio:

HTTP/1.1 200 OK
x-powered-by: Express
date: Fri, 01 May 2020 17:37:19 GMT
x-envoy-upstream-service-time: 17
server: istio-envoy
transfer-encoding: chunked

And:

curl -I -H Host:mhs.example.com -H X-Mirror-Code:500 'http://localhost'

should return an HTTP 500 response:

HTTP/1.1 500 Internal Server Error
x-powered-by: Express
date: Fri, 01 May 2020 17:38:34 GMT
x-envoy-upstream-service-time: 2
server: istio-envoy
transfer-encoding: chunked

Congratulations, you’ve come to the end of this first tutorial!

For information, you can also access MHS with your favourite browser if you run a proxy command first to expose the pod:

export POD_NAME=$(kubectl get pods --namespace application -l "app.kubernetes.io/name=mhs,app.kubernetes.io/instance=mhs" -o jsonpath="{.items[0].metadata.name}")

kubectl port-forward --namespace application $POD_NAME 8080:80

Then, navigate to http://localhost:8080/.

You should see a… blank page. This is normal, MHS does not return a body in the response and there is no HTML output!

Cleaning up resources

You can delete the MHS application and its namespace.

helm delete mhs --namespace application

kubectl delete namespaces application

We don’t remove Istio / Prometheus because we will need it in the next article, but if you want to free up some resources, you can use these commands:

What’s next?

The next article will focus on the installation of Flagger and use different versions of MHS to try canary deployments. Stay tuned! In the meantime, you can stop the Kubernetes cluster by unchecking the box and restarting Docker Desktop. Your computer deserves a break.

Comments: No Comments »
Categories: Agile programming
Tags: cloud, docker, flagger, helm, istio, kubernetes
Comments rss

Flagger – Canary deployments on Kubernetes

Installing Flagger

Experiment 0 – Initialize Flagger with MHS v1.1.1

Experiment 1 – MHS v1.1.2 canary deployment

Experiment 2 – MHS v1.1.3 faulty deployment

Observations

Cleaning up resources

What’s next?

Flagger – Get Started with Istio and Kubernetes

Docker

Mirror HTTP Server

Kubernetes

Kube dashboard

Helm

Istio & Prometheus

Deploying Mirror HTTP Server

Cleaning up resources

What’s next?

RSS feeds

Most viewed posts

Recent Comments

Recent posts

Language

Follow me!

Email subscription

Links

Categories

Archives

Flagger – Canary deployments on Kubernetes

Installing Flagger

Experiment 0 – Initialize Flagger with MHS v1.1.1

Experiment 1 – MHS v1.1.2 canary deployment

Experiment 2 – MHS v1.1.3 faulty deployment

Observations

Cleaning up resources

What’s next?

Related posts

Flagger – Get Started with Istio and Kubernetes

Docker

Mirror HTTP Server

Kubernetes

Kube dashboard

Helm

Istio & Prometheus

Deploying Mirror HTTP Server

Cleaning up resources

What’s next?

Related posts

RSS feeds

Most viewed posts

Recent Comments

Recent posts

Language

Follow me!

Email subscription

Tags

Links

Categories

Archives