DEV Community

Chris Harris
Chris Harris

Posted on • Edited on

Apache Superset and Azure - multi-container application deployment

In this article you’ll learn the steps to deploy Apache Superset on Azure using Azure Kubernetes Service.
 

Background

I work on improving the Python Developer Experience on Microsoft Azure and I've spoken with lots of Python developers from around the world. One thing I've heard repeatedly is that many Python devs like to run production code in containers. And another thing I've learned is many apps require multiple containers.

I also like to do some data analysis on the side and recently ran across Apache Superset which describes itself as a "modern data exploration and data visualization platform". Coincidentally, Superset has a lot of Python code and can be deployed in containers (nine of them at current count!)
Superset image from https://github.com/apache/superset

There are several options for running container-based apps on Azure. Because Superset requires several containers (9 at current count) I decided to use Azure Kubernetes Service (AKS). In a future post I plan to share how you can use Azure's new managed service for containers, Azure Container Apps, instead.

While I did not find an existing document or post to help me complete my journey in a straightforward manner, here are several docs that helped:

 

The set up

Some of the syntax below may vary with your choice of OS or shell.

Prerequisites you will need:

 

Get the Superset app and create your AKS Cluster

  1. Clone the Apache Superset repo

    git clone https://github.com/apache/superset
    
    cd superset
    

     

  2. Sign in to the Azure CLI
    You can sign in by using the az login command

    az login
    

     

  3. Create a new AKS cluster with ACR integration
    Azure Container Registry (ACR) is kind of like Docker Hub. It's a place to store your container images so you can run your application in your Azure Kubernetes Service (AKS) cluster. We will use it to store the Superset container images.
    Use az acr create to create an ACR named [yourname]supersetacr in a resource group called supersetrg.
    Replace [yourname] below with a name of your choice.

    # Create a Resource Group to hold your ACR and AKS resources
    # Feel free to use a location closer to you
    az group create supersetrg --location westus2
    
    # Create an Azure Container Registry
    az acr create -n [yourname]supersetacr -g supersetrg --sku basic
    
    # Create an AKS cluster with ACR integration
    az aks create -n supersetaks -g supersetrg --generate-ssh-keys --attach-acr [yourname]supersetacr
    

     

    When you create the ACR, you will see a blob of JSON. Here are a couple of important values to notice:

    ...
    "location": "westus2",
    "loginServer": "[yourname]supersetacr.azurecr.io",
    "name": "[yourname]2supersetacr",
    ...
    "provisioningState": "Succeeded",
    "publicNetworkAccess": "Enabled",
    "resourceGroup": "supersetrg",
    "sku": {
      "name": "Basic",
      "tier": "Basic"
    },
    ...
    

     

    It will take a few minutes to create the AKS cluster. When it's done you will see an even larger blob of JSON. Here are a couple of important values to notice in this one:

    ...
      "osDiskSizeGb": 128,
      "osDiskType": "Managed",
      "osSku": "Ubuntu",
      "osType": "Linux",
    ...
      "provisioningState": "Succeeded",
    ...
      "vmSize": "Standard_DS2_v2",
    ...
    "azurePortalFqdn": "supersetak-supersetrg-2223f9-06aacbbd.portal.hcp.westus2.azmk8s.io",
    ...
    "fqdn": "supersetak-supersetrg-2223f9-06aacbbd.hcp.westus2.azmk8s.io",
    ...
    "kubernetesVersion": "1.22.6",
    ...
    "nodeResourceGroup": "MC_supersetrg_supersetaks_westus2",
    ...
    

     
    Additional ways to integrate ACR with AKS: 3 Ways to integrate ACR with AKS
     

  4. Push the Superset container images into your ACR
    Superset uses two container images, which you can see in the superset repo in the /helm/superset/values.yaml file:

    image: 
      repository: apache/superset
    ...
    initImage: 
      repository: busybox
    

     

    While ACR can technically pull images directly from Docker Hub, the throttling that Docker Hub has recently implemented means it could take a while (sometimes a long while) for the images to get into your ACR. Instead we're going to pull the images to your machine and then push them to ACR. Don't forget to replace [yourname] with the name you chose previously.

    # First - login to your ACR so Docker can push to it
    az acr login -n [yourname]supersetacr.azurecr.io
    
    # Pull the Superset image
    docker pull apache/superset
    # Tag the image using your ACR login server name 
    docker tag apache/superset [yourname]supersetacr.azurecr.io/superset
    # Push the image to ACR
    docker push [yourname]supersetacr.azurecr.io/superset
    
    # Pull the Busybox image
    docker pull busybox
    # Tag the image using your ACR login server name
    docker tag busybox [yourname]supersetacr.azurecr.io/busybox
    # Push the image to ACR
    docker push [yourname]supersetacr.azurecr.io/busybox
    

     

  5. Create a my_values.yaml file to override defaults
    In the previous step we gave the images new tags and now we need to create override the defaults in /helm/superset/values.yaml so our new images will be used.
    Add the following to a new file called my_values.yaml, being sure to replace [yourname] with the name you chose previously:

    image:
      repository: [yourname]supersetacr.azurecr.io/superset
    
    initImage:
      repository: [yourname]supersetacr.azurecr.io/busybox
    

     

    While we're in the my_values.yaml file, in order to expose the Superset website, we need to change service type from ClusterIP to LoadBalancer and to make it easier to browse we will set the port to 80. Add these lines to my_values.yaml as well:

    # Set type to 'LoadBalancer' so we can browse it
    service:
      type: LoadBalancer
      port: 80
    

     

    And unless you're planning on deploying to production, I recommend that you load up some examples to play with. Just make one more addition to the my_values.yaml file:

    # Load Superset Examples
    init:
      loadExamples: true
    

     

    Here is the entire my_values.yaml file:

    image:
      repository: [yourname]supersetacr.azurecr.io/superset
    
    initImage:
      repository: [yourname]supersetacr.azurecr.io/busybox
    
    # Set type to 'LoadBalancer' so we can browse it
    service:
      type: LoadBalancer
      port: 80
    
    # Load Superset Examples
    init:
      loadExamples: true
    

     

Deploy Superset to your AKS cluster

  1. Get the credentials to your AKS cluster so helm deploy to it
    Use the az aks get-credentials command to download credentials for your AKS cluster:

    az aks get-credentials -n supersetaks -g supersetrg
    

     

  2. Deploy to AKS using helm
    First update your helm chart dependencies using the helm dependency update command:

    helm dependency update helm/superset
    

     

    Then install your helm chart using the helm install command:

    helm upgrade --install --values my_values.yaml superset helm/superset
    

     

Check it out!

  1. View your AKS cluster in the Azure Portal and open Superset
    Some Azure services make it very easy for you to quickly jump from your CLI to view the resource in the Azure web portal; fortunately, AKS is one of those services.

    Open the portal to your AKS cluster with the az aks browse -n supersetaks -g supersetrg command.
    This will take you to the Workloads page where you will see the status of your Kubernetes workloads. Give it five minutes or so for everything to get deployed and to turn green.
    AKS Workloads page

    Then click on Services and ingresses.

    On the Services and ingresses page, you will find the External IP for your Superset website:
    AKS Services and ingresses page

    When you click on the External IP and add the port number :8088, if everything worked perfectly, you will see the Superset login page below where you can use admin and admin to login.
    Superset login page 

Conclusion

While I haven't tried out everything yet, I have connected to an external database and created charts and dashboards. I'm happy to say that performance has been great so far and I'm looking forward to seeing what I can create!
Here's one of the dashboards you can explore when you load the examples mentioned above:

USA Births Names Dashboard

Cleaning up the resources

Since AKS is not free, when you are done testing Superset you may want to delete the resources.

When you created your ACR, you created a resource group called supersetrg. And when you created your AKS cluster another resource group was also created with a name similar to this: MC_supersetrg_supersetaks_westus2.

You will need to delete both of these resource groups to prevent incurring additional costs.

# View all of your resource groups
az group list -o table

# Delete the resource groups for your AKS cluster
az group delete -g supersetrg
az group delete -g MC_supersetrg_supersetaks_westus2
Enter fullscreen mode Exit fullscreen mode

 

Next steps

Top comments (0)