DEV Community

loading...

Apache Pulsar on AKS quick setup for development

Haris Secic
software developer doing some architecture
・6 min read

THANKS to guys at kafkaesque-io there's https://github.com/kafkaesque-io/pulsar-helm-chart that will make it much easier to setup.

If you love Azure or Microsoft skip this part

I dislike Azure due to a lot of time wasted on trying to configure it where at the end I would eventually give up and use whatever was possible or just give up and tell my employer that this is not possible on Azure.

Some examples include:

  1. Impossible to setup MongoDB as docker image via Container Instance which would enable easy setup for dev server. Short: it's not possible to attach Azure file share so mongo can use it. It will create a couple of folders and give up as "permission denied". However neo4j passed it.
  2. Network. I had great pain setting up Spring Boot and Micronaut Neo4j clients just to realise that App Services cannot go over 230 seconds of idle TCP but Azure default load balancer or server or whatever under these connections will not properly close connections. Instead, it will just drop it leaving things like SDN/RX, which rely on TCP communication proper closing, think that connection is still there. Therefor, you'll see 20 sec connection pending until your app gets a hang of it realises no Neo4j connection in pool actually works then drop it and create new. Acquire time has nothing to do with this as connections will be visible to clients and only way to detect it is to have transaction timeout where client knows that if X amount of time is not enough for transaction there's a connection problem. Easy fix was to use connection lifetime on client settings where connections have to end in 230 seconds or 3.8 minutes. Maybe 228s as I set it just to be sure to break existing connection in pool before it ends and gets pending state.
  3. Anything you touch or do with it burns. Each software will come with instructions for AWS and a lot with GCP. A LOOOOOOOOOOOOOOT come without any info on Azure or special info saying trillions of steps on how to enable it somehow and some of the features are not available.

I really did consider quitting my job and changing career just to be safe from the hell Microsfot puts his Azure clients through. And not to mention Blob Storage hell and prices.

Preps

However, job is a job, depression goes away and you still need to do stuff. Latest was the Apache Pulsar. I needed to get this up and running just to keep on working and I'm blocking the whole team.

Good thing kafkaesque.io guys put together helm chart that has Azure included.

First of all I want to mention that you need at least 6 nodes in AKS pool with machines that have 2 CPU and 4GB RAM. Because of companies subscription and previous AKS setup I had to remove AKS and add a new one with 5 nodes of 2 cpus + 4GB ram. Good enough.

Docker to speed up generating tokens prior to deploy

I actually used docker to run my own standalone pulsar to generate keys upfront. If you do it like that generate all keys on docker. You can jump in to pulsar docker instance like

docker container exec -it <<pulsar container id>> /bin/bash

Than generate stuff like explained in kafkaesque.io git repo for helm chart for pulsar under Authorisation section. Too lazy to explore well:

bin/pulsar tokens create-key-pair --output-private-key my-private.key --output-public-key my-public.key
bin/pulsar tokens create --private-key file:///pulsar/my-private.key --subject admin >> admin.jwt
bin/pulsar tokens create --private-key file:///pulsar/my-private.key --subject superuser >> superuser.jwt
bin/pulsar tokens create --private-key file:///pulsar/my-private.key --subject websocket >> websocket.jwt
bin/pulsar tokens create --private-key file:///pulsar/my-private.key --subject proxy >> proxy.jwt

By default everything is setup by those guys to work with those names. So it's really important that filename for keys are my-private.key and my-public.key. This is assuming that you want to do as less as possible with configuring anything.

Next you want to have all of those files on your PC for easier use

docker cp <<your pulsar docker id>>:/pulsar/my-private.key my-private.key
docker cp <<your pulsar docker id>>:/pulsar/my-private.key my-private.key
docker cp <<your pulsar docker id>>:/pulsar/my-public.key my-public.key
docker cp <<your pulsar docker id>>:/pulsar/admin.jwt admin.jwt
docker cp <<your pulsar docker id>>:/pulsar/proxy.jwt proxy.jwt
docker cp <<your pulsar docker id>>:/pulsar/websocket.jwt websocket.jwt
docker cp <<your pulsar docker id>>:/pulsar/superuser.jwt superuser.jwt

Reason for "one by one file" is that you see all of the files used in here. You can have some shorthand command if you like.

Also I had plugin for Kubernets in Visual Studio Code which made it easy to check Azure AKS and right click on them then and choose Merge into Kubeconfig. Using docker this was super easy and then you can switch via docker to AKS. Just to see what I'm talking about:

Alt Text

Why? Well I'm too lazy to configure stuff so this was easy. Also on Windows docker will have tray icon with right click option Kubernetes where you can see your AKS in list after previous step and click on it to switch it very easy.

Push secrets to AKS

If you skipped Docker part please note here that it's important to have same names for files in secrets as secret names also. This is because it's already set up to work with these names and probably for development you won't care enough to change it so:

kubectl create secret generic token-public-key --from-file=my-public.key --namespace pulsar
kubectl create secret generic token-private-key --from-file=my-private.key --namespace pulsar
kubectl create secret generic token-admin --from-file=admin.jwt --namespace pulsar
kubectl create secret generic token-superuser --from-file=superuser.jwt --namespace pulsar
kubectl create secret generic token-websocket --from-file=websocket.jwt --namespace pulsar
kubectl create secret generic token-proxy --from-file=proxy.jwt --namespace pulsar

Installation - for development

For development means not storage. Why? Well it's too expensive.

If you want you can add your Container Registry from Azure first:

helm registry login <<yourazure>>.azurecr.io --username <<listed on azure>>--password <<so is your_pass>>

Next, add helm repo:

helm repo add kafkaesque https://helm.kafkaesque.io
helm repo update


Why update? Well I have some form of OCD when it comes to updates so I like to update stuff as soon as I add new repo. Hope that's fine.

If you want to enable Authorisation you need to add extra line at the top

enableTokenAuth: yes

to example development yaml so:

I also removed pulsar functions because I don't need it*

persistence: no
enableAntiAffinity: no
#this is missing in their repo file
enableTokenAuth: yes

zookeeper:
  resources:
    requests:
      memory: 512Mi
      cpu: 0.3 
  configData:
    PULSAR_MEM: "\"-Xms512m -Xmx512m -Dcom.sun.management.jmxremote -Djute.maxbuffer=10485760 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:+DisableExplicitGC -XX:+PerfDisableSharedMem -Dzookeeper.forceSync=no\""

bookkeeper:
  replicaCount: 2
  resources:
    requests:
      memory: 512Mi
      cpu: 0.3 
  configData:
    PULSAR_MEM: "\"-Xms512m -Xmx512m -XX:MaxDirectMemorySize=512m -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+UseG1GC -XX:MaxGCPauseMillis=10 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSharedMem -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintHeapAtGC -verbosegc -XX:G1LogLevel=finest\""

broker:
  component: broker
  replicaCount: 1
  resources:
    requests:
      memory: 512Mi
      cpu: 0.3 
  configData:
    PULSAR_MEM: "\"-Xms512m -Xmx512m -XX:MaxDirectMemorySize=512m -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSharedMem\""

autoRecovery:
  resources:
    requests:
      memory: 1Gi
      cpu: 1

proxy:
  replicaCount: 1 
  resources:
    requests:
      memory: 512Mi
      cpu: 0.3 
  wsResources:
    requests:
      memory: 512Mi
      cpu: 0.3
  configData:
    PULSAR_MEM: "\"-Xms512m -Xmx512m -XX:MaxDirectMemorySize=512m\""

Now apply this as installation. A little heads up (for users on Windows and helm 3+ I think but may be all of you): Docs on previous github repo link do not use name for installing repo so add it like this:

helm install pulsar --namespace pulsar kafkaesque/pulsar --values 'C:\Users\<<YourUsername>>\Desktop\test_helm\dev_values.yml'


This means that first pulsar in the command is actually name of the installation inside of your AKS or whichever proper word is used instead of installation. Anyways it's missing at the time of writing in the repo example.

Wait a bit and do a quick check with "kubectl get pods". If everything is initialised you can use

kubectl expose service pulsar-proxy --type=LoadBalancer --name pulsar-exposed

This will get you and External IP for using pulsar outside of AKS. Use that IP with 6650 or 8080 and admin.jwt (I think this one is for basic use).

Good luck.

Discussion (0)