André

Posted on Apr 25, 2020

Syncing Kubernetes services on MicroK8s for local development with Python

#kubernetes #microk8s #python

Disclaimer: This was something I did because I was bored. There are probably better ways to do this (please share them if you know any!), and your mileage most definitely will vary.

The situation

Kubernetes is a common tool for deploying applications and services, and tools like Minikube or MicroK8s are great for developing applications locally with it.

However, there are still situations where developing outside the cluster has its advantages. One being less overhead; not needing to sync files and directories, or attaching to remote debuggers. I like having as few layers as possible between my code and my debugger, and sometimes, even Docker development itself is a pain which I try to avoid.

This, though, means that the application won't be able to access services running in the cluster by their name. This is an issue especially for me, since one of the projects I'm working on makes use of Kubernetes to dynamically provision resources, making it sometimes a pain to work with locally, and the resources all have service names I need to access.

So, recently, I started researching solutions. Minikube is a good start, but since it is recommended to run it in a VM, it has too much overhead for some of the workflows I'm developing. So, I tried MicroK8s, which is great, since it runs locally, with sufficient isolation (it is installed as a Snap package on Linux). However, neither Minikube or MicroK8s have any solution to DNS "outside the cluster" (that I know of).

Some suggested solutions

Usually the solution to this (at least on Linux) boils down to:

Run a local DNS server, or dnsmasq, configured to point the service host names to your machine;
Manually edit your machine's hosts file (/etc/hosts) to keep things in sync yourself. For example, Minikube's Ingress guide explicitly suggests you edit your hosts file.

Neither is great. For example, using dnsmasq or a local DNS server may interfere with VPNs, meaning you may not be able to access your cluster, or your VPN's services, or both; while manually editing hosts files is a pain, and requires administrator privileges.

An alternative is to use something like Telepresence (which sounds really promising, mind), but if you're developing outside containers, it may also mess with your DNS settings in such a way that VPNs will not work either.

What I want

This is what I want:

Be able to develop and debug my application without requiring building an image and running it in a cluster (within reason);
Still be able to access the services which are deployed in the cluster by their name;
Not mess up the VPNs I need to access enterprise resources.

Since DNS is out of the question, I'm left with hosts files, so I'll use those.

Hosts files

A hosts file, briefly, is a text file containing a "table" of IP Address to Host Name(s), like this:

127.0.0.1 localhost
192.168.1.1 my-router

These files exist on Windows, macOS, and Linux, and once updated, they will be used for host name resolution, instead of your DNS server (note that there are exceptions to this, which I'll not cover here).

Types of Kubernetes Services

There are several ways of exposing a Deployment's containers with a Service. Here are some:

NodePort, which maps a port on the host to the port on the container (much like port forwarding, or the -p option for docker run)
ClusterIP, which assigns a cluster-local IP Address for that Deployment
LoadBalancer, which allows routing to multiple replicas of a Deployment's pods to distribute load.

There's also Ingress resources, which you can use to expose Services outside the cluster as specific host names. You should preferably use these if you want to expose services to the outside.

Demonstrating the problem

Assume you have an Nginx web server you want to expose:

$ kubectl create deployment nginx --image nginx
deployment.apps/nginx created

This will create a deployment and a Pod for Nginx:

$ kubectl get pods
NAME                                         READY   STATUS      RESTARTS   AGE
nginx-f89759699-2thhf                        1/1     Running     2          7h3m

To expose this deployment's Pods (to the cluster's pods and to the outside), we need a Service. We can create one like this:

$ kubectl expose deploy nginx --port 80 --name nginx-svc
service/nginx-svc exposed

You can now check that it was created:

$ kubectl get services
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
nginx-svc    ClusterIP   10.152.183.235   <none>        80/TCP    39s

If you type that IP Address (note that it will be different in your case), you should see the Nginx welcome page:

$ curl 10.152.183.235
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

Now, the same nginx-svc is available to other Pods on the clusters via its name. You can test this by running another Pod on the cluster:

$ kubectl run -i -t --rm --image alpine name
If you don't see a command prompt, try pressing enter.
/ # apk add curl
<snip>
/ # curl nginx-svc
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

But if you do it outside the Cluster (i.e., on the host):

$ curl nginx-svc
curl: (6) Could not resolve host: nginx-svc

It fails, as there is no DNS entry for the service name outside the cluster.

The solution

If you add the IP address of the service to the hosts file, it will work:

$ echo '10.152.183.235 nginx-svc' | sudo tee /etc/hosts
10.152.183.235 nginx-svc
$ curl nginx-svc
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

Mind that this works because MicroK8s sets up a network interface and routes to allow traffic from your host machine to the cluster, and vice versa.

Anyway, this is great, but it means that I need to edit the hosts file manually. That's boring, so let's automate this:

Automating hosts file synchronization

We need two things for each service: it's name, and it's IP address inside the cluster. I could use kubectl and Shell Scripts to retrieve the information I need, but I decided to resort to Python instead. This also allows me to use the kubernetes library. This can be installed with pip. After that, we can do something like this:

import kubernetes.config

# The first thing we need to do is configure the library with our credentials. This will load them from ~/.kube/config.
kubernetes.config.load_kube_config()

core = kubernetes.client.CoreV1Api()

# Get the services from the server (I'll assume the default namespace)
services = core.list_namespaced_service(namespace='default').items

for service in services:
    print(f"Service {service.metadata.name} has IP {service.spec.cluster_ip}")

Running this, we get:

Service kubernetes has IP 10.152.183.1
Service nginx-svc has IP 10.152.183.235

It's a good start! Now, we need to add this information to the hosts file. I'll do this in a very naive way to keep things simple, so:

import kubernetes.config

# The first thing we need to do is configure the library with our credentials. This will load them from ~/.kube/config.
kubernetes.config.load_kube_config()

core = kubernetes.client.CoreV1Api()

# Get the services from the server (I'll assume the default namespace)
services = core.list_namespaced_service(namespace='default').items

# Open the hosts file in append mode, so that we can write to it.
# ATTENTION: DO NOT RUN THIS, IT MIGHT DAMAGE YOUR HOSTS FILE!
with open('/etc/hosts', 'a') as hosts_writer:
    # Write an empty line, in case the file doesn't end with a new line (in case you do run this)
    hosts_writer.write('\n')

    # The format of the hosts file is [IP] [Host Name], so we can write to it like this:
    for service in services:
        hosts_writer.write(f'{service.spec.cluster_ip} {service.metadata.name}')

Again, do not run this! You might damage your hosts file, which can cause you trouble.

But, assuming you do manage to run this (sudo kind of breaks things with Kubernetes, thankfully), your hosts file should be updated. However, if you did manage to run this, do not run this repeatedly, as this will keep appending new lines to the hosts file.

To make this a bit safer, you could perhaps make the script mark added lines with a comment, and then remove them before adding new ones.

Going further

Having the basic version of the script, we can go a bit further:

Add support for Ingress resources, such that they are also accessible from the host (if they have a defined host, of course);
Running this periodically in the Cluster, with a Cron job;
Extra validation checks for sanity;
Add filters such that we only expose some hosts.

You can check what I did at this GitHub Repository.

Thanks for reading!

Top comments (1)

Ramiro Berrelleza • Apr 27 '20

Pretty cool! We are building an open source project that I think solves the 3 points you outline in the "What I Want" section. You should try it out! github.com/okteto/okteto