In Kubernetes, Nodes, Pods and Services all have their own IP addresses.
If you want to run the examples from this guide yourself, then you should create the following Deployment:
kubectl create deployment --image=nginx nginx-app --replicas=2
It's also expected that you have basic Kubernetes, ReplicaSets and Deployment knowledge.
Listing all our created pods with kubectl get pods
, we get
NAME READY STATUS RESTARTS AGE
nginx-app-d6ff45774-7m9zg 1/1 Running 0 10d
nginx-app-d6ff45774-xp2tt 1/1 Running 0 9d
Let's find the IP of one of our Pods with
kubectl describe pod/nginx-app-d6ff45774-7m9zg
which returns
...
Name: nginx-app-d6ff45774-7m9zg
Namespace: default
Node: minikube/192.168.49.2
Start Time: Sun, 21 Nov 2021 12:09:04 +0100
Labels: app=nginx-app
pod-template-hash=d6ff45774
Status: Running
IP: 172.17.0.3
IPs:
IP: 172.17.0.3
Controlled By: ReplicaSet/nginx-app-d6ff45774
...
Here, we can read that one of the two NGINX Pods has the 172.17.0.3
IP address, and this address is only available from the inside of the cluster.
The Pod IP address is dynamic, which means it will change every time it restarts or a new one is created as a replacement. For example, when a Pod crashes or is deleted and another one comes up with the help of a ReplicaSet, the new Pod has a different IP address from the terminated one. This makes the Pod IP address unstable which can result in application errors. The same goes for Nodes, if a Node dies pods die with it, and the Deployment will create new ones.
Services
Services in Kubernetes is an abstraction that is used to group Pods into a single consistent endpoint. As Pods can be recreated, they can be attached to Services for stable IP addresses. When a network request is made to a Service, it selects all pods in the cluster matching the Service's selector, chooses one of them, and forwards the network request to it. This consistent endpoint is tied to the lifespan of the Service, and will not change while the Service is alive.
Running the
kubectl get services
returns
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 10d
This is a default Service created during the initialization of our cluster. Every Kubernetes cluster comes with a special Service, that provides a way for internal applications to talk to the API server. (Remember, running the kubectl
commands is talking to the API server).
Kubernetes Service Types
There are different Service types used in Kubernetes. These Services differ by how they expose Pods internally or externally and the way they handle the traffic. The different Service types are described below.
ClusterIP
ClusterIP exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster. ClusterIP is the default Service if not specified otherwise.
Let's create a ClusterIP Service exposing our Pods inside the cluster with a consistent IP address
kubectl expose deploy/nginx-app --port=80
where
-
expose
- Creates a Service that exposes a Kubernetes resource (e.g. Pod, Deployment, ReplicasSet,...) -
deploy/nginx-app
- Ournginx-app
deployment from before -
--port 80
- The port that the service should serve on
One noteworthy additional flag is the
--target-port
flag. Compared to--port
,--target-port
is the port on the container that the service should direct traffic to. If no--target-port
is specified, then it will default to the same value as--port
.
E.g.kubectl expose deployment my-app --port 80 --target-port 8080
would result in service port80
targeting container port8080
.
Running kubectl get services
returns now
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 105
nginx-app ClusterIP 10.108.45.199 <none> 80/TCP 8s
Our two Pods are now accessible within the cluster using the 10.108.45.199
IP address.
We can test that by running an additional Pod in our cluster, whose only purpose is to run a curl
command on that IP:PORT
address. By theory, curl
should return the NGINX welcome homepage HTML structure. We know that the NGINX container by default exposes a welcome website on to port 80.
curl
cURL, which stands for client URL, is a command-line tool that developers use to transfer data to and from a server. At the most fundamental, cURL lets you talk to a server by specifying the location (in the form of a URL) and the data you want to send.
E.g. runningcurl http://example.com
would return the HTML structure ofexample.com
. You could also get more information, such as headers,...
To run an additional Pod in our cluster without a deployment, you can run a one time Pod with a shell:
kubectl run -it network-shell --image=praqma/network-multitool --restart=Never sh; kubectl delete pod network-shell
where:
-
kubectl run
- Create and run a particular image in a Pod without a Deployment. -
-it
- Run the Container in the Pod in interactive mode that allows you to interact with the shell of the container. -
network-shell
- Our temporary Pod name -
--image praqma/network-multitool
a minimalistic docker image (18 MB in size) with some network tool commands, includingcurl
-
sh
- The command we want to run in interactive mode, here to access the shell in the network-shell container -
--restart=Never
- A Pod without a deployment is created (default value = "Always" which creates a Deployment).
The --rm
flag, which you may find online, does not seem to work on every system.
Therefore, by combining the run and delete command with ;
(e.gkubectl run... ; kubectl
), we create a one-time Pod with accessing its shell, and automatically remove it afterwards.
Attention
All flags aftersh
will be passed as flags tosh
. Order matters.
From within that one time Pod, you may now run
curl 10.108.45.199
which returns the Nginx welcome page HTML structure
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
...
NodePort
NodePort allows external accessibility to a Pod on a Node. It receives an external request from clients or users and maps to the ports. The NodePort range can be set manually, which must be in the range of 30000 – 32767, or is auto-assigned if not specified.
Let's expose our Pod to the outside world:
kubectl expose deploy/nginx-app --port=80 --name=nginx-app-np --type=NodePort
Under the hood, when creating a NodePort, a ClusterIP is also created which establishes the connection to the Pods in the cluster. The ClusterIP is not shown as a separate Service, but NodePort connects to its ClusterIP which then connects to the pod IP.
Running kubectl get service
:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 106d
nginx-app ClusterIP 10.108.45.199 <none> 80/TCP 8s
nginx-app-np NodePort 10.100.142.119 <none> 80:32358/TCP 2s
which returns us six columns NAME
, TYPE
, CLUSTER-IP
, EXTERNAL-IP
, PORT(S)
and AGE
.
We see now that we have our new NodePort Service nginx-app-np
with the type of NodePort
. But if you look closely at nginx-app-np
's CLUSTER_IP
column, you see that it has an IP address, instead of being empty. This is the hidden ClusterIP service IP address, that was created with our NodePort service.
The PORT(S)
value for our nginx-app-np
NodePort service is 80:32358
, where 80
is the hidden ClusterIP's port number, and 32358
is the NodePort's external port number. The --port
flags sole purpose in the NodePort's creation was to specify the hidden ClusterIP's port (and also the target port on the container if not specified with --target-port
).
Now, when somebody connects from the outside of the cluster on port 32358
, the traffic gets forwarded to the hidden ClusterIP on 10.108.45.199:80
and finally to the Pod, which exposes its application on port 80 as well.
You should be able to run the same curl
command from your host computer:
-
minikube users - If you use minikube, then remember that you're running Kubernetes from a VM. This VM has an IP address on your machine, which you can retrieve with the
minikube ip
command. -
microk8s users - If you use microk8s (on Linux), then you should be able to use
localhost
since Kubernetes is natively installed without a VM. If on Mac or Windows, you need to retrieve the VM's IP address..
Running minikube ip
returns on my machine the 192.168.49.2
IP address, which finally is my cluster IP address.
Therefore, executing curl 192.168.49.2:32358
(or curl localhost:32358
for microk8s users) returns once again the NGINX welcome HTML structure <!DOCTYPE html><html><head><title>Welcome to nginx!</title>...
.
The same goes for the hidden ClusterIP, created with the NodePort. Using the one time Pod again
kubectl run -it network-shell --image=praqma/network-multitool --restart=Never sh; kubectl delete pod network-shell
Use the ClusterIP IP address from our NodePort nginx-app-np
row, 10.100.142.119
, and run the curl
command on port 80
, i.e. curl 10.100.142.119:80
.
NodePorts aren’t often ideal for public Services. They use non-standard ports, which are unsuitable for most HTTP traffic. You can use a NodePort to quickly set up a service for development use or to expose a TCP or UDP service on its own port. When serving a production environment to users, you’ll want to use LoadBalancer or Ingress.
LoadBalancer
The LoadBalancer Service automatically integrates with load balancers provided by cloud services. These load balancers spread out workloads evenly across Kubernetes clusters, and then the Kubernetes cluster spreads out workloads evenly across the Nodes -> Pods -> Container instance.
The advantage of using Load balancers here is that if a Cluster fails, the workload is directed to a backup server, which reduces the effect on users.
Under the hood, when creating a LoadBalancer, a NodePort and ClusterIP are also created exposing creating the internal and external connections.
For local development, Docker Desktop provides a built-in LoadBalancer.
If you're on minikube or microk8s, LoadBalancers won't work and the service will just say "pending", but NodePort and ClusterIP work.
A LoadBalancer service can be created by using the --type=LoadBalancer
flag.
It is also important to note that a "Load Balancer" is not something Kubernetes provides out of the box. If you are using a cloud provider (like AWS, GKE, Digital Ocean, etc.) your Kubernetes installation is managed, and a Load Balancer is provided as part of your deal. But Load Balancers are not present in a bare metal Cluster that you provision on your own. There are workarounds provided by testing tools, like minikube. But other than that, you will need to manage external IPs' manually or deploy a 3rd party solution, like MetalLB.
Domain Name System (DNS) in Kubernetes
DNS translates an IP address that might be changing, with an easily rememberable name such as “example.com”. Every network has a DNS server, but Kubernetes implements their own DNS within the cluster to make connecting to containers a simple task.
Before Kubernetes version 1.11, the Kubernetes DNS service was based on kube-dns. Version 1.11 introduced CoreDNS to address some security and stability concerns with kube-dns.
Each time you deploy a new service or pod, the Kubernetes DNS sees the calls made to the Kube API and adds DNS entries for the new service or pod. Then other containers within a Kubernetes cluster can use these DNS entries to access the service or pod.
It may also happen that only services, and no pod, receives a DNS entry.
Service
Services get a DNS entry of the service name appended to the namespace and then appended to svc.cluster.local
.
Syntax
<service-name>.<namespace>.svc.<cluster-domain>.local
When we created the nginx-app
service, a nginx-app.default.svc.cluster.local
entry has been added to the Kubernetes DNS, where initial
is the default Kubernetes namespace. (More about namespace later). cluster.local
is the default cluster domain.
Pod
Similarly pods get entries of pod IP address appended to the namespace and then appended to .pod.cluster.local
. The pod IP addresses dots 172.12.0.3
must be replaced with dashes 172-12-0-3
.
Syntax
`<pod-ip-address>.<namespace>.pod.<cluster-domain>.local`
When we created our nginx-app
deployement, one of the created pods, the nginx-app-d6ff45774-7m9zg
, got assigned the 172.17.0.3
IP address. The entry 172-17-0-3.default.pod.cluster.local
was then added to the Kubernetes DNS.
How to check if Pod DNS is enabled in your Cluster?
CoreDNS has by default disabled Pod DNS in a Kubernetes Cluster, but many local Kubernetes tools (e.g. minikube) or Cloud Kubernetes services (e.g. AWS EKS) have it enabled to make it still compatible with kube-dns
.
CoreDNS uses the server block syntax (same as NGINX config) for its configuration. The pod
property in the configuration (also known as POD-MODE) determines if pod DNS is enabled or not.
The three valid pod
values are:
-
disabled
- Default. Do not process Pod requests, always returningNXDOMAIN
(domain name can't be resolved by a DNS) -
insecure
- Default for minikube, AWS,... . Always return an A record with IP from request, even if the Pod IP address doesn't exist. This option is provided for backward compatibility withkube-dns
.verified
- Return an A record if there exists a Pod in the same namespace with matching IP. This option requires substantially more memory than in insecure mode since it will maintain a watch on all Pods.
The insecure
setting is vulnerable to abuse if used maliciously in conjunction with wildcard SSL certs. Meaning, if you handle traffic with pod DNS entries, instead of Services, then you're vulnerable to DNS spoofing. But this should not be the case since we learned that you should handle traffic with Services as they have non-ephemeral IP addresses.
DNS Spoofing (DNS spoofing example in the bottom of this page) in this context means that one malicious Pod intercepts incoming traffic and returns malicious code or applications, instead of the original one. (Source)
As long as you know that every developer in a cluster handles traffic with Services and not Pods, you should not be too worried about pod insecure
as it's a backward compatibility for kube-dns
and pretty much industry standard (e.g. AWS EKS uses insecure
).
Let's inspect our Kubernetes CoreDNS configurations
kubectl get -n kube-system configmaps/coredns -o yaml
which should return the along the lines
...
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
...
We can see that we have Pod DNS enabled.
Resolving DNS entries
The diagram below shows a pair of Services and a pair of Pods that are deployed across two different Namespaces. Below are the DNS entries that you can expect to be available for these objects, assuming Pod DNS is enabled for your cluster.
Any pods looking for a Service within the same namespace can just use the common name of the “Service” (e.g. ninja
service) instead of the fully qualified domain name (FQDN - is the most complete domain name that identifies a host or server)(e.g. hollowapp.namespace1.svc.cluster.local
). Kubernetes will add the proper DNS suffix to the request if one is not given (explained in Kubernetes & Unix networking below).
To test this with our nginx-app
setup, run again the command to create the one time Pod:
kubectl run -it network-shell --image=praqma/network-multitool --restart=Never sh; kubectl delete pod network-shell
We know that we currently have three Services, the ClusterIP service nginx-app
, the NodePort service nginx-app-np
and LoadBalancer service nginx-app-lb
(optional).
In our one time Pod, run
nslookup nginx-app
where nginx-app
is the name of our Service and nslookup
is a command for obtaining from DNS servers the IP address of a domain name and vice-versa.
The results show the ClusterIP address of the Service. Also, pay close attention to the FQDN, which follows the service DNS syntax that we've previously seen.
$ nslookup nginx-app
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: nginx-app.default.svc.cluster.local
Address: 10.111.70.23
Kubernetes & Unix networking
The following section explains some basics of unix networking which helps to further understand Kubernetes DNS and Services.
We will look at nslookup
and how it can resolve nginx-app
without the need for the fully qualified domain name.
The resolv.conf
file
DNS resolution inside a container - like any Linux system - is driven by the /etc/resolv.conf
config file.
You can check your local (Linux only) host machines resolv.conf
file with
cat /etc/resolv.conf
Returning
nameserver 127.0.0.53
options edns0 trust-ad
Here:
-
nameserver
directs to an IP address of a DNS, which in return translates domain names into IP addresses. The IP127.0.0.53
is a local DNS cache resolver, aka a temporary database that stores information about previous DNS lookups. Whenever you visit a website, your OS and web browser will keep a record for the domain and the corresponding IP address increasing performance instead of using a remote DNS. -
options edns0 trust-ad
insignificant option
But the resolv.conf
file has one more important part, the search
directive.
Let's have a look at a more complete resolv.conf
file:
search eng.matata.com dev.matata.com fr.matata.com
options ndots:2
nameserver 10.436.22.8
-
search eng.matata.com dev.matata.com labs.matata.com
contains a list of domain names, which each one will be appended for every domain name query. (E.g. querying forbanana
, the search directive would search forbanana.eng.matata.com
,banana.dev.matata.com
,..) -
options ndots:2
The number of maximum allowed dots in a domain name query for using thesearch
directive, before querying a remote DNS server (e.g.chef.example.com
has two dots). If the number of dots surpasses thendots
value, then the remote DNS server is first asked to resolve the domain name before (only if failed) to use thesearch
directive and perform a local DNS resolution. -
nameserver 10.436.22.8
when using the search directive, forward the query to the DNS at the10.436.22.8
IP address about A-records, CNAME,.. .
If a domain name can't be resolved by a DNS (aka, no A-record, CNAME,... exists) then it will return an NXDOMAIN
(non-existing domain) error response.
A-Record vs CNAME record
DNS's use records to point a domain name to an IP address.
A-Record maps a name to one or more IP addresses. CNAME maps a domain name to another domain name.
It's usual that an A-record is created for a root domain, e.g.example.com -> 102.192.22.102
and a CNAME record for subdomains such asbanana.example.com -> example.com
. It's then up to the server to handle requests for the two domains and
Example: Querying for hakuna
Using the previous resolv.conf
file, imagine we know that a CNAME exists for hakuna.dev.matata.com
and we want to find the IP of the server.
hakuna
has less than ndots:2
dots in the domain name. so it first performs a local search.
Running
nslookup hakuna
executes a local search in following order:
-
hakuna.eng.matata.com
-> fails withNXDOMAIN
-
hakuna.dev.matata.com
-> success and stops here, DNS returns the ClusterIP address of the Service hakuna.labs.matata.com
Example: Querying for aws.amazon.com
aws.amazon.com
has more than ndots:2
dots in the domain name, so it's considered as an absolute name ( = FQDN) and first resolved with a remote DNS, before resolving it locally with the search
directive.
Resolving nginx-app
Let's inspect the /etc/resolv.conf
of a with the one time Pod.
kubectl run -it network-shell --image=praqma/network-multitool --restart=Never sh; kubectl delete pod network-shell
Inside the Pod, run
cat /etc/resolv.conf
which should return
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
Here 10.96.0.10
is our nameserver IP address, which in theory should be CoreDNS. We can investigate that nameserver IP address by running
kubectl get service -n kube-system
where
-
-n
is a shorthand for--namespace
and allows you to specify the namespace in where to look for services -
kube-system
is the namespace for services, pods,... created by the Kubernetes system.
That command should return
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 114d
We have an IP address match! But hold up a second. Why is the Service name kube-dns
and not coredns
?
The system nameserver Service name was kept as kube-dns
, even as it now uses CoreDNS under the hood for backwards compatibility from CoreDNS to kube-dns
.
We can verify this by running
kubectl get pods -l k8s-app=kube-dns
which returns:
NAME READY STATUS RESTARTS AGE
coredns-558bd4d5db-rvvst 1/1 Running 2 114d
The -l
flag stands for labels
and is later on covered. But basically, the commands return the Pods which are served by the kube-dns
Service.
Now that we understand our 10.96.0.10
IP address, we run the search
directive since nginx-app
has fewer dots than the specification of options ndots:5
.
-
nginx-app.default.svc.cluster.local
-> success and stops here, DNS returns the ClusterIP address of the service nginx-app.svc.cluster.local
nginx-app.cluster.local
Additional Sources:
Top comments (1)
Thank you. Great article.