To achieve high availability and better utilization of resources, it is required to spread the pods across multiple failure domains / availability zones. You can use OpenShift’s topologySpreadConstraint feature to achieve this.
Here I will explain how to achieve this by labeling the nodes and defining the topology spread constraint for the pods.
1. Label the nodes as per the zones they are located in
oc label node worker0 zone=az0
oc label node worker1 zone=az1
oc label node worker2 zone=az1
oc label node worker3 zone=az2
oc label node worker4 zone=az2
2. Define the topology spread constraint in deployment yaml
apiVersion: apps/v1
kind: Deployment
metadata:
spec:
…
template:
metadata:
…
spec:
containers:
…
topologySpreadConstraints:
- maxSkew: 1
topologyKey: zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: app1
…
where,
maxSkew: This defines the maximum difference in number of pods between any two zones. It is used to control how strict the pod distribution should be.
topologyKey: The label used to define the zone value for nodes. In our example, it is zone.
whenUnsatisfiable: How to handle a pod if it does not satisfy the spread constraint.
labelSelector: Pods to select for the distribution based on the labels associated with the pods. In this example, the pods are created with label app: app1 using spec.template.metadata.labels
tag in the same deployment yaml as shown below.
apiVersion: apps/v1
kind: Deployment
metadata:
spec:
…
template:
metadata:
…
labels:
app: app1
After making these changes, the pods will be evenly distributed across the zones. The worker node selection within the zone will be done based on the capacity available on those worker nodes. So it is quite possible that one worker node will have more pods than other worker node within the same zone.
In the below picture, we can see how the 5 pods are distributed to 3 availability zones.
- Pod0 is assigned to az0 which contains only one worker node worker0.
- Pod1 and Pod2 are assigned to az1 which contains two worker nodes worker1 and worker2. Both the pods are assigned to worker1 as per the resource requirement and capacity available.
- Pod3 and Pod4 are assigned to az2 which contains two worker nodes worker3 and worker4. Both the pods are assigned to worker4 as per the resource requirement and capacity available.
Top comments (0)