Sawit M.

Posted on Jan 27, 2020

LFS258 [3/15]: Kubernetes Architecture

#lfs258 #kubernetes #docker #devops

TL;DR

Main Components
- Master Nodes เป็น nodes ที่มี pods ที่เป็น control-plane run อยู่ จะมีแค่เพียง nodes เดี่ยวก็ได้ แต่ใน production มักจะมี 3 nodes ขึ้นไป pod ที่ run ที่ master node ได้แก่ kube-apiserver, etcd, coredns, kube-controller-manager และ cloud-controller-manager เป็นต้น
- Worker Nodes เป็น nodes ที่ run pod ที่เป็น workload
- Services เป็นตัวช่วยเพิ่มความสะดวกในการเรียกใช้งาน pod โดยทำหน้าที่เป็น load balancer, HA หรือ expose port ของ pods รับ traffic จากภายนอก
- Controllers เป็นตัวที่คอยดูแล object ในความรับผิดชอบให้มี status เหมือนอย่างที่เราระบุไว้ เช่น ถ้าเราให้ pods นี้ run 2 instanced ถ้าด้วยเหตุใดก็ตามเหลือ Pod ดังกล่าวเหลือเพียงแค่ 1 instance controller จะ start pod ขึ้นอีก 1 instance เพื่อให้ครบเท่ากับ 2 instances ตามที่เราระบุไว้ เรียก controller ชนิดนี้ว่า ReplicaSet เป็นต้น
- Pods เป็น unit ที่เล็กที่สุดที่ Kubernetes สามารถควบคุมได้ โดย Pod จะห่อหุ่ม container ไว้ และ provide IP, storage access และ lookback interface เป็นต้น ให้กับ container นั้นๆ ใน 1 pod อาจมีได้หลาย containers แต่ส่วนมากจะมีเพียงแค่ container ที่เป็น main app เพียง container เดียว ส่วน container อื่นๆ จะเป็น support app เรียกว่า sidecar
- Containers เป็น application ที่ถูก pack ใส่กล่องพร้อมด้วย dependencies ของมัน ทำให้มันสามารถ run ได้ทุกที่ที่มี container runtime
- Network เป็นส่วนที่ให้การติดต่อสื่อสารระหว่าง pod ทั้งที่อยู่ใน node เดียวกันหรือ ต่าง node (pod-to-pod communication) เกิดขึ้นได้ โดยอาจอาศัย infrastructure เช่นใน GKE หรือ เป็น overlay network software ก็ได้
Component Review
Mesos เป็น Container Orchestrator อีกตัวหนึ่งขอค่าย Apache

Main Components

Kubernetes ถูกออกแบบตามหลัก 12 Factors Application Principles โดยแบ่ง application ออกเป็น module ย่อยๆ ตามหน้าที่ของมัน ซึ่งแต่ละ module จะสื่อสารด้วยการ API

เพื่อลดความซับซ้อน kubernetes จะมี kube-apiserver เป็นจุดศูนย์กลางของ control plane ในการรับ API calls เปรียบเสมือน Frontend ของ control plance

ภาพด้านบน เป็น high level architecture ของ kubernetes cluster ประกอบไปด้วย Master Nodes และ Worker Nodes ที่มี API เป็น จุดศูนย์กลางรับคำสั่งจากผู้ใช้งาน คำสั่งนั้นจะถูกส่งไปยัง module ที่ต่างๆ ใน Kubernetes Master เพื่อทำงานตามคำสั่ง บางคำสั่งที่ต้องมีการทำงานกับ container workload ก็จะถูกส่งไปยัง Worker Nodes และถ้าหากเป็นการสร้าง container ใหม่ worker nodes จะต้องไป download image จาก image registry เพื่อนำมา run และทำงานต่อไป

เราสามารถแบ่ง components ของ kubernetes ได้ดังนี้

Master Nodes
Worker Nodes
Services
Controllers
Pods
Containers
Network

Master Nodes

เป็นที่อยู่ของ module ที่ทำหน้าที่เป็น control plane ซึ่งมีดังต่อไปนี้

`kube-apiserver`

เป็นจุดศูนย์กลางในการจัดการ kubernetes cluster ทำหน้าที่

เป็น master process ของ cluster
เป็น frontend ในการ query state ของ object ต่างๆ ใน cluster
รับคำสั่งทั้ง create, delete, modify รวมทั้ง query state ของ object ต่างๆ จากภายในและภายนอก cluster
ทำการ validate คำสั่ง
เป็น frontend ให้กับ etcd (อ่านว่า "เอ็ด-ซี-ดี") ที่เป็น key-value pair database cluster ของ Kubernetes (จะมีแค่ kube-apiserver เท่านั้นที่ติดต่อกับ etcd ได้)

ตั้งแต่ v1.16 เป็นต้นมา network plugin ส่วนใหญ่ของ kubernetes เริ่มมีการแยก traffic ที่เป็น server-initiated traffic และ user-initiated traffic ออกจากกัน เพื่อเพิ่ม performance, capacity และ security ของระบบ

`kube-scheduler`

เป็น process ที่ทำ algorithm ในการ แจก Pods ไป run ยัง worker nodes ต่างๆ โดยพิจารณาจาก

Quota Restrictions: การ limit จำนวน CPU และ memory ตามที่กำหนดไว้ใน cluster และ namespaces
taints and tolerations: เป็น preference ว่า node นี้ run pod แบบไหนได้บ้าง และ pods นี้สามารถ run ที่ node ไหนได้บ้าง
labels: เป็น metadata ของ nodes และ pods เก็บในรูปแบบ key-value pair ช่วยให้การจัดการ node หรือ pod ง่ายขึ้น
resources ของ worker nodes: current load, forecast load ตาม algorithm

ถ้า Pods มี status เป็น Pending หมายความว่า kube-scheduler ไม่สามารถหา worker node ให้มัน pods run ได้

`Etcd Database`

etcd เป็น persistent storage ที่ใช้จัดเก็บ state ของ cluster และ network รวมทั้งเก็บข้อมูลอื่นๆ ที่ต้องเก็บไว้แบบถาวรด้วย

etcd เป็น distributed database ที่เก็บข้อมูลโดยใช้ technique แบบ B-tree key-value storage โดยจะเก็บ data แบบ append เท่านั้น data เก่าจะถูก marked flag ไว้ และจะถูกตามมาลบทีหลัง โดย internal process ของมันเอง เราสามารถทำงานกับ etcd ได้ด้วยการใช้ curl หรือ http library ทั่วไป

ในการเข้ามาแก้ไขข้อมูลใน etcd ต้องมีการอ้างอิง version ปัจจุบันของ data นั้นๆ เสมอ ถ้ามี หลาย requests เข้ามา update data เดียวกันพร้อมๆ กัน kube-apiserver จะเป็นคนจัดลำดับของ request ให้เป็นแบบ serial นั้นหมายความว่า request แรกจะ update ได้สำเร็จ เป็นผลให้ version ของ data เปลี่ยน request ถัดมาจะได้รับ error 409 เนื่องจาก version ของ data ที่จะ update ไม่ตรงกับ version ปัจจุบัน ดังนั้นคนที่ส่งการ update เข้ามาต้องทำการ handle error ตัวนี้เอง

แม้ etcd จะเป็น distributed database ที่มีความ durable สูง แต่การ ใช้ tool เช่น kubeadm ในการ upgrade cluster อาจมีผลกระทบเล็กน้อยกับ cluster ของเรา

`kube-controller-manager`

เป็น core control loop daemon ที่คอยติดต่อกับ kube-apiserver เพื่อตรวจสอบ state ของ cluster ถ้าหากพบว่า object ไหน มี state ไม่ตรงกับ desired state มันจะติดต่อไปยัง controller ที่ควบคุม object นั้น เพื่อทำให้ object นั้นๆ กลับมาสู่ desired state

ตัวอย่างของ controller เช่น endpoint, namespace และ replication เป็นต้น

`cloud-controller-manager`

ตั้งแต่ v1.16, cloud-controller-manager แยกตัวออกมาจาก kube-controller-manager เพื่อลด impact ที่มีต่อ core application หากมีการเปลี่ยนแปลง APIs ของ cloud provider ต่างๆ และทำให้การแก้ไขเพื่อ support cloud provider APIs ทำได้รวดเร็วยิ่งขึ้นด้วย

ในการใช้งาน cloud-controller-manager เราต้องเลือก cloud provider ก่อน โดย cloud provider มี 2 แบบ คือ

in-tree: อยู่ใน list ของ kubernetes core ก็จะสามารถ run เป็น daemonset ได้เลย
out-of-tree: ไม่ได้อยู่ใน list ในข้อแรก สามารถ implement เองได้โดยดูตัวอย่างจาก Developing Cloud Controller Manager และ Kubernetes Cloud Controller Manager

จากนั้นเราก็มาเพิ่ม options ให้กับ process เหล่านี้

kubelet ต้องใส่ option --cloud-provider=external
cloud-controller-manager ต้องใส่ options --cloud-provider=[YOUR_CLOUD_PROVIDER]

แล้ว restart cluster

Worker Nodes

เป็น server ที่ใช้ในการ run pods ที่เป็น workload โดยจะต้องมี application พื้นฐานเช่น kubelet, kube-proxy และ container runtime เช่น docker หรือ rkt run อยู่ด้วยเสมอ

`kubelet`

เป็น process ที่ติดต่อกับ container engine และ kube-apiserver เพื่อ

make sure ว่า container ที่ต้อง run ใน node นั้นๆ run อยู่
ดูแลเรื่องการสร้าง, เปลี่ยนแปลง และ ลบ resources เช่น volume และ secret ตามที่ได้รับคำสั่งมาจาก kube-apiserver
คอยส่ง status ของทั้ง Pods และ resources กลับไปให้ kube-apiserver

kubelet ยังทำงานเกี่ยวกับ topology-aware resource assignments ด้วย โดย call ไปยัง components อื่นๆ เช่น Topology Manager เพื่อเอาข้อมูลมาใช้ในการจัดสรรค์ทรัพยากร เช่น CPU และ Hardware accelerators เป็นต้น ให้เหมาะสมกับ topology แต่ feature นี้ ไม่ได้ enable โดย default เนื่องจากยังเป็น alpha อยู่

ถ้า setup kubernetes cluster ด้วย kubeadm จะมีแค่ process kubelet และ container runtime เท่านั้นที่จะ run เป็น process ธรรมดาใน worker และ master nodes ซึ่งจะมี process manager เช่น supervisord หรือ systemd เป็นคน monitor process ให้

`kube-proxy`

เป็น process ที่จัดการเรื่อง connectivity ให้ containers ซึ่ง support 2 modes ได้แก่

userspace: ใช้ rules ของ iptables ในการ proxy traffic เข้ามายัง container
ipvs: ใช้ ipvs ของ linux ในการ share traffic เข้ามายัง container (ยังเป็น alpha feature อยู่)

kubernetes ไม่ได้มาพร้อม logging system แต่ CNCF แนะนำให้ใช้ Fluentd เป็น unified-logging layer เพื่อ ส่ง Log ไปยัง Elasticsearch สามารถเรียกรวมกันว่าเป็น ELK stack

Services

เป็น components ที่ช่วยให้การ decoupling ทำได้ง่ายขึ้น โดยทำหน้าที่รวม resources เข้าด้วยกัน และ reconnect หากมี resource นั้นเพิ่มขึ้น, ลดลง หรือ restart

หน้าที่ของ Services มีดังนี้

Connect Pods together: เป็น flexible และ scalable agent ที่รวม pods เข้าด้วยกัน และ reconnect เมื่อ pod มีการเกิดและตายไป
Expose Pods to Internet: ใน mode NodePort และ LoadBalancer ของ service จะเป็นการ distribute traffic ไปยัง pods ฝนกลุ่มเดียวกัน
Decouple Settings: เป็น single point of contact ในการติดต่อสื่อสารกับกลุ่มของ pods ที่รองรับ function เดียวกัน
Define Pod Access Policy: enforce policy ในการเข้าถึง pod ต่างๆ

ตัวอย่าง ClusterIP

App คือ main container ที่ทำ logic ต่างๆ ของ Pod
Logger คือ sidecar ที่ทำหน้าที่ shift log ไปยัง Log server
pause คือ container ที่ใช้ในการ reserve IP ก่อนที่จะ start container อื่นๆ container นี้จะไม่เห็นใน kubernets แต่จะเห็นในระดับ container runtime เช่น docker หรือ crictl
ClusterIP svc 1 เป็น service ที่ทำหน้าที่ได้ทั้ง expose เป็น NodePort และ ต่อกับ Ingress Controler เพื่อรับ traffic จาก นอก cluster
ClusterIP svc 2 เป็น service ที่เชื่อมต่อกับ backend pods ที่อยู่ภายใน server เดียวกัน

Controllers

เป็นส่วนที่สำคัญมากใจการจัดการ (orchestration) cluster เราสามารถสร้าง controller ของเราเองได้ด้วย โดยหลักการการทำงานของ controller เป็นดังนี้


^{_{Source: github.com}}

Note:

Reflector: เป็น component ที่ติดต่อกับ Kubernetes API เพื่อตรวจสอบ resource (Kind เช่น namespace, endpoint, serviceaccount และ pod เป็นต้น) ในความดูแลของตัวเอง แล้ว push ลง Delta Fifo Queue (FIFO queue ที่ delete ได้) ซึ่งจะมี 3 status คือ Added, Updated และ Deleted
Informer: ดึง objects ออกจาก Delta Fifo queue เพื่อนำไป ส่งให้ Indexer เพื่อเก็บ object ไว้ใน store แล้ว รับค่า index ของ object มา จากนั้น ส่ง event พร้อม index ของ object ไปให้ controller ทำงานต่อไป
Indexer: เก็บ object เข้า thread safe store แล้ว return index ของ object เพื่อให้นำไปใช้ต่อไป
Workerqueue: แจก index ไปให้ worker ต่างๆ ทำงาน โดยทำ feature ทั้ง rate limiting, delayed และ time queue

Reference: How to Create a Kubernetes Custom Controller Using client-go

Pods

Kubernetes ไม่สามารถ interact กับ container ได้ตรงๆ หน่วยที่เล็กที่สุดที่ Kubernetes สามารถ interact ได้คือ Pods

Pods ถูก design ให้ wrap container ไว้ โดยปกติ จะมี 1 Pod จะมี 1 container (one-process-per-container architecture) แต่สามารถมีหลาย container ได้ ใน Pod เดียวกัน โดย container จะ share IP, loopback interface และ shared filesystem

Container ใน Pod จะ start พร้อมๆ กัน ดังนั้นเราไม่มีทางรู้เลยว่า container ไหนจะ start เสร็จก่อน เราสามารถใช้ InitContainers ในการจัดลำดับการ start ของ container ได้ในระดับหนึ่ง

จุดประสงค์หลักของการมี หลายๆ container ใน 1 Pod เช่น logging, proxy และ adapter ต่างๆ เป็นต้น โดย container เหล่านี้ไม่ได้ทำหน้าหลักของ Pods แต่ทำงานที่ช่วย support container หลักให้ทำงานได้ง่ายขึ้น เราเรียก container พวกนี้ว่า sidecar, ambassador หรือ adapter

Containers

แม้ว่า kubernetes จะไม่สามารถจัดการกับ container ได้ตรงๆ แต่มันสามารถจัดการ resource ของ container ได้ โดยผ่านทาง

resources ของ PodSpec แบบนี้

resources:
  limits:
    cpu: "1"
    memory: "4Gi"
  requests:
    cpu: "0.5"
    memory: "500Mi"

ResourceQuota ซึ่งเป็นการกำหนด soft limit และ hard limit ให้กับ namespace และใน v1.12 เป็นต้นมา มี beta features scopeSelector และ priorityClassName ที่สามารถทำให้กำหนด priority ภายใน namespace ได้อีกด้วย

apiVersion: v1
kind: List
items:
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: pods-high
  spec:
    hard:
      cpu: "1000"
      memory: 200Gi
      pods: "10"
    scopeSelector:
      matchExpressions:
      - operator : In
        scopeName: PriorityClass
        values: ["high"]

Init Containers

จากที่เรารู้มาแล้วในเรื่อง Pods ว่าแต่ละ container ใน pod จะถูก start พร้อมกัน หากต้องการจัดลำดับการ start อาจใช้พวก LivenessProbes, ReadinessProbes และ StatefulSets มาช่วยได้ แต่มันก็ทำให้ yaml file ของเราซับซ้อนยิ่งขึ้น

Kubernetes มีอีก feature ชื่อว่า initContainers ซึ่งเป็นการบอกว่าต้องรอให้ container ไหนบ้าง start เสร็จก่อน ก่อนที่จะ start main container

apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
  labels:
    app: myapp
spec:
  containers:
  - name: myapp-container
    image: busybox:1.28
    command: ['sh', '-c', 'echo The app is running! && sleep 3600']
  initContainers:
  - name: init-myservice
    image: busybox:1.28
    command: ['sh', '-c', 'until nslookup myservice; do echo waiting for myservice; sleep 2; done;']
  - name: init-mydb
    image: busybox:1.28
    command: ['sh', '-c', 'until nslookup mydb; do echo waiting for mydb; sleep 2; done;']

ถ้า containers ใน initContainers start ไม่สำเร็จ kubernetes จะ ทำการ restart จนกว่าจะสำเร็จ ถึงจะ start main container ขึ้นมา

containers ที่ระบุไว้ใน initContainers ก็เหมือน containers ทั่วไป เราสามารถกำหนด access storage และ security ได้อิสระจาก main container ทำให้เราสามารถใช้ command ที่ main container ใช้งานไม่ได้ เพื่อ configure อะไรต่างๆ ก่อน start main container ได้

Network

การที่จะสร้าง kubernetes cluster ขึ้นมาสิ่งที่ขาดไม่ได้เลยก็คือ network ถ้าใครเคยสร้าง VMs บน IaaS cloud platform ก็จะคุ้นเคยมันเป็นอย่างดี เราอาจเปรียบ VMs ได้กับ Pods ของ kuberenetes โดย Network จะเป็นคนแจก IP ให้กับ Pods และ route traffic ระหว่าง Pods ในแต่ละ Nodes

ในโลกของ Container Orchestration, network ต้องทำสิ่งเหล่านี้ได้

container-to-container communication: การสื่อสารระหว่าง container ซึ่งสำหรับ kubernetes containers อยู่ใน Pods ดังนั้น Pods จัดการเรื่องนี้ไปแล้ว
pod-to-pod communication: การสื่อสารระหว่าง pods ไม่ว่าจะอยู่ใน node เดียวกันหรือต่าง node
External-to-pod communication: การเชื่อมต่อกับโลกภายนอกของ Pods ซึ่งสำหรับ kubernetes ได้มีการใช้ concept ของ service มาทำในส่วนนี้แล้ว

ดังนั้น สำหรับ kubernetes แล้ว network ทำหน้าที่แค่จัดการ pod-to-pod communication เท่านั้น

Further Reading

Illustrated Guide To Kubernetes Networking by Tim Hockin, one of the lead Kubernetes developers

CNI Network Configuration File

Container Network Interaface (CNI) เป็น standard specification ของ container network ใน Kubernetes มันถูกใช้ในการ assign IP ให้กับ Pods

ตั้งแต่ v1.6.0 kubeadm สามารถ set up kubernetes cluster โดยใช้ CNI แต่ผู้ใช้ต้อง recompile เอง

CNI คือ specification ที่มาพร้อมกับ libraries ที่ใช้ในการสร้างและลบ container networking จุดมุ่งหมายคือ เป็นมาตรฐานกลางสำหรับแต่ละ network solution ในการติดต่อกับ container runtime ซึ่งตอนนี้ support ทั้ง Amazon ECS, SR-IOV และ Cloud Foundry เป็นต้น

ตัวอย่างของ CNI network configuration file

{
    "cniVersion": "0.2.0",
    "name": "mynet",
    "type": "bridge",
    "bridge": "cni0",
    "isGateway": true,
    "ipMasq": true,
    "ipam": {
        "type": "host-local",
        "subnet": "10.22.0.0/16",
        "routes": [
            { "dst": "0.0.0.0/0" }
        ]
    }
}

โดย config นี้เป็นการ define bridge interface ชื่อ cni0 ที่มี subnet 10.22.0.0/16

Pod-to-Pod Communication

Requirement ของ Pod-to-Pod Communication ของ Kubernetes คือ

ทุก Pods ต้องติดต่อสื่อสารกันได้แม้ว่าจะอยู่คนละ node
ทุก node ต้องสามารถติดต่อได้กับทุก pods
ต้องไม่มีการ NAT (Network Address Translation) เกิดขึ้นใน cluster

จาก requirements สามารถตีความได้ว่า "ทุก IP ไม่ว่าจะเป็นของ Pods หรือ Nodes ต้องติดต่อหากันได้โดยปราศจากการ NAT" ซึ่งสามารถทำได้ทั้งในระดับ infrastructure เช่น GKE และ ในระดับ software โดยใช้ software defined overlay network solution เช่น Weave, Flannel, Calico หรือ Romana

Further Reading

Install Pod Netorking

Cluster Netoworking

Networ Addon

Component Review

ทุก components จะ communicate กับ kube-apiserver
มีแค่ kube-apiserver เท่านั้นที่สามารถติดต่อกับ etcd ได้
เราสามารถ interact กับ etcd ได้โดยการใช้ command etcdctl
เราสามารถดู status และ configure ของ calico ได้โดยใช้ command calicoctl
Calico Felix เป็น DaemonSets ของ Calico ที่ run อยู่ทุก server ใน cluster ทำหน้าที่ monitor network interface, กำหนด routing, กำหนด ACL และ report status ของ node
BIRD เป็น dynamic IP routing daemon ที่ Felix ใช้ในการอ่าน status และ distribute ข้อมูลของตัวเองไปให้ node อื่นๆ ใน cluster

Node

kubernetes มอง Node เป็น API Object ที่ถูก create นอก cluster โดย Master Nodes ต้องเป็น Linux เท่านั้น แต่ Worker Nodes จะเป็น Linux หรือ Window Server 2019 ก็ได้

ถ้า kube-apiserver ไม่สามารถติดต่อกับ kubelet ที่ node ใดๆ ได้เป็นเวลา 5 นาที NodeLease จะเตรียมการ delete node นั้น ออกจาก cluster โดยจะทำการเปลี่ยน status ของ Node จาก Ready เป็น False แล้วจึงค่อยๆ migrate Pods ออกไป node อื่นๆ
จากนั้นเมื่อ kube-apiserver กลับมาติดต่อกับ kubelet ได้อีกครั้ง Node status กลับมาเป็น Ready จากนั้น node กลับอยู่ใน list ของ node ที่สามารถ schedule pods ได้
ขั้นตอนในการ remove node ออกจาก cluster
- ที่ master node: kubectl delete node <Node_Name> ขั้นตอนนี้ pods จะค่อยๆ ถูกย้ายออกจาก node นั้น
- ที่ node ที่ถูก delete: kubeadm reset เพื่อลบข้อมูลของ cluster แต่บาง rule ของ iptables อาจยังเหลืออยู่ ต้อง check แล้วลบเอง

Pods: Single IP per Pod

Pods อาจกล่าวได้ว่าเป็นที่อยู่ของกลุ่มของ container และ data volume ของมัน โดยที่แต่ละ container จะใช้ IP เดียวกัน โดยมี container ที่ชื่อว่า pause เป็นคนถือ IP

การติดต่อสื่อสารกันระหว่าง Container ใน Pod เดียวกันอาจใช้

loopback interface
เขียน file ผ่าน data volume ที่ใช้ร่วมกัน
inter-process communication (IPC)

Kubernetes support ทั้ง IPv4 และ IPv6 ตั้งแต่ v1.16 โดย ตอนสร้าง service ต้องสร้าง แยกกัน

Services: Container to Outside Path

จากรูปเป็นตัวอย่างของ service ชนิด NodePort ที่ใช้ในการ expose port ของ container ออกไปในระดับ node เพื่อให้ภายนอก cluster ได้ใช้

ในการสร้าง service เราต้องระบุ endpoint ด้วยว่าจะให้โยน traffic ไปที่ Pod ไหน Port อะไร จากนั้น traffic จะถูก route โดยใช้ technique iptables หรือ ipvs ขึ้นอยู่กับ cluster ที่เราสร้างไว้ในตอนแรก

service จะถูก monitor ด้วย watch-loop ของ kube-controller-manager เพื่อเพิ่มหรือลด endpoint ตาม pods ที่ run อยู่ ณ ขณะนั้น

ตัวอย่างของการสร้าง service NodePort

kind: Service 
apiVersion: v1 
metadata:
  name: hostname-service 
spec:
  type: NodePort
  selector:
    app: echo-hostname 
  ports:
    - nodePort: 30163
      port: 8080 
      targetPort: 80

Mesos


^{_{Source: mesos.apache.org}}

เป็นอีก container orchestrator ที่ได้รับความนิยม โดย function หลักๆ Mesos และ Kubernetes ทำงานได้ไม่ต่างกัน เช่น มี API กลางงในการควบคุมดูแล, การ schedule งานไป run ใน node ต่างๆ และ การเก็บ state ของ cluster ไว้ใน persistent storage ซึ่ง Kubernetes ใช้ etcd แต่ Mesos ใช้ ZooKeeper

ถ้าเราลองกลับไปดู system อย่าง Openstack หรือ Cloudstack ทั้งหมดต่างมี master node ที่ทำหน้าที่เป็น control-plane, มีการ schedule งานไป run ตาม node ต่างๆ, การเก็บ state ไว้ใน persistent storage และ การจัดการเรื่อง network ซึ่งทั้งหมดต่างมีพื้นฐานเหมือนๆ กัน

แต่สิ่งที่ทำให้ Kubernetes แตกต่าง คือ การให้ความสำคัญอย่างมากในเรื่อง fault-tolerance, self-discovery และ scaling โดยมีพื้นฐานอยู่บน API-driven mindset นั่นเอง

DEV Community