NebulaGraph

Posted on Mar 4, 2020 • Originally published at nebula-graph.io on Mar 4, 2020

How to Deploy NebulaGraph Database on Kubernetes

#kubernetes #nebulagraph #k8s #graphdatabase

What is Kubernetes

Kubernetes (commonly stylized as k8s) is an open-source container-orchestration system, aiming to provide a simple yet efficient platform for automating deployment, scaling, and operations of application containers across clusters of hosts.

Kubernetes has a series of components architecturally, enabling a mechanism that can provide deployment , maintenance , and extension of applications.

The components are designed to be loosely coupled and scalable so that they can meet various kinds of workloads.

The scalability of the system is largely provided by the Kubernetes API which is used mainly as a scalable internal component and as a container running on Kubernetes.

Kubernetes consists mainly of the following core components:

etcd is used as Kubernetes' backing store for all cluster data
apiserver provides a unique entry for resource operations and provides mechanisms for authentication, authorization, access control, API registration, and discovery
controller manager is responsible for maintaining the state of the cluster, such as fault detection, automatic expansion, rolling updates, etc.
scheduler is responsible for scheduling resources, and scheduling Pods to corresponding machines according to a predetermined scheduling policy
kubelet is responsible for maintaining the life cycle of the container, and is also responsible for the management of Volume and Network
Container runtime is responsible for image management and the runtime of the Pod and container (CRI)
kube-proxy is responsible for providing service discovery and load balancing within the cluster for the kubernetes-service

In addition to the core components, there are some recommended Add-ons:

kube-dns is responsible for providing DNS services for the entire cluster
Ingress Controller provides external network access for services
Heapster provides resource monitoring
Dashboard provides GUI
Federation provides clusters management across Availability Zones
Fluentd-elasticsearch provides cluster log collection, storage and query

Kubernetes and Databases

Database containerization is a hot topic recently, and what benefits can Kubernetes bring to databases?

Fault recovery : Kubernetes restarts database applications when that fail, or migrates database to other health nodes in the cluster
Storage management : Kubernetes provides various solutions on storage management so that databases can adopt different storage systems transparently
Load balancing : Kubernetes Service provides load-balance by distributing external network traffic evenly to different database replications
Horizontal scalability : Kubernetes can scale the replicas based on the resource utilization of the current database cluster, thereby improving resource utilization rate

Currently many databases such as MySQL, MongoDB and TiDB all work fine on Kubernetes.

NebulaGraph Database on Kubernetes

NebulaGraph is a distributed, open source graph database that is comprised of graphd (the query engine), storaged (data storage) and metad (meta data). Kubernetes brings the following benefits to NebulaGraph :

Kubernetes adjust the workload between the different replicas of the graphd, metad and storaged. The three can discover each other by the dns service provided by Kubernetes.
Kubernetes encapsulate the details of the underlying storage by storageclass, pvc and pv, no matter what kind of storage-system such as cloud-disk or local-disk.
Kubernetes can deploy NebulaGraph cluster within seconds and upgrade cluster automatically without perception.
Kubernetes supports self-healing. Kubernetes can restart the crashed single replica without operations engineer.
Kubernetes scales the cluster horizontally based on the cluster utility to improve the nebula performance.

We will show you the details on deploying NebulaGraph with Kubernetes in the following part.

Deploy

Software And Hardware Requirements

The following list is software and hardware requirements involved in the deployment in this post:

The operation system is CentOS-7.6.1810 x86_64.
Virtual machine configuration: 4 CPU + 8G memory + 50G system disk + 50G data disk A + 50G data disk B
Kubernetes cluster is version v1.16.
Use local PV as data storage.

Cluster Topology

Following is the cluster topology:

Components to Be Deployed

Install Helm
Prepare local disks and install local volume plugin
Install NebulaGraph cluster
Install ingress-controller

Install Helm

Helm is the Kubernetes package manager similar to yum on CentOS, or apt-get on Ubuntu. Helm makes deploying clusters more easily with Kubernetes. Since this article does not give a detailed introduction to Helm, read the Helm Getting Started Guide to understand more about Helm.

Download and Install Helm

Installing Helm with the following command in your terminal:

[root@nebula ~]# wget https://get.helm.sh/helm-v3.0.1-linux-amd64.tar.gz
[root@nebula ~]# tar -zxvf helm/helm-v3.0.1-linux-amd64.tgz
[root@nebula ~]# mv linux-amd64/helm /usr/bin/helm
[root@nebula ~]# chmod +x /usr/bin/helm

View the Helm Version

You can view Helm version with the command helm version and the output is like the following:

version.BuildInfo{ Version:"v3.0.1", GitCommit:"7c22ef9ce89e0ebeb7125ba2ebf7d421f3e82ffa", GitTreeState:"clean", GoVersion:"go1.13.4" }

Prepare Local Disks

Configure each node as follows:

Create Mount Directory

[root@nebula ~] # sudo mkdir -p /mnt/disks

Format Data Disks

[root@nebula ~]# sudo mkfs.ext4 /dev/diskA
[root@nebula ~]# sudo mkfs.ext4 /dev/diskB

Mount Data Disks

[root@nebula ~]# DISKA\_UUID=$(blkid -s UUID -o value /dev/diskA)
[root@nebula ~]# DISKB\_UUID=$(blkid -s UUID -o value /dev/diskB)
[root@nebula ~]# sudo mkdir /mnt/disks/$DISKA\_UUID
[root@nebula ~]# sudo mkdir /mnt/disks/$DISKB\_UUID
[root@nebula ~]# sudo mount -t ext4 /dev/diskA /mnt/disks/$DISKA\_UUID
[root@nebula ~]# sudo mount -t ext4 /dev/diskB /mnt/disks/$DISKB\_UUID

[root@nebula ~]# echo UUID=`sudo blkid -s UUID -o value /dev/diskA` /mnt/disks/$DISKA\_UUID ext4 defaults 0 2 | sudo tee -a /etc/fstab
[root@nebula ~]# echo UUID=`sudo blkid -s UUID -o value /dev/diskB` /mnt/disks/$DISKB\_UUID ext4 defaults 0 2 | sudo tee -a /etc/fstab

Deploy Local Volume Plugin

[root@nebula ~]# curl https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/archive/v2.3.3.zip
[root@nebula ~]# unzip v2.3.3.zip

Modify the v2.3.3/helm/provisioner/values.yaml file.

#
# Common options.
#
common:
  #
  # Defines whether to generate service account and role bindings.
  #
  rbac: true
  #
  # Defines the namespace where provisioner runs
  #
  namespace: default
  #
  # Defines whether to create provisioner namespace
  #
  createNamespace: false
  #
  # Beta PV.NodeAffinity field is used by default. If running against pre-1.10
  # k8s version, the `useAlphaAPI` flag must be enabled in the configMap.
  #
  useAlphaAPI: false
  #
  # Indicates if PVs should be dependents of the owner Node.
  #
  setPVOwnerRef: false
  #
  # Provisioner clean volumes in process by default. If set to true, provisioner
  # will use Jobs to clean.
  #
  useJobForCleaning: false
  #
  # Provisioner name contains Node.UID by default. If set to true, the provisioner
  # name will only use Node.Name.
  #
  useNodeNameOnly: false
  #
  # Resync period in reflectors will be random between minResyncPeriod and
  # 2\*minResyncPeriod. Default: 5m0s.
  #
  #minResyncPeriod: 5m0s
  #
  # Defines the name of configmap used by Provisioner
  #
  configMapName: "local-provisioner-config"
  #
  # Enables or disables Pod Security Policy creation and binding
  #
  podSecurityPolicy: false
#
# Configure storage classes.
#
classes:
- name: fast-disks # Defines name of storage classes.
  # Path on the host where local volumes of this storage class are mounted
  # under.
  hostDir: /mnt/fast-disks
  # Optionally specify mount path of local volumes. By default, we use same
  # path as hostDir in container.
  # mountDir: /mnt/fast-disks
  # The volume mode of created PersistentVolume object. Default to Filesystem
  # if not specified.
  volumeMode: Filesystem
  # Filesystem type to mount.
  # It applies only when the source path is a block device,
  # and desire volume mode is Filesystem.
  # Must be a filesystem type supported by the host operating system.
  fsType: ext4
  blockCleanerCommand:
  # Do a quick reset of the block device during its cleanup.
  # - "/scripts/quick\_reset.sh"
  # or use dd to zero out block dev in two iterations by uncommenting these lines
  # - "/scripts/dd\_zero.sh"
  # - "2"
  # or run shred utility for 2 iteration.s
     - "/scripts/shred.sh"
     - "2"
  # or blkdiscard utility by uncommenting the line below.
  # - "/scripts/blkdiscard.sh"
  # Uncomment to create storage class object with default configuration.
  # storageClass: true
  # Uncomment to create storage class object and configure it.
  # storageClass:
    # reclaimPolicy: Delete # Available reclaim policies: Delete/Retain, defaults: Delete.
    # isDefaultClass: true # set as default class

#
# Configure DaemonSet for provisioner.
#
daemonset:
  #
  # Defines the name of a Provisioner
  #
  name: "local-volume-provisioner"
  #
  # Defines Provisioner's image name including container registry.
  #
  image: quay.io/external\_storage/local-volume-provisioner:v2.3.3
  #
  # Defines Image download policy, see kubernetes documentation for available values.
  #
  #imagePullPolicy: Always
  #
  # Defines a name of the service account which Provisioner will use to communicate with API server.
  #
  serviceAccount: local-storage-admin
  #
  # Defines a name of the Pod Priority Class to use with the Provisioner DaemonSet
  #
  # Note that if you want to make it critical, specify "system-cluster-critical"
  # or "system-node-critical" and deploy in kube-system namespace.
  # Ref: https://k8s.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical
  #
  #priorityClassName: system-node-critical
  # If configured, nodeSelector will add a nodeSelector field to the DaemonSet PodSpec.
  #
  # NodeSelector constraint for local-volume-provisioner scheduling to nodes.
  # Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
  nodeSelector: {}
  #
  # If configured KubeConfigEnv will (optionally) specify the location of kubeconfig file on the node.
  # kubeConfigEnv: KUBECONFIG
  #
  # List of node labels to be copied to the PVs created by the provisioner in a format:
  #
  # nodeLabels:
  # - failure-domain.beta.kubernetes.io/zone
  # - failure-domain.beta.kubernetes.io/region
  #
  # If configured, tolerations will add a toleration field to the DaemonSet PodSpec.
  #
  # Node tolerations for local-volume-provisioner scheduling to nodes with taints.
  # Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
  tolerations: []
  #
  # If configured, resources will set the requests/limits field to the Daemonset PodSpec.
  # Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
  resources: {}
#
# Configure Prometheus monitoring
#
prometheus:
  operator:
    ## Are you using Prometheus Operator?
    enabled: false

    serviceMonitor:
      ## Interval at which Prometheus scrapes the provisioner
      interval: 10s

      # Namespace Prometheus is installed in
      namespace: monitoring

      ## Defaults to what is used if you follow CoreOS [Prometheus Install Instructions](https://github.com/coreos/prometheus-operator/tree/master/helm#tldr)
      ## [Prometheus Selector Label](https://github.com/coreos/prometheus-operator/blob/master/helm/prometheus/templates/prometheus.yaml#L65)
      ## [Kube Prometheus Selector Label](https://github.com/coreos/prometheus-operator/blob/master/helm/kube-prometheus/values.yaml#L298)
      selector:
        prometheus: kube-prometheus

Modify hostDir: /mnt/fast-disks and # storageClass: true to hostDir: /mnt/disks and storageClass: true respectively, then run:

# Installing [root@nebula ~] # helm install local-static-provisioner v2.3.3/helm/provisioner # List local-static-provisioner deployment [root@nebula ~] # helm list

Deploy Nebula Graph Cluster

Download nebula helm-chart Package

# Downloading nebula [root@nebula ~] # wget https://github.com/vesoft-inc/nebula/archive/master.zip # Unzip [root@nebula ~] # unzip master.zip

Label Kubernetes Slave Nodes

The following is a list of Kubernetes nodes. We need to set the scheduling labels of the worker nodes. We can label 192.168.0.2, 192.168.0.3, 192.168.0.4 with label nebula: "yes".

Detailed operations are as follows:

[root@nebula ~] # kubectl label node 192.168.0.2 nebula="yes" --overwrite [root@nebula ~] # kubectl label node 192.168.0.3 nebula="yes" --overwrite [root@nebula ~] # kubectl label node 192.168. ### Deploying Ingress-controller on one Node

Modify the Default Values for nebula helm chart

Following is the directory list of nebula helm-chart:

master/kubernetes/
└── helm
    ├── Chart.yaml
    ├── templates
    │ ├── configmap.yaml
    │ ├── deployment.yaml
    │ ├── \_helpers.tpl
    │ ├── ingress-configmap.yaml\
    │ ├── NOTES.txt
    │ ├── pdb.yaml
    │ ├── service.yaml
    │ └── statefulset.yaml
    └── values.yaml

2 directories, 10 files

We need to adjust the value of MetadHosts in the yaml file master/kubernetes/values.yaml, and replace the IP list with the IPs of the 3 k8s workers in our environment.

MetadHosts: - 192.168.0.2:44500 - 192.168.0.3:44500 - 192.168.0.4:44500

Install Nebula via Helm

# Installing
[root@nebula ~]# helm install nebula master/kubernetes/helm
# Checking
[root@nebula ~]# helm status nebula
# Checking nebula deployment on the k8s cluster

[root@nebula ~]# kubectl get pod | grep nebula
nebula-graphd-579d89c958-g2j2c 1/1 Running 0 1m
nebula-graphd-579d89c958-p7829 1/1 Running 0 1m
nebula-graphd-579d89c958-q74zx 1/1 Running 0 1m
nebula-metad-0 1/1 Running 0 1m
nebula-metad-1 1/1 Running 0 1m
nebula-metad-2 1/1 Running 0 1m
nebula-storaged-0 1/1 Running 0 1m
nebula-storaged-1 1/1 Running 0 1m
nebula-storaged-2 1/1 Running 0 1m

Deploy Ingress-controller

Ingress-controller is one of the Add-Ons of Kubernetes. Kubernetes exposes services deployed internally to external users through ingress-controller. Ingress-controller also provides load balancing function, which can distribute external access to different replicas of applications in k8s.

Select a Node to Deploy Ingress-controller

[root@nebula ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
192.168.0.1 Ready master 82d v1.16.1
192.168.0.2 Ready <none> 82d v1.16.1
192.168.0.3 Ready <none> 82d v1.16.1
192.168.0.4 Ready <none> 82d v1.16.1
[root@nebula ~]# kubectl label node 192.168.0.4 ingress=yes

Edit the ingress-nginx.yaml deployment file.

apiVersion: v1
kind: Namespace
metadata:
  name: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: nginx-configuration
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: tcp-services
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: udp-services
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nginx-ingress-serviceaccount
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: nginx-ingress-clusterrole
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
rules:
  - apiGroups:
      - ""
    resources:
      - configmaps
      - endpoints
      - nodes
      - pods
      - secrets
    verbs:
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - services
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "extensions"
      - "networking.k8s.io"
    resources:
      - ingresses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - create
      - patch
  - apiGroups:
      - "extensions"
      - "networking.k8s.io"
    resources:
      - ingresses/status
    verbs:
      - update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
  name: nginx-ingress-role
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
rules:
  - apiGroups:
      - ""
    resources:
      - configmaps
      - pods
      - secrets
      - namespaces
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - configmaps
    resourceNames:
      # Defaults to "<election-id>-<ingress-class>"
      # Here: "<ingress-controller-leader>-<nginx>"
      # This has to be adapted if you change either parameter
      # when launching the nginx-ingress-controller.
      - "ingress-controller-leader-nginx"
    verbs:
      - get
      - update
  - apiGroups:
      - ""
    resources:
      - configmaps
    verbs:
      - create
  - apiGroups:
      - ""
    resources:
      - endpoints
    verbs:
      - get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  name: nginx-ingress-role-nisa-binding
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: nginx-ingress-role
subjects:
  - kind: ServiceAccount
    name: nginx-ingress-serviceaccount
    namespace: ingress-nginx

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: nginx-ingress-clusterrole-nisa-binding
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: nginx-ingress-clusterrole
subjects:
  - kind: ServiceAccount
    name: nginx-ingress-serviceaccount
    namespace: ingress-nginx

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nginx-ingress-controller
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
      app.kubernetes.io/part-of: ingress-nginx
  template:
    metadata:
      labels:
        app.kubernetes.io/name: ingress-nginx
        app.kubernetes.io/part-of: ingress-nginx
      annotations:
        prometheus.io/port: "10254"
        prometheus.io/scrape: "true"
    spec:
      hostNetwork: true
      tolerations:
        - key: "node-role.kubernetes.io/master"
          operator: "Exists"
          effect: "NoSchedule"
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/name
                    operator: In
                    values:
                      - ingress-nginx
              topologyKey: "ingress-nginx.kubernetes.io/master"
      nodeSelector:
        ingress: "yes"
      serviceAccountName: nginx-ingress-serviceaccount
      containers:
        - name: nginx-ingress-controller
          image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64:0.26.1
          args:
            - /nginx-ingress-controller
            - --configmap=$(POD\_NAMESPACE)/nginx-configuration
            - --tcp-services-configmap=default/graphd-services
            - --udp-services-configmap=$(POD\_NAMESPACE)/udp-services
            - --publish-service=$(POD\_NAMESPACE)/ingress-nginx
            - --annotations-prefix=nginx.ingress.kubernetes.io
            - --http-port=8000
          securityContext:
            allowPrivilegeEscalation: true
            capabilities:
              drop:
                - ALL
              add:
                - NET\_BIND\_SERVICE
            # www-data -> 33
            runAsUser: 33
          env:
            - name: POD\_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD\_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          ports:
            - name: http
              containerPort: 80
            - name: https
              containerPort: 443
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 10
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 10

Deploying ingress-nginx.

# Deployment
[root@nebula ~]# kubectl create -f ingress-nginx.yaml
# View deployment
[root@nebula ~]# kubectl get pod -n ingress-nginx
NAME READY STATUS RESTARTS AGE
nginx-ingress-controller-mmms7 1/1 Running 2 1m

Access Nebula Graph Cluster in Kubernetes

View which node ingress-nginx is located in:

[root@nebula ~]# kubectl get node -l ingress=yes -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
nebula.node23 Ready <none> 1d v1.16.1 192.168.8.23 <none> CentOS Linux 7 (Core) 7.6.1810.el7.x86\_64 docker://19.3.3

Access NebulaGraph Cluster:

[root@nebula ~] # docker run --rm -ti --net=host vesoft/nebula-console:nightly --addr=192.168.8.23 --port=3699

FAQ

How to deploy Kubernetes cluster?

Please refer to the Official Doc on deployment of high-availability Kubernetes clusters.

You can also refer to Installing Kubernetes with Minikube on how to deploy local Kubernetes cluster with minikube.

How to modify the NebulaGraph cluster parameters?

When using helm install, you can use --set to override the default variables in values.yaml. Please refer to Helm on details.

How to observe nebula cluster status?

You can use the kubectl get pod | grep nebula command or via the kubernetes dashboard.

How to use other disk types?

Please refer to the Storage Classes doc.

References

Originally published at https://nebula-graph.io.

DEV Community