DEV Community

Jim Hatcher
Jim Hatcher

Posted on • Updated on

Running CockroachDB on k8s - with tweaks for Production

Running CockroachDB (CRDB) on Kubernetes (k8s) is a complementary affair. CRDB provides high availability at the data layer, and k8s provides high availability at the infrastructure layer.

For enterprises that are already leveraging k8s to run their applications & microservices and who have experience running and administering k8s, it can make sense to also run the database within k8s.

The docs for running CRDB on k8s on the Cockroach Labs' documentation site have an excellent set of steps for getting CRDB up and running. They are all you need to run a demo of CRDB on k8s.

However, if you're planning on running k8s in Production, there are a few other things you'll probably want to do. In this blog I'll explain how to go about this.

Expose CRDB outside of the k8s cluster

You can follow the docs for deploying CRDB in a single-region k8s deployment. When you deploy CRDB on k8s (through any of the available methods -- operator, helm chart, or via yaml configs), there are a few services created for you.

For instance, after installing CRDB via the CRDB k8s operator on GKE, I have the following services:

$ kubectl get svc
NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                        AGE
cockroachdb          ClusterIP   None            <none>        26258/TCP,8080/TCP,26257/TCP   59s
cockroachdb-public   ClusterIP   10.108.12.200   <none>        26258/TCP,8080/TCP,26257/TCP   59s
kubernetes           ClusterIP   10.108.0.1      <none>        443/TCP                        7m5s
Enter fullscreen mode Exit fullscreen mode

In k8s, there are four potential types of services:

  • ClusterIP - this is the default if another type is not explicitly specified in the service's manifest; this type of service is internal to the k8s cluster
  • Nodeport - this is a way to expose the service outside of the k8s cluster using a custom port on the various cluster nodes - this can be OK for development clusters but it's not typically how you want to do things in Production because you have to use non-standard ports
  • LoadBalancer - this is another way to expose services to apps running outside of the k8s cluster; it's a better way for Production deployments because you can use the standard ports, but you need to have a process that can assign a Public IP to the load balancer; if you're running in a cloud-based k8s service (i.e., EKS, GKE, AKS, or OpenShift), this is handled for you, but if you're running on OSS k8s, you have to handle this yourself
  • ExternalName - this is a way of assigning an external DNS name to a k8s service and is not really applicable for what we're talking about here.

SQL Port

You'll notice that in the services listed in our CRDB k8s cluster that we have one called "cockroachdb" of type "ClusterIP" and which has no Cluster IP assigned. This is called a "headless" service. The point of this service is to be the service associated with the statefulset called "cockroach". It is not intended to be used to access the CRDB cluster by any internal or external apps. You can see the reference to this service in the statefulset's manifest:

$ kubectl get sts cockroachdb -o yaml | grep "serviceName"
  serviceName: cockroachdb
Enter fullscreen mode Exit fullscreen mode

The other cockroach service called "cockroachdb-public" is also of type ClusterIP but has a Cluster IP assigned to it. The point of this service is to be used by apps wanting to access CRDB that are running inside the k8s cluster.

In the CRDB docs, you'll see a section called "Use the built-in SQL Client" and you can see that they leverage this service:

kubectl exec -it cockroachdb-client-secure \
-- ./cockroach sql \
--certs-dir=/cockroach/cockroach-certs \
--host=cockroachdb-public
Enter fullscreen mode Exit fullscreen mode

This is a perfectly acceptable way to setup some basic things in the cluster via the SQL prompt (like creating the first users and to verify basic read/write capabilities are working). However, this is not the mechanism you'd want to use in Production for your apps to access the CRDB cluster -- especially if your apps are running outside the k8s cluster. I'll talk about the right way to do this a little later on.

There is also a third service listed called "kubernetes" which is not used to access CRDB at all.

When you're running CRDB, there are three access points into the CRDB nodes:

  1. the SQL client port (26257 by default)
  2. the DB Console port (8080 by default), and
  3. the port used by other nodes to do node-to-node interactions (26258 by default).

All three of these ports are exposed by the "cockroachdb-public" service that we've been looking at.

DB Console port

We can technically get to the DB Console on each CRDB node from any pod running inside our k8s cluster, but that would involve running curl commands which aren't very useful.

Just to illustrate what I'm talking about, you can do something like this:

$ kubectl exec -it cockroachdb-0 -- curl -Lk http://cockroachdb-public:8080/
Defaulted container "db" out of: db, db-init (init)
<!DOCTYPE html>
<html>
    <head>
        <title>Cockroach Console</title>
        <meta charset="UTF-8">
        <link href="favicon.ico" rel="shortcut icon">
    </head>
    <body>
        <div id="react-layout"></div>
        <script src="bundle.js" type="text/javascript"></script>
    </body>
</html>
Enter fullscreen mode Exit fullscreen mode

You can see that we do actually get some HTML back from our curl command, but this is a lame way to interact with a website! So, in the CRDB docs, they recommend using the kubectl port-forward command to expose this service to the computer where your kubectl command is running:

$ kubectl port-forward service/cockroachdb-public 8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Enter fullscreen mode Exit fullscreen mode

Once you've run this command, you can go to the browser on your computer and get to http://localhost:8080/ and acccess the DB Console. The kubectl command is proxying your input to the CRDB nodes and their output back to your browser.

Again, this is a perfectly acceptable way to access the CRDB nodes for a demo and just to make sure they're running OK. But, for Production, you don't want to have everybody on your team port-forwarding into your nodes in order to monitor what's happening in CRDB.

Using an external load balancer

In order to access the SQL client and DB Console from outside the cluster, the best way to go is to create a k8s service of type LoadBalancer.

Create a yaml file like this:

apiVersion: v1
kind: Service
metadata:
  annotations:
    # put annotations here that affect how EKS creates things
    # service.beta.kubernetes.io/aws-load-balancer-internal: "true"
  labels:
    app: cockroachdb
  name: cockroachdb-lb
spec:
  # if you don't specify the type, it will default to ClusterIP which won't expose the services outside of the k8s cluster
  type: LoadBalancer
  selector:
    # this selector is the label associated with your CRDB pods
    # if you're not sure -- run this: kubectl get pods --show-labels
    app: cockroachdb
  ports:
  - name: https
    port: 8080
    protocol: TCP
    targetPort: 8080
  - name: tcp
    port: 26257
    protocol: TCP
    targetPort: 26257
Enter fullscreen mode Exit fullscreen mode

Then, you can create the service by applying that yaml file with:

kubectl apply -f cockroachdb-lb.yaml
Enter fullscreen mode Exit fullscreen mode

Notice that if you get a listing of your services right after creating it that you will see a service called "cockroachdb-lb" and it will have an External IP of "pending":

$ kubectl get svc
NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                          AGE
cockroachdb          ClusterIP      None            <none>        26258/TCP,8080/TCP,26257/TCP     33m
cockroachdb-lb       LoadBalancer   10.108.4.152    <pending>     8080:31016/TCP,26257:31395/TCP   4s
cockroachdb-public   ClusterIP      10.108.12.200   <none>        26258/TCP,8080/TCP,26257/TCP     33m
kubernetes           ClusterIP      10.108.0.1      <none>        443/TCP                          39m
Enter fullscreen mode Exit fullscreen mode

If you wait a few seconds and try again, you'll see that an External IP value is assigned:

$ kubectl get svc
NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                          AGE
cockroachdb          ClusterIP      None            <none>           26258/TCP,8080/TCP,26257/TCP     35m
cockroachdb-lb       LoadBalancer   10.108.4.152    34.139.126.177   8080:31016/TCP,26257:31395/TCP   97s
cockroachdb-public   ClusterIP      10.108.12.200   <none>           26258/TCP,8080/TCP,26257/TCP     35m
kubernetes           ClusterIP      10.108.0.1      <none>           443/TCP                          41m
Enter fullscreen mode Exit fullscreen mode

Because I'm running in GKE, Google Cloud handles creating a load balancer for me. If you look in the GCP Cloud Console, you can see the load balancer details.

If I describe the LB svc, I can look at the endpoints that have been exposed by the service:

$ kubectl describe svc cockroachdb-lb
Name:                     cockroachdb-lb
Namespace:                default
Labels:                   app=cockroachdb
Annotations:              cloud.google.com/neg: {"ingress":true}
Selector:                 app=cockroachdb
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.108.4.152
IPs:                      10.108.4.152
LoadBalancer Ingress:     34.139.126.177
Port:                     https  8080/TCP
TargetPort:               8080/TCP
NodePort:                 https  31016/TCP
Endpoints:                <none>
Port:                     tcp  26257/TCP
TargetPort:               26257/TCP
NodePort:                 tcp  31395/TCP
Endpoints:                <none>
Session Affinity:         None
External Traffic Policy:  Cluster
Events:
  Type    Reason                Age    From                Message
  ----    ------                ----   ----                -------
  Normal  EnsuringLoadBalancer  3m12s  service-controller  Ensuring load balancer
  Normal  EnsuredLoadBalancer   2m36s  service-controller  Ensured load balancer
Enter fullscreen mode Exit fullscreen mode

We can see here that there are no endpoints assigned. That's not good!

The way that the LB gets associated with pods is via the "selector" in its spec. My current selector is looking for pods with a label of app: cockroachdb.

Let's see what labels our pods are actually using:

$ kubectl get pods --show-labels
NAME            READY   STATUS    RESTARTS   AGE   LABELS
cockroachdb-0   1/1     Running   0          37m   app.kubernetes.io/component=database,app.kubernetes.io/instance=cockroachdb,app.kubernetes.io/name=cockroachdb,controller-revision-hash=cockroachdb-7b9668cd75,crdb=is-cool,statefulset.kubernetes.io/pod-name=cockroachdb-0
cockroachdb-1   1/1     Running   0          37m   app.kubernetes.io/component=database,app.kubernetes.io/instance=cockroachdb,app.kubernetes.io/name=cockroachdb,controller-revision-hash=cockroachdb-7b9668cd75,crdb=is-cool,statefulset.kubernetes.io/pod-name=cockroachdb-1
cockroachdb-2   1/1     Running   0          37m   app.kubernetes.io/component=database,app.kubernetes.io/instance=cockroachdb,app.kubernetes.io/name=cockroachdb,controller-revision-hash=cockroachdb-7b9668cd75,crdb=is-cool,statefulset.kubernetes.io/pod-name=cockroachdb-2
Enter fullscreen mode Exit fullscreen mode

A better choice for the selector label would be app.kubernetes.io/name=cockroachdb.

Let's edit the yaml to include that value and re-apply it:

apiVersion: v1
kind: Service
metadata:
  annotations:
    # put annotations here that affect how EKS creates things
    # service.beta.kubernetes.io/aws-load-balancer-internal: "true"
  labels:
    app: cockroachdb
  name: cockroachdb-lb
spec:
  # if you don't specify the type, it will default to ClusterIP which won't expose the services outside of the k8s cluster
  type: LoadBalancer
  selector:
    # this selector is the label associated with your CRDB pods
    # if you're not sure -- run this: kubectl get pods --show-labels
    app.kubernetes.io/name: cockroachdb
  ports:
  - name: https
    port: 8080
    protocol: TCP
    targetPort: 8080
  - name: tcp
    port: 26257
    protocol: TCP
    targetPort: 26257
Enter fullscreen mode Exit fullscreen mode

Notice that I have to enter app.kubernetes.io/name: cockroachdb instead of app.kubernetes.io/name=cockroachdb

kubectl apply -f cockroachdb-lb.yaml
Enter fullscreen mode Exit fullscreen mode

Now, let's look at our endpoints again:

$ kubectl describe svc cockroachdb-lb
Name:                     cockroachdb-lb
Namespace:                default
Labels:                   app=cockroachdb
Annotations:              cloud.google.com/neg: {"ingress":true}
Selector:                 app.kubernetes.io/name=cockroachdb
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.108.4.152
IPs:                      10.108.4.152
LoadBalancer Ingress:     34.139.126.177
Port:                     https  8080/TCP
TargetPort:               8080/TCP
NodePort:                 https  31016/TCP
Endpoints:                10.104.0.4:8080,10.104.1.8:8080,10.104.2.7:8080
Port:                     tcp  26257/TCP
TargetPort:               26257/TCP
NodePort:                 tcp  31395/TCP
Endpoints:                10.104.0.4:26257,10.104.1.8:26257,10.104.2.7:26257
Session Affinity:         None
External Traffic Policy:  Cluster
Events:
  Type    Reason                Age                  From                Message
  ----    ------                ----                 ----                -------
  Normal  EnsuringLoadBalancer  49s (x2 over 8m52s)  service-controller  Ensuring load balancer
  Normal  EnsuredLoadBalancer   44s (x2 over 8m16s)  service-controller  Ensured load balancer
Enter fullscreen mode Exit fullscreen mode

Yay! We have endpoints.

Now, if I try to access the nodes via the SQL port or via the DB Console, I should be able to do so from outside the cluster.

I can go to a browser and access https://34.139.126.177:8080/ and it works. (You'll want to substitute the actual value of your LB's External IP here.)

Also, I can access the nodes on the SQL port. For example:

$ cockroach sql --url 'postgres://roach:Q7gc8rEdS@34.139.126.177:26257/defaultdb?sslmode=require'
#
# Welcome to the CockroachDB SQL shell.
# All statements must be terminated by a semicolon.
# To exit, type: \q.
#
# Client version: CockroachDB CCL v22.2.5 (aarch64-apple-darwin21.2, built 2023/02/16 16:37:38, go1.19.4)
# Server version: CockroachDB CCL v22.2.2 (x86_64-pc-linux-gnu, built 2023/01/04 17:23:00, go1.19.1)

warning: server version older than client! proceed with caution; some features may not be available.

# Cluster ID: 7539a31a-fc44-4f89-a154-cc60f8aaeddd
#
# Enter \? for a brief introduction.
#
roach@34.139.126.177:26257/defaultdb>
Enter fullscreen mode Exit fullscreen mode

The SQL port in CRDB needs to be exposed by a L4/TCP load balancer (which is what we're doing above). The DB Console port can also be exposed this way (as we've demonstrated), but since it's an HTTP/HTTPS access point, it could also be exposed through an L7 endpoint, like k8s' Ingress. I'm not going to demonstrate that here in this blog, but it can certainly be done.

Fixing the Node Certs

Another thing to note here. In my example above, I'm able to connect to my server using the IP address because I specified sslmode=require. This tells my Postgres driver/client that I want to use TLS to encrypt traffic to/from the cluster, but I don't want to do any hostname verification checks. We don't recommend connecting this way because it leave your cluster susceptible to man-in-the-middle (MITM) attacks.

In order to connect the "right way", I need to connect using sslmode=verify-full and specify the ca.crt used to sign all the certs used in the CRDB cluster.

I can get a list of the certs used by my cluster by asking k8s to list out all the secrets being used:

$ kubectl get secrets
NAME               TYPE     DATA   AGE
cockroachdb-ca     Opaque   1      57m
cockroachdb-node   Opaque   3      57m
cockroachdb-root   Opaque   3      57m
Enter fullscreen mode Exit fullscreen mode

If I look into the details of the cockroachdb-node secret, I can see the various cert and key files that it contains:
(note that I'm using jq which you can install on your Mac using brew install jq)

$ kubectl get secrets cockroachdb-node -o json | jq '.data | map_values(@base64d)' | awk '{gsub(/\\n/,"\n")}1'
{
  "ca.crt": "-----BEGIN CERTIFICATE-----
MIIDJTCCAg2gAwIBAgIQC+85luldQT9+ctIxQ1BitjANBgkqhkiG9w0BAQsFADAr
MRIwEAYDVQQKEwlDb2Nrcm9hY2gxFTATBgNVBAMTDENvY2tyb2FjaCBDQTAeFw0y
MzAyMjUyMDEzMzhaFw0zMzAzMDUyMDEzMzhaMCsxEjAQBgNVBAoTCUNvY2tyb2Fj
aDEVMBMGA1UEAxMMQ29ja3JvYWNoIENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8A
MIIBCgKCAQEAp7fVT7JMbrzC0J9UeqN3r5YeW1FyYpVfpGWRiQICK8ZPv8NnzsaQ
SgOig83c9wax/wHP+xK4ISoTPMLc75eM+YKoN5fU17Ki28iopJwgIakCjSXJxcAv
cN0H6cn6BemL+qb9RS7Pffu8ohJKyLsNk7a/8xMNKUAPhgmBAYws4SOhG68/f1je
Lk8hsPrqVlCDBGPwVQdhCYkKvavLA7qG0D/+F+FfNI7a/qldqn/u74DN69gie5w4
37bB1IecleX3Ks0Ype+AiNzcdllUBC22ttVREpymVj7K24ti5DeyGPeHND5F/q6F
o8a/apYMPr+hbbPgsMjoreHlcCwgxk/zEwIDAQABo0UwQzAOBgNVHQ8BAf8EBAMC
AuQwEgYDVR0TAQH/BAgwBgEB/wIBATAdBgNVHQ4EFgQUabb2eIdtS1cn3QY/pNrk
v9Kyz8swDQYJKoZIhvcNAQELBQADggEBAI29Fz3SBzkSYdvhRWsXVjRL3XnteIvJ
GwwdIXgEx/Uxc+QXnOGRF6yKqMAzJhU15qP0u1LgqHq56tkeTmQAeApMg6VTa2wj
HibW77O8w8anukv5ThXeGs52FYTVzzQ/ao+y3R9cfyHQleoecohiFXYJ0RLKmj7n
ywZ9CocP6VnRklMyegpNBp9VWnnKTsMOs+lEaGzPDiJBdPJ0Ym9946jwaojb1st3
pnApAgN/32Ak9bTrBVf6Zl2zj6n6rLD294+EMScpvVqqIqA4iJh9cpGbIEu2TO4x
QrjTl5aBbP7e4VWQnVOSZgmeTJUnFm4L2kR53yFonmys0ZJ/14z0acw=
-----END CERTIFICATE-----
",
  "tls.crt": "-----BEGIN CERTIFICATE-----
MIID+jCCAuKgAwIBAgIQCjaSSuwS1yLAEuoysn7ZUDANBgkqhkiG9w0BAQsFADAr
MRIwEAYDVQQKEwlDb2Nrcm9hY2gxFTATBgNVBAMTDENvY2tyb2FjaCBDQTAeFw0y
MzAyMjUyMDEzMzhaFw0yODAzMDEyMDEzMzhaMCMxEjAQBgNVBAoTCUNvY2tyb2Fj
aDENMAsGA1UEAxMEbm9kZTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEB
AKT4igRxUZE5p7NDkDeSWqQjENX7W3tOTXoON1GyIjf8/j1xyQN2i/AMMFAdb5P9
f8mBFzsYes/WLgXWlPZQOal2MKJAOKJ1AYywKeZ+AqCYftIJlqm/1A/EdNn74Mv1
ykNU5f2YxdBAnl8MOIrIvWeghwzKv1PSYTiUDBFti9TNsQAvrwtXC8vrfir9rnz3
8j8QP1RMzQkySRUSsik0GGD/YMW5leTsEQKYxI+clkH7YM1pOUhw6b3SHbkZlYkO
arsgv2qlnjMUN4j/6HqtOyzu5wjyOBXKxccGwNtIJB3Xq0w3wYN1E3TWDmi9jY1c
T64w9KGgXLC8NR46MqjvfM0CAwEAAaOCASAwggEcMA4GA1UdDwEB/wQEAwIFoDAd
BgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwHwYDVR0jBBgwFoAUabb2eIdt
S1cn3QY/pNrkv9Kyz8swgckGA1UdEQSBwTCBvoIJbG9jYWxob3N0ghJjb2Nrcm9h
Y2hkYi1wdWJsaWOCGmNvY2tyb2FjaGRiLXB1YmxpYy5kZWZhdWx0gixjb2Nrcm9h
Y2hkYi1wdWJsaWMuZGVmYXVsdC5zdmMuY2x1c3Rlci5sb2NhbIINKi5jb2Nrcm9h
Y2hkYoIVKi5jb2Nrcm9hY2hkYi5kZWZhdWx0gicqLmNvY2tyb2FjaGRiLmRlZmF1
bHQuc3ZjLmNsdXN0ZXIubG9jYWyHBH8AAAEwDQYJKoZIhvcNAQELBQADggEBACAr
6MQL8dGjbhufGRcGjpKE/ctmwpoARvfFIvCg1S5/ZXPJTz4A9fp1B0bxKoHopaOO
6F2EH9B6qH6g3cFbD6au+QXc5f/kgxVJVJOewCOUDLRjH1i3Fcnd/zxvywQU6cIs
ArfwWW+XoifirNQ6MwqxtPVzjMectGQUs1IpdDwLReO6eS9pFo4kHMZiJi5XTgjJ
krDFMbFUW8qnul1w3UrxgikXeLKnuIDnegPpX4Xk0yYF1ycxA46ZORV+DybP3DG8
F6lH6wA3uF2E62Z/52XH7UUvtUaAIK937vbxXosufD8KwbXCNEcojlSDYtuKhtCq
KcywMKGrVgdtd/nwxy4=
-----END CERTIFICATE-----
",
  "tls.key": "-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEApPiKBHFRkTmns0OQN5JapCMQ1ftbe05Neg43UbIiN/z+PXHJ
A3aL8AwwUB1vk/1/yYEXOxh6z9YuBdaU9lA5qXYwokA4onUBjLAp5n4CoJh+0gmW
qb/UD8R02fvgy/XKQ1Tl/ZjF0ECeXww4isi9Z6CHDMq/U9JhOJQMEW2L1M2xAC+v
C1cLy+t+Kv2ufPfyPxA/VEzNCTJJFRKyKTQYYP9gxbmV5OwRApjEj5yWQftgzWk5
SHDpvdIduRmViQ5quyC/aqWeMxQ3iP/oeq07LO7nCPI4FcrFxwbA20gkHderTDfB
g3UTdNYOaL2NjVxPrjD0oaBcsLw1HjoyqO98zQIDAQABAoIBAQCGBtY6ncXi8rBo
V6/HNkQlrcdz0W6VUxxm2T3gRZS/X+8+BD+HbLxsHbrym7eWyBEVqKcy/8RnLl7d
p2QGaU8vejIw33QjqGPF5SlldWK1Dq+Z/OhGqO6kkLtOjfAoRFw7L7Jawc+UTatd
FRSqzEP0+No/bkja1MTfrofPcOx1ygiTHsSm3JHy+rh/bxRxeU9J5JBWUD1KeRS4
FRsYqf7tgv6KzBktRRs29q/HeU4up0S9HyjbE9emc99g6ZfX2dpmqoDW0kBjo729
x0XP2KxmSGeAogTmpVBz6RjoDuCUAbtUjMAbpbDRJJqnm6R8fIj1e+mDpSwOS4QN
dikzHQiBAoGBANacPfviPU81ddowy1HjPEko4C4Qy6brmPWuaeA6QUUL/MR+QrYN
Usp4B7d8lsLnZEdHyeszDnxPaAzj4rE7uDhSSMJmfflNqQVmWR6jByQ8GgzDFS5/
Re3LYR26DJMHBNGZqCxQ7us7Aqc0+YeDT8/wlniOAlndvXQ7l1Tt7nGZAoGBAMTJ
fk7Cs81SaalQQ05O7jfjwvi6kX+ISbBnBB6LyDCNkBoCwRKIHsiDIkKlFhGgwvim
+K/Ugsg8GuqYb5qd1ag+Kb8ykQpbMjkvr6Mh1bArN3KWSQTaiFko7nJLLd7P2H0V
WzrD/OUD0J2NKkzQLJcxuS8hc5YRj0DGWqzCVw1VAoGAAmj+yTVhOuJ+0FR79A95
PdkXq2zE3LsInLm4tqvwz7WywQIp/aForJ1seMMNbmLq3WIRAnMwVnUN1hc5FIR3
LSq/Zm+AOqyEmWrs1Us/aUjDgiEuu7byMhl2nb7ZJU2O4Eu5d8Xw6PNgtEAEDWGM
I+mvxurRW/EBj6ybpniFlQECgYB/DXzQSyMdeI0htOGPyKRDT3lNb9+K0KqLCyf8
tNEuj+eu84JGfb4qRYg0MTQbc4kOU3eSxokd0LisKHk+AZO1yVTYzkQYxKKbi29B
yxGVaYGmKOPCD3oi3qt8/Y8DIXyr3cMGIQ3BqwHhBwh9iZaQk5j1lgpzpKix8J8Q
lXTw9QKBgQCNKN1p9UNxpX19ni7Mg+MVPRkZgC68DBXYxNDnAexFjlatV+dIo03u
SxGBsB8Y1q4AbqHwWe8lSp+erADzeWtkYD4u9BSZl4WD50Mbev/Fut9dGJwnI+BJ
0ldr96qyslFD1RitRl5Xc6gOTcF4Bt/O5GRo5+2F4fDwJm6+dYIjJA==
-----END RSA PRIVATE KEY-----
"
}

Enter fullscreen mode Exit fullscreen mode

I'm going to copy and paste the ca.crt output into a file called ca.crt.

OK, now I can try to connect the right way:

$ cockroach sql --url 'postgres://roach:Q7gc8rEdS@34.139.126.177:26257/defaultdb?sslmode=verify-full&sslrootcert=ca.crt'
#
# Welcome to the CockroachDB SQL shell.
# All statements must be terminated by a semicolon.
# To exit, type: \q.
#
ERROR: failed to connect to `host=34.139.126.177 user=roach database=defaultdb`: failed to write startup message (x509: certificate is valid for 127.0.0.1, not 34.139.126.177)
Failed running "sql"
Enter fullscreen mode Exit fullscreen mode

Notice this time I pass my ca.crt file and use sslmode=verify-full and I get an error saying that my node certs are not authorized to respond as 34.139.126.177.

To fix this, I need to re-issue my node certs to know about their load-balancer IP. If you want to create a DNS record like crdb.myproject.mydomain.com, now is a good time to do that because we'll want to include that domain name on our new certs, too.

The process to generate the node certs is described in this doc.

Depending on how you installed CRDB in k8s in the first place, you might have these various certs and keys in place in your local file system, or you might have to download them from the certs (as we did above for the ca.crt file).

You don't need to re-create the ca key/crt pair, but you want to recreate the CRDB node cert. When you get to that step, add the IP address and DNS name (if you have one) as additional parameters to this command:

$ cockroach cert create-node \
> localhost 127.0.0.1 \
> cockroachdb-public \
> cockroachdb-public.default \
> cockroachdb-public.default.svc.cluster.local \
> *.cockroachdb \
> *.cockroachdb.default \
> *.cockroachdb.default.svc.cluster.local \
> 34.139.126.177 \
> crdb.myproject.mydomain.com \
> --certs-dir=certs \
> --ca-key=my-safe-directory/ca.key
Enter fullscreen mode Exit fullscreen mode

This will create a node.crt/node.key pair. I am re-naming those to tls.key/tls.crt since that's what they're called in my installation.

You can examine the tls.crt file to see if it includes our IP and dns name by running this command:

$ openssl x509 -in certs/tls.crt -noout -text | egrep -A 2 'X509v3 Subject Alternative Name'
            X509v3 Subject Alternative Name: 
                DNS:localhost, DNS:cockroachdb-public, DNS:cockroachdb-public.default, DNS:cockroachdb-public.default.svc.cluster.local, DNS:*.cockroachdb, DNS:*.cockroachdb.default, DNS:*.cockroachdb.default.svc.cluster.local, DNS:crdb.myproject.mydomain.com, IP Address:127.0.0.1, IP Address:34.139.126.177
    Signature Algorithm: sha256WithRSAEncryption
Enter fullscreen mode Exit fullscreen mode

Notice that our IP and domain name are listed.

Now we need to:

  1. Delete the existing secret
  2. Re-create the secret with the new cert/key files
  3. Restart the pods so they will pick up the new secrets/certs

After doing so, if I try to connect using the "right way", I am able to do so successfully:

$ cockroach sql --url 'postgres://roach:Q7gc8rEdS@34.139.126.177:26257/defaultdb?sslmode=verify-full&sslrootcert=certs/ca.crt'
#
# Welcome to the CockroachDB SQL shell.
# All statements must be terminated by a semicolon.
# To exit, type: \q.
#
# Client version: CockroachDB CCL v22.2.5 (aarch64-apple-darwin21.2, built 2023/02/16 16:37:38, go1.19.4)
# Server version: CockroachDB CCL v22.2.2 (x86_64-pc-linux-gnu, built 2023/01/04 17:23:00, go1.19.1)

warning: server version older than client! proceed with caution; some features may not be available.

# Cluster ID: 7539a31a-fc44-4f89-a154-cc60f8aaeddd
#
# Enter \? for a brief introduction.
#
roach@34.139.126.177:26257/defaultdb> 
Enter fullscreen mode Exit fullscreen mode

Summary

When running CRDB in k8s in Production, you'll want to expose the CRDB nodes externally by using a load balancer. And, you'll want to re-create your node certs to references the load balancer details.

Happy CRDBing!

Top comments (0)