José Miguel Parrella

Posted on Feb 13, 2020

Dapr, the hard way

#dapr #kubernetes #distributedsystems

This weekend I wanted to catch up with Dapr, the Distributed Application Runtime. As a sysadmin by trade, I have little knowledge of application model theory and limited hands-on experiences with some of the challenges that Dapr seeks to address, so I was a little bit more inclined to figure out how things were put together, which proved useful for me to understand the value of Dapr.

TL;DR: what the Dapr software does is stand up a localhost endpoint next to your application. This endpoint, which speaks HTTP and gRPC, provides a standard, simple API (the Dapr spec) through which your application and any other application that interacts with yours can invoke methods, store and retrieve state, publish and subscribe to events, and more.

Why dapr?

The reasons why you'd want such a thing are best described in the website and the Azure Fridays two-part interview with the team, but I'll give it a shot.

At first glance,none of the things Dapr can do are necessarily new to developers. Everyone consumes events and stores state. What's new is the ability to enable large, complex, distributed applications without a lot of focus on the (traditionally hard) implementation details.

Historically, ways through which developers bring things like session stores into their applications include:

1) Framework facilities
2) Middleware
3) External services

Examples of framework facilities that can achieve what Dapr does include the session management capabilities in Laravel for PHP developers, or things like Catalyst::Plugin::Session for Perl developers.

Those facilities offer the most idiomatic option for developers but they come at a cost: it's harder to part ways with the framework, in some cases it isn't ready for a distributed world (the most simplistic facilities are instance-bound, live in memory, etc.) and even for the ones that are more sophisticated, someone still has to write and maintain the code that does state persistence, interact with the backends, etc.

Middlewares are a generalization of this. They tend to be less language dependent, but they also tend to require more specialized operator knowledge. Some have rich language bindings, like the Java bindings for many Apache projects and Java-adjacent middlewares. With externally managed services, the issue tends to be around the need to import and keep track of SDKs, which can introduce challenges from poor developer experience to severe API lag, missing documentation and more.

Imagine a world where your application can be written in any language (or many) and where you can always count on a local endpoint that offers you the building blocks to make your application highly decoupled and distributed (including in Kubernetes).

This would be without taking any SDK dependencies (and perhaps even dropping some!) and just using HTTP/gRPC and JSON. All that you need to know as a developer is to know how to target the Dapr spec.

And even if ultimately all state must be persisted somewhere, and/or a pub/sub server must be somehow available, you can delegate all of these decisions to someone else in your team, and those things can change from underneath you without you changing your application: want to use a managed CosmosDB instance instead of a bring-your-own Redis one? Dapr can do that.

Dapr and `podman`

I'm running Debian sid and I wanted to run dapr in my local machine. I followed the installation instructions (the install script basically fetches the latest release from GitHub) and ran dapr init. This failed (at least in version 0.3.0) because I don't have docker in my machine. Since I wanted to use podman, I proceeded to take a look at the code and see how I could make that happen.

In standalone mode (i.e., not Kubernetes) dapr init does three key things: check that docker is installed, fetch the daprd binary and prepare the runtime. So I went ahead and changed the test logic for testing and calling docker (see #257) in my cloned repo, and rebuilt dapr with go build.

Once I ran dapr init with the resulting binary, I had local containers running Dapr's placement service and Redis as the default state store in standalone mode:

$ dapr init 
⌛  Making the jump to hyperspace...
✅  Downloading binaries and setting up components...
✅  Success! Dapr is up and running

$ podman ps | grep dapr
231a26a31000  docker.io/library/redis:latest   redis-server          4 minutes ago  Up 4 minutes ago  0.0.0.0:6379->6379/tcp    dapr_redis
f49eb9753492  docker.io/daprio/dapr:latest                           4 minutes ago  Up 4 minutes ago  0.0.0.0:50005->50005/tcp  dapr_placement

Dapr also fetches daprd which is the component that actually starts with each application instance. The Redis container supports it (for storing state) and the placement container is used for actors -- more on this later.

Dapr in action

The easiest way to see Dapr in action in standalone mode is to run one of the samples (don't forget to install sample dependencies with npm install), for example:

$ dapr run --app-id alpha --log-level error --app-port 3000 -- node app.js
ℹ️  Starting Dapr with id alpha. HTTP Port: 34651. gRPC Port: 34655
✅  You're up and running! Both Dapr and your app logs will appear here.
== APP == Node App listening on port 3000!

As you can see, I start my Node.js application as an argument of dapr, and I use --app-port to bind daprd to the application port. What I can do now is start querying a Dapr endpoint via localhost:34651, calling my Node.js application methods from dapr, persisting and querying state, and I can do this over HTTP or gRPC, with the dapr CLI or a tool like curl:

$ dapr invoke --app-id alpha --method neworder --payload '{"data": { "orderId": "42" } }'
$ curl http://localhost:45651/v1.0/invoke/alpha/method/order ; echo
{"orderId":"42"}

The interesting thing is when you read the code of the /neworder method in my application, it also calls Dapr for persistence (and yes, daprPort is passed as an environment variable) which makes the entire programming model decoupled and simple:

app.post('/neworder', (req, res) => {
    const state = [{
      key: req.body.data,
      value: req.body.data.orderId
    }];

    fetch(stateUrl, { // http://localhost:${daprPort}/v1.0/state/${stateStoreName}
        method: "POST",
        body: JSON.stringify(state),
        headers: {
            "Content-Type": "application/json"
        }
    }).then((response) => {
    ...

And this also means that instead of curl or a CLI, I could have written a consumer in any other language. In fact, Dapr's components extend to pub-sub, input/output bindings which are great for triggers and other events, state and secret stores, tracing exporters, OAuth authorization and, perhaps dramatically, Virtual Actors support, helping abstract the implementation details of things such as concurrency control.

Check out all the supported integrations in the components-contrib repo!

Beyond standalone `daprd`

Of course, things get far more interesting when you deploy this to Kubernetes. A simple dapr init --kubernetes will use your current kubeconfig to deploy the Dapr elements to your cluster. You can deploy a redis chart, or use a managed Redis service or other state store. I end up with a Dapr-enabled cluster:

$ kubectl get pods  | grep dapr
dapr-operator-68f7dcb454-zjhdj           1/1     Running   0          4d1h
dapr-placement-6d77d54dc6-ww5rb          1/1     Running   0          4d1h
dapr-sidecar-injector-86d6ccf956-7r85k   1/1     Running   0          4d1h

The distributed calculator sample is a great way to see Dapr in action in a Kubernetes cluster. You'll get a React-based calculator that can persist state to Dapr and that calls services in multiple languages for each operation, all over Dapr. Once you deploy this sample, you'll end up with a bunch of services:

$ kubectl get svc
NAME                        TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)            AGE
addapp-dapr                 ClusterIP      10.0.66.128    <none>           80/TCP,50001/TCP   4d1h
calculator-front-end        LoadBalancer   10.0.15.166    158.51.155.210   80:30366/TCP       4d1h
calculator-front-end-dapr   ClusterIP      10.0.79.13     <none>           80/TCP,50001/TCP   4d1h
divideapp-dapr              ClusterIP      10.0.160.36    <none>           80/TCP,50001/TCP   4d1h
...

You'll notice that each service has a -dapr sidecar, and that the React frontend doesn't know the cluster IP or service names of each operation's corresponding service. This is a key aspect of Dapr beyond standalone mode: Kubernetes-aware service discovery, along with mDNS capabilities for non-Kubernetes environments, that developers don't need to implement in their code.

And what do you say? Passing around the output of something like jc? Using dapr as the backend for something like deskconn? Running dapr run --app-id ncapp --app-port 3000 -- socat -v tcp-l:3000,fork exec:'/bin/cat'? Sure! Why not?

Summary

Dapr offers an excellent path to getting rid of redundant logic with brittle implementations and enable large teams to start doing things like service invocation, decoupled state stores and event-driven programming while reducing the 3rd party code and SDK footprint in the codebase.

It could be particularly exciting when coupled with OAM/Rudr (watch Mark Russinovich's Dapr, Rudr, OAM interview at Microsoft Ignite) to help further separate developer and operator concerns in Kubernetes clusters.

I learned a thing (or five) in the process of checking it out. Give it a try, reach out to the team and say hi!

DEV Community

Dapr, the hard way

Why dapr?

Dapr and `podman`

Dapr in action

Beyond standalone `daprd`

Summary

Top comments (0)

Why dapr?

Dapr and podman

Dapr in action

Beyond standalone daprd

Summary

Dapr and `podman`

Beyond standalone `daprd`