Arik

Posted on Sep 10, 2023 • Edited on Sep 12, 2023

Let's build a code execution engine

#programming #go #distributed #code

Have you ever wondered what happens behind the scenes when you hit "Run" on a code snippet in online development environments like Go Playground or OneCompiler?

By following along, at the end of this post, you will have the front-end and backend for a very bare-bones implementation that resembles something like this:

If you just want to see the code, you can find it here.

It's important to note that while skeletal, the implementation is by no means a "toy". We will address the most important considerations for building the core requirement of such a platform. Namely:

Security - we are letting users execute arbitrary code on our servers so we need a way to isolate the code execution in order to limit the possibility for abuse as much as possible.
Scalability - We need a way to scale our system as the number of users grow.
Limits - We want to limit the amount of resources we are allocating for a given code execution so it doesn't tax our servers as well as hurt other users' experience.

What are we going to use

In order to address all the considerations mentioned above, we are going to use Tork to do all the heavy lifting for us.

In a nutshell, Tork is a general purpose, distributed workflow engine that I've been working on for the past couple of months.

It uses Docker containers for the execution of workflow tasks which addresses point #1 and point #3 for us - we'll see exactly how in a minute.

It also supports a distributed setup to scale task processing to an arbitrary number of worker nodes which addresses point #2.

There are two ways we can go about this:

Download and install the vanilla Tork and write a new thin API service that will sit between the client and Tork -- because we don't necessarily want to expose Tork's native API to them in order to have tight control over which parameters are sent to Tork. The advantage with this approach is that I can write my "middleware" server in a language of my choice.
Extend Tork to expose our custom API endpoint and disable all other endpoints. This requires knowledge of Go programming.

For the purposes of this demo I'll be going with option #2.

OK, let's write some code

You'll need:

Docker installed on the machine that you're running the demo on.
Golang >= 1.19+

Create a new directory for the project:



mkdir code-execution-demo
cd code-execution-demo

Initialize the project:



go mod init example.com/code-execution-demo

Get the Tork dependency:



go get github.com/runabol/tork

Create a main.go file at the root of the project with the minimum boilerplate necessary to start Tork:



package main

import (
    "fmt"
    "os"

    "github.com/runabol/tork/cli"
    "github.com/runabol/tork/conf"
)

func main() {
   // Load the Tork config file (if exists) 
   if err := conf.LoadConfig(); err != nil {
     fmt.Println(err)
     os.Exit(1)
   }

   // Start the Tork CLI
   app := cli.New()
   if err := app.Run(); err != nil {
     fmt.Println(err)
     os.Exit(1)
   }
}

Start Tork:



go run main.go

If all goes well, you should see something like this:



 _______  _______  ______    ___   _ 
|       ||       ||    _ |  |   | | |
|_     _||   _   ||   | ||  |   |_| |
  |   |  |  | |  ||   |_||_ |      _|
  |   |  |  |_|  ||    __  ||     |_ 
  |   |  |       ||   |  | ||    _  |
  |___|  |_______||___|  |_||___| |_|
...

Let's use the RegisterEndpoint hook to register our custom endpoint:



package main

import (
    "fmt"
    "net/http"
    "os"

    "github.com/runabol/tork/cli"
    "github.com/runabol/tork/conf"
    "github.com/runabol/tork/middleware/web"
)

func main() {
    // removed for brevity

    app.RegisterEndpoint(http.MethodPost, "/execute",handler)

    // removed for brevity
}

func handler (c web.Context) error {
  return c.String(http.StatusOK, "OK")
}

Start Tork in standalone (not distributed) mode:



go run main.go run standalone

Call the new endpoint from another terminal window:



% curl -X POST http://localhost:8000/execute
OK

So far so good.

Let's assume the client is going to send us the following JSON object:



{
  "language":"python|bash|go|etc.",
  "code":"the source code to execute"
}

Let's write a struct that we can bind these values to:



type ExecRequest struct {
    Code     string `json:"code"`
    Language string `json:"language"`
}



func handler(c web.Context) error {
  req := ExecRequest{}
  if err := c.Bind(&req); err != nil {
    c.Error(http.StatusBadRequest, err)
    return nil
  }  

  return c.JSON(http.StatusOK,req)
}

At this point we just echo the request back to the user. But it's a good stepping stone to make sure the binding logic works. Let's try it:



% curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute

{"code":"echo hello world","language":"bash"}

OK, next we need to convert the request to a Tork task:



func buildTask(er ExecRequest) (input.Task, error) {
        var image string
        var run string
        var filename string

        switch er.Language {
        case "":
                return input.Task{}, errors.Errorf("require: language")
        case "python":
                image = "python:3"
                filename = "script.py"
                run = "python script.py > $TORK_OUTPUT"
        case "go":
                image = "golang:1.19"
                filename = "main.go"
                run = "go run main.go > $TORK_OUTPUT"
        case "bash":
                image = "alpine:3.18.3"
                filename = "script"
                run = "sh ./script > $TORK_OUTPUT"
        default:
                return input.Task{}, errors.Errorf("unknown language: %s", er.Language)
        }

        return input.Task{
                Name:    "execute code",
                Image:   image,
                Run:     run,
                Files: map[string]string{
                        filename: er.Code,
                },
        }, nil
}

So we are doing three things here essentially:

Map the language field to a Docker image.
Write the code to an appropriate file in the container depending on the language.
Run the necessary command to execute the code in the container.

Let's use it in our handler:



task, err := buildTask(req)
if err != nil {
  c.Error(http.StatusBadRequest, err)
  return nil
}

And finally, let's submit the job:



input := &input.Job{
  Name:  "code execution",
  Tasks: []input.Task{task},
}   

job,err:= engine.SubmitJob(c.Request().Context(),input)
if err != nil {
  return err
}

fmt.Printf("job %s submitted!\n", job.ID)

Let's try to run our updated handler:



go run main.go run standalone



curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute

If all goes well, you should see something like this in the logs:



job 5488620e9bc34e09b6ec3677ea28a067 submitted!

Next, we want to get the execution output so we can return it to the client. But since Tork operates asynchronously we need a way to tell Tork to let us know what the job is done (or failed).

This is where JobListener comes in:



result := make(chan string)

listener := func(j *tork.Job) {
  if j.State == tork.JobStateCompleted {
    result <- j.Execution[0].Result
  } else {
    result <- j.Execution[0].Error
  }
}

// pass the listener to the submit job call
job, err := engine.SubmitJob(c.Request().Context(), input, listener)
if err != nil {
  return err
}

return c.JSON(http.StatusOK, map[string]string{"output": <-result})

Since the job listener is not executing in the "main" thread/goroutine we need a way to pass it back to the main thread. Luckily, Golang has this really convenient thing called a channel which does exactly that.

OK, let's see if this works:



curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute
{"output":"hello world\n"}

Nice!

Security

Let's update our Task definition to enforce a more strict set of security constraints:



input.Task{
  Name:  "execute code",
  Image: image,
  Run:   run,
  Limits: &input.Limits{
    CPUs:   ".5", // no more than half a CPU
    Memory: "20m", // no more than 20MB of RAM
  },
  Timeout:  "5s", // terminate container after 5 seconds
  Networks: []string{"none"}, // disable networking
  Files: map[string]string{
  filename: er.Code,
}

Let's disable Tork's built-in endpoints:

Create a file named config.toml in the root of your project with the following contents:



# config.toml
[coordinator.api]
endpoints.health = true
endpoints.jobs = false
endpoints.tasks = false
endpoints.nodes = false
endpoints.queues = false
endpoints.stats = false

Now when you start the project you should see that Tork picked up the config:



% go run main.go run standalone          
7:08PM INF Config loaded from config.tom
...

Frontend

Let's try to get the frontend to talk to our backend:



git clone git@github.com:runabol/code-execution-demo.git
cd code-execution-demo/frontend
npm i
npm run dev

And open http://localhost:3000

Scaling Up

The last point we are left to address is scalability.

There are many ways to tweak Tork for scalability, but for our purposes here we'll keep it simple and do the bare minimum of starting a RabbitMQ broker which will allow us to distribute the task processing:

Start a RabbitMQ broker. Example:



docker run \
  -d \
  --name=tork-rabbit \
  -p 5672:5672 \
  -p 15672:15672 \
  rabbitmq:3-management

Next, add the following to your config.toml:



[broker]
type = "rabbitmq"

[broker.rabbitmq]
url = "amqp://guest:guest@localhost:5672"

Stop Tork if it's currently running.

Start the Tork Coordinator:



go run main.go run coordinator

From a separate terminal window start a worker. You can also start additional workers if you like:



go run main.go run worker

Using curl or the frontend try to submit a code snippet.

Conclusion

Hope you enjoyed this tutorial as much as I did.

The full source code can be found on Github.

DEV Community

Let's build a code execution engine

What are we going to use

OK, let's write some code

Security

Frontend

Scaling Up

Conclusion

Top comments (0)

Read next

How do you choose a component library for your project?

Guide To Cloud Computing in Banking

Automating Markdown and Image Translations with Co-op Translator

🚨🏆 Top 5 Open-source Alternatives for LLM Development You Must Know About 💥