DEV Community

Anton Gubarev
Anton Gubarev

Posted on

RBAC with OPA

Open Policy Agent (OPA) is a tool aimed at unifying the application of policies in different technologies and systems. It avoids the introduction of multiple authentication approaches separately for Kubernetes or Kafka or even for micro-services. The tool is part of CNCF and is used by companies such as Netflix, Cloudflare and Pinterest.
Let me briefly tell you how it works. The OPA is written in the golang language and can be used as an agent/daemon or as a golang library with extensibility. The Rego language is provided to create access policies.
Imagine that your company has several hundred micro-services and you need to restrict access to those that contain sensitive data, such as personal data of users. You can of course implement authorization on the handles of the service: verify token, store access policies in the service itself. But you may need to repeat the implementation of this mechanism several tens or hundreds of times for each such service, which of course will be very expensive and difficult.
It is much easier to implement the authorization mechanism once and let other services access it as needed. And it seems simple and obvious, too. But everything is complicated if we need to give access not only to the service handles, but also to the database (because the developers to search for problems can not do without access to the production database), also probably need access to namespace in the cluster of Kuberenets and everything else. Imagine how many times you have to go to different systems and give access.
OPA is a universal tool that can be integrated with other tools such as ssh, Kubernetes, Envoy and many others.
The picture below shows how it works.
Image description
The possibilities of using OPA are very wide. In this article I will describe one of the ways of use. Let me tell you how you can execute access policies using data in a SQL database using the RBAC approach. I will store access policies in the database because it is very flexible and easy to implement APIs and frontends to manage them.

Scheme

What are the main entities to be represented in the mechanism.

  • User. Unique login. Policy subject.
  • Group. Users may belong to a single group. Policy subject.
  • Desired action.
  • Role. A set of actions that are allowed to everyone who has this role. This will simplify the work with permissions because you do not have to issue the same set of permissions many times and give the role at once.
  • Service. This is an authorization system object. The rights are given to work with the service and its resources (databases, S3 repositories, secrets in vault and others).
--- groups
CREATE TABLE IF NOT EXISTS public.groups
(
    id serial,
    name character varying(250) NOT NULL,
    CONSTRAINT groups_pkey PRIMARY KEY (id),
    CONSTRAINT groups_name_key UNIQUE (name)
)
Enter fullscreen mode Exit fullscreen mode

I made the name unique so there are no random takes.

--- users
CREATE TABLE IF NOT EXISTS public.users(
    id serial,
    login varchar(250) NOT NULL,
    name varchar(250) NOT NULL,
    group_id integer references groups(id),
    CONSTRAINT users_pkey PRIMARY KEY (id)
);
Enter fullscreen mode Exit fullscreen mode

The user is unique in his login and referenced to one group. If you want, you can extend the example in this article to more than one group.

--- roles
CREATE TABLE IF NOT EXISTS public.roles
(
    id serial,
    name text NOT NULL,
    actions character varying(100)[] NOT NULL,
    CONSTRAINT roles_pkey PRIMARY KEY (id)
)
Enter fullscreen mode Exit fullscreen mode

A role is a simple entity that stores an array of available permissions. I will bind subjects to a role.

CREATE TABLE IF NOT EXISTS public.service
(
    id serial,
    name character varying NOT NULL,
    owner_id integer,
    CONSTRAINT service_pkey PRIMARY KEY (service_id),
    CONSTRAINT service_owner_id_fkey FOREIGN KEY (owner_id)
        REFERENCES public.users (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE RESTRICT
)
Enter fullscreen mode Exit fullscreen mode

The service name is unique and should not be repeated, as well as each service has a user-owner, a chief responsible, in fact an admin.

CREATE TABLE IF NOT EXISTS public.rules
(
    id serial NOT NULL,
    user_id integer,
    group_id integer,
    role_id integer NOT NULL,
    service_id integer NOT NULL,
    CONSTRAINT rules_pkey PRIMARY KEY (id),
    CONSTRAINT rules_group_id_fkey FOREIGN KEY (group_id)
        REFERENCES public.groups (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE CASCADE,
    CONSTRAINT rules_role_id_fkey FOREIGN KEY (role_id)
        REFERENCES public.roles (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE RESTRICT,
    CONSTRAINT rules_service_id_fkey FOREIGN KEY (service_id)
        REFERENCES public.service (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE CASCADE,
    CONSTRAINT rules_user_id_fkey FOREIGN KEY (user_id)
        REFERENCES public.users (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE CASCADE
)
Enter fullscreen mode Exit fullscreen mode

I implemented binding to the object through separate fields user_id, group_id. That is, the rule can be either user or group. First, it is done to simplify the example and understand how the work with the OPA. Secondly, to maintain consistency of the data in the database. In real examples, it is best to implement as a pair of fields subject_type/subject_id, this can be useful for extending the types of subjects (organization, command, other service, etc.)

Rego rule

Now I need to implement a rule that succeeds if one of the conditions is met:

  • User is the owner of the service. Owners are allowed any actions.
  • This user has permission for the requested action with the service.
  • User's group has permission for requested action with service.

There are two ways:

  • Upload to OPA all data from the database by user, group and rule and update this data every few minutes.
  • Make queries into database directly from rego rules.

The second way looks much preferable because

  • The data will always be up to date and will not have to be updated.
  • Do not need to store in memory a large amount of data, which in some cases can become huge.

So our rule will look like this

package auth

default is_allowed = false

is_allowed {
    is_user_service_owner(input.userId, input.serviceName)
}

is_allowed {
    has_user_permission(input.userId, input.serviceName, input.action)
}

is_allowed {
    has_user_group_permission(input.userId, input.serviceName, input.action)
}
Enter fullscreen mode Exit fullscreen mode

I use id in some arguments and in some unique name, as in real life it is not always possible to engage in what you want and this makes the example more close to reality. I hope you understand what I mean)
By default is_allowed is false. This means that if none of the conditions below are met, the total result will be false too. Let’s move to the implementation of custom functions

Rego custom functions

The OPA allows us to extend the set of functions. I will show you the implementation of this feature on the example of has_user_permission. Other functions will differ only by SQL query inside and quite simply implemented by analogy.

type RegoHasUserPermission struct {
    userPermission HasPermission
}

func NewRegoHasUserPermission(userPermission storage.HasPermission) *RegoHasUserPermission {
    return &RegoHasUserPermission{
        userPermission: userPermission,
    }
}

func (r *RegoHasUserPermission) GetFunction() func(r *rego.Rego) {
    return rego.Function3(&rego.Function{
        Name:    "has_user_permission",
        Decl:    types.NewFunction(types.Args(types.N, types.S, types.S), types.A),
        Memoize: false,
    }, func(bctx rego.BuiltinContext, userId, serviceName, action *ast.Term) (*ast.Term, error) {
        userIDInt, err := termToInt(userId)
        if err != nil {
            return nil, fmt.Errorf("OPA function RegoHasUserPermission, arg `userId`: %v", err)
        }

        serviceNameStr, err := termToString(serviceName)
        if err != nil {
            return nil, fmt.Errorf("OPA function RegoHasUserPermission, arg `serviceName`: %v", err)
        }

        actionStr, err := termToString(action)
        if err != nil {
            return nil, fmt.Errorf("OPA function RegoHasUserPermission, arg `action`: %s", err)
        }

        result, err := r.userPermission.HasUserPermission(r.ctx, *userIDInt, *serviceNameStr, *actionStr)
        if err != nil {
            return ast.BooleanTerm(false), fmt.Errorf("OPA function RegoHasUserPermission: %v", err)
        }
        return ast.BooleanTerm(result), nil
    })
}
Enter fullscreen mode Exit fullscreen mode

I created a structure into which I transmit some sort of storage as an interface.

type HasPermission interface {
    HasUserPermission(userId int, serviceName, action string) (bool, error)
}
Enter fullscreen mode Exit fullscreen mode

Below I will show its implementation. I have defined the has_user_permission function with three arguments

  • First type.N. Argument type number. The user id will be passed here
  • Other type.S. String type argument. For service name and action name. To convert the arguments values I will receive during the rego function execution, I have implemented two functions termToString and termToInt, which convert an argument with a string into string type and a numeric into int type.
func termToString(arg *ast.Term) (*string, error) {
    astStringVal, ok := arg.Value.(ast.String)
    if !ok {
        return nil, fmt.Errorf("cannot convert term to string: %s", arg.String())
    }
    stringVal := string(astStringVal)

    return &stringVal, nil
}

func termToInt(arg *ast.Term) (*int, error) {
    number, ok := arg.Value.(ast.Number)
    if !ok {
        return nil, fmt.Errorf("cannot convert term to number: %s", arg.String())
    }
    intval, ok := number.Int()
    if !ok {
        return nil, fmt.Errorf("cannot convert term to int: %s", arg.String())
    }

    return &intval, nil
}
Enter fullscreen mode Exit fullscreen mode

After that the arguments are received and there are no errors remaining only to execute the query to the database. Above in custom function I used the HasPermission interface. I implemented it as follows.

func (pg *UserPermissionControllerPg) HasUserPermission(userId int, serviceName, action string) (bool, error) {
    rule := struct {
        Id int `json:"id"`
    }{}
    query := `SELECT r.id FROM rules AS r
        LEFT JOIN service AS s ON r.service_id=s.service_id
        LEFT JOIN roles AS rr ON r.role_id=rr.id
        WHERE r.user_id=$1 
            AND s.service_name=$2 
            AND $3=ANY(rr.actions)`

    if err := pg.db.GetContext(context.Background(), &rule, query, userId, serviceName, action); err != nil {
        if err == sql.ErrNoRows {
            return false, nil
        }
        return false, fmt.Errorf("HasUserPermission: %v", err)
    }

    return true, nil
}
Enter fullscreen mode Exit fullscreen mode

In this method, I verify that the required permissions are present in the user roles. As you remember, there are a set of rules that are tied to the user and service. And each such rule is tied to a role. The query looks for at least one record in the rules table, where there is a binding to the desired role, user, and service. And if it’s there, then the user has a rule.
By analogy it is possible to implement the functions rego is_user_service_owner and has_user_group_permission with the only difference that the rule search will go in a different way.

Integration

It remains only to integrate the received mechanism into the application.

type RegoAuthController struct {
    rego *rego.Rego
}

func NewRegoController(
    policiesPath []string,
    ruleName string,
    funcs ...func(*rego.Rego)) *RegoAuthController {
    opts := []func(*rego.Rego){
        rego.Query(ruleName),
        rego.Load(policiesPath),
    }
    if len(opts) > 0 {
        opts = append(opts, funcs...)
    }

    return &RegoAuthController{
        rego: rego.New(opts...),
    }
}
Enter fullscreen mode Exit fullscreen mode

I created a structure in which there is a link to the object Rego. And in the constructor execute its initial configuration. In particular, you need to skip the paths in which the files with the rules are stored. In our example it is one, but you can create as much as you like. For example, divide them by domains. In addition, I have provided the opportunity to transfer to the designer additional functions that can perform the OPA configuration. In particular, I need to transfer the additional custom function I described above.

func (p *RegoAuthController) IsAllowed(ctx context.Context, input map[string]interface{}) (bool, error) {
    query, err := p.rego.PrepareForEval(ctx)
    if err != nil {
        return false, fmt.Errorf("IsAllowed PrepareForEval: %v", err)
    }
    resultSet, err := query.Eval(ctx, rego.EvalInput(input))
    if err != nil {
        return false, fmt.Errorf("eval rego query: %v", err)
    }

    if len(resultSet) == 0 {
        return false, errors.New("undefined result")
    }

    return resultSet.Allowed(), nil
}
Enter fullscreen mode Exit fullscreen mode

This method starts the rule and transfers input-mapped arguments to it. In the rule that I showed you earlier, we’re just looking at this structure, which is also called.
Next on the code I think everything is clear. Prepared expression is executed. Since the executed rule can have several results in the response, the result set is returned. If, in addition to is_allowed, which we have now, we add some more function, for example is_owner, then the result of its execution will also be presented in the answer. In this case, we don’t need this, so check the validity of all the conditions (i.e., one) in the call to resultSet.Allowed().

Now RegoAuthController can be used either in the handler on which will come the rule call or somewhere else. It depends on the specific application. Below I’ll show you an example of what it might look like.

func (h *Handler) Handle(...) error {
    regoHasUserPermissionFunction := policy.NewRegoHasUserPermission(hasPermission, ctx)
    authController := policy.NewRegoController(
        []string{h.policyDir + "/auth.rego"},
        "data.auth.is_allowed",
        regoHasUserPermissionFunction.GetFunction(),
    )

    isAllowed, err := authController.IsAllowed(ctx, map[string]interface{}{
        "userId":      in.UserId,
        "serviceName": in.ServiceName,
        "action":      in.Action,
            }, nil)
    if err != nil {
        return fmt.Errorf("handler: %v", err)
    }
    out.IsAllowed = isAllowed

    return nil //nolint:govet
}
Enter fullscreen mode Exit fullscreen mode

In this example, I created a new RegoController structure object and configured:

  • Passed the file in which are the rules (file contents I gave above in the article). If necessary, you can transfer several files.
  • Specifies which rule to work with. data.auth.is_allowed Consists of the package name and function name.
  • Passed a custom function as a pointer to an object structure. I configured it beforehand, transferring the ready connection to the database (so as not to install it again on each request). And then it remains only to call the method IsAllowed and pass the parameters from the request. I omit some implementation details such as establishing a database connection and working with the http framework as it would be superfluous information that is not related to the topic and can vary greatly from project to project.

Conclusion

I showed the whole mechanism. It can be extended further by adding the necessary functionality. For example, it is possible to grant rights to other entities such as an organization or other services. I hope that the presented approach will help you realize your tasks. If there are any questions, ask them in the comments or to me in the еmail.

Top comments (0)