Ian Kleats

Posted on Feb 21, 2020

GRANDstack Access Control - Basics and Concepts

#graphql #neo4j #grandstack #node

Hey there. Thank you for joining me on a journey of exploration and discovery into unlocking some of the most powerful features of the GRANDstack! By the end of this series, we'll be able to implement fine-grained discretionary access control features into the GraphQL endpoint generated by neo4j-graphql-js.

Cool, right? I thought so too.

Before we dive in...

First off, this series assumes some basic familiarity with GraphQL concepts and the GRANDstack itself (GraphQL, React, Apollo, Neo4j Database). Most important of those GRANDstack topics will be its support for complex nested filtering. Luckily, there's a good blog post to get you up to speed.

Second, this is not a full-fledged tutorial. . . yet. The posts in this series are as much a learning log to document these concepts being developed in real time as they are to invite others to think about and share their own approaches. Learning can be messy. Let's get messy together.

And back to the action...

Ok, let's start small. You know what's small? A boring old To-Do app.

(Wait, you promised an epic journey of awesomeness and are giving me some crappy To-Do app?!?!? For now at least, yes.)

We've heard about this thing called the GRANDstack. It has a lot of synergy out of the box. All you really need to get your backend up is your GraphQL type definitions (i.e. the data model). neo4j-graphql-js will generate the executable schema from there, which can be served by apollo-server.

Ignoring the custom mutation you might use for user login, your type definitions might look like:

const typeDefs = `
type User {
  ID: ID!
  firstName: String
  lastName: String
  email: String!
  todoList: [Task] @relation(name: "TO_DO", direction: "OUT")
}
type Task {
  ID: ID!
  name: String!
  details: String
  location: Point
  complete: Boolean!
  assignedTo: User @relation(name: "TO_DO", direction: "IN")
}
`;

Cool beans. We have Users that can be assigned Tasks. Our tasks even take advantage of neo4j-graphql-js Spatial Types that could be useful in the future!

Let's run it and...

What went wrong?

Oh, your app works great. That is, if you wanted Bob down the street to see that you need to stop by the pharmacy to pick up some hemorrhoid cream.

We could use the @additionalLabels directive on Task to keep them accessible to only one User, but that's kind of limited. What if your mom was going to the pharmacy anyway? Maybe you want certain people to be able to see certain tasks.

Maybe you want discretionary access control.

Unfortunately, I am not aware of any clear cut fine-grained access control options for GRANDstack out of the box. If I were, this post would not exist. On the bright side, we get to explore the possibilities together!

Filter to the rescue!

I might have mentioned how GRANDstack does have out-of-the-box support for complex nested filtering. Could this be the answer we seek? (HINT: I think so.)

Nested filtering means that we can filter the results of any field within our query by the fields of its related types. Those fields of its related types could lead to yet other filterable related types. Ad infinitum.

I don't actually think we need to go on forever. We just need to realize that the access control list for our business data is itself a graph connected to our primary data model.

We could do this with an arbitrarily complex authorization layer, but instead we're going to keep it simple. Let's reduce the access control structure to a single relationship that sits between the User and Task types. Our updated type definitions might look like:

const typeDefs = `
type User {
  userId: ID!
  firstName: String
  lastName: String
  email: String!
  taskList: [Task] @relation(name: "TO_DO", direction: "OUT")
  visibleTasks: [Task] @relation(name: "CAN_READ", direction: "IN")
}
type Task {
  taskId: ID!
  name: String!
  details: String
  location: Point
  complete: Boolean!
  assignedTo: User @relation(name: "TO_DO", direction: "IN")
  visibleTo: [User] @relation(name: "CAN_READ", direction: "OUT")
}
`;

The following filter arguments could then form the basis for locking down our assets:

query aclTasks($user_id: ID!){
  Task(filter: {visibleTo_some: {userId: $user_id}}) {
    ...task fields
  }
  User {
    taskList(filter: {visibleTo_some: {userId: $user_id}} {
      ...task fields
    }
  }
}

If there are other filters that need to be applied, we can wrap them all with an AND clause:

query aclTasks($user_id: ID!){
  Task(filter: {AND: [{visibleTo_some: {userId: $user_id}},
                     {location_distance_lt: {...}}]}) {
    ...task fields
  }
}

Moving ahead in our journey

Oh, I'm sorry. Did I miss something? Your nosy neighbor Bob can still see your pharmaceutical needs can't he because he's savvy enough to submit his own queries without those filters. That dog!

Next time we'll need to figure out how to use a new schema directive to automate the transformation of our GraphQL filter arguments. This will do more to keep Bob out and also keep the queries on the client side a little cleaner. Till then!

Top comments (2)

pmualaba • Mar 14 '20 • Edited

Hello Ian.
Implementing a granular permission system to GRANDSTACK is a great initiative!

While I was reading through te concepts, i was thinking that in my opinion, there are 2 major approaches for implementing this. The first is at the instance level, the second is at the schema level. Your implementation combines both if I understand it correctly.
The permissions are first declared in the schema, but at the end they are persisted in the graph on instance level as relations to User Nodes. This may lead to the Neo4j supernode syndrome (many incoming relations from millions of nodes to the user nodes). Also this approach creates a mass of extra relations in order to persist the permissions system. It's like a permission system graph layer on top of a data graph layer. This also leads to graph equivalents of ALTER TABLE when the permissions change. All the concerned permission relations will have to be deleted or recreated.

Does it not make more sense to implement everything at the schema level? This means that each Schema field on a node should declare its own permissions using RBAC at the Schema level: Ex. Task.name has permissions:
{
read: ['everyone'],
create: ['devops', 'developers', 'managers'],
update: ['owner', 'admins'],
delete: ['owner', 'admins']
}
Then these permissions should be validated before querying the graph. The permissions should generate/modify the GraphQL/CYPHER query in such a way that the to be executed query already describes the constraints/filters that enforces the declared permissions in the Schema. That way no extra permissions graph layer needs to be created in the graph, no full graph scan ALTER TABLE are necessary and no danger for the supernode syndrome. Permissions remain flexible at the Schema level. And also query performance will be greater, since less nodes and relations are involved.
The gist of Schema level permissions is that, it is the Query itself that is generated/templated with the correct embedded enforced schema permissions. That "permissions-aware" query just hits the "open" data graph, and returns only the nodes/relations that the user can access.

Any thoughts on this approach?

Ian Kleats • Mar 14 '20

Thanks for the thoughts! It's a lot to respond to, so please forgive me if I skip anything. There are also four subsequent articles that might answer some of the issues for you.

To one of your last comments: "That 'permissions-aware' query just hits the 'open' data graph, and returns only the nodes/relations that the user can access". Exactly the point!

I've spent a lot of time over the past year digging through the neo4j-graphql-js source code, and I've found it's kind of challenging to extend in the way you've described. This proof-of-concept just uses the filter argument to modify WHERE clauses instead of doing a broader re-write of the neo4j-graphql-js internals.

I think what you bring up are more issues with specific access control structures, not with the implementation of the schema directives. I apologize if my example gave the impression I was advocating one specific structure over another; it was merely for illustration.

Under the simplest RBAC structures, you could definitely do something like what you've laid out. It could even be easily done with the TranslationRule approach I'm putting forward by referencing the JWT claims from the request context. However, this is already mostly solved by the existing support for graphql-auth-directives aside from the over-fetching aspect, so it wasn't what was motivating me.

With anything more complicated, I might be missing something, but I'm still trying to wrap my head around how what you suggested would be applied in an instance where there is heterogeneity of roles for a single user within a single node type (i.e. a User has Owner rights on some Tasks, Editor rights on others, Public rights on many more, and Forbidden/Undefined on the rest).

If such a user were to query the graphQL endpoint with query { Task { somePublicField, someProtectedField } }, we would have to:

a) Store a list of all User IDs qualifying for each permission on each field of the node with the UserID still showing up in the WHERE clause (i.e. filter argument, so able to be accomplished by the current implementation w/o doing more tweaking of cypher strings).
b) Store object references and user-specific claims for all relevant nodes as part of the JWT and perform some type of UNION by claim-level (but where is this information persisted in the first place? If in same Neo4j instance, then likely also accessible through a filter argument).
c) Storing this information as additional labels, nodes, and relationships on the graph which can be referenced by pattern-matching WHERE clauses (i.e. possible with filter argument again).

The level of complexity of whatever access control structure that is implemented is at the discretion (haha, so pun-ish me) of the implementer. I used a very simple example because it is very obvious, not one I'd recommend or use myself necessarily. The point was that, whatever directive support I tried to create, I wanted to be very unopinionated -- ensuring that it could support just about any implementation someone else would dream up.

The supernode issue is a consideration for any graph data model, and it's really up to the implementer to be mindful of and figure out how to overcome that. (And frankly, by the time you're at the scale supernode becomes a problem, you're probably not going to be running a simple GRANDstack anyway.) But... if Neo4j weren't good for modeling permissions, it probably wouldn't be one of the top highlighted use-cases for the database.

Hit me back if you have more questions/thoughts or if I've completely missed the mark on your points.