Photo by Ricardo Gomez Angel on Unsplash
Note: This post requires some basic knowledge of what GraphQL live queries are and how relay works. Therefore I recommend reading my previous articles about live queries GraphQL Live Queries with Socket.io and Collecting GraphQL Live Query Resource Identifier with GraphQL Tools and this awesome series Relay: the GraphQL client that wants to do the dirty work for you first.
The Relay GraphQL specification has some nice implications that could potentially greatly benefit live query implementations.
- Unique global identifiers
- The
Query.node
field
As each Node.id
field should resolve to a global unique identifier that can be passed to the Query.node(id:)
for querying for the given resource a live query engine can leverage that for only re executing partials of a query document.
Example Schema
interface Node {
id: ID!
}
type Position2D {
x: Float!
y: Float!
}
type Token implements Node {
id: ID!
label: String!
position: Position2D!
}
type MapGrid implements Node {
id: ID!
position: Position2D!
columnWidth: Float!
columnHeight: Float!
}
type Map implements Node {
id: ID!
grid: MapGrid
tokens: [Token!]!
}
type Query {
node(id: ID!): Node
}
Example Live Query
query map($id: ID) @live {
map: node(id: $id) {
... on Map {
id
grid {
id
position {
x
y
}
columnWidth
columnHeight
}
tokens {
id
label
position {
x
y
}
}
}
}
}
The live query engine could then build the following queries for efficiently re-executing partials instead of the full query document, after a global unique ID has been invalidated:
Token
query node($id: ID) {
node(id: $id) {
... on Token {
id
label
position {
x
y
}
}
}
}
Given a token has a global unique id (Token.id
) of Token:1
a invalidation and execution of the ad-hoc query could be scheduled via liveQueryStore.invalidate("Token:1")
.
MapGrid
query node($id: ID) {
node(id: $id) {
... on MapGrid {
id
position {
x
y
}
columnWidth
columnHeight
}
}
}
And then publish the result in some patch format:
Token Sample JSON payload
{
"data": {
"id": "Token:1",
"label": "Some Orc",
"position": {
"x": 10,
"y": 10
}
},
"path": ["map", "tokens", 0],
"hasNext": true
}
MapGrid Sample JSON payload
{
"data": {
"id": "Map:1:MapGrid",
"position": {
"x": 10,
"y": 10
},
"columnWidth": 50,
"columnHeight": 50
},
"path": ["map", "grid"],
"hasNext": true
}
On the client we definitely need some middleware for applying the deltas similar to @n1ru4l/graphql-live-query-patch
.
For bigger queries this can drastically reduce the payload that must be sent over the wire.
Furthermore, a JSON patch (or similar) middleware could furthermore optimize the payload, so it is only necessary to send deltas.
E.g. if a Token position would have changed the delta could look similar to this:
{
"patch": [
{ "op": "replace", "path": "/position/x", "value": 5 },
{ "op": "replace", "path": "/position/y", "value": 5 }
],
"path": ["map", "tokens", 0],
"hasNext": true
}
What about lists?
As always lists are a huge pain point of real-time and are currently still not properly addressed by the InMemoryLiveQueryStore
implementation.
The connection specification of relay, however, might help building a proper abstraction for invalidating pagination.
First of all one should clarify whether pagination is actually needed. In the example above one could argue whether it is necessary.
On the one hand we could have a small map with only 10-20 token objects. Having pagination does not make sense for that. But we could also have a list of like millions of items (imagine Google maps). There a connection might be handy and the connection args might include some information about the visible area and the zoom level, so you can return the most important items to display based on that.
But that might not really the issue for us right now. The most important thing that should be relevant to us right now is: How can we efficiently add and remove items?
Let's first take a look on how I tackled this in the past using GraphQL subscriptions with a Subscription.notesUpdates
field that publishes payloads for manually updating the existing connection in the client cache.
type Query {
notes(first: Int, after: String): NoteConnection!
}
type NoteConnection {
edges: [NoteEdge!]!
pageInfo: PageInfo!
}
type NoteEdge {
cursor: String!
node: Note!
}
type Note implements Node {
id: ID!
documentId: ID!
title: String!
content: String!
contentPreview: String!
createdAt: Int!
viewerCanEdit: Boolean!
viewerCanShare: Boolean!
access: String!
isEntryPoint: Boolean!
updatedAt: Int!
}
type NotesUpdates {
"""
A node that was added to the connection.
"""
addedNode: NotesConnectionEdgeInsertionUpdate
"""
A note that was updated.
"""
updatedNote: Note
"""
A note that was removed.
"""
removedNoteId: ID
}
type NotesConnectionEdgeInsertionUpdate {
"""
The cursor of the item before which the node should be inserted.
"""
previousCursor: String
"""
The edge that should be inserted.
"""
edge: NoteEdge
}
type Subscription {
notesUpdates(endCursor: String!, hasNextPage: Boolean!): NotesUpdates!
}
The corresponding client code has been implemented like this:
const subscription = requestSubscription<tokenInfoSideBar_NotesUpdatesSubscription>(
environment,
{
subscription: TokenInfoSideBar_NotesUpdatesSubscription,
variables: {
endCursor: data.notes.pageInfo.endCursor,
hasNextPage: data.notes.pageInfo.hasNextPage,
},
updater: (store, payload) => {
if (payload.notesUpdates.removedNoteId) {
const connection = store.get(data.notes.__id);
if (connection) {
ConnectionHandler.deleteNode(
connection,
payload.notesUpdates.removedNoteId
);
}
}
if (payload.notesUpdates.addedNode) {
const connection = store.get(data.notes.__id);
if (connection) {
const edge = store
.getRootField("notesUpdates")
?.getLinkedRecord("addedNode")
?.getLinkedRecord("edge");
// we need to copy the fields at the other Subscription.notesUpdates.addedNode.edge field
// will be mutated when the next subscription result is arriving
const record = store.create(
// prettier-ignore
`${data.notes.__id}-${edge.getValue("cursor")}-${++newEdgeIdCounter.current}`,
"NoteEdge"
);
record.copyFieldsFrom(edge);
if (payload.notesUpdates.addedNode.previousCursor) {
ConnectionHandler.insertEdgeBefore(
connection,
record,
payload.notesUpdates.addedNode.previousCursor
);
} else if (
// in case we don't have a previous cursor and there is no nextPage the edge must be added the last list item.
connection?.getLinkedRecord("pageInfo")?.getValue("hasNextPage") ===
false
) {
ConnectionHandler.insertEdgeAfter(connection, record);
}
}
}
},
}
);
const TokenInfoSideBar_NotesUpdatesSubscription = graphql`
subscription tokenInfoSideBar_NotesUpdatesSubscription(
$endCursor: String!
$hasNextPage: Boolean!
) {
notesUpdates(endCursor: $endCursor, hasNextPage: $hasNextPage) {
removedNoteId
updatedNote {
id
title
isEntryPoint
}
addedNode {
previousCursor
edge {
cursor
node {
id
documentId
title
}
}
}
}
}
`;
So the three important events are
- A node got added
- A node got deleted
- A node got updated
The latter can actually already be addressed by simple invalidation via the global unique note id. (e.g. Note:1
).
liveQueryStore.invalidate("Note:1")
{
"data": {
"id": "Token:1",
"label": "Some Orc",
"position": {
"x": 33,
"y": 33
}
},
"path": ["map", "paginatedTokens", "edges", 0, "node"],
"hasNext": true
}
So it must not necessarily be covered by our live connection abstraction. The only crucial thing is that we always need to know the index if the item in the edge array. I am not sure if we could guarantee this. Any recommendations are welcome!
So if we want to implement this with live queries we will have to come up with a solution for (1) a node got added and (2) a node got deleted.
Let's jump back to our Token example and let's model it as a connection.
Token modeled with the connection spec
type TokenEdge {
cursor: String!
node: Token!
}
type TokenConnection {
pageInfo: PageInfo!
edges: [TokenEdge!]!
}
extend type Map {
paginatedTokens(first: Int, after: String): TokenConnection!
}
Maybe the TokenEdge.cursor
field might be the source of truth for this?
If we can identify where an item must be added or deleted based on the cursor that might make sense.
If we want to add a new item we can do this by adding an item to the list AFTER an item with a specific cursor.
If we want to remove an item we can do this by removing an item WITH a specific cursor.
Other things that one might need is re-sorting items. This could be achieved by having a list of remove and add instructions for all affected items.
The question now is: How can we model this abstraction in live query land?
Live query but with connection instead of a simple list
query map($id: ID) @live {
map: node(id: $id) {
... on Map {
id
grid {
id
position {
x
y
}
columnWidth
columnHeight
}
paginatedTokens {
edges {
cursor
node {
id
label
position {
x
y
}
}
}
}
}
}
}
Diffing the whole connection might be super expensive. So the easiest solution might be to add some kind of imperative API for notifying that an item got added/removed from a connection.
// The cursor string is combined out of three parts.
// 1. Connection name
// 2. Edge resource type name
// 3. Edge node resource id
// We could also obscurify this for the client. For simplicity I kept it a plain string :)
const cursor = "TokenConnection|TokenEdge|Token:1";
liveQueryStore.triggerEdgeRemoval(cursor);
const afterEdgeCursor = cursor;
const newEdgeCursor = "TokenConnection|TokenEdge|Token:2";
liveQueryStore.triggerEdgeInsertion(afterEdgeCursor, newEdgeCursor);
If the live query store is aware of the cursor format and can do stuff based on its contents, it can then generate the patches that should be sent to the client.
E.g. for the edge removal flow via the "PaginatedTokens|TokenEdge|Token:1"
cursor can first look for all operations that select the TokenConnection
type. Then check which of those connections includes the TokenEdge
that has a node with the id Token:1
and send a patch for this items removal to the affected clients.
Patch for removing a token:
{
"connectionPatch": {
"type": "removeEdge",
"cursor": "TokenConnection|TokenEdge|Token:1"
},
"path": ["map", "paginatedTokens"],
"hasNext": true
}
For the edge insertion task it can do the steps above for the afterEdgeCursor
("PaginatedTokens|TokenEdge|Token:1"
), and then additionally load the new resource edge node ("TokenConnection|TokenEdge|Token:2"
) via our partial operation we generated earlier:
query node($id: ID) {
node(id: $id) {
... on Token {
id
label
position {
x
y
}
}
}
}
Patch for adding a new token after another token:
{
"connectionPatch": {
"type": "insertEdge",
"afterCursor": "TokenConnection|TokenEdge|Token:1",
"edge": {
"cursor": "TokenConnection|TokenEdge|Token:2",
"node": {
"id": "Token:2",
"label": "foo bars",
"position": {
"x": 20,
"y": 20
}
}
}
},
"path": ["map", "paginatedTokens"],
"hasNext": true
}
If the list is initially empty we don't have a afterCursor
, so we might need to use null
instead to indicate that.
But also what if the connection returns different data based on the connection arguments or even viewer scope? E.g. an admin user might see all tokens and a normal user might only see tokens that are marked as visible.
If we encode this information in the cursor that might work. I will update this once I gather some more thoughts about this.
Another thing is actual pagination. How does this scale if we have to fetch more items? Another thing I need to think more of.
If you have any ideas regarding any of the mentioned above. Please contact me!
Right now this is all just theory, but I might take a spin at trying to implement this soon over here: https://github.com/n1ru4l/graphql-live-query
Top comments (1)
Yeah, more real life features harder to implement.
Andrey Sitnik made a great progress to synchronization, but using a custom protocol over Websocket: logux.io. It uses CRDT under the hood. Maybe it will help somehow.
Another crazy solution is YJS: github.com/yjs/yjs or Automerge: github.com/automerge/automerge. They all keep data in server memory and makes smart diffs.