GraphQL can be an incredibly useful tool for decoupling orchestration and schema for your back end implementation, be it APIs, direct database calls or even a simple in-memory store. One thing that presents itself fairly quickly, though, is the issue of tracking what is happening when your Graph queries are executed.
Query Issues
Adding an additional layer presents an entirely new suite of issues to address, the most common of which below.
Number of IO Operations
A single Graph query can look deceptively simple, but be configured to call many APIs, read / write to many areas of a data store or even call out to third party providers. This can occur due to an overly complicated back-end implementation, or simply pulling too much data from the front end.
Overall Duration (Performance)
Often this can be a side-effect of the above, but one IO operation can be all it takes to drag a query down. That one API call, which runs a poorly optimized SQL query can take 30+ seconds given the right conditions or data set.
Errors
Logging is incredibly helpful for tracking your errors, but for viewing statistics on the queries generating errors and individual traces, this can be a more complex task to generate from logs.
Monitoring Your Setup
Apollo GraphQL is a very common community-driven GraphQL implementation, primarily for NodeJS. In addition to the core libraries, they offer a free (with paid additional functionality available) SaaS platform for monitoring your GraphQL implementation: Apollo Studio. This offers solutions to the above, as well as the following functionality:
- Track schema changes (with notifications)
- Explore your schema and run queries
- Report on schema usage
Integrating to Apollo Studio
For users of the Apollo Server GraphQL implementation (for NodeJS), it's pretty straight-forward. For Java and Python implementations there are also third-party providers, but that's where support ends. The link above also details how to create a custom integration and that's where this article picks up. This process involves: importing the protobuf schema, converting performance stats to Apollo trace format, signing the message and finally converting to a background process for batching.
Generating Apollo Studio Classes for Protobuf
There are a number of Protobuf implementations for .NET Core, but I like protobuf-net as it's a nice, clean, Apache 2.0 Licensed implementation. It is also supported by protogen, a great online generator that will output protobuf-net classes ready for use (for its CSharp profile). If you open the latest schema from the link here, you can simply paste into the generator.
NOTE: At the time of writing, [(js_preEncoded)=true]
isn't supported by the generator, and can be removed from the proto schema.
Converting to Apollo Studio Format
In order to get data in a suitable format for Apollo, you can enable Apollo Tracing enrichment of your responses in GraphQL.NET. What follows is a large code dump of how I put together a conversion system for these classes to those generated by the above:
public class MetricsToTraceConverter
{
public Trace? CreateTrace(ExecutionResult result)
{
ApolloTrace? trace = result.Extensions != null && result.Extensions.ContainsKey("tracing") ? (ApolloTrace)result.Extensions["tracing"] : null;
var resolvers = trace?.Execution.Resolvers?
.OrderBy(x => string.Join(":", x.Path), new ConfigurationKeyComparer())
.ToArray();
var rootTrace = resolvers?.FirstOrDefault(x => x.Path.Count == 1);
if (rootTrace == null && result.Errors == null)
return null;
int resolverIndex = 1;
var rootErrors = result.Errors?.Where(x => x.Path != null && x.Path.Count() == 1).ToArray();
var rootNode = rootTrace != null && resolvers != null
? CreateNodes(rootTrace.Path, CreateNodeForResolver(rootTrace, rootErrors), resolvers, ref resolverIndex, GetSubErrors(rootTrace.Path, result.Errors?.ToArray()))
: new Trace.Node();
if (rootTrace == null && result.Errors != null)
{
foreach (var executionError in result.Errors)
rootNode.Errors.Add(CreateTraceError(executionError));
}
return new Trace
{
StartTime = trace?.StartTime ?? DateTime.Now,
EndTime = trace?.EndTime ?? DateTime.Now,
DurationNs = (ulong)(trace?.Duration ?? 0),
http = new Trace.Http { method = Trace.Http.Method.Post, StatusCode = result.Errors?.Any() == true ? (uint)HttpStatusCode.BadRequest : (uint)HttpStatusCode.OK },
Root = rootNode
};
}
private static Trace.Node CreateNodeForResolver(ApolloTrace.ResolverTrace resolver, ExecutionError[]? executionErrors)
{
var node = new Trace.Node
{
ResponseName = resolver.FieldName,
Type = resolver.ReturnType,
StartTime = (ulong)resolver.StartOffset,
EndTime = (ulong)(resolver.StartOffset + resolver.Duration),
ParentType = resolver.ParentType
};
if (executionErrors != null)
{
foreach (var executionError in executionErrors)
node.Errors.Add(CreateTraceError(executionError));
}
return node;
}
private static Trace.Error CreateTraceError(ExecutionError executionError)
{
var error = new Trace.Error
{
Json = JsonConvert.SerializeObject(executionError),
Message = executionError.Message
};
if (executionError.Locations != null)
error.Locations.AddRange(executionError.Locations.Select(x => new Trace.Location { Column = (uint)x.Column, Line = (uint)x.Line }));
return error;
}
private static ExecutionError[]? GetSubErrors(List<object> path, ExecutionError[]? errors)
{
return errors
?.Where(x => x.Path != null && x.Path.Count() > path.Count && x.Path.Take(path.Count).SequenceEqual(path))
.ToArray();
}
private static Trace.Node CreateNodes(List<object> path, Trace.Node node, ApolloTrace.ResolverTrace[] resolvers,
ref int resolverIndex, ExecutionError[]? executionErrors)
{
bool isArray = node.Type.StartsWith("[") && node.Type.TrimEnd('!').EndsWith("]");
if (isArray)
{
if (resolverIndex < resolvers.Length)
{
var resolver = resolvers[resolverIndex];
while (resolver.Path != null && resolver.Path.Count == path.Count + 2 && resolver.Path.Take(path.Count).SequenceEqual(path))
{
var index = (int)(resolver.Path[^2]);
var subPath = path.Concat(new object[] {index}).ToList();
var previousIndex = resolverIndex;
node.Childs.Add(CreateNodes(subPath,
new Trace.Node
{
Index = Convert.ToUInt32(index),
ParentType = node.Type,
Type = node.Type.TrimStart('[').TrimEnd('!').TrimEnd(']')
}, resolvers, ref resolverIndex, GetSubErrors(subPath, executionErrors)));
// Avoid infinite loop if the worst happens and we don't match any items for this index (HOW?!?!?)
if (resolverIndex == previousIndex)
resolverIndex++;
if (resolverIndex >= resolvers.Length)
break;
resolver = resolvers[resolverIndex];
}
}
}
else
{
if (resolverIndex < resolvers.Length)
{
var resolver = resolvers[resolverIndex];
while (resolver.Path != null && resolver.Path.Count == path.Count + 1 && resolver.Path.Take(path.Count).SequenceEqual(path))
{
var errors = executionErrors?.Where(x => x.Path.SequenceEqual(resolver.Path)).ToArray();
resolverIndex++;
node.Childs.Add(CreateNodes(resolver.Path, CreateNodeForResolver(resolver, errors), resolvers,
ref resolverIndex, GetSubErrors(resolver.Path, executionErrors)));
if (resolverIndex >= resolvers.Length)
break;
resolver = resolvers[resolverIndex];
}
}
}
return node;
}
}
So what is all of this doing? Here's an overview:
- Retrieve the tracing data added by enabling "Apollo tracing" enrichment of results from the execution result.
- Order the resolvers hierarchically (ConfigurationKeyComparer does this beautifully), so that we can consume them in order and avoid expensive full scans of the resolver traces.
- Find the root trace (should be the first item).
- Collect all root errors from the traces.
- Construct a node hierarchy from node paths - this is fairly complicated, but it's easier to see how this works using a sample of data (view one at runtime).
- If there's no traces, but there are errors, add those to the root node.
- Return a Trace object (from protobuf generated classes) for queuing for send in a batch.
Up Next
In the next article, we'll look at how to generate and send the full report class to Apollo Studio.
Top comments (0)