Nykaela Dodson

Posted on Dec 30, 2022

Skopos: Monitor Critical API Workflows

#api #aws #opensource #microservices

What is Skopos?

Skopos is an open-source API monitoring tool designed for testing multi-step API workflows and running groups of tests in parallel.

Meet the Team that built Skopos

Nykaela Dodson
Hans Elde
Katherine Ebel
Gagan Sapkota

What is an API?

APIs are at the heart of every piece of modern software in use, and they act as the connective tissue that binds together application services.

Consider a weather application: when a user clicks a button to check the weather in Seattle, the weather app sends a GET request to the weather API. Next, the weather API sends back a response with the requested data. The weather app then uses that data to display the current weather to the user.

It is common for an application to rely on multiple API endpoints.

What happens when APIs fail?

When one API fails, other dependent APIs may fail too with cascading failures that can have unfortunate consequences. Let’s take a look at one example where an API failure had a noticeable impact.

In 2019, UberEats customers were able to order huge amounts of free food because one of UberEats' payment services, PayTm, changed one of their API endpoints. Before the change, the endpoint returned the same response every time a customer placed an order with insufficient funds to pay for the order. After the change, the endpoint was no longer idempotent¹. After the first attempt to place an order with insufficient funds, on subsequent attempts the PayTm endpoint returned a new, unexpected error message.

Because of the way UberEats had integrated their application with PayTm, this unexpected error message allowed orders to go through, even though payment had not actually been processed successfully. Before UberEats realized the error, many customers had already ordered thousands of dollars worth of food for free.

It is important to identify issues like this as early as possible if and when they occur in production. Fortunately, API failures can be traced and when APIs fail, their failures manifest in a few different ways:

Data Payload: As we just saw with UberEats, the data payload sent back in the API's response might be unexpected.
Response Time: The failure might show up as an unacceptable response time.
Status code: The HTTP status code might be wrong.

To catch failures as quickly as possible, companies invest in API Monitoring tools that track the performance of API endpoints and look for these signs of failure. API monitoring tools detect API failures by making requests to endpoints at specified intervals and checking the validity of their response data².

Whenever there is a mismatch between the expected response data and the actual response data the endpoint is considered to have failed.

What is API Monitoring?

API monitoring is the process of making requests to API endpoints at set intervals and comparing the response to expected values to check both the availability of API endpoints and the validity of their response data. The goal of API monitoring is to spot issues that may affect users as early as possible.

We think it's helpful to think of how API monitoring tools work in the context of a set of core functionalities that are shared by pretty much all API monitoring tools. These core functionalities can be broken down at a broad level as a set of distinct steps: Definition, Execution, Scheduling, and Notification. An API monitoring tool allows users to define tests, execute those tests on a schedule, and notify various targets when the tests fail.

Definition

For defining tests, a user provides the information necessary for communicating with the API endpoint. This might include the HTTP method, endpoint URL, headers, and request body. A user then defines assertions to compare expected and actual responses. An assertion is how the user actually specifies what status code they expect, what response time is reasonable, or what they expect in the response body. Definition functionality is generally offered either through a graphical user interface or a command line tool.

Execution

An API monitoring tool then has to execute the tests. This means the tool sends requests to the specified API endpoint, receives a response, and checks assertions associated with that test against the received response.

Scheduling

The tool must also be able to schedule the tests to run. In this case, scheduling is not restricted to time-based scheduling but also refers to arranging for tests to execute from different geographical locations or setting a deployment trigger to execute tests as part of a CI/CD pipeline. For example, tests scheduled based on time can be set to execute every minute, every 15 minutes, or every hour depending on the use case and how quickly API failures should be responded to.

Notification

When APIs return unexpected responses, the monitoring tool must be able to alert interested parties about the failures. This could include internal notifications that alert self-healing processes, or external notifications to users of the monitoring tool through integrations with PagerDuty or Slack.

While API monitoring tools have these core commonalities, they are not generally one-size-fits-all products, and these functionalities – definition, execution, scheduling, and notification – can be fine-tuned to suit different use cases. Knowing which features to target with an API monitoring tool depends on which approach to API monitoring makes the most sense for that specific use case.

Introducing Skopos

While API monitoring tools have these core commonalities, they are not generally one-size-fits-all products, and these functionalities – definition, execution, scheduling and notification – can be fine-tuned to suit different use cases. Knowing which features to target with an API monitoring tool depends on which approach to API monitoring makes the most sense for that specific use case. Here are some of the key functionalities we hoped to provide with Skopos to simulate workflows that rely on multiple APIs.

Multi-Step Tests

Multi-step tests are designed to simulate complex workflows that consume multiple APIs. This is typically used when different services of an application need to communicate with one another in sequence over API calls to complete common functionality, such as an API endpoint requiring authentication through a token³. API call chaining like this is particularly prevalent with a microservices architecture. Consider a user workflow that includes adding an item to a cart, making a payment, scheduling a delivery, and updating the database.

When all of these steps work as expected, the test passes. However, when one or more of these steps do not work as expected, for example, if the payment step fails, the test would fail.

Parallel Test Execution

Another approach is parallel tests. Although it is necessary to simulate multi-step workflows by running tests sequentially, the disadvantage of this is that it takes extra time. For example, executing three tests that take 200 milliseconds each would take 600 milliseconds. Meanwhile, the same three tests sent in parallel would take only 200 milliseconds. If these tests are being sent to different endpoints, this feature can save time when running a large number of tests⁴.

Executing tests in parallel can save computing time when the requests do not depend on each other. Furthermore, making parallel requests can also be used for load testing because you can configure multiple tests to make a request to the same API endpoint in parallel to see how the endpoint performs.

While collections could be executed in parallel,
multi-step tests within each collection should be executed sequentially.

Skopos is open source, has multi-step and parallel test functionality, and offers a user-friendly GUI that meets the needs of our use case.
Skopos’ Core Application Components
For Skopos’ core application, we knew we needed to build the definition functionality and provide a way to set up multi-step tests. An important part of this is referencing values from previous tests. This was challenging, because those values are not accessible until after the previous test has completed.

We also noticed we were working with a large amount of data. Storing and retrieving that data quickly became a complex challenge. Let's look into the first challenge and how we reference values downstream.

Creating Reference Flags

We decided to group tests that reference previous values from other tests together into what is called a collection. A collection is a group of tests. This way, when the tests within a collection are run sequentially, values needed downstream will be available.

A value from the response of test 1 can be accessed by test 2 or 3,
but not by tests in another collection, such as tests 4, 5, and 6.

How does a test know that the user wants to interpolate a previous value? We decided to create a reference flag to identify where to interpolate values using @{{}}.This is how we solved the challenge of defining where a user wants to access previous values to then be interpolated later on.

Data Storage and Fetching

During development, we started by working with REST endpoints, however, this became limiting. Sometimes, we were not able to fetch all of the data we needed from one endpoint, or under-fetching, and other times we were fetching more data than we needed, or over-fetching.

First, we tried to adjust our queries to target the specific data we needed, however, queries started growing in complexity, and the custom endpoints we used started to move away from proper REST implementation.

We decided to use GraphQL because it allowed us to retrieve the precise data we needed. In particular, we added Apollo Server to the backend of our application. Any components that would need to communicate with the database could then use Apollo Client, and the backend running Apollo Server would act as the single gateway to the database.

This was a great start, however, while implementing our data model, we ended up making frequent updates to our schema. To mitigate this issue, we decided to use Prisma. Prisma is an object-relational mapper, or ORM. It allowed us to not only interact with the database as if it were an object but also update and migrate our schema with ease.

Here is our final data stack, which consists of Apollo client communicating from the frontend, a backend running Apollo Server integrated with GraphQL and Prisma that then communicates with our Postgres database.

Building Execution Functionality

One core problem we faced when implementing the execution functionality for making API calls came with handling the complexity of the code. We decided to group this functionality into what we call the collection runner.

Here are the steps for making this possible:

A post request is sent to an express endpoint on the collection runner.
Data for the tests that belong to the collection are fetched from the database.
Requests are first processed by interpolating values the user has defined in the test using our reference flag, @{{}}.
The first request is ready to be sent to the specified API endpoint.

Once a response is received from the API, assertions are checked for the test. If they fail, interested parties are notified. However, if the assertions pass, the process is repeated with the next request.

After this process has been repeated for each request in the collection, the execution phase is complete.

As you can see, this requires complex logic and keeping track of variables that change at different parts of this process. We started looking at implementing a state machine to run collections and keep track of this logic.

A state machine transitions through predefined states;
context values can be updated and used throughout these states.

A state machine helps declaratively model application logic. It defines the states an application can exist in and the actions that take the machine from one state to another⁵. State machines also keep track of context, where values saved to the state machine’s context can be updated by different states. We decided to use this and began moving our complex execution logic to X State, which is a library for creating and using state machines.

Here is how we implemented it:

A post request is sent to an express endpoint on the collection runner.
The Collection Runner Machine moves into the initializing state where a collectionRunId is generated that will be used to later save results to the database. For each test in the collection, child state machines handle the logic of processing the request, sending the request, and making assertions on the response. Each child state machine has its own states and context that are sent back in an event to the parent machine.
The Request Processor Machine is invoked and values for tests that reference previous requests are interpolated.
The Request Runner machine is invoked and makes requests to the specified API endpoint, waits for the response, and passes the data to the parent machine to be saved as responses.
The Assertion Runner machine is invoked, which uses the API’s response to evaluate assertions defined for the current test. Assertion results are then saved to the collection runner’s context.

This process repeats until all tests in a collection have been completed. At that point, the collection runner machine moves to the complete state and the collection run is done.

Let's take a look at how we handled failures. If an assertion does not pass, an error occurs at any point during a state machine’s invocation, or the parent Collection Runner Machine does not receive an event from a child state machine, the parent Collection Runner Machine is notified and enters the failed state. At this point, the remaining tests in the collection will not run and an appropriate error message is sent to any interested parties.

In summary, the collection-runner receives a post request with a collectionId and invokes multiple state machines that go through a sequence of states to complete the tests in a collection. This process allows for the implementation of multi-step tests while organizing the complex logic that comes with that.

At this point, we had implemented a large part of our API monitor’s definition and execution components. Skopos is now able to define and execute multi-step tests but is unable to perform parallel testing, scheduling, or notifying. We decided to solve most of these problems at an infrastructure level.

Building Skopos’ Cloud Infrastructure

As you might remember, every API monitoring tool needs four key components - definition, scheduling, execution, and notification. Here we take a look at the infrastructure for each of those components and how they fit together for our final architecture.