DEV Community

James Heggs
James Heggs

Posted on

GCP DevOps Certification - Pomodoro Eleven

Data processing SLI's

There is a high likelihood that you'll be working working with a platform that works with user provided or user generated information to provide a service.

Google recommend four different types of SLI's for that use case:

  • Freshness
  • Correctness
  • Coverage
  • Throughput

Let's review each of them, much along the similar lines of previous blogs Google uses the term "valid", this time in regards to the data and "proportions" to reflect things as percentages.

Freshness

The data "freshness" can be considered as the proportion of valid data updated more recently than a give threshold.

Given that definition an implementation requires making two choices, which of the data this system processes are valid for the SLI, and, when the timer for measuring the freshness of the data starts and stops.

Image showing freshness calculation

Correctness

When speaking of data correctness its important to note that users often have independent ways of validity checking data from your systems. As a result its important to consider an SLI for data correctness to ensure trust with your users.

The data "correctness" can be considered as the proportion of valid data producing correct output.

Given that definition an implementation requires making two choices, which of the data this system processes are valid for the SLI and how to determine whether the data is "correct".

Correctness of data

Coverage

A coverage SLI is useful for scenarios when users have an expectation of when the data will be made available for them.

Data "coverage" can be considered as the proportion of valid data processed successfully.

Given that definition an implementation requires making two choices, which of the data this system processes are valid and whether the piece of data was processed successfully.

Coverage SLI

Throughput

Throughput SLI's for data are useful in scenarios where a latency SLI might not be right. EG. If the latency throughput varies a lot (peak times versus quiet times) then maybe a data throughput SLI might be applicable.

Data "throughput" can be considered as the proportion of time where the data processing rate is faster than a threshold.

For this SLI to work you have to turn an event into a portion of time. IE. How long did that one event take to process. Such as Bytes per second.

Any metric that scales at the same rate as the cost of processing should work for tracking the SLI. EG. Big data file to process would need longer time or more processing power.

Top comments (0)