DEV Community

Ismael Velasco
Ismael Velasco

Posted on • Originally published at ismaelvelasco.dev on

How many emissions in a gigabyte of data?

The Big Picture

In my post on the green potential of event driven architectures and AsyncApi I gave a high level overview of how to estimate the quantity of CO2 emissions generated by API traffic, and particularly REST APIs. Using the most conservative, lowest estimate of CO2/gigabyte I could find, I arrived at a figure equivalent driving all the cars in Shanghai for a year. Every year. Which would need the equivalent of doubling all the trees in Ireland. Every year.

This is the Big Picture we need to keep our focus on, which can be summarised as:

  1. The more data we transmit or consume, the greater our digital carbon footprint
  2. The amount of data currently transmitted is a significant contributor to climate change
  3. Anything we can do to reduce data transmission, in volume, frequency or duration, through greener infrastructure, architecture and back-end and front-end design patterns such as green by default; and green mode design, we should do to shift the current dismaying climate trends.

image.png

At a more granular level however the Big Picture above is your motivation and theory of change, but it's not enough to translate vision into measurable objectives.

The challenge of measurement

Use cases

You need some kind of unit of measure for benchmarking your product's current digital emissions in order to monitor and measurably improve them. This could be purely at engineering team level, incorporated into your CI pipeline to stay within a carbon budget in your ticket implementations; or it could be part of a company wide digital Lifecycle Analysis (LCA), environmental management systems (EMS), or B Corp certification process.

So how much CO2 is generated by 1GB of data? This is a case in which why you measure is more important than how you measure.

Focus on the why

Bearing in mind the Big Picture above, you benchmark in order to make measurable progress, and choose a metric in order to monitor and communicate that progress. What really matters in most cases, is not so much the precision of your unit, but the trajectory of your product and organisation.

Not to say that the precision of your measurements isn't consequential, or that expert disagreement on exact bounds or ranges mean there are no bounds or ranges you need to follow to remain rooted in evidence!

But it is to say, that for most organisational purposes, as long as you are within the ranges indicated by reputable science (meaning pick a scientifically backed metric, even if it differs from another scientifically backed metric), the precise calculus of your units of measurement are less important than how much and how fast your emissions are improving (or worsening).

Much better to have a measure that turns out to be inaccurate to quantify a 50% annual improvement in emissions, and consistent progress over 5 years, than a measure with superior exactitude demonstrating 10-20% more emissions every year over the same 5 years.

image.png

Having said that, there is no exact and constant GB/kWh/CO2 correspondence, which is why there is quite a bit of scientific and policy debate in terms of arriving at precise figures.

Calculating data emissions.

Global averages

The most conservative recent figure is 0.015 kWh per GB by McGovern, and 0.0042 kg of CO2 per GB. The International Energy Agency (IAE) estimated in 2020 a 0.06 kWh/GB and 0.478 kg CO2/kWh footprint, which would result in 0.028 kg of CO2 per GB streamed.

So that shouldn't be too difficult right? Just choose one of the metrics above, and start calculating your CO2/GB benchmarks.

Hardware factors

Except, it is a bit more nuanced than that, if you're after precision. There is a difference in the emissions of the same GB of data by device type (e.g. mobile or PC), and by signal type (e.g. 3G, 4G, Wi-Fi) with mobile emissions calculated at 0.1-0.2 kWh/GB for 4G mobile, so a lot more than the metrics above.

It follows that if your product involves the Internet Of Things the intensity per GB will likewise vary if you're using a smart watch, a fridge, smart glasses, or an implant in your leg.

Problem is, once you've accounted for device type, not all device brands and and models within each device type are created equal, so depending on the age, brand and model your 1GB might produce completely different emissions.

Likewise what data communication protocol (e.g. HTTP, USSD, MQTT your device uses to transmit the 1GB of data to your device.

Got it. So if we just apply different CO2/GB metrics per device type, per signal type and per communication protocol, then, as the Brits disconcertingly say: Bob's your uncle.

Now can we go measure?

Software factors

Well, say you have 2 identical devices running on the same signal type via the same communication protocol. Different configurations, software installed, operating system, etc. will affect the electricity consumption of that identical machine upon receiving 1 GB of data.

It will be very different browsing on a minimalist Linux distribution like Porteus which is small enough to fit in an old USB stick and run entirely from system RAM, and browse 1GB or data using a text based browser in your terminal; than browsing 1GB of data a Windows OS opening 100 Chrome tabs while having 10 desktop applications open in the background.

image.png

So let's imagine you have 2 identical machines, identically configured, in identical conditions, with identical hardware and software running. Will your elusive GB now be equivalent?

Use case factors

Well actually, no. There is also a difference in emissions according to the what user behaviour that GB of data is meant to elicit. 1 GB has estimated to be equivalent to 600 web pages, or 30 minutes of HD video (caveat emptor: previous paragraphs apply!). By now, you've read enough to be pretty sure that your identical quantity of data will produce different emissions in these two scenarios... but can you guess whether 30 mins of data and CPU intensive video or 600 super optimised web page visits is worse for the climate?

If we take an average time spent on a web page to be 52 seconds (varies hugely between industries and between websites), then 600 pages is around 8 hours on your machine viewing its screen. That is 16 times longer than a 30 minute video.

Imagine that 30 minute video was transmitted in HD, not hosted in a green cloud provider, no steps taken to optimise delivery. Meanwhile those web pages were downloaded in a single optimised request via a super optimised CDN from a green cloud provider. Clearly the emissions generated by the GB of video traffic would be incomparably higher than those of your fantastically opitimized and delivered 600 pages.

Except, all that reckless, high emission, 30 minute video traffic, would STILL use less electricity than keeping your computer monitor, CPU, background processes, etc, in use for 8 hours.

Taking everything into account so far, and inaccurately working from a global average electricity mix, streaming a GB of video will produce around 18g of CO2. In contrast, a laptop that has a life of three years will generate, including embodied carbon, (107g of CO2 per hour of use)[https://www.hempoffset.com/2022/05/31/whats-the-carbon-footprint-of-your-digital-content/], so 828g of CO2: nearly 50 times the 30 minute video's footprint for a single GB of data.

Grid intensity

So now we're there, right? You just have to use the same machine to stream the same 30 minute video over the same network for your 1GB of data to translate into a single CO2 emissions metric.

Not by a long shot!

image.png

The emissions of your extremely frustrating GB will vary in relation to the location and time at which you stream your 30m video on the same device over the same network and protocol, hosted in the same server.

Emissions fluctuate in accordance with the intensity of the electricity grid at the time you stream a video. Your GB will produce more emissions when the grid is at high intensity, and less when the grid is being powered by renewable resources, so the same GB of data will produce more or less CO2 at a different time of day. There's a variety of APIs, like Electricity Maps (global), carbonintensity.org.uk (UK) and many many more, that you can use to measure your emissions in relation to grid usage, and make your websites and applications not just carbon aware but carbon intelligent.

An example of one implementation is the fantastic https://branch.climateaction.tech website. If you look top right, you will see it has 4 performance modes, giving you most functionality at low grid usage times, and reducing the defaults at high grid intensity times. So in low grid intensity mode you get full colour images displayed by default; in moderate you get them monochrome, and in high usage mode you have to click in order to see the image. The "Live" mode is a smart mode automatically switches depending on grid intensity.

image.png

You can see how they use the UK grid api to measure intensity in this simple JavaScript file: https://github.com/climateaction-tech/branch-theme/blob/master/js/gridintensity.browser.min.js).

Another implementation would be https://codecarbon.io/ which is more granular and calculates not just the emissions of your website or application in aggregate, but of your actual code, in accordance with grid intensity. This allows you to create carbon aware queueing jobs, which basically allocate the most computing intensive jobs to the lowest grid intensity times.

Track to improve, gradually refine

So is there then one answer to the question of how many CO2 emissions are in 1 GB we can use as a consitent metric? Alas no. Which is to say, tools like Ecograder in my first posts give you a nice starting point to benchmark and communicate your progress, but if you want to be thorough and maximally impactful, you need to drill down much more in both your measurements and implementations, with the tools and examples I've offered.

The most important thing, of course, is that whatever metrics and tools you choose, quick and dirty or orchestrated and precise, they help you surface your direction of travel in emissions, and empower you to reduce them month on month, year on year.

Top comments (0)