Play Button Pause Button

The Cost of Data with Vaidehi Joshi

vaidehijoshi profile image Vaidehi Joshi ・5 min read

To find all of the resources listed in this talk, as well as the full transcript of this talk, check out costofdata.dev.

Vaidehi is a senior engineer at DEV, where she builds community and helps improve the software careers of millions. She enjoys building and breaking code, but loves creating empathetic engineering teams a whole lot more. She is the creator of basecs and baseds, two writing series exploring the fundamentals of computer science and distributed systems. She also co-hosts the Base.cs Podcast, and is a producer of the BaseCS and Byte Sized video series.

A super brief outline of this talk is as follows:

  1. [Introduction]: The what + why of data centers
  2. [Middle]: Investigating the environmental impact of cloud providers
  3. [Middle]: Exploring how the impact of running data centers will scale over time
  4. [Middle]: Highlighting some advancements made in this sector
  5. [Conclusion]: Providing actionable items for developers

Here is a much more detailed outline, which goes into greater depth:

[Introduction]: The what + why of data centers
Everything that we do on the web has to be stored somewhere. Many of us work with databases every day, and some of us even do impromptu ops work, focusing on making sure our servers stay alive in an effort to keep our apps up and running. But even though we all know this theoretically, most of us never have to think beyond our own databases or servers.

This talk takes a macroscopic view of what all of this data actually looks like in reality. All of our app's data lives in a data center, somewhere in the world. Whether we build our own, or use a cloud provider, we're relying on that infrastructure to maintain our app's uptime and store our users' content. These physical buildings are often out-of-sight—but that doesn't mean that they should be out of mind!

[Middle]: Investigating their environmental impact
Data centers require large footprints: they are physically huge buildings, but they also require a large amount of energy to provide constant, uninterrupted service. Current research estimates that data centers worldwide use ~200 terawatt hours (TWh) a year and demand somewhere between 1-3% of the world’s global electricity (I’ve found conflicting reports on the exact number). In terms of consuming energy, this puts data centers in the same bucket as the aviation industry!

Data centers require so much energy because servers create heat, so they need to be cooled down; unfortunately, many data centers are built in warm climates, which makes them pretty inefficient. Another harsh reality is where these data centers get their energy from. While some cloud providers have shifted to using 100% renewable energy, not all of them have yet. Amazon Web Services (AWS), for example, has committed to a long-term goal of using 100% renewable energy, one of its most popular regions, US East, is still fueled by coal and natural gas. And, to make matters more complicated, cloud providers don’t exactly make it easy for consumers to know whether or not their data is being stored in a green facility or not, and many of them are simply not transparent, and do not release information on their data centers.

[Middle]: Exploring how the impact of running data centers will scale over time
When we stop to think about how this will scale over time, this problem can seem overwhelming and daunting. Researchers estimate that 28 billion devices will be connected to the internet in 2020, and the amount of data we create is ever-growing.

As our climate changes, there are other threats to the infrastructure that is going to be needed to power all of those devices that are connected to the internet. Researchers at the University of Oregon have predicted that, by 2030, approximately 235 data centers will be impacted by a predicted 1 foot rise in sea levels.

The data center problem is multifaceted: they take energy to power, to cool and their number are growing. They are impacting the physical environment, and they are also at a high risk of being impacted by climate change, too.

[Middle]: Highlighting some advancements made in this sector
But, there’s hope yet. Data centers also happen to be the home of some of the most interesting technological advancements in our industry! More and more data centers are being moved to cooler climates, and scientists are coming up with better, more effective cooling solutions. In Stockholm, researchers have found a way to recover data center heat waste and are now reusing that same energy to heat homes in the city!

[Conclusion]: Providing actionable items for developers
So, knowing all of this, what can we do, as developers? Here are some actionable items that each of us can do today:
• Figure out where your data lives!
• Figure out if it’s green! (Check out thegreenwebfoundation.org is a great resource for this)
• If your data doesn’t live in a green zone, consider migrating your data to a different location or provider. (Admittedly, I know that this is not easy!)
• When provisioning a new server/database, don’t provision it in a zone that is not green.
• Build things that make this knowledge easily accessible! (I love the cloud sustainability console chrome extension, built by Paul Johnston, which highlights AWS zones that are green in the console). He also imagined a CLI tool that would allow us to see how much energy usage, renewable energy, and carbon released for every new instance on a cloud vendor! This doesn’t exist yet, but other tools like this would be great to build for the community.
• If you work at a small company, draw attention to this issue internally! At my previous company, I brought this up and we started the discussion of what it would take to migrate away from the AWS US East zone.
• If you are part of a large company, especially one that has a large enterprise account, pressure your cloud provider to be transparent about where their energy comes from for powering their data centers.
• If you work for a cloud provider, push them to use clean energy in their data centers (lots of employees have already done great work on this!)"

Here is a download link to the talk slides (PDF)

This talk will be presented as part of CodeLand:Distributed on July 23. After the talk is streamed as part of the conference, it will be added to this post as a recorded video.

Posted on by:

vaidehijoshi profile

Vaidehi Joshi


Writing words, writing code. Sometimes doing both at once. Señiorita engineer at Forem.


markdown guide

Yay!!!! I feel like a kid in a candy store. lol!


I am loving how well-researched the data is in this talk





An incredibly useful talk - I'm glad for the focus on our responsibility regarding the true costs of data. Awesome job, @vaidehijoshi !! 🚀


Wow, excellent talk! Thank you.


Ah! I have that byte sized sticker!


Awesome talk, never really thought about this even as I was working with databases. Companies must be held to a standard of using alternative and clean sources of energy, and programmers should be made more aware of their decisions!


Agreed! And we as the creators of technology can hopefully influence companies in the right direction :)


Brilliant talk @vaidehijoshi - really gives pause for thought, both for the tech sector, but in any sector and all of our personal lives. The time series chart showing year on year % consumption was really quite scary.


Thank you so much! I put so much work into this talk, so hearing feedback like your comment makes me feel all warm and happy inside! ✨


Intriguing talk. 20% of projected demand by end of decade !!! Definitely be thinking about it.


That's a very informative and interesting talk! I had no idea that servers are so costly!


Thanks for sharing this talk! I love that you were so informative, and also hopeful about what we can do to improve this and be better as developers. :)


I love this talk, it's a unique perspective on the tech industry as a whole!


I was wondering about this as I study for my cloud exam. Thanks for making this talk!


This is amazing! I haven't thought about any of this before. Thanks for giving this talk, @vaidehijoshi !


This is a great talk thankyou! I never thought about how data affects our climate.


Being new to dev of any kind this was an amazing talk! I never considered where data is stored! Thank you for putting this into perspective and creating a resource page!


@vaidehijoshi Awesome topic, fascinating presentation, thank you!


Another great talk about computing and climate change by Nabil Hassein: deconstructconf.com/2018/nabil-has...


Use them to heat homes in Sweden!


As a Swede, I approve of this suggestion.


Turns out that Sweden has already done this! 😉 Check out this article.

😺 Yes! Thank you for posting the article.


Hi Vaidehi! 👋

Did you find the researching the energy usage of data centers come pretty easily, or did you have to go down rabbit holes because maybe the companies aren't so transparent about this?


It was easy in a sense because the work done in the Paul + Anne's whitepaper was immensely helpful! But, it's also just extremely hard to find data round this, which made it hard. So, all in all, I knew where to start, but it was hard to get data and concrete statements from certain providers (especially Alibaba!).


Wow, 20% of projected energy. Surprising.


How soon do you think data centers will produce their own energy, rather than buying offsets?


I would be surprised if this happens, since it requires a lot of your own infrastructure set up! But maybe it will, who knows. I think the more likely scenario is that private companies start providing "green electricity" alternatives (but I also don't know how hard that would be to do from a legal perspective 😉)!


I read about a man that kept hearing a noise, non-stop, that ended up being the data center near him.


What about the redundant storage?


I usually don't think about this part so this will be interesting!