How large is the dev.to production database πŸ€”

twitter logo github logo Updated on ・1 min read

This month of October as we do the hacktoberfest I personally adopted the dev.to project.

Browsing though the codebase, I'm loving the way the code looks so organised and simple and the people and bots πŸ˜… who were involved in the code certainly make it seam easy, which is very encouraging.

Now I have this burning curiosity nagging me in the back of my mind

How large is the dev.to production database? 100GB? 500GB? 2TB 😲?

Let's try guessing the size and let @ben and the @devteam tell us the answer in the end.

My guess is ≃ 250GB


Cheers πŸ₯‚
@kudapara

twitter logo DISCUSS (22)
markdown guide
 

I believe all uploaded images/videos are stored on a CDN, so I will not include them in my estimate. But let's get a few other numbers to try and get a better estimate.

  • This article has an ID of 186129, so I'll round to 200k articles for easy math.
  • The text in an article doesn't take up much space at all. I would guess most articles are a few KB in size, and the largest ones less than 100KB. Let's go with a median estimate of 50KB per article.
  • The most popular articles have a couple hundred comments, but most are probably around 10 or so. - Comments are much smaller than articles, so lets go with an estimate of 1KB per comment.
  • The homepage for logged-out visitors has "239,226 humans who code". So let's around to 250k registered accounts. I couldn't even begin to put an accurate estimate on the size of each account record, so lets just say 10kb to account for a bio, linked URLs, etc.

So let's do the math!

(200,000 articles * 50kb) + ((200,000 articles * 10 comments) * 1kb) + (250,000 users * 10kb) = 14.5 GB

I'm gonna put my official guess at 25 GB. Text-based media takes up much less space than you might expect!

 

Wondering if these guys use event sourcing?? If it is , then the event store would be HUGE!. The table size would increased rapidly. The events that would be having large payload would be PostUpdated, CommentUpdated.

However i cannot estimate the numbers.

 

Yeah, i would guess something around that too.

In math i would also include that this is an rails app, so basically it means it has a lot of trash in db and if using gems like papertrail, its even worse - thats why i would land at around 25GB, because i would say it should not exceed 10GB.

 

That's very sound maths. You sir have my respect. I had a but more conservative estimation based on a blogging company I once contracted for. My estimation was 72GB rounded.

 

My guess is ≃ 250GB

This is the closest guess yet (being the first guess), but it is not correct.

 
 

Wondering if these guys use event sourcing?? If it is , then the event store would be HUGE!. The table size would increased rapidly. The events that would be having large payload would be PostUpdated, CommentUpdated.

However i cannot estimate the numbers.

 

I don't think they use event sourcing from what I saw as I was browsing through the codebase (though I could be wrong)

 
 
 
 
 

Thank you all πŸ‘πŸΎπŸ–€πŸ‘πŸΎ

 
 
 
 

You've made some really good points there. I checked on the internet for additional information about the issue and found most people will go along with your views on this website.

Classic DEV Post from Apr 5

What was your win this week?

Got to all your meetings on time? Started a new project? Fixed a tricky bug?

Kudakwashe Paradzayi profile image
Extreme Programmer πŸ‘¨β€πŸ’» Γ— Fullstack Javascript Developer πŸ’ͺ🏽 Γ— Hackerman 😎