loading...

How large is the dev.to production database πŸ€”

twitter logo github logo Updated on ・1 min read

This month of October as we do the hacktoberfest I personally adopted the dev.to project.

Browsing though the codebase, I'm loving the way the code looks so organised and simple and the people and bots πŸ˜… who were involved in the code certainly make it seam easy, which is very encouraging.

Now I have this burning curiosity nagging me in the back of my mind

How large is the dev.to production database? 100GB? 500GB? 2TB 😲?

Let's try guessing the size and let @ben and the @devteam tell us the answer in the end.

My guess is ≃ 250GB


Cheers πŸ₯‚
@kudapara

twitter logo DISCUSS (22)
markdown guide
 

I believe all uploaded images/videos are stored on a CDN, so I will not include them in my estimate. But let's get a few other numbers to try and get a better estimate.

  • This article has an ID of 186129, so I'll round to 200k articles for easy math.
  • The text in an article doesn't take up much space at all. I would guess most articles are a few KB in size, and the largest ones less than 100KB. Let's go with a median estimate of 50KB per article.
  • The most popular articles have a couple hundred comments, but most are probably around 10 or so. - Comments are much smaller than articles, so lets go with an estimate of 1KB per comment.
  • The homepage for logged-out visitors has "239,226 humans who code". So let's around to 250k registered accounts. I couldn't even begin to put an accurate estimate on the size of each account record, so lets just say 10kb to account for a bio, linked URLs, etc.

So let's do the math!

(200,000 articles * 50kb) + ((200,000 articles * 10 comments) * 1kb) + (250,000 users * 10kb) = 14.5 GB

I'm gonna put my official guess at 25 GB. Text-based media takes up much less space than you might expect!

 

Wondering if these guys use event sourcing?? If it is , then the event store would be HUGE!. The table size would increased rapidly. The events that would be having large payload would be PostUpdated, CommentUpdated.

However i cannot estimate the numbers.

 

Yeah, i would guess something around that too.

In math i would also include that this is an rails app, so basically it means it has a lot of trash in db and if using gems like papertrail, its even worse - thats why i would land at around 25GB, because i would say it should not exceed 10GB.

 

That's very sound maths. You sir have my respect. I had a but more conservative estimation based on a blogging company I once contracted for. My estimation was 72GB rounded.

 

My guess is ≃ 250GB

This is the closest guess yet (being the first guess), but it is not correct.

 
 

Wondering if these guys use event sourcing?? If it is , then the event store would be HUGE!. The table size would increased rapidly. The events that would be having large payload would be PostUpdated, CommentUpdated.

However i cannot estimate the numbers.

 

I don't think they use event sourcing from what I saw as I was browsing through the codebase (though I could be wrong)

 
 
 
 
 

Thank you all πŸ‘πŸΎπŸ–€πŸ‘πŸΎ

 
 
 
 

You've made some really good points there. I checked on the internet for additional information about the issue and found most people will go along with your views on this website.

Classic DEV Post from Sep 27 '19

What was your win this week?

Got to all your meetings on time? Started a new project? Fixed a tricky bug?

Kudakwashe Paradzayi profile image
Extreme Programmer πŸ‘¨β€πŸ’» Γ— Fullstack Javascript Developer πŸ’ͺ🏽 Γ— Hackerman 😎