Ben Halpern

Posted on Feb 2, 2017 • Edited on Apr 14, 2018

Making dev.to Incredibly fast

#webperf #meta

It makes me smile when someone raves about how fast this website loads, because that's no accident. We put a lot of effort into making it so. It is the sort of thing that usually goes unnoticed, but when your readers are developers, there's a better chance they notice and appreciate it. I have written about this in the past, but it's worth re-examining because these ideas are always evolving.

From the beginning, the idea was that if we could make everything insanely fast, every other UX consideration was going to be a lot easier going forward. Performance of a webpage is the most important UX consideration for me. Nobody wants to spend their time staring at a blank screen. Regardless of the little touches, folks are going to want to come back to a site that isn't going to waste their time on a white screen.

But speed is not free. In order to ensure performance, a project must have a set of reasonable constraints. Focusing on performance early helped map out the site's architecture to fit within the constraints that ensure great time-to-render speeds. Future decisions are less likely to have to weigh performance, because the performance consideration is center to the whole structure of the website.

Most performance issues on the web come down to understanding where the bottlenecks are. From there we need to determine what tradeoffs we can make to eliminate them. Tradeoffs are really hard to communicate between people with different concerns. I believe a big part of this project's success has been because I had a pretty good grasp of the top-to-bottom problem and how to implement it. It would have been really hard to explain to a designer right off the bat why we can't always have custom content on a page or have a designer explain to me why they need it.

Design constraints

What can we understand about the application to determine what decisions we can make about architecture? For dev.to, I took to thinking that this is a very read-heavy application, and there was a great opportunity to perform a lot of caching. It was an opportunity to be fairly minimalist in features and min-max the hell out of what's important: Content consumption. It is also vital to have an understanding of the available infrastructure tools. Don't build a project around infrastructure that nobody is providing as a reasonable service.

Fundamentally, maximizing web performance is about caching and latency mitigation. This application is a series of blog posts. There is a lot of dynamic read-write action any time a post is updated or a comment is made, etc. But most of the behavior is that of a user coming to the site, reading, or glancing over the content, and leaving. In order to best service this behavior, we make good use of edge caching. That means that if you visit this site from New York, you will be served a static, gzipped HTML page right from New York. If you visit from London, you get the London version of the cached page. If you visit from Nigeria, you get Spain's version, as that is the closest node in that case. This is a far better experience than when a site serves all its content from Utah, or something.

If you cannot serve fully static content, the same principles would apply with region-based data replication. As our the feature set of our site grows, I am excited to expand our capabilities in this sense, but starting with optimal performance in a simple way was key for a one-person operation. We use Fastly for our CDN edge-caching needs. We really like their service. Full disclosure, Fastly is now a sponsor of ours, but we made that happen because we love them, not the other way around.

Initial page request

A typical request first returns a fully-cached, pre-gzipped HTML page served from the closest Fastly node if one is available. Additional "dynamic" info, such as which comments you've liked, etc. go through a second, lightweight, request that hits the application database and returns the relevant information. A lot of this data is, itself, cached, but its cache keys are based on the specific user session, so we could not return this from the edge as of now. We plan to store more of this data on the client as well so we do not have to fetch from the source as often, but that is not implemented as of yet.

Eliminating render-blocking styles and scripts

Once a page is delivered the next important detail is rendering it as quickly as possible. This, in my opinion, is where a lot of sites, even ones serving totally static content from the edge via a CDN, often fall short. The key to improving render times is to eliminate all render-blocking latency. That means no external CSS requests, no custom fonts, and no synchronous JavaScript.

Yes, all the CSS relevant for the initial render come over the wire in a <style> tag, and no part of the page's structure is dependent on any JavaScript execution. This ensures that even when a user comes to the site with a totally clear cache, they still see all the site's content virtually instantaneously. All the site's style logic is organized into stylesheets as per convention, and make use of SCSS for convenience and cleanliness, but relevant CSS is printed to the page at runtime, and everything is, of course, cached on subsequent requests.

Images are automatically optimized for compression and served from the most efficient format depending on the browser (webp for Chrome, jpeg for Safari, etc.). This is a service provided by Cloudinary. Cloudinary also fully leverages HTTP2 where possible so we do not have to think about it.

Large cover images have their background color set to the approximate color of the image, so when the image loads, it is a nicer transition. I played around with a very low-res, blurred image inlined into the page, but that itself added a extra page weight and increased the time it took for the rendering engine to do its job. There may be other solutions for this in the future.

I am excited to play around with the newish native <picture> element, which should help us serve even smaller images. However, we have not yet made use of that.

The bottom line

Performance on the web is the most important user experience consideration. When people click a link to this site, I don't want them to habitually go to another tab while the page loads. I want them to be confident in clicking through on mobile and expecting to get a very fast page load time, even if network conditions are spotty. Try hitting a page with lots of external CSS and custom fonts on the network. You're going to have a bad time. I want our readers all over the world to have the same lightning-quick page request times regardless of their devices, networks, and distance from our main region. I want developers who put their blog posts up on our site to trust that we are continuously drilling to improve the important parts of user experience.

I hope this was helpful. Please feel free to ask questions or leave comments below.

Top comments (36)

Max Stoiber • Feb 2 '17

Have you seen Grade by @benhowdle? You might be able to show a gradient instead of a solid background color while the images are loading, which doesn't have an impact on the loading time but looks gorgeous!

Ben Halpern • Feb 2 '17

Interesting. But is that data available before the image is loaded? The background color example here is to provide a color background on initial render, because the image won't show up at first.

But the gradient look still could be the right call, it would just be done on the sever once ahead of time instead of calculated on the client. That is the general idea with most of the app's structure. More work at write time, less work at read time.

Max Stoiber • Feb 2 '17

Yes exactly, I didn't mean to use that exact library client-side, but to take inspiration from the algorithm and use it on the server.

It's certainly a lot nicer than just a plain color!

Ben Halpern • Feb 2 '17

Ah yes. Right on, good call.

marcellothearcane • Feb 22 '21

Now moved to benhowdle89.github.io/grade/

Warren • Sep 6 '17

And I found something interesting. Hovering to any dev.to links loads the page contents even if you didn't click it.

Andrii Los • Nov 16 '17

Not only. I think it also uses Service Workers for that. And also have some delay before it will actually start to fetch because if you move around link very fast you will see that after every request is red hence canceled. While if you move and keep your mouse for another ms more, it will end up in success :)

So it looks like that.
prntscr.com/hb4pxp

kaelscion • Oct 29 '18 • Edited

See, this is why I love this site. Not because of the performance per sey, but because the way this performance is achieved is just thrown out there for all of us to enjoy. Most times you have to ask the team behind an open source project how they did stuff, and the answer is usually "Well, only our sponsors and Patreon supporters are privy to that information wink wink hint hint". But with @ben and the team it's like "Hey, our site is fast as balls because (detailed description of all steps). Cheers." Like it's no big deal. Gotta love that kind of transparency! 😁😁

swyx • Feb 3 '18

i am very interested in this topic but felt that this article was a bit too high level. i was left wanting more detail. in particular would be very interested in objective tools to measure and identify bottlenecks in page speed. what websites do you use? what are the numbers that really matter? would love more on that. maybe pick a website you use and do a "what i would do if i were you" post?

Attila Szeremi⚡ • Aug 27 '18

Don't inline styles make HTML load slower on uncached load of the user logged-in view? I believe you'd be able to rely on your CDN less, and have to do more work on your backend in this case.
And isn't HTTP2 meant to load CSS for you as if CSS were inlined, with the benefit of being able to cache the CSS and HTML separately?

Ben Halpern • Aug 27 '18

Yes and yes to an extent.

It's possible that we should remove this strategy for logged-in users. A huge amount of the traffic is still fresh users via Google or elsewhere.

HTTP2 Server Push would possibly help in the way you're describing but it's still not clear whether that tech is all the way there to avoid headaches.

I think in the next little while we will see how we plan to evolve our strategy.

Attila Szeremi⚡ • Aug 27 '18

I'm saying the first thing because I'm surprised the website is so blazing fast even if I'm logged in. Okay I see that if I disable JS, there's some basic stuff loaded that probably would be the safe for all users, but even so, the rest of it seems to load so fast; the database calls in the backend (on first look there didn't seem to be a lot of caching there, but I might have seen wrong), and also the fact that Ruby is quite a slow language as far as I know. I tend to have speed issues running Laravel, but then again I'm a cheapskate that runs it on $5-10 Digital Ocean servers :P

Steve Steiner • Mar 10 '17

I apologize if this is answered somewhere else but what do you use to serve the blog, blogging software-wise?

Lloyd Hanneman • Apr 26 '17

also wondering this!

Jess Lee • May 22 '17

We built our own CMS 🙃

Ben Halpern • May 24 '17

With Rails, specifically.

Rich Smith • May 24 '17

Even more impressive that it is as fast as it is then!

Supun Kavinda • Feb 6 '21

Hi Ben, you mentioned "A typical request first returns a fully-cached, pre-gzipped HTML page served from the closest Fastly node if one is available."

How do you make Fastly cache this response? (Headers?)
And, let's say you cached a dev.to post. Then, the author edits it. How do you update the cached page?

Thanks for the great article. I'm waiting for a response :)

Pothi Kalimuthu • Feb 2 '17

I have wondered how dev.to could be this fast. Thanks for sharing insights on how you approached the whole thing!

Ben Halpern • Feb 2 '17

Nice! There is more I didn't cover here, that is mentioned in some of those older posts. Of course, this covers some of the stuff not covered before.

I'll write about this more often. I'm constantly looking for new ways to improve, or at least new ways to think about and express why this stuff is important.