DEV Community

Kevin Naidoo
Kevin Naidoo

Posted on • Updated on

Scaling moderate sized websites

For the majority of web applications - scaling to the level of Facebook, twitter, Netflix etc... is not very common however whenever you mention the word "scale" - These websites come to mind, and we often make unnecessarily complex infrastructure decisions just because it works for these large tech giants.

In this post - I'm just going to list a few points that are usually not overly complicated to implement:

  1. Load balancing - this generally is a must, always have more than one node to distribute the traffic/workload evenly and also provide a fallback option when one node experiences downtime. http://www.haproxy.org/ - usually a good choice for this purpose.

  2. Caching - while it's not possible to cache everything, there's always a large percentage of your website / app that can be cached for an hour or ten minutes or 1 day etc... - all depends on the type of content but the longer you can cache for without negatively effecting content quality - the better. A good caching server example would be redis : https://redis.io/ or https://memcached.org/

  3. Read / Write databases - Most database servers such as MySQL, Postgresql and so on provide the option to replicate your data into multiple instances. You usually would create a write db and then multiple slaves that serve as a read db.

  4. NoSQL - usually with SQL databases you can easily handle millions of queries without much issue by simply indexing your data and optimising queries. There are some instances where you have data such as logs - which don't require complex join queries to query - and moving these off your primary db server into a NoSQL server such as Mongodb would greatly improve performance of your stack.

  5. Micro-services - sometimes some processes may be quite heavy on resources such as image manipulation - therefore instead of putting pressure on your main application's infrastructure - you can isolate this one component out into a separate service on a different node or use something like serverless functions (AWS Lambda) .

  6. A solid error tracking tool - e.g. : https://newrelic.com/platform/application-monitoring or https://sentry.io/. When stuff breaks - you'd want to know as quickly as possible because downtime at scale can be costly. With these sort of tools - you can constantly monitor your applications health and react more quickly and efficiently when something breaks.

  7. Say no to ORM's! - sure most of the time ORM's are clean and easy to work with and scale relatively well, however on high traffic sites - I find you often have to drop down to SQL so that you can optimise queries efficiently.

  8. SQL caching - similar to point 2 above, but in this instance I am referring to creating table caches or pushing data to some NoSQL db / elastic search and so fourth. The idea is - usually in the backend, there's always some sort of reporting - a monthly sales forecast or something similar which is often complex queries that take a bit of time and resources to process - you usually can get away with running these queries in the background once or twice a day when there's less strain on your infrastructure and then cache that result. When an admin is pulling the report - the dashboard is doing a much simpler read query instead of 20 people running the same heavy report over and over again.

  9. Don't be afraid to spread your wings a bit, by that I mean - if your stack is built on a high level language e.g. Ruby, Python and PHP to some extent, it would be beneficial in terms of cost and resources to extract out certain parts of you application and re-write those parts using more high performance languages such as Golang, Rust, C++ and so fourth.

In conclusion - most of the time, you'll find the bottleneck to be in SQL, therefore a good replication strategy, maybe - some NoSQL and lots of caching can save you a ton of pain.

Top comments (0)