DEV Community

Cover image for For Most of Us, Simplification Rules
ymc9 for ZenStack

Posted on

For Most of Us, Simplification Rules

On one late night in Feb 2018, after switching all our customer's workload from the old data processing system to the new one, I felt an overwhelming sense of relief. Despite the challenges that lay ahead in monitoring stability and fixing code bugs, the feeling that we were now standing above a greatly simplified infrastructure was ecstatic.

You would have shared my euphoria if you knew how challenging our old platform was. It consisted of a homebrewed Erlang cluster ingesting data to HDFS, a multi-pass Hadoop ETL that crunched raw data into an intermediary format and loaded it into a manually sharded MySQL cluster of over 60 instances. At query time, a computation cluster used its magic to combine SQL queries with distributed in-memory map-reduce to serve OLAP requests. Anything going wrong along the chain caused a nightmare, and unpleasant OT work to backtrack and fix.

The new data platform replaced all this nonsense with a homogeneous ElasticSearch cluster of 4x nodes. Data was ingested right into it after minimum processing, and most queries were handled natively by the cluster without external computation. It was simpler, faster, and much more reliable.

This is not a story about software architecture and tech stack choices. Instead, it's a personal reflection on my 5-year service as a CTO role in a SaaS company. What my team and I suffered, strived, and achieved made me increasingly believe in one thing - for most of us, simplification rules.


Complexity: A Full-Stack Pain from Code Quality to Company Culture

You know what? If you have a messy system, your customers feel it right away. A few days ago, when I checked out from a hotel, the front desk lady had great trouble charging my credit card on file. Since she felt so helpless, I had a peek at her screen and saw a colorful text-based UI like developed at least 20 years ago. No wonder charging a card is difficult. One of the top reasons why such systems survive to date is they're so complex that nobody dares to make changes.

Our PMs used to struggle so much with developers about strange constraints that hindered their ability to design new features. This struggle was contagious and quickly spread to marketing, sales, and customer success teams. Eventually, even some customers built a perception that our system was unwieldy. The excessive complexity is like some highly dense air poured into our organization - invisible, but everybody feels it and is slowed down by it.

A funny side effect is that the more complexity, the more time you spend having meetings on addressing the outcome, and you have less time to spend on fixing the root cause - a death spiral towards a dysfunctional organization.

It's mind-boggling to think that Facebook has over 70K employees - it feels like building social media is like manufacturing cars; the labor cost is proportional to the scale. It's also startling that Twitter still stands after a 90% RIF. Very unfortunate for people impacted by it, but it demonstrates how much potential each organization has in reducing complexity.

Simplification Comes from Constant Challenging

I hope never to use microservices again.

One Saturday morning roughly four years ago, one of my dev leads and I met to discuss redesigning an old system. He drew a pretty architecture diagram on the whiteboard at one stretch, consisting of many independent services and middlewares. What I said was:

"Dude, I would hire you (again) if this were an interview. But we're doing real work here. You're showing me something that looks universally correct. Are you convinced we need all these goodies? We'll end up having more subsystems than engineers!"

Adopting a complex methodology without internalization is the NO 1 source of architectural disaster. We're so unaccustomed to asking what we can subtract instead of add.

Our product used to use Spring Boot microservices. The services were deployed on the container orchestration platform Rancher. Spring Boot has its de factor official "discovery and load balancing service" called Eureka, which caused us so much pain because it had a bad memory and constantly "forgot" about the services it managed.

What's interesting is that Rancher, as an orchestration platform, has an internal DNS service, and DNS, for decades, has been used for discovery and load balancing. So I tried hard to convince my dev lead that we could live without Eureka, and it worked very well after we made the change.
Unfortunately, it was just a bandit to a bigger mess. Microservice was a mistake. The pain, confusion, and useless debate it brought far exceeded the gain. After switching to much fewer coarse-grained services, my team's productivity was at a whole new level.

People often believe that complexity comes from overthinking. It's quite the contrary. It's indeed the result of laziness, laziness that makes people follow whatever other people are doing or saying. While achieving simplicity takes courage, out-of-the-box thinking, and diligent work.

Your Team Can Do Fewer Things Much Better

To build a product efficiently, you need to let your team's skills consolidate into a small number of really essential things. After a few years, for our backend team, these skills became:

  • Node.js + Typescript
  • MySQL + ORM
  • ElasticSearch

These are not the coolest technologies in the market, but when your average team has good proficiency in these few things, your company will have the confidence to tackle most of the upcoming new challenges and consistently deliver good quality. By focusing on fewer things, you've also got the opportunity to make long-term investments in deeper optimization and building custom toolings, which tend to yield significant returns down the road.


Software engineering is an art of managing complexity. Good engineering leaders chase simplification relentlessly.

Eric Ries quote


P.S., We're building ZenStack, a toolkit that supercharges Prisma ORM with a powerful access control layer and unleashes its full potential for full-stack development.

Top comments (0)