Our core beliefs are the principles that we've learned through experience or subscribed to from others in order to quickly make decisions and value judgements. They occasionally need to be re-evaluated, but without them, we'd spend all of our time analyzing instead of acting.
Over time, as a software developer, you'll adopt and develop software development core beliefs. It's important to be conscious of them, so that you can contextualize their application, debate your peers, and know when to re-evaluate.
Here are some of the web architecture core beliefs that I've developed and adopted in my 20 years as a software developer. Each of these could be a blog post, and probably already are. My former colleagues may be able to remember specific incidents and hard lessons learned that cemented some of these beliefs. 😬
Core beliefs, be they about software or life, are personal; I'm not asking you to agree with or adopt any of these, but if something rings true, certainly feel free.
Software that is not being actively developed is dead, and it's resource intensive to resurrect dead software. If your process involves doing this repeatedly, you're being hugely inefficient.05:44 AM - 04 Jan 2021
In the web world, software ages very quickly. Dependencies are out of date in months, and with a culture of small packages, your software may have a lot of dependencies.
Every time you set a piece of software down, it's going out of date. Bit rot is real. The platform that you built it for moves on, your embedded dependencies have software vulnerabilities published, your services change APIs, etc. It'll take some effort to bring your software back up to date the next time you need to work on it.
When choosing dependencies, it's usually a bad idea in the web world to use anything that hasn't been updated in the last 6 months. You've been warned.
Are there exceptions? Sure. What if your software uses a stable platform, has no or few dependencies, and is super small. Again, context is king.
The Kiss Principle is 80 years old at this point. It states that complicated systems are much more prone to failure.
There's a lot of these software principals. You may want to adopt some of them if they ring true to you.
I also like Gall's Law.
The thing is, we're engineers, and we over-do it sometimes. Over-engineering is always a temptation for software developers that love their craft. It's probably the most common software development sin. There's a time and a place, which we'll get to in a bit.
Client-side web applications have to hold some data, and the client is almost never the source of truth. That means that any data you maintain in a client is part of a cache and should be treated that way.
Leon BambrickThere are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.14:20 PM - 01 Jan 2010
You have to be very deliberate with cached data. Here are some guidelines:
- Only cache data that you're very likely to use.
- Don't cache data that the user doesn't have permissions to see.
- Avoid making decisions based on cached data.
- Invalidate cached data when the source of truth updates.
- Don't update the cache from user actions without marking it as pending.
Dealing properly with client state data is a very large topic, but thinking of it as a cache first can help you make the right choices for how to use and maintain that data.
When writing a library or API, 100% test coverage should be a priority. Yes, it's doable. In the worst case, it's time-neutral, and the benefits are great and sometimes non-obvious.
isaacs@izs@fritzy Oh also: 100% test coverage is worth the cost. No, it's not perfect, and yes you'll still ship bugs, but it makes fixing those bugs possible, because you might know what the code does.
Good code is easy to test fully. Being hard to test is a smell.07:24 AM - 22 Jan 2021
I'm not an expert when it comes to testing on the client-side of a web application, but I have seen significant coverage of the data-management and component side done to great effect.
Tests not only validate that your code is following the intended purpose, but it also documents that intended purpose in great detail. It gives future contributors a starting point when looking at a piece of functionality -- they can use your tests to suss out how to generate the required inputs, where in the code functionality lives, and exactly what your expectations for the code were. Good test coverage reduces the cognitive overhead of someone else working on your code.
Tests give you a solid foundation when refactoring -- if your coverage is good enough, you'll know where you need to keep working and when you're done refactoring. Tests give you the confidence to make big changes.
All of your business logic for a web app should be in the server. UX logic can live on the client, and data management logic can live in the database, but you're asking for trouble to do any business logic anywhere except your server process. Ideally, it should be in it's own layer within that process.
When you spread business logic out, it makes it hard to find and maintain, and you greatly increase the complexity of your app. You're opening yourself up for security vulnerabilities (the very nature of a web client having authorization logic is a vulnerability in itself).
So what is business logic? Any action that results in a function that changes/accesses data directly related to the purpose of the application.
It's okay to use a stored procedure for actions that are not specific to the application, like maintenance, migrations, sharding, etc. It's okay to have client logic for displaying views and generating API calls. If you're checking to see whether a user can make an API call before making the API call, that may be a sign that something is amiss.
Keeping all of the business logic on one server layer makes it more testable, makes it easier to swap out database and API layers, and reduces the number of things a developer has to think about while making changes or debugging.
Any code that you're not using in your project is technical debt, and any API that you haven't tested is a vulnerability. It's a trope at this point for developers to brag about how many lines of code they've deleted rather than the number that they've written with a wink. It goes along with the KISS principal. Don't be overly clever in order to reduce your line-count, but generally less is more.
One of the reasons why I generally don't think GraphQL is a great solution is because you're deploying more capability than you're using. You're giving clients and users a full query language, of which they'll only use a small portion, in most cases, and when it doesn't work or surfaces a denial-of-service vulnerability because they can write a query that performs poorly, that's a bug you're going to have to fix.
There are valid use cases for exposing a query interface like GraphQL, and there are ways of limiting its surface-area, and ways of getting complete test coverage, but all of those things involve considered effort. Here is an Apollo GraphQL Guide for dealing with some of these problems.
Every line of code and every piece of functionality you provide is an opportunity for bugs and increases maintenance. If you and your users aren't going to use it, don't include it.
If you’re developing an MVP, only use technologies that you’re familiar with, only create features that are needed for one user persona, & don’t create abstractions or clever logic to ease future dev. The problems you create are good problems to have if you’re successful.05:00 AM - 04 Jan 2021
This twitter thread pretty much covers my view here. It's a bit of a hot take, but it goes along with some of my other core beliefs and the KISS principal.
Basically, the time for innovation is not during the initial version of a project. You should do your experimentation separately, either as isolated prototypes and experiments, or incrementally after you prove to the business that the MVP has value.
You should write code for people first, not computers. It's important that team members and your future self be able to easily see what your code is meant to do. Follow language idioms, don't take shortcuts just to shave off a few lines, and don't try to combine your logic to fulfill many purposes in one clever bit of code.
Instead, clearly handle your logic in an idiomatic way that walks through the problem step-by-step. Simplicity is handling the core logic and edge-cases with clear code. Don't go out of your way to save keystrokes as typing is not the time consuming part of writing software; thinking is. You can always go back and make things more terse if you've thought of a better way to be clear and handle more edge cases later.
Your goal is to make your code appear pedestrian, despite solving the problems brilliantly. This may take some iteration and a lot of thinking. You'll know you're doing well if both senior and junior developers understand your code and pull requests at a glance. Make sure to give similar feedback when doing pull requests for other developers.
For example, creating a clever mix-in function for a language that only supports direct inheritance will make the code hard to follow, and make the software more difficult to debug. Keep things simple.
Warning: This section is about the CAP Theorem, so feel free to skip it. Do not operate heavy machinery while taking CAP Theorem.
Multi-master replication will always have compromises. The CAP theorem states that between C.onsistency, A.vailability, and P.artition-tolerance, you can never maintain all three at once.
Well-implemented distributed databases document their claims and don't claim to violate the CAP theorem. These claims are typically AP (available and partition-tolerant) or CP (consistent and partition-tolerant). On a practical level, all distributed databases aim to be partition-tolerant, safely recovering from node and network failures.
A partition is when one or more nodes are either down or can't communicate with all other nodes. These partitions can happen in unexpected ways, where the boundaries of communication is different for any given node.
An AP database keeps data available during network partitions, but can produce conflicts and inconsistencies such as foreign references or incomplete transactions. A CP system does not produce conflicts (usually, check your guarantees) and keeps data consistent, but some reads or writes can be blocked during a network partition. RDBMS databases, like SQL, are typically CP, while document stores are more often AP.
Conflict resolution is a business logic problem, and needs to be solved specifically in the context of the data being stored. If two writers are editing the same article during a network partition, a last-write-wins strategy could delete one of the editor's changes. In that particular case, you may want to resolve the conflicts manually with document-merge strategy. An AP database often resolves conflicts with a last-write-win strategy by default, so be mindful.
MongoDB famously had ridiculous performance and replication claims when it first launched in 2009. It took quite a few years to clean up. I learned a lot in those days from Aphyr's posts, exploring database clustering claims with his testing tool, Jespen. Beware of marketing-driven engineering products!
In short, when selecting a distributed database product, make sure you understand their claims. There's no magic bullet when it comes to horizontal scaling.
You can scale pretty high with a single application server and a single database server, but you should also have a plan for scaling up later. Implementing a scaling solution during development of an MVP introduces unnecessary complexity (violating several of my previous core beliefs) and is an example of premature optimization
Depending on your business needs, you could scale horizontally, making it so that you could have any number of application servers and database servers (See the Multi-Master core belief above -- this can be complicated to get right). Horizontally scaling makes sense if you need to support many writes per second, and your users and data aren't logically siloed.
You could also scale through siloed sharding. If your product primarily manages logically grouped users (a common case for sharding) each group can have a dedicated set of application servers and databases. Interactions between user groups is limited. This is one of the easier ways to scale.
If your user numbers are capped, like in an IT intranet application, simply writing an efficient application is enough to scale to 10s of thousands of active users. You should still design your application in such a way that it doesn't matter how many instances you run at once for reasons of uptime and staged releases.
Keep in mind, running multiple instances of your API against a single database server (or single master server with many read-servers) will only scale to a point. Eventually you'll exceed your ability to write data to the database and will need to shard your users/data or use a distributed database.
Knowing how you're going to scale up ahead of time will help you make implementation decisions during the MVP stage of your application. Remember, having to scale up is a good problem to have, and you'll have more resources to implement your plan later.
At one of my first full-time programming jobs 20 years ago, I spoke with a senior developer about a problem I was having configuring the project for a niche QA use-case. They pointed me to a CLI program they wrote to manage that configuration. He was the only one using this tool so far, so I was shocked to see how nice it was. He'd take the time to have self-documenting command-line arguments and a clear README that went over its purpose and usage.
Why had he bothered to take the time to make this script so easy to use when he was the only user? Because developer experience matters, even just for himself, and he suspected others might find it useful eventually. At the time, I would have just slapped together a script, made all of the options constants that I would have to manually edit each time, and moved on.
It left an impression, and I began taking more care in the experience of the software I was writing, because it usually payed dividends in time. I was never embarrassed to share my little tools and libraries with others.
Years later, when I started regularly writing open source code, I took the time to see what successful open source projects included in their README. It's made a huge difference for my career.
When I started this blog post, I asked people what their core software beliefs were. I got some pretty good answers. Feel free to send me any additional thoughts or feedback below or at @fritzy. For more, follow me here, on Twitter, and on github.com/fritzy.
In this post, I set out to explore and share some of my web architecture beliefs. Some of them are objective fact (like multi-master databases), some of them are value statements (DX matters), but they're all things that I keep in mind in order to quickly architect quality software.
A similar article was published during the writing of this. There's a lot there that I agree with, but certainly not all of it. I don't endorse their views, but it's the kind of exercise that I'm encouraging with this article.