Spencer Pauly

Posted on Aug 18, 2020 • Edited on Apr 20, 2021

Why I switched away from Google Firestore

#startup #database #javascript #node

Brought to you by engine.so - a tool to instantly create a public self-service knowledge base for your customers with Notion.

Google Firestore is Google's propriety NoSQL Document-Based Database. Paired with the rest of the Firebase suite such as Cloud Functions, Firebase Auth and Firebase Storage this can look like a pretty attractive tech stack for startups or solo-developers looking to get an app up and running quickly.
This is exactly what I thought 9 months ago when choosing a tech stack for my mobile app. Firestore had some advantages that I was attracted to. It had a generous free tier, an auto-scaling NoSQL data model, and some sweet integrations with the other Firebase services. If you feel like you're in this happy bubble with this technology now, here's a word of advice…
Make sure you're aware of the downsides of Firestore.

The Three Big Reasons I Won't Use Firestore Again

1. Proprietary Database

We've all heard the term "vendor lock-in". Well, Firestore is the epitome of this idea. If you think this won't be an issue because your product is simple or small, I'll tell you right now that even with the simplest apps Firestore's vendor lock-in creeps up. I experienced this when trying to do the simple task of a deploying a DEV and a PROD version of the database. This is a huge challenge with Firestore.

The first hurdle you run into is the fact that you can't have multiple Firestore databases associated with a project. Therefore you have to create separate project-dev and project-prod projects. This isn't too hard initially, and is probably a good design pattern in general, but now your development experience gets 2x as complex. Now you have to decide if you want each project to have a separate Firebase Auth, and what about cloud functions, or storage buckets, etc? And there's no tools to automate any of this deployment, so if you want to just "copy over" your database data and functions and auth users to production, you have to do that manually. And even then, some of these operations can be done through the firebase CLI, but the more important ones like migrating data can't be.

Assuming you get production and development environments setup, now you have 20 other issues that crop up. How do you do automated backups? How do you export data from one database to another in an automated way to refresh staging servers? How can you get a local version of this database running to test with? The answer to all these questions is that… it's complicated. These more complicated use cases are hard to do because this database isn't open source, so there's no community around it making tools for these things.

Some of these issues aren't unique to Firestore, but simply to any proprietary database vendor. This is why I'll never choose a proprietary database again. There's times to try out the latest and greatest thing, but when it comes to the integrity, security, and accessibility of your company's most important asset (your data), I'd say 10 times out of 10 that it's a better choice to use a solution that's been battle-tested on open source.

2. Firestore Optimizes for themselves. NOT you.

This part really annoyed me while using Firestore. It's the fact that Firestore has two features that are consistently at ends with each other.

Firestore charges per document when you read/write to the database.
Firestore's querying abilities are very primitive, so more complicated filtering, sorting, or merging of data MUST be done client-side.

This deadly combination means that if you have to do a more complicated query (which is almost unavoidable), you will need to overfetch the data, and then filter it in a Cloud Function or on the client-side before using it. This isn't just wasteful on networking bandwidth and client-side processing time, but because of Firestore's payment strategy it ends up costing you more money as-well. The biggest result I've seen from this is that

It results in my database collections and available querying operations defining what features I implement into my product, rather than my customers deciding it.

Now I'm going to play devil's advocate for a second because I understand why Firestore is setup this way. It's because Firestore is built for one purpose. It's built to make it very difficult for you to write a bad query. Almost every possible query you can make to Firestore is of O(1) complexity. This is great because it means your database processing time is short and clients are getting results very quickly. But…

Did you catch that?

Firestore is built to make processing cheap on the server-side. But guess what? You pay per document so whether a query takes 1ms or 100ms doesn't matter to your wallet. This means that Firestore is optimizing to make their costs cheaper. Not yours. And since you have to overfetch data and manually filter it on the client side you actually end up with a more expensive and slower query overall. This is why I moved away from Firestore. After seeing that this was their business model, it proved to me that there's no way I want to try to scale with this product.

3. A NoSQL Database Likely Isn't Right For You

One thing that initially attracted me to Firestore was it's NoSQL data model. There's other options for NoSQL such as MongoDB or AWS DynamoDB, but Firestore provided a really nice auto-scaling out-of-the-box solution for me that I liked right away. Until I didn't like it anymore.

You see, most data for the typical web or mobile application is going to be highly relational. A typical application will probably have users, as-well as things that relate to the users in some way. And these things likely relate to other things as-well. Etc, etc. And they might be viewed in a list, or indexed, or queried to see all the things that a user has created. For managing these basic use-cases, Firestore is okay, but once it gets more complicated Firestore breaks down.

The NoSQL solution to these problems includes things like data duplication, fan-out writes, etc. These principles take more development time to implement than having a SQL database to begin with. If you're looking towards Firestore as a solution, you're probably looking for something that saves development time, because that's Firebase's selling point, but Firestore is more akin to taking on time-debt that you have to pay off later. To illustrate some really painful hurdles I had to develop around I'll give some quick examples from my project:

Users can create reviews. A user's profile picture and username is attached to each review they create. This is needed because the frontend views a list of reviews. If we have to fetch all the reviews then make a second query for each review to get the user profile picture and username, then that 1 query now becomes N+1 queries. This is called the N+1 problem. Then a user changes their name. Now you have to code a cloud function that notices that change and dynamically searches through every report (could be millions) and changes that user's display name on each one that their old name is on. This is a lot of programming for something that in a SQL database would be a feature out-of-the-box.
Users need to choose a username when they sign up. I want to make sure two users don't have the same username (ignoring capitalization). The solution to this problem in a Firestore NoSQL way? I had to add a lowercaseUsername field to every single user. When a user wants to change their username, it converts it to lowercase, then queries if it exists already and if not it changes their username. This is a total pain if your app is in production already, because backfilling every user document to add a lowercaseUsername field requires development time to write a single-use function to execute this migration. I found I had to backfill data all the time and eventually it just got too hard to work with.
Users can follow Trails. Trails can have multiple Users following them. This creates a many-to-many relationship between these objects. Managing this in Firestore was beyond tedious. It's somewhat straightforward when you only have to think about creating data, but then you have to deal with updating and deleting it too creates a ton of complexity.

As you can see, there's so many situations where a NoSQL database screws you up and causes a lot of development time-sink. SQL databases are very scalable and powerful now that they will serve your needs much better. And guess what? If you want the best of both worlds you can use BOTH. Put your relational database in a SQL database, and put your non-relational data (like the millions of live chat messages for example) in a NoSQL database and get the benefits of both with the tradeoffs of neither.

Is Firestore Right for You?

I still like a couple things about Firestore. Their client SDK that managed client-side offline-support was convenient, and for querying simple data that's non-relational in nature I would still consider it. But unless I know my project has a fixed completion date and won't run into any of the limitations mentioned above, I can't recommend it.

So What's The Alternative to Firestore?

If you're like me and you enjoy getting the nested JSON response from your database, then you should consider using GraphQL. I switched to GraphQL paired with a SQL Database and found it to be the perfect balance where I get everything I liked from before in terms of easy querying, but then I still can query the database directly if I want to do something more involved. I also found that speed was still comparable, and I can add read-replicas if my database begins to slow down as it scales.

For other use cases, here are my recommendations:

If you want something that's just an easy bucket to put data into then consider checking out something like Contentful: https://www.contentful.com/

If you want something that gives you an easy-to-use open-source UI to make CRUD API's on top of an open-source Postgres database, consider GraphQL with Hasura + Postgres: https://hasura.io/

If you want a SQL database where you don't have to deal with data duplication, but also don't want to use GraphQL or manage database scaling, consider AWS Aurora:
https://aws.amazon.com/rds/aurora/

Check me out: https://spencerpauly.com

Top comments (43)

Pablo Hdz Martín • Aug 18 '20

What I like the most is the reactive database. Something changes in the database and is changed automatically if you are subscribed in the front end.
How can I get this feature without Firestore? Ty

Spencer Pauly • Aug 19 '20

Yep, I liked this feature too. Right now I use GraphQL subscriptions in Hasura to get this same functionality. It's pretty convenient to set them up in Hasura too, which is usually a more painful process if you write your own GraphQL server.

Vasyl Boroviak • Aug 18 '20

MongoDB has it built-in. They call it "change streams" I think.

Tobias Nickel • Aug 19 '20

thanks , that is what I have been looking at for a while. I was impressed by the live sync feature by metheor js, but hated the rest of that framework.

Thomas Götzsche • Dec 23 '20

This is one of the main selling points for sure. When you do mobile, one of the most difficult things to handle is bad network. With this feature you don't need to worry about http codes, timeouts, retries, etc.

Sagar Bhattacharya • Aug 19 '20

you can look into pouchdb , couchdb pouchdb.com/

Himujjal Upadhyaya • Aug 22 '20

46kB gzipped!! something like pouchdb-'lite' could have been more popular.
Still better than firebase though.

Alex Vidotti Fornazieri • Oct 30 '24

I think it can be easily implemented with Web Sockets, problem is that need a full time machine, I guess, but talking about SQL you can up a Socket API in same machine that you install a Postgress or some else; Most of noSQL databases, as others says, have it streams native, but is all basically a Web Socket implementation

Don Morris • Aug 19 '20

I have been using Firebase realtime db and firestore for over 2 years now. About 6 months ago I added an amazon MySQL instance to store orders and advertising data for amazon sellers. As you can imagine a seller can have hundreds if not thousands of records a day and firestore just isn’t robust enough for data warehousing. Overall I’m pretty happy with firestore for most scenarios but it just can’t replace a sql db.

As I read your article I did get the feeling part of your frustration was due to your design of your data handling and was not necessarily issues with firestore.

Jonathan Marbutt • Sep 6 '20

I agree with you Don, I did a pretty large application that was ready for production that hit some snags with Firebase but reverted back to SQL for now. I am now evaluating using Firebase in conjunction with SQL to get the best of both worlds. I think a lot of larger applications need multiple data stores for different needs. They are just merely a tool in your tool belt and not the only tool. I am evaluating using Firebase more as my front end query store with SQL being the version of truth and for large analytics type jobs.

How do you keep everything in sync?

Don Morris • Sep 8 '20

We don't have to sync. The data is distinct between the 3 databases we have. Let me explain...

Our app helps amazon sellers optimize their product listings and analyze their sales. Most of our data is stored in either firestore or the realtime db. Firestore was not in production when we began so a lot of our core data is in the realtime db. We want to move all of it into firestore but it is not a trivial task.

Product orders and advertising performance data is loaded every 6 hours into the sql db. A seller (our users) can have thousands of orders each day. Their advertising data can be even greater. We originally tried to use firestore for this data but it was like trying to put in a nail with a screwdriver instead of a hammer. We have a lot of analytics and this is much easier and runs faster in sql.

In summary, the order data and advertising metrics don't need synchronization to our operational data such as product listings.

Jonathan Marbutt • Sep 8 '20

Thanks that makes a lot of sense. We went down the road of moving everything to Firebase but hit the snags with needing some sql things. I feel like my pattern would be the opposite than yours, SQL first since we have a lot of legacy there and push to firebase on an api trigger or something.

Rune Jeppesen • Aug 19 '20

Why not have 1 document with all usernames in lowercase in it?
Regarding the NOSQL it is true that modelling the structure of your data is more.. challenging and should be done based on 'demand'. Changing a username 'should' be expensive as it is such a rare event.
firebase.google.com/docs/firestore... ?

Bresson B • Jul 18 '21

"Changing a username 'should' be expensive as it is such a rare event" is a proprietary, domain decision and not for the database to decide. To state that such an event is "expensive" and "rare" imposes an artificial constraint.

Rune Jeppesen • Jul 19 '21

NoSQL data modeling is typically driven by application-specific access patterns

I wrote "Should" based on my experience modeling in NoSQL.

With great power comes great responsibility.. with NoSQL you can have a post document containing the post, all comments and all the usernames.
If usernames are changed "too often" you should have references to users instead.

My guess is that the extra speed when loading a post from not reading all commenters user profiles is well worth the rare and expensive username change event

Rodrigo de Souza Marques • Aug 19 '20 • Edited

My suggestion: Parse Server - parseplatform.org
1) OpenSource
2) Node.Js + Express + MongoDb or PostgreSQL
3) REST API, GraphQL or SDK / libraries for various languages / technologies
4) Allows the use of relationships between tables using Pointer or Relation. No duplicate data.
5) Another 30 query operators, including relational queries.

It has practically all the features of Firebase (since Google copied most of them there in 2016.).

Garret • Aug 25 '20 • Edited

I think Firestore is a great database and I actually think the things you dislike about it are a big plus to it.

As far as the first point of it being propriety does not really matter to me at all. You should structure your code in a way which if you do want to change databases you can do so relatively easy just basically rewriting queries/writes to the new database form. For example, I have a small example node app on my GitHub if I wanted to change from MongoDB to some other database all I would really have to do is change this repository file. this is because I only have one table/repository so depending how many tables you have it can be a bit more work but not hard at all in my eyes.

The point about it being hard to develop/test with is also not really true in my eyes. It is pretty simple to create a development environment and production environment which uses 2 separate firebase applications (which is how it should be they should never rely on the same stuff). Firebase also provides local emulators to run these services locally on your machine to make it quite easy if you set up your application correctly. Here is the introduction to the emulator's post on firebase.

As far as pricing structure paying for only document reads/writes/deletes you do is a big plus in my eyes as you are only paying for what you use and not over-provisioning database capacity in which you are not using. Sure they do not give you any discounts for having faster queries, however, the pricing model would be much more confusing and annoying if you had to factor in query time as well.

I do not personally use Firestore that often (only really in test projects as I am deep into the AWS ecosystem). So I cannot say how annoying it might be to query data. However, I do have a ton of experience in DynamoDB and have helped a company quite recently switch off of Postgres and they saw a huge improvement not only in query speeds but in cost as well because they are no longer paying for over-provisioned databases.

I will admit trying to learn how to properly use NoSQL databases is a bit hard because you have to rethink how you model your data and it is not always the same from database to database in the NoSQL world, however, it brings some huge benefits over SQL especially in performance at scale and clustering data across multiple servers. Though of course, some cases like analytics are better on SQL where you may have an extreme variety of query patterns as you build NoSQL for the query whereas you build SQL for the data. Hence the performance gains on NoSQL, especially at scale.

Jozef Maxted 👻 • Aug 19 '20

This is shaping up to look like a pretty great alternative to Firebase powered by Postgres: supabase.io/

Tobias Nickel • Aug 19 '20

thanks for sharing your experiences.

you say that you worked for 9 months with firebase. can you estimate how much would be saved if you had used an other db to begin with? so how much time of that do you feel was wasted and is saved with this article for all of us?

Spencer Pauly • Aug 19 '20

Honestly firestore was probably a time-save for the first month or so of the project. It was running and hooked up to my frontend day 1 which is huge. The issue though is that the first 90% of what you want to do w/ firestore will be easy but that last 10% will be incredibly difficult. I'd say if you're building a personal project and not a business and that project has a concrete "finished" point, then that sounds like a good use-case for firestore.

If you think you'll be working on the project for more than 1-2 months, I think it would be worth your time to look elsewhere. For my project personally I started hitting walls after month 1. I think around month 2-3 is where if I had chosen another solution I would've been developing faster.

Alastair Pitts • Dec 6 '20

Just on the 1st point, you can use Terraform and the firebase cli to do the automated setup of a new environment.

You can create a Google project, set up firestorm and create storage buckets via TF. Then use the firebase cli to deploy indexes & cloud functions.

Dwayne Charrington • Aug 21 '20

The biggest benefit of Firebase that isn't easily replaced with something else is authentication. Sure, you have Auth0 and other auth providers, but Auth0 is super expensive. The database is easy to replace, even MongoDB is a nice easy option if you want something like Firestore.

Every time I go to switch, I go down the problematic rabbit hole of implementing auth and remember just how painful OAuth authentication truly is. The most promising option appears to be Supabase, which is eventually going to support every Firebase feature that you can host yourself.

Brad • Aug 19 '20

Great article pointing out all things Firebase!

My team once talked to a Firebase engineer from Google who said there are really 2 primary groups of users serious users for Firebase:

The group that builds their product around Firebase
The group that builds using Firebase for prototyping

Odds are if your not in these two groups, Firebase probably isn't the best choice for your project.

If your in the 2nd group, you probably will end up migrating off the platform at some point which is ok, as long as you take that into account. Its better to build fast, get feedback and go from there than spend time and money building for a scale that you never meet and need.

The 1st group are the ones that build around the limitations of the platform, which arguably could be done with other solutions, but is done with Firebase.
One company we ran into, which was in the first group, essentially "tacked on" multiple databases onto the Real Time Database for different use-cases. So they had an SQL DB for managing/reviewing/aggregating data, but used the RTD as their "client-side" DB for example. (Firestore wasn't an option at the time)

View full discussion (43 comments)