Here are my notes on Firestore, and what I wish I knew. This is a rundown, so I've put bits in bold so you can scan.
Firebase is one of the most popular "backendless" services. The whole idea is: let's provide everything we need for our applications as direct services that make setting up a backend server redundant. Firestore continues this: let's just connect right to our database from the client. The client here is our website, or a mobile app, a desktop client, whatever.
Firestore is a document store database, which means everything is an amorphous document that you set values on. It is schemaless, meaning each document can have whatever fields you want on it– you aren't setting up columns or properties ahead of time.
Every document is in a collection, kind of like a table, but you can think of it as a directory with the documents as files. In fact, they are named exactly like directory paths
/collection-name/document-id, and can keep going: you can have sub-collections like
/collection-name/document-id/subcollection-name/subdocument-id. In truth, there is absolutely no relation between the sub-document and the document, deleting the document won't even delete the sub-document. It merely serves as a way to organize.
Again, you connect directly to Firestore from the client / front end. You can update the value of a document, you can read a document, and you can query for multiple documents in a collection. Also, you can subscribe to a document, collection, or collection query, and get real-time updates in the form of "snapshots".
Queries are really simple. We're not doing SQL queries here, just things like "is this field equal to this value" and "is this field bigger than this value". You can combine multiple query statements, but they require creating special indexes. Fortunately, when you attempt to use an index that doesn't exist, the library is smart enough to throw an error and give you an exact link to click on to create the query index. But probably 80% of the time, when I make an index, it turns out I don't actually need it. If you're making a lot of indexes in Firestore, it's a red flag that you're laying out your data wrong.
Generally, there is no processing on writes and reads, so whatever the client writes goes directly into the database. Whatever is in the database, the client reads. However, there is a way to trigger cloud functions when a document is created or written. This lets you simulate processing but, importantly it is after the fact, so there's a quasi-state between writing and processing where things can be out of whack. This task gets scheduled somewhere by Firebase, in my tests it occurred a few seconds after the data was saved.
And that's basically it. Firestore is purposefully super simple. If you need more, you're using the wrong tool.
Don't overthink Firestore. It exists in a larger offering called "Google Cloud Platform". It has a lot of friends to pick up its slack, and it's not trying too hard, so you shouldn't either.
It's one of the fastest ways I know of going from zero to a full "backend" solution. It's amazing for prototyping, amazing for simple applications, and great as a front for a larger system, as a sort of caching layer. It's also pretty cheap, given the alternatives. Just make sure you aren't doing a lot of writes.
Connecting directly from the client is as easy as you'd think it should be. It feels like cheating, and it's worth the price of admission. Although the documentation can be pretty obtuse, once you figure out that you're mostly basing everything off of a document reference:
firestore.collection('collection-name').doc('doc-id'), you begin to fly. And once you figure out snapshots, you're golden.
It's fast. Special caching and indexing make it feel like real-time, especially when you subscribe to changes, they seem to come instantly.
It has a live interface to view and edit the data. It's amazing for prototyping because you can see real-time changes. If your client is subscribing to the data, you can make the change in the interface and see it live. Absolute filth for live demos to upper management. "Oh, you want to change this name? Already done."
Another "feature" is that it forces you to simplify your data organization. This can be bad and good, but I've found that it gives you all you need for most simple applications. So if you're struggling with data layout, it's a red flag that you either need a separate solution for that feature, or you're thinking about it all wrong.
The simplicity will quickly box you in, though. This isn't a relational database. It's a specialized non-relational database– you will have to learn a different way of thinking that matches it, specifically.
With no real way of doing relations, you're going to make a lot of reads and round-trips. This is by design, and probably fine, but we're not able to do anything fancy with joins or graph queries. This also means being very, very aware of how you set up your data. In many places, you will be duplicating data and will have to manage to update this duplicated data in a system that, frankly, doesn't want to help you.
Another big issue is that there is no schema, at all. You cannot be sure of data consistency. That means you have to set up defenses in your client code when you read and/or setup post-processing cloud function triggers to ensure sanity. A lot of
object.title || '' or use a schema validation library. It is really pushing the schema work to the client. This is a pretty awful indictment when you compare it to services offering things like GraphQL that handle this inherently.
And given that a nefarious user could, with some console scripting, update whatever they have access to with whatever data they wanted, you can't rely on any of it. Again, security becomes a big question.
Finally: pricing is weird. On the surface, Firestore has a straightforward pricing model: you pay a certain amount per read and per write. But things can get problematic. It's hard to figure out what special functions like subscribing to collection queries do, and it's hard to know what cloud functions might trigger either a read or a write. At best, you'll be able to estimate based on projected users and activity. At worst, like most cloud services, you'll have to do some linear regressions after you've run it for a while to see how much your app will cost when it scales. But, hey, this is why we have CFOs, right?
Here are some strategies for success with Firestore:
First thing, first thing: Set up pricing limits. Make something reasonable. Da pricing calculator don't care that your script went into an infinite loop, your credit card is still on the line. Big daddy Google ain't messing around. Though to be fair, I've heard of people messing this up and Google has been lenient. Still, put the limit in.
As a general rule, separating public and private data is great. In Firebase it's even more greaterific. This lets you write public rules that are super simple, and focused, well crafted, private rules. It's also a good line on which to duplicate data, e.g. have private user profiles with all their information, and public profiles, in a separate collection, with only a subset of fields.
I abandoned making writes at all directly from the client. Instead, I set up an App Engine REST API to do this. You could easily do serverless functions, an API Gateway with Cloud Run, whatever. Yes, this throws out the idea of fully backendless as you're literally making a backend. But I find it much, much easier to ensure security and data schema in custom functions. Best to do this after the prototyping phase. Get the functionality down, then move on to data consistency and security — batten down the hatches, so to speak.
An alternative to this is to have special collections set up for writes, and then have serverless functions trigger and process them. This is a fine system, but you're essentially just doing the same thing as above.
If you start thinking of Firestore as essentially a specialized, fast, subscribable read-only data source, it's easy to imagine gradually putting it in between your existing services and clients. Gradually put data into it and move clients to read from it. Slowly your backend becomes write-only, decreasing complexity and cost.
The rules are such little jerks, you need to be sure they are working. Don't leave it to the interface, it's just waiting to fail.
The client bends backward to make every operation idempotent, so you can't get the database in a weird state. But on the server, you need to use transactions to make sure you don't create a weird state of the database. Also, it can save on pricing as a single transaction sort of bundles costs up. But, uh, don't quote me on that last part, it's not well defined.
Google has done an impressive job creating videos to help you understand how to organize your data. I've dealt with a lot of document store and non-relational databases and they still very much helped me think in Firestore.
Welp, that's all I got. Overall, I really like Firestore. It lets me get building fast. But it's a special thing, and it's best to understand it before you jump in.
It's well-executed; most of the issues are from trade-offs that I can't fault them on. Although the rules system really needs a lot of work.
Going forward, the horizon looks good for some other backendless technologies. AWS is coming around the corner with Amplify, and any GraphQL service is going to have a huge benefit here since we're already giving up REST. But it remains to be seen if the simplicity, cost, and speed of Firebase will shine through.
Cover image: Photo by burak kostak from Pexels
Meet me on twitter: @deadwisdom
I'm available for consulting services on architecture and web development.