DEV Community

Cover image for MongoDB 1๏ธโƒฃ-Few/Many
Manav Misra
Manav Misra

Posted on

MongoDB 1๏ธโƒฃ-Few/Many

Photo by Everton Vila on Unsplash

I love ๐Ÿ’™ MongoDB and ๐Ÿ’ฏ reach for it before SQL and its limited paradigms whenever feasible. No, not saying NoSQL is ๐Ÿ’ฏ > SQL, but a majority of the time unless all of the data truly is independent and only exists because of relationships, then it probably is. However, that's a convo for another day.

A common challenge of using MongoDB (unlike in SQL) is the decision fatigue that comes with figuring out the relationships.

What Not To Do

One bad solution to this problem is to just use NoSQL like its SQL.

Just put all the things into separate collections (a la SQL tables) and then use references (Mongoose populate or native $lookup) for all the things. In that case, just don't use NoSQL. ๐Ÿ™ƒ

References

I'm putting my references at the top because my post really does little more than regurgitate ๐Ÿคฎ and remix what's already been said by official MongoDB folks.

  1. MongoDB Schema Design Best Practices
  2. MongoDB Schema Design Best Practices (Video)
  3. Schema Design Anti-Patterns - Part 1

You should review these for full insights. I am just summarizing and remixing.

This post just focuses on relationships, and not other considerations that are covered in the aforementioned resources โ˜๏ธ.

TLDR

Information that tends to be viewed/accessed together tends to stay together ergo favor embedding rather than references/populate/$lookup.

1๏ธโƒฃ-Few

Prefer embedding for one-to-few relationships.

What's 'few?' A couple of hundred.

A user with a few different addresses and/or social media accounts. This is a 'one-to-few.'

It's reasonable that I would click on the user and want to view all of the details together at the same time.

I don't need to create a separate view in my front-end just to view the addresses. Just show me the user and all of the details including addresses in one view. This means it's one read to get what I want.

In addition, it's unlikely that I need to make frequent updates to a user's addresses.

Embed an array of address subdocuments into the user documents. Do not make it a reference like you would in SQL.

But......

Needing to access an object on its own is a compelling reason not to embed it.

Let's imagine ๐Ÿšฒs. An e-commerce site that sells products in addition to individual parts that make up said products.

Chances are, I don't need to click a link to a product and at the same time see a comprehensive listing of all of the associated part details. Instead, I might have links on the web view where I can click on a part number and then see details for a specific part.

It's also a reasonable use case that I might be shopping for 1๏ธโƒฃ of the individual parts and not a product. In other words, I need to view a product in isolation and also view a part in isolation.

In this case, unlike a user's addresses, it makes sense for these document types to be presented by themselves.

Furthermore, prices change on both products and on individual parts.

Keep them as separate collections and do use references.

1๏ธโƒฃ-Many

What's 'many?' More than a 'few'. A few thousand. More specifically, an unbounded array. This is especially important due to Mongo DB's 16MB document limit.

Just use references. Don't embed.


There are also use cases for 1๏ธโƒฃ-Squillions mentioned in the aforementioned resources. Not covering that here.


Favor Embedding

If not sure, then favor embedding. Unlike SQL, it's ๐Ÿ†— to have duplicate data, if needed.

This is only an issue if this embedded data needs to be updated frequently. Then, separate with references, but unless you are sure, then favor embedding.

With the users and addresses scenario โ˜๏ธ, a user doesn't frequently update their addresses (or do they? ๐Ÿ‘‡๐Ÿพ), social media, etc., so embed.

Frequent Updates and/or Isolated Views? References!

For ๐Ÿšฒ products/parts, prices of each might fluctuate somewhat frequently, and, as discussed previously, I probably don't need to look ๐Ÿ‘€ at all of that information together, anyway. They would be separate views. Use references.

It All Depends ๐Ÿ˜ตโ€๐Ÿ’ซ

What post on development would be complete without these infamous words?

Think all the way back โ˜๏ธ to our example regarding users and addresses. The assumption was that users were the focal point of the application.

What if it's more about the addresses and who is living at what address instead? Say, for a housing complex with tenants.

What we might do is flip it ๐Ÿ™ƒ. Embed the users into their addresses. As folks move in and out, we'll just be updating that embedded users array for an address document. ๐Ÿ†’.

The assumption here is that most of the reads pertain to addresses and not users by themselves. We also assume that the users' individual data don't need to updated frequently.

If this was not the case and there was equal reading/updating of both individual users and addresses, then we should use references, as long as we are sure that there is a need. When in doubt, favor embedding.

Authors Of Pain ๐Ÿ“š

Consider authors and books data. What is the app about? Is it an authors directory listing for publishers? Then embed books into authors. An author is not going to have an unbounded array situation where they are cranking out books like Tweets. We are assuming a couple of hundred here. Not several thousand.

Or, is it more about finding books for users/readers? Then, embed authors into books. Again, a book is not going to have an unbounded array of authors.

What if it's both? What if the site serves both audiences? I might want to have a view showing all of the authors with the embedded ๐Ÿ“š and/or also also just browse books and not worry as much about the author.

Then, we should use a reference.

Unsure? Embed! In NoSQL data schemas are flexible and it's easier to retroactively fix a bad design decision later in the process. This is not so with SQL.

Hybrid Approaches

Still, favor embedding if not sure.

Embedding MongoDB

Use this in situations such as movies and reviews. Here, it doesn't make sense to access reviews outside of the context of a movie. I am also facing an unbounded array situation where a bunch of jerks online will all feel obligated to pour their thoughts out about some movie such that over time there are thousands and thousands of reviews. They're kind of like jerks that pour their thoughts out on MongoDB schemas! ๐Ÿ˜

When I click on a movie, do I want to read all of that ๐Ÿ’ฉ? No! But...I can use a hybrid approach where I embed the last few reviews in the movie and keep an array of references to all of the rest. Best of both worlds!


What you use together, store together.

Top comments (0)