Use case
Lately, I needed to create an analysis page on the app to count how many shoes a shop has. The database set up is MongoDB so I looked for a good database design suitable for analysis.
My analysis is
- Yearly, monthly, daily count of shoes
- A shoe has the maker name and color
After searching, I found Time Series can be done on MongoDB from this article. Still, needed a bit more solid production-level tutorial, and luckily I found it here on Dev Community by the awesome Riccardo Cardin!
Solution
Assuming these datasets we have for the shoes counters, I started implementing it and it was ready.
Dataset
Makers: ['Nike', 'Adidas', 'Onitsuka Tiger']
Colors: ['White', 'Black', 'Red']
The way I set up the Analysis DB is to have the collection Shoes
containing a document/year. When the document first created, all counters are set to 0 then incremented daily.
DB Shoes Collection document
{
created_at: ISODate("2019-01-01T00:00:10.000Z"),
year: 2019,
nike: 15000,
adidas: 10000,
tiger: 8000,
black: 25000,
white: 1000,
red: 8000,
// months is an array of 12 elements
months: [
{
created_at: ISODate("2019-01-01T00:00:10.000Z"),
nike: 200,
adidas: 1000,
tiger: 80,
black: 200,
white: 1000,
red: 80,
// days is an array of days/month
days: [
{
created_at: ISODate("2019-01-01T00:00:10.000Z"),
nike: 20,
adidas: 100,
tiger: 8,
black: 120,
white: 8,
red: 0,
}
// rest of days
]
}
// rest of months
]
};
Advantages
- This structure is powerful because the document size doesn't change after creation so IO processes are not as expensive, this document size would be around 35 Kbs
- Fetching the document and getting what you need (e.g. Nike shoes number on 2/8/2019) can be tricky at first sight, but thanks to Functional programming, create 1 function and reuse it
- The structure is neat and very well-organized.
Drawbacks
Let's say we want to use some filtering on the Graph to find Black Adidas shoes count on 15/4/2019. The above DB structure fails to give us any information about overlapping joined counters, they're just atomic mere counters.
The only way around it I could think of is creating new counters with every possible joint. This is absolutely a terrible idea, hence, decided to put down the Time Series DB for now and find another design for this use case.
Top comments (1)
there is now native support for time series starting from mongodb 5.0; did you test that yet? mongodb.com/developer/how-to/new-t...