EDIT: I started to type this hours ago. Then I went to a food break and didn’t see the 2 previous answers.
Hi @Joao_Pinela,
My answer might not be the only truth but, let’s try.
First, I would say that MANY collections in MongoDB is generally a bad idea. I’d say that it’s better to have a few very large collection with many documents in them rather than MANY MANY collections with a limited numbers of documents in them.
Also, this will make any aggregation involving the entire data set a lot more complex, because you would have to $unionWith all the collections to calculate the average temperature for example.
I think if you HAVE to split your data set into a FEW collections, I would use something with a lot less cardinality so the number of collections stays completely under control.
For example, I would use the year
or month_year
.
sensor_temp_2020
sensor_temp_2021
OR
sensor_temp_01_2021
sensor_temp_02_2021
At least here, if you need to calculate the average temperature for 2020 and 2021, if you chose the first option, it’s trivial, it’s more complicated if you choose the second option.
If you need the averages per months, I would go for the first or second option, in that case, both aggregations are trivial.
I think it’s all coming down to “how are you going to query your data”?
Another GREAT pattern for IOT data with too many documents would be to use the bucket pattern.
Basically, instead of storing 1 temperature per document, you store the entire day or month of temperatures in a single document using arrays. This can divide your number of documents very significantly. But don’t make jumbo documents either. A few hundreds KB top would be my recommendation.
Also, I would use Online Archive to archive automatically the old values into S3 to reduce the costs but keep that data queryable using the federated queries that still allow to query both the “hot” data in Atlas and the archived one.
I hope this helps.
Cheers,
Maxime.