Time series and data modeling

Hi Team,

I’m working on an IoT project developed with MongoDB and NodeJs and I’m not an IoT and MongoDB super expert.

We have a large collection that contains all the data of all the machines.
The machines do not send the sensor data but the variables data.
Users can ask for different variables and they can add or remove which variables to ask to the machines.

We used the sized bucket pattern, but the time series array is not pre-populated with the default value of the variables. This is because we do not know a priori which variable the machines will send us.
I read that inserting new elements into an array can lead to performance problems and a best practice is to “pre-populate anything you can”.

How do you think it can be solved? Which design schema is the best?

Thank you