Hello,
I have set of raw MongoDB collections and now for reporting purposes, need to create set of aggregated collections which will have several computed/calculated values (measures), grouping (dimensions) i.e. a computed pattern with structure similar to star schema style:
[
{
"measure":123
"dim_1":{
"dim1.col1": "xyz"
},
"dim_2":{
"dim2.col1": "abc"
}
},
{
"measure":234
"dim_1":{
"dim1.col1": "yyz"
},
"dim_2":{
"dim2.col1": "def"
}
}
]
Please suggest if using MongoDB Aggregation Framework is the right choice here or using Pandas library for all the transformations and computations and then simply load/insert into MongoDB collection using PyMongo ? Which is the most efficient method ?
Appreciate your inputs.
Thanks!