How to perform multiple group bys in a aggregation query

Hope you can help I am new to aggregation queries.

I have a nested data structure that I want to group to produce statistical output. I have a set of orders where the order is for a country, product and product code. An order looks like:

db.orders.findOne();
{
    "_id" : ObjectId("5efc6db38cb109193e41c4d3"),
    "createdDate" : ISODate("2020-06-25T02:06:25.428Z"),
    "data" : {
        "nested" : {
            "country" : "France"
        },
        "product" : "Product 4",
        "latest" : {
            "sub" : {
                "code" : "Code 3"
            }
        }
    }
}

I have an aggregation query that groups by country, product and code.

db.getCollection('orders').aggregate([
{
    $unwind :{
        path: "$data.nested.country"
    }
},
{
   $group: {
       _id: { country: "$data.nested.country", product: "$data.product", code: "$data.latest.sub.code" }
   }
}
])

This produces output such as:

{ "_id" : { "country" : "Slovenia", "product" : "Product 3", "code" : "Code 7" } }
{ "_id" : { "country" : "Japan", "product" : "Product 1", "code" : "Code 9" } }
{ "_id" : { "country" : "Japan", "product" : "Product 4", "code" : "Code 4" } }
{ "_id" : { "country" : "China", "product" : "Product 1", "code" : "Code 1" } }
{ "_id" : { "country" : "France", "product" : "Product 3", "code" : "Code 4" } }   
{ "_id" : { "country" : "Japan", "product" : "Product 4", "code" : "Code 8" } }
{ "_id" : { "country" : "Japan", "product" : "Product 4", "code" : "Code 5" } }
{ "_id" : { "country" : "Slovenia", "product" : "Product 4", "code" : "Code 4" } }
{ "_id" : { "country" : "Slovenia", "product" : "Product 4", "code" : "Code 7" } }
{ "_id" : { "country" : "Slovenia", "product" : "Product 4", "code" : "Code 2" } }
{ "_id" : { "country" : "China", "product" : "Product 4", "code" : "Code 8" } }
{ "_id" : { "country" : "France", "product" : "Product 4", "code" : "Code 4" } }
{ "_id" : { "country" : "Japan", "product" : "Product 2", "code" : "Code 3" } }
{ "_id" : { "country" : "Japan", "product" : "Product 2", "code" : "Code 6" } }
{ "_id" : { "country" : "Japan", "product" : "Product 2", "code" : "Code 3" } }
{ "_id" : { "country" : "Slovenia", "product" : "Product 2", "code" : "Code 9" } }
{ "_id" : { "country" : "China", "product" : "Product 2", "code" : "Code 6" } }

I want to group this data by country and then product and then code so, for example, Japan would have a list of products i.e. Product 4, Product 2 inside each there would be a list of codes so “Product 4”: [“Code 8”,“Code 5”,“Code 3”,“Code 6”,“Code 2”] etc. Since an order can be made for a Product with a particular code more than once for a country I need I think a map of codes and the counts for each code.

{ "_id" : { "country": "Japan", products: [{"product":"Product 2","codes":[{"code":"Code 3","count":2},{"code":"Code 6","count":1]}]

Hello, @Sean_Barry! Welcome to the community!

You can achieve what you want with two sequential $group stages

db.orders.aggregate([
  {
    $group: {
      _id: {
        country: '$data.nested.country',
        product: '$data.product',
      },
      productCodes: {
        $push: '$data.latest.sub.code',
      },
      uniqueCodes: {
        $addToSet: '$data.latest.sub.code',
      }
    }
  },
  {
    $group: {
      _id: '$_id.country',
      country: {
        $first: '$_id.country',
      },
      products: {
        $push: {
          product: '$_id.product',
          codes: {
            $map: {
              // run $filter+$size operations per each code
              input: '$uniqueCodes',
              as: 'code',
              in: {
                code: '$$code',
                count: {
                  $size: {
                    $filter: {
                      // collect same codes into one array
                      // to be able to count them per product 
                      input: '$productCodes',
                      cond: {
                        $eq: ['$$code', '$$this'],
                      },
                    },
                  },
                },
              },
            },
          },
        },
      },
    },
  },
  // cleanup
  {
    $unset: ['_id'],
  }
]);

Sample dataset:

db.orders.insertMany([
  {
    '_id' : 1,
    'data' : {
      'nested' : {
        'country' : 'France'
      },
      'product' : 'Product 1',
      'latest' : {
        'sub' : {
          'code' : 'Code A'
        }
      }
    }
  },
  {
    '_id' : 2,
    'data' : {
      'nested' : {
        'country' : 'France'
      },
      'product' : 'Product 2',
      'latest' : {
        'sub' : {
          'code' : 'Code B'
        }
      }
    }
  },
  {
    '_id' : 3,
    'data' : {
      'nested' : {
        'country' : 'Canada'
      },
      'product' : 'Product 1',
      'latest' : {
        'sub' : {
          'code' : 'Code B'
        }
      }
    }
  },
  {
    '_id' : 4,
    'data' : {
      'nested' : {
        'country' : 'Ukraine'
      },
      'product' : 'Product 2',
      'latest' : {
        'sub' : {
          'code' : 'Code B'
        }
      }
    }
  },
  {
    '_id' : 5,
    'data' : {
      'nested' : {
        'country' : 'Canada'
      },
      'product' : 'Product 1',
      'latest' : {
        'sub' : {
          'code' : 'Code A'
        }
      }
    }
  },
  {
    '_id' : 6,
    'data' : {
      'nested' : {
        'country' : 'Canada'
      },
      'product' : 'Product 1',
      'latest' : {
        'sub' : {
          'code' : 'Code A'
        }
      }
    }
  }
]);

Aggregation output on sample dataset:

[
  {
    "country" : "France",
    "products" : [
      {
        "product" : "Product 2",
        "codes" : [
          {
            "code" : "Code B",
            "count" : 1
          }
        ]
      },
      {
        "product" : "Product 1",
        "codes" : [
          {
            "code" : "Code A",
            "count" : 1
          }
        ]
      }
    ]
  },
  {
    "country" : "Canada",
    "products" : [
      {
        "product" : "Product 1",
        "codes" : [
          {
            "code" : "Code B",
            "count" : 1
          },
          {
            "code" : "Code A",
            "count" : 2
          }
        ]
      }
    ]
  },
  {
    "country" : "Ukraine",
    "products" : [
      {
        "product" : "Product 2",
        "codes" : [
          {
            "code" : "Code B",
            "count" : 1
          }
        ]
      }
    ]
  }
]
1 Like

Thank you very much. That’s great. Works perfectly. I couldn’t have come up with this solution. There’s definitely a lot going on. I thought I might just need another $group.

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.