Watch keynotes and sessions from MongoDB.live, our virtual developer conference.

How to group on multiple columns and sum up individual columns

Hi Friends,
I am new to mongoDB and it’s really exciting.

I have a scenario where i need to group on common fields(3) span across collections and have to perform summing up all the numeric columns by column wise and all string/text columns need to make “Nan” or “Null”.

Help me to form the query…
attaching two sample collections, great regards from my side.

The common fields across the collections are, X, Y_DT and Z. and other columns either numeric, string/text.

Hi @Murali_Muppireddy, you would want to do something similar to the following:

{
    $group: {
        "_id": {
            X: "$X",
            Y_DT: "$Y_DT",
            Z: "$Z"
        },
        adj: {"$sum": "$adj"},
        bjc: {"$sum": "$bjc"},
        ...
    }
} 

Any non numeric value will be converted to 0 for the purposes of the $sum operation.

1 Like

Thank @Doug_Duncan, I could able to get the expected results…
with the following query…

 db.test650.aggregate(
{
    $group: {
        "_id": {
            X: "$X",
            Y_DT: "$Y_DT",
			Z: "$Z"
         },
		adj: {$sum: "$adj" },   
		bjc: {$sum: "$bjc" },
		jbc: {$sum: "$jbc" },
		mnk: {$sum: "$mnk"}
    }
 }
)

But in case of bigger collection with many number of fields, say like 500, it will be very complex to write all the filed names and sumup, is there any generic way we can write some thing like looping rest of the fields(other than grouping ones)?

Seems a bit crazy :slight_smile:

For the crazy request - crazy solution! :laughing:

db.test.aggregate([
 {
   $group: {
     // specify group-by fields here
     _id: {
       x: '$x',
       y: '$y',
       z: '$z',
     },
     docsEntries: {
       $push: {
         $objectToArray: '$$CURRENT',
       },
     },
   },
 },
 {
   $addFields: {
     docKeysList: {
       $map: {
         input: {
           $arrayElemAt: ['$docsEntries', 0],
         },
         in: '$$this.k',
       },
     },
   },
 },
 {
   $addFields: {
     // gather fields, values of which we will sum-up
     filteredKeyList: {
       $filter: {
         input: '$docKeysList',
         cond: {
           $not: {
             // specify ignored fields, that you do not want to calculate
             // at minimum, here should be _id and grouping keys
             $in: ['$$this', ['_id', 'x', 'y', 'z']],
           },
         },
       },
     },
   },
 },
 {
   $addFields: {
     // collect all entries (values of same key from all docs in the group)
     // in the single array
     groupedEntries: {
       $map: {
         input: '$filteredKeyList',
         as: 'filteredKey',
         in: {
           $reduce: {
             input: '$docsEntries',
             initialValue: [],
             in: {
               $let: {
                 vars: {
                   targetDocEntry: {
                     $filter: {
                       input: '$$this',
                       as: 'docEntry',
                       cond: {
                         $eq: ['$$docEntry.k', '$$filteredKey'],
                       },
                     },
                   },
                 },
                 in: {
                   $concatArrays: ['$$value', '$$targetDocEntry'],
                 },
               },
             },
           },
         },
       },
     },
   },
 },
 {
   $addFields: {
     calculatedEntries: {
       $map: {
         // we need to ned to return { k, v } for each key, so we can
         // transform it to single object with custom prop names
         input: '$groupedEntries',
         as: 'groupedEntry',
         in: {
           k: {
             $let: {
               vars: {
                 item: {
                   $arrayElemAt: ['$$groupedEntry', 0],
                 },
               },
               in: {
                 // here the custom prop name is calculated
                 // feel free to change the logic if needed
                 $concat: ['total_', '$$item.k'],
               },
             },
           },
           v: {
             $reduce: {
               input: '$$groupedEntry',
               initialValue: 0,
               in: {
                 $add: ['$$value', {
                   $convert: {
                     input: '$$this.v',
                     to: 'double',
                     // Change NaN to 0 for onError prop,
                     // if your props can contain various value types
                     // and you cant to calculate only number values
                     // for every document field
                     onError: NaN,
                     onNull: 0,
                   },
                 }],
               },
             },
           },
         },
       },
     },
   },
 },
 {
   $addFields: {
     results: {
       $arrayToObject: '$calculatedEntries',
     },
   },
 },
 {
   $project: {
     results: true,
   },
 },
]).pretty();

And for the following documents:

db.test.insertMany([
  { x: 1, y: 1, z: 1, propA: 5, propB: '5', propC: 't5' },
  { x: 1, y: 1, z: 2, propA: 10, propB: '10', propC: 't10' },
  { x: 1, y: 1, z: 2, propA: 15, propB: '15', propC: 't15' },
]);

The aggregation will return:

[
  {
    "_id" : {
      "x" : 1,
      "y" : 1,
      "z" : 1
    },
    "results" : {
      "total_propA" : 5,
      "total_propB" : 5,
      "total_propC" : NaN
    },
  },
  {
    "_id" : {
      "x" : 1,
      "y" : 1,
      "z" : 2
    },
    "results" : {
      "total_propA" : 25,
      "total_propB" : 25,
      "total_propC" : NaN
    },
  },
]

The solution is not ideal and can be improved, but that is another crazy story :sweat_smile:

I made this just to proove, that this is possible with Mongo v4.2.