How to query aggregate to get result only object instead array?

I have Transaction collection like this:

{
  _id: ObjectId("5f1284078a7dd8a6b9140c97")
  company: ObjectId("5f127ce1e354f37df698f55e")
  person: ObjectId("5f1284078a7dd8a6b9140c95")
  actionTransaction: "Purchase"
}

I have Companies like this :

{
  _id: ObjectId("5f127ce1e354f37df698f55e")
  name:  "MongoDB Inc"
  phoneNumber: "+1111111111"
  members: [
              {
                  _id: ObjectId("5f1284078a7dd8a6b9140c95")
                  title: "CEO"
                  role: "ADMIN"
              },
              {
                  _id: ObjectId("5f1284078a7ff8a6b9300d93")
                  title: "CHAIRMAN"
                  role: "Others"
              }
          ]
}

and I have ‘People’ collection like this:

{
   _id: ObjectId("5f1284078a7dd8a6b9140c95")
   name: "Foo Bar"
   gender: "Male"
}

when I do query like this:

db.Transaction.aggregate(
  [
      {  $lookup:
        {
          from: "companies",
          localField: "company",
          foreignField: "_id",
          as: "company"
        }
      },
      {  $lookup:
        {
          from: "people",
          localField: "person",
          foreignField: "_id",
          as: "person"
        }
      },
  ]
)

I got result an Array like this :

[
   _id: ObjectId("5f1284078a7dd8a6b9140c97")
   company: [
       {
         _id: ObjectId("5f127ce1e354f37df698f55e")
         name:  "MongoDB Inc"
        phoneNumber: "+1111111111"
        members: [
              {
                  _id: ObjectId("5f1284078a7dd8a6b9140c95")
                  title: "CEO"
                  role: "ADMIN"
              },
              {
                  _id: ObjectId("5f1284078a7ff8a6b9300d93")
                  title: "CHAIRMAN"
                  role: "Others"
              }
           ]
        }
   ],
   person: [
         {
           _id: ObjectId("5f1284078a7dd8a6b9140c95")
           name: "Foo Bar"
           gender: "Male"
         }
    ]
]

you can see each relations there the result on that fields are Array,
and I really one also on that Person field on Transaction Collection gonna get the role and title which from Company but I have no idea to filter it into that Person field

this is the illustration the result I want just an Object:

// Transaction
{
  company : { COMPANY_DATA },
  transactions: [
       member: {
                  _id: ObjectId("5f1284078a7dd8a6b9140c95")
                  name: "Foo Bar"
                  gender: "Male"
                  title: "CEO"
                  role: "ADMIN"
               },
       actionTransaction: "Purchase"

 ]

}

is it possible doing this ? any idea for doing it to possibble??

1 Like

Considering the output, it looks like you want to have an array of transactions, that were made by some company members, grouped by company.

I have extended your aggregation and now it gives the exact output as you’ve described:

db.transactions.aggregate([
  {
    $lookup: {
      from: 'companies',
      localField: 'company',
      foreignField: '_id',
      as: 'company'
    }
  },
  {
    $lookup: {
      from: 'people',
      localField: 'person',
      foreignField: '_id',
      as: 'person'
    }
  },
  // the following two $unwind stages
  // used to transform array into object
  {
    $unwind: '$person',
  },
  {
    $unwind: '$company',
  },
  {
    // find a company member, that matches
    // to a given person
    $addFields: {
      matchedMember: {
        $arrayElemAt: [{
          $filter: {
            input: '$company.members',
            cond: {
              $eq: ['$$this._id', '$person._id'],
            }
          },
        }, 0]
      }
    }
  },
  {
    $addFields: {
      person: {
        $mergeObjects: ['$person', '$matchedMember'],
      }
    }
  },
  {
    // group all transactions of members,
    // that belong to a certain company
    $group: {
      _id: '$company._id',
      company: {
        $first: {
          // select fields, that you want to have
          // it the output
          _id: '$company.id',
          name: '$company.name',
          phoneNumber: '$company.phoneNumber'
        },
      },
      transactions: {
        $push: {
          member: '$person',
          actionTransaction: '$actionTransaction',
        }
      }
    }
  },
  {
    // cleanup
    $project: {
      _id: false,
    }
  }
]).pretty();

Have a look at $mergeObjects, $filter pipeline operators and $group, $unwind pipeline stages for more details.

Keep in mind, if you do not limit your query, the aggregation will tend to be slower and slower with each document, inserted into ‘transactions’ collection. Alternatively, you may want to denormalize your transaction documents like this:

db.transactions.insertOne({
  _id: ObjectId("5f1284078a7dd8a6b9140c97"),
  company: {
    _id: ObjectId("5f127ce1e354f37df698f55e"),
    name:  "MongoDB Inc",
    phoneNumber: "+1111111111",
  },
  member: {
    // this member._id may be used to join data 
    // from 'people' collection
    _id: ObjectId("5f1284078a7dd8a6b9140c95"),
    title: "CEO",
    role: "ADMIN"
  },
  actionTransaction: "Purchase"
});

Or even like this:

db.transactions.insertOne({
  _id: ObjectId("5f1284078a7dd8a6b9140c97"),
  company: {
    _id: ObjectId("5f127ce1e354f37df698f55e"),
    name:  "MongoDB Inc",
    phoneNumber: "+1111111111",
  },
  person: {
    _id: ObjectId("5f1284078a7dd8a6b9140c95"),
    name: "Foo Bar",
    gender: "Male",
    title: "CEO",
    role: "ADMIN"
  },
  actionTransaction: "Purchase"
});

This way you can use simpler queries and take advantage of indexes. It will make your queries much more faster, but you will have to take care about the data consistency across collections: if, say, person.name changes in one collection, you will need to update it in another one (immediately or with background process).

1 Like

yess, if I update data on organization or person, I should have to update both :(, that is why I make the collection transaction like that, it take more logic also to update on both collection :frowning: , so if my collection collection like that, it will be very slow ? :frowning: any option beside on your options above ??

also the People collection is use on everywhere on my mongo, , if user want to update, I should update many collections ?

btw the query u answer is not include role, and status of person, I want on transaction.person include their each roles, and status :smiley:

is that any option for this query ? because the title and role not recorded there ?

Hello, @Virtual_Database!

For the most cases, a collection is much more often being read than written. That means, that having denormalized data model will provide good benefits, even though, you will have to write more code to take care of the data consistency across collections. So, usually, it is worth it :wink:

To get the number of reads/writes - execute this in the shell:

db.your_collection.stats().wiredTiger.cursor

It will output this object (I removed and reordered some props for simplicity):

{
  ...
  "insert calls" : 2,
  "modify calls" : 0,
  "remove calls" : 0,
  "update calls" : 0,
  "search calls" : 6,
  ...
}

Roughly:
Number of reads = search calls.
Number of writes = insert+modify+update+remove calls.

Let’s assume, your have 3 collections:

db.players.insertMany([
  { _id: 'P1', name: 'Bob', age: 22, country: 'Australia' },
  { _id: 'P2', name: 'Bill', age: 26, country: 'New Zeland' },
]);

db.teams.insertMany([
  {
    _id: 'T1',
    name: 'team A',
    players: [
      { _id: 'P1', name: 'Bob' },
    ]
  }
]);

db.coaches.insertMany([
  { _id: 'C1', coachesPlayers: ['P1', 'P2']}
]);

Notice, that documents from ‘players’ collection are used in ‘teams’ and in ‘coaches’ collections.
‘coaches’ collection uses normalized model (only _id as reference to document from other collection is used);
‘teams’ collection uses de-normalized model (beside immutable _id there is also player’s name included);

The benefit of having ‘name’ prop beside ‘_id’ in ‘teams’ collection is that you may not need to join players to teams to get player names, which will improve performance. But you will need to make sure that all collections would have same value for player’s name.

  • If this ‘name’ is not super important you can synchronize it on schedule (once per 20 minutes or so, if you do plenty of writes).
  • If it is crucial to have the most recent player’s name everywhere - you would need to use transactions. You updates will be slower, but if you do not update your fields very often - you may not notice a big difference :wink:

Now, a bit about consistency maintenance:

  • If you will need to update player’s name - you will need to update 2 collections: ‘teams’ and ‘players’, but not ‘coaches’.
  • If you will need to update player’s age - you will need to update only document in ‘players’ collection. No need to touch others, because they do not use player’s age in their documents.

But, it worth to day, that with MongoDB, you will have to manually take care of the data consistency, if you have 100% normalized data. For example, if you delete any player from ‘players’ collection, you will have to update 3 collections, because they use a bit or more of player’s data!

Yes, you can take the free course on the MongoDB’s aggregation framework and try to extend and optimize your pipeline.