Slow Aggregate for large collection

Aggregate below is very slow (30 minutes). Any tips?

{
“command”: {
“aggregate”: “contacts”,
“pipeline”: [
{
“$match”: {
“isDeleted”: false,
“tenant_id”: {
“$oid”: “5ec2a723a73af34fd5964c93”
},
“$or”: [
{
“emails”: {
“$exists”: true,
“$not”: {
“$size”: 0
}
}
},
{
“cellphones”: {
“$exists”: true,
“$not”: {
“$size”: 0
}
}
}
]
}
},
{
“$lookup”: {
“from”: “events”,
“let”: {
“cId”: “$_id”
},
“pipeline”: [
{
“$match”: {
“$expr”: {
“$and”: [
{
“$eq”: [
“$contact_id”,
“$$cId”
]
}
]
}
}
},
{
“$group”: {
“_id”: “$contact_id”,
“60900d520548ab3d4f3c657f”: {
“$sum”: {
“$cond”: [
{
“$and”: [
{
“$eq”: [
“$channel”,
“form”
]
},
{
“$eq”: [
“$event”,
“submitted”
]
},
{
“$eq”: [
“$form_id”,
{
“$oid”: “6000194aae91a5ea3fa9975b”
}
]
}
]
},
1,
0
]
}
}
}
}
],
“as”: “events”
}
},
{
“$unwind”: {
“path”: “$events”,
“preserveNullAndEmptyArrays”: true
}
},
{
“$addFields”: {
“60900d520548ab3d4f3c657f”: “$events.60900d520548ab3d4f3c657f”
}
},
{
“$project”: {
“events”: 0
}
},
{
“$lookup”: {
“from”: “events”,
“let”: {
“cId”: “$_id”
},
“pipeline”: [
{
“$match”: {
“$expr”: {
“$and”: [
{
“$eq”: [
“$contact_id”,
“$$cId”
]
}
]
}
}
},
{
“$group”: {
“_id”: “$contact_id”,
“60900d520548ab80b13c657e”: {
“$sum”: {
“$cond”: [
{
“$and”: [
{
“$eq”: [
“$channel”,
“form”
]
},
{
“$eq”: [
“$event”,
“submitted”
]
},
{
“$eq”: [
“$form_id”,
{
“$oid”: “604f5890d5460d89d93949df”
}
]
}
]
},
1,
0
]
}
}
}
}
],
“as”: “events”
}
},
{
“$unwind”: {
“path”: “$events”,
“preserveNullAndEmptyArrays”: true
}
},
{
“$addFields”: {
“60900d520548ab80b13c657e”: “$events.60900d520548ab80b13c657e”
}
},
{
“$project”: {
“events”: 0
}
},
{
“$lookup”: {
“from”: “events”,
“let”: {
“cId”: “$_id”
},
“pipeline”: [
{
“$match”: {
“$expr”: {
“$and”: [
{
“$eq”: [
“$contact_id”,
“$$cId”
]
}
]
}
}
},
{
“$group”: {
“_id”: “$contact_id”,
“60900d520548ab3cd73c657d”: {
“$sum”: {
“$cond”: [
{
“$and”: [
{
“$eq”: [
“$channel”,
“form”
]
},
{
“$eq”: [
“$event”,
“submitted”
]
},
{
“$eq”: [
“$form_id”,
{
“$oid”: “604a0401f4ac575280617d95”
}
]
}
]
},
1,
0
]
}
}
}
}
],
“as”: “events”
}
},
{
“$unwind”: {
“path”: “$events”,
“preserveNullAndEmptyArrays”: true
}
},
{
“$addFields”: {
“60900d520548ab3cd73c657d”: “$events.60900d520548ab3cd73c657d”
}
},
{
“$project”: {
“events”: 0
}
},
{
“$lookup”: {
“from”: “events”,
“let”: {
“cId”: “$_id”
},
“pipeline”: [
{
“$match”: {
“$expr”: {
“$and”: [
{
“$eq”: [
“$contact_id”,
“$$cId”
]
}
]
}
}
},
{
“$group”: {
“_id”: “$contact_id”,
“60900d520548abc18b3c657c”: {
“$sum”: {
“$cond”: [
{
“$and”: [
{
“$eq”: [
“$channel”,
“form”
]
},
{
“$eq”: [
“$event”,
“submitted”
]
},
{
“$eq”: [
“$form_id”,
{
“$oid”: “607f0e6c21f8060d785195cd”
}
]
}
]
},
1,
0
]
}
}
}
}
],
“as”: “events”
}
},
{
“$unwind”: {
“path”: “$events”,
“preserveNullAndEmptyArrays”: true
}
},
{
“$addFields”: {
“60900d520548abc18b3c657c”: “$events.60900d520548abc18b3c657c”
}
},
{
“$project”: {
“events”: 0
}
},
{
“$match”: {
“$or”: [
{
“60900d520548ab3d4f3c657f”: {
“$gte”: 1
}
},
{
“60900d520548ab80b13c657e”: {
“$gte”: 1
}
},
{
“60900d520548ab3cd73c657d”: {
“$gte”: 1
}
},
{
“60900d520548abc18b3c657c”: {
“$gte”: 1
}
}
]
}
},
{
“$project”: {
“_id”: 1
}
},
{
“$count”: “count”
},
{
“$unwind”: “$count”
}
],
“allowDiskUse”: true,
“cursor”: {},
“lsid”: {
“id”: {
“$binary”: “5iRNOC0dSTaKQySLPvB5uQ==”,
“$type”: “03”
}
},
“$clusterTime”: {
“clusterTime”: {
“$timestamp”: {
“t”: 1620056396,
“i”: 6
}
},
“signature”: {
“hash”: {
“$binary”: “PHh4eHh4eD4=”,
“$type”: “00”
},
“keyId”: {
“$numberLong”: “6902231878846119939”
}
}
},
“$db”: “production”
},
“planSummary”: [
{
“IXSCAN”: {
“isDeleted”: 1,
“tenant_id”: 1,
“cookies”: 1
}
}
],
“numYields”: 1286555,
“queryHash”: “B03447D9”,
“planCacheKey”: “C5207446”,
“ok”: 0,
“errMsg”: “Error in $cursor stage :: caused by :: operation was interrupted because a client disconnected”,
“errName”: “ClientDisconnect”,
“errCode”: 279,
“reslen”: 311,
“locks”: {
“ReplicationStateTransition”: {
“acquireCount”: {
“w”: 4124187
}
},
“Global”: {
“acquireCount”: {
“r”: 4124187
}
},
“Database”: {
“acquireCount”: {
“r”: 4124187
}
},
“Collection”: {
“acquireCount”: {
“r”: 4124188
}
},
“Mutex”: {
“acquireCount”: {
“r”: 2837632
}
}
},
“protocol”: “op_msg”,
“millis”: 1800325
}

Hi @Admin_MlabsPages_mLa,

This aggregation is very complex and have lots of inefficient operators:

  1. Many $lookups with expressive joins
  2. Non selective matches like $or
  3. $groups and $unwinds with $addfields
  4. Many stages in general.

I don’t know why you expect it to be performant or why do you need such a complex aggregation.

Usually it indicates on a normalised schema design not fitted for MongoDB where you have to join many relationships to get to an application like document.

Unfortunately, the only recommendations I might have is to have to do an index on

tenant_id : 1,
IsDeleted : 1,
emails : 1,
cellphones : 1,
_id : 1

And index any lookup related fields, but I don’t expect a dramatic change :frowning:

Thanks
Pavel