Hi Stennie
So it’s release 4.4 I’m after. I’m not after a fix for general sorts running very slowly, sorts on find work perfectly well. The bug I mentioned is related to aggregate queries not selecting the index to sort on correctly which results in the entire collection being reloaded (AIUI) and explains the problems we are having completely. I really just need to know when the next release of Mongo is scheduled, I assume the referenced bugs will be addressed in it.
We have implemented a work-around for now as recommended by a bug linked to 7568 https://jira.mongodb.org/browse/SERVER-21471 by replacing our aggregate call with find but this is not going to be a long term viable option without a lot of code upheaval which I would rather not have to do.
The issue manifests itself (at least for us) working on a collection of 34,000,000 documents. Weirdly if we run an aggregate query that returns many documents (8,782,333 to be precise) the query completes in approx 8 seconds. But if we then run the same query against a different objectId such that it returns a very small number of documents (27) then it takes 2 minutes to return:
So we have a collection containing documents that reference other objects in an array of ObjectIds e.g.:
{
...
groups: [
ObjectId("5ce283422ab79c000f9040f5"),
ObjectId("5e9d01c5a5db2000075764fe")
]
....
}
and we wish to run a query that returns every document that has a specific ObjectId in the groups array:
db.collection.aggregate([ { $match : { $and: [ { groups: ObjectId("5ce283422ab79c000f9040f5") } ] } }, {$sort: {_id: 1}} ] )
8,782,333 documents
7.92 seconds
db.collection.aggregate([ { $match : { $and: [ { groups: ObjectId("5e9d01c5a5db2000075764fe") } ] } }, {$sort: {_id: 1}} ] )
27 documents
127 seconds
Now I know that this is not a generic sort issue as the following equivalent find queries demonstrate:
db.collection.find({ $and: [ {groups: ObjectId("5ce283422ab79c000f9040f5")} ]}).sort({_id: 1})
8,782,333 documents
5.13 seconds
db.collection.find({ $and: [ {groups: ObjectId("5e9d01c5a5db2000075764fe")} ]}).sort({_id: 1})
27 documents
0.18 seconds
127 seconds down to 0.18 seconds replacing the aggregate with the find and both using sort.
MongoDb: 4.2 Atlas cluster M30 tier, replica set, not sharded (yet)
‘Explain’ output for the 2 aggregate queries, I could see no discernible difference between the 2