Peng_Huang
(Peng Huang)
December 14, 2020, 11:42pm
1
Data:
student class
s1 ["english"]
s2 ["maths"]
The field class is indexed, and values of it are all in array.
Query1:
db.students.explain('executionStats').aggregate([
{ $match: { $expr: { $setIsSubset: [ "$class", [ "english" ] ] } } }
])
,
Query2:
db.students.explain('executionStats').aggregate([
{ $match: { "class": "english" } }
])
The first one does a COLLSCAN, while the second does a IXSCAN with indexBounds class : [\"maths\", \"maths\"]
.
Wondering if there is anyway to achieve the setIsSubset functionality, which can utilize the index efficiently.
1 Like
Peng_Huang
(Peng Huang)
December 29, 2020, 10:03pm
2
Can someone please answer this question? Thank you
Stennie_X
(Stennie)
December 30, 2020, 1:25am
3
Hi @Peng_Huang ,
The $expr
operator does not support multikey indexes, so isn’t a good choice for the first step of this aggregation pipeline by itself.
However, you can include multiple $match
stages in your pipeline. Combining your two stages would result in $expr
only being applied to documents matched via the index:
db.students.aggregate([
{ $match: { "class": "english" } },
{ $match: { $expr: { $setIsSubset: [ "$class", [ "english" ] ] } } }
])
If you look at the explain()
results, you’ll notice the query planner optimises these two matches into a single query with $expr
used as a filter.
Excerpt from explain() output for the above query
parsedQuery: {
'$and': [
{ class: { '$eq': 'english' } },
{ '$expr': { '$setIsSubset': [ '$class', [Object] ] } }
]
},
winningPlan: {
stage: 'FETCH',
filter: {
'$expr': { '$setIsSubset': [ '$class', { '$const': [Array] } ] }
},
inputStage: {
stage: 'IXSCAN',
keyPattern: { class: 1 },
indexName: 'class_1',
isMultiKey: true,
multiKeyPaths: { class: [ 'class' ] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { class: [ '["english", "english"]' ] }
}
},
Regards,
Stennie