Get indexed field values (cached) without reading whole documents from disk

I have a collection that contains documents with _id, indexed_attributes and big_data_blob. The index size is small enough to fit into the cache but all database does not. I use projection to get just the document _id and indexed_attributes field values with db.myobjects.find({},{"indexed_attributes":1}). Looking at db.serverStatus().wiredTiger.cache["bytes read into cache"] shows that every time whole documents are read into cache and therefore the query takes more time than it would need to just read the indexed field.

Is there some other method to read the content of the indexed fields more efficiently?

Hi @Tomaz_Beltram and welcome back :smiley: !

Looks like you are trying to make a covered query.

Can you please confirm which indexes exist in this collection?

If you really want to know what mongod is doing to resolve your query, you can use explain. If you have a covered query, then you won’t see a FETCH stage in your winning plan.

To make this query db.myobjects.find({},{"indexed_attributes":1}) covered, you would need to have the index {'indexed_attributes': 1, '_id': 1} because you are returning both these values here as ‘_id’ is present by default and needs to be explicitly removed in the projection if you don’t want it. But if you do return it, it needs to be in your index to make it a covered query. You will probably also need to add a filter using your index in the query to trigger the use of the index.

This example is a bit silly but…

db.coll.find({b: 1},{b: 1, _id:0}).explain()

This is what it should look like in the explain plan:

"winningPlan" : {
			"stage" : "PROJECTION_COVERED",
			"transformBy" : {
				"b" : 1,
				"_id" : 0
			},
			"inputStage" : {
				"stage" : "IXSCAN",
				"keyPattern" : {
					"b" : 1,
					"_id" : 1
				},
				"indexName" : "b_1__id_1",
				"isMultiKey" : false,
				"multiKeyPaths" : {
					"b" : [ ],
					"_id" : [ ]
				},
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 2,
				"direction" : "forward",
				"indexBounds" : {
					"b" : [
						"[1.0, 1.0]"
					],
					"_id" : [
						"[MinKey, MaxKey]"
					]
				}
			}

I hope this helps.

Cheers,
Maxime.

1 Like

Hi Maxime,
Thanks for your quick reply and pointing me to the covered query. The _id and indexed_attributes fields are both indexed in my database. In may query the filter was missing and therefore COLLSCAN was used.

"winningPlan" : {
  "stage" : "PROJECTION_SIMPLE",
  "transformBy" : {
    "indexed_attributes" : 1,
    "_id" : 0
  },
  "inputStage" : {
    "stage" : "COLLSCAN",
    "direction" : "forward"
  }
}

If I add a filter to limit the query then I get stage PROJECTION_COVERED. However I noticed that it uses FETCH instead of index also in case that the filter matches all documents, e.g. {“indexed_attributes”: {$exists: true}}. Thanks again for your help.
wbr Tomaz

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.