Index Usage for Update Query Performance

Amalka_Pathirage · February 8, 2021, 5:33pm

I’m looking to execute an update query in a Production Mongo DB. Version 3.0.14. DB cluster with Primary and 4 Replicas with i3.2xlarge.
There are 2.2 M records to be updated. Can this have a considerable performer impact on the DB instances .

Following are the indexes.

 "Type" : 1.0,

"app" : 1.0,

"event" : 1.0,

"createdAt" : -1.0

This is a similar sample query

db.getCollection('<collection name>').update(

{"Type" : "<Type>",

    "app" : "<App>",

    "event" : "<testEvent>",

    $and:[{"createdAt":{$gte: ISODate("2020-10-26T00:00:00.000Z")}},

    {"createdAt":{$lte: ISODate("2020-12-26T00:00:00.000Z")}}]},

{$set:{"isRead" : true}},

{multi:true}

)

Prasad_Saya · February 8, 2021, 6:28pm

Hello @Amalka_Pathirage, welcome to the MongoDB Community forum.

The update operation has a query filter with multiple fields. To access the documents efficiently based upon these fields a Compound Index is required. Indexing on individual fields (single field indexes) wont be of that much use - at most one or maybe even two indexes will be used.

You can verify if the query uses an index (or indexes) by generating a query plan using the db.collection.explain on the update method. For the details of the explain’s output see Explain Results.

Also see Indexing Strategies. This has multiple topics, and Create Queries that Ensure Selectivity is a topic of interest, that how you can organize the fields in a compound index. With a large dataset, a compound index with four fields occupies space on disk and memory (RAM) while in operation - see the topic Ensure Indexes Fit in RAM.

I suggest you try some indexing (as suggested above) and tests in a test environment with test datasets and see what the explain results show. Based upon the findings you can figure what indexing will be of use.