MongoDB update operation performance

The update query performance varies significantly when the field values are set to the same value. MongoDB version: 4.2.6

Run 1:

no. of documents: 1 million

time taken to update 1M documents: 13m31.227s

db.test.find().forEach(function(doc){db.test.update({_id:doc._id}, {$set:{counter: 400}})})

Run 2:

no. of documents: 1 million

time taken to update 1M documents: 7m41.080s

db.test.find().forEach(function(doc){db.test.update({_id:doc._id}, {$set:{counter: 400}})})

Run 3: After mongod restart and cleaning buff/cache manually

no. of documents: 1 million

time taken to update 1M documents: 7m41.080s

db.test.find().forEach(function(doc){db.test.update({_id:doc._id}, {$set:{counter: 400}})})

Run 4: Setting counter to new value

no. of documents: 1 million

time taken to update 1M documents: 13m44.284s

db.test.find().forEach(function(doc){db.test.update({_id:doc._id}, {$set:{counter: 500}})})

Run 5:

no. of documents: 1 million

time taken to update 1M documents: 7m42.356s

db.test.find().forEach(function(doc){db.test.update({_id:doc._id}, {$set:{counter: 500}})})

Does mongodb perform additional checks while setting the value of a field? What can cause such performance difference for same update operation?

Hi @astro,

Thats a very interesting observation.

I assume that since the Wired Tiger storage document does not actually need to be changed it won’t need to write it to disk eventually , easing on overall ack duration which drives client faster.

Having said that I haven’t analysed the code to verify so its just a smart guess…

Let me know if that makes sense.

Best
Pavel

Thanks, @Pavel_Duchovny.

This has helped. I am observing a lot of update calls making it to logs(above 100ms) in the fresh run. But there are only a few in the logs when the field value is already set.

Shouldn’t there be no calls making it to logs when writing to disk is skipped in the second run?

@astro,

The log writes commands that exceed 100ms execution time regardless of their behaviour with the storage engine.

Therefore, the likelihood of those commands crossing 100ms is lower than with a new value but it still exists and therefore you see some in the logs. You will probably see that the document was modified as the query layer did updated it from its standpoint .

Best
Pavel