Question regarding large transaction and limited WiredTiger cache size

Hello devs, we have a use case where we insert thousands (>20000) of documents (each one small, <1kb) into the same collection within one transaction. Setup is a one-node replica set.

We made the following observations:

  • If the cache size is quite small, in mongotop it can be seen that insert activities stop once default cache eviction trigger (20% of Wiredtiger cache size) is reached and MongoDB consumes much CPU. Sometimes this process finishes after some minutes, sometimes it runs for a long time. During that time almost no other DB queries are handled (thus system is almost blocked).
  • If the cache size is large enough, the transaction runs smoothly in few seconds.

Questions:

  1. As written in https://www.mongodb.com/blog/post/performance-best-practices-transactions-and-read--write-concerns transactions with more than 1000 documents should be avoided but we want to ensure DB consistency. Are we overlooking a possibility to keep the insert consistent with multiple transactions (and not needing to implement a manual rollback)?
  2. The required portion of the cache size seems to be significantly larger than the data to be inserted, especially if the collection already contains some data. Is it related to the index size? Do you have any idea how to calculate the required cache size beforehand? Then we could adjust the cache size or forbid the insert.
  3. Do you have other ideas how to handle this use case?

Hi @fran_28,

Welcome to MongoDB community!

The WiredTiger cache is the main component that translates block level disk representation to the memory structure which your queries and CRUD operations run.

By default it will take around 50% of the machine Which is sufficient for most use cases and should not be changed. This is also because the filesystem cache used have the data in compressed disk format and having a sufficient space for that should allow better access other than giving more and more space to WT.

The engine will try keeping the cache under 80% full and dirty under %5. Now when dirty cache reach 20 % application threads will be busy evicting cache rather than surving queries, this is why your instance almost halt.

I would say you need to find the resource that cause this to reach 20% and scale it (disk,ram,cpu) rather than increasing cache.

Transactions do come with a price as the mechanics to isolate reads are expensive and require extra cache. Having transactions small or throttle your transaction rate should ease that.

Consider testing different isolation levels as well and maybe combine documents into single objects.

Best
Pavel

Hi @Pavel_Duchovny,

thank you for your detailed answer. My follow-up questions are:

  1. I have seen that we can set WT parameters at runtime. In case we know that a large transaction is starting, is it safe to temporarily increase the “eviction_dirty_trigger” property to 40 or 60 and reset it after the transaction is done?
  2. Can we better estimate if a transaction will fit into the cache or not? As mentioned in point 2. of my previous post, I have the impression that it makes a difference if running the same transaction on an empty collection or one that already contains millions of documents. Does a part of the index also need to fit into the dirty part of the cache while the transaction is running?

Hi @fran_28,

As I mentioned before changing those internal values are not recommended if you can scale the env or tune your workload.

Those values are set this way to guard you from driving your database to places where it can abruptly stop or get corrupted. Playing with those without a deep inspection from a MongoDB engineering might yield unexpected results.

The best way to tune your workload is by load testing and trying verious write concerns and read isolation levels. Lowering the amount of documents per transaction should not lower your consistency if you implement a retry logic.

Best
Pavel

1 Like