Insert latency in replica set

We have two clusters 800-802 and 9000-9002with PSS architecture that receive a heavy influx of writes (~25k messages/sec) into dated schemas. Applications are facing serious latency issues with writes - 800(200mil records behind) and 9000(100mil records behind).

· 800-802 - OS: Linux sles12sp3, SATA SSD, CPU: Ivy Bridge, RAM: 250GB, Disk: 4TB (2.5TB used), mongov4.0.3 , w:0 j:true

· 9000-9002 - OS: Linux sles12sp3, SATA SSD, CPU: Ivy Bridge, RAM: 250GB, Disk: 9TB (5TB used), mongov4.0.3 , w:0 j:false (although this setting is not safe/recommended app users are willing to take the chances as opposed to having millions of records drop due to insert latency)

My recommendation is to set inserts on both clusters to w:1 j:false . While 800 may experience better performance (since it no longer hits on disk journal) 9000 is going to take a further hit with this change.

Questions:

  1. Will switching to RAID10 provide a significant performance improvement even for SSDs?

  2. Will transferring the journal file to a different volume help ?

  3. Given that this is a write heavy application are there any cache settings that can be adjusted?

In my experience, yes, it can give significant performance boost. I’ve used RAID10 and RAID50 on SSD drives, and it has been really good. I suggest you try it out, there are benchmarking tools that you can use to fine tune your setup, and make then educated selection which one you start running in production.
Sorry that I don’t have experience in MongoDB specific situation, so can’t help with 2 & 3. Gut feeling is that moving journal to different drive would help, but haven’t done that.