Mongodb - replication - raid

If mongo has replication and I replicate the data on each shard 3x times, why do I need/want RAID to mirror data on the host? E.g., If I’m already redundant with that data, aren’t I just giving up performance? Assuming I would be using RAID-1, aren’t I just giving up half of my write performance and storage space? (I’m not optimizing for read but write)

Aren’t 30 shards better than 15 for write performance? Of course losing all three replicas of one shard would be a catastrophe. Please advise.

Hi @Matthew_Zimmerman1

A comment on RAID. Without RAID when you drop a disk(the most commonly failed component) your node is offline and you will need sync from scratch when disk is replaced. Conversely with a RAID that node would remain online and suffer some performance degradation while the RAID resynchronises.

RAID10 is the recommended level for IO intensive operations such as Databases and mongodb.

As for sharding the recommendation I received was scale up then scale out. But this is going to be dependent on where your bottlenecks end up.

1 Like

Lack of write performance is my issue. Currently in order to max out the CPU/Disk for writing I have 29 shards (3 replicas) on 4 hosts with 22 disks each. So each shard is on one disk. I’m getting performance of about a million records inserted per hour on a collection with 34 indexes.

True that yes I will have to restore/resync, but otherwise I’d only be able to have 15 shards as I’m “wasting” those disks in terms of write performance (and storage space in the db too…)

I guess to ask the question another way, if I’m already replicated out to 3 hosts, why should I also replicate on the host too?

Hi @Matthew_Zimmerman1

I don’t think there is a right/wrong answer to your question, really. It’s a matter of tradeoffs, and what your priority is.

As @chris mentioned, RAID would help availability on individual node. If there are any issue in the storage part of a node, you would not need to do maintenance from the database side. Thus having RAID helps keep the node from being offline or having to do initial sync which could be an expensive operation that your app can’t afford.

On the other side, not using RAID may help with throughput, as you have mentioned. If this is the must-have feature of your app, then the tradeoff is not having redundancy within the individual nodes and would increase their chances of getting disrupted.

In conclusion, if availability is your main concern, then I would say that you’re not wasting anything by using RAID, with the expense of speed. Conversely, if throughput is your main concern, sacrificing reliability for speed may be a good tradeoff. I don’t think there’s a single correct answer. It depends on what you need from the system.

Best regards,