Limits on data size?

Hello

I see that Atlas isn’t allowing me to select data size of more than 4TB per shard in MongoDB Atlas.

Is that a hard limit?
If I wanted to spread my data of 40 TB across 4 nodes, is it not possible with Atlas because of that limit?
If that is the case, as a workaround, can I have multiple Instances per node and spread the multiple Instances across multiple nodes? Our data is flexible and doesn’t have to reside on a single Instance.

Hi Satya,

MongoDB offers horizontal scale-out using sharding: While a single ‘Replica Set’ (aka a shard in a sharded cluster) cannot exceed 4TB of physical storage, you can use as many shards as you want in your MongoDB Atlas sharded cluster.

For example, if you allocated 2TB per shard, a twenty shard cluster would have a total of 40TB of physical space (all would be redundant for high availability).

By the way, MongoDB offers compression by default meaning that your logical data size can in practice greatly exceed these physical storage numbers.

Cheers
-Andrew

@Andrew_Davidson, thank you for that information. Do you have any information on how the compression ratios will be?

Hi @SatyaKrishna,

Compression ratios depend on your data. Most data sets are highly compressible, but the default is often more than 50%. There’s an older blog post on New Compression Options in MongoDB 3.0 which is still generally applicable: ultimately you should test with a representative data set.

If you need help designing a large cluster or scale-out plan with MongoDB Atlas, I’d encourage you to contact our sales team (or your Account Executive, if known). One of our experienced Solution Architects can likely provide more specific advice for your use case and growth plans.

Regards,
Stennie