Choosing an appropriate Shard key

I am building a sharding infrastructure but I am stuck at a point to select an appropriate shard key. Can anyone help me out in selecting an appropriate shard key for my data? Further, an issue I am facing is that I am having a total of 147 collections so is there any way I can shard all of my collections in one one go? or I have to manually shard each of the collection one by one?

@Qamber_Ali Welcome to the forum!

I would start by reviewing the documentation on Choosing a shard key and then follow-up with specific questions for your use case. The most appropriate shard key will depend on your use case and data distribution, so you’ll need to consider factors like: shard key field(s) should appear in every document, have high cardinality and even distribution (without being monotonically increasing), and support your most common queries.

Once you have candidate shard key(s), you could post a follow-up comment here if more specific advice is needed. It would be helpful to describe your concerns around how the candidate shard keys may (or may not) suit your use case.

Collections have to be individually sharded, but you could certainly automate this if you know what the desired shard key is. Before sharding all collections, I would consider whether it actually makes sense to do so. A sharded cluster can contain both sharded and unsharded collections, and some smaller or lower traffic collections may not benefit from sharding.

Regards,
Stennie

3 Likes