Not clear on number of chunks the balancer can migrate in one balancer round

Hi,

The lecture notes on “Balancing”, states that:

“The balancer can migrate chunks in parallel.
A given shard cannot participate in more than one migration at a time.”

So it seems that the number of balancer round is depending on number of chunks in each shard.

But the next sentence the lecture nodes states the following:

“So take the floor of n divided by 2, where n is the number of shards, and you have the number of chunks that can be migrated in a balancer round.”

So, for example is I have 6 shards, the blanacer can migrate only 3 chunks.

The first couple of statements seem to say that number balancer rounds would depend on number of chunks within a shard that need to migrate. But the later statement, make this based on the number of shards.

It seems to me that the formula to determine how many rounds of balancer, should depend on both:

  • Number of chunks need to migrate.
  • Number of shards that these chunks can move to.

Can you elaborate on that?

thanks,
Roya Yazdani

Hi @Roya_70671,

The number of balancer round does not depend on the number of chunks in each shard rather it depends on how uneven the distribution of chunks are.

For instance, Let’s say we have three shards - “shard A, shard B, shard C” with each one of them having 3 chunks.

So, in total we have 9 chunks but there wouldn’t be any balancing round here because the chunks are evenly distributed on each shard.

Yes, this is the maximum number of chunks that can be migrated in a given balancer round. Now for any given configuration of the sharded cluster the number of balancer rounds you would be needing depends on how unevenly the chunks are distributed in the cluster.

Hope it helps!

Please feel free to get back to us if you have any query.

Thanks,
Shubham Ranjan
Curriculum Support Engineer

Hi @Shubham_Ranjan,

Thanks for the explanation.

I am still not clear. I understand that in your example, we have 3 shards, with 3 chunks. Yes, that is even distributed number of chunks, but if one of those chunks becomes dramatically larger than the others, then it is not longer an evenly distributed environment. I am guessing that at that point, the balancer may attempt to split that chunk and try to move the new chunks to other shards to distribute the load evenly.

Is this a correct assumption?

Thanks,
Roya

Hi @Roya_70671,

Yes, after an insert operation if the size of the chunk becomes greater than the maximum chunk size then the balancer will split it. Let’s say this happened on the shard C.

Now, this is what the distribution of chunk looks like :

shard A : 3, shard B : 3, shard C : 4

In this setup, the migration of chunk will not take place. The reason is that there is a cost associated with the balancer moving the chunk and we should minimise it to improve the overall performance.

MongoDB uses a threshold value to decide if it should move a chunk for a particular setup. For a particular sharded collection, it finds the difference between the number of chunks for a shard with most chunks and a shard with fewest chunks.

If the difference exceeds the threshold then migration takes place.

Number of Chunks Migration Threshold
Fewer than 20 2
20-79 4
80 and greater 8

Hope it helps!

Please feel free to get back to us if you have any other question.

Happy Learning :slight_smile:

Thanks,
Shubham Ranjan
Curriculum Services Engineer