Slow chunk movements when there are already MANY chunks in config database

Rob_De_Langhe · December 10, 2020, 10:31am

We run the community version 4.4.1 and we split our data over weekly collections.

We have 4 new collections per week, and some 1200M records are stored in them every week.

Each of the collections is split into 3600 chunks, which are moved evenly over 108 shards of 3 shard-servers each, while the collection is freshly created and thus entirely empty (=no documents). Auto-balancer is disabled of course.
So per week we have 4x3600 = 14400 chunks.
We have now some 19 weeks of 4 collections/week, and target is to have 52 weeks of collections.

Thus now we have some 19x14400 = 273K chunks

But where our chunk-movements went very fast when we started with this setup, it now goes increasingly slow, even while there is no single other activity going on to/from our cluster (no loading, no queries):
Every chunk movement takes roughly 40 seconds.

When I check the logs of our config servers for some indication of this slowness, I see following logs:

{"t":{"$date":"2020-12-10T11:08:27.462+01:00"},"s":"I",  "c":"COMMAND",  "id":51803,   "ctx":"conn1217790","msg":"Slow query",
"attr":{"type":"command","ns":"config.changelog","appName":"MongoDB Shell","command":{"aggregate":"changelog","pipeline":[{"$match":{"time":{"$gt":{"$date":"2020-12-09T10:08:26.775Z"}},
"what":"moveChunk.from","details.errmsg":{"$exists":false},"details.note":"success"}},{"$group":{"_id":{"msg":"$details.errmsg"},"count":{"$sum":1.0}}},{"$project":{"_id":{"$ifNull":["$_id.msg","Success"]},"count":"$count"}}],"fromMongos":false,"cursor":{"batchSize":101},"readConcern":{"provenance":"implicitDefault"},"writeConcern":{"w":1,"wtimeout":0,"provenance":"implicitDefault"},"lsid":{"id":{"$uuid":"5bad02ed-7a8d-419e-8b94-9750be30558d"},"uid":{"$binary":{"base64":"YtJ8CVGJPpojGlBhlVfpmkB+TWiGCwPUvkGEjp5tty0=","subType":"0"}}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1607594920,"i":1}},"signature":{"hash":{"$binary":{"base64":"F/KZAbai8D8Qcs/qQWKjLV0BTvM=","subType":"0"}},"keyId":6880911790037794833}},"$audit":{"$impersonatedUsers":[{"user":"mongo-admin", "db":"admin"}],"$impersonatedRoles":[{"role":"root","db":"admin"}]},"$client":{"application":{"name":"MongoDB Shell"},"driver":{"name":"MongoDB Internal Client","version":"4.4.1"},"os":{"type":"Linux","name":"Ubuntu","architecture":"x86_64","version":"18.04"},"mongos":{"host":"mongop_rtr1:27017","client":"10.100.22.100:36242","version":"4.4.1"}},"$configServerState":{"opTime":{"ts":{"$timestamp":{"t":1607594919,"i":55}},"t":7}},
"$db":"config"},
"planSummary":"COLLSCAN",
"keysExamined":0,
"docsExamined":473718,
"cursorExhausted":true,"numYields":473,"nreturned":1,"queryHash":"ABD7C887","planCacheKey":"5B43D47C","reslen":359,"locks":{"ReplicationStateTransition":{"acquireCount":{"w":475}},"Global":{"acquireCount":{"r":475}},"Database":{"acquireCount":{"r":475}},"Collection":{"acquireCount":{"r":475}},"Mutex":{"acquireCount":{"r":2}}},"readConcern":{"provenance":"implicitDefault"},"writeConcern":{"w":1,"wtimeout":0,"provenance":"implicitDefault"},"storage":{},"protocol":"op_msg",
"durationMillis":685}}

I identify the “moveChunk” actions that we initiate, a “config.changelog” namespace, a “COLLSCAN” (collection-scan?), the “docsExamined”=473718, and most suspicious is the “keysExamined”=0
Finally the “durationMillis”=685

Does this mean there is no optimal configuration to scan the list of chunks in the configuration database?
“KeysExamined” sounds to me like there is no index used, or whatever I should interprete as “keys”

I find no hint in the documentation or this forum about how I can check for any detailed causes of slow chunk movements.

Hoping that any of you can shed some light in this darkness

thx in advance!
Rob

Rob_De_Langhe · December 16, 2020, 3:02pm

I notice that many of these ‘moveChunk’ commands have been kind of queued, but not executed:

the “sh.status()” command shows that the chunk movements are not done for all shards (some shards are missing chunks, the default shard has excess chunks)
any next attempt to run “moveChunk” is happily ignored, as it seems: none of them cause any change in the “sh.status()” overview
the logs of the ‘mongos’ router shows many “moveChunk” commands that are still processed altough we are no longer initiating them via “adminCommand”; these logs seem to indicate some long-running queued moveChunk commands that are now aborted with some “cursorExhausted=true” message:

Log sample:

{"t":{"$date":"2020-12-16T15:49:21.492+01:00"},"s":"I",  "c":"COMMAND",  "id":51803,   "ctx":"conn1736393","msg":"Slow query","attr": {"type":"command","ns":"config.changelog","appName":"MongoDB Shell","command":{"aggregate":"changelog","pipeline":[{"$match":{"time":{"$gt":{"$date":"2020-12-15T14:49:20.088Z"}}, 
"what":"moveChunk.from",
"$or":[{"details.errmsg":{"$exists":true}},{"details.note":{"$ne":"success"}}]}},{"$group":{"_id":{"msg":"$details.errmsg","from":"$details.from","to":"$details.to"},"count":{"$sum":1.0}}},{"$project": {"_id":{"$ifNull":
["$_id.msg","aborted"]},
"from":"$_id.from","to":"$_id.to","count":"$count"}}],"cursor":{},"lsid":{"id":{"$uuid":"e1c5cc31-476e-44bb-be55-db1687d3b7e4"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1608130173,"i":28}},"signature":{"hash":{"$binary":{"base64":"DYpcel5ZTCQ5+Sn+iFO/V5Pvz8g=","subType":"0"}},"keyId":6880911790037794833}},"$db":"config"},"nShards":1,
cursorExhausted":true,"numYields":0,
"nreturned":0,"reslen":230,"protocol":"op_msg","durationMillis":802}}

These logs entries keep on going on, as if there is an awful long queue of chunk-movement attempts that are queued for long time, then aborting for some reason (“cursorExhausted=true” ?)
Any new attempt to move another chunk is terminating within a few millisecs, but nothing is effectively done when I check “sh.status”.
So to me it looks like chunk movements get queued, but that queue is not processed (blocked for some reason?) and finally these attempts are aborted.
=> how to get rid of this queue?
=> how to find the reason why the moveChunk commands don’t get executed?

rgds
Rob

Rob_De_Langhe · December 16, 2020, 4:01pm

ok, turns out I had to add the new (since release 4.4) option “forceJumbo: true” since my chunks are otherwise too big.

-> resolved

system · December 21, 2020, 4:01pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.