Can't add shards

Can not add shard due to NetworkInterfaceExceededTimeLimit.

Mongo Version: db version v4.9.0-alpha-1182-g2326fb8
Cluster Type: Sharded Cluster (4 shards, each running single node replica set)
OS builds: Config server and mongos running on x86 and Shards running on different build

I am able to ping and telnet to each nodes vice-versa.

When I am trying to add shards, it is giving

“NetworkInterfaceExceededTimeLimit”: Request 107 timed out, , deadline was 2021-02-28T12:54:08.250+00:00, op was RemoteCommand 107.

When I checked in the logs, It is marking that as a Slow Query:

“msg”:“Slow query”,“attr”:{“type”:“command”,“ns”:“admin.shard1rs1/x.x.x.x:27017”,“appName”:“MongoDB Shell”,

has anyone come across such issue? Any suggestions / Any clue would be really helpful. Thank you in advance!

I am not preferring an option of changing the mongo version here. (These are custom builds)

Hi Viraj,

What is the base MongoDB server version for your custom build? The v4.9.0-alpha reference looks like a nightly development/unstable release.

Regards,
Stennie

Hi @Stennie,

Yes. This is based out of master branch from MongoDB source code. We’ve build based on that point of time code.

Sharded cluster working perfectly fine on our DEV setup and when we try to run it on prod on AWS, it is causing this issue.

Hi @viraj_thakrar,

Builds from the master branch or development/unstable releases aren’t thoroughly tested yet, and are definitely not ready for production deployment.

I would start by ensuring that all of the components of your cluster are built with identical versions. The master branch includes work in progress, so mixing versions may lead to unexpected results.

If you have setup a novel environment it is going to be challenging for someone to try to reproduce the problem, so any details would be helpful:

  • What are the differences between your environment and builds. Are the shards running on a different git checkout, different hardware architecture, … ?

  • What options did you use to build MongoDB?

  • Are there specific features or build optimisations you are trying to test?

  • What steps are you running in order to add shards? You mentioned there are four shards – how many were added successfully?

I recommend waiting for a tagged release (alpha or RC) if you want a more stable test environment.

Regards,
Stennie

Hi @Stennie,

Thank you for replying.

Yes. I totally understand your point about using master branch or development releases. I manage to resolve the issues I was facing and the purpose I am using it for, is more of testing performance with particular data set and with different cluster setup. I am using same version on all cluster components.

I caught in to the issue of “NetworkInterfaceExceededTimeLimit” because of very very slow network across the devices I was using it. So I tried to manage it with increasing pingTimeouts and was trying to find out if I can some how increase the timeout which it uses when we run .addShard(). I manage to handle this issue with some alternatives.

There is one bug I believe which I would like to report specifically for Mongo Version: db version v4.9.0-alpha-1182-g2326fb8 as I came across and could be helpful. To replicate that behaviour, The steps are:

  1. Apply hashed sharding on _id field
  2. Let balancing run for a while
  3. check shard distribution of the collection (db.collection.getShardedDistribution())

Current Behaviour: Even if collection has been sharded with hashed key, checking sharded distribution with above command outputs “Collection is not sharded”

Expected Behaviour: It should display the shard distribution as it displays on other versions.

Thank you!

Hi @viraj_thakrar,

Thanks for confirming you were able to resolve the initial problem.

Bug reports can be filed directly in the SERVER project in the MongoDB Jira issue tracker: https://jira.mongodb.org. You can login using the same account as the community forums.

If you are unable to file an issue I could raise one on your behalf, but generally it is best for you to be the reporter so you will be notified of any updates or requests for further information.

Regards,
Stennie

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.