Mongodb-mms-automation-agent Host Discovery Missing 1 Host

I have a sharded cluster (2 shards, each 3 mongods; 3 config server, 2 mongoses) which deployed by MongoDB Ops Manager.

Now one of the shard host status is shown as a grey diamond (“Last Ping: Never”). However, the cluster works well and we can list the missing host sh-0-0(shard 0, replica 0) by sh.status() and rs.status().

I found that the issue is caused by mongodb-mms-automation-agent host discovery. The automation agent log shows that:

[2021-01-15T10:13:03.299+0000] [agent.info] [monitoring/agent.go:Iterate:177] Received new configuration: Primary agent, Assigned 1 out of 10 plus 1 chunk monitor(s)
[2021-01-15T10:13:03.299+0000] [agent.info] [monitoring/agent.go:Run:266] Done. Sleeping for 55s…
[2021-01-15T10:13:03.299+0000] [discovery.monitor.info] [monitoring/discovery.go:discover:793] Performing discovery with 10 hosts
[2021-01-15T10:13:06.661+0000] [discovery.monitor.info] [monitoring/discovery.go:discover:850] Received discovery responses from 20/20 requests after 3s
[2021-01-15T10:13:58.298+0000] [agent.info] [monitoring/agent.go:Iterate:177] Received new configuration: Primary agent, Assigned 1 out of 10 plus 1 chunk monitor(s)
[2021-01-15T10:13:58.298+0000] [agent.info] [monitoring/agent.go:Run:266] Done. Sleeping for 55s…
[2021-01-15T10:13:58.298+0000] [discovery.monitor.info] [monitoring/discovery.go:discover:793] Performing discovery with 10 hosts

Since there are 11 hosts in total (2 * 3 + 3 + 2 = 11), it seems that the mongodb-mms-automation-agent needs to send 22 requests instead of 20. That’s why it doesn’t discover host sh-0-0. I also checked the automation agent execution command:

/mongodb-automation/files/mongodb-mms-automation-agent -mmsGroupId xxx -pidfilepath /mongodb-automation/mongodb-mms-automation-agent.pid -maxLogFileDurationHrs 24 -logLevel INFO -healthCheckFilePath /var/log/mongodb-mms-automation/agent-health-status.json -useLocalMongoDbTools -mmsBaseUrl http://ops-manager-svc.mongodb.svc.cluster.local:8080 -sslRequireValidMMSServerCertificates -mmsApiKey xxx -logFile /var/log/mongodb-mms-automation/automation-agent.log

Found no clue how does mongodb-mms-automation-agent discover the target hosts. So what’s the discovery mechanism and how to make the agent discover the missing host?

1 Like