Watch keynotes and sessions from MongoDB.live, our virtual developer conference.

Ftdc serverStatus was very slow results in mongo failures ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms

Mongo Version: 3.2.8
Deployment is replica set with an arbiter.

Observing following issue during peak system utilization:

Primary MongoDB Server:
2020-03-25T00:06:18.640+0000 I COMMAND [ftdc] serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after repl: 310, after storageEngine: 490, after tcmalloc: 490, at end: 4970 }
2020-03-25T00:06:25.992+0000 I REPL [ReplicationExecutor] Starting an election, since we’ve seen no PRIMARY in the past 10000ms

Secondary MongoDB Server:
2020-03-25T00:06:26.657+0000 I REPL [ReplicationExecutor] Error in heartbeat request to rats2.sm2:33000; ExceededTimeLimit: Operation timed out

mongostat output for refrence during the failure:

insert query update delete getmore command flushes mapped vsize  res faults qr|qw ar|aw netIn netOut conn  set repl                 time
   *18    *0    *42     *2       0     2|0       0  7.75G 18.9G 205M      1   0|0   0|0  237b  13.4k   25 TBFM  SEC 2020-04-09T11:24:15Z
   *16    *0   *143     *0       0     1|0       0  7.75G 18.9G 204M      2   0|0   0|0   79b  12.9k   25 TBFM  SEC 2020-04-09T11:24:16Z
   *11    *0   *130     *2       0    11|0       0  7.75G 18.9G 202M      1   0|0   0|0  805b  29.4k   25 TBFM  SEC 2020-04-09T11:24:17Z
    *4    *0   *233     *1       0     2|0       0  7.75G 18.9G 204M      2   0|0   0|0  237b  13.4k   25 TBFM  SEC 2020-04-09T11:24:18Z
    *3    *0   *232     *1       0     4|0       0  7.75G 18.9G 203M      0   0|0   0|0  353b  14.2k   25 TBFM  SEC 2020-04-09T11:24:19Z
    *2    *0    *80     *0       0     5|0       0  7.75G 18.9G 203M      0   0|0   0|0  311b  14.5k   25 TBFM  SEC 2020-04-09T11:24:20Z
   *79    *0   *328     *0       0     3|0       0  7.75G 18.9G 205M      5   0|0   0|0  295b  13.8k   25 TBFM  SEC 2020-04-09T11:24:21Z
   *14    *0   *258     *0       0     6|0       0  7.75G 18.9G 202M      2   0|0   0|0  369b  14.9k   25 TBFM  SEC 2020-04-09T11:24:22Z
    *2    *0   *170     *0       0    18|0       0  7.75G 18.9G 189M      0   0|1   0|0 1.29k  32.0k   25 TBFM  SEC 2020-04-09T11:24:23Z
    *1    *0    *13     *0       0     2|0       0  7.75G 18.9G 185M      1   0|1   0|0  137b  13.3k   25 TBFM  SEC 2020-04-09T11:24:24Z
insert query update delete getmore command flushes mapped vsize  res faults qr|qw ar|aw netIn netOut conn  set repl                 time
    *8    *0   *413     *0       0     2|0       0  7.75G 18.9G 183M      0   0|0   1|0  237b  13.4k   25 TBFM  SEC 2020-04-09T11:24:25Z
   *12    *0   *274     *0       0     1|0       0  7.75G 18.9G 184M      0   0|0   0|0   79b  12.9k   25 TBFM  SEC 2020-04-09T11:24:26Z
    *8    *0   *210     *0       0     2|0       0  7.75G 18.9G 181M      0   0|0   0|0  237b  13.4k   25 TBFM  SEC 2020-04-09T11:24:27Z
    *3    *0   *224     *0       0     2|0       0  7.75G 18.9G 180M      0   0|0   0|0  237b  13.4k   25 TBFM  SEC 2020-04-09T11:24:28Z
    *6    *0   *134     *0       0     6|0       0  7.75G 18.9G 181M      0   0|0   0|0  465b  14.6k   25 TBFM  SEC 2020-04-09T11:24:29Z
    *7    *0   *190     *0       0     4|0       0  7.75G 18.9G 181M      2   0|0   0|1  253b  14.1k   25 TBFM  SEC 2020-04-09T11:24:30Z
    *7    *0   *364     *0       0     3|0       0  7.75G 18.9G 181M      0   0|0   0|0  295b  13.8k   25 TBFM  SEC 2020-04-09T11:24:31Z
    *4    *0   *245     *0       0     8|0       0  7.75G 18.9G 179M      0   0|0   0|0  485b  15.7k   25 TBFM  SEC 2020-04-09T11:24:32Z
    *7    *0   *207     *0       0    10|0       0  7.75G 18.9G 176M      2   0|0   0|0  801b  16.5k   25 TBFM  SEC 2020-04-09T11:24:33Z
    *7    *0   *115     *0       0     8|0       0  7.75G 18.9G 175M      1   0|0   0|0  513b  28.0k   25 TBFM  SEC 2020-04-09T11:24:34Z
insert query update delete getmore command flushes mapped vsize  res faults qr|qw ar|aw netIn netOut conn  set repl                 time
    *6    *0   *181     *0       0     2|0       0  7.75G 18.9G 177M      2   0|0   0|0  237b  13.4k   25 TBFM  SEC 2020-04-09T11:24:35Z
   *12    *0   *329     *0       0     7|0       0  7.75G 18.9G 177M      0   0|0   0|0  455b  27.6k   26 TBFM  SEC 2020-04-09T11:24:36Z
   *13    *0   *119     *0       0     5|0       0  7.75G 18.9G 175M      0   0|0   0|0  429b  14.7k   25 TBFM  SEC 2020-04-09T11:24:37Z