Primary node in replica set down and 2 weeks of data lost

Hello,

I am facing an issue with my MongoDB cluster. While noticing some degraded performance, the primary node of the replica set crashed and never restarted. After another node was elected as primary I have noticed that a lot of data is gone.

I have checked the backups (mongo dump and oplogs) and no trace of data was found for the last days.
I the mongo logs I have noticed a lot of errors like this:
[LogicalSessionCacheRefresh] Failed to refresh session cache: WriteConcernFailed: waiting for replication timed out; Error details: { wtimeout: true } at rs1\r\n"
Those errors coincide with the dates where the data was lost.

Is there a way to recover the lost data? And why the data was not synced from memory to disk?

Thank you!

What is the:

  • OS
  • MongoDB Version
  • Topology

Hello Chris,

The cluster is dockerized, the MongoDB version is 4.0, container OS is Ubuntu 16.04, with persistent disks.
The topology is as follows:
1 mongos
1 replica set config servers (1 primary 2 secondaries)
2 Replica set shared clusters ( 1 primary and 2 secondaries for each RS)