Resume of change stream was not possible, as the resume point may no longer be in the oplog

I am running a sharded MongoDB Cluster with a single shard and use change streams to listen to change evens in the database. I am getting the Following error a couple of times per day:

MongoError: Error on remote shard shard-01-01:27017 :: caused by :: Resume of change stream was not possible, as the resume point may no longer be in the oplog.
at MessageStream.messageHandler (/home/node/app/node_modules/mongodb/lib/cmap/connection.js:266:20)
at MessageStream.emit (events.js:314:20)
at MessageStream.EventEmitter.emit (domain.js:486:12)
at processIncomingData (/home/node/app/node_modules/mongodb/lib/cmap/message_stream.js:144:12)
at MessageStream._write (/home/node/app/node_modules/mongodb/lib/cmap/message_stream.js:42:5)
at writeOrBuffer (_stream_writable.js:352:12)
at MessageStream.Writable.write (_stream_writable.js:303:10)
at TLSSocket.ondata (_stream_readable.js:713:22)
at TLSSocket.emit (events.js:314:20)
at TLSSocket.EventEmitter.emit (domain.js:486:12)
at addChunk (_stream_readable.js:303:12)
at readableAddChunk (_stream_readable.js:279:9)
at TLSSocket.Readable.push (_stream_readable.js:218:10)
at TLSWrap.onStreamRead (internal/stream_base_commons.js:188:23)

Not sure what it is caused by or how to prevent it. Does anybody encountered the problem before or has any pointers?

Thanks in advance

Hi @Jascha_Brinkmann,

Why use a sharded cluster with a single shard? Why not just a single replica set? Is it to be future proof because you are planning to scale or something like that?

As you have a single replica set, can you run the following command in your shard so we can have an idea of the health status of your oplog?

test:PRIMARY> rs.printReplicationInfo()
configured oplog size:   990MB
log length start to end: 29251secs (8.13hrs)
oplog first event time:  Thu Sep 17 2020 12:30:26 GMT+0000 (UTC)
oplog last event time:   Thu Sep 17 2020 20:37:57 GMT+0000 (UTC)
now:                     Thu Sep 17 2020 20:38:00 GMT+0000 (UTC)

I suspect this is happening because you went through an election and your client is trying to restart the Change Stream but cannot because the previous known point has already been overwritten in the oplog because it’s too small.

Cheers,
Maxime.

Hey @MaBeuLux88,
yes – it’s already foreseeable that we will have to shard at some point so that’s why its already deployed as a sharded cluster.

This is the output of rs.printReplicationInfo():

configured oplog size:   1005.845458984375MB
log length start to end: 106384secs (29.55hrs)
oplog first event time:  Wed Sep 16 2020 15:19:27 GMT+0000 (UTC)
oplog last event time:   Thu Sep 17 2020 20:52:31 GMT+0000 (UTC)
now:                     Thu Sep 17 2020 20:52:37 GMT+0000 (UTC)

29hrs seems plenty. Any reason why my resume point would exceed that time frame?

That being said, I haven’t seen the error within the last 24 hours. The last time I saw the error was at Wed Sep 16 2020 13:18:41 GMT+0000 (UTC) which unfortunately is now before the first event in the oplog.

So it might be possible that during peak times the log length start to end is considerable smaller than it was in the past 24 hours. I will try to check it again once I see the error.

Hey @Jascha_Brinkmann,

Your oplog is “only” 1GB. From what you provided here, it means that every ~30h, you write 1GB of data to MongoDB (mix of insert / update / delete operations) and each new operations is overwriting the one from 30h ago.

It’s OK as is, but could be more if you want to recover smoothly on Monday morning a node that failed at 10pm on a Friday evening.

Also if you have large batch of insertions, updates or deletions (>1GB here), that’s going to bring this down to only a few seconds (the time for the batch to execute) and suddenly, it’s not healthy at all because it means that a node will be “lost” (won’t be able to catch up) only after a few seconds of network partition or just a reboot for an OS update.

This log length start to end must be monitored and should be as large as possible.

Here are the metrics available in Atlas to monitor the Oplog:

Cheers,
Maxime