Change stream permanent oplog

we are using Mongo as our event store in our event sourced project.our database has only one replica just for supporting transactions and we are not gonna use multi replica, shards or any other advanced features of Mongo.

in order for projections of events into a report table idea to work, we need to be able to permanently resume “Change Streams” from any point in the history.and we just need the history of insertions, no more

to do that, we store resume token and also operation time of each insertion along with it’s id in a separate collection, and when we want to resume from a certain insertion in the history, we query and find the exact resume token and we resume it from there

this causes a bunch of problems:
1-storing resume tokens and operation time of each insertion via a change stream is fragile and also takes storage, but works!!!

2-we recently hit the oplog size and found out that resumability is possible with oplog so in order to be able to resume from any point in the history, we have to store and keep the whole oplog throughout the life of application, but the size of oplog hits it’s max capacity every 2 hours!
so it is not possible to save and keep oplog!

what should we do?
it seems there is no way other than implementing change stream from history ourselves on the Mongo.

Hi @Masoud_Naghizade

Welcome to MongoDB community.

So you are correct, in order to use the change stream resume token it must be present in the oplog.

If current oplog size is covering 2 h consider increasing it anyway (we recommend trying to have an oplog of 12h+ anyway).

Now in 4.4 you can define how much window you need and the oplog will try to grow to sustain it. Of course this requires a massive disk size and possibly can have performance overhead.

https://docs.mongodb.com/manual/reference/command/replSetResizeOplog/#replsetresizeoplog-minretentionhours

One question I have is why you need to have ability to resume from any point in time? Maybe there is a better design.

Thanks
Pavel

so in event sourcing, in order to have multiple reports(event projections) you need to be able to ask your event store to feed all events matching the criteria to your projector.if your projector fails to apply one event in to the projection in the middle of the way, you need to be able to ask it to resume it from there.other people use “EventStore” or “Kafka” as their event store, but Mongo gives the ability to live query your event streams and thats a huge benefit.anyway, thanks for your fast reply, but i think i have to implement my own change streams on top of Mongo

Hi @Masoud_Naghizade,

Why not to use our kafka connector to implement it:
https://docs.mongodb.com/kafka-connector/current/

Thanks
Pavel

1 Like

yeah, thanks.that was great and absolutely the solution to my problem.connecting mongo to Kafka topics is the way to stream my events into projectors.thanks again

actually there is a catch event with connecting mongo changes to Kafka which is explained here .if your change stream shuts down and falls behind oplog size, those changes in between wont be announced to Kafka and you can only have the most recent changes pushed to Kafka.actually this option best suits those who just want to see most recent changes through Kafka in an at least once option.thanks anyway, i’m going to implement my own change stream driver on top of Mongo change streams

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.