ReadConcernMajority disabled raises MongoExecutionTimeoutException in Change Stream

Giovanni_Desiderio · November 19, 2020, 9:42pm

Hi all,

This post is based on one of my previous one regarding Change Event Stream
https://www.mongodb.com/community/forums/t/change-events-stream-stops-working-when-a-node-fails-in-replicaset/11043/2

I am still trying to keep working my watcher when one of my data-bearing node of a replica set PSA
goes down by setting in the file .conf of every node the following instruction: replication.enableMajorityReadConcern equals to false.

Here my java code with API mongodb-driver-sync 4.2.0-beta1:

	    MongoClient mongoClient = MongoClients.create("mongodb://localhost:27017,localhost:27018,localhost:27019/?replicaSet=replica");
    	MongoDatabase database = mongoClient.getDatabase("test");		
    	MongoCollection<Document> collectionStream = database.getCollection("myCollection");
    	List<Bson> pipeline = Arrays.asList(Aggregates.match(Filters.and(Filters.in("operationType", Arrays.asList("insert", "update", "replace", "invalidate")))));
    	MongoCursor<ChangeStreamDocument<Document>> cursor = collectionStream.watch(pipeline).fullDocument(FullDocument.UPDATE_LOOKUP).iterator();
    	ChangeStreamDocument<Document> streamedEvent = cursor.next();
    	System.out.println("Streamed event: " + streamedEvent);

When my settings are like above, by stopping one of the primary or secondary node, I am always getting the following exception in console:

        Exception in thread "main" com.mongodb.MongoExecutionTimeoutException: Error waiting for snapshot not less than { ts: Timestamp(1605805914, 1), t: -1 }, current relevant optime is { ts: Timestamp(1605805864, 1), t: 71 }. :: caused by :: operation exceeded time limit
        	at com.mongodb.internal.connection.ProtocolHelper.createSpecialException(ProtocolHelper.java:239)
        	at com.mongodb.internal.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:171)
        	at com.mongodb.internal.connection.InternalStreamConnection.receiveCommandMessageResponse(InternalStreamConnection.java:359)
        	at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:280)
        	at com.mongodb.internal.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:100)
        	at com.mongodb.internal.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:490)
        	at com.mongodb.internal.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:71)
        	at com.mongodb.internal.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:259)
        	at com.mongodb.internal.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:202)
        	at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:118)
        	at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:110)
        	at com.mongodb.internal.operation.CommandOperationHelper.executeCommand(CommandOperationHelper.java:345)
        	at com.mongodb.internal.operation.CommandOperationHelper.executeCommand(CommandOperationHelper.java:336)
        	at com.mongodb.internal.operation.CommandOperationHelper.executeCommandWithConnection(CommandOperationHelper.java:222)
        	at com.mongodb.internal.operation.CommandOperationHelper$5.call(CommandOperationHelper.java:208)
        	at com.mongodb.internal.operation.OperationHelper.withReadConnectionSource(OperationHelper.java:583)
        	at com.mongodb.internal.operation.CommandOperationHelper.executeCommand(CommandOperationHelper.java:205)
        	at com.mongodb.internal.operation.AggregateOperationImpl.execute(AggregateOperationImpl.java:189)
        	at com.mongodb.internal.operation.ChangeStreamOperation$1.call(ChangeStreamOperation.java:325)
        	at com.mongodb.internal.operation.ChangeStreamOperation$1.call(ChangeStreamOperation.java:321)
        	at com.mongodb.internal.operation.OperationHelper.withReadConnectionSource(OperationHelper.java:583)
        	at com.mongodb.internal.operation.ChangeStreamOperation.execute(ChangeStreamOperation.java:321)
        	at com.mongodb.internal.operation.ChangeStreamOperation.execute(ChangeStreamOperation.java:60)
        	at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:178)
        	at com.mongodb.client.internal.ChangeStreamIterableImpl.execute(ChangeStreamIterableImpl.java:204)
        	at com.mongodb.client.internal.ChangeStreamIterableImpl.cursor(ChangeStreamIterableImpl.java:158)
        	at com.mongodb.client.internal.ChangeStreamIterableImpl.iterator(ChangeStreamIterableImpl.java:153)
        	at com.softstrategy.ProvaWatcher.ProvaWatcherApplication.main(ProvaWatcherApplication.java:34)

On the other hand, if I comment out the enableMajorityReadConcern in the every file .conf nodes as it is by default, that exception does not appear.
(Nonetheless, my watcher still does not stream…)

Hence my questions are the following two ones:

Why that exception is raised only when ReadConcern is set to false and when the node down is a data-bearing one?
Why when the arbiter node goes down, that exception is not raised regardless of ReadConcern setting?

Thanks in advance.

wan · December 2, 2020, 1:44am

Hi @Giovanni_Desiderio,

Thanks for linking the other thread for background information.

If majority read concern is set to false, change streams goes into a “speculative majority read”. In this case, change streams read at the current point in time and waits for it to be majority committed. If it cannot do majority commit, it throws an exception. When the secondary (only other data-bearing) node is unavailable the majority commit point cannot advance, and thus the current point in time will never become majority committed as well.

When the arbiter is unavailable, the majority commit point still advances. As the secondary (the other data-bearing node) able to advance the majority commit point. In this case, the system operates normally.

In your case, I’d recommend to substitute your arbiter for another secondary. This is because it requires a majority committed data (i.e. change stream) with good redundancy.

Regards,
Wan.

Giovanni_Desiderio · December 3, 2020, 10:27pm

Thank you for your detailed answer.
By the way, reading the documentation (https://docs.mongodb.com/manual/reference/read-concern-majority/), it seems that disabling ReadConcern majority should not affect in any way the change streaming watcher in mongodb 4.2 or higher.
On the contrary, your previuos answer seems to be totally in contrast with what is written in documentation.
Am I not right?

Regarding a phrase of your answer

change streams read at the current point in time and waits for it to be majority committed

which kind of commit are you talking about?

I suppose you are pointing to the WriteConcern parameter, meaning that a new data must be majority persisted, to be available for the streaming watcher.
But in my replica set configuration, the parameter WriteConcern is equal to 1.
Hence, I think if I had an insert/update operation, the commit of this operation, in order to be receveid by the majority of actors, would require just one actor.
That is, just the node that is the primary, in that moment.

What is wrong with my thougt?
Have I misunderstood anything?

Greetings,
Giovanni

wan · December 9, 2020, 11:13pm

Hi @Giovanni_Desiderio,

I think the documentation does not cover the situation where the data-bearing secondary is not available longer than the point-in-time majority commit. I’ll discuss this with the documentation team for clarification.

Change streams only notify on data changes that have persisted to a majority of data-bearing members in the replica set. This ensures that notifications are triggered only by majority-committed changes that are durable in failure scenarios.

Regards,
Wan.