Connection Reset Errors

Seeing an abundance of the below in a sharded cluster environment hosted in AWS. Any insight as to how to debug? Have tinkered with tcp keepalive on the servers (currently set to 120) and maxIdleTime on the client without any noticeable change.

MongoDB Server Version: 4.4.2
Java Driver: 'org.mongodb:mongodb-driver-reactivestreams:1.13.1
‘io.reactivex.rxjava3:rxjava:3.0.3’
Architecture: arm64

"stack_trace":"java.util.concurrent.ExecutionException: com.mongodb.MongoSocketReadException: Exception receiving message
    at java.base/java.util.concurrent.CompletableFuture.reportGet(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.get(Unknown Source)
    at com.creativeradicals.openio.pipeline.persist.feed.FeedMultiSaver.accept(FeedMultiSaver.java:107)
    at com.creativeradicals.openio.pipeline.persist.feed.FeedMultiSaver.accept(FeedMultiSaver.java:34)
    at com.creativeradicals.openio.rabbit.base.RequestConsumer.handleDelivery(RequestConsumer.java:42)
    at com.rabbitmq.client.impl.ConsumerDispatcher$5.run(ConsumerDispatcher.java:149)
    at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:104)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: com.mongodb.MongoSocketReadException: Exception receiving message
    at com.mongodb.internal.connection.InternalStreamConnection.translateReadException(InternalStreamConnection.java:569)
    at com.mongodb.internal.connection.InternalStreamConnection.access$1200(InternalStreamConnection.java:76)
    at com.mongodb.internal.connection.InternalStreamConnection$5.failed(InternalStreamConnection.java:520)
    at com.mongodb.internal.connection.AsynchronousChannelStream$BasicCompletionHandler.failed(AsynchronousChannelStream.java:235)
    at com.mongodb.internal.connection.AsynchronousChannelStream$BasicCompletionHandler.failed(AsynchronousChannelStream.java:203)
    at java.base/sun.nio.ch.Invoker.invokeUnchecked(Unknown Source)
    at java.base/sun.nio.ch.Invoker$2.run(Unknown Source)
    at java.base/sun.nio.ch.AsynchronousChannelGroupImpl$1.run(Unknown Source)
    ... 3 common frames omitted
Caused by: java.io.IOException: Connection reset
    at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.finishRead(Unknown Source)
    at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.finish(Unknown Source)
    at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.onEvent(Unknown Source)
    at java.base/sun.nio.ch.EPollPort$EventHandlerTask.run(Unknown Source)
    ... 1 common frames omitted"}

See the discussion containing this suggestion and perhaps that will solve your problem…

Our application uses docker, and it looks like the version of Java we are on (OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.2+12) has patched the bug mentioned in that thread. Is there any other way to debug these constant connection reset errors and socket exceptions?

Hmm, I don’t know an easy way … maybe we can ask @Jeffrey_Yemin

There’s no straightforward way to determine the root cause of connection reset errors. It’s not typically a driver bug that causes it. Rather, it’s either something happening in the MongoDB server or in the network between driver and server. I would look first at MongoDB server logs to see if there are any clues there. It’s possible that the server itself is closing the connection for some reason. If not, you’ll need to involve an expert in network administration, perhaps to employ a tool like Wireshark to figure out what’s happening, assuming that you can reproduce the error.

One other thought: if you’re able to test outside of Docker, that would be one way to rule that Docker itself as a contributing factor.

2 Likes

Yes, simplification is a useful debugging tool.

Our cluster is running on ARM64 CentOS machines in AWS. I see that the compatibility specs don’t list CentOS under ARM64. Is it worth trying to switch the underlying Operating System to Ubuntu?
.
https://docs.mongodb.com/manual/administration/production-notes/