Consistency with replicas


I have an issue with GridFS (See for details) where writes on the secondary seemed to be applied in arbitrary order, more specifically chunks of a given document are not readable while the files entry already is, despite the fact that (AFAIU) chunks are written before the files entry.

The documentation states that “MongoDB provides monotonic write guarantees, by default, for standalone instances and replica set”. Doesn’t that mean that write order on the primary are replicated in the same order on secondaries if there is no sharding ?
Or maybe it does but it’s not reflected from the point of view of a reader of those writes ?
I see that there is causally consistent client session that could be used to provide stronger guarantee but I’d like to be sure I understand correctly the default behavior in my case.

Thanks in advance.

Env: mongo 4.2, cluster with 6 replicas, no sharding.

Hi @Francois_EE and welcome in the MongoDB Community :muscle: !

Writes operations are replicated on a secondary node in the same order they appear in the oplog: collection and it’s also the same order on the primary node.

But WHAT you can read depends on which read concern you are using and WHERE (== which node) you are reading from depends on your read preference.

As GridFS relies on 2 collections: fs.files and fs.chunks, if you want to write a big file “atomically” to your primary (and replicate that in a similar manner on your secondaries), you would have to use a multi doc transaction which is the real “all or nothing” implementation that you are chasing here apparently.

I hope this helps :slight_smile:.


Thanks Maxime.
From what I can read transactions do have an impact on performances. I do a lot of writes and a few reads so I’m sensitive to write performances.
From what I understand, with the default write concern and a read concern of majority, using session would achieve the consistency I’m looking for. Is that correct ?
From a performance standpoint, would it be ok to use a new session for each document upload to gridFS or would it better to share the same session for a set of gridFS uploads ?

Thanks again.

Hi @Francois_EE,

Actually I have to take some of my comment back because GridFS doesn’t support multi-doc transactions for some reasons. It’s actually the first thing in the GridFS doc. My bad :confused:.

But yes, you are correct, a causal consistent session will help you “read your own writes” ─ even if these read operations happen to be on a secondary node right after the insertion. Depending on which read concern & write concern combo you are using.

It’s all explained in this documentation:

Also, you will have to solve this replication delay that you have on your RS because it’s not healthy and in case you have too many delayed nodes, your read & write operations with “majority” will time out if they can’t be replicated fast enough.