Write Concerns & Transaction

Hi all !

Hope everyone’s doing well learning new stuff.

Read and write concerns are amazing for a newbie as I am, but I’m thinking as I see “the timeout doesn’t mean that the write failed” that in most of the case, the write finally occurs, which can be bad for some applications (we wan’t the writing process to occurs or not at all).

42

So my question is : is there a way to do a transactional write concerns over replicas ?

Thanks for letting me know,

Joris

https://docs.mongodb.com/manual/core/replica-set-write-concern/

Hopefully one of these answers your question.

  1. This is covered in the Write Concern Part 1 video, and this is what I wrote down :slight_smile:

    Write Concern Levels

    0 – Don’t wait for acknowledgement – The write may succeed or fail; the
    application doesn’t care. For fire and forget applications.

    1 (default) – Wait for acknowledgment from the primary only

    2>= - Wait for acknowledgement from the primary and one or more secondary members. Higher levels of write concern correspond to a stronger guarantee of write durability.

    “majority” : Wait for acknowledgement from a majority of replica set members, so simple divide the number of members by 2 and add 1 (Voting members). The nice thing about “majority”, is that you don’t have manual update your write concern if you increase the size of the replica set. There is also a write concern for replica set tags (advanced), whatever they are.

    So ideally “majority” would be your write concern of choice if you wanted some data durability.

  2. However if you want to know if the last write was successful this may be of use, which having re reading your question.

https://docs.mongodb.com/manual/reference/command/getLastError/#getLastError

Maybe it would be called in the event of a timeout only.
1 Like

Thanks @NMullins for your answer, but I think that in any of these write concern levels (1), if the time is out, the application will get a fail answer even if the write occurs.

If I wan’t more infos on this operation, the function you send me getLastError (2) will give some infos. But that’s about log analyzing ?

What I would really want to do is the writing process not to occurs on none of the replica nodes if the time is out. So the application can safely retry to write without making potential duplicate content. That’s called transactions, right ?

"Transactions provide an “all-or-nothing” proposition, stating that each work-unit performed in a database must either complete in its entirety or have no effect whatsoever. "

Make it an upsert. A create if it didn’t work and a update if it did. If your key is unique on the insert it will not matter. It will just update with the same information.

The write process will occur once you give the primary the data, you can’t stop that, you can control the level of data durability (or as you have described it “Transaction”).

What you can do is control what you do if you don’t get a response back.

Ok. Say for instance you insert a record, with a unique key. You say that you want a majority write concern, so that 2 out of the 3 nodes have to have accepted the write, before you get a response back. You can say all the nodes if you want, which is what you seem to require to want.

Now say your your application loses it connection to the database, before you get a response. So as far as the application is concerned, we had a failure to insert, however in the back ground the mongodb has happily gone along and committed that insert to the majority/all of the nodes.

The application takes that failure, and decides to reinsert the record. Since it is an upinsert, it will see that there is record already there with the unique key, and will just update the record, with the same data.

If the record isn’t there because actually the data centre actually lost power (Change data centre provider btw), then your upsert will insert the record as desired and mongodb will replica it out. Once all the services are back up and running.

Once the mongodb servers get their power back , will finish off/rollback any transactions, before they accept any new data.

Regardless of this your application will need to wait for mongodb to comeback.

Your definition of transaction here is really a 2 phase commit.

2 phase commit protocol is an atomic commitment protocol for distributed systems. This protocol as its name implies consists of two phases . The first one is commit -request phase in which transaction manager coordinates all of the transaction resources to commit or abort.

https://www.google.com/search?ei=3PZzXL-OH6yZ1fAP9tO3-AQ&q=two+phase+commit&oq=multi+phase+transaction&gs_l=psy-ab.1.0.0i71l8.0.0..3614...0.0..0.0.0…0…gws-wiz.UOupiTuMMOs

Maybe the forum supervisor can offer an opinion? I am out of thoughts :slight_smile:

Hi @Joris_SAIDANI_37802,

Excellent question!

From MongoDB 4.0 onwards, we now support Replica Set transactions.
Before I go into lots of detail about the transactions, be advise that the Write Concern mechanism is orthogonal to the Transactions support.
This means that the way that a WriteConcern is enforced / guaranteed / applied, is independent of how transactions are applied.

That said, when combining transactions + writeConcerns the expectations should be clarified.

Transactions are all or nothing ACID operations, this means that a combined set of operations can be aborted (rolled-back) or committed.
Transactions will be prone to conflicts, and there is a conflict resolution mechanism, which is exposed to the client application, that will drive you to make decisions, based on what the conflict is about:
https://docs.mongodb.com/manual/core/transactions/#transactions-in-applications

All write operations of a transaction, until an abort or commit, are bound to the Primary node. This means that within the set of operations of a transaction, until an abort or commit, all write operations occur within the primary.
You can initiate a transaction asking for w: majority, but that will only take place once the transaction gets committed, and not to the individual set of operations in transaction, but to all write operations of that same transaction.
If (for ex w:majority) write concern defined for a transaction does not get satisfied, the same set of expectations will be meet, in the sense the the current primary will acknowledge the writes, potentially several, and that the number secondaries of secondary nodes required to confirm the WriteConcern was not meet within the defined wtimeout. Nevertheless, the data has been confirmed and committed in the Primary node.

Now, WriteConcern in isolation, without considering ReadConcerns may not be enough to ensure full data availability. The fact that we write to a majority of nodes, ensures that the data we’ve provided, and in a Transaction committed, reached all the designated nodes. However, if we want to guarantee that all the data that write, in a distributed system like MongoDB, we also need to ensure that before writing that same data, we are reading with ReadConcern.Majority or ReadConcern.Snapshot in a transaction, to avoid any potential stale read.

Don’t forget, there is a tradeoff between always reading and writing from/to a majority of nodes, your operations will take longer to complete and acknowledge.
Within the transactions, while write concerns are pretty straightforward, the readConcerns are a bit more elaborate, in the set of conflicts and resolution mechanism that might take place.

We will soon be adding Transactions module to the M220 courses, where we hope to clarify even more how these work together.

Stay tuned.

N.

1 Like

Thanks for that.

  1. So the write concern is on the transaction, which can contain more than one actions (remind me what version did multiple atomic actions come in on ? 3.6 or 4).

  2. In v4 where it says you have multi document transactions, does that mean within the same collection, or multi document within different collections?

  3. If (for ex w:majority ) write concern defined for a transaction does not get satisfied, the same set of expectations will be meet, in the sense the the current primary will acknowledge the writes, potentially several, and that the number secondaries of secondary nodes required to confirm the WriteConcern was not meet within the defined wtimeout . Nevertheless, the data has been confirmed and committed in the Primary node.

    So the data could have have replicated to the secondaries, just not necessarily within the wtimeout specified?

  4. Given the potential overhead of replica set transactions, in V4, is this something that can be specified at application level, or is this a replica set setting, because as far I could see, not all actions within an application, would require that level of redundancy.

  5. When does that Transaction section get added? Since I will hold off on doing a M220 until then.

Joris, depending on which version of mongodb you have, if Mongodb 3.6 then this may be of use, the section on retryable writes.

This has been an interesting topic, since it has made me appreciate things a bit more.

1 Like

4.0

https://docs.mongodb.com/manual/core/transactions/

Multi-document transactions can be used across multiple operations, collections, databases, and documents.

Correct, by default the wtimeout is not set and it will block indefinitely:
https://docs.mongodb.com/manual/reference/write-concern/#wc-wtimeout

All transactions have a commit time of 60seconds, and transactions are not set by default. You need to explicitly start a transaction.
The transactions write overhead is generally quiet low. All write operations within a transaction will be replicated at the same time with the same write concern. The larger overhead will be caused by memory back-pressure, given that MongoDB will have to maintain a larger memory footprint, for documents (and it’s versions overtime) corresponding to the time the transaction gets initiated. From the write perspective, there is little overhead.

We don’t have specific date yet, and we might do this module as a microcourse on the transactions topic alone. Also, behare in mind that transactions in MongoDB, given the document model, are not the rule but the exception. I recommend you taking M220 regardless of the transaction support lessons/lectures.

I would recommend a different set of information on this topic than the percona blog post.

N.

1 Like

Thank you - the point on on Percona post, was about the use of the session id.

A quick question on the visibility of data within the scope of a transaction.

If I start a transaction. Make a change to a document, and refind that document, I will see the changed document(s). Not the original document, pre start of the transaction.

Where as queries outside of that transaction, will see the pre transaction version of the documentation as per online manual.