MongoDB.live, free & fully virtual. June 9th - 10th. Register Now MongoDB.live, free & fully virtual. June 9th - 10th. Register Now

MongoDB: Document-based ACID vs Multi-Document ACID

Consider an application in which we have some docs (I use doc instead of document in order to differentiate it from MongoDB’s document) and modifications are performed on them. The only requirement we have is that changes on multiple docs are done atomically (All of them are done, or none). There are two ways to implement it:

  1. A transaction is started and all the changes to the docs are performed inside it. Then it is commited. Whenever we need a doc we retrieve it by its ID.
  2. A new document is added to MongoDB that includes all the changes to the docs (example below). Since a document is inserted atomically there is no need for a transaction. We put an index on changes.docId and whenever we want to retrieve a doc we find all changes on the doc (by the index) and aggregate them and produce the doc.
{
	_id: ...
	changes: [
			{docId: 1, change: ...},
			{docId: 10, change: ...},
			{docId: 5, change: ...},
			...
		]
}

Note that since we need the history of changes, even in the first solution we keep the changed values inside the doc. Thus, by the measure of storage space these two solutions are not much different (without considering indexes, …).

The question is that which of these solutions is better?

Some of my own thoughts on this question:

  • The second solution may be faster in writes (It does not need transaction handling among different documents and shards).
  • The first solution may be faster in reads (The second solution needs to look for all the changes on the doc with the help of the index, which may be spread in different documents or even shards).
  • Assuming that reads are more prevalent than write (although not much), if satisfying ACID among multiple documents (and shards) in MongoDB is super efficient and very low-cost the first solution may be better. But, if transaction handling makes a lot of overhead on the system and requires a tremendous amount of coordination among shards, the second solution may be better. So, if someone has an in-depth knowledge of how MongoDB works, he/she may be able to provide a useful answer.