M103 and M201 contradict regarding Merging / Sorting in case of Sharded Cluster

Confusion regarding where the final Merging / Sorting of data happens (Mongos or Primary Shard?)

Lecture: Sharding Architecture, which is a part of Chapter 3 of course M103, at 3:18 states “Each individual shard is going to send the results back to Mongos. Mongos will gather the results and, may be, sort the results if the query mandated it. This stage is called SHARD_MERGE and it takes place on the Mongos. Once the Shard merging is complete, Mongos will return the results back to the client.
image

Lecture: Queries in a Sharded Cluster, which is also a part of Chapter 3 of course M103, at 1:20 states “Mongos opens a cursor against each of the targeted shards. Each cursor executes the query predicate and returns any data returned by the query for that shard. The mongos now has results from each targeted shard. The mongos merges all the results together to form the total set of documents that fulfils this query and then returns that set of documents to the client application.”

Thus, the above 2 lectures in course M103, give an impression that Merging / Sorting of data received from individual Shards, happens on Mongos. However course M201, states it otherwise

Lecture: Performance Consideration in Distributed Systems Part 2 , which is a part of Chapter 5 of course M201, at 4:15 states “Once the local sort is performed, then a final Sort Merge needs to occur at the Primary Shard of our database. Once that sort merge is performed, we return back the information to our client.”
image

Thus, as per course M201, Merging / Sorting of data happens on the Primary Shard.

Request you to please clarify where does Merging / Sorting happen in a Sharded Cluster??

Hi @PuneetC,

I apologise for this ambiguous explanation.
However, let me just check and get back to you with a better understanding as soon as possible.

Thanks,
Sonali

Hi @PuneetC,

The mongos routes query to a cluster and will establish a cursor on all target shards. Then, mongos will merge the data from each target shards and return the result documents to the client application. If the query is not sorted, the mongos instance will open a result cursor that “round robins” result from all cursors on the shards.
Sometimes sorting is performed on primary shard before mongos retrieves the result. To reduce latency in retrieving the results, a cursor is established in the shard nodes to perform sorting.

Changed in version 3.6: For aggregation operations that run on multiple shards, if the operations do not require running on the database’s primary shard, these operations may route the results back to the mongos where the results are then merged.

To sum up, the merge sort always occurs on the mongos for find operations. For aggregation operations, the merge-sort occurs on a random shard.
Please refer to the following doc for more information:

Please let me know, if you have any questions.

Thanks,
Sonali

Sorry but there is still ambiguity :

You mentioned:

However the link you provided states :

The mongos then merges the data from each of the targeted shards and returns the result document. Certain query modifiers, such as sorting, are performed on a shard such as the primary shard before mongos retrieves the results.

Hi @PuneetC,

Apologies for late response.
Please allow me some more time here, so that I can give you relevant and precise explanation.

Kind Regards,
Sonali