Cannot get $merge to work with node.js driver 3.5.9

Hi there,

I have been developing with node.js and mongo for over 4 years now. We have a production application that makes heavy use of the aggregation pipeline.

As of mongodb version 4.2, which is currently what we are using at altas as a 3 node replica set, you have added the $merge stage functionality to the agg pipeline.

I have spent the better part of a day trying to get this to work with our application to no avail. I get no errors and the new collection is not created.

Here is the query that gets called

cursor = exectuteOnCollection.aggregate(superQuery[i].pipeline,{“readPreference”:“ReadPreference.PRIMARY”,“batchSize”:5000,“allowDiskUse”:true}).toArray(fred)

I have tried with the callback and without the callback. I have tried with the toArray (although i think it is needed) and without.

Here is an example of the pipeline that is called:

[
   {
      "$match":{
         "$and":[
            {
               "$and":[
                  {
                     "REPORT_DATE":{
                        "$gte":"2020-01-10T05:00:00.000Z"
                     }
                  },
                  {
                     "REPORT_DATE":{
                        "$lte":"2020-07-08T04:59:59.999Z"
                     }
                  }
               ]
            }
         ]
      }
   },
   {
      "$match":{
         "$and":[
            {
               "WH_ID":{
                  "$nin":[
                     "TB"
                  ]
               }
            }
         ]
      }
   },
   {
      "$project":{
         "DIRECT_SECONDS":"$DIRECT_SECONDS",
         "_id":0,
         "GOAL_SECONDS":"$GOAL_SECONDS",
         "JOBCODEID":"$JOBCODEID",
         "REPORT_DATE":"$REPORT_DATE",
         "REPORT_DATE_LOCAL":"$REPORT_DATE_LOCAL",
         "WH_ID":"$WH_ID",
         "Priyanka":{
            "$sum":[
               "$GOAL_SECONDS"
            ]
         }
      }
   },
   {
      "$merge":"temp_21.1-MultiSeriessuperWidget_1594143071494"
   }
]

If I execute this pipeline in a 3rd party tool such as 3T, it works perfectly. It creates the collection and writes the data.

Also the driver 3.3 release notes says that functionality for $merge has been added.

Help is appreciated

Hi @Gerry_Brady,

It would be helpful for others to read and respond to your question if you could apply the proper code formatting to your post.

Cheers,

Jamie

Thanks Jamie, will do for next time.

Anyway, i figured out the issue and it was me! nothing to do with the library or syntax. The requirement was to append the new $merge stage when required.

When I did this, I inadvertently modified the query, hence it produced no results.

Sorry about this.

2 Likes

Glad you figured it out and thanks for sharing the solution!

Hi…I am having this same issue. Can you please tell me what you mean by th requirement to append the new merge stage when required? I have mine in the last stage of the pipeline and it works fine in Studio3T.

// Pipeline
	[
		// Stage 1
		{
			$match: {
			category: "project",
			            isTemplate: {$ne: true},
			            bPortfolio: true,
			            customerId: "1",
			            archived: {$exists: false},
			            deleted: {$exists: false}
			}
		},

		// Stage 2
		{
			$lookup: // Equality Match
			{
			from: "tasks", "localField":"_id", foreignField: "project_id", as: "project_tasks"
			              
			}
			
			// Uncorrelated Subqueries
			// (supported as of MongoDB 3.6)
			// {
			//    from: "<collection to join>",
			//    let: { <var_1>: <expression>, …, <var_n>: <expression> },
			//    pipeline: [ <pipeline to execute on the collection to join> ],
			//    as: "<output array field>"
			// }
		},

		// Stage 3
		{
			$unwind: {
			    path : "$project_tasks"
			}
		},

		// Stage 4
		{
			$match: {
			"project_tasks.deleted": {$exists: false}, 
			"project_tasks.dashHide": false
			}
		},

		// Stage 5
		{
			$project: {
			  _id: 1, portfolioName: "$name", portfolioStatus: "$status", 
			        portfolioDescription: "$description", portfolioDashHide: "$dashHide", customerId: "$customerId",
			        locations: "$locations", creator: "$creator", collaborators: "$collaborators", 
			        project_tasks: "$project_tasks",
			        numberOfAddresses: {$cond: { if: {$isArray: "$project_tasks.addresses" }, then: {$size: "$project_tasks.addresses"}, else: 0}}
			}
		},

		// Stage 6
		{
			$match: {
			    numberOfAddresses: {$gt: 0}
			}
		},

		// Stage 7
		{
			$unwind: {
			    path : "$project_tasks.addresses"
			}
		},

		// Stage 8
		{
			$project: {
			_id:1, portfolioName:1, portfolioStatus: 1, portfolioDescription: 1, portfolioDashHide: 1, customerId: 1, locations: 1, creator: 1, collaborators: 1, 
			                  "addressTask._id": "$project_tasks._id", "addressTask.card_id": "$project_tasks.card_id", "addressTask.dashHide": "$project_tasks.dashHide",
			                  "addressTask.tableDashHide": "$project_tasks.tableDashHide", "addressTask.position": "$project_tasks.position",
			                  "addressTask.color": "$project_tasks.color", "addressTask.name": "$project_tasks.name", "addressTask.markerSize":
			                    "$project_tasks.markerSize", "address.name": "$project_tasks.addresses.name", "address.cleanAddress": "$project_tasks.addresses.cleanAddress",
			                  "address.coordinates": "$project_tasks.addresses.coordinates", "address.addressComponents": "$project_tasks.addresses.addressComponents",
			                  "address.project_id": "$project_tasks.addresses.project_id", "address.collectionId": "$project_tasks.addresses.collectionId",
			                  "address.geoDone": "$project_tasks.addresses.geoDone"}
		},

		// Stage 9
		{
			$lookup: // Equality Match
			{
			    from: "projects", localField: "address.project_id", "foreignField": "_id", as: "projectAddress"
			    }
		},

		// Stage 10
		{
			$unwind: {
			    path : "$projectAddress"
			}
		},

		// Stage 11
		{
			$project: {
			_id: 1, portfolioName: 1, portfolioStatus: 1, portfolioDescription: 1, portfolioDashHide: 1, customerId: 1, locations: 1, creator: 1, collaborators: 1, 
			                  addressTask: 1, "address.name": "$address.name", "address.cleanAddress": "$address.cleanAddress",
			                  "address.coordinates": "$address.coordinates", "address.addressComponents": "$address.addressComponents",
			                  "address.project_id": "$address.project_id", "address.collectionId": "$address.collectionId",
			                  "address.geoDone": "$address.geoDone", "address.csvFields": "$projectAddress.csvFields",
			                        "address.csvFieldsArray": "$projectAddress.csvFieldsArray", "address.projectName": "$projectAddress.name"
			
			}
		},

		// Stage 12
		{
			$group: {
			_id: {portfolioId: "$_id", customerId: "$customerId", portfolioName: "$portfolioName", portfolioStatus: "$portfolioStatus", 
			    locations: "$locations", creator: "$creator", collaborators: "$collaborators", 
			                    portfolioDescription: "$portfolioDescription", portfolioDashHide: "$portfolioDashHide", addressTask: "$addressTask"},
			                  "addresses": {$push: "$address"}}
		},

		// Stage 13
		{
			$project: {
			"_id": "$_id.portfolioId", "name": "$_id.portfolioName", "status": "$_id.portfolioStatus", customerId: "$_id.customerId", 
			            locations: "$_id.locations", creator: "$_id.creator", collaborators: "$_id.collaborators", 
			                  "description": "$_id.portfolioDescription", "dashHide": "$_id.portfolioDashHide", "addressTask._id": "$_id.addressTask._id",
			                  "addressTask.card_id": "$_id.addressTask.card_id",
			                  "addressTask.dashHide": "$_id.addressTask.dashHide", "addressTask.tableDashHide": "$_id.addressTask.tableDashHide",
			                  "addressTask.position": "$_id.addressTask.position", "addressTask.color": "$_id.addressTask.color",
			                  "addressTask.name": "$_id.addressTask.name", "addressTask.addresses": "$addresses"}
		},

		// Stage 14
		{
			$group: {_id: {"_id": "$_id", customerId: "$customerId", name: "$name", status: "$status", description: "$description", dashHide: "$dashHide",
			locations: "$locations", creator: "$creator", collaborators: "$collaborators"},
			"addressTasks": {$push: "$addressTask"}}
		},

		// Stage 15
		{
			$project: {
			    _id: "$_id._id", name: "$_id.name", status: "$_id.status", description: "$_id.description", dashHide: "$_id.dashHide",
			                  customerId: "$_id.customerId", locations: "$_id.locations", creator: "$_id.creator", collaborators: "$_id.collaborators", addressTasks: 1
			}
		},

		// Stage 16
		{
			$merge: {
			     // The $merge operator must be the last stage in the pipeline
			     into: "portfolioAddresses",
			     //on: <identifier field> -or- [ <identifier field1>, ...],  // optional
			     //let: <variables>,                                         // optional
			     whenMatched: "replace",  // optional
			     whenNotMatched: "insert"                     // optional
			}
		},

	]

Hi Jane,

Not sure how much help I can be but I can explain to you where I went wrong.

The requirement of what I was developing was to add a $merge stage at the end of pipelines in certain circumstances. Basically there are situations where we want to write the query results to the db rather than returning the results to user.

As I mentioned in the post I am coding in node.js. My mistake was that once I added the $merge stage I would stringify the entire pipeline then parse it before executing the query. What this did was turn my dates from objects to strings, but the parsing does not do the reverse. So my dates stayed as strings

The effect was that there was no data being returned by the query, The reason that it worked for me in 3t was because I manually adjusted the dates when transposing the query to 3T.

The way I finally realized that this was happening was by manually hard-coding a small query in the code (basically just a limit of 10 documents on a collection) to be executed with a $merge. Once I saw that this worked, i quickly realized my error. The stringify.

In your place I would try the same. Maybe keep your $match, then hard code a $limit followed by a $merge. And see what happens.

By the way when you execute in your code do you get an error? Does the collection get created?

Let me know
Gerry

1 Like

Hi Gerry,

Thanks for that info!

It doesn’t throw an error. And I added a test method which returns the records right before the merge instead of adding the merge and the records do get returned so that’s what is really confusing me. And it works in Studio3T.

I will try your suggestions and let you know what I see.

Thanks again!
Jane

Hmmm…intersting.

Here are a couple of thoughts.

  1. if you are writing this in node.js, you have to use a method on your query call like .toArray() or .forEach() after your pipeline. This is used as a delimiter by mongo to know that they query is ended. You don’t need a callback but depending what you’re doing you may end up in a race condition if its not there. If you are not using node.js maybe there is something similar?

  2. Are you using a replica set? One important thing to know is even if your read preference is readSecondary, the $merge will write to your primary. There is no choice here. However, if you are trying to read your data immediately after insert, the read(with readPreference secondary) happens on the secondary. If you are running this programmatically, the read could be happening so fast that the data has not had time to replicate. 3t will write to primary but also could be reading primary as well.

We resolved this by setting read preference to primary on the occasions where used $merge. I am pretty sure you could also use writeconcern to ensure data is written to all replicas before you read.

  1. This may be really stupid and I doubt is the issue but in the query you posted there was an extra comma in your pipeline. But I am pretty sure this would have thrown an error.

Please let me know if any of this helps. If not and you have more to add, maybe I can help

Gerry

2 Likes

Hi Gerry,

You’re first thought was right…I wasn’t using a toArray in my NodeJS code because I wasn’t looking for the output at that time. I just used an async / await call. I added in the toArray and it works great.

Thanks for your help!
Jane

3 Likes

2 posts were split to a new topic: Problem with $merge using migrate-mongo