Chapter 2 Lab 2 - Aggregation code crashing

Hello,

Using mongo server 3.6.8.
Hopefully I can give an outline of what I am experiencing. I have answered the question correctly, mainly because the answer was at the start of my results set. Anyway I had built a pipeline up, however when I got to add the sort, the final part of my pipeline, mongo kept crashing.

I don’t believe the syntax of the sort was incorrect, but maybe my previous stages was causing a problem, so I will outline the stages, and if ok will give the error message. I appreciate once I saw the answer my pipeline could have dropped the later $match. Anyway:-

$match
$project
$match
$addFields
$sort
$limit

If I put the $sort into the pipeline on the normalisedRating, or even title, I would get the following error. I got the answer, but it annoys me the last part kept crapping out.

assert: command failed: {
“operationTime” : Timestamp(1542735075, 1),
“ok” : 0,
“errmsg” : “cant $subtract adouble from a string”,
“code” : 16556,
“codeName” : “Location16556”,
“$clusterTime” : {
“clusterTime” : Timestamp(1542735075, 1),
“signature” : {
“hash” : BinData(0,“nSIisNE9WJGaV+4+iGK1fWlM/rY=”),
“keyId” : NumberLong(“6595588990781358081”)
}
}
} : aggregate failed
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:403:5
DB.prototype._runAggregate@src/mongo/shell/db.js:260:9
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1212:12
@(shell):1:1

2018-11-20T17:31:22.986+0000 E QUERY [thread1] Error: command failed: {
“operationTime” : Timestamp(1542735075, 1),
“ok” : 0,
“errmsg” : “cant $subtract adouble from a string”,
“code” : 16556,
“codeName” : “Location16556”,
“$clusterTime” : {
“clusterTime” : Timestamp(1542735075, 1),
“signature” : {
“hash” : BinData(0,“nSIisNE9WJGaV+4+iGK1fWlM/rY=”),
“keyId” : NumberLong(“6595588990781358081”)
}
}
} : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:403:5
DB.prototype._runAggregate@src/mongo/shell/db.js:260:9
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1212:12
@(shell):1:1

So I figured it out.

  1. Dropping the $match before the $addFields, allowed the $sort to work, e.g. not crash.

  2. I also noticed that even though the year and Released are uttered in the question in the same breath, there is another year, which I saw was used. A bit naughty that, however it allowed me to drop the 2nd match.

It would be nice if the question for this lab, was phrased so that the “released” is no longer referred to, if the other year field is the one to use, since it can send you down a rabbit hole.

Hopefully I haven’t given too much away.

From what I see, mongo was not crashing but producing an error message, specifically: ‘cant $subtract a double from a string’.

By dropping the $match you removed the error probably because no documents reaches the $addFields the $substract which contained the error. You probably have one of the number in quotes so mongo think it is a string not a number.

{ ‘$subtract’ : [ ‘1521105’ , 5 ] }
vs
{ ‘$subtract’ : [ 1521105 , 5 ] }

1 Like

The problem was that the addition of the $sort was causing the crash, as soon as that stage was added the pipeline would crash.

So you are thinking that a document was reaching the $sort on normalisedRating and there was a piece of data in there it couldn’t handle.

I think this was a good lesson in the difference between a traditional relational database and a document database.

To think if I had used the year field and not used the released field for my year check, my stress levels would have been lower.

The reason that I refer to it as a crash, is that syntactically my pipeline was correct, and I would have hoped to have the error handled more gracefully. Having scanned the error code, I didn’t see anything that would have pointed me to the problem, and that is always a concern.

Thank you for your thoughts.

I’m having the exact same problem. I figured out how to calculate the normalized rating, all I need is to sort the results, so that I can know which is the lowest rating.
Just by adding a $sort stage, no matter which field I sort by (year, votes, title, anything) I get the exact same crash.
I’m not sure what is going on and how to fix it.
Removing the $match, as you did, seems not the riht solution to me, since there are requirements there that must be fullfilled (languages must contain english, year must be 1990 or above, etc).
Not sure how can I move on now…

What do you mean by crash?

Do you have any error messages?

@ gmauch

If you are seeing the exact same error message, then as the message says

“errmsg” : “cant $subtract adouble from a string”,

Go back and review your ‘normalizeRating’ stage and see that it correctly follows the format given in the handout:

// within a pipeline, it should look something like the following
/*
{
$add: [
1,
{
$multiply: [
9,
{
$divide: [
{ $subtract: [, <x_min>] },
{ $subtract: [<x_max>, <x_min>] }
]
}
]
}
]
}
*/

And notice that, as @ steevej-1495 says, numbers must be entered as numbers (that is, without quotes) and not as strings. HTH.

I didn’t remove the initial $match, I had 2 matches in my initial version of the pipeline, the second $match was performing a check on movies.released once I projected out the year, it was because of misunderstanding (see below).

It was because I was using the wrong field in the initial $match, to get the year value, which was including a piece of data that was blowing out the $sort later on. I initially used the movies.released date, since that contains a year, instead of using movies.year.

Are you using the “movies.year” field in your initial match ? If not use that, once I swapped over then the piece of data that was causing me a problem dropped out, and the $sort stage functioned as expected. Assuming that your initial match restricts on rating and languages as well, you should be fine.

I hope that helps.

PS - I did raise the issue about the potential for misunderstanding about which “year” to use, so hopefully it was resolved for this iteration of the course documents/lab notes.

Yes, I’m receiving the exact same message as above. I checked the number literals and no one is surrounded by quotes.
What strikes me as odd is that I have my aggregate pipeline working perfectly. It’s general outline is:

db.movies.aggregate( [
{$addFields: },
{$project: }
{$match: }
])

This runs as expected and generates nicely looking results without throwing exceptions of any kind. However if I add a simple {$sort:{title: 1}} then I have this "cant $subtract adouble from a string" exception. that seems odd since the $sort is not even by a numeric value. the same exception happens if I sort by year or votes or any other field!
I might be overlooking something, but if there is an error in the pipeline shouldn’t that happen before the $sort is added?

Thanks for the help!
Strange that I’m using the year, I never used the release date.
I added some more information in another reply on this same thread, but basically just adding a $sort will make things not work as expected. If there was a problem in the pipeline, I’d expect it to happen earlier and not just when I add a $sort stage.

The $sort can only work with one data type, so the data in the field has to be of the same type, if not it blows up.

If that isn’t right, I am sure someone will correct me :wink:

@ gmauch

I’m sorry, but I’m confused. Are you working on Lab 2.2 “Bringing it all together”? Why are you using $addFields? And I would suggest that you sort on the normalized rating that you have calculated which is the required parameter, correct?

Clearly some stage before the sort is adding something that the $sort is finding as a problem. I’d suggest that you work backwards here. Take your pipeline and start with just a $sort – I’m pretty sure that will work. Then add your $match and $sort, and so on. When you add a stage that crashes the aggregation, that will tell you where to look.

Hi all,
Can someone please guide me on this. My solution is as follows.
First stage is a $match, where im filtering on

  • language has English
  • year gte 1990
  • imdb.rating gte 1
  • imdb.votes gte 1

Next stage is $project
projecting, title, imdb.rating and scaled_votes where scaled_votes is calculated as the formula given in handout. Ive initalized x_max and x_min with the values given in the handout.
{
$add: [
1,
{
$multiply: [
9,
{
$divide: [
{ $subtract: ["$imdb.votes", x_min] },
{ $subtract: [x_max, x_min] }
]
}
]
}
]
}

Next stage is $addFields
normalized_rating:{$avg:["$scaled_votes","$imdb.rating"]}}

And finally
{$sort: {normalized_rating:1}}

But the answer I get is wrong. Can someone kindly point out the mistake in my approach.

@ sajitmk

As the instructions say “… and it was released in 1990 or after”. Notice that we’re looking for the field released year here, not year.

@ sajitmk

Also, in the future please open a new question – particularly for a new iteration of the course – in a new post. Thanks.

Thank you. That fixed it!

seems you revealed entire solution