Lab: Using Cursor-like Stages

I completed a few courses already but with this one I am really struggling over minor details.

If I run the following code, I would expect one of the four movies listed as the first result.
This is not the case, would somebody be able to put me in the right direction please?

 db.movies.aggregate([
{ $match : {  cast: { $exists: true, $not: {$size: 0} } , "tomatoes.viewer.rating": {$gte:3} , countries: { $elemMatch : {$eq: "USA"} } }  }  ,
{ $project : {	_id: 0, 
		cast: 1,
		title: 1 ,
		countries : 1 ,
		rating : "$tomatoes.viewer.rating" ,
		num_favs : { $size : {$setIntersection: [  [
							  "Sandra Bullock",
							  "Tom Hanks",
							  "Julia Roberts",
							  "Kevin Spacey",
							  "George Clooney"] , "$cast" ] } }
	}
} ,
{ $sort : {num_favs: -1  , "tomatoes.viewer.rating":-1, title:-1}  } ,
{ $project : {title:1, countries:1, rating: 1,  num_favs: 1}} ,
{ $skip : 24 }

])

The issue is in your match statements:
Tips:

  1. Create variable favorites then assign all fav. cast
  2. then match cast to favorites list to filter on movies with at least one cast from your list
    you can use “cast”: {$in: }
  3. add the rest statements on your match.

I hope you will find the answer :slight_smile:

Thanks for your reply, appreciated.
I don’t understand why I would need an extra filter on cast with at least one cast from my list.
Ok, I would get less results, but if I sort by num_favs, rating, and title, my top results should be the exactly the same?
Regards

@ Henk_81354

Let’s take this as a good place to hone your debugging skills. :wink:

Break your pipeline down into each stage, and run each one, one at a time. Start with the first one – your $match – alone. Does that give you what you expected? Then add your next one – your $project – and run the two together. Again, does that look like what you expect? And so on.

I think if you do that, you will quickly see why you’re not getting the result you expected. HTH.

Post back here if you still have problems.

2 Likes

Similar thoughts to as what @Henk_81354 was describing. It’s quite frustrating, and yes debugging in stages is a regular approach, still isn’t allowing me to see the error. Would appreciate some insight.

I’m not seeing a match on the cast possibly

db.movies.aggregate([
{
$match: {
“tomatoes.viewer.rating”: {$gte: 3},
“countries”: {$eq: “USA”},
cast: {$in: [
“Sandra Bullock”,
“Tom Hanks”,
“Julia Roberts”,
“Kevin Spacey”,
“George Clooney”]
}
}
}
]).itcount()

gives 122 but when I run it in aggregate i.e.

var pipeline = [
{
$match: {
“tomatoes.viewer.rating”: {$gte: 3},
“countries”: {$eq: “USA”},
cast: {$in: [“Sandra Bullock”,
“Tom Hanks”,
“Julia Roberts”,
“Kevin Spacey”,
“George Clooney”]
}
},
{
$addFields: {
num_fav: {
$setIntersection: [ “$favorites”, “$cast” ]
}
}
},
{
“$project”: {
“_id”: 0,
“title”: 1,
“countries”: 1,
“tomatoes.viewer.rating”: 1,
“num_fav”: {
$size: {
$ifNull: [
“$num_fav”,[{value: 0}]]
}
},
“cast”: 1
}

and so on, the itcount() is 15325

All righty bit of a toolish error. pipeline wasn’t updated in the shell from an old try. Now at least the count matches but the intersect doesn’t

Specifically I see gravity up there in the first few results which has both Clooney and Bullock but the $count for the num_favs is 1

Hi, first of all thanks for your reply:

First of all, I understand that the num of favorites AND tomatoes.viewer.rating AND title needs to be sorted decending, so num of favorites from 2 to 0, rating from 10 to 1 and title from Z to A .

Country of release should be USA, but that does not mean ONLY USA. So as long USA is in the list of countries thats fine. tomatoes.viewer.rating should be equal to 3 or higher.

I also made sure that cast exists, and that there is at least one element in the array, otherwise there is no point in comparing the cast with the favorites array.

Debugging wise I did the following,
As a last stage, I have put a match in again in again, so can double check my findings:

db.movies.aggregate([
{ $match : {  cast: { $exists: true, $not: {$size: 0} } , "tomatoes.viewer.rating": {$gte:3} , countries: { $elemMatch : {$eq: "USA"} } }  }  ,
{ $project: { _id:0, title:1 , countries:1 ,  "tomatoes.viewer.rating":1, cast: 1 ,  commonToBoth :  {$setIntersection: [  [
							  "Sandra Bullock",
							  "Tom Hanks",
							  "Julia Roberts",
							  "Kevin Spacey",
							  "George Clooney"] , "$cast" ] }     }  }    ,
							  
{ $match : {  .........  } }  ,							  

]).pretty()

So in the last stage of the pipeline, the 2nd $match, I put as follows:

countries: {$size:2} 

Seems I have always USA in my list, sometimes on first sometimes on second place- check .

"tomatoes.viewer.rating": {$lte:3}

Seems I am only getting movies with rating 3 now, check.

commonToBoth: {$size:2}

My commont to both field shows 2 actors as expected (only names which are common to both lists) - check.

So, if I store the size of commonToBoth into num_favs, as per my very first post , and then sort as follows:

$sort : {num_favs: -1  , "tomatoes.viewer.rating":-1, title:-1} 

I would have expected the 25th entry is the one I need?
Appreciate that I am overlooking something - but seriously - I am blind at the moment :frowning:

HI
“num_favs” is not a field in documents. it is actually the size of array, you get after intersecting (favorites and cast). You can avoid projecting cast and countries

Thanks again for your reply, appreciated.

“num_favs” is not a field in documents. it is actually the size of array, you get after intersecting (favorites and cast).
I am aware of this and that’s exactly what I did - see my very first code in this topic?

You can avoid projecting cast and countries
Of course, but in my latest post I did for debugging purposes.
This would not affect the outcome though?

Kind regards.

I used this method however the count is always 0. Anyone know why?

{
favs: {
$setIntersection : ["$cast",[
“Sandra Bullock”,
“Tom Hanks”,
“Julia Roberts”,
“Kevin Spacey”,
“George Clooney”]
]
},
num_favs: {
“$size”: { “$ifNull”: [ “$favs”, ] }
}
}

I think your problem is that you refer to “$favs” in the same stage of the pipeline, setup a new stage $addFields in which you add num_favs.

Thanks @Henk_81354, but i realised its actually some paste issue on compass. If I paste the array with all the cast name into it, it has some syntaxt issue, but if i line them up into 1 line it worked. even without the $ifNull

Good to hear.
I am still not sure whether you would be able to get the num_favs out the way you do.
I believe either you need to get the size out in a next stage OR do it all in one go, like

			num_favs : { $size : {$setIntersection: [  [
							  "Sandra Bullock",
							  "Tom Hanks",
							  "Julia Roberts",
							  "Kevin Spacey",
							  "George Clooney"] , "$cast" ] } }
1 Like

yup did that at the end, but only worked when i carefully place it into the editor. If i just paste the array in it will give me some syntaxt error expecting “[” :confused:

For some reason I had trouble with sorting on “tomatoes.viewer.rating”–it seemed to have no effect. But when I projected to a new property (e.g., “tomatoRating”), I could sort on that with no issue and got the right answer.

Does anyone know why sorting on “tomatoes.viewer.rating” didn’t seem to work?

1 Like

I did not realize that - and I can’t answer you, but you helped me tremendously with your question.
See my very first question in this topic:

By changing “tomatoes.viewer.rating” to “rating” in the sort stage , I finally get the right answer as well.
I should have seen this before - but that’s the way it goes sometimes…

Hints to look at my match statements put me completely in the wrong direction :frowning:

1 Like

Yeah, I kind of figured that was your problem, too. :slight_smile: Glad to help.

1 Like