Lab - using-cursor-like stages

I am working on the Lap: Cursor-like Stage to find the movie tile on the 25th records. It seems my result is one row off from the answer. My 26th record is one of the answers in the lab.

The following are my STAGES. It seems something wrong.

  1. Use $match to filer ‘cast’ for existing, not empty arrays, and filter tomatoes.viewer.rating >= 3
  2. Use $addField to take care of " (" issue on ‘cast’ field. (I found this step is not necessary in this lab, because it does not matter. I have the same result no matter I have this stage or not.)
  3. Use $project and $setIntersection to obtain a new field named ‘commonCasts’ which is an array containing all the casts match any elements in array ‘favorites’.
  4. Use $addField to add a new filed called ‘num_favs’, which is the number of array-elements in ‘commonCasts’, i.e. the number that represents how many favorites appear in the cast field of the movie
  5. Use $match to retain records with ‘num_favs’ > 0. (Optional. Since anyway, we will sort later.)
  6. Sort the input from last step by $sort: {num_favs: -1, ‘tomatoes.viewer.rating’: -1, title: -1}

Then my 26th record is ‘title’ - “The Heat” in stead of 25th. Do I miss anything in my stages above? Help and hints are needed. Thank you so much.

Does it mean I need to concern the ‘title’ sort with ‘the’, ‘a’ ignored?

I do not remember having to do that. Can you share the document of this given movie? We might be able to see what went wrong by seeing the data.

Hi @Qi.Chen,

I have initiated a discourse inbox message with you. Please share your aggregation pipeline so that we can work together to fix it.

Kind Regards,
Sonali

Hi Steevej-1495, You should see my pipeline. I have posted it on the forum. Thank you so much for your help. -Qi

Hi Steevej-1495, The following is my pipeline. (2 stages may be omitted, but I leave them there anyway.) Thank you so much for your help. -Qi

db.movies.aggregate([
{
$match: {tomatoes: {$exists: true},‘tomatoes.viewer.rating’: {$gte: 3}, cast: { $elemMatch: { $exists: true } } }
},

{
   $addFields:  {cast: {$map: {
                  input: "$cast",
                  as: "myCast",
                  in: {$arrayElemAt: [ { $split: [ "$$myCast", " (" ] }, 0 ] }
                            }
                     }, 
              }
},
{
    $project: {_id: 0, 'tomatoes.viewer.rating': 1, cast: 1, title: 1,  
    commonCasts:  {$setIntersection: [['Sandra Bullock', 'Tom Hanks', 'Julia Roberts', 'Kevin Spacey', 'George Clooney'], '$cast']} } 
},
{
   $addFields: {num_favs: {$size: '$commonCasts'}}
},
{
    $match: {num_favs: {$gt: 0}}
},
{
   $sort: {num_favs: -1, 'tomatoes.viewer.rating': -1, title: -1}
},

])

Hi Sonali,
I guess my replied, that I sent to you a while ago, is not displayed here due to the ‘discourse inbox’. You should see me pipeline code. I am sorry I do not know how to create the discourse inbox. Your help is very appreciated.
-Qi

This is a slice of the command db.movies.findOne()

	"cast" : [
		"Jeanne d'Alcy",
		"Georges M�li�s"
	],
	"tomatoes" : {
		"viewer" : {
			"rating" : 3.7,
			"numReviews" : 59
		}}
}
  • To start with, let’s re-write $match:

$match: {
//tomatoes:{$exist:true},
“tomatoes.viewer.rating”: {"$gte": 3},
“cast”: { “$elemMatch”: { “$exists”: true } }
}

  • tomatoes:{$exist:true} is implied in “tomatoes.viewer.rating”. Hence, I commented that out.

  • I’ve double quoted everything: If you use quotes, better be double ".

  • $map from what I see, is useless, since it would be useful for data like

cast:[
"Paul Dirac (nobel of physics 192x)", 
"erwin schrodinger (nobel physics 1926)"
]

But I can’t see that shape anywhere.

  • $project looks good, maybe rewrite "tomatoes.viewer.rating":1
    to "rating:$tomatoes.viewer.rating"

  • If you need the 25th element, I’d think of this after sort: {$skip:24},{$limit:1}

You can play around a bit, and come back if necessary…

My 25th record is “Tantastic Mr. Fox”, and my 26th record is “The Heat”. My 28th Record is “Recount”. I just wonder what my pipeline is wrong. Do I miss anything?

I removed //tomatoes:{$exist:true}, and then my result is still same. In worst case, I will wipe out my pipeline and redo it to see what the result is. Thank you.

Hi Sonali_Mamgain, I have not seen your reply yet. Have you got a chance to read my pipeline code? Do you have any hints for me?

I replaced the single quote with the double quote. The following is my new code: (The output result is still same.) -Qi

db.movies.aggregate([
{
$match: {“tomatoes.viewer.rating”: {$gte: 3}, cast: { $elemMatch: { $exists: true } } }
},
{
$project: {_id: 0, “tomatoes.viewer.rating”: 1, cast: 1, title: 1,
commonCasts: {$setIntersection: [[“Sandra Bullock”, “Tom Hanks”, “Julia Roberts”, “Kevin Spacey”, “George Clooney”], “$cast”]} }
},
{
$addFields: {num_favs: {$size: “$commonCasts”}}
},
{
$match: {num_favs: {$gt: 0}}
},
{
$sort: {num_favs: -1, “tomatoes.viewer.rating”: -1, title: -1}
},
])

@Qi.Chen

Please, use straight quotes, I can’t test that piece of code. See the difference:

" != ”

The left hand is ok. Right hand side isn’t parsed properly. Also,

Is probably innecesary.

And don’t spend characters signing your name, it’s already on the post :joy:

And this is still missing:

You are missing the requirement: For movies released in the USA

The problem is not that he is not using straight quotes. The problem comes from the fact that he is not using the appropriate markup. The code and sample documents must be within a block that starts with a line of 3 single back ticks and ends with another line of 3 single back ticks. This is 3 single back ticks ```. Another solution is to use the html code or pre elements.

So @Qi.Chen, please use the appropriate markup so we can test faster and help you faster.

Hi Steevej-1485, Thank you very much for the advice. The double-quote (or single-quote) marks were tilted when I copy/paste my code to the forum. Thank you for the advice. I did not realize the trouble for the other reader. I will use pre block markup in the future.

Hi Santiago_Miranda,
Thank you so much for your hints and correction. Your help is very appreciated.

Hi Steevej-1495, That is exactly the root cause. Even I read the question/problem a few times, I never fully understand that USA should be treated one of the requirements. I should read the question more carefully in the future. Thank you so much.