Lab - Using $lookup / Is my approach wrong?

In the homework:
Chapter 3: Core Aggregation - Combining Information
Lab - Using $lookup

Using this as a solution:

db.air_routes.aggregate([
{$match:{$or:[{airplane:{$eq:“380”}},
{airplane:{$eq:“747”}}]}},
{$lookup:{
from: “air_alliances”,
localField: “airline.name”,
foreignField: “airlines”,
as: “AlNm”
}},
{$match:{AlNm:{$elemMatch:{$exists:true}}}},
{$group:{_id:"$AlNm.name",AlCnt:{$sum:1}}}
])

I get an answer (maximum value for AlCnt), but I am told this is wrong. Why is that? I hope someone can’t point out my mistake?

Second and side question:
This homework has 3 possible answers and 3 trials.
Trying all possibilities without even thinking will inevitably lead to the solution. Am I missing something? (For example because my browser wouldn’t show it all?)

tl;dr

airplane:/380|747/,

Overall, the query is quite alright. But this:

Better be:

airplane:{$in:["380", "747"]},

In both cases though. we will have trouble. Let’s type this command: db.air_routes.find({},{_id:0, airplane:1}) to look at the data:

We get

{ "airplane" : "CR2" }
{ "airplane" : "142 146" }
{ "airplane" : "E70 CRJ M87" }
...

Planes are a string, not an array.

$in won’t go through the string word by word.

Still we could:

  • split the string, and get an array, and use $in

If we don’t split the string, something like this:
"CR2 747 380" won’t be matched.

The general, and better approach is to use regex, like so:

airplane:/380|747/,

I’m not saying this solves the query, just some ideas.

3 Likes

I see. This is interesting, I had indeed not looked that closely to the shape of the data. I was only thinking of an array and took it for granted. Well, I was wrong. Thank you for pointing out the miss.

@Michel_Bouchet, I have tried the same way you did and seems my approach is wrong. I am getting 10,8,5 planes for the given airlines in the same way you are getting. Are we missing anything?

Yes we are missing something. Santiago_Miranda rightly pointed out the mistake, read his answer above, you will see where the issue is. We are dealing with a string, not an array.

1 Like

Hello,

I do think the shape of the data in airplane e.g. “747 380”, should have been mentioned in the Lab notes, since once a person goes down the $in route, it doesn’t matter what they do, their aggregation pipeline is wrong.

They will never see the documents they are missing and their totals will be wrong.

Of course if it’s a lesson in beware your data, then fine, but I am not sure that should be the point.

PS - I know there were only three possible answers, and you had three try’s but no one likes to get things wrong.

Hi @NMullins!

You can use the @ to call in any of the content engineers. The course will probably be updated soon though. But I guess there is enough statistics (users reporting this problem) to argue you are right.

Hello,

I am not sure who would be the appropriate person to @ in these cases, especially for a course that is about to be refreshed.

This is the second time I have undertaken this particular course, and I try to avoid looking at my previous answers, but I fell foul of this regex search this time, I went with $in this time, and regex last time, probably because I spotted the slight data complexity with air_routes.airplane previously.

Try to warn the user of any “Data shape” gotcha’s, in any future labs.

Cheers :slight_smile:

The one I remember is @Sonali_Mamgain (sorry!). You can look em up in the users table under ☰, or as answerers on many questions.

Have a good day :slight_smile: