Chapter 4 : $facet solution is incorrect (despite it gives the correct answer)

If you look at the DB, you’ll see that the number of metacritic with a value of 100 is 11 :

db.movies.find({'metacritic': {$eq: 100}, 'imdb.rating': {$exists: true, $ne: ""}}).count()

In the given solution there is a sort on title and a $limit to 10 :slight_smile:

    {
      $sort: {
        metacritic: -1,
        title: 1
      }
    },
    {
      $limit: 10
    },

By chance this gives the correct answer, if you remove the $sort on the title, it won’t match the expected record, as the needed record is the 11th. If the common value with imdb.rating top10 would have been the last one after the sort, it would have been removed by the $limit.

If you set an imdb.rating of 9.7 to “The Wizard of Oz”, it won’t appear in the result as it will be at 11th position in the metacritic list ordered by the title.

Hello Laurent,

you are probably right. But in order to understand what is being asked, the statement of the problem must be rephrased to:

How many movies in the list of the top ten highest imdb.rating and in the list
of the top 10 highest metacritic are common to both? We should get
these results with exactly one access to the database.
Hint: What is the intersection?

So the problem is about the top 10 records only. This is why the LIMIT 10 is needed for both lists. The sort is vital to get the values from highest to the lowest.

There is an urgent need to rephrase a bunch of problem statements. This is a common complaint.

Jaime :space_invader: :coconut:

1 Like

@jhrozo there are 11 movies with metacritic value of 100, so how do you choose the top10 ?
In the case of the given solution, they choose by ordering with title which is incorrect when you want to AND with the imdb top 10 results.

@Laurent, in fact there are 2 sort statements, one for each list. Each sort statement has 2 arguments. The order of the arguments is crucial to obtain the wanted results. So when you use: {$sort: {metacritic: -1, title: 1}} you are telling MongoDB 2 things:

  1. order first by metacritic in descending order, and then
  2. while keeping the ordered metacritic, order the titles for each metacritic.

Suppose you have the following data set where the first column is the metacritic and the second is the title:

2, aaa
2, xxx
1, vvv
3, fff
1, bbb
4, kkk
4, lll
5, yyy

now, lets apply the first part of the sort: {$sort: {metacritic: -1}}, the resulting sub-set is:

5, yyy
4, lll
4, kkk
3, fff
2, xxx
2, aaa
1, vvv

as you can see, the titles are not sequentially ordered for each metacritic. Here is when the second argument comes into play and its function is to organize in ascending order the group of titles for each metacritic but not the whole set of movies. So the final ordering will be:

5, yyy
4, kkk
4, lll
3, fff
2, aaa
2, xxx
1, vvv

Now, lets apply a LIMIT sentence, say LIMIT 5:

5, yyy
4, kkk
4, lll
3, fff
2, aa

The same internal process happens in order to obtain the top 10 movies with highest imdb.rating.

Jaime :space_invader: :coconut:

The problem here is that there are 11 metacritic values at highest 100, do you see the problem for choosing 10 among 11 (then depending on the title) ?

No problem at all @Laurent_Roger. What is being asked is clear, that is, you must sort by descending metacritic and then organize alphabetically by tittle in ascending order and gather the top 10 documents which eventually would left out of the resulting set a document having high metacitic with a low alphabetically tittle order. Otherwise, if they wanted just the top 10 highest metacritic, the list would be like the one below:

[ { _id: ObjectId("573a13ddf29313caabdb4133"),
    title: 'Best Kept Secret',
    metacritic: 100 },
  { _id: ObjectId("573a13bef29313caabd5c06c"),
    title: 'Boyhood',
    metacritic: 100 },
  { _id: ObjectId("573a1397f29313caabce844d"),
    title: 'Fanny and Alexander',
    metacritic: 100 },
  { _id: ObjectId("573a1396f29313caabce4a9a"),
    title: 'The Godfather',
    metacritic: 100 },
  { _id: ObjectId("573a1396f29313caabce3f2c"),
    title: 'The Conformist',
    metacritic: 100 },
  { _id: ObjectId("573a1395f29313caabce2ae5"),
    title: 'Au Hasard Balthazar',
    metacritic: 100 },
  { _id: ObjectId("573a1395f29313caabce1f51"),
    title: 'The Leopard',
    metacritic: 100 },
  { _id: ObjectId("573a1395f29313caabce1bd0"),
    title: 'Lawrence of Arabia',
    metacritic: 100 },
  { _id: ObjectId("573a1394f29313caabce0808"),
    title: 'Sweet Smell of Success',
    metacritic: 100 },
  { _id: ObjectId("573a1394f29313caabcdf67a"),
    title: 'Journey to Italy',
    metacritic: 100 },
 **>>>   end of the top 10 highest metacritic.**

  { _id: ObjectId("573a1393f29313caabcdbe7c"),  **>>> left out document**
    title: 'The Wizard of Oz',
    metacritic: 100 },

Now, the list ordered by tittle:

[ { _id: ObjectId(“573a1395f29313caabce2ae5”),
title: ‘Au Hasard Balthazar’,
metacritic: 100 },
{ _id: ObjectId(“573a13ddf29313caabdb4133”),
title: ‘Best Kept Secret’,
metacritic: 100 },
{ _id: ObjectId(“573a13bef29313caabd5c06c”),
title: ‘Boyhood’,
metacritic: 100 },
{ _id: ObjectId(“573a1397f29313caabce844d”),
title: ‘Fanny and Alexander’,
metacritic: 100 },
{ _id: ObjectId(“573a1394f29313caabcdf67a”),
title: ‘Journey to Italy’,
metacritic: 100 },
{ _id: ObjectId(“573a1395f29313caabce1bd0”),
title: ‘Lawrence of Arabia’,
metacritic: 100 },
{ _id: ObjectId(“573a1394f29313caabce0808”),
title: ‘Sweet Smell of Success’,
metacritic: 100 },
{ _id: ObjectId(“573a1396f29313caabce3f2c”),
title: ‘The Conformist’,
metacritic: 100 },
{ _id: ObjectId(“573a1396f29313caabce4a9a”),
title: ‘The Godfather’,
metacritic: 100 },
{ _id: ObjectId(“573a1395f29313caabce1f51”),
title: ‘The Leopard’,
metacritic: 100 },
end of the top 10 highest metacritic ordered by title as required

{ _id: ObjectId(“573a1393f29313caabcdbe7c”), >>> left out document
title: ‘The Wizard of Oz’,
metacritic: 100 }

Jaime :space_invader: :coconut:

@jhrozo Then if “The Wizard of Oz” had a imdb.rating of 9.7, do you agree that it wouldn’t be common with the top highest metacritic ? Which is a incorrect answer to the question :wink:

@Laurent_Roger, first, I cannot agree as the highest rating is 100 and there are 11 movies to choose the top 10 among them.

Second, I would agree if instead of 11 movies having 100 rating only 9 have the 100 rating and the next movie with the highest rating is The Wizard of Oz.

Jaime :space_invader: :coconut:

let split this in smaller chunks :slight_smile:

  • Statement 1 : so you agree that there are 11 movies with 100 metacritic
  • Statement 2 : you cannot choose a unique list of top 10 in a list of length equal to 11

This is not the question here.

With the previous list of metacritic of 100 with sort criteria, “The Wizard of Oz” is not in the top 10

  • Statement 3 : “The Wizard of Oz” is not in the top 10 → from your post
  • Statement 4 : If “The wizard of Oz” had a imdb.rating of 9.7, he will be in the imdb.rating top 10

  • Statement 5 : Then “The wizard of Oz” is not in the top 10 of metacritic and in the imdb.rating top 10 despite he has a metacritic of 100.

  • Statement 6 : if you change the sort in metacritic top 10 (reverse order for title for example), then we will have 2 entries as result of movies in both top 10 (which is correct)

What is your opinion on each statement : true or false ?

@Laurent_Roger,

  • Statement 1 : so you agree that there are 11 movies with 100 metacritic
    :white_check_mark: It is a fact from the collection.

  • Statement 2 : you cannot choose a unique list of top 10 in a list of length equal to 11
    :negative_squared_cross_mark: there is not uniqueness at play, just a list that may a have 10 records with the same value for a specific field or a mix of equal and different values in descending order as is the case for the numeric column in
    5, yyy
    4, kkk
    4, lll
    3, fff
    2, aaa
    2, xxx
    1, vvv

  • Statement 3 : “The Wizard of Oz” is not in the top 10 → from your post
    :white_check_mark: This is a fact that comes after the execution of the query.

  • Statement 4 : If “The wizard of Oz” had a imdb.rating of 9.7, he will be in the imdb.rating top 10
    :white_check_mark: I would agree if instead of the 11 movies having a rating of 100 there is only 9 having the 100 rating and the next movie with the highest rating is The Wizard of Oz.

  • Statement 5 : Then “The wizard of Oz” is not in the top 10 of metacritic and in the imdb.rating top 10 despite he has a metacritic of 100.
    :negative_squared_cross_mark: It is not in the top 10 of metacritic query because the title “The Wizard of Oz” is alphabetically greater than “The Leopard”, and it does not appear in the list of the top 10 not even in the list of the top 12 imdb.rating as shown below using the pipeline

    [{$sort: {
    “imdb.rating”: -1}}, {$project: {
    title:1, “imdb.rating”:1
    }}, {$limit: 12
    }]

[ { _id: ObjectId("573a13fbf29313caabdeeea7"),
    title: 'Junjun',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdeedb3"),
    title: 'Lone Wolves',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdee970"),
    title: 'I\'m Not Ashamed',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdee889"),
    title: 'El faro de las orcas',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdee642"),
    title: 'Patchwork',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdeee68"),
    title: 'Interruption',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdeee3b"),
    title: 'Noc Walpurgi',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdee7fc"),
    title: 'Hundra�ringen 2',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdee7db"),
    title: 'Kaivos',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdee77c"),
    title: 'The Lion Guard: Return of the Roar',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdee69d"),
    title: 'Killjoy\'s Psycho Circus',
    imdb: { rating: '' } },
  { _id: ObjectId("573a13fbf29313caabdee5e1"),
    title: 'Siccin 2',
    imdb: { rating: '' } } ]
  • Statement 6 : if you change the sort in metacritic top 10 (reverse order for title for example), then we will have 2 entries as result of movies in both top 10 (which is correct)
    :negative_squared_cross_mark: Nope, reversing the sort order for the title in one or the other or in both queries would not change the expected correct result for the problem.

Jaime :space_invader: :coconut:

I can see Laurent’s point here. In my opinion, the limit of 10 should NOT be applied. Rather, one needs to find the 10th rating (from highest), say it is the number R, and then collect ALL documents with rating >= R. Use this for the intersection. “top 10” can include more than 10 documents when there is a tie at 10th place.

The answer is indeed incorrect.
The request is “How many movies are in both the top ten highest rated movies”
The answer takes in account only movies with both ratings, not all movies:
$match: {
metacritic: { $gte: 0 },
“imdb.rating”: { $gte: 0 }
}

1 Like

Hi @Vasant_Shanbhogue @Martin_Terreni,

We understand that the lab statement is not correct given all the edge cases that might sprout up.
However, we are working on revamping the MongoDB Aggregation course to include more relatable and easy-to-understand lab statements, with an improved learning experience for our users.

For this version of the course, I would recommend referencing MongoDB Aggregation documentation to understand more about the stages and operators discussed in the course and practicing the examples given in the documentation.

Thanks for sharing your thoughts!!
Please feel free to reach out if you have any questions.

Kind Regards,
Sonali