Chapter 2: Basic Aggregation - Utility Stages Lab - Bringing it all together

I don’t understand the problem. Calculate an average rating for each movie (…) out of what? Each movie has the average rating already calculated. What are the values to calculated the requested average? If it is that single value then the result is identical. What is the idea behind normalizing amount of votes? Rating is already normalized 1-10 and is an average of those votes.

Hi @Sir_Elias,

The normalized_rating is the average of imdb.rating and scaled_votes. You can refer to the lab handouts for the formulae associated for field calculations.

Hint: First Use $match stage to filter the document as per the given conditions.Then in the following $project stage, calculate the field values.

Please feel free to reach out if you have any additional questions.

Kind Regards,
Sonali

The question is what is the point of scaling the votes? If 10 people rate a movie 2.6(avg) and 120 people rate another movie 4.6(avg) then that avg score is what I need. Is the imdb.rating something else than an average of votes?

I actually need clarification on that. The lab handout ends with:

// given we have the numbers, this is how to calculated normalized_rating
// yes, you can use $avg in $project and $addFields!
normalized_rating = average(scaled_votes, imdb.rating)

now, does that mean the average of scaled_votes * imdb.rating, or does it mean two averages, one for each? the comma is what is confusing me.

Hi Jason, this comma confused me too. What did you discovered?

Hi @Daniel_Carrasco,

The formula in the handouts is for calculating the scaled_votes:

  {
    $add: [
      1,
      {
        $multiply: [
          9,
          {
            $divide: [
              { $subtract: [<x>, <x_min>] },
              { $subtract: [<x_max>, <x_min>] }
            ]
          }
        ]
      }
    ]
  } 

You can use it as is in you pipeline.

Now, normalized_rating is the average of scaled_votes and imdb.rating. You can calculate the average using the $avg operator in the $project stage in your pipeline.

You can refer to this specific example for more details:

Please feel free to reach out if you have any questions.

Kind Regards,
Sonali

1 Like

I discovered that, as @Sonali_Mamgain explained below, it is scaled_votes * imdb.rating

Thank you, it’s clear now.