Pipeline performance

I wrote a pipeline other than the one in the solution, they both the same result but my pipeline is kinda slow.
Below are the two pipelines,
As a reminder: The task was to find movies with 1 word title


So based What’s the performance issue there?

Hi @Motaz_69999,

Interesting observation!

$expr is your bottleneck. I’m guessing you’re noticing this because you’re doing an itcount() on the pipeline (which in itself is slow), but besides that, it appears that there are some undocumented performance issues/limitations with the $expr operator.

In my view, where possible:

  • avoid $expr or use it as a last resort/sparingly
    Or
  • use it with equality conditions + on fields that have indexes (multikey indexes are not supported)
  • use it with operators that summarise data, i.e. $avg, $sum etc
  • use it when you’re running a query based on pre-aggregated results

Explain Plan:
I also ran an Explain() on both queries but the solutionPipeline doesn’t yield useful results, i.e. in the Explain Results, the stages.$cursor.query field is empty and the nReturned value is incorrect. Clearly a bug with the use of $match + computed field in the Explain Results. @Sonali_Mamgain, feedback for your Core Dev colleagues :slight_smile:
The myPipeline however has an additional filter field which is also undocumented. I did report this a while back.

Lastly, with regards itcount(), you can use the $count stage if you would like a much quicker count.

PS: just so you’re aware, forum guidelines doesn’t permit us to post full or partial solutions to labs so I’ve renamed the title of your thread to make it less obvious. And if you could trim the solutionPipeline part from your post too, that’ll be great. :wink:

1 Like

Hi @Motaz_69999,

Few more additions here:

  1. The number of documents returned using both the queries are equal.
  2. Execution stats for both the pipelines are different, because in myPipeline, the result is fetched from the query and thus we get nReturned documents equal to final number of documents returned. However, using solutionPipeline, the cursor.stages include the $project and $match stage later after the executionStats.
  3. The solutionPipeline above is not complete and this is the reason for empty $cursor.query.

Please let me know, if any questions.

Thanks,
Sonali