Ticket: Faceted Search

RE: pytest -m facets

I got the following errors when run the “pytest -m facets” and invoking the function get_movies_faceted(filters, page, movies_per_page) in db.py. And wonder how to fix the issue.

E pymongo.errors.OperationFailure: BSONObj size: 65162038 (0x3E24B36) is invalid. Size must be between 0 and 16793600(16MB) First element: runtime: [ { _id: 0, count: 2312 }, { _id: 60, count: 11448 }, { _id: 90, count: 24255 }, { _id: 120, count: 4768 }, { _id: “other”, count: 3231 } ]

And the error was obviously caused by the following lines(don’t know why the allowDiskUse=True didn’t work):

facet_stage = {
    "$facet": {
        "runtime": [{
            "$bucket": {
                "groupBy": "$runtime",
                "boundaries": [0, 60, 90, 120, 180],
                "default": "other",
                "output": {
                    "count": {"$sum": 1}
                }
            }
        }],
        "rating": [{
            "$bucket": {
                "groupBy": "$metacritic",
                "boundaries": [0, 50, 70, 90, 100],
                "default": "other",
                "output": {
                    "count": {"$sum": 1}
                }
            }
        }],
        "movies": [{
            "$addFields": {
                "title": "$title"
            }
        }]
    }
}


pipeline = []
pipeline.append(facet_stage)

movies = list(db.movies.aggregate(pipeline, allowDiskUse=True))[0]

I guess i has been discussed here

it took awhile for me to understand, it is very confusing line of code in the ticket

4 Likes

Thank you a lot for your reply.

After I removed the # pipeline = #, When running the command “pytest -m facets” again, the error ,which related to BSONObj size limit,mentioned above has gone but another errors appears. The following are the errors:

=================================== FAILURES ===================================
_________ test_faceted_search_should_return_rating_and_runtime_buckets _________

client = <FlaskClient <Flask ‘mflix.factory’>>

@pytest.mark.facets
def test_faceted_search_should_return_rating_and_runtime_buckets(client):
    filter = {'cast': ['Tom Hanks']}
    result = get_movies_faceted(filter, 0, 20)
    # expecting the first entry in the returned tuple to be a dictionary with
    # the key 'movies'
  assert len(result[0]['movies']) == 20

E AssertionError: assert 51 == 20
E + where 51 = len([{’_id’: ObjectId(‘573a1399f29313caabcee607’), ‘awards’: ‘Won 6 Oscars. Another 40 wins & 47 nominations.’, ‘cast’: [’…’: [‘Leonardo DiCaprio’, ‘Tom Hanks’, ‘Christopher Walken’, ‘Martin Sheen’], ‘countries’: [‘USA’, ‘Canada’], …}, …])

tests/test_facets.py:20: AssertionError
----------------------------- Captured stdout call -----------------------------

________________ test_faceted_search_should_also_support_paging ________________

client = <FlaskClient <Flask ‘mflix.factory’>>

@pytest.mark.facets
def test_faceted_search_should_also_support_paging(client):
    filter = {'cast': ['Susan Sarandon'], }
    result = get_movies_faceted(filter, 3, 20)
  assert(len(result[0]['movies'])) == 3

E AssertionError: assert 63 == 3
E + where 63 = len([{’_id’: ObjectId(‘573a13b8f29313caabd4d633’), ‘awards’: ‘11 nominations.’, ‘cast’: [‘Emile Hirsch’, ‘Nicholas Elia’, …cast’: [‘Susan Sarandon’, ‘Geena Davis’, ‘Harvey Keitel’, ‘Michael Madsen’], ‘countries’: [‘USA’, ‘France’], …}, …])

tests/test_facets.py:32: AssertionError

From your error information, I see you got 51 records instead of 20. To solve your issue, you can extend skip and take to your pipline then try again.

2 Likes

Thank you.

after changing the code to the following it look working:

pipeline =

 pipeline.append(skip_stage)
pipeline.append(limit_stage)
pipeline.append(facet_stage)

The issue resolved. Thanks a lot!

2 Likes

You made my life easy! Thanks! :wink:

const queryPipeline = [
matchStage,
sortStage,
skipStage,
limitStage,
facetStage
]

1 Like