Cursor iteration is very slow in pymongo

Yogita_Pal · March 17, 2020, 11:35am

mongodb query return command cursor .to iterate the cursor and get the data its very slow takes around 90 sec for 250 records .Can someone help

using pymongo 3.9.0

logwriter · March 17, 2020, 1:46pm

Hi @Yogita_Pal,

First of all, welcome to the MongoDB Community.
Could you share your code?

All the best,

Rodrigo (a.k.a. Logwriter)

Yogita_Pal · March 17, 2020, 6:01pm

cursor= db.collection.aggregate(
    [
        {
            '$addFields':
                {

                    'yearSubstring': {'$substr': ["$__json.created_on", 0, 10]},
                }
        },
        {
            '$match': {
                '$and': [{'yearSubstring': {'$gte': '12/01/2019'}}, {'yearSubstring': {'$lte': '12/02/2019'}}]
            }
        }

    ])

#query take around 4-5 seconds
while cursor and cursor.alive:
list_element= cursor.next()
items.append(list_element)
#but this while to read each cursor data take around 100 seconds

Bernie_Hackett · March 18, 2020, 9:12pm

Just call list(cursor) to create a list of the results.

Yogita_Pal · March 19, 2020, 3:56am

I had already done that before and tried again it also takes same takes same time .what is standard bench mark for reading 1000 records from mongodb server using pymongo 3.9 .

Shane · April 9, 2020, 11:20pm

90 seconds for 250 records does not seem normal to me.

what is standard bench mark for reading 1000 records from mongodb server using pymongo 3.9

Assuming these are small documents (<1KB) they can all be returned in a single network roundtrip to the server and the total time should be roughly equal to the network latency.

Although in general the answer depends on a number of factors:

What server are you running against? MongoDB Atlas? If so, what size cluster: Free tier M0, M5, M20, etc?
What is the network latency from the application to the server? 10ms? 500ms?
What is the average size of the returned documents? ~100 bytes, ~1KB, ~1MB?
How long does the server take to satisfy the query? Perhaps the query can be sped up with an index?
Was pymongo installed with the C extensions? These speed up pymongo’s BSON encoding/decoding. You can check with: python3 -c 'import bson;print(bson.has_c())'

A final note, you can use cProfile to determine where the CPU time (not I/O time) is being spent:

The Python Profilers — Python 3.12.2 documentation
python3 -m cProfile -s time myscript.py