Query is performing poorly even after using the index

vinodkumar_Mallikarj · October 19, 2020, 2:10pm

I have run the below query

 db.getCollection('COL1').find(
   {  
      "type": "A",
      "dt_tm" : {
         $gte: ISODate("2020-08-17 00:00:00.000Z"),
         $lt: ISODate("2020-09-17 00:00:00.000Z")}
   },
   {_id:0, "type":1, "dt_tm":1 }
)

There is an index on the type and dt_tm column, query uses that index and returned 2886148 records from the IXSCAN stage[Time taken in this index stage is 283ms] , from this stage it goes to PROJECTION_COVERED returning the same records[Time taken in this stage is 82ms].

Then added one more column by name teams in the projection of the above query.

Below is the query

db.getCollection('COL1').find(
   {"type": "A",
      "dt_tm" : {
         $gte: ISODate("2020-08-17 00:00:00.000Z"),
         $lt: ISODate("2020-09-17 00:00:00.000Z")}
   },
   {_id:0, "type":1, "dt_tm":1, "teams":1}
)

Now executed the above query, Mongo uses the same index and returned 2886148 records from the IXSCAN stage[Time taken in this index stage is 771ms].

The time taken is increased here from 283ms to 771ms. Why is this difference since it has used the same index and returning the same record count from the IXSCAN stage…?

Here the FETCH stage is added additionally and it is returning the same records [Time taken is 681ms].

Next is the PROJECTION_SAMPLE stage , giving the same records but time taken is 742ms.

We can see that time taken is increased both in IXSCAN and the PROJECTION_SAMPLE stage, can any one help, why there is increase in the time…?

Prasad_Saya · October 19, 2020, 3:26pm

Hello @vinodkumar_Mallikarj,

In the first query, it was a case of Covered Query. The query was served by the index alone. That means, the index doesn’t have to lookup the documents for any additional details. The query filter and the projection are covered by the index fields.

Here the FETCH stage is added additionally and it is returning the same records

This is because the query had to fetch additional data, the field teams, from the document. The trip to access the document is your additional time and the FETCH stage in the query plan.

kevinadi · October 19, 2020, 10:56pm

Hi @vinodkumar_Mallikarj,

What @Prasad_Saya said is I believe the correct answer.

As an aside, if this is an actual query you use regularly, personally I would reevaluate the need to return 2886148 documents in a single query. It seems to be an excessive number, unless you’re doing a data export. A query returning an excessive amount of documents could interfere with your working set, leading to suboptimal performance in some cases.

Best regards,
Kevin

vinodkumar_Mallikarj · October 20, 2020, 5:49am

Then how to handle of large amount of data with aggregation…?

kevinadi · October 21, 2020, 5:13am

Hi @vinodkumar_Mallikarj

Then how to handle of large amount of data with aggregation…?

I’m not sure I understand what you mean. Are you asking on how to aggregate your data so that you don’t need to fetch millions of documents from the server?

If that is the question, it’s probably best to open a new thread with more details (what you need, example documents, example output, what you have attempted, etc.). I think it’s best to keep one question per thread so it’s not confusing to read, since this thread is all about query indexing.

Best regards,
Kevin

steevej · October 21, 2020, 12:35pm

If I may add, MongoDB university at https://university.mongodb.com/ offers very great courses. One in particular, M121, just for the aggregation framework.

system · October 26, 2020, 12:35pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.