Compound Indexes, prefixes

Hello!

In this lecture: Querying on Compound Indexes Part 2
In the video time interval: 2:15 ~ 3:20

I don’t understand the explanation.

It is said: since a prefix is not used, 50474 documents had to be examined because the last_name_first_name index cannot be used. This seems very strange (and inefficient) to me. Though I do not know the internals of mongoDB. It seems obvious that the searched piece of information could be found by properly looking at the keys first, without any need to pull out the whole document in each case.

For instance if the key {last_name: “Smith”, first_name: “Peter”} is found.
There is no chance this document will satisfy {first_name: “Sonia”}, so it should not be pulled out and read.

Also, though having a prefix would be better, it should still be possible to use the fact that the first name is ordered within a given last name range to perform some optimization and avoid looking at irrelevant documents.

Hi @Michel_Bouchet,

Lets try to understand this in pieces:

  1. When a query uses index, the explain plan will do an IXSCAN i.e despite of going through each document, only index keys will be examined in order to fetch the correct document.

If we have created an index {"last_name":1, "first_name": 1} on our collection and issue the following query:

db.test_collection.find({“last_name”: “Solomon”})

This query uses index prefix {"last_name": 1} to find the result. IXSCAN will happen here and only last_name key will be examined to fetch the result.

  1. When we are trying to run the query: db.test_collection.find({"first_name": "Sonia"})
    This query is not using any index so there will be a collection scan ie all the documents will be scanned in order to fetch the results. No index key will be scanned here as we do not have any index on first_name field.

Edit: Refer to this documentation link:

Please feel free to reach out if you have any questions.

Kind Regards,
Sonali

1 Like

Thank you for this explanation, but I am not sure you have read the question.
I know, because explained in the lecture how things happen.

I was only making a point that even without having a prefix it should still be possible to use the information available to avoid performing a brute force scan on all the documents.

Is this idea actually applied? I don’t know.

Not an answer, but I’d like to reformulate your doubt, just to double check we understand it @Michel_Bouchet

Imagine you create an index in firstname, lastname, and the documents of the index collection are:

//document 1
----FN1----
    ----LN1----
    ----LN2----
    ----LN3----
//document 2
----FN2----
    ----LN1----
    ----LN2----
    ----LN3----
// etc...

Each document, has also _id information to fetch the document (in case this is needed).

  • If we look db.coll.find({"first_name": "FN1"}, {"first_name":1,_id:0}) this is a good case scenario: query is covered.

  • If we run db.coll.find({"first_name": "FN1"}) there will be index scan and fetch.

That’s all clear. But…

db.coll.find({"last_name": "LN1"}) , why can’t this return documents with name FN1, FN2.

I just would remind that compound indexes with 32 keys are allowed: programming that flexibility you ask for would be a mess. Also you can expert developers forum.

I see, you may be right; that could be an interesting problem to think about though.

1 Like

Sure it would!

Hello Everybody,

@Michel_Bouchet. @Santiago_Miranda @Sonali_Mamgain

In same lecture, minute 3:36, it is said :

" If your application has two queries and one uses fields that are a subset of the other, you should build an index where one query uses the index prefix and the other uses all fields of the index. "

I’m not sure that I completly inderstund this, if I want to illustrate this by the following :

image

Field 2 is a subset of Field1,
We have to build an Index like showen in picture to have a good performance ratio when searching Field2 by using index prefix which is part of our Index

Thanks to clarify to me

Regards
Otman