Lesson 3.9 Understanding Explain Part I

In lecture video this is explained --> When we run below and we dont have index, there is COLLSCAN
expRun.find({“last_name”:“Johnson”, “address.state”:“New York”})

Then we create below index,
db.people.createIndex({last_name:1})

And after creating index if we run again below then we have IXSCAN
expRun.find({“last_name”:“Johnson”, “address.state”:“New York”})

Now we create below index,
db.people.createIndex({“address.state”: 1, last_name: 1})

and now if we re-run the below query , after above compound index creation
expRun.find({“last_name”:“Johnson”, “address.state”:“New York”})

then in winning plan, we have stage = IXSCAN and indexname used = “address.state_1_last_name_1”. My question is why this index was used, since in query we have used first column as address.state and second column as last_name?

The index address.state_1_last_name_1 is used as it is more efficient.

Note that the query

{"last_name":"Johnson", "address.state":"New York"}

is exactly the same as

{ "address.state":"New York", "last_name":"Johnson" }

This means the server can determine which documents to return by only looking at the index. In the other case, the server would have to look at all the documents that match the address.state part of the query to determine if the other part of the query is satisfied.

so in this case concept of index prefix is not used ?

In documentation it is mentioned that,

db.products.createIndex( { “category”: 1, “item”: 1 } )

This allows you both options. You can query on just category , and you also can query on category combined with item .
A single compound index on multiple fields can support all the queries that search a “prefix” subset of those fields.

Why do you thing that?

Just like an index on { address.state:1,last_name:1 } can satisfy query on just address.state and query on address.state combined with last_name. This is why

Lets say we have an index on collection as below
db.people.createIndex({“address.state”: 1, last_name: 1})
And we execute below queries,

  1. expRun.find({“last_name”:“Johnson”, “address.state”:“New York”})
  2. expRun.find({“address.state”:“New York”, “last_name”:“Johnson”})
    Both would use same index and both would have same performance

As I wrote earlier:

Same query means same index and same performance.

Hi @Mandar_Kulkarni1

In terms of what @steevej-1495 mentioned, it may be helpful to review the concept of a query shape. This captures the structure of the predicate you are using for your query, particularly the field names. These (the field names) rather than the values are what is useful to the query planner to determine what indexes are possible candidates to fulfil your query.

In terms of better understanding compound indexes, I often point people to my colleague Jesse’s old but still relevant blog post on compound indexes. I’d suggest reviewing this as it may help further clarify your questions.

Kindest regards,
Eoin

1 Like