I have a question about the contents of key-value pairs which are converted from collection and index data, when I put db.collection.insertOne() in pymongo.
I understand that the document, which is inserted from User, is converted into two key-value pairs(collection and index, respectively) and inserted into each b+ tree that is created by file schema, in default MongoDB. Then, what is the exact content of those key-value pairs? Also, what does happen to those values when I create Index?
To answer my question, I did some little experiment. I inserted 10 documents one by one, and capture contents of cursor at __curfile_insert(WT_CURSOR* cursor) using GDB, and looking at those key and value data of cursor. Also, in between 5-th and 6-th insert, I created Index to see whether the content of index is changed or not.
- Settings
MongoDB version : 4.0.9
MongoDB storage engine : wiredtiger 3.1.1 version
MongoDB storage engine configurations :
collectionConfig blockCompressor : none
indexConfig prefixCompression : false
python version: 3.6
- python script
1 from pymongo import MongoClient
2 import bson
3 import pprint
4
5 client = MongoClient()
6 db = client['idx_tree_check']
7 admin_db = client.admin
8
9 document0 = { "item": "canvas_1", "qty": "500"}
10
11 document1 = { "item": "canvas_2", "qty": "700"}
12
13 document2 = { "item": "canvas_3", "qty": "1000"}
14
15 document3 = { "item": "canvas_4", "qty": "400"}
16
17 document4 = { "item": "canvas_5", "qty": "100"}
18
19 document5 = { "item": "canvas_6", "qty": "600"}
20
21 document6 = { "item": "canvas_7", "qty": "900"}
22
23 document7 = { "item": "canvas_8", "qty": "800"}
24
25 document8 = { "item": "canvas_9", "qty": "200"}
26
27 document9 = { "item": "canvas_10", "qty": "300"}
28
29 print("write document0")
30 db.usertable.insert_one(document0)
31
32 print("write document1")
33 db.usertable.insert_one(document1)
34
35 print("write document2")
36 db.usertable.insert_one(document2)
37
38 print("write document3")
39 db.usertable.insert_one(document3)
40
41 print("write document4")
42 db.usertable.insert_one(document4)
43
44 print("create Index")
45 db.usertable.create_index([ ("qty", 1) ])
46
47 print("write document5")
48 db.usertable.insert_one(document5)
49
50 print("write document6")
51 db.usertable.insert_one(document6)
52
53 print("write document7")
54 db.usertable.insert_one(document7)
55
56 print("write document8")
57 db.usertable.insert_one(document8)
58
59 print("write document9")
60 db.usertable.insert_one(document9)
- result using GDB
Sorry that I am new user and can post one image only
- 1st document insert : collection kv pair
As shown in figure above, the key of collection would be record id, and value would be concatenation of object Id(primary key), and document data. - 1st document insert : index kv pair
In case of inserting kv pair to btree managed by index, the key is equal to object Id which is created before. However, I don’t get what the value is. - now I created index
- 6 th document insert : collection kv pair
As shown in figure above, the pattern is same as the first result: key is record id, and the value is the document data. - 6 th document insert : index kv pair
Before the experiment, I expected that the key would be changed. However, the key of index was same: object id of document. - 6 th document insert : second index kv pair
Furthermore, I observed that one more kv pair is inserted to index btree. But I cannot see this content about.
Finally, here are my questions:
When inserting one document by calling db.collection.insertOne(),
q1-1 : what is content of key-value pair which accesses btree managed by collection file schema?
q1-2 : what is content of key-value pair which accesses btree managed by index file schema?
When I created index, and inserting document again,
q2-1 : what is content of key-value pair which accesses btree managed by collection file schema?
q2-2 : what is content of key-value pair which accesses btree managed by index file schema?
q2-3 : Why index key-value pairs are created twice?
Thank you.