My reservations about flat structure vs object tree structure.
That is documents with flat schema like
Flat = { _id : 1 , userName : "John" , userSurname : "Doe" }
versus documents with object tree structure like
Object = { _id : 1 , user : { Name : "John" , Surname : "Doe" }
My argument is that the Object is more code friendly than the Flat one.
1 - It is easier to $project in or out only the user information with Object
{ "$project" : { "user" : false } }
vs
{ "$project" : { "userName" : false , "userSurname" : false } }
If we throw schema evolution in, the $project does not have to change with the Object schema. It does with Flat.
2 - The code is cleaner with Object
A function that process user information could be called with Object.user and will work with generic field names like name and surname and will not have access to fields outside the user object. With Flat, you would need to pass each user field separately or pass the whole document. If you pass the field separately you may write code that works with generic field name but if you pass the document, the function may access fields that are not related to the user. But let say, you have another collection where the users are correspondant, so you would have a “from:” object and a “to:” object. The same function would work with “Object.from” and “Object.to”. Using Flat would be more complicated.
After all, O in jsOn is for object. The arguments in favour of object-oriented programming would go down in the drain if the flat structure would be better for programming. After, most of the time when you do SQL you map your multiple tables and records into Objects.
Now, as for size consideration, I was not convince that Flat was better. Intuitively, I was surprised that repeating the user prefix to make the Flat structure would result in a smaller size. After all, user, name and surname is 4 characters less userName and userSurname. So why using short keys would be better in one case and not in this one.
So I came up with a bigger document structure.
Object = {
_id: ObjectId("6550dd9de0d1abae5c24671b"),
user: {
name: 1,
surname: 2,
alias: 3,
phone: 4,
address: {
street: 5,
city: 6
}
}
}
versus
Flat = {
_id: ObjectId("6550d897e0d1abae5c246719"),
userName: 1,
userSurname: 2,
userAlias: 3,
userPhone: 4,
userAddressStreet: 5,
userAddressCity: 6
}
The average object size for Object is 114B while the Flat is 127B as indicated by $collStats. So my guess is that with only user Name and Surname the sweet spot where Object overhead is smaller was not reached.
Finally, even
{_id:1,user:{Name:"John",Surname:"Doe",Phone:1}}
is less characters than
{_id:1,userName:"John",userSurname:"Doe",userPhone:1}
To terminate, I want to say that all other points are valid and want to add to
this will be a double bonus because if someField use a sparse (usually smaller) index.
I am one of the click on the link you shared. Thanks again for sharing.