Hello, M320 class of Aug 2019 :-)

Hi, I’m Simon from the UK, and I’m so excited to be starting this new course on schema design and data modelling with MongoDB. I’ve worked in IT for nearly 20 years, most of that time as a developer working with relational databases, and if I’ve learned one thing from the previous MongoDB University courses that I’ve completed, it’s that large chunks of what I know about relational databases just don’t apply to document databases such as MongoDB.

I’m particularly looking forward to learning how to decide whether to embed all my data in documents in one collection, or whether to $lookup some of it from other collections (to normalise or not to normalise), but I’m sure I’ll also learn a bunch of other things that I haven’t even considered yet.

I’ve found these forums to be really useful in my previous courses, it’s good to know that if I hit something that I don’t understand, I can ask for help here, and there are both fellow students and MongoDB employees who can help. If you’re stuck, post a question here, and you never know, it might be something I can help with.

Oh, and I make no apology for spelling “normalise” with an S rather than a Z. Being a native of the UK, my first language is en-GB rather than en-US (in which it’s spelled “normalize”) so I may spell some words differently to how they’re spelled in the videos and course materials and other people’s contributions to the forum. Don’t get hung up on it, we’re all talking about the same thing :slight_smile:

Welcome to M320, I look forward to giving and receiving help in this forum.

Does anyone else want to introduce themselves? Don’t be shy…

Hey @Simon_39939

First off Welcome. Glad to be joining you on this course. I am from Canada so I can related to your spelling of normalise; however I tend to ignore my spellcheck for those words and use the ‘Z’ versions. lol. Kind of like how the US dollar is the world reserve currency, I just default to their spelling as well. Just personal preference is all. :slight_smile:

Best of luck to you in the course and enjoy the power of mongo’s rich document structure :exclamation:

Hi all,

I’m Baydr also from the UK and a developer with about 12yrs of experience mainly in .net but have branched out to other technology stacks recently. I’m mainly interested in the relational capabilities of mongodb. For example you often hear about MongoDB compared to relational databases, implying they are mutually exclusive. I’m curious if MongoDB is going to be or has already become more relational. If anyone has any incite into this I’d be keen to hear from you.

Thanks,
Baydr

Hey @Baydr_94892

You are right that the is am implication to that, however I would not say mutually exclusive, but each has its place and (especially with the fast changing to the environment, IoT) time. Especially since MongoDB has references.

And to create those relations in Mongodb there are references! With $lookup & $graphlookup which we can use in an aggregation pipeline to ‘fill out’ those references.

However do you the Flexible data model and rich document structure of MongoDB it is not always wise to follow the normalization of data..

Since MongoDB gives users the ability to embed documents inside outer documents OR use references in a relational type fashion we are left with the decisions and challenge below.

The key challenge in data modeling is balancing the needs of the application, the performance characteristics of the database engine, and the data retrieval patterns. When designing data models, always consider the application usage of the data (i.e. queries, updates, and processing of the data) as well as the inherent structure of the data itself.

With MongoDB we let the applicaton tell the database how it should be setup instead of having to shape the application around how the db is set up with a fully normalized data structure. This is very apparent when you consider the schema-less design, that no 2 documents in a single database collection need to have the same schema, however they often do for human clarity sake. The Schemaless part does not imply there is no schema (since we can still do schema validation & ORM like Mongoose.js), just that a developer does not first have to create/update a schema before inserting a new document into a collection

And now that v4.2 offers ACID guarantee multi-document transactions it appears that MongoDB has bridged any gaps or reservations people may have coming from a RDB to MongoDB.

I am wondering if this article could be of some interest to you?
https://www.mongodb.com/scale/relational-vs-non-relational-database

1 Like

Hi @natac13,

Thanks for the quick response. The decision I am trying to make involves whether to repeat a document of about 20 key value pairs in 2 places in the same document or to break it out into a separate collection and reference it from both places. I can wirte the application that will use the subdocument so it only needs to reference the subdocument collection so ‘joins’ or lookups aren’t required. My main concern is keeping them in sync when mongodb has no referential integrity (that i am aware of). I like the idea of just keeping a single document with duplication of the subdocument as it avoids orphaned data and a possibly lengthy clean up process.

Thanks,

Baydr

@Baydr_94892

So with your example I would just like to add some labels to make it easy to converse about.

You have a document that represent a Person lets say, who have Shipping and Home Addresses (which are always the same in this example)

const Person = {
  name: 'Sean Campbell',
  age: 31,
  shippingAddress: {
    street: '5445 Paper st.',
    city: 'New York',
    state: 'New York',
    country: 'USA',
  },
  homeAddress: {
    street: '5445 Paper st.',
    city: 'New York',
    state: 'New York',
    country: 'USA',
  },
  field5: '...',
  ...
}

The question then becomes,
Does the sub document (addresses) always get included in the query for Person. If so then embed. However if you are querying for Person without the need to always see the addresses then you could do a relational thing.

However for most cases, with Mongodb I believe it is best to embed as a rich document structure is/was the differentiating aspect of Mongo

I understand that duplication is the scourge of any programmer, with the maxim

DRY - Don’t Repeat Yourself

However there are exceptions to every rule.

Hope that helps.

1 Like

How about my real life example of the application I am build. A Training Center Database and Sign up application for Members.

Embed
So I have a Members collections that has data which is inserted by the admin, such as first, last name, email of all the Members of this program. Those Members can then register with the application. The user data generate from registration, like username and password get stored with the Member data since it is so closely related && accessed together.

Reference on _id
When it come to the Courses and the Classes on those courses I have a different setup. There are Many Courses that will have an every growing list of Classes. Therefore I choose to reference the Course document from the Class Documents.
The access pattern of the application calls for Upcoming classes to be shown to Members logged in. This query only deals with the Class info, like date of and who is attending, and does not need to know about the Course data which has info on how many hours it takes to complete the Course. As well as Course info to be shown to admins only. Which if the Classes were embedded in the Course docs would be over fetching data since the Admin only cares about the Course info, not individual Classes.

Important
Again the biggest thing that matters for Schema and Model design, as well as some sharding concerns, is the Access patterns of you application How is it accessing the data? How often? How much of each document? You may find you are projecting away certain fields all the time and therefore may re-consider your design.