Database design for SaaS Application

Hello Everyone,

While trying to design a database structure for a SaaS application i have got a different ways to do it. However i am curious to know which approach is the best with MongoDB Atlas.

We expecting to have close to 500 tenants. Coming to the behavior of tenants:

  1. We expect same Schema across all tenants with no customization
  2. We expect more or less equal load across all tenants
  3. Based on the application, we expect 50 -80 collections within each tenant

Now when we design the DB we have come across multiple approaches, however based on the no. of tenants, what should be the right approach.

  1. One database per tenant.
    a. With this approach is there a limitation with no. of databases that can added to a cluster?
  2. One database for all tenants where tenant specific data is identified by tenant id with each collection
    a. How will performance impact when 2500 users use the application?
  3. One database for all tenants and Shard data based tenant id

Can you please share your thoughts if you have come across with similar use cases ?

Hi, I’m curious if you’ve made any decisions with regard to this, as I’m in the process of evaluating MongoDB for a similar scenario with multi-tenancy and a moderate amount of normalization (i.e., multiple document types). In my case my initial thought was to use the collections to separate each tenant and then store the different types of documents for each tenant in the same collection together, but this appears to run against the MongoDB data modeling principles of having each collection represent a specific type of document – though it would seem to make a lot of sense for performance purposes and security in that it would isolate customer data.

I think you’d want to avoid putting all customers in the same database and collection(s), though, for several reasons:

  1. If you want to take a tenant “offline” it would be difficult to extract / delete their individual documents since all tenant data would be in a single repository
  2. Much greater risk of missing something at the query level that could expose one tenant’s data to another
  3. Indexes would all have to include tenant ID as an additional level, making them heavier to maintain

I’d love to know what you’ve come up with, though, and would also be curious if you’ve found performance sufficient if you’re separating document types by collections and you need to “lookup” to retrieve related data together.

Thank you