MongoDB Realm for large public data sets

I’m looking at using MongoDB Realm Sync as the backend for an app that will allow users to upload and share reviews of music albums, follow other users, and search for albums and artists.

Looking at the docs, I get a bit stuck re: partition keys. Most of the data needs to be public (i.e. album, artist, review data), but if I partition these all into a ‘public’ realm, then the cached database will end up growing to a huge size in local storage. Is there a better way of handling this/am I understanding how partitioning works correctly? Also, is it true that the whole database will be downloaded/cached to the device – I assume this could be problematic with large data sets? Any docs you could point me to as well could be great!

The other way I was thinking of going about this was to partition data into realms for each user, and then use a separate API that I write using MongoDB Realm 3rd party services to get review or artist or album data, OR use MongoDB realm functions to basically do the same thing but avoid having to use HTTP requests.

I’ve also been looking at the code for RealmSwift, and stumbled upon RealmApp.mongoClient, which I couldn’t find any documentation for. Is this a way to access this ‘public’ data (or any cluster data for that matter) without using Realm Sync? Or is this feature not fully fleshed out yet.

Sorry for the long post – I’m just trying to figure out whether MongoDB Realm will suit my use case or if I’ll have to go another route. Thanks so much! :slight_smile:

1 Like

Hi @Pierre_Rodgers,

I think your second part is the correct approach.

Not all data within the realm application have to be synced to the device. You can choose which databases or collections you are going to sync and partition those by a logical device partition key.

The other parts which does not need “offline-first” access can be accessed via the Realm sdk directly query or function.

Pushing aggregations or text search to the Atlas platform will result in better performance.

I think the mongoClient is a last resort option if your queries are not available through standard collection api.

Let me know if that covers your questions.

Best regards,
Pavel