Understanding Realm data management & Sync

Julien_Chouvet · April 28, 2021, 12:51pm

Hi all!

I’m developing an iOS app using Realm. I read some docs to understand the functioning of Realm data management, especially with the Sync functionality. However I still have a lack of understanding on some points.

My app allows users to choose different items from different categories and then generate some reports containing details about the selected items.

Here are two use cases happening in my app:

Use case #1

To select the items, the user navigate through different categories.

Let’s say I have only 2 levels on categories, the user goes to the category 1, then 1.1 which displays a list of items, then it goes to category 2 → 2.1 which displays a list of other items and so on.

Currently each time a category is opened, I call the following function which open a Realm and then call the function initUserItemsObserver() which init an observer which do some UI stuffs.

private func initUserItemsRealm(){
    self.userItemRealm = try ! Realm(configuration: RealmConstants.USER!.configuration(partitionValue: <partition_value>))
    self.userItems = self .userItemRealm?.objects(Items. **self** ).sorted(byKeyPath: "name")
    self .initUserItemsObserver()
}

I want to know if there is a way to optimize my algorithm in terms of number of request and sync time because I have categories with hundreds/thousands of items.

Use case #2

Once a report is generated, I save it in a collection. The report object contains an array of _id corresponding to the one of the items.

When a user wants to see an old report, I use the _id of the items stored in the array to get the item objects and then generate the report.

My first question is: In that case, to get the item objects from their _id is it better to open a Realm with an observer (as in the initUserItemsRealm() ) or to use AsyncOpen ?

My second question is: In order to optimize the number of requests and the time, I was thinking of storing the information needed from the items directly in the report object (instead of storing only the _id in the array). Instead of storing directly the information in the report object, I was thinking of using the One-to-Many Relationship to access the information needed from the item objects. However, I read that when using the Relationships « Realm Database executes read operations lazily as they come », but as I will read all the item objects is there a benefit of using the Relationship?

My last question is about the lifecycle of the objects. As explained in the « Think Offline-first » paragraph, the changes received from the server are integrated into the local realm. If I understand well, does it mean that after the objects have been downloaded once to generate the report, if the user wants to generate the same report again, realm will take the objects from the local file instead of downloading them from the server?

Thanks for your help!

Jay · April 28, 2021, 4:39pm

Realm is an offline first database. That means ALL of the data is stored locally and then sync’d at a later time - typically milliseconds. So, if you’re performing a query, that’s a local function and it will return the data as fast is your drive will return it. Fortunately, Realm objects are lazily loaded so even with thousands of items, the results will be populated ‘instantly’.

So… I am not sure what your algorithm is but your code looks great to me!

In a sync environment, you always open realm the first time with .asyncOpen. See Sync Changes Between Devices - iOS SDK

Thereafter you can access realm via the code in your question.

This sounds like you are asking about denormalizing your data. In a nutshell that means duplicating your data into smaller or different chunks to improve read performance. I am not really sure it’s necessary in this use case; a lot of that would depend on how long it takes to generate the report in the first place. If it takes 18ms for example, then denormalizing the data is not needed.

That technique is really powerful when you are dealing directly with a NoSQL database. While that is what MongoDB uses on the back end for storage, up front here in the drivers seat we are insulated from that and get to play with and query super flexible objects that represent that data in an object oriented way.

I think this wraps back to your first question; objects are not downloaded once in response to a read or a query. All objects exist on the local drive as well as on the server. So when you run a report, no additional information is downloaded as it’s already there.

Back when Realm was not part of MongoDB, they had a thing called a Query based aka Partial sync where the app would only download specific realm data. That changed and now its a 100%. So keep that in mind - local first really means ‘local’; all of your data is stored locally and sync’d at a later time.

Julien_Chouvet · April 29, 2021, 6:16am

Thanks a lot for your answers @Jay! It’s a lot more clear now.

I still have 2 questions that came while reading your answers.

1 - When you say:

I’m wondering if it is “simply” stored on my Iphone because as said in the link you provided “Realm avoids copying data into memory except when absolutely required” and if it’s indeed the case, is there a way to limit the size on the data stored?

2 - My second question is about the Sync Runtime. Is this metric increased every time I open a realm with Realm(config:)? For example, if I open a realm with Realm(config:) and the data stored locally is the same than the one on the server (no changes have been made), does the Sync runtime increase?
Same question when I init an observer, is the Sync Runtime increased until the observer is invalidated or just when data are downloaded (if some are)?

Andrew_Morgan · April 29, 2021, 9:52am

Hi Julien,

to answer #1. You can use Realm Sync Partitioning to control what data is synced to the device (typically based on the user and/or what they’ve asked to see). I’ve a new article that will hopefully go live tomorrow that covers various partitioning strategies – I’ll try to remember to circle back here with the link once it’s live (but if I forget, then it will appear in this list: https://www.mongodb.com/learn/?products=Mobile).

Cheers, Andrew.

Jay · April 29, 2021, 3:44pm

Let me elaborate a bit on question #1. I am sure @Andrew_Morgan will cover it more thoroughly but coding examples are always good.

Suppose you have a wine cataloging app. It stores information about wines; the grape (varietal), a rating and the country of origin etc. In this use case, we’re going to use the country of origin as the _partitionKey; Here’s the object:

class WineClass: Object {
   @objc dynamic var _partitionKey
   @objc dynamic var varietal = ""
   @objc dynamic var rating = ""
}

So an WineClass object from the United Stated may look like

WineClass
   _partitionKey = "US"
   varietal = "Cabernet Sauvignon"
   rating = "Excellent"

one from Italy may look like

WineClass
   _partitionKey = "Italy"
   varietal = "Nebbiolo"
   rating = "Good"

So as you can see we have a single object WineClass, that has different partitions. Note that in the big picture, a partition = a Realm. When you Read realm, the partition you want to read is specified

let config = user.configuration(partitionValue: "Italy")
Realm.asyncOpen(configuration: config) { result in...

So only the wines from Italy are sync’d - the wines from the US will never touch your disk. So when I mentioned ALL data is sync’d, what was meant was ALL data whose partitions you access from code are synch’d.

If the data on the server matches the local data, there’s nothing to sync. When you add an observer, it’s observing something that’s already been lazily loaded, and those objects were stored locally. In other words if you wanted to observe your wines for changes it would look something like this:

let wineResults = realm.objects(WineClass.self) // <- results from disk
notificationToken = self. wineResults.observe { changes in

The let wineResults lazily loads the wines (from disk) and then the observer observes those results. There will be no time impact for that above code, so no it would not impact sync’ing since that’s automagically done in the background.

Remember data is not downloaded upon request; it’s local. Any partitions you accessed when opening Realm (as shown above) already has the data sync’d by the time you’re ready to use it.

Julien_Chouvet · April 30, 2021, 2:03pm

Thanks again for your answers!

I didn’t know that data is sync’d automatically in background.

I just have one last question. In the Billing documentation, it is said:

Realm counts the total amount of time in which a client application user has an active connection to the sync server even if they are not transferring data at the time

What does an active connection to the sync server means?
Does it means that as long as my app has opened (at least) one Realm (and thus has a local version which is sync’ing in background), I’m connected to the sync server?

Jay · April 30, 2021, 5:30pm

This is correct. For MongoDB Realm Sync:

Price: $0.08 / 1,000,000 runtime minutes ($0.00000008 / min)

Formula: (# Active Users) * (Sync time (min / user)) * ($0.00000008 / min)

Free Tier Threshold: 1,000,000 requests or 500 hours of compute or 10,000 hours of sync runtime (whichever occurs first)

Julien_Chouvet · May 5, 2021, 6:13am

Thanks for all your answers @Jay!