Realm Sync dataset with M2 gives no documents to client after first run

I’m having an issue where syncing a large number of documents to a realm sync client worked once, but NOT on subsequent attempts. The first attempt worked fine (slowly, it’s an M2) but then slows down and just …stops! The document count is ~27000, but the size of the docs averages at 262B. So, the entire download is ~7MB. Surely even an M2 could perform this task more than once? It only made it to ~26,000 documents sync’d.

I delete the app documents and data off of the client, and run again.

Now, clients receive 0 documents from this collection (all others syncing normally). Is this a limit with the M2s or even above?

Hi Eric. a couple of questions.

  1. What are you syncing to (Swift, Kotlin/Java, React Native?)
  2. Are there any relevant Realm logs in your backend Realm app?

Hello, thanks for the reply. I’m using RealmSwift SDK.

I was unable to find any logs yesterday that I thought were relevant. Initially there were some instances where I needed to tidy up the dataset as its source is a .csv file. At first I thought that it could trip up the sync if the mongo encoding encountered some documents that didn’t fit the schema. In other words I did see a bunch of red MongoEncode errors that cleared out once I had done a better job of scrubbing the csv.

At the end of the day I had set up a new realm-app to continue testing to get a grip on the behaviour of sync when applied to relatively larger datasets (as above). I could observe at least 3 different behaviours on a fresh test app install.

  1. The documents were downloaded and available right away, with no incremental receives. (Sort of what I expected to happen).
  2. No documents were available at any time
  3. The documents would “trickle” in one at a time, at a rate where the full collection wouldn’t be available for a long span of time. 10s of minutes would be my estimate.

I got an alert email around that time saying Sync had “been paused” and needed attention. I assume that something I did during my experimentation was not good for the system…

Anyhow, I’m trying to accomplish at least two things in this spot I’m in now, which are to gain an excellent understanding of the sync behaviour when applied to larger datasets (like > 10,000 documents - which I’m told is a sort of “soft” limit), and to understand any unwritten caveats surrounding tiered clusters and their effect on sync performance.

Thanks again for getting back!

Eric

Further, I realized this morning that I may have began conducting the download tests a bit too soon after re-enabling sync. I caught a glimpse of the “Copying documents” progress message in the blue bar on the Sync page in Atlas, and then attempted a fresh test app install and run only AFTER the Sync enabling process and document copying had all been completed. However I am still observing the “trickle effect” on the client. I’m using a changesetPublisher with .sink callback, and the result is like so (truncated):

Removed 0  
Inserted 20268  
Updated 0  
Removed 0  
Inserted 4124  
Updated 0  
Removed 0  
Inserted 25  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 31  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 1

I feel like this is definitely throttling. I hope no one minds but I’ve updated the post title. I’m going to switch to an M10 and check the results.

I want to know what each one of these callbacks with an insert is costing in terms of requests / data transfer / sync runtime and how that plays out in my billing (i.e. how much of my free tier gets used up by one of these experiments).

Edit: I can’t actually update the post title. I think it should read “Realm Sync M2 throttling experiments with collections > 10,000”

Results of the same experiment on an M10:

First run output, truncated (fresh install, new cluster, so Development Mode == Enabled)

Removed 0  
Inserted 225  
Updated 0  
Removed 0  
Inserted 885  
Updated 0  
Removed 0  
Inserted 8993  
Updated 0  
Removed 0  
Inserted 716  
Updated 0  
Removed 0  
Inserted 6  
Updated 0  
Removed 0  
Inserted 9  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 6  
Updated 0  
Removed 0  
Inserted 2  
Updated 0  
Removed 0  
Inserted 2  
Updated 0  
Removed 0  
Inserted 14  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 5  
Updated 0  
Removed 0  
Inserted 900  
Updated 0  
Removed 0  
Inserted 656  
Updated 0  
Removed 0  
Inserted 696  
Updated 0  
Removed 0  
Inserted 1179  
Updated 0  
Removed 0  
Inserted 60  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 1  
Updated 0  
Removed 0  
Inserted 19  
Updated 0  
Removed 0 

Second run, truncated (fresh install, Development Mode == Disabled):

Initially received 0
Removed 0
Inserted 27685
Updated 0

So with a brand new app and no data, but Realm Sync is not in Development mode, the trickle stops. But the amount of time between

Initially received 0

and

Inserted 27685

was 31 seconds.

My “uneducated guess” is that the most significant factor in overcoming this “trickle factor” seems to be disabling development mode.

So in ideal conditions:

  • M10
  • development mode off
  • excellent network conditions

And the demand of

  • 27685 documents, at
  • average size 226B each
  • Initial connection of fresh app with no realm sync data onboard

it takes 31 seconds to sync all of those documents down to the client.

Same setup on an M2 takes 88 seconds, but with trickling. This means my “uneducated guess” about development mode causing the trickle effect was incorrect. The M2 trickles when development mode is disabled as well.

This probably varies a lot with the transient conditions of the multi-tenant tier. Hope I wasn’t being too noisy for my neighbours.

So, for my own requirements, it looks like I need at least an M10 to support this particular use case.

I must apologize for hurling uneducated guesses and loosely-research based conclusions around but I think that’s what these forums are for. And if any specialists in the subject matter could correct me or support me anywhere that would be GREATLY appreciated.

What does the iOS code look like for opening and measuring the download speed? Have you tried using asyncOpen when opening the realm? Have you taken a look at the Xcode profiler while running your tests?

Thanks for chiming in Ian

The code

try! Realm(configuration: user.configuration(partitionValue: "/places/NA"))
            .objects(Airport.self)
            .changesetPublisher
            .sink(
                receiveCompletion: { completion in
                    dump(completion)
                },
                receiveValue: { changes in
                    print("Callback time: \(Date())")
                    self.lastUpdate = Date()
                    switch changes {
                        case .initial(let documents):
                            print("Initially received \(documents.count)")
                            self.documents = Array(documents)
                        case .update(let documents, deletions: let deletions, insertions: let insertions, modifications: let modifications):
                            deletions.forEach { self.documents.remove(at: $0) }
                            print("Removed \(deletions.count) airports")

                            insertions.forEach { self.documents.insert(documents[$0], at: $0) }
                            print("Inserted \(insertions.count) airports")

                            modifications.forEach {
                                self.documents.remove(at: $0)
                                self.documents.insert(documents[$0], at: $0)
                            }
                            print("Updated \(modifications.count)")

                        case .error(let err): fatalError(err.localizedDescription)
                    }
                })
                .store(in: &cancellables)

I prefer to stay away from the asyncOpen API as much as possible as I have spent too much time trying to integrate it with my application. In the end, that API has influenced my overall architecture in such a way that it is not needed.

I’m less than fluent with profiler. I can give it a go and provide observations. Any suggestions what I should be looking at / for exactly?

Thanks again,

Eric

This post’s title needs to be changed but I am unable to for some reason

I don’t think we can deterministically measure performance for any shared instance - they are there for development and not for any testing or production usage.

asyncOpen does download the initial seed realm in one big chunk as a performance improvement so I would be interested to know how much time it takes to download the realm using asyncOpen on a m10. I’d also be interested to know how much time it takes to download the partition after terminating and re-initializing sync.

Unfortunately I had to shut down the M10 since I didn’t intend to keep it running. I can say anecdotally that the sync enabling copy process on the M10 was similar to the M2, probably on the order of 30 seconds.