Mongodump mongorestore slowness

Team,

Mongodump and Mongorestore was completed in less than 5 minutes for 20 GB of data.
When data increased to 100 GB then mongodump took ~5 hours and mongorestore took 48 hours.

Please let me know if this is expected.

Welcome to the community @Jitender_Dudhiyani!

The general problem you are describing is a likely outcome using mongodump and mongorestore for backup. This approach does not scale well, as noted in the documentation:

mongodump and mongorestore are simple and efficient tools for backing up and restoring small MongoDB deployments, but are not ideal for capturing backups of larger systems.

It would be helpful if you can provide some more information on your environment:

  • What type of deployment do you have (standalone, replica set, or sharded cluster)?
  • What specific version of MongoDB server are you using?
  • How many GBs of RAM does the instance you are dumping from have?
  • Do you have any monitoring in place for metrics like memory and I/O during your dump & restore procedures?
  • Are you dumping data from an instance that is actively being used by your application?
  • What options are you using with mongodump and mongorestore?
  • Are you running mongodump and mongorestore local to the instance you are backing up (or restoring to), or over the network?

A mongodump operation requires reading all data to be dumped through the mongod process’ memory, so if your data has grown significantly beyond available RAM the process may become I/O bound. A mongodump backup can have a significant performance impact if your application is also actively trying to use the instance you are backing up.

The mongorestore operation will load all data and rebuild all indexes. The time to rebuild indexes will also grow with your data set.

If you want to improve both backup and restore times, I would recommend using an alternative supported backup method such as filesystem snapshots or an agent-based approach like MongoDB Cloud/Ops Manager. If you have monitoring in place, that may provide more insight into the resource limitations that currently impact your backup and restore procedures.

Regards,
Stennie

2 Likes

@Stennie_X - Please find the below response:-

  • What type of deployment do you have (standalone, replica set, or sharded cluster)?
    Sharded cluster

  • What specific version of MongoDB server are you using?
    4.2.1 Enterprise edition for Windows

  • How many GBs of RAM does the instance you are dumping from have?
    16 GB

  • Do you have any monitoring in place for metrics like memory and I/O during your dump & restore procedures?
    I have Spotlight tool to monitor Windows performance

  • Are you dumping data from an instance that is actively being used by your application?
    This is Dev environment and instance that is NOT actively being used by the application.

  • What options are you using with mongodump and mongorestore ?
    mongodump --oplog --host server01 --port yyyyyy --out e:\mongodb_backup\shard04
    mongorestore --host server02 --port yyyyy --oplogReplay --dir=E:\mongodb_backup\shard04 --stopOnError

  • Are you running mongodump and mongorestore local to the instance you are backing up (or restoring to), or over the network?
    local execution of commands within the server

@Stennie_X - Your help is appreciated. Please help me.

Hi @Jitender_Dudhiyani,

As noted earlier, mongodump is not the most efficient or scalable backup approach to use, as it requires all data to be read and dumped via the mongod process. A mongorestore has to recreate all data files and rebuild indexes, so will also take longer to restore than a backup approach such as filesystem snapshots.

If your data to be backed up is significantly larger than RAM, the backup and restore time will increase with the growth in your data set.

If you are backing up a sharded cluster, there are more moving parts to coordinate and mongodump is not a viable backup approach if you are also using sharded transactions in MongoDB 4.2+.

I would recommend looking into alternative backup methods (filesystem snapshots or MongoDB Cloud/Ops Manager).

4.2.1 Enterprise edition for Windows

An aside not specifically related to backup: you should upgrade to the latest 4.2.x release (currently 4.2.8). Minor releases include bug fixes and stability improvements, and do not introduce any backward breaking changes. See Release Notes for MongoDB 4.2 for more details on issues fixed.

Regards,
Stennie

Also very disappointed in Mongo from a backup and recovery perspective. I have a .gz backup which is 30 GB. One of the collections in this is 400GB uncompressed. I worked out that the restore for just that collection will take +/- 40 hours.