How can I backup part of documents when the number of documents reaches the number I set?

Hello!
This is a good night to have a good dream.

I wonder how to backup part of documents when the number of documents reaches the number I set?

I searched, But I just find “How to create capped collection”.
This just deletes old documents when the collection reaches maximum number of documents.

All I need is these.
When the number of documents in a collection reaches the number I set,

  1. Backup part of documents in a collection.
  2. Delete backed up documents in a collection.

Help me :cry:

Hi,

Backup part of documents in a collection.

Do you mean backup part of the collection?

As I understand it, you wanted to do something like a capped collection. But instead of deleting the old documents, you want to move them somewhere else. Is this correct?

If this is not correct, could you provide some examples of what you have in mind?

Best regards,
Kevin

1 Like

No, not the back up part.

I was just saying that, I need to know how to back-up some part of my document when the no of document in my collection reaches certain limit. (No of document. Not the size :slight_smile: )

So, basically I am trying to back-up some no of documents when i have no of documents more than i need in order not to have huge file size. ( and delete the documents that were backed-up in the original collection of course :slight_smile: )

Thnx in advance

You can use Change Streams.

This will let your application watch the collection, that is the number of documents in the collection, and when the number increases a previously set limit, a process is started to backup (or write to another collection) a selected number of documents (based upon some criteria you have).

1 Like

Hi @DongHyun_Lee,

For self-managed deployments, using change streams (as suggested by @Prasad_Saya) is certainly one approach. However, do consider the potential impact of triggering a count every document is inserted or updated.

A more efficient approach would be to write your own scheduled task that runs periodically and exports documents according to your expiry rules before removing them. You can schedule the task (using O/S scheduling tools like cron) to run during off-peak hours on a suitable frequency (twice daily, daily, every 3 days, weekly, …) to minimise impact on a production deployment.

If you happen to be using MongoDB Atlas (or might consider doing so), we recently added a new Atlas Online Archive beta feature which archives data greater than an expiry date (based on rules you configure) into more cost-effective S3 storage. With Online Archive and Atlas Data Lake you can continue to query both live and archived data.

Regards,
Stennie

1 Like

Yes, the countDocuments query can take time, for each insert.

The document counting can be tracked within the application, for example, a variable can be used (and the variable value can be persisted, once in every n number of documents) . Also, application servers have mechanisms to persist state (variable value) in the event of application failures.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.