I wonder how to backup part of documents when the number of documents reaches the number I set?
I searched, But I just find “How to create capped collection”.
This just deletes old documents when the collection reaches maximum number of documents.
All I need is these.
When the number of documents in a collection reaches the number I set,
As I understand it, you wanted to do something like a capped collection. But instead of deleting the old documents, you want to move them somewhere else. Is this correct?
If this is not correct, could you provide some examples of what you have in mind?
I was just saying that, I need to know how to back-up some part of my document when the no of document in my collection reaches certain limit. (No of document. Not the size )
So, basically I am trying to back-up some no of documents when i have no of documents more than i need in order not to have huge file size. ( and delete the documents that were backed-up in the original collection of course )
This will let your application watch the collection, that is the number of documents in the collection, and when the number increases a previously set limit, a process is started to backup (or write to another collection) a selected number of documents (based upon some criteria you have).
For self-managed deployments, using change streams (as suggested by @Prasad_Saya) is certainly one approach. However, do consider the potential impact of triggering a count every document is inserted or updated.
A more efficient approach would be to write your own scheduled task that runs periodically and exports documents according to your expiry rules before removing them. You can schedule the task (using O/S scheduling tools like cron) to run during off-peak hours on a suitable frequency (twice daily, daily, every 3 days, weekly, …) to minimise impact on a production deployment.
If you happen to be using MongoDB Atlas (or might consider doing so), we recently added a new Atlas Online Archive beta feature which archives data greater than an expiry date (based on rules you configure) into more cost-effective S3 storage. With Online Archive and Atlas Data Lake you can continue to query both live and archived data.
Yes, the countDocuments query can take time, for each insert.
The document counting can be tracked within the application, for example, a variable can be used (and the variable value can be persisted, once in every n number of documents) . Also, application servers have mechanisms to persist state (variable value) in the event of application failures.