MongoDB.live, free & fully virtual. June 9th - 10th. Register Now MongoDB.live, free & fully virtual. June 9th - 10th. Register Now

How to remove documents by certain date (like until yesterday)

I need to remove the data in one of the collection in dev enrionment but only remove the data that is created until yesterday.

id field documents are numeric values(like 1,2,3 etc)… i am working in mongo shell looking for help on this. There are no date fields in my json that i could use filter criteria to remove.

@nagaraju_voora

This may make things difficult.

However…

As this is a dev environment and I am assuming by ‘yesterday’ you mean any giving yesterday and not exactly Feb 18? Could you not just clear the database and await fresh data?

I could be wrong but I think you would need some date field stored in the documents you wish to remove by a certain date

Timestamp would be ideal. If _id is the default ObjectID then you can use that(but I think you said this was an int, unless that was id not _id):

sorting on an _id field that stores ObjectId values is roughly equivalent to sorting by creation time.
https://docs.mongodb.com/manual/reference/bson-types/#objectid

Possibly also using $natural in sort.
https://docs.mongodb.com/manual/reference/glossary/#term-natural-order

I am assuming you have a field called “created_on” and it is date field, so you can subtract 24 hours and find the dates less than that value.

{
    "created_on": {
        "$lte": new Date((new Date().getTime() - (24 * 60 * 60 * 1000)))
    }
}

I need ability remove the documents for a given date…my documents in mongo sb collection does not have any date related elements

I really do not understand. You do not have or want any field with a date in your documents yet you want to be able to delete document for a given date. But this is comparable to I want delete all Blue items but my items have no colour. Impossible

1 Like

Not entirely impossible in this case. It depends. I gave two possible alternatives when there is no deicated timestamp field.

Given a replica set and using the oplog I am sure there are other possibilities.

Is the dataset poorly designed for this operation? Almost certainly.

1 Like

The $natural parameter returns items according to their natural order within the database. This ordering is an internal implementation feature, and you should not rely on any particular structure within it.

I think this can only be assumed if @nagaraju_voora was using Mongoose as an ORM which adds created_at with timestamps

I think the best way is to add a created_at field or some sort of date field. Keeping in mind this is only a dev environment things should be easy to change no?

Hey @nagaraju_voora while I can do this, you really probably shouldn’t. At least not this way, I strongly agree with the previous posters that the prefered method would be to add a timestamp to your documents. That being said, you can probably do what you’ve asked using the oplog


import pymongo
from bson.objectid import ObjectId
from datetime import datetime, timedelta

client = pymongo.MongoClient()
oplog = client.local.oplog.rs
db = client['dev']
previous_day = datetime.now() - timedelta(days=1)

ops = oplog.find({"ns":"dev.users", "op": "i", "wall": {"$lt": previous_day}})

for op in list(ops):
    db["users"].delete_one({"_id": ObjectId(op['o']['_id'])})

This code is incredibly hacky and I do not recommend you run it in any environment, not even dev… but as a purely academic exercise of “is it possible”, then the answer is yes.

You can find out more about the oplog in the docs

3 Likes

I think you just filled the:

1 Like

How does the document in your collection look like?