Find unused collection from MongoDB

Hello, Is there any way to find when was the collection last (read/update/insert), in case collection, have no particular date field that maintains that record. I checked multiple forums but no luck.
MongoDB version using 3.4

Hi @AayushiMangal,

There are two options that I can think of for your specific use case:

  1. On a replica set, you can use the oplog.rs collection for update and insert operations only:
    use local
    db.oplog.rs.find({op: {$in: ["u","i"]}, ns: "<db>.<collection>"}).pretty()
  2. Audit filter on CRUD/DML (read, update and insert/create) operations. But note that this will impact write performance.

Hello 007_jb,

Thank you for your response! I tried for oplog collection, with the aggregation operation like this, but it is taking too long since data is much.

And regarding the audit filter, we will not be able to get the details of these operations from the past? I believe no.

db.oplog.rs.aggregate([
{
$group:
{
_id:0,
ns: {$addToSet: ‘$ns’},
op: {$addToSet: ‘$op’},
ts: {$addToSet: ‘$ts’},
}}
]).forEach(function(Doc){print(Doc.ns + “,” + Doc.op + “,” + Doc.ts);})

My purpose is to find out what is the collection that is not being used recently (read/write/insert) at all.

That’s correct. It starts working from the point you create it.

The oplog.rs collection is not an indexed collection (db.oplog.rs.getIndexes()) and in your code, there’s no grouping operation being applied and the forEach iteration is also a bottleneck. Revised code below:

db.oplog.rs.aggregate([
   {
      $match: {
	     op: {$in: ["u","i"]}
	  }
   },
   {
      $group: {
	     _id: {
		    ns: "$ns",
			op: "$op"
		 },
		 count: {$sum: 1},
		 ts_min: {$min: "$ts"},
		 ts_max: {$max: "$ts"}
	  }
   }
]).pretty()

And if it’s still taking a very long time, then I would suggest that you take a copy of the oplog, insert it into a standalone test db, apply appropriate indexes and re-run the query above.

Hello,

But we cannot create index on oplog.rs collection, I checked that and found that index creation is not possible for system collections.

Thank you so much for responding it…

I am aware of that and that was not what I suggested. I said, take a copy and push the data into a test collection in a standalone db, and create the index on this test collection.

Again, test the code I gave you first to see how long it takes before you consider pushing the oplog data to another test collection.

mongotop might also be useful

thank you so much, that is helpful!

Good! So what did you do in the end?

I took copy of data to different collection name and I will be able to work. Thank you!

You’re welcome!

Another one to consider is $indexStats on each collection… i.e. unused index somewhat indicates unused collection.