Hello.
Python 3.7.3
pymongo==3.10.1
Mongodb version 4.2.3
in the collection of 100 million documents and every 30min + 300K documents
How to make a request similar to MySQL:
UPDATE table SET status_flag=1 WHERE status_flag=0 LIMIT 300000;
???
This is udate all documents
my_collection.update_many({“status_flag”: 0}, {"$set": {“status_flag”: 1}}).limit(300000)
How to make a limit ?
Hello @Kompaniiets_Denis , welcome to the MongoDB forum.
You can use Bulk Write Operations for this batch update. Using PyMongo:
bulk_request = [ ]
for doc in collection.find( { 'status_flag': 0 } ).limit( 3 ):
bulk_request.append( UpdateOne( { '_id': doc['_id'] }, { '$set': { 'status_flag': 1 } } ) )
result = collection.bulk_write( bulk_request )
print(result.matched_count, result.modified_count)
PyMongo Definitions:
NOTE: Having an index on the query filter field status_flag
can improve the query performance.
index status_flag and ts_utc_start set
bulk_request = [ ]
for doc in col.find( { ‘status_flag’: count_old } ).sort( [ ( ‘ts_utc_start’, 1 ) ] ).limit( 10 ):
bulk_request.append( updateOne( { ‘_id’: doc[ ‘_id’ ] }, { ‘$set’: { ‘status_flag’: count_new } } ) )
result = col.bulk_write( bulk_request )
or
bulk_request.append( UpdateOne( { ‘_id’: doc[ ‘_id’ ] }, { ‘$set’: { ‘status_flag’: count_new } } ) )
gives an error message
Exception name ‘updateOne’ is not defined
or
Exception name ‘UpdateOne’ is not defined
Here is your Pyhton code with proper imports:
import pymongo
from pymongo import UpdateOne
client = pymongo.MongoClient()
collection = client.testDB.testColl
bulk_request = [ ]
for doc in collection.find( { 'status_flag': 0 } ).limit( 2 ):
bulk_request.append(UpdateOne( { '_id': doc['_id'] }, { '$set': { 'status_flag': 1 } } ) )
result = collection.bulk_write( bulk_request )
print(result.matched_count, result.matched_count)
1 Like
@Prasad_Saya thanks for the help
300K documents
for doc in col.find({“status_flag”: 0}).sort( [(“ts_utc_start”, 1)] ).limit(300000):
res = col.update_many({"_id": doc["_id"]}, {"$set": {“status_flag”: 1}})
completed in 2min 04 seconds
bulk_request = [ ]
for doc in col.find( { ‘status_flag’: 0 } ).limit( 300000 ):
bulk_request.append(UpdateOne( { ‘_id’: doc[’_id’] }, { ‘$set’: { ‘status_flag’: 1 } } ) )
result = collection.bulk_write( bulk_request )
completed in 24 seconds
This is a cool result.
1 Like
Stennie_X
(Stennie)
Split this topic
January 9, 2021, 7:45am
6