Errors running migration

I’m getting: pymongo.errors.OperationFailure: you are over your space quota, using 519 MB of 512 MB when in run the movie_last_updated_migration.py or the test. This is the code modified in the script:

host = “mongodb+srv://m220student:m220password@mflix-nxu6f.gcp.mongodb.net/test”
mflix = MongoClient(host)[“mflix”]

TODO: Create the proper predicate and projection

add a predicate that checks that the “lastupdated” field exists, and then

checks that its type is a string

a projection is not required, but may help reduce the amount of data sent

over the wire!

predicate = {“lastupdated”: {"$exists": “True”}}
projection = None

cursor = mflix.movies.find(predicate, projection)

this will transform the “lastupdated” field to an ISODate() from a string

movies_to_migrate =
for doc in cursor:
doc_id = doc.get(’_id’)
lastupdated = doc.get(‘lastupdated’, None)
movies_to_migrate.append(
{
“doc_id”: ObjectId(doc_id),
“lastupdated”: parser.parse(lastupdated)
}
)

print(f"{len(movies_to_migrate)} documents to migrate")

try:
# TODO: Complete the UpdateOne statement below
# build the UpdateOne so it updates the “lastupdated” field to contain
# the new ISODate() type
bulk_updates = [UpdateOne(
{"_id": movie.get(“doc_id”)},
{"$set": {“lastupdated”: movies_to_migrate}}
) for movie in movies_to_migrate]

# here's where the bulk operation is sent to MongoDB
bulk_results = mflix.movies.bulk_write(bulk_updates)
print(f"{bulk_results.modified_count} documents updated")

except InvalidOperation:
print(“no updates necessary”)
except Exception as e:
print(str(e))

Look at what you are updating($set). We’re using a for loop to iterate over movies to migrate to build our UpdateOne statements. You need to get the value of the lastupdated field of movies_to_migrate for each record just like you’re doing with doc_id.

2 Likes

Also, you forgot to add in your predicate a check if the type of ‘lastupdated’ is a string.
Try to add a projection as it will greatly reduce the time of running the script (it saved me a lot of time doing re-runs of the script).

1 Like

Hi Cimes, thanks a lot, your hint how to iterate over the movies and get the “lastupdated” field helped med a lot. Regards Tilo

1 Like

@: predicate = {“lastupdated”: {"$exists": “True”}}
You should also check data type is string for this field.
@: {"$set": {“lastupdated”: movies_to_migrate}}
I think you should add lastupdate with ISODate format instead of movies array to migrate to lastupdate field.

Check strange quotes around lastupdated words.
This code works OK

predicate = {“lastupdated”: {"$exists": True}, “lastupdated”: {"$type": ‘string’}}
projection = None

bulk_updates = [UpdateOne(
    {"_id": movie.get("doc_id")},
    {"$set": {'lastupdated': movie.get('lastupdated')}}
) for movie in movies_to_migrate]

solved…

thx alll

PS : I forgot updating my host…

Hi, thanks to all for the hints, I’ve made the modifications:

predicate = {“lastupdated”: {"$exists": “True”}, “lastupdated”: {"$type": “string”}}
projection = {“lastupdated”: 1, “_id”:1}

cursor = mflix.movies.find(predicate, projection)

this will transform the “lastupdated” field to an ISODate() from a string

movies_to_migrate =
for doc in cursor:
doc_id = doc.get(’_id’)
lastupdated = doc.get(‘lastupdated’, None)
movies_to_migrate.append(
{
“doc_id”: ObjectId(doc_id),
“lastupdated”: parser.parse(lastupdated)
}
)


try:
# TODO: Complete the UpdateOne statement below
# build the UpdateOne so it updates the “lastupdated” field to contain
# the new ISODate() type
bulk_updates = [UpdateOne(
{"_id": movie.get(“doc_id”)},
{"$set": {“lastupdated”: movie.get(“lastupdated”)}}
) for movie in movies_to_migrate]

# here's where the bulk operation is sent to MongoDB
bulk_results = mflix.movies.bulk_write(bulk_updates)
print(f"{bulk_results.modified_count} documents updated")

except InvalidOperation:
print(“no updates necessary”)
except Exception as e:
print(str(e))

But I still get the quota error:

E pymongo.errors.OperationFailure: you are over your space quota, using 519 MB of 512 MB

mflix_venv/lib/python3.7/site-packages/pymongo/helpers.py:155: OperationFailure

And this error shows every time I try to open the web app in any way, its completely unusable.

@carlosherrera i think the error is the value of boolean in $exists…you need pass a boolean value, not a string. Try remove the quotes.

Hi @bigworks thanks for your reply, Changed it as you pointed out. Still getting the same error:

predicate = {“lastupdated”: {"$exists": True}, “lastupdated”: {"$type": “string”}}

projection = {“lastupdated”: 1, “_id”:1}

Try to remove double quotes on method movie.get… i newby in python, but i think when you use the method get you need pass the property in simple quotes.i don’t know why… but it’s worked for me.

Thanks, @bigworks I too have very little experience in python (1year or so), tried what you pointed out but still get the same error in the tests. And I get it as well when I run the application with python run.py and open the webpage. I think I’ll need to rebuild the database from scratch.

It would be best to restore the database. Also, one thing to point out in this code:

projection = {“lastupdated”: 1, “_id”:1}

The _id is redundant here because it is by default included in the projection.

2 Likes

I am still working on this subject.
But in your code, I do not understand which is the instruction that actually changes lastupdated from string to ISODate().

Hi @PaulRym, as far I can understand the cast is made here:

import dateutil.parser as parser

movies_to_migrate.append(
{
“doc_id”: ObjectId(doc_id),
"lastupdated": parser.parse(lastupdated)
}
)

I still don’t understand what changed the ISODate.
Shouldn’t this be done in UpdateOne? Where I take the string and transform it in ISODate? Shouldn’t we be doing that with $set?

the parser from dateutil transforms lastupdated from string to ISODate?

Look this thread in SO: https://stackoverflow.com/questions/7651064/create-an-isodate-with-pymongo

Indeed the parser converted the string to a date, that is translated to ISODate when inserted/updated.

1 Like

Confirmed, I restored the database and everything worked sans problems.

The parse method updates the string to an IsoDate (check import statements). So, when you build the movies_to_migrate array from the cursor, you are converting the str to an IsoDate for each record appended . When you iterate over the movies_to_migrate array you are building the list of “UpdateOne” statements. I would imagine a file of alot of UpdateOne statement(like 46K when I ran it) That list is what is getting sent to bulk_writes (MongoDB) to execute.
Edit…
I don’t know if there is a way to change the type when executing updateone. If so, I also don’t know if it is more efficient to do it in Python vs. the db - I’m sure we’ll learn that along the way.

At last, after restoring it… it was “Passed”… thanks! :raised_hands: