Pymongo convert timestamp to date as a new field (UNSOLVED!)

I have many entities in my collection and I have to create a new date field in collection to use for future queries.

 {'_id': ObjectId('5afea920d326051990a7f337'), 'created_at': 'Fri May 18 10:21:07 +0000 2018', 'timestamp_ms': '1526638867739'}
    {'_id': ObjectId('5afea920d326051990a7f339'), 'created_at': 'Fri May 18 10:21:08 +0000 2018', 'timestamp_ms': '1526638868310'}
    {'_id': ObjectId('5afea972d326051c5c05bc11'), 'created_at': 'Fri May 18 10:22:30 +0000 2018', 'timestamp_ms': '1526638950799'}
    {'_id': ObjectId('5afea974d326051c5c05bc16'), 'created_at': 'Fri May 18 10:22:32 +0000 2018', 'timestamp_ms': '1526638952160'}
    {'_id': ObjectId('5afea974d326051c5c05bc17'), 'created_at': 'Fri May 18 10:22:32 +0000 2018', 'timestamp_ms': '1526638952841'}

I have many entities in my collection and I have to create a new date field in collection to use for future queries.

{'_id': ObjectId('5afea920d326051990a7f337'), 'created_at': 'Fri May 18 10:21:07 +0000 2018', 'timestamp_ms': '1526638867739'}
{'_id': ObjectId('5afea920d326051990a7f339'), 'created_at': 'Fri May 18 10:21:08 +0000 2018', 'timestamp_ms': '1526638868310'}
{'_id': ObjectId('5afea972d326051c5c05bc11'), 'created_at': 'Fri May 18 10:22:30 +0000 2018', 'timestamp_ms': '1526638950799'}
{'_id': ObjectId('5afea974d326051c5c05bc16'), 'created_at': 'Fri May 18 10:22:32 +0000 2018', 'timestamp_ms': '1526638952160'}
{'_id': ObjectId('5afea974d326051c5c05bc17'), 'created_at': 'Fri May 18 10:22:32 +0000 2018', 'timestamp_ms': '1526638952841'}

I need to convert timestamp_ms into date format like this:

{'_id': ObjectId('5afea920d326051990a7f337'), 'created_at': 'Fri May 18 10:21:07 +0000 2018', 'timestamp_ms': '1526638867739’, 'NewDate': '2018 05 18 10:21:07'}
{'_id': ObjectId('5afea920d326051990a7f339'), 'created_at': 'Fri May 18 10:21:08 +0000 2018', 'timestamp_ms': '1526638868310’, 'NewDate': '2018 05 18 10:21:08'}
{'_id': ObjectId('5afea972d326051c5c05bc11'), 'created_at': 'Fri May 18 10:22:30 +0000 2018', 'timestamp_ms': '1526638950799’, 'NewDate': '2018 05 18 10:22:30'}
{'_id': ObjectId('5afea974d326051c5c05bc16'), 'created_at': 'Fri May 18 10:22:32 +0000 2018', 'timestamp_ms': '1526638952160’, 'NewDate': '2018 05 18 10:22:32'}
{'_id': ObjectId('5afea974d326051c5c05bc17'), 'created_at': 'Fri May 18 10:22:32 +0000 2018', 'timestamp_ms': '1526638952841’, 'NewDate': '2018 05 18 10:22:32'}

I used this code (Python 3.6, pymongo 3.8, mongodb 4.0):

    pipeline = [
        {
            '$addFields': {
                'newDate': {
                    '$toDate': '$timestamp_ms'
                }
            }
        }
    ]
    cursor = collection.aggregate(pipeline)

But gives this error message:
pymongo.errors.OperationFailure: Error parsing date string '1526638867739'; 12: Unexpected character '9'

I am not sure that aggregate is the right method for this task datetime.strptime() can be better for 'created_at' but I haven’t figured out how to implement it into db.Mycollection_update_many().

Hi @enes_75322,

The reason your query did not work is $toDate required the $timestamp_ms field to be of type double and not a string, so we first need to convert it from string to double.
Please try the following pipeline:

[
    {
        '$project': {
            'yearMonthDayUTC': {
                '$convert': {
                    'to': 'double', 
                    'input': '$timestamp_ms'
                }
            }
        }
    }, {
        '$addFields': {
            'newDate': {
                '$toDate': '$yearMonthDayUTC'
            }
        }
    }
] 

Let me know if it does not work.

Kanika

Hi Kanika Singla,
I would like to say thank you for your reply. It works on MongoDB Compass and Python but I have a problem. Even though everything looks good, it hasn’t created new field! Do you have any suggestion for it.
Thanks
You can see the problem on screen shots.

  1. Run aggregation in Python

  2. Check new field on MongoDB Compass

Hi @enes_75322,

You need to update the collection to set this new field in every document. :slight_smile: You can use Bulk Writes for the same. Ticket: Migration does exactly the same.

Let me know if it helps.

Kanika

Hi Kanika Singla,
I’m using pipeline to update each document in the db, isn’t it! Do I need to change anything in my Python code! By the way it is in the first picture.
Thanks
PS: The link in your reply doesn’t work! Ticket: Migration Page not found

Hi Kanika,
I haven’t figured it out how to apply your solution into all documents in the mongoDB. Can you give me any example for it. Bulk write gives a batch error!
Thanks

Hi @enes_75322,

In M220P, we have a lecture in Chapter 3: Bulk Writes which explains in detail on how we should do bulk writes.

But, if you want to learn before hand, here is the documentation link for Python Driver: Bulk Write Operations.

I hope it helps.

Kanika

Hi Kanika,
Actually I checked the documentation before and spent almost one day for error codes unfortunately I haven’t figured it out. I’ll check the Chapter 3 and I hope that it can help me pon this matter. Thanks for your reply and if couldn’t figure it out in end of this week, I’ll bother you again. Sorry for it in advance.
Thanks

No worries @enes_75322 :smiley:

Good luck and let me know!

Kanika

Hi Kanika,
So far I learnt that $project limits the fields and $out create a copy of the collection so I couldn’t use them in my case.
I understood that I need to use collection.update with $set operator. I need to create a new field to store double version oftimestamp and update this field into date time . When I am doing this process, I need to create a condition for the objects that has already updated.
I try to use this code but it didn’t work so far.

collection.update({}, [{'$set':
                                     {'Date':
                                          {'$convert':
                                               {'to': 'double',
                                                'input': '$timestamp_ms'
                                                }
                                           }
                                      }
                                 },
                                {'$set':
                                     {'Date':
                                          {'$toDate': '$Date'
                                           }
                                      }
                                 }
                                ], multi=True) 

It gives this error
TypeError: document must be an instance of dict, bson.son.SON, or other type that inherits from collections.Mapping

Could you give me some feedback to fix it?
Thanks

The syntax for update is described at https://docs.mongodb.com/manual/reference/method/db.collection.update/

In the code you supplied you pass an array as the second argument. It should be a document.

Hi Steevej,
Even though I checked the documentation, I don’t understand it fully because of that I am a rookie. If changed the second part of the code from array to object, how I can add conditional part?
Thanks

As the documentation indicates the query is the first argument.

collection.update({1},{2}) 1 is the query part and in here I can put a conditional!

Hi Kanika and Steevej,
I spent very long time on this problem and this is my final solution:

collection.update({"FixedDate": {"$exists": False}},
                  [{"$set":
                        {"FixedDate":
                             {"$convert":
                                  {"input": "$timestamp_ms",
                                   "to": "double"
                                   }
                              }
                         }
                    },
                   {"$set":
                        {"NewDate":
                             {"$toDate": "$FixedDate"
                              }
                         }
                    }
                   ], multi=True)

I used array [ ] because of that it works like aggregation option in update option. You can see the explanation in here https://stackoverflow.com/a/56570099/4770670

This code give this error:
TypeError: document must be an instance of dict, bson.son.SON, or other type that inherits from collections.Mapping

Can you help me to figure out this problem. I don’t really understand the bulkWrite option. If you can give an example of it is also good for me.

I really need your help to solve this!! Thanks

As already mentioned :

I do not recall any field name starting with an uppercase.

I am sorry for the confusion. FixedDate is a new field and it isn’t originally in the records in mongoDB objects. I used it instead of yearMonthDayUTC in the Kanika’s answer.
At the beginning there isn’t any field such as FixedDate so the condition works for all documents. If I add new document, in this case it will apply only new documents.

Can you write the second part as an object because I don’t understand it!

The second argument you passed is of the form [ … ], that is an array. The error message

tells you that you need an document that is something like { … }. I will refer you again to the documentation: https://docs.mongodb.com/manual/reference/method/db.collection.update

I know that this is […] an array but when I check the example in here https://stackoverflow.com/a/56570099/4770670 and it says that I need to use […] array for aggregate !
Could you give an example for the second part! I need to change the type of timestamp_ms into double and after that I must create a new field for date!

Yes, aggregate() needs an array. But you are calling update().

I already pointed to the update() documentation, as for the aggregate() documentation see https://docs.mongodb.com/manual/reference/method/db.collection.aggregate/