Hello Everyone,
So today I ran into a road block in my development bc I cant seem to find anyone having this issue online.
I’m attempting to use mongo datalake to weekly import a csv file to my cluster , about 1MM records.
for that i have set up a aggregate pipeline with some transformations and and $out statement at the end, like this:
{
"$out": {
"atlas": {
"clusterName": "Cluster",
"db": "source",
"coll": "collection"
}
according to documentation here
https://docs.mongodb.com/datalake/reference/pipeline/out
as the last step. the pipeline works great on a mongo client , etc . but by doing it so on python:
def extract_new_movers_population():
pipeline = extract_transform
res= collection.aggregate(pipeline=pipeline, allowDiskUse=True)
print(res)
I get this output:
pymongo.errors.OperationFailure: If an object is passed to $out it must have exactly 2 fields:
'db' and 'coll', full error: {'operationTime': Timestamp(1612190585, 1), 'ok': 0.0,
'errmsg': "If an object is passed to $out it must have exactly 2 fields:
'db' and 'coll'", 'code': 16994, 'codeName': 'Location16994', '$clusterTime': {'clusterTime':
Timestamp(1612190585, 1), 'signature': {'hash': b')\xb6\x15\xfa\x03\x1cv\x0b\xa1\xef\xa5\x0c\x0c\x0c^\xe7\x9d/\xa2\x1f', 'keyId': 6922836924319137794}}}
with motor driver gets even worst because the driver swallows the output and never tells
Hope I can find a captain