Watch keynotes and sessions from MongoDB.live, our virtual developer conference.

Query for different documents with different projections in pymongo

Hi, I have a function written with pymongo that will access a collection and retrieve all documents that has a specific field, and then I use a specific projection for that field.
Now I am doing that in a loop for all specific pairs of fields and projections.
What I’m trying to find out is if there is a way to “string queries” like this into just one call to the collection?

def readRawPlotData():

    Methods = Collection.distinct("header.Method")

    plotMethods =  [method for method in Methods if method in constants.getPlotColumnsByMethod(method, keys=True)]

    rawPlotData = []

    for method in plotMethods:

        project = {"_id":0,"header":1}

        for plotColumn in constants.getPlotColumnsByMethod(method):

            project["data."+plotColumn] = 1

        methodData = Collection.find({"header.Method":method},project)

        for data in methodData:

            rawPlotData.append(dumps(data))

    return rawPlotData

Hi @Fredrik_Niva, welcome!

I’d assume that the function constants.GetPlotColumnsByMethod() returns an array of desired methods’ value.
If that’s the case, you should be able to utilise $in operator in find() to avoid querying the database for each method. For example the query should be:

db.collection.find(
    {"header.Method":{"$in": ["a", "b", "c"]}}, 
    {"_id":0, "header":1, "data.a":1, "data.b":1, "data.c":1}
)

If you still have further question, it would be helpful to provide the relationship between constants.GetPlotColumnsByMethod() with method. For example, if it’s a fixed constant per method, perhaps it’d be useful to include those value into the document in the database.

Regards,
Wan

Hi again,
Sorry for not being clear enough, I will try to clarify!
I’ve attached a snippet from my DB to show the structure.


So the constants.GetPlotColumnsByMethod() gives an array of column names.

The documents might contain column “A”,“B”,“C”,“D”,“E”, but say Method 1 wants to retrieve only columns “A”,“B”,“C”, and Method wants to retrieve “A”,“C”,“D”, Method 3, “A”,“D”,“E” and so forth.

In short I want to pair a specific projection with each method.

I will have to note also that I’m both new to MongoDB and relatively so also to python :slight_smile:
Your help is much appreciated.
Sincerely,
Fredrik

Hi @Fredrik_Niva,

Not a problem, thanks for providing an example document. It wasn’t obvious before that data is an array of documents.

If you’re able to store the columns per method on the document, this will save you a round trip back to the client just to check which method needs which columns.
For example, if you add a field plots to contain the pairing for method/projection as below example:

{
 "header": {"Method": "cpt", "ID":"185440-CPT", "Group":0}, 
 "plots": ["QC", "NA"], 
 "data": [
     {"index": 0, "QC":10, "FS":2, "TA":3, "NA":7, "NB":132, "NC":245},
     {"index": 1, "QC":11, "FS":22, "TA":33, "NA":77, "NB":232, "NC":900},
 ]
}

Then you can utilise MongoDB Aggregation Pipeline to project only data fields that matches in `plots. For example:

db.collection.aggregate([
    {"$match":{"header.Method": {"$in": ["cpt", "foobar"]}}}, 
    {"$project":{
        "_id": 0, 
        "header": 1, 
        "data": {
            "$map":{
                "input": "$data",
                "as":"x",
                "in": {
                    "$arrayToObject": {
                        "$filter":{
                            "input": {"$objectToArray":"$$x"}, 
                            "as":"y", 
                            "cond": {"$in": ["$$y.k", "$plots"]}
                        }
                    }
                }
            }
        }
    }}
]) 

If you would like to learn more about MongoDB and Python, I’d recommend to enrol in a free online course on MongoDB University https://university.mongodb.com , specifically M220P: MongoDB for Python Developers

Regards,
Wan.

Thank you so much!
I actually took that course but seems I didn’t quite make the most out of it (even though that pipeline is a bit more than the course covered) :smiley:
But seeing your solution it mostly makes sense, didn’t think of adding that array to the document myself!
Thanks again for taking the time.
Sincerely,
Fredrik

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.