Aggregation for MongoDB Charts with ; separated data

I have a collection that looks like this.

[
      {
        "_id": {
          "$oid": "5fa80c13752bc1ad5a4abe71"
        },
        "Requirements": {
          "Experience": "3 - years",
          "Languages": ";javascript;SQL;C#;Rust",
          "Education": ";Bachelor's Degree in computer science"
        },
        "Date_posted": {
          "$date": {
            "$numberLong": "1604337459493"
          }
        },
        "Job_title": "Junior Programmer",
        "Search_indeed": "programmer - South Africa",
        "company_name": "Growthpoint Properties",
        "Spoken_Languages": "",
        "location": "Sandton, Gauteng",
        "Site_link": "https://za.indeed.com/jobs?q=programmer&l=South Africa&/rc/clk?jk=e8e8f3aecdfaa26e&fccid=d67f4db96a6da922&vjs=3",
        "payment_type": {
          "type": "",
          "amount": {
            "$numberInt": "0"
          }
        }
      },
      {
        "_id": {
          "$oid": "5fa80c0b752bc1ad5a4abe66"
        },
        "Requirements": {
          "Experience": "3 - year",
          "Languages": ";java;SQL;XML;C#",
          "Education": ";Bachelor's Degree"
        },
        "Date_posted": {
          "$date": {
            "$numberLong": "1600535851513"
          }
        },
        "Job_title": "C# Analyst Programmer (intergration)",
        "Search_indeed": "programmer - South Africa",
        "company_name": "definitive recruitment",
        "Spoken_Languages": "",
        "location": "Stellenbosch, Western Cape",
        "Site_link": "https://za.indeed.com/jobs?q=programmer&l=South Africa&/rc/clk?jk=ce457ee55b1c2ba7&fccid=e7b09f8169c7519f&vjs=3",
        "payment_type": {
          "type": "",
          "amount": {
            "$numberInt": "0"
          }
        }
      }

I’m trying to count all the unique Languages in Requirements.Languages
but I have no idea how I would go about doing this.

How would I go about doing this, I’ve looked at $split, but I’m definitely a little lost with syntax.

Hi @Pieter_van_zyl -

You’re on the right track. You should add a Calculated Field with an expression like this:

{ $split: ['$Requirements.Languages', ';'] }

Or alternatively you could do it in the query bar:

[
   { $addFields: { language: { $split: ['$Requirements.Languages', ';'] } } }
]

HTH
Tom

2 Likes

That works wonderfully !

I do have one problem now, sometimes Requirements.Languages is “” but is still being added to the count.

When go to the chart filters and attempt to filter “EMPTY STRING”
The chart stops displaying.

I cant figure out how to use a match with this
You’re on the right track. You should add a Calculated Field with an expression like this:

{ $split: ['$Requirements.Languages', ';'] },{ $match : { calculatedfieldname: /[A-Z]/ } },

Is that even possible in a calculated field ?

Ah - the problem is that the filtering is happening before the array is being unwound. Since every document probably contains the unwanted empty string, it is resulting in all documents being filtered out.

You can address this by manually doing the unwind in the query, e.g:

[
   { $addFields: { language: { $split: ['$Requirements.Languages', ';'] } } },
   { $unwind: '$language' }, 
   { $match: { language: { $ne: '' } } }
]
1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.