Atlas Search Custom Analyzer

We tried to create custom analyzer like below in UI,

[
  {
    "baseAnalyzer": "lucene.keyword",
    "charFilters": [],
    "name": "lowerCaseAnalyzer",
    "tokenFilters": [
      {
        "type": "lowercase"
      }
    ],
    "tokenizer": {
      "type": "keyword"
    }
  }
]

referred the analyzer in index, during index rebuild it throw below error
Your index could not be built: references invalid analyzer “lowerCaseAnalyzer” that has the following error: unrecognized fields [“charFilters”, “tokenFilters”, “tokenizer”]

Define analyzer is accepting only below format

{
    "baseAnalyzer": "lucene.keyword",
    "name": "testsearchanalyser"
  }

Can any one share the schema to use for custom analyzer

Hi,

We have the same problem and we’re unable to create a custom analyzer in any way. We tried what the OP did, it doesn’t surprise us that much that it doesn’t work because the fields “charFilters”, “tokenFilters” and “tokenizer” are not defined in the API documentation (https://docs.atlas.mongodb.com/reference/api/fts-analyzers-update-all/). The API let us put these fields, we then referred the analyzer in the index definition and it fails to build.

We don’t understand the PUT API because the customer analyzers seem to be defined in the index definition (https://docs.atlas.mongodb.com/reference/atlas-search/analyzers/custom/) when creating a new index, but again, call the POST endpoint as in the documentation returns “Invalid attribute analyzers specified”… and that’s true ! the API documentation nowhere mentions the field “analyzers” :sweat_smile:

Here what we tried:

{
    "collectionName": "myCollection",
    "database": "myDatabase",
    "name": "myIndexName",
    "analyzer": "myAnalyzer",
    "analyzers": [
        {
            "name": "myAnalyzer",
            "charFilters": [],
            "tokenizer": {
                "type": "nGram",
                "minGram": 3,
                "maxGram": 7
            },
            "tokenFilters": []
        }
    ],
    "mappings": {
        "dynamic": false,
        "fields": {
            "label": [
                {
                    "type": "string",
                    "analyzer": "myAnalyzer",
                }
            ]
        }
    }
}    

So, how do we create a custom analyzer ?

Hi,
I am having the same trouble. Were u able to get it to work?

Thanks,
Supriya

Hi same problem I want to create analysers along side my indexes by hitting the API

e.g.

    {
            name: VALUE_MATCHING,
            mappings: {
              dynamic: false,
              fields: {
                values: {
                  type: FieldTypes.DOCUMENT,
                  dynamic: false,
                  fields: {
                    value: {
                      type: FieldTypes.STRING,
                      analyzer: "englishStemmer",
                      searchAnalyzer: "englishStemmer",
                    },
                  },
                },
              },
            },
            analyzers: [
              {
                name: "englishStemmer",
                tokenizer: {
                  type: TokenizerTypes.STANDARD,
                },
                tokenFilters: [
                  {
                    type: TokenFilterTypes.LOWERCASE,
                  },
                  {
                    type: TokenFilterTypes.SNOWBALL_STEMMING,
                    stemmerName: StemmerName.ENGLISH,
                  },
                ],
              },
            ],
          }

The PUT endpoint for creating analysers doesn’t seem to support the same depth of customisation

Hi,
is this problem already fixed?
On our side, we still have this issue.

It worked when we create the analyzer directly in the index definition