MongoDB C# Driver throws System.Text.DecoderFallbackException

This is a pretty straightforward issue, though I’m not sure why it’s happening. It’s caused by including a unicode emoji (probably other unicode characters too) after a regular letter in a BsonDocument key, like this:

var bsonDoc = new BsonDocument();
bsonDoc["i😀"] = "invalid";

The driver allows you to save this in a document, but will throw the exception when querying for it or when calling ToString on the command, like this:

var settings = MongoClientSettings.FromConnectionString(connectionString);
settings.ClusterConfigurator = builder =>
{
    builder.Subscribe<CommandStartedEvent>(e =>
    {
        var str = e.Command.ToString(); // this throws a System.Text.DecoderFallbackException
    });
};

The full exception is System.Text.DecoderFallbackException : Unable to translate bytes [F0][9F][98] at index -1 from specified code page to Unicode.

Calling ToString on the document before saving it with InsertOneAsync(bsonDoc) succeeds, like this:

var valid = bsonDoc.ToString(); // no exception - shows { "i\ud83d\ude00" : "invalid" }

I was able to reproduce this in the MongoDB.Driver.Examples solution in the InsertPrimer test by adding this invalid document. It will throw the exception either by adding a query to the test to retrieve the document, or by adding the CommandStartedEvent subscriber in PrimerTestFixture, and calling ToString on the command. Both of these scenarios call into System.Text.UTF8Encoding.GetString where the exception is originating from.

It doesn’t throw the exception with just the unicode character like “:grinning:” or if the letter is after the character like “:grinning:i”. It only throws when starting with a letter.

The fact that calling ToString on the document succeeds before saving it, but fails when called on the command before it gets executed implies that something is happening in the driver to mutate the unicode into something invalid.