MongoDB Kafka Sink Connector

derek_henderson · February 20, 2020, 12:02pm

I have been working with Kafka and Mongodb for a few months now. I am running into problems with the Kafka -> Mongodb sink connector, in that it cannot cope with the number of records I’m throwing at it. I’m processing around 100,000 records per second. I dont expect mongo to keep up with that but I’m only hitting about 1000-2000 records a second into mongo. I’m using upserting, using the primary key, any tips would be helpful. When we dump data straight in from SQL server we get much better through put, so its not server spec. I have increased batch size and max tasks, to no avail.

Thanks in advance.

Ross_Lawley · February 27, 2020, 11:56am

Hi @derek_henderson,

Are you doing any post processing of messages? Are you watching multiple topics with the connector?

The connector only supports a single task, so changing max tasks won’t change the throughput. Did setting a batch size make any difference at all?

Ross

derek_henderson · February 27, 2020, 3:16pm

Hi Ross,

Thanks for the response, I am watching about 50 topics, but am not doing any post processing, simply pushing it straight into a collections. Batch size did not make any significant improvement.

Derek