Environment
Server specs
Processors: 2 x Intel Xeon E5-2640 2.50GHz
Memory: 8GB RDIMM, 1333 MH (Total 32Gb RAM)
Network Card Speed: Broadcom 5720 QP 1Gb Network Daughter Card
Operating System: Core OS
MongoDB Server Version: 3.6.2 (Docker hosted)
Client specs
Processors: Intel Core i7-4790 CPU @ 3.60GHz (8CPUs). ~3.1GHz
Memory: 16GB RAM
Network Card: Intel® Ethernet Connection (2) I218-V, 1Gb
Operating System: Windows 7 Enterprise
Average data transfer rate
The avg data transfer rate between the client and server is ~90 MB/s
Sample code
public static void bulkInsert2() {
MongoClientSettings settings = MongoClientSettings.builder()
.applyConnectionString(new ConnectionString("mongodb://10.4.1.3:34568/"))
.build();
MongoClient mongoClient = MongoClients.create(settings);
WriteConcern wc = new WriteConcern(0).withJournal(false);
String databaseName = "test";
String collectionName = "testCollection";
System.out.println("Database: " + databaseName);
System.out.println("Collection: " + collectionName);
System.out.println("Write concern: " + wc);
MongoDatabase database = mongoClient.getDatabase(databaseName);
MongoCollection<Document> collection = database.getCollection(collectionName).withWriteConcern(wc);
int rows = 1000000;
int iterations = 5;
int batchSize = 1000;
double accTime = 0;
for (int it = 0; it < iterations; it++) {
database.drop();
List<InsertOneModel<Document>> docs = new ArrayList<>();
int batch = 0;
long totalTime = 0;
for (int i = 0; i < rows; ++i) {
String key1 = "7";
String key2 = "8395829";
String key3 = "928749";
String key4 = "9";
String key5 = "28";
String key6 = "44923.59";
String key7 = "0.094";
String key8 = "0.29";
String key9 = "e";
String key10 = "r";
String key11 = "2020-03-16";
String key12 = "2020-03-16";
String key13 = "2020-03-16";
String key14 = "klajdlfaijdliffna";
String key15 = "933490";
String key17 = "paorgpaomrgpoapmgmmpagm";
Document doc = new Document("key17", key17).append("key12", key12).append("key7", key7)
.append("key6", key6).append("key4", key4).append("key10", key10).append("key1", key1)
.append("key2", key2).append("key5", key5).append("key13", key13).append("key9", key9)
.append("key11", key11).append("key14", key14).append("key15", key15).append("key3", key3)
.append("key8", key8);
docs.add(new InsertOneModel<>(doc));
batch++;
if (batch >= batchSize) {
long start = System.currentTimeMillis();
collection.bulkWrite(docs);
totalTime += System.currentTimeMillis() - start;
docs.clear();
batch = 0;
}
}
if (batch > 0) {
long start = System.currentTimeMillis();
collection.bulkWrite(docs);
totalTime += System.currentTimeMillis() - start;
docs.clear();
}
accTime += totalTime;
System.out.println("Iteration " + it + " - Elapsed: " + (totalTime / 1000.0) + " seconds.");
}
System.out.println("Avg: " + ((accTime / 1000.0) / iterations) + " seconds.");
mongoClient.close();
}
Description
Inserting 1 million documents (See the sample document section) using bulk write, the performance starts to degrade when I use a batch size larger than 1000. The following is the execution times of the sample code using different batch sizes.
batch size 1000
Iteration 0 - Elapsed: 6.577 seconds.
Iteration 1 - Elapsed: 6.52 seconds.
Iteration 2 - Elapsed: 6.156 seconds.
Iteration 3 - Elapsed: 6.859 seconds.
Iteration 4 - Elapsed: 6.152 seconds.
Avg: 6.452800000000001 seconds.
batch size 5000
Iteration 0 - Elapsed: 7.112 seconds.
Iteration 1 - Elapsed: 6.662 seconds.
Iteration 2 - Elapsed: 6.457 seconds.
Iteration 3 - Elapsed: 6.551 seconds.
Iteration 4 - Elapsed: 6.211 seconds.
Avg: 6.5986 seconds.
batch size 10000
Iteration 0 - Elapsed: 8.049 seconds.
Iteration 1 - Elapsed: 7.528 seconds.
Iteration 2 - Elapsed: 7.664 seconds.
Iteration 3 - Elapsed: 7.462 seconds.
Iteration 4 - Elapsed: 7.396 seconds.
Avg: 7.6198 seconds.
Is this the expected outcome in relation to batch sizes ? Can someone explain why does using larger batch sizes causes the performance to degrade in this case ?
Sample document
{
"_id" : ObjectId("5f3c2db34063366c39177e64"),
"key17" : "paorgpaomrgpoapmgmmpagm",
"key12" : "2020-03-16",
"key7" : "0.094",
"key6" : "44923.59",
"key4" : "9",
"key10" : "r",
"key1" : "7",
"key2" : "8395829",
"key5" : "28",
"key13" : "2020-03-16",
"key9" : "e",
"key11" : "2020-03-16",
"key14" : "klajdlfaijdliffna",
"key15" : "933490",
"key3" : "928749",
"key8" : "0.29"
}