Mongoimport fails with broken pipe when the file is large

Hi,

I’m trying to import a dataset, which is in a large file (21.1GB json), which has a list of documents. I could import around 75% (15.9GB) of that file and then it gives “error inserting documents: write tcp a.b.c.d:e: write broken pipe”

I tried this 3 times and each time this happened. The command I’m using is as below (connecting to mongos instance);

mongoimport -d dbname -c collectionName --host hostname --file filename -j 4 --batchSize=200 --jsonArray

Has anybody faced the same issue and any recommendations?

Thanks

There might be a message related to the error in the server logs. It might provide more details.

Here is some related information at: https://jira.mongodb.org/browse/TOOLS-379

Hi,

Thanks for the reply… However, if the document is bigger than 16MB, MongoDB skips importing that particular document and it’s printed on the command line… But this is something else I guess… Will post if I find out the reason…

Hi @Laksheen_Mendis,

What does mongoimport --version report and what specific version of MongoDB server are you importing into? Also, what type of deployment do you have (standalone, replica set, or sharded cluster)?

Finally: how large are your documents on average (you can check imported documents via db.collectionName.stats().avgObjSize)? If you have large documents you may want to try further reducing the --batchSize value.

Regards,
Stennie

1 Like