How to load a CSV file to MongoDB Atlas?

Hello! I want to load a CSV file to MongoDB Atlas.

How to do this?

I would use the command line utility mongimport (connection example. available in the Atlas UI for each under the Command Line Tools tab under the cluster). More detail here https://docs.mongodb.com/manual/reference/program/mongoimport/

1 Like

I use this import string from Command Line Tools bar:

mongoimport --uri mongodb+srv://tudor_15:<PASSWORD>@cluster0.acapz.mongodb.net/<DATABASE> --collection <COLLECTION> --type <FILETYPE> --file <FILENAME> 

Here I put my password, database name, collection name, CSV file type and my file name.

I run the string in command line, and it throws me the next error:

The syntax of the command is incorrect.

How can I get rid of this error and import the CSV?

Can you paste in your exact command line you are trying to use (except the password)?

1 Like

Sure, this is the line I’m trying to use:

mongoimport --uri mongodb+srv://tudor_15:<PASSWORD>@cluster0.acapz.mongodb.net/<Database1> --collection <Invoice_Train> --type <CSV> --file <invoice_train.csv>

remove the < > brackets

something like:

mongoimport --uri mongodb+srv://tudor_15:PASSWORD@cluster0.acapz.mongodb.net/Database1 --collection Invoice_Train --type CSV --file invoice_train.csv

I ran this line of code, without the < > brackets, but I got an error again:

The system cannot find the file specified.

This is strange, because in command line I am in the file’s folder. I also tried to use the absolute path, but nothing changed.

How could I get rid of this error?

What is your OS?
If it is Windows check how the file was saved
It may be having different extension

It’s Windows 10. The file is saved “invoice_train.csv”.

These are the files I use:

Please show the exact file you are trying to load from your system by dir command and show few lines from your file

The sample .csv files from your link appear to be in excel format

csv is comma separated

I tried to load a sample .csv file with your command on my Windows machine.It worked fine

Here is the output of the “dir” command, run on the folder with the files I’m working with:

05/11/2021  01:03 PM    <DIR>          .
05/11/2021  01:03 PM    <DIR>          ..
08/24/2020  12:30 PM         2,253,632 client_test.csv
08/24/2020  12:30 PM         5,986,133 client_train.csv
05/30/2021  12:35 PM       146,577,388 invoice_test.csv
05/24/2021  08:29 PM       344,346,841 invoice_train.csv
05/11/2021  01:04 PM     1,105,521,210 invoice_train.json
08/24/2020  12:30 PM         2,153,008 SampleSubmission (2).csv
               6 File(s)  1,606,838,212 bytes
               2 Dir(s)  224,879,640,576 bytes free

A few lines from “invoice.csv” file:

client_id,invoice_date,tarif_type,counter_number,counter_status,counter_code,reading_remarque,counter_coefficient,consommation_level_1,consommation_level_2,consommation_level_3,consommation_level_4,old_index,new_index,months_number,counter_type
train_Client_0,2014-03-24,11,1335667,0,203,8,1,82,0,0,0,14302,14384,4,ELEC
train_Client_0,2013-03-29,11,1335667,0,203,6,1,1200,184,0,0,12294,13678,4,ELEC
train_Client_0,2015-03-23,11,1335667,0,203,8,1,123,0,0,0,14624,14747,4,ELEC
train_Client_0,2015-07-13,11,1335667,0,207,8,1,102,0,0,0,14747,14849,4,ELEC
train_Client_0,2016-11-17,11,1335667,0,207,9,1,572,0,0,0,15066,15638,12,ELEC
train_Client_0,2017-07-17,11,1335667,0,207,9,1,314,0,0,0,15638,15952,8,ELEC
train_Client_0,2018-12-07,11,1335667,0,207,9,1,541,0,0,0,15952,16493,12,ELEC
train_Client_0,2019-03-19,11,1335667,0,207,9,1,585,0,0,0,16493,17078,8,ELEC
train_Client_0,2011-07-22,11,1335667,0,203,9,1,1200,186,0,0,7770,9156,4,ELEC
train_Client_0,2011-11-22,11,1335667,0,203,6,1,1082,0,0,0,9156,10238,4,ELEC

Yes, my computer also detected this file as suitable for excel format, but since it’s too big for the Excel, I open it with Notepad. Otherwise, the Excel will keep 1 million rows and delete the rest (the file has 4 million rows).

Are you sure you are in the correct directory?

I downloaded client_test dump and successfully loaded into my cluster

The original file is a zipped file.When you unzip/extract the file it creates a directory client_test.csv
Under this you will see the file client_test.csv

Yes, absolutely.

Can you show me the command line which you use?

You can try yourselves.Create a simple csv file using those 10-15 records from your file and try mongoimport again.It will work

Issue could be your path or file itself

This is what i used

mongoimport --uri mongodb+srv://m001-student:m001-mongodb-basics@cluster0-xxxx.mongodb.net/client --collection clienttest --type CSV --file client_test.csv --headerline


2021-06-05T19:32:36.025+0530 [########################] client.clienttest 2.15MB/2.15MB (100.0%)
2021-06-05T19:32:36.026+0530 imported 58069 documents

I ran this command with my identification data and got another error:

2021-06-06T21:51:05.980+0300 error connecting to host: could not connect to server: server selection error: server selection timeout, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: cluster0-shard-00-02.acapz.mongodb.net:27017, Type: Unknown, State: Connected, Average RTT: 0, Last error: connection() : connection(cluster0-shard-00-02.acapz.mongodb.net:27017[-190]) incomplete read of message header: EOF }, { Addr: cluster0-shard-00-00.acapz.mongodb.net:27017, Type: Unknown, State: Connected, Average RTT: 0, Last error: connection() : connection(cluster0-shard-00-00.acapz.mongodb.net:27017[-189]) incomplete read of message header: EOF }, { Addr: cluster0-shard-00-01.acapz.mongodb.net:27017, Type: Unknown, State: Connected, Average RTT: 0, Last error: connection() : connection(cluster0-shard-00-01.acapz.mongodb.net:27017[-188]) incomplete read of message header: EOF }, ] }

More precisely, this is the command line I ran:

mongoimport --uri mongodb+srv://tudor_15:PASSWORD@cluster0.acapz.mongodb.net/Database1 --collection Invoice_Train --type CSV --file invoice_train.csv --headerline

Were you able to connect to your cluster before?
What is the status of your cluster in Atlas.Any error/alerts?
Have you whitelisted your IP?
Could be temporary N/W issue or SSL issue
Share your id/pwd for us to check or create another user and share the creds

Actually, I haven’t tried yet. I intended to upload the date to MongoDB Atlas, and then to connect from Google Colaboratory.

I realized recently that I ran this command when I was in another wifi network, and when my IP adress was different. The fact that I didn’t change the IP adress in my cluster could have generated this error.

But, when I put my current IP adress and ran the command for uploading the data and the command for checking the cluster status, I got the error:

2021-06-07T12:16:27.432+0300 error connecting to host: could not connect to server: connection() : auth error: sasl conversation error: unable to authenticate using mechanism "SCRAM-SHA-1": (AtlasError) bad auth : Authentication failed.

And when I ran these commands with 0.0.0.0/0 IP adress, I got the same error.

My id: tudor_15
My password: Centurion15

Bad auth means wrong combination of userid & pwd
The userid/pwd we are passing in the connect string are database user
Did you create the user tudor_15 in the database?

You should first check your connectivity to cluster then do the load
How did you get your connect string?
From Atlas or framed it based on another sample string?

Please login to your Atlas and check status of your cluster
Is it up and running?Is the clusterid given correct?

Please check again your steps

In the ‘Cluster’ tab, there is ‘Command Line Tools’ subtab. There are the prototypes for the main commands. You have to take the command, paste in command prompt, and replace the PASSWORD word with your own password, and in case you have to import data, you replace the other parameters in the string with yours.

The user id ‘tudor_15’ was already put in the command line strings in the ‘Command Line Tools’.

How to check the status of the cluster if the command line gives me an error?

I understood yoru problem.From shell you cannot check
Asking you to check cluster status from Atlas
When you login to your Atlas account you can see your cluster status.There should be an alerts tab/button
After you setup your Cluster you would have created a user
What is your cluster type Free Tier or paid and what steps you followed
For mongo University students there are steps on how to setup Sandbox cluster and load data

I repeat unless you have a database user you cannot connect to your cluster/db from shell nor you can perform other tasks like mongoimport etc