Validate_lab_shard_collection - stating a wrong error

I did import the complete data set , the number and the query matches the number, but the validation script complaints me a error stating “Incorrect number of documents imported”

i did import 516784 records, i am not attach a screen shot, else i would show a proof…

I got the same error. Import worked successfully; count show correct number
.
.
2019-03-02T03:26:29.074+0000 [#######################.] m103.products 85.2MB/87.9MB (96.8%)
2019-03-02T03:26:31.970+0000 [########################] m103.products 87.9MB/87.9MB (100.0%)
2019-03-02T03:26:31.970+0000 imported 516784 documents

MongoDB Enterprise mongos> db.products.count()
516784

vagrant@m103:/etc$ validate_lab_shard_collection

Incorrect number of documents imported - make sure you import the entire
dataset.

I am having the exact same problem…

1 Like

I faced same issue and few others too
Please search the forum for validate_lab_shard_collection.Some members ran the validate in a loop then it worked
Spent lot of time fixing this.I had to repeat load with drop option

2019-01-26T08:57:50.940+0000 [########################] m103.products 87.9MB/87.9MB (100.0%)
2019-01-26T08:57:50.940+0000 imported 516784 documents
vagrant@m103:/dataset$

vagrant@m103:/dataset$ validate_lab_shard_collection

Incorrect number of documents imported - make sure you import the entire
dataset.

I got it fixed after I did drop the database applicationData, which had the products collection …, did load that during some other lab …

2 Likes

Ramachandra_37567, thanks for the info. I ran it again and it worked.

MongoDB Enterprise m103-repl-2:PRIMARY> db.products.count()
298899

MongoDB Enterprise m103-repl:PRIMARY> db.products.count()
217885

vagrant@m103:/etc$ echo 298899 + 217885|bc
516784

I had the same issue. Please check all the mongo background processes.

vagrant@m103:~/exam$ ps -ef | grep mongo
vagrant 5006 1 1 05:03 ? 00:02:54 mongod -f node1.conf
vagrant 5281 1 2 05:06 ? 00:04:57 mongod -f node4.conf
vagrant 5309 1 3 05:06 ? 00:05:44 mongod -f node5.conf
vagrant 5337 1 3 05:06 ? 00:06:15 mongod -f node6.conf
vagrant 5598 1 1 05:08 ? 00:01:42 mongod -f csrs-2.conf
vagrant 5632 1 1 05:08 ? 00:01:43 mongod -f csrs-1.conf
vagrant 5743 1 1 05:11 ? 00:01:46 mongod -f csrs-3.conf
vagrant 5916 1 0 05:12 ? 00:00:34 mongos -f mongos.conf
vagrant 9022 1 1 06:20 ? 00:01:42 mongod -f node2.conf
vagrant 10537 1 1 06:58 ? 00:00:42 mongod -f node3.conf
vagrant 11912 6719 0 07:51 pts/0 00:00:00 grep --color=auto mongo

Inside on my vagrant box some nodes of the replica set “m103-repl” were terminated during collection sharding and I was not able to restart them immediately.

In the log file on one node I found an error “[conn23] out of memory.” with some backtrace.

After a while (watching all lessons) however I was able to restart nodes again (mongod -f node3.conf) and the validation script passed

1 Like

Hi, @vinu_prabhu_84718!

I solved the validation error with that instructions.
Thanks a lot!

I had the same issue, and after restarting completely my VM, and then each replicaset of nodes/config server/mongos, it solved this “out of memory” issue.

Dropping the m103 db I had created in the m103-repl set previously, and was able to complete the validation properly. The count was off because of the ‘old’ m103 database. Thanks vinu!

Don’t worry about incorrect number of documents imported. It just tells you that you picked bad shard key. Everything is fine with the correct shard key.

Thanks Veenu. Dropping applicationData (which had products collection from a previous lab) fixed the issue.

– Correct row count imported:


2019-04-27T15:48:14.551+0000 [######################…] m103.products 82.5MB/87.9MB (93.9%)
2019-04-27T15:48:16.822+0000 [########################] m103.products 87.9MB/87.9MB (100.0%)
2019-04-27T15:48:16.823+0000 imported 516784 documents

… sharding step completed

– validation failing:
vagrant@m103:~$ validate_lab_shard_collection
Incorrect number of documents imported - make sure you import the entire
dataset.

vagrant@m103:/dataset$ mongo --port 26000 --username m103-admin --password m103-pass --authenticationDatabase admin

MongoDB Enterprise mongos> show databases
admin 0.000GB
applicationData 0.029GB
config 0.001GB
m103 0.061GB
test 0.000GB
testDatabase 0.000GB
MongoDB Enterprise mongos> use m103
switched to db m103
MongoDB Enterprise mongos> db.products.count()
734669 . <--------- in correct row count
MongoDB Enterprise mongos> use applicationData
switched to db applicationData
MongoDB Enterprise mongos> show collections
products
MongoDB Enterprise mongos> db.products.count()
516784
MongoDB Enterprise mongos> db
applicationData
MongoDB Enterprise mongos> db.products.drop()
true
MongoDB Enterprise mongos> use m103
switched to db m103
MongoDB Enterprise mongos> show databases
admin 0.000GB
config 0.001GB
m103 0.061GB
test 0.000GB
testDatabase 0.000GB
MongoDB Enterprise mongos> use m103
switched to db m103
MongoDB Enterprise mongos> db.products.count()
516784 . <------- correct row count
MongoDB Enterprise mongos>

– Validation worked.

Is collection count a MongoDB bug (for a sharded environment)?

Thx
~amrita

Hello all, I had the same issue and the count result changed after running “sh.status()” command! That’s really weird. Here’s what I did:
I had run the import, sharded the collection with selected key and then count:

MongoDB Enterprise mongos> db.products.count()
516784
MongoDB Enterprise mongos> ^C
bye
vagrant@m103:~$ validate_lab_shard_collection

Congratulations, you got it right! Here's your validation code:
xxxxxxxxxxxxxxxxxxxxxxxxxx

Then I got back to mongos, run “sh.status()” and did count again:

MongoDB Enterprise mongos> db.products.count()
734669
MongoDB Enterprise mongos> ^C
bye
vagrant@m103:~$ validate_lab_shard_collection

Incorrect number of documents imported - make sure you import the entire dataset.

I have already dropped applicationData db before, so it’s not interfering the results.

I was having an “Incorrect number of documents imported” error after running the validator before stumbling across this thread and after multiple re-imports, here are my findings.

From my tests, it turns out that even though sh.status() shows the number of chunks between the shards, it’s actually still doing some work in the background, i.e. sharding is not truly complete. I believe that what the sh.status shows is the way it’s going to split the data but it doesn’t truly confirm that all the work to split the chunks is complete. I’m sure that with a much more powerful server, you won’t notice this background process.

In summary:

  1. You don’t need to drop applicationData db because it’s not interfering. It just so happens that during the process of dropping this db, the allocation of chunks actually completes. I can’t imagine that there will be a clash of collection names in an enterprise grade database like MongoDB.
  2. You don’t need to run a loop against the validator. Again, it just so happens that during the process of looping, the allocation of chunks actually completes.
  3. You don’t need to drop the m103 db using db.products.drop(). The --drop command in mongoimport takes care of that.

Here are my workings/solution approach:

  1. Connect to both replica sets and run a products count on both:

Replica 1:
mongo --host “m103-repl/m103:27001” -u “m103-admin” -p “m103-pass” --authenticationDatabase “admin”

use m103
db.products.count()

Replica 2:
mongo --host “m103-repl-2/m103:27004” -u “m103-admin” -p “m103-pass” --authenticationDatabase “admin”

use m103
db.products.count()

  1. If you find that one of the replica set has a count of 516784, the sharding process is not yet complete. You may find that even though one replica set has a count of 516784, the other replica set may return a count of 200K+… the sharding process is still running.
  2. Keep checking the count on both replica sets until the sum of the count of products in both replica sets equals the total number of products
  3. Finally, run the validator. You may encounter a timeout issue when you run the validator, just re-run it.

These are my thoughts after spending an hour trying to get to the bottom of this.

Curriculum Engineers, please feel free to chime in.

Hi @007_jb,

Yes, this is one issue that we have been seeing for some users and we are working on a solution for the same.

Till then, dropping the products collection and switching between the shard keys is the quickest work around to get through this.

We’ll definitely get back once we have concrete solution to this problem.

Thanks,
Muskan
Curriculum Support Engineer