Lab - Shard a Collection error

I’m in trouble with this Lab.
I created the second replica set: m103-repl-2, launched the csrs replica set, but when I try to launch mongos daemon (mongos -f mongos.conf) it fails and I don’t know why.
I restarted the vagrant machine, re-launched the csrs and the mongos daemon too, but, I can’t add the second shard m103-repl-2.

This is the output screen on my vagrant shell:

vagrant@m103:~ mongos -f mongos.conf 2019-07-01T22:04:19.778+0000 W SHARDING [main] Running a sharded cluster with fewer than 3 config servers should only be done for testing purposes and is not recommended for production. about to fork child process, waiting until server is ready for connections. forked process: 2619 ERROR: child process failed, exited with error number 48 To see additional information in this output, start without the "--fork" option. vagrant@m103:~
and this is my mongos.conf file

sharding:
configDB: m103-csrs/192.168.103.100:26001
net:
port: 26000
bindIp: “127.0.0.1,192.168.103.100”
security:
keyFile: /var/mongodb/pki/m103-keyfile
systemLog:
destination: file
path: /var/mongodb/db/mongos.log
logAppend: true
processManagement:
fork: true

May be your setup is not complete
3 replica servers,3 config servers and 1 mongos should be up and running

Please check
rs.status() for replica set: m103-csrs.
It should contain three nodes- 26001, 26002 and 26003.

Hi @Luciano_82816,

Are you able to resolve this error?

Please let me know, if you have any questions.

Thanks,
Sonali

OKay so i’m actualllt having the same issues. this is what i see in my database:

set" : "m103-csrs",
	"date" : ISODate("2019-07-02T13:40:20.263Z"),
	"myState" : 1,
	"term" : NumberLong(1),
	"syncingTo" : "",
	"syncSourceHost" : "",
	"syncSourceId" : -1,
	"configsvr" : true,
	"heartbeatIntervalMillis" : NumberLong(2000),
	"optimes" : {
		"lastCommittedOpTime" : {
			"ts" : Timestamp(1562074812, 2),
			"t" : NumberLong(1)
		},
		"readConcernMajorityOpTime" : {
			"ts" : Timestamp(1562074812, 2),
			"t" : NumberLong(1)
		},
		"appliedOpTime" : {
			"ts" : Timestamp(1562074812, 2),
			"t" : NumberLong(1)
		},
		"durableOpTime" : {
			"ts" : Timestamp(1562074812, 2),
			"t" : NumberLong(1)
		}
	},
	"members" : [
		{
			"_id" : 0,
				"name" : "192.168.103.100:26001",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 64303,
			"optime" : {
				"ts" : Timestamp(1562074812, 2),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2019-07-02T13:40:12Z"),
			"syncingTo" : "",
			"syncSourceHost" : "",
			"syncSourceId" : -1,
			"infoMessage" : "",
			"electionTime" : Timestamp(1562040076, 2),
			"electionDate" : ISODate("2019-07-02T04:01:16Z"),
			"configVersion" : 3,
			"self" : true,
			"lastHeartbeatMessage" : ""
		},
		{
			"_id" : 1,
			"name" : "192.168.103.100:26002",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 34485,
			"optime" : {
				"ts" : Timestamp(1562074812, 2),
				"t" : NumberLong(1)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1562074812, 2),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2019-07-02T13:40:12Z"),
			"optimeDurableDate" : ISODate("2019-07-02T13:40:12Z"),
			"lastHeartbeat" : ISODate("2019-07-02T13:40:18.305Z"),
			"lastHeartbeatRecv" : ISODate("2019-07-02T13:40:19.555Z"),
			"pingMs" : NumberLong(0),
			"lastHeartbeatMessage" : "",
			"syncingTo" : "192.168.103.100:26001",
			"syncSourceHost" : "192.168.103.100:26001",
			"syncSourceId" : 0,
			"infoMessage" : "",
			"configVersion" : 3
		},
		{
			"_id" : 2,
			"name" : "192.168.103.100:26003",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 34466,
			"optime" : {
				"ts" : Timestamp(1562074812, 2),
				"t" : NumberLong(1)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1562074812, 2),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2019-07-02T13:40:12Z"),
			"optimeDurableDate" : ISODate("2019-07-02T13:40:12Z"),
			"lastHeartbeat" : ISODate("2019-07-02T13:40:18.305Z"),
			"lastHeartbeatRecv" : ISODate("2019-07-02T13:40:19.751Z"),
			"pingMs" : NumberLong(0),
			"lastHeartbeatMessage" : "",
			"syncingTo" : "192.168.103.100:26001",
			"syncSourceHost" : "192.168.103.100:26001",
			"syncSourceId" : 0,
			"infoMessage" : "",
			"configVersion" : 3
		}
	],
	"ok" : 1,
	"operationTime" : Timestamp(1562074812, 2),
	"$gleStats" : {
		"lastOpTime" : Timestamp(0, 0),
		"electionId" : ObjectId("7fffffff0000000000000001")
	},
	"$clusterTime" : {
		"clusterTime" : Timestamp(1562074812, 2),
		"signature" : {
			"hash" : BinData(0,"R6vydQdBAlRbFS0gMk8Q3qi2vkY="),
			"keyId" : NumberLong("6708911045756321818")

===========================================================

vagrant@m103:/var/mongodb$ ps -ef | grep -i mongod
vagrant  22027     1  0 10:20 ?        00:01:24 mongod -f csrs_3.conf
vagrant  22069     1  0 10:25 ?        00:01:22 mongod -f csrs_1.conf
vagrant  22124     1  0 10:31 ?        00:01:20 mongod -f csrs_2.conf
vagrant  23336     1  0 12:03 ?        00:00:44 mongod -f mongod-repl-2.conf
vagrant  23760     1  0 12:18 ?        00:00:37 mongod -f mongod-repl-3.conf
vagrant  24180     1  0 12:53 ?        00:00:24 mongod -f mongod-repl-1.conf
vagrant  24867 21217  0 13:43 pts/0    00:00:00 grep --color=auto -i mongod

============================================================
mongos error

mongos -f mongos.conf

2019-07-02T13:44:57.670+0000 W SHARDING [main] Running a sharded cluster with fewer than 3 config servers should only be done for testing purposes and is not recommended for production.

about to fork child process, waiting until server is ready for connections.

forked process: 24879

ERROR: child process failed, exited with error number 1

To see additional information in this output, start without the "--fork" option.

vagrant@m103:/var/mongodb$

=====================================================
mongos config
sharding:
configDB: m103-csrs/192.168.103.100:26001
security:
keyFile: /var/mongodb/pki/m103-keyfile
net:
bindIp: localhost,192.168.103.100
port: 26000
systemLog:
destination: file
path: /var/mongodb/db/mongos.log
logAppend: true
processManagement:
fork: true

Hi @James_68753,

Please refer to the following posts for this error:

Also, re check your configuration file for nodes in m103-repl and m103-csrs replica sets.

Please let me know, if any solution works for you.

Thanks,
Sonali

Hello. This is not unfortunately helping my error. same error messages

Replied to you in other thread
Try to run without --fork and see what error it gives

Also change the configDB line as below.I think one IP you gave should work but give it a try

configDB: m103-csrs/192.168.103.100:26001,192.168.103.100:26002,192.168.103.100:26003

1 Like

Finally I did it!
I had to stop all the processes and then restarted them in this order:
first of all the three csrs, then the mongos process and, at last, the three nodes of m103-repl-2.
Then I connected to mongos daemon and I added the second shard (m103-repl-2).
No problem with the other tasks of the lab: everything went fine.

2 Likes

@Ramachandra_37567 @Simon_39939

I think i’m just confused on how we create m103-repl-2. “We can now initialize m103-repl-2 as a normal replica set.” where do we set this up? i have node4,5,6 configured and csrs 123 also configured

@Simon_39939 also i’m not sure which db to run the command on

We can now initialize m103-repl-2 as a normal replica set.

Now exit the mongo shell and connect to mongos. We can add m103-repl-2 as a shard with the following command:

sh.addShard(“m103-repl-2/192.168.103.100:27004”)

When you say you have nodes 4, 5 and 6 configured, do you mean you’ve got their mongod instances running, or do you mean you’ve also configured them to know that they’re all part of the same replica set?

Try connecting the shell to one of those mongods and see what rs.status() displays. If it shows 3 members, one primary and 2 secondaries, then your replica set is configured and good to go. If it shows only one member then you need to add the other 2 nodes to the replica set.

And repeat the process for one of the CSRS nodes.

Once that’s all done, you can start the mongos (if it’s not already running) and connect the shell to the mongos port, which is the way into a sharded cluster, and then add the replica set consisting of nodes 4, 5 and 6 as a second shard of your cluster.

I mean that i have the processes running

vagrant@m103:/var/mongodb$ ps -ef | grep -i mongod
vagrant   1915     1  0 10:15 ?        00:02:01 mongod -f node4_conf
vagrant   1948     1  0 10:15 ?        00:02:00 mongod -f node5_conf
vagrant   1977     1  0 10:15 ?        00:02:01 mongod -f node6_conf
vagrant   2839     1  1 12:17 ?        00:04:54 mongod -f mongod-repl-1.conf
vagrant   2893     1  0 12:31 ?        00:03:48 mongod -f csrs_1.conf
vagrant   2986     1  0 12:31 ?        00:03:25 mongod -f csrs_2.conf
vagrant   3075     1  0 12:31 ?        00:03:24 mongod -f csrs_3.conf
vagrant   3532  1512  0 12:53 pts/2    00:00:00 vi mongod-repl-1.conf
vagrant   6835  1827  0 19:16 pts/0    00:00:00 grep --color=auto -i mongod

here is the other csrs. It does show that it is the secondary.

MongoDB Enterprise m103-csrs:SECONDARY> rs.status()
{
“operationTime” : Timestamp(1562354300, 4),
“ok” : 0,
“errmsg” : “there are no users authenticated”,
“code” : 13,
“codeName” : “Unauthorized”,
“$gleStats” : {
“lastOpTime” : Timestamp(0, 0),
“electionId” : ObjectId(“000000000000000000000000”)
},
“$clusterTime” : {
“clusterTime” : Timestamp(1562354300, 4),
“signature” : {
“hash” : BinData(0,“QcvxBtZDv6Htk58dw6u3ljkPU80=”),
“keyId” : NumberLong(“6709196905894641690”)
}

OK, so you have 3 mongods running with the intention of them being a CSRS replica set, and it looks like you’ve initialised the replica set because you’ve got the m103-csrs as part of your command prompt. But you’re getting an error message “there are no users authenticated” in your rs.status() output, which means you haven’t successfully authenticated yourself to the replica set when running mongo. Without authenticating yourself you’re unlikely to get any more meaningful information from any commands.

Have you created the m103-admin user in that replica set? And have you included the -u and -p options on the mongo command line when connecting to it?

I’m not able to for some reason @Simon_39939 still getting that error. How did you do yours?

@James_68753 what are you still not able to do?

Create the m103-admin user in the CSRS replica set?

Connect the shell to the m103-csrs replica set as the m103-admin user?

Validate that the m103-admin user has been created in the CSRS replica set?

Something else?

I guess creating the user. that portion is not in the instructions. are you creating the user on port 26001?

@Simon_39939

use admin
db.createUser({
  user: "m103-admin",
  pwd: "m103-pass",
  roles: [
    {role: "root", db: "admin"}
  ]
}) 

right?

2019-07-06T18:43:36.759+0000 E QUERY [thread1] Error: couldn’t add user: there are no users authenticated :

_getErrorWithCode@src/mongo/shell/utils.js:25:13

DB.prototype.createUser@src/mongo/shell/db.js:1437:15

@(shell):1:1

MongoDB Enterprise m103-csrs:SECONDARY>

That looks like the correct command to create an admin user, but are you running it on the correct mongod?

This is why I suggest including the command prompt in what you share here, it tells us the context in which you are running that command, and gives us a better chance of working out what’s wrong with it.

Failing to create the admin user in the CSRS replica set is a blocker for the rest of the lab. It looks like you might be unable to create that admin user in the CSRS replica set because another user has already been created (and so the localhost exception doesn’t apply any more). Have you created such a user? If you have and it has sufficient privileges then you might be able to log in as it to create the m103-admin user and continue with the lab, otherwise I suspect your only option may be to delete the CSRS replica set, its data files, and all references to it, and build them again from scratch :frowning:

I sincerely hope there’s an alternative to such drastic action, time to call in the real experts (I’m just a student after all)…

Ya i agree i was trying to avoid having to do that, but its the easiest way! I’ll keep this thread updated after i try again. Thanks!