Bash script to create a replica set

Hello everybody.

I am working on a replicaSet mongo in aws, my goal is to set the replica in run time with a bash script command.

my bash look like this:

mongo mongodb://10.0.1.100 --eval "rs.initiate( { _id : 'rs0', members: [{ _id: 0, host: '10.0.1.100:27017' }]})"
mongo mongodb://10.0.1.100 --eval "rs.add( '10.0.2.100:27017' )"
mongo mongodb://10.0.1.100 --eval "rs.add( '10.0.3.100:27017' )"
mongo mongodb://10.0.1.100 --eval "db.isMaster().primary"
mongo mongodb://10.0.1.100 --eval "rs.slaveOk()"

but when i log in my instance and run rs.status(), i get the error that no config could be found.

So i tried in a different way. I accessed my fresh mongo instance, and through themongo command line i inserted the var config such as:

var config={_id:"rs0",members:[{_id:0,host:"10.0.1.100:27017"}, {_id:1,host:"10.0.2.100:27017"}, {_id:2,host:"10.0.3.100:27017"}]};
> rs.initiate(config);

if i run rs.status, it works.

i would like to run the same command through a linux bash command script to initiate the config, but i cant find a solution. any help please?

Hi @Hamza_El_Aouane,

The script layout you have now should in theory at least initiate the replicaSet on the 10.0.1.100 instance, but I could see it failing to add the other instances as the replicaSet might not be ready when those commands are sent to the mongod instance.

My question here is, if you’re just going to add members right away, why not add those members to the initiate() method call?

rs.initiate({
    _id : 'rs0',
    members: [
        {'_id': 0, 'host': '10.0.1.100:27017' },
        {'_id': 1, 'host': '10.0.2.100:27017' },
        {'_id': 2, 'host': '10.0.3.100:27017' }
    ]
})

This will save you some hassle. If you’re trying to make sure that the instance on 10.0.1.100 is the primary, then you can add priority: 2 to the block for that instance and the it would be the primary member as long as it was up and accessible to the other two nodes.

I’m not sure what your ultimate plan is with this bash script so it’s hard to tell you the best way about setting it up.

1 Like

Thank you very much for you reply and you are right.

But is still got the same problem.

if i put the code in my bash

#!/bin/bash

sudo systemctl enable mongod
sudo systemctl start mongod

mongo rs.initiate({
    _id : 'rs0',
    members: [
        {'_id': 0, 'host': '10.0.1.100:27017' },
        {'_id': 1, 'host': '10.0.2.100:27017' },
        {'_id': 2, 'host': '10.0.3.100:27017' }
    ]
})

and run my instance, it seems that mongo console does not execute this code. but if the instance is up and running, and i access mongo console and paste the same code, than it works just fine.

You need to give the mongo service time to start up. Put some type of delay in the script, after starting the service. Also note that the mongo command as you have it will fail. You still need to run the rs.initiate(...) through --eval.

I have some questions:

  • What are you trying to accomplish with this script?
  • Why run an rs.initiate() via a shell script? You only need to do this once.
  • Why would you want to enable and start the service every time? Shouldn’t the service already be running since enable tells the service to start up after a system reboot?

Knowing what you’re trying to accomplish will allow us to give you more input into how you should go about doing what you’re trying to do. A script like this, as it stands now, doesn’t make a lot of sense to me.

1 Like

One other thing, if this is to set up test systems, you might want to look at mlaunch is part of the bigger mtools set of tools and is maintained by @Stennie!

mlaunch makes setting up test clusters/replica sets a breeze. These systems will all run on the same machine so it’s not for production, but for testing having multiple mongod instances on the same host is not a big deal.

Again it all depends on what you’re trying to accomplish on whether this makes sense for you to use or not.

I am a junior devOps, and i am building (for training) a high availability and fail over infrastructure. So having 3 app instances running a app homepage with autoscaling and load balancer, and as a database a mongodb in replicaset accross 3 availability zones in aws.

The point of this training is to spin up automatically all the infrastructure by running terrafform.

Everything else work and i am missing just the mongodb part.

Thats why i need this script, so when my instance is spinned up, it will have all the configuration in place to work,automating everything without having to run any command once the infrastructure is up and running.

i tried to delay the script by 2minutes before to execute it, but still nothing. if after 3 minute i ssh in the instance and run rs.status() i get:

> rs.status()
{
	"info" : "run rs.initiate(...) if not yet done for the set",
	"ok" : 0,
	"errmsg" : "no replset config has been received",
	"code" : 94
}

When you run your shell script to you get any errors?

Having this in a shell script works just fine for me to init a three node replica set running locally on my Mac:

mongo --host 127.0.0.1 --port 27017 --eval 'rs.initiate({_id: "testing", members: [{_id: 0, host: "127.0.0.1:27017", priority: 2}, {_id: 1, host: "127.0.0.1:27018"}, {_id: 2, host: "127.0.0.1:27019"}]})'

Make sure to replace the host IP with your machine’s IP and testing with the actual name of your replicaSet (the same value you used on in the mongod --replSet testiing value. Make sure that all machines have port 27017 open up to each other and to the machine running the script (if it’s not one of the machines in the replica set). This should be the case since you can do it manually inside of the mongo shell, but something you will want to check.

You will want to check the MongoDB logs

grep mongologfile | grep REPL

Replace mongologfile with the path and file name of your log. Any replication operations will have the term REPL in the log line. You should see a lot of these as the replica set initializes and elections are performed.

If none of that is happening, then I’m not sure what’s going on. Without having logs it’s hard to troubleshoot.

I honestly have no clue why is this happening. my security group (for testing purpose) are open to all kind of traffic from anywhere.

i put:

sleep 60;
mongo --host mongodb://10.0.1.100 --port 27017 --eval 'rs.initiate({_id: "rs0", members: [{_id: 0, host: "10.0.1.100:27017", priority: 2}, {_id: 1, host: "10.0.2.100:27017"}, {_id: 2, host: "10.0.3.100:27017"}]})'

but still unable to run the replica from bash.

my mongodb.log states this:

2020-05-12T01:10:51.372+0000 I REPL     [initandlisten] Did not find local voted for document at startup.
2020-05-12T01:10:51.372+0000 I REPL     [initandlisten] Did not find local replica set configuration document at startup;  NoMatchingDocument: Did not find replica set configuration document in local.system.replset

One more thing. in my mongod.conf i am binding my IP to 0.0.0.0. this is my mongod.conf:

# mongod.conf

# for documentation of all options, see:
#   http://docs.mongodb.org/manual/reference/configuration-options/

# Where and how to store data.
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true
#  engine:
#  mmapv1:
#  wiredTiger:

# where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

# network interfaces
net:
  port: <%= @port %>
  bindIp: <%= @bindIp %>


#processManagement:

#security:

#operationProfiling:

#replication:
replication:
  replSetName: rs0

#sharding:

## Enterprise-Only Options:

#auditLog:

#snmp:

where port is 27017 and bindip is 0.0.0.0

if i run the same command from mongo console. i got this output in my mongo.log:

I REPL     [conn161] replSetInitiate admin command received from client
2020-05-12T01:30:39.128+0000 W NETWORK  [conn161] getaddrinfo("ip-10-0-1-100") failed: Name or service not known
2020-05-12T01:30:39.128+0000 I NETWORK  [conn161] getaddrinfo("ip-10-0-1-100") failed: Name or service not known
2020-05-12T01:30:39.128+0000 E REPL     [conn161] replSet initiate got NodeNotFound: No host described in new configuration 1 for replica set rs0 maps to this node while validating { _id: "rs0", version: 1, members: [ { _id: 0, host: "ip-10-0-1-100:27017" } ] }

OK for the first log file snippet you show is normal. This is what MongoDB will show when you first start mongod with the --replSet option. Shortly after that you should see that MongoDB is waiting for connections on a given port:

2020-05-11T19:44:08.778-0600 I REPL     [initandlisten] Did not find local voted for document at startup.
2020-05-11T19:44:08.778-0600 I REPL     [initandlisten] Did not find local Rollback ID document at startup. Creating one.
2020-05-11T19:44:08.778-0600 I STORAGE  [initandlisten] createCollection: local.system.rollback.id with generated UUID: ec76c67d-5cd1-4900-9c1b-0814ae205bff
2020-05-11T19:44:08.855-0600 I REPL     [initandlisten] Initialized the rollback ID to 1
2020-05-11T19:44:08.856-0600 I REPL     [initandlisten] Did not find local replica set configuration document at startup;  NoMatchingDocument: Did not find replica set configuration document in local.system.replset
2020-05-11T19:44:08.856-0600 I CONTROL  [LogicalSessionCacheReap] Sessions collection is not set up; waiting until next sessions reap interval: Replication has not yet been configured
2020-05-11T19:44:08.856-0600 I CONTROL  [LogicalSessionCacheRefresh] Sessions collection is not set up; waiting until next sessions refresh interval: Replication has not yet been configured
2020-05-11T19:44:08.856-0600 I NETWORK  [initandlisten] waiting for connections on port 27017

In the second log snippet it shows that the getaddrinfo command that MongoDB runs fails to find host ip-10-0-1-100. Can you ping that host name? If not you would need to try the fully qualified domain name or adding the short host name to your /etc/hosts file with the associated IP address.

i had my /etc/hosts set as:

127.0.0.1 localhost
10.0.1.100
10.0.2.100
10.0.3.100
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

but still cannot find that ip. should i set the host ip in a different way?

Change that toL

10.0.1.100  ip-10-0-1-100
10.0.2.100  ip-10-0-2-100
10.0.3.100  ip-10-0-3-100

Machines can find the IP addresses, they have problems mapping host names to IP addresses at times however.

This might not work either as I believe you stated that originally when you ran the rs.initiate(...) from the mongo shell originally.

Ok so changing the ip maping in the hosts file as you suggest prevented the earlier error. now i get this:

2020-05-12T02:25:58.694+0000 I REPL     [initandlisten] Did not find local voted for document at startup.
2020-05-12T02:25:58.695+0000 I REPL     [initandlisten] Did not find local replica set configuration document at startup;  NoMatchingDocument: Did not find replica set configuration document in local.system.replset
2020-05-12T02:25:58.695+0000 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/var/lib/mongodb/diagnostic.data'
2020-05-12T02:25:58.695+0000 I NETWORK  [initandlisten] waiting for connections on port 27017
2020-05-12T02:25:58.695+0000 I NETWORK  [HostnameCanonicalizationWorker] Starting hostname canonicalization worker

and nothing else afte. should i change something in the rs.initiate. because my mongo if i run rs.status() i get still this error:

> rs.status()
{
	"info" : "run rs.initiate(...) if not yet done for the set",
	"ok" : 0,
	"errmsg" : "no replset config has been received",
	"code" : 94
}

I am really sorry to bother you with this, but i never used ongodb before and i am still learning.

I have something new. i passed the mapped ip to the shell and when i run my instance, still is not automated the replicaset, but funny thing is if i open the mongo console and type rs.initiate(), it does initiate the current instance as primary but ignores the slaves.

my rs.stauts look like:

rs0:PRIMARY> rs.status()
{
	"set" : "rs0",
	"date" : ISODate("2020-05-12T04:16:58.695Z"),
	"myState" : 1,
	"term" : NumberLong(1),
	"heartbeatIntervalMillis" : NumberLong(2000),
	"members" : [
		{
			"_id" : 0,
			"name" : "ip-10-0-1-100:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 68,
			"optime" : {
				"ts" : Timestamp(1589257005, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2020-05-12T04:16:45Z"),
			"infoMessage" : "could not find member to sync from",
			"electionTime" : Timestamp(1589257004, 2),
			"electionDate" : ISODate("2020-05-12T04:16:44Z"),
			"configVersion" : 1,
			"self" : true
		}
	],
	"ok" : 1
}

while in my mongolog i got:

2020-05-12T04:16:44.282+0000 I COMMAND  [conn17] initiate : no configuration specified. Using a default configuration for the set
2020-05-12T04:16:44.282+0000 I COMMAND  [conn17] created this configuration for initiation : { _id: "rs0", version: 1, members: [ { _id: 0, host: "ip-10-0-1-100:27017" } ] }
2020-05-12T04:16:44.282+0000 I REPL     [conn17] replSetInitiate admin command received from client
2020-05-12T04:16:44.282+0000 I REPL     [conn17] replSetInitiate config object with 1 members parses ok
2020-05-12T04:16:44.282+0000 I REPL     [conn17] ******
2020-05-12T04:16:44.282+0000 I REPL     [conn17] creating replication oplog of size: 990MB...
2020-05-12T04:16:44.288+0000 I STORAGE  [conn17] Starting WiredTigerRecordStoreThread local.oplog.rs
2020-05-12T04:16:44.288+0000 I STORAGE  [conn17] The size storer reports that the oplog contains 0 records totaling to 0 bytes
2020-05-12T04:16:44.288+0000 I STORAGE  [conn17] Scanning the oplog to determine where to place markers for truncation
2020-05-12T04:16:44.320+0000 I REPL     [conn17] ******
2020-05-12T04:16:44.344+0000 I REPL     [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 1, protocolVersion: 1, members: [ { _id: 0, host: "ip-10-0-1-100:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5eba232c528c300e954ae166') } }
2020-05-12T04:16:44.344+0000 I REPL     [ReplicationExecutor] This node is ip-10-0-1-100:27017 in the config
2020-05-12T04:16:44.344+0000 I REPL     [ReplicationExecutor] transition to STARTUP2
2020-05-12T04:16:44.344+0000 I REPL     [conn17] Starting replication applier threads
2020-05-12T04:16:44.345+0000 I REPL     [ReplicationExecutor] transition to RECOVERING
2020-05-12T04:16:44.346+0000 I REPL     [ReplicationExecutor] transition to SECONDARY
2020-05-12T04:16:44.346+0000 I REPL     [ReplicationExecutor] conducting a dry run election to see if we could be elected
2020-05-12T04:16:44.346+0000 I REPL     [ReplicationExecutor] dry election run succeeded, running for election
2020-05-12T04:16:44.384+0000 I REPL     [ReplicationExecutor] election succeeded, assuming primary role in term 1
2020-05-12T04:16:44.384+0000 I REPL     [ReplicationExecutor] transition to PRIMARY
2020-05-12T04:16:45.346+0000 I REPL     [rsSync] transition to primary complete; database writes are now permitted

Any idea about why it ignores the other configuration? from the primary i can ping the other mongodb. Thank you again for you time and patience.

The top line in your latest log file makes it sound like the command that was run was rs.initiate() with no document passed in. I’m also not seeing any attempt to communicate with the other two members in the replicaset so the machine that you’re on doesn’t know about them. Have you tried to manually add the other two instances from the mongo shell of the machine that you ran the rs.initiate(...) command on? If not try running rs.add('ip-10-0-1-100:27017') to see if it gets added. If it does then add the third member to make sure it gets added. Once you can manually get all three members added then you can try automating it. For the automation process recreate the machines or delete the contents from the data directory so you have instances that are not part of a replicaset and try running the shell script with the rs.initiate(...) command with the members list in it. If that don’t work then you’ll have to troubleshoot what’s failed. MongoDB should give errors for any failures that it encounters.

At this point in time without being physically being on the machine and running commands I can’t say what’s going on. What you say you’re running and what the logs are showing are not aligning.

Since you state that you’re new to MongoDB, I would recommend taking the Basic Cluster Administration course from MongoDB University to help give you a better understanding of how things work. This course is currently running so you should still be able to get in on this run and work through everything.

Unfortunately I’m not sure what else I can do to try provide help at this time.

Look, Thank you so much for your help and patiente I did appreciate a lot and you helped me a lot.
Now atleast i know i can add the primary and the secondary DB from mongo shell command, i need just to figure out why this is not happening from my bash. This is the last step, i would be easier if i had any log or failure but my logs are clean. Thank again

Hi @Hamza_El_Aouane I’m glad you’re able to at least get it working from the mongo shell. That means that the machines are set up to properly work in a replica set.

The only only thing that I can think of is to make sure you’re running the shell script that calls the rs.initiate(...) from one of the machines that are part of the replica set. I would put all three nodes as members in the command instead of doing an rs.initiate() followed by two rs.add() commands, although it should work either way in theory.

Being able to do this manually is the first step in being able to automate it. I would recommend looking at the logs from the manual process to see what is expected to happen as you initiate the replica set and add members. Remember that errors might not show up in the log, if the server never processed the command. You might have errors coming back from the mongo shell.

Best of luck in getting the automation process finished up and sorry that I was not able to help get you to the end on this one. :frowning:

Hi doug, first thing first let me thank you a lot for your help, you are an absolute legend. You apology because you couldnt help me to solve the problem, but trust me you did. You put me on the right path to understand what i was doing wrong and i found my error. As you helped me to make sure my replica was working properly from the mongo shell, i could troubleshoot my terraform and i figure out that in my provisioning file i was calling a template instead of a shell file. once i changed that, i was able to run all the command and have the replica set up and running. So thank you very much one more time for your patience and help

1 Like

Glad to hear you got it the final piece worked out @Hamza_El_Aouane! It was my pleasure to help out where I could.