Hi,
I am currently testing mongodb for a potentional use on a live system altough I have encountered an issue with mongodb replication.
I have PSA setup on mongodb v4.2.6. 3 Servers with 10 core CPU, 20GB RAM ea. I am currently testing how MongoDB behaves if for instance I stop the replication manually.
Configuration :
oplog was configured for each server with 160GB of size which is more than I need for the test that I am conducting also Secondary Sync Target was configured rs.syncFrom(“mongo01:27017”) ( to the Primary server)
PRIMARY> rs.printReplicationInfo()
configured oplog size: 160000MB
log length start to end: 12799secs (3.56hrs)
oplog first event time: Wed Apr 22 2020 11:21:03 GMT+0000 (UTC)
oplog last event time: Wed Apr 22 2020 14:54:22 GMT+0000 (UTC)
now: Wed Apr 22 2020 14:54:25 GMT+0000 (UTC)
When conducting the test I am inserting roughly 20k transactions of JSON per second. After one minute I am stopping the replica server by turning off mongod and let the inserts continue for 5 for minutes. After this procedure I turn on the replica and these are the 2 issues I have encountered:
-
When turning off the replication the transactions fall down to 2K per second compared to 20K per second I was achieving when my PSA set up was still up and running I would like to know why this is happening because I cannot see any errors apart from the fact that the secondary server is down. Can you provide any inside on this ?
-
When restarting the replica the tps still remains at 2K per second and the replication lag continues to increase without eventually never recovering, I can see that this is not an oplog issue because I didn’t exceed the configured size.
I can see some main flaws here, what happens if I have to recover my replica even after 5 minutes, do I have to recover everything by using this method : Replica Set Resync by Copying This isn’t as feasible as having the replica replicate the changes required only considering this is just 5 minutes of data (600k entries considering that the inserting rate dropped to 2K per sec)
Is there any way to fix these issues please? Maybe a configuration which I am missing