Sharding question

I have setup a three shards cluster:

  • 3 shard composed of 3 single instance replica set
  • 3 config servers
  • 1 router

My goal is to separate the data into different location based on the “station” field in documents.

I have run the below, but seems all the insert at the end results in documents in only one shard, the primary shard. Not sure what went wrong.

use led

sh.enableSharding("led")

stations = ["yll", "fep", "sas"]
sharded_collections = [ "camera", "station" ]

// Drop the collections (for dev), and recreate later
sharded_collections.forEach(collection => {
    db[collection].drop()
})

// Create indexes
sharded_collections.forEach(collection => {
    db[collection].createIndex({ station: 1, _id: 1})
})

stations.forEach(stationCode => {
    sh.disableBalancing("led." + stationCode)
})

sh.addShardTag("shardyll", "yll")
sh.addShardTag("shardfep", "fep")
sh.addShardTag("shardsas", "sas")

sh.addTagRange(
    "led.station",
    { "station" : "yll", "_id" : MinKey },
    { "station" : "yll", "_id" : MaxKey },
    "yll"
)
sh.addTagRange(
    "led.station",
    { "station" : "fep", "_id" : MinKey },
    { "station" : "fep", "_id" : MaxKey },
    "fep"
)
sh.addTagRange(
    "led.station",
    { "station" : "sas", "_id" : MinKey },
    { "station" : "sas", "_id" : MaxKey },
    "sas"
)
sh.enableBalancing("led.station")

// insert a few docs to test
db.station.insertOne({ "station": "yll", message: "hello1"})
db.station.insertOne({ "station": "fep", message: "hello1"})
db.station.insertOne({ "station": "fep", message: "hello2"})
db.station.insertOne({ "station": "sas", message: "hello1"})
db.station.insertOne({ "station": "sas", message: "hello2"})
db.station.insertOne({ "station": "sas", message: "hello3"})

db.station.getShardDistribution()

Output for sh.status:

use led

sh.enableSharding("led")

stations = ["yll", "fep", "sas"]
sharded_collections = [ "camera", "station" ]

// Drop the collections (for dev), and recreate later
sharded_collections.forEach(collection => {
    db[collection].drop()
})

// Create indexes
sharded_collections.forEach(collection => {
    db[collection].createIndex({ station: 1, _id: 1})
})

stations.forEach(stationCode => {
    sh.disableBalancing("led." + stationCode)
})

sh.addShardTag("shardyll", "yll")
sh.addShardTag("shardfep", "fep")
sh.addShardTag("shardsas", "sas")

sh.addTagRange(
    "led.station",
    { "station" : "yll", "_id" : MinKey },
    { "station" : "yll", "_id" : MaxKey },
    "yll"
)
sh.addTagRange(
    "led.station",
    { "station" : "fep", "_id" : MinKey },
    { "station" : "fep", "_id" : MaxKey },
    "fep"
)
sh.addTagRange(
    "led.station",
    { "station" : "sas", "_id" : MinKey },
    { "station" : "sas", "_id" : MaxKey },
    "sas"
)
sh.enableBalancing("led.station")

// insert a few docs to test
db.station.insertOne({ "station": "yll", message: "hello1"})
db.station.insertOne({ "station": "fep", message: "hello1"})
db.station.insertOne({ "station": "fep", message: "hello2"})
db.station.insertOne({ "station": "sas", message: "hello1"})
db.station.insertOne({ "station": "sas", message: "hello2"})
db.station.insertOne({ "station": "sas", message: "hello3"})

Output for shard distribution

mongos> db.station.getShardDistribution()
Collection led.station is not sharded.

Output for sh.status()

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
  	"_id" : 1,
  	"minCompatibleVersion" : 5,
  	"currentVersion" : 6,
  	"clusterId" : ObjectId("5e718d5beb2d31dadf510508")
  }
  shards:
        {  "_id" : "shardfep",  "host" : "shardfep/shardfep1:27022",  "state" : 1,  "tags" : [ "fep" ] }
        {  "_id" : "shardsas",  "host" : "shardsas/shardsas1:27023",  "state" : 1,  "tags" : [ "sas" ] }
        {  "_id" : "shardyll",  "host" : "shardyll/shardyll1:27021",  "state" : 1,  "tags" : [ "yll" ] }
  active mongoses:
        "3.4.24" : 1
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Collections with active migrations: 
                balancer started at Tue Mar 17 2020 19:54:20 GMT-0700 (PDT)
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
        {  "_id" : "led",  "primary" : "shardfep",  "partitioned" : true }

Any insight is much appreciated.

I think you have missed a step in the sharding procedure. The following are the important ones, and to be followed in that order:

(1) Enable sharding on a database
(2) Create index on the shard key field
(3) Shard a collection

I see you have missed the step (3), You have to use the command sh.shardCollection() for that.

Note that you enable sharding on a database, but shard a collection. A sharded database base can have sharded as well as unsharded collections. More details at Deploy a Sharded Cluster - Procedure.

2 Likes