Spin up a Multiprovider 3 Node Cluster on 3 Providers?

Hello Atlas experts,

it seems to be possible to setup a RS where each node is on a different provider.

Sounds odd but you get:

  • Available during partial region outage
  • Available during full region outage
  • Available during a cloud provider outage

where as a 3 node RS only provides:

  • Available during partial region outage

Fun fact: according to the UI it is cheaper, I’d assume that the TCO including network traffic will come to something even. For sure the latency will increase, I was told appr. ~5 ms when all nodes are in the same region. This can be interesting but: is this a real functioning setup ?

How ever cool this sounds the setup raises further questions:

  • above mentioned: effect of latency
  • whats around VPC peering, and how is that handled?
  • if a secondary get elected primary will there be a trade off or loss in terms of latency / response times
  • Increased Data Transfer costs as data is being replicated further between the nodes.
  • using a specific write concern like (e.g. majority) then there will be some impact since the write needs to go cross cloud to achieve this

Is anyone around who can addon experiences, thoughts, etc. to this?

Regards,
Michael

1 Like

Hi Michael,

Great questions!

You’re exactly right that with this configuration you maintain majority quorum and hence continuous read and write availability (save for a momentary replica-set level election) in the event of a full cloud provider. Of course that all assumes you have a cross-cloud resilient application tier which isn’t trivial (but with K8s it’s becoming more and more reasonable over time, still early days).

You’re definitely right that TCO when you include cross-cloud data transfers (even with compression over the wire) should not be cheaper than being in a single cloud provider. And you might consider running 2x2x1 to ensure that even during maintenance you always have your primary in your preferred provider: that would also increase the cost a bit.

Regarding VPC Peering (or Private Endpoints): when you use this you can only reach the portion of your cluster inside the same cloud provider. So in a sharded cluster this means reads and writes (since mongos’s can do the routing to the rest of the cluster) but in a replica set you’d be experiencing a read-only connection if you were peered only to a secondary and couldn’t reach the primary over the network. Some options to consider would be to leverage public IP access lists for cross-provider app tier access, or of course you could run a sharded cluster.

There would definitely be latency tradeoffs here: particularly if you’re using the majority write concern. Writes would then acknowledge after hitting two cloud providers: which could be susceptible for less reliable network latency.

Cheers
-Andrew

3 Likes

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.