Member-only story
Docker swarm: High Availability
A Docker swarm is composed by nodes, nodes can be worker nodes or manager nodes.
Manager nodes: those nodes are key elements of the swarm, from those nodes you can do swarm administrative tasks, also are used to store the swarm state.
Worker nodes: those nodes are used to run the containers.
Note: that a manager node can be a worker also at the same time.
Docker swarm uses the Raft Consensus Algorithm to manage the global cluster state, Raft makes sure that all the manager nodes in the cluster who manage and schedule tasks are storing the same consistent state.
Having the same state across the cluster means that in case of a failure of the leader manager node any manager node can pickup-tasks and restore the services to a stable state, the state is propagated across the cluster manager nodes by replicating Raft logs.
Raft tolerates up to “(N-1) / 2" failures and requires a majority of manager nodes to agree on values proposed to the cluster, which translates to “(N/2) + 1” manager nodes. This means that in a cluster of 5 manager nodes if 3 become un-available the cluster drop requests to schedule new tasks. The existing task will keep run but the cluster cannot rebalance tasks to overcome additional failures.
Things to consider before deploying manager nodes
- There is no limit on the number of manager nodes, but there is a trade-off between performance and fault tolerance, adding more manager nodes makes the swarm more fault tolerant, but reduces write performance because more nodes must acknowledge proposals to update the swarm state. This means more network round-trip traffic.
- When creating a swarm you must advertise your ip address to the other manager nodes in the swarm, managers are meant to be a stable component of the infrastructure and a static ip address for advertising must be used, if the manager changes its ip address the rest of the manager nodes will not succeed to contact to this manager node.
Fault — tolerance design
The number of manager nodes should be odd, having an odd number of managers ensures that after a failure the number of remaining manager nodes are “N/2 + 1”
