Docker swarm — manage services and high availability
Docker swarm allows you to deploy services distributed to the nodes of the swarm, this allows high availability and high performance of the service. To start a single service we can enter something very simple like this
Create a simple service and cause a on the running node
$ docker service create nginx
spesyzvgu8pkh1eidhpjcti7d
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
verify: Service converged
This command started a new service with the container nginx.
To see details about the services we can run
$ docker service list
ID NAME MODE REPLICAS IMAGE PORTS
spesyzvgu8pk nifty_lehmann replicated 1/1 nginx:latest
Note that the replicas are 1 and does not equal the number of my worker nodes which are 3 in my setup, This is because by default docker swarm creates one replica but still enjoys high availability, in case of a failure of the node that runs this container the service will move to another node. Lets first find on which node it runs
$ docker service ps nifty_lehmann
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
md9qwxb69y56 nifty_lehmann.1 nginx:latest worker_node3 Running Running 5 minutes ago
To do this we need to pass the name of the container to docker service ps which is nifty_lefmann.1 in mycase and the host of the swarm that executes the container is worker_node3.
Lets simulate a real world scenario, lets stop docker on this worker which equals to a problem in the node.
worker_node3$ sudo systemctl stop docker
We can verify the status of the nodes with docker node ls. Using this command we can see that the status of workder_node3 is “Down”.
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
ncold3meymr0sl9pd9v71dweb * swarm_manager Ready Active Leader 18.09.5
d913ucvquyjigsl4mnwmcxn4s worker_node2 Ready Active 18.09.5
ch02clvvojp6qosy20i9k9nim worker_node3 Down Active 18.09.5
Now lets see the state of the service from the swarm manager
swarm_manager$ docker service ps nifty_lehmann
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
pqkvpdxtyumm nifty_lehmann.1 nginx:latest swarm_manager Running Running 26 seconds ago
md9qwxb69y56 \_ nifty_lehmann.1 nginx:latest worker_node3 Shutdown Running 46 seconds ago
Here we can see that the container is in shutdown state on worker_node3 and in running state in the swarm_manager, this is happening because if not configured the swarm managers are workers as well
Lets bring back to life docker on node worker_node3
worker_node3$ sudo systemctl start docker
Verify that the node is back
swarm_manager$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
ncold3meymr0sl9pd9v71dweb * swarm_manager Ready Active Leader 18.09.5
d913ucvquyjigsl4mnwmcxn4s worker_node2 Ready Active 18.09.5
ch02clvvojp6qosy20i9k9nim worker_node3 Ready Active 18.09.5
Lets see what happens to the service
$ docker service ps nifty_lehmann
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
pqkvpdxtyumm nifty_lehmann.1 nginx:latest swarm_manager Running Running 14 minutes ago
md9qwxb69y56 \_ nifty_lehmann.1 nginx:latest worker_node3 Shutdown Shutdown 8 minutes ago
Nothing changes! despite that the node is up the container runs in the swarm_manager node.
Delete a service
To delete a running service enter
swarm_manager$ docker service rm nifty_lehmann
nifty_lehmann
Its very simple! it should return the name of the service
Run a service with more than one replicas
Lets do something more complicated, lets define the number of replicas of the service along with some options
swarm_manager$ docker service create --name nginx --replicas 3 -p 8080:80 nginx
x8cgelqviid3e5l4icybsclg2
overall progress: 3 out of 3 tasks
1/3: running [==================================================>]
2/3: running [==================================================>]
3/3: running [==================================================>]
verify: Service converged
With this command we did the following
- The name of the service is nginx
- The number of replicas are 3
- We exposure port 80 of each container to port 8080 of each node
Running docker service ps nginx we can see the following
$ docker service ps nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ek8llokh8ia8 nginx.1 nginx:latest swarm_manager Running Running 7 minutes ago
st654bmojc4u nginx.2 nginx:latest worker_node2 Running Running 7 minutes ago
copaz24xs2zf nginx.3 nginx:latest worker_node3 Running Running 7 minutes ago
Each node has a running container for service nginx
How Load balancing works
Lets do something very interesting, run this on any host of your swarm that is a worker node
for i in {1..100}; do curl -l localhost:8080 &> /dev/null; done
Then run the following in a swarm node
docker service logs nginx | grep -i http | cut -d " " -f1 | sort -n | uniq -c
No matter how many times you will run it and no matter on which host you should see that the number of requests should be divided equal to each worker node that runs this container
184 nginx.1.ek8llokh8ia8@swarm_manager
184 nginx.2.st654bmojc4u@worker_node2
182 nginx.3.copaz24xs2zf@worker_node3
And this is happening because docker swarm internally load balance the requests in a round-robin fashion despite on which node we perform the request and this is great for simple solutions that do not require a very specialized load balancing scheme.
Changing the number of replicas
We can on the fly increase or decrease the number of running replicas for a service, to do this we can enter in a swarm manager
$ docker service update --replicas 2 nginx
nginx
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
This will reduce the number of nginx replicas from 3 to 2, we can verify this by running
$ docker service ps nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ek8llokh8ia8 nginx.1 nginx:latest worker_node2 Running Running 30 minutes ago
st654bmojc4u nginx.2 nginx:latest swarm_manager Running Running 30 minutes ago
To increase the number of replicas
$ docker service update --replicas 6 nginx
nginx
overall progress: 6 out of 6 tasks
1/6: running [==================================================>]
2/6: running [==================================================>]
3/6: running [==================================================>]
4/6: running [==================================================>]
5/6: running [==================================================>]
6/6: running [==================================================>]
verify: Service converged
$ docker service ps nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ek8llokh8ia8 nginx.1 nginx:latest worker_node2 Running Running 32 minutes ago
st654bmojc4u nginx.2 nginx:latest swarm_manager Running Running 32 minutes ago
xpw6gl4sos46 nginx.3 nginx:latest worker_node3 Running Running 15 seconds ago
9m286xjd6mnn nginx.4 nginx:latest swarm_manager Running Running 15 seconds ago
zkkaxktfozz6 nginx.5 nginx:latest worker_node2 Running Running 15 seconds ago
qxvqa3svz6js nginx.6 nginx:latest worker_node3 Running Running 14 seconds ago
Note that the containers have been equally balanced to each running node
How to disallow a node running containers
A good practice is not to use swarm managers to run heavy load containers because this can disrupt the consistency of the swarm managers statuses which can cause problems to the quorum of the swarm, to avoid this you can configure a node not to run containers
$ docker node update --availability drain swarm_manager
swarm_manager
And to verify that the swarm_manager does not execute any containers we run docker service ps nginx
$ docker service ps nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ek8llokh8ia8 nginx.1 nginx:latest worker_node2 Running Running 38 minutes ago
pnvff40fy745 nginx.2 nginx:latest worker_node3 Running Running 56 seconds ago
st654bmojc4u \_ nginx.2 nginx:latest swarm_manager Shutdown Shutdown 57 seconds ago
xpw6gl4sos46 nginx.3 nginx:latest worker_node3 Running Running 6 minutes ago
wu3u0lw42gq9 nginx.4 nginx:latest worker_node2 Running Running 57 seconds ago
9m286xjd6mnn \_ nginx.4 nginx:latest swarm_manager Shutdown Shutdown 57 seconds ago
zkkaxktfozz6 nginx.5 nginx:latest worker_node2 Running Running 6 minutes ago
qxvqa3svz6js nginx.6 nginx:latest worker_node3 Running Running 6 minutes ago
We can see that swarm_manager does not execute any containers but the number of replicas are still 6 and distributed equal to the 2 workers.
How to automatically use all nodes of a swarm
In case you want to use all nodes of the swarm there is not need to define the number of replicas, you can instead use the global option
$ docker service create --name nginx --mode global nginx
This command will create as many replicas as the number of nodes that can run containers in the swarm.
I hope you found my article interesting! :)