07Jun 2016

Docker Swarm Tutorial with Consul (Service Discovery) and Examples

by Ventz

If you have not used Swarm, skim the non-service-discovery tutorial to get a feel for how it works:
https://blog.vpetkov.net/2015/12/07/docker-swarm-tutorial-and-examples. It’s very easy, and it should give you an idea of how it works within a couple of minutes.

Using Swarm with pre-generated static tokens is useful, but there are many benefits to using a service discovery backend. For example, you can utilize network overlays and have common “bridges” that span multiple hosts (https://docs.docker.com/engine/userguide/networking/get-started-overlay/). It also provides service registration and discovery for the Docker containers launched into the Swarm. Now lets get into how to use it with service discovery – which is what you would use in a scaled out environment/production.

Again, assuming you have a bunch of servers running docker:
vm01 (10.0.0.101), vm02 (10.0.0.102), vm03 (10.0.0.103), vm04 (10.0.0.104)

Normally, you can do “docker ps” on each host for example:
ssh vm01 ‘docker ps’
ssh vm04 ‘docker ps’

If you enable the API for remote bind on each host you can manage them from a central place:
docker -H tcp://vm01:2375 ps
docker -H tcp://vm04:2375 ps
(note: port is optional for default)

But if you want to use all of these docker engines as a cluster, you need Swarm.
Here we will go one step further and use a common service discovery backend (Consul).

Docker Swarm Tutorial with Consul and How-To/Examples

A swarm contains only two components: agents (the workers in the cluster) and manager(s).
We are also going to add consul (the service discovery backend).

First, grab the swarm and the consul images on each docker host:


docker pull swarm
docker pull progrium/consul

docker pull swarm

docker pull progrium/consul

Then, make sure the API is enabled for remote bind on each host (NOTE: see bellow if using Systemd OS):


# (on Ubuntu) cat /etc/default/docker:

DOCKER_OPTS="-H tcp://0.0.0.0:2375 \
--cluster-store=consul://consulServer:8500 \
--cluster-advertise=managerIp:2376 \
-H unix:///var/run/docker.sock --dns 8.8.8.8 --dns 8.8.4.4 ..."

# And then restart:
/etc/init.d/docker restart

# -- OR --

# IF USING a Systemd OS -- Ubuntu 16.04/CentOS7 and up, use this instead:

# (on Ubuntu) cat /etc/systemd/system/docker.service

[Service]
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2375 \
--cluster-store=consul://consulIp:8500 \
--cluster-advertise=managerIp:2376 \
-H unix:///var/run/docker.sock --dns 8.8.8.8 --dns 8.8.4.4

# And then reload/restart
systemctl daemon-reload
systemctl restart docker

# (on Ubuntu) cat /etc/default/docker:

DOCKER_OPTS="-H tcp://0.0.0.0:2375 \

--cluster-store=consul://consulServer:8500 \

--cluster-advertise=managerIp:2376 \

-H unix:///var/run/docker.sock --dns 8.8.8.8 --dns 8.8.4.4 ..."

# And then restart:

/etc/init.d/docker restart

# -- OR --

# IF USING a Systemd OS -- Ubuntu 16.04/CentOS7 and up, use this instead:

# (on Ubuntu) cat /etc/systemd/system/docker.service

[Service]

ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2375 \

--cluster-store=consul://consulIp:8500 \

--cluster-advertise=managerIp:2376 \

-H unix:///var/run/docker.sock --dns 8.8.8.8 --dns 8.8.4.4

# And then reload/restart

systemctl daemon-reload

systemctl restart docker

Don’t panic here! It looks complicated, but it’s actually incredibly easy.

The consulIp in “–cluster-store=consul://consulIp:8500” is the docker host that will run the consul service (much like the swarm manager). Since you will map the port to the docker host itself, that’s simply the IP of the docker host (in our case – vm01)

The managerIp in “–cluster-advertise=managerIp:2376” is the docker host that will run the swarm manager service. Since you will map the port to the docker host itself, that’s simply the IP of the docker host (in our case – vm01).

To get everything started, go to whatever docker host you pick as the manager (in our case vm01), and create the consul server:


docker run -d -p "8500:8500" -h "consul" progrium/consul -server -bootstrap

docker run -d -p "8500:8500" -h "consul" progrium/consul -server -bootstrap

Now, on *each* AGENT (including the manager if you want to use it as a worker) run:
docker run -d swarm join –addr 107.170.73.43:2375 consul://consulIp/swarm:8500/swarm


docker run -d swarm join --advertise 10.0.0.101:2375 consul://10.0.0.101:8500/swarm
docker run -d swarm join --advertise 10.0.0.102:2375 consul://10.0.0.101:8500/swarm
docker run -d swarm join --advertise 10.0.0.103:2375 consul://10.0.0.101:8500/swarm
docker run -d swarm join --advertise 10.0.0.104:2375 consul://10.0.0.101:8500/swarm

docker run -d swarm join --advertise 10.0.0.101:2375 consul://10.0.0.101:8500/swarm

docker run -d swarm join --advertise 10.0.0.102:2375 consul://10.0.0.101:8500/swarm

docker run -d swarm join --advertise 10.0.0.103:2375 consul://10.0.0.101:8500/swarm

docker run -d swarm join --advertise 10.0.0.104:2375 consul://10.0.0.101:8500/swarm

You would do this for *each* agent and in our case vm01 is also an agent.

At last, you need to run a manager service on your chosen manager host (in our case, vm01) to manage the swarm:
docker run -d -p 2376:2375 swarm manage consul://consulIp:8500/swarm


docker run -d -p 2376:2375 swarm manage consul://10.0.0.101:8500/swarm

docker run -d -p 2376:2375 swarm manage consul://10.0.0.101:8500/swarm

The idea is that the manager wants to provide an API on port 2375. We are binding that to the local host on 2376. If your manager is NOT an agent, you can simply bind it on 2375 by doing a “run -d -P swarm manage consul://…”. In that case, you would NOT run the “swarm join” command on your manager. However, in our case we want all of the hosts to be agents, including the manager.

The last step is to query the cluster:
docker -H tcp://managerIP:2376 info

In our case, we use vm01:


docker -H tcp://vm01:2376 info
or
docker -H tcp://vm01:2376 ps

docker -H tcp://vm01:2376 info

docker -H tcp://vm01:2376 ps

Again, if your manager is NOT an agent, you would simply run:
“docker -H tcp://managerIP:2375 info” or even “docker -H tcp://managerip”

Don’t forget to start the manager on reboot, and each join on the agents on reboot.

Posted in: Uncategorized ⋅ Tagged: automation, cloud, docker, linux, network

12 Thoughts on “Docker Swarm Tutorial with Consul (Service Discovery) and Examples”

lookman on July 24, 2017 at 8:04 am said:

hi sir ,

just ignore my previuos comment

already succesfull ran this command

ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 \
–cluster-store=consul://172.172.172.104:8500 \
–cluster-advertise=172.172.172.104:2376 \
-H unix:///var/run/docker.sock –dns 8.8.8.8 –dns 8.8.4.4

after that i got this error on the status docker

level=warning msg=”Registering as \”172.172.172.104:2376\” in discovery failed

[root@sw ~]# systemctl status docker
● docker.service – Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: active (running) since Sen 2017-07-24 07:47:23 EDT; 1h 12min ago
Docs: https://docs.docker.com
Main PID: 31360 (dockerd)
Memory: 74.5M
CGroup: /system.slice/docker.service
├─31360 /usr/bin/dockerd -H tcp://0.0.0.0:2375 –cluster-store=consul://172.172.172.104:8500 –cluster-advertise=172.172.172.104:2376 -H unix:///var/run/docker.sock –dns 8.8….
├─31373 docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock –metrics-interval=0 –start-timeout 2m –state-dir /var/run/docker/libcontainerd/cont…
├─31593 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8500 -container-ip 172.17.0.2 -container-port 8500
├─31599 docker-containerd-shim 410b9ccd8458c2782215f8f0ad08294b8cbf3c3b8184cc57c1abde6a19300752 /var/run/docker/libcontainerd/410b9ccd8458c2782215f8f0ad08294b8cbf3c3b8184cc57c…
├─31733 docker-containerd-shim 692b70da1a03dd453350a2ec9b8eda68ab88fffec5174f3574c774cfe8ba7a0c /var/run/docker/libcontainerd/692b70da1a03dd453350a2ec9b8eda68ab88fffec5174f357…
├─31890 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 2376 -container-ip 172.17.0.4 -container-port 2375
└─31896 docker-containerd-shim f25ba779f0aaad2d01bd96a1c3f3da2f16160ce5b9af6d5956416a1b8ea63741 /var/run/docker/libcontainerd/f25ba779f0aaad2d01bd96a1c3f3da2f16160ce5b9af6d595…

Jul 24 07:47:23 [1]: Started Docker Application Container Engine.
Jul 24 07:47:42 [31360]: time=”2017-07-24T07:47:42.917802299-04:00″ level=error msg=”discovery error: Get http://172.172.172.104:8500/v1/kv/docker/…on refused”
Jul 24 07:47:42 dockerd[31360]: time=”2017-07-24T07:47:42.917898573-04:00″ level=error msg=”discovery error: Put http://172.172.172.104:8500/v1/kv/docker/…on refused”
Jul 24 07:47:42 dockerd[31360]: time=”2017-07-24T07:47:42.918050458-04:00″ level=error msg=”discovery error: Unexpected watch error”
Jul 24 07:48:05 [31360]: time=”2017-07-24T07:48:05.924421036-04:00″ level=error msg=”discovery error: Get http://172.172.172.104:8500/v1/kv/docker/…te to host”
Jul 24 07:48:08 [31360]: time=”2017-07-24T07:48:08.930381847-04:00″ level=error msg=”discovery error: Put http://172.172.172.104:8500/v1/kv/docker/…te to host”
Jul 24 07:48:11 [31360]: time=”2017-07-24T07:48:11.936205421-04:00″ level=error msg=”discovery error: Unexpected watch error”
Jul 24 07:48:27 [31360]: time=”2017-07-24T07:48:27.975333177-04:00″ level=warning msg=”Registering as \”172.172.172.104:2376\” in discovery failed:…n sessions”
Jul 24 07:48:42 [31360]: time=”2017-07-24T07:48:42.796836356-04:00″ level=warning msg=”2017/07/24 07:48:42 [WARN] memberlist: Binding to public add…ryption!\n”
Jul 24 07:48:42 [31360]: time=”2017-07-24T07:48:42.796966912-04:00″ level=info msg=”2017/07/24 07:48:42 [INFO] serf: EventMemberJoin: ….172.104\n”
Hint: Some lines were ellipsized, use -l to show in full.
- Ventz on July 24, 2017 at 10:37 am said:
  
  No problem. Glad you figured it out.
  
  By the way – you should look at the new Docker “Swarm Mode” — it fixes a lot of the design issues that the “older style” cluster introduces (with consul and such)
lookman on July 24, 2017 at 5:09 am said:

hi ,

i just following the step but i got the error when i have tried run the docker afer i insert this
config

[Service]
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2375 \
–cluster-store=consul://172.172.172.104:8500 \
–cluster-advertise=172.172.172.104:2376 \
-H unix:///var/run/docker.sock –dns 8.8.8.8 –dns 8.8.4.4

# here’s the error

Active: failed (Result: start-limit) since Sen 2017-07-24 05:57:50 EDT; 11s ago
Docs: https://docs.docker.com
Process: 29096 ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 –cluster-store=consul://172.172.172.104:8500 –cluster-advertise=172.172.172.104:2376 -H unix:///var/run/docker.sock –dns 8.8.8.8 –dns 8.8.4.4 (code=exited, status=1/FAILURE)
Main PID: 29096 (code=exited, status=1/FAILURE)

Jul 24 05:57:49 sw.docker.mbiz.co.id systemd[1]: Failed to start Docker Application Container Engine.
Jul 24 05:57:49 sw.docker.mbiz.co.id systemd[1]: Unit docker.service entered failed state.
Jul 24 05:57:49 sw.docker.mbiz.co.id systemd[1]: docker.service failed.
Jul 24 05:57:50 sw.docker.mbiz.co.id systemd[1]: docker.service holdoff time over, scheduling restart.
Jul 24 05:57:50 sw.docker.mbiz.co.id systemd[1]: start request repeated too quickly for docker.service
Jul 24 05:57:50 sw.docker.mbiz.co.id systemd[1]: Failed to start Docker Application Container Engine.
Jul 24 05:57:50 sw.docker.mbiz.co.id systemd[1]: Unit docker.service entered failed state.
Jul 24 05:57:50 sw.docker.mbiz.co.id systemd[1]: docker.service failed.

here’s the location of the config :

/etc/systemd/system/multi-user.target.wants/docker.service
Bernardo Corrêa on November 21, 2016 at 8:30 pm said:

Yes, sure, I got that running in no time, it is really easy. But I thought it wasnt ready for production use yet. I got a lot of problems in the old swarm and the overlay network. Lot of containers went stale and there were no way to disconnect them from the network,compose sometimes just wouldnt find the network, it was a real mess.

I’ll give it a shot with swarm mode and leaveConsul just for general service discovery and health checks.

Thanks again.
Bernardo Corrêa on November 21, 2016 at 4:13 pm said:

Hi, thanks for the reply,

I could not really see any difference from your post, except when using docker hub`s token mode. The docs are not very clear on how to create the swarm using an external discovery backend like consul.

Thanks anyway,
- Ventz on November 21, 2016 at 5:24 pm said:
  
  The new one is completely different and incredibly simple.
  It has discovery (ex: consul) built in, and you no longer have to worry about scaling discovery services or having redundancy — it “just works”.
  Each “engine” can be a manager/master or a regular node. It can be promoted and demoted. It also comes with a bunch of other nice features like auto load balancing, services, etc.
  The only downside I’ve noticed so far is that it really doesn’t make sense to have 2 “nodes” as a manager and non-manager. You really want an odd # of nodes (to deal with contention), and the minimum number is 3.
  
  Here’s an article that compares the two types of docker swarm.
  https://www.infoq.com/news/2016/06/dockercon-docker-swarm
  
  Good luck.
Bernardo Corrêa on November 20, 2016 at 10:50 pm said:

Hi, thanks for the post.

Maybe it is too late for asking but running docker -H :2376 info shows detailed info about the swarm manager. I found the output strange cause I’ve noticed that it has the following info:

Swarm:
NodeID:
Is Manager: false
Node Address:

If I run -H :2376 node ls, i get : Error response from daemon: 404 page not found.

Also, automatic load balancing does not work neither regular swarm commands like node ls or service create/ls/tasks.

What am I missing?

Regards.
- Ventz on November 20, 2016 at 10:54 pm said:
  
  This is for the old “swarm engine”. It was extremely confusing and I kept seeing people wanting “simple tutorials”.
  Now that Docker’s 1.12 “swarm mode” is out, you should use that. Here is their official tutorial: https://docs.docker.com/engine/swarm/swarm-tutorial/

ibolcina on June 9, 2016 at 3:36 pm said:

I agree. Weave is great. I just hope it will be accepted into community.

Ventz on July 31, 2016 at 1:30 am said:

Found something interesting which is relevant to what we were talking about:
https://github.com/docker/docker/blob/master/docs/userguide/networking/work-with-networks.md

Look at this section: “Network-scoped alias”


While links provide private name resolution that is localized within a container, the network-scoped alias provides a way for a container to be discovered by an alternate name by any other container within the scope of a particular network. Unlike the link alias, which is defined by the consumer of a service, the network-scoped alias is defined by the container that is offering the service to the network.

While links provide private name resolution that is localized within a container, the network-scoped alias provides a way for a container to be discovered by an alternate name by any other container within the scope of a particular network. Unlike the link alias, which is defined by the consumer of a service, the network-scoped alias is defined by the container that is offering the service to the network.

This looks like the exact solution you want.

The limitation (it doesn’t list, but I am assuming) is multiple hosts and container across those. However, between the overlay networking, and the new macvlans which came out in Docker 1.12, you can fix that now easily too — either an overlay on top of swarm over multiple hosts, or having multiple hosts just plug into the same vlan with macvlan and 802.1q.

Ventz on June 9, 2016 at 4:00 am said:

You would have to first expose the DNS ports in Consul (-p 8600:53/udp) and then point your docker container’s DNS server to the consul server’s IP. (better yet, add “–dns-search service.consul” to your DOCKER_OPS” — since it seems it has recursion enabled and it’s using google’s servers)

At this point each docker container you start will register it’s “–name” into consul. This will be accessible via:
$name.service.consul. If you add that to the DNS search domain, you can simply use the $name at that point.

I haven’t tried this myself yet because for me Weave (mentioned this earlier) is simply the perfect solution.
You might need a service like “registrator” (another docker container), which monitors the unix socket and then registers container names/ips/data into consul. Ex:


$ docker run -d \
    --name=registrator \
    --net=host \
    --volume=/var/run/docker.sock:/tmp/docker.sock \
    gliderlabs/registrator:latest \
    consul://localhost:8500

$ docker run -d \

--name=registrator \

--net=host \

--volume=/var/run/docker.sock:/tmp/docker.sock \

gliderlabs/registrator:latest \

consul://localhost:8500

Give it a try and let me know.

I still think using something like Weave is the way to go for the time being. It solves the cross-datacenter/cross-clouds problem, and it gives you a bunch of freebies on the process.

ibolcina on June 8, 2016 at 4:07 am said:

Hi.Thanks for tutorial.

I’am trying to ping machines by name

I create network

docker -H tcp://10.0.1.101:2376 network create my_swarm_net

then I start “u1”, “u2″” containers
docker -H tcp://10.0.1.101:2376 run -ti –name u1 –net my_swarm_net ubuntu bash

docker -H tcp://10.0.1.101:2376 run -ti –name u2 –net my_swarm_net ubuntu bash

they get addresses like “10.0.0.2” ,”10.0.0.3″
Host names are correctly resolved,

but ping (from u2->u1) returns

root@2ef627851fd7:/# ping u1
PING u1 (10.0.0.2) 56(84) bytes of data.
From 2ef627851fd7 (10.0.0.3) icmp_seq=1 Destination Host Unreachable
From 2ef627851fd7 (10.0.0.3) icmp_seq=2 Destination Host Unreachable

>> NOTE: Please use <code>...</code> to post code/configs in your comment.

← Previous Post

Docker Swarm Tutorial with Consul (Service Discovery) and Examples

Docker Swarm Tutorial with Consul and How-To/Examples

12 Thoughts on “Docker Swarm Tutorial with Consul (Service Discovery) and Examples”

>> NOTE: Please use <code>...</code> to post code/configs in your comment.

Post Navigation