Docker Lab – Networking

by Louis DeLosSantos

As I begin to dig deeper into placing docker into production I find myself needing an elaborate lab. I personally use KVM on a Ubuntu system but this should apply to any hypverisor which uses Linux bridges to accomplish networking.

I wanted end-to-end connectivity from my host system to containers that were being hosted in a virtual machine. For example, I should be able to create a VM (from my KVM Host machine) named dockerhost01 and a container in this machine. I should be able to ping the container from my KVM HOST machine. With the default routing table this will not work.

In order to aid in representing our configuration I have a crude hand-written diagram. I’m far too lazy to start up my windows VM just to get visio going so I hope this will do justice.

HTML5 Icon

In the image you can see we have a workstation with a wlan adapter at 192.168.0.104. This workstation also has the default virtual bridge that KVM creates. We can think of a virtual bridge as a “switch” which lives on our host. We are, in essence, turning our workstation into a multi-port switch. Each virtual machine we power on literally “plugs” their interface into this virtual switch named virtbr0. I will return back to linux virtual bridges later in the post.

Next you can see that we have a KVM VM named DockerHost. This VM itself has another linux virtual bridge named Docker0. This is how containers communicate with their host machine. In the same fashion, our containers “plug” their interfaces directly into Docker0.

Now if this is your first introduction to linux bridges you may be slightly confused. I would suggest you play with them a little to see how they work. It’s a simple construct but conceptually can be slightly confusing. Let’s download the bridge-utils tools and inspect our switches:

#Workstation machine

root@ldubuntu:/home/ldelossa# brctl show

bridge name bridge id STP enabled interfaces

virbr0 8000.5254007de659 yes virbr0-nic

vnet0

vnet1

vnet2

vnet3

vnet4

root@ldubuntu:/home/ldelossa#

What we see here is the default linux bride that KVM creates. You can see that each VM I have created on my workstation is a “plugged-in” interface on the bridge. The bridge has an IP address also:

root@ldubuntu:/home/ldelossa# ip addr list

5: virbr0: mtu 1500 qdisc noqueue state UP group default

link/ether 52:54:00:7d:e6:59 brd ff:ff:ff:ff:ff:ff

inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0

valid_lft forever preferred_lft forever

6: virbr0-nic: mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 500

link/ether 52:54:00:7d:e6:59 brd ff:ff:ff:ff:ff:ff

7: vnet0: mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 500

link/ether fe:54:00:1a:57:13 brd ff:ff:ff:ff:ff:ff

inet6 fe80::fc54:ff:fe1a:5713/64 scope link

valid_lft forever preferred_lft forever

8: vnet1: mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 500

link/ether fe:54:00:61:5d:99 brd ff:ff:ff:ff:ff:ff

inet6 fe80::fc54:ff:fe61:5d99/64 scope link

valid_lft forever preferred_lft forever

10: vnet2: mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 500

link/ether fe:54:00:ba:3c:e3 brd ff:ff:ff:ff:ff:ff

inet6 fe80::fc54:ff:feba:3ce3/64 scope link

valid_lft forever preferred_lft forever

11: vnet3: mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 500

link/ether fe:54:00:88:b8:44 brd ff:ff:ff:ff:ff:ff

inet6 fe80::fc54:ff:fe88:b844/64 scope link

valid_lft forever preferred_lft forever

12: vnet4: mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 500

link/ether fe:54:00:63:42:aa brd ff:ff:ff:ff:ff:ff

inet6 fe80::fc54:ff:fe63:42aa/64 scope link

valid_lft forever preferred_lft forever

This IP address is used in NAT operations. Any node trying to access the 192.168.122.0/24 network OUTSIDE of our workstation, would need to first send packets to our workstation, then our workstation sends packets to our Linux bridge.

And here is where things get confusing if you are not used to Linux bridging. One would think that on our workstation, if we tried to trace the path to our DockerHostVM, we would see the Linux bridge’s interface IP as a hop. However we must keep in mind that our workstation IS the Linux bridge. There’s no hop necessary. We are directly connected to the 192.168.122.0/24 network simply by BEING the bridge.

Okay, if that doesn’t make too much sense don’t worry. The above can be demonstrated the same way for DockerHostVM. The same concepts are at play, however docker calls its Linux bridge docker0.

So if you look at our diagram, what is the main goal we need to accomplish? We need to get packets originating from HostWorkStation, destined for 172.17.0.1 to DockerHostVM, and then to the container. The reply packets must follow the same path up, through DockerHostVM, and back to HostWorkstation.

So let’s take a look at what we know. We know that we are going to need both Linux machines to act as a packet forwarding router. So let’s do this first, enable packet forwarding on both machines:

Run the following commands on both machines:

sysctl -w net.ipv4.ip_forward=1

Okay cool, so now we have the mechanisms which will allow us to forward packets based on our routing table entries. That last statement was a hint on where to go next. Let’s get an idea of what the routing tables look like on each machine in play.

#HostWorkStation

ldelossa@ldubuntu:~$ ip route

default via 192.168.0.1 dev wlan0 proto static metric 600

192.168.0.0/24 dev wlan0 proto kernel scope link src 192.168.0.104 metric 600

192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1

#DockerHostVM

[root@dockerhost01 ~]# ip route

default via 192.168.122.1 dev ens3 proto static metric 100

172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.11 metric 100

root@22e95e83a211:/# ip route

default via 172.17.0.1 dev eth0

172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.0.2

Okay so let’s look at this information from bottom up. First, the container’s routing table. Not much going on here, which is fine. All traffic leaving the container is going to head the only way it can – out its one interface which we know is virtually “plugged” into docker0.

Next we take a look at our DockerHostVM routing table. Here we have a little more complexity but still not bad. We know to direct packets heading to the container’s ip ranges (172.17.0.0/16) toward the docker0 bridge. We also know that we are directly connected to 192.168.122.0/24 via our ens3 interface. This interface is “plugged” into our virtb0 bridge on our HOST machine. So ANY packets that we aren’t sure where the destination is, we send up to our linux bridge on HostWorkStation.

Now we have our HostWorkStation. This is our point of interest. My workstation’s default route is going to send all unknown packets out my wireless lan interface; Which is appropriate, that’s where the internet is and consequently where we should send any unknown packets. Next we have a route for our VM network (192.168.122.0/24) directing any packets that need to go to our VMs to head to virbr0 interface. This is all great and dandy, but what are we missing?

We need to tell our HostWorkStation to send packets destined for our docker container somewhere other than out my wireless lan interface. Right now there’s no routes for 172.17.0.0/16, hence those packets will head out wlan0, and die. So where do we need to send those packets? My first inclination was to send those packets to our bridge interface, 192.168.122.1 – however this is incorrect. We need to remember that our HostWorkStation IS! the bridge. The bridge interface is not a separate device with its own routing table, it literally is the linux machine we are running. Therefore we want to route packets to the next device who knows how to get to our docker containers, we want to route packets to DockerHostVM (192.168.122.11)

Let’s do just that

#HostWorkStation

ip route add 172.17.0.0/16 via 192.168.122.11

Let’s test connectivity

ldelossa@ldubuntu:~$ ping 172.17.0.2

PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.

64 bytes from 172.17.0.2: icmp_seq=1 ttl=63 time=0.815 ms

64 bytes from 172.17.0.2: icmp_seq=2 ttl=63 time=0.466 ms

64 bytes from 172.17.0.2: icmp_seq=3 ttl=63 time=0.550 ms

64 bytes from 172.17.0.2: icmp_seq=4 ttl=63 time=0.539 ms

64 bytes from 172.17.0.2: icmp_seq=5 ttl=63 time=0.396 ms

64 bytes from 172.17.0.2: icmp_seq=6 ttl=63 time=0.452 ms

64 bytes from 172.17.0.2: icmp_seq=7 ttl=63 time=0.554 ms

^V64 bytes from 172.17.0.2: icmp_seq=8 ttl=63 time=0.568 ms

Very nice!

So the full picture, how does this work?

#from workstation to container
1) We generate a ping from our workstation toward 172.17.0.2
2) Workstation looks at its routing table and says “Okay I want to send to 172.17.0.2, no problem I’ll send these packets over to 192.168.122.11”
3) The networking stack then determines where to source this ping from. It determines that it has an interface on the 192.168.122.0/24 network, the bridge, and sends the ping out this interface, over to 192.168.122.11 sourced from 192.168.122.1. (this step is exactly why we do not see 192.168.122.1 as a hop, the networking stack owns the bridge, and can source packets from this bridge directly)
4) Our DockerHostVM (192.168.122.11) gets this packet with the destination of 172.17.0.2
5) Our DockerHostVM does a routing table lookup and finds its route for 172.17.0.0/16 going toward docker0 and sends the packet that way.
6) The packet is delivered via the bridge to the correct container

Let’s see this with tcpdump. We will initiate a ping from our HostWorkStation toward 172.17.0.2. We will run tcpdump on both how DockerHostVM and our container

#HostWorkStationVM

ldelossa@ldubuntu:~$ ping 172.17.0.2

PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.

64 bytes from 172.17.0.2: icmp_seq=1 ttl=63 time=0.613 ms

64 bytes from 172.17.0.2: icmp_seq=2 ttl=63 time=0.665 ms

64 bytes from 172.17.0.2: icmp_seq=3 ttl=63 time=0.477 ms

64 bytes from 172.17.0.2: icmp_seq=4 ttl=63 time=0.650 ms

64 bytes from 172.17.0.2: icmp_seq=5 ttl=63 time=0.535 ms

64 bytes from 172.17.0.2: icmp_seq=6 ttl=63 time=0.583 ms

64 bytes from 172.17.0.2: icmp_seq=7 ttl=63 time=0.632 ms

64 bytes from 172.17.0.2: icmp_seq=8 ttl=63 time=0.448 ms

64 bytes from 172.17.0.2: icmp_seq=9 ttl=63 time=0.611 ms

#DockerHostVM

[root@dockerhost01 ~]# tcpdump

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on docker0, link-type EN10MB (Ethernet), capture size 65535 bytes

23:43:47.053790 IP 192.168.122.1 > 172.17.0.2: ICMP echo request, id 24412, seq 1, length 64

23:43:47.053956 IP 172.17.0.2 > 192.168.122.1: ICMP echo reply, id 24412, seq 1, length 64

23:43:47.055546 IP 172.17.0.2.55398 > dns02.kvm.lan.domain: 8571+ PTR? 1.122.168.192.in-addr.arpa. (44)

23:43:47.056461 IP dns02.kvm.lan.domain > 172.17.0.2.55398: 8571 NXDomain* 0/1/0 (99)

23:43:47.056987 IP 172.17.0.2.36237 > dns02.kvm.lan.domain: 25760+ PTR? 3.122.168.192.in-addr.arpa. (44)

23:43:47.058016 IP dns02.kvm.lan.domain > 172.17.0.2.36237: 25760* 1/2/2 PTR dns02.kvm.lan. (137)

23:43:48.052682 IP 192.168.122.1 > 172.17.0.2: ICMP echo request, id 24412, seq 2, length 64

23:43:48.052936 IP 172.17.0.2 > 192.168.122.1: ICMP echo reply, id 24412, seq 2, length 64

23:43:49.051689 IP 192.168.122.1 > 172.17.0.2: ICMP echo request, id 24412, seq 3, length 64

23:43:49.051818 IP 172.17.0.2 > 192.168.122.1: ICMP echo reply, id 24412, seq 3, length 64

23:43:50.050764 IP 192.168.122.1 > 172.17.0.2: ICMP echo request, id 24412, seq 4, length 64

23:43:50.050964 IP 172.17.0.2 > 192.168.122.1: ICMP echo reply, id 24412, seq 4, length 64

23:43:51.050628 IP 192.168.122.1 > 172.17.0.2: ICMP echo request, id 24412, seq 5, length 64

23:43:51.050728 IP 172.17.0.2 > 192.168.122.1: ICMP echo reply, id 24412, seq 5, length 64

23:43:52.050675 IP 192.168.122.1 > 172.17.0.2: ICMP echo request, id 24412, seq 6, length 64

23:43:52.050817 IP 172.17.0.2 > 192.168.122.1: ICMP echo reply, id 24412, seq 6, length 64

#Container

[root@dockerhost01 ~]# docker exec -it nipap-psql01 /bin/bash

root@22e95e83a211:/# tcpdump

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

04:43:47.053812 IP 192.168.122.1 > 22e95e83a211: ICMP echo request, id 24412, seq 1, length 64

04:43:47.053953 IP 22e95e83a211 > 192.168.122.1: ICMP echo reply, id 24412, seq 1, length 64

04:43:47.055538 IP 22e95e83a211.55398 > dns02.kvm.lan.domain: 8571+ PTR? 1.122.168.192.in-addr.arpa. (44)

04:43:47.056467 IP dns02.kvm.lan.domain > 22e95e83a211.55398: 8571 NXDomain* 0/1/0 (99)

04:43:47.056981 IP 22e95e83a211.36237 > dns02.kvm.lan.domain: 25760+ PTR? 3.122.168.192.in-addr.arpa. (44)

04:43:47.058036 IP dns02.kvm.lan.domain > 22e95e83a211.36237: 25760* 1/2/2 PTR dns02.kvm.lan. (137)

04:43:48.052782 IP 192.168.122.1 > 22e95e83a211: ICMP echo request, id 24412, seq 2, length 64

04:43:48.052933 IP 22e95e83a211 > 192.168.122.1: ICMP echo reply, id 24412, seq 2, length 64

04:43:49.051751 IP 192.168.122.1 > 22e95e83a211: ICMP echo request, id 24412, seq 3, length 64

04:43:49.051816 IP 22e95e83a211 > 192.168.122.1: ICMP echo reply, id 24412, seq 3, length 64

04:43:50.050798 IP 192.168.122.1 > 22e95e83a211: ICMP echo request, id 24412, seq 4, length 64

04:43:50.050960 IP 22e95e83a211 > 192.168.122.1: ICMP echo reply, id 24412, seq 4, length 64

04:43:51.050651 IP 192.168.122.1 > 22e95e83a211: ICMP echo request, id 24412, seq 5, length 64

04:43:51.050726 IP 22e95e83a211 > 192.168.122.1: ICMP echo reply, id 24412, seq 5, length 64

04:43:52.050706 IP 192.168.122.1 > 22e95e83a211: ICMP echo request, id 24412, seq 6, length 64

04:43:52.050814 IP 22e95e83a211 > 192.168.122.1: ICMP echo reply, id 24412, seq 6, length 64

We have successfully achieved end-to-end connectivity from our KVM host to our docker containers being hosted on a machine. This now allows us to lab with the more exciting aspects of docker, such as multi-docker host overlay networking, CI deployments, orchestration, and logging just to name a few.

Hope this helped anyone who was looking to perform a similar set-up.

Louis DeLosSantos is a Sr. DevOps engineer and an avid contributor to thought and writing in the tech community.