I sometimes use docker on my macbook but I had no idea how containers and host are communicating. To understand it, I played around with docker for Mac and left notes here.

Version info Link to heading

Server: Docker Desktop 4.15.0 (93002)
 Engine:
  Version:          20.10.21
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.7
  Git commit:       3056208
  Built:            Tue Oct 25 18:00:19 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.10
  GitCommit:        770bd0108c32f3fb5c73ae1264f7e503fe7b2661
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Container setup Link to heading

I deployed two containers, minikube and Nginx, because I wanted to learn k8s infra as well🙃

% docker ps
CONTAINER ID   IMAGE                                 COMMAND                  CREATED          STATUS          PORTS                                                                                                                        NAMES
7fa649f74462   ade71b98a9cb                          "/docker-entrypoint.…"   28 seconds ago   Up 27 seconds   80/tcp, 0.0.0.0:8080->8080/tcp                                                                                               wonderful_poitras
5094d821aff9   gcr.io/k8s-minikube/kicbase:v0.0.36   "/usr/local/bin/entr…"   4 days ago       Up 5 hours      0.0.0.0:49674->22/tcp, 0.0.0.0:49675->2376/tcp, 0.0.0.0:49672->5000/tcp, 0.0.0.0:49673->8443/tcp, 0.0.0.0:49671->32443/tcp   minikube

These ports were exposed on host:

  • minikube: 49674, 49675, 49762, 49763, 49761
  • nginx: 8080
Note
Minikube can be deployed on various environment, docker, hyperkit, and so on. I selected docker driver for learning here ref:

I saw these ports were LISTEN status on macOS.

% lsof -nP | grep LISTEN 
com.docke  1154 yoshi   21u     IPv6 0x4dda2711431557c7         0t0                 TCP *:49675 (LISTEN)
com.docke  1154 yoshi   25u     IPv6 0x4dda2711431588c7         0t0                 TCP *:49671 (LISTEN)
com.docke  1154 yoshi   57u     IPv6 0x4dda2711431549c7         0t0                 TCP *:49672 (LISTEN)
com.docke  1154 yoshi   59u     IPv6 0x4dda2711431534c7         0t0                 TCP *:49673 (LISTEN)
com.docke  1154 yoshi   91u     IPv6 0x4dda271143153bc7         0t0                 TCP *:49674 (LISTEN)
com.docke  1154 yoshi  119u     IPv6 0x4dda2711431629c7         0t0                 TCP *:8080 (LISTEN)

Also, I saw following docker bridges were deployed.

% docker network ls
NETWORK ID     NAME       DRIVER    SCOPE
8f1d9243e23c   bridge     bridge    local
99cef85b950f   host       host      local
a22db289b316   minikube   bridge    local
50800bebdd2d   none       null      local

bridge is docker0 which is default docker bridge and nginx container is connected to.

% docker network inspect bridge
[
    {
        "Name": "bridge",
        "Id": "8f1d9243e23c0d5280682fa7321ef6194e952a7e2e7e89ed1e69fb24c74479b4",
        "Created": "2023-01-08T02:51:49.564851454Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16",
                    "Gateway": "172.17.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "7fa649f7446290f36440b0195b6a3125d9baaa1f24e7910e57385048fc91cff1": {
                "Name": "wonderful_poitras",
                "EndpointID": "da56a56a50027cc085db8067f9ce1dba5bd489cff3e74133a878bf0799438295",
                "MacAddress": "02:42:ac:11:00:02",
                "IPv4Address": "172.17.0.2/16",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]

minikube is the network bridge which minikube belongs to.

% docker network inspect minikube
[
    {
        "Name": "minikube",
        "Id": "a22db289b316d78535149a49c879fa8c28cf7c5905cc9a637b2de1a8bfbcaed1",
        "Created": "2023-01-04T05:58:20.3787848Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "192.168.49.0/24",
                    "Gateway": "192.168.49.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "5094d821aff9827358b7bd05a694efaf690f7d093eab2e6134eb1b44c5e63ccb": {
                "Name": "minikube",
                "EndpointID": "8adc610e615c4b23348163bb6656eaa5c5dbd87f35be0c73c6d7195db2424961",
                "MacAddress": "02:42:c0:a8:31:02",
                "IPv4Address": "192.168.49.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "--icc": "",
            "--ip-masq": "",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {
            "created_by.minikube.sigs.k8s.io": "true",
            "name.minikube.sigs.k8s.io": "minikube"
        }
    }
]

So, I wanted to figure out how docker network is deployed virtually and how it’s working under virtualization.

I was curious about:

  • How port on localhost (macOS) is connected to docker container IP address/Port? More specifically, how localhost:8080 is mapped into 172.17.0.2:8080?
  • How docker is routing packets internally? What kind of Linux technology stack enables the routing under the docker virtualization?

image

Under the hood: networking on Hyperkit Link to heading

But, as documented on docker doc, ifconfig on macOS didn’t show docker0 and minikube bridges. I could not see these networks on Mac. So where can I find the actual network interface and corresponding configurations?

Because of the way networking is implemented in Docker Desktop, you cannot see a docker0 interface on the host. This interface is actually within the virtual machine.

ref: https://docs.docker.com/desktop/networking/#there-is-no-docker0-bridge-on-the-host

I remembered dockerd (daemon) is working on Linux VM (hyperkit - Linuxkit), not on macOS directly. It’s deployed on Linux VM.

So, I dived into hyperkit. To connect to hyperkit VM, I used unix domain socket to login to linux VM shell.

nc -U ~/Library/Containers/com.docker.docker/Data/debug-shell.sock
Tip

You can use priviledged docker container to get login shell as well

docker run -it --rm --privileged --pid=host justincormack/nsenter1

ref: https://gist.github.com/BretFisher/5e1a0c7bcca4c735e716abf62afad389

I was on Linux VM where dockerd is working! Verifying dockerd was up and running with ps

# ps | grep dockerd
 1202 root      0:01 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id docker -address /run/containerd/containerd.sock
 1221 root      0:00 /usr/bin/docker-init /usr/bin/entrypoint.sh
 1544 root      0:00 /usr/bin/trim-after-delete -- /sbin/fstrim /var/lib/docker
 1661 root      0:00 /usr/bin/logwrite -n dockerd /usr/local/bin/dockerd --containerd /var/run/desktop-containerd/containerd.sock --pidfile /run/desktop/docker.pid --swarm-default-advertise-addr=eth0 --host-gateway-ip 192.168.65.2
 1668 root      0:11 /usr/local/bin/dockerd --containerd /var/run/desktop-containerd/containerd.sock --pidfile /run/desktop/docker.pid --swarm-default-advertise-addr=eth0 --host-gateway-ip 192.168.65.2

But, I was unable to run command dockerd on linuxkit…Weird.

/ # dockerd
/bin/sh: dockerd: not found

To find out parents process of dockerd, I ran pstree on linuxkit.

/ # pstree
init-+-containerd---8*[{containerd}]
     |-3*[containerd-shim---10*[{containerd-shim}]]
     |-containerd-shim-+-allowlist---5*[{allowlist}]
     |                 `-11*[{containerd-shim}]
     |-containerd-shim-+-devenv-server---5*[{devenv-server}]
     |                 `-11*[{containerd-shim}]
     |-containerd-shim-+-dhcpcd
     |                 `-11*[{containerd-shim}]
     |-containerd-shim-+-diagnosticsd-+-5*[sh]
     |                 |              `-9*[{diagnosticsd}]
     |                 `-11*[{containerd-shim}]
     |-containerd-shim-+-dns-forwarder---9*[{dns-forwarder}]
     |                 `-11*[{containerd-shim}]
     |-containerd-shim-+-containerd-shim-+-systemd-+-containerd---10*[{containerd}]
     |                 |                 |         |-5*[containerd-shim-+-pause]
     |                 |                 |         |                    `-10*[{containerd-shim}]]
     |                 |                 |         |-containerd-shim-+-etcd---10*[{etcd}]
     |                 |                 |         |                 `-11*[{containerd-shim}]
     |                 |                 |         |-containerd-shim-+-kube-controller---13*[{kube-controller}]
     |                 |                 |         |                 `-10*[{containerd-shim}]
     |                 |                 |         |-containerd-shim-+-kube-scheduler---8*[{kube-scheduler}]
     |                 |                 |         |                 `-10*[{containerd-shim}]
     |                 |                 |         |-2*[containerd-shim-+-pause]
     |                 |                 |         |                    `-11*[{containerd-shim}]]
     |                 |                 |         |-containerd-shim-+-kube-apiserver---15*[{kube-apiserver}]
     |                 |                 |         |                 `-10*[{containerd-shim}]
     |                 |                 |         |-containerd-shim-+-kube-proxy---7*[{kube-proxy}]
     |                 |                 |         |                 `-10*[{containerd-shim}]
     |                 |                 |         |-containerd-shim-+-coredns---9*[{coredns}]
     |                 |                 |         |                 `-10*[{containerd-shim}]
     |                 |                 |         |-containerd-shim-+-storage-provisi---6*[{storage-provisi}]
     |                 |                 |         |                 `-10*[{containerd-shim}]
     |                 |                 |         |-cri-dockerd---10*[{cri-dockerd}]
     |                 |                 |         |-dbus-daemon
     |                 |                 |         |-dockerd---32*[{dockerd}]
     |                 |                 |         |-kubelet---15*[{kubelet}]
     |                 |                 |         |-sshd
     |                 |                 |         `-systemd-journal
     |                 |                 `-11*[{containerd-shim}]
     |                 |-containerd-shim-+-nginx---4*[nginx]
     |                 |                 `-11*[{containerd-shim}]
     |                 |-containerd-shim-+-sh---pstree
     |                 |                 `-10*[{containerd-shim}]
     |                 |-docker-init---entrypoint.sh---logwrite-+-lifecycle-serve-+-logwrite-+-containerd---8*[{containerd}]
     |                 |                                        |                 |          `-5*[{logwrite}]
     |                 |                                        |                 |-logwrite-+-dockerd---27*[{dockerd}]
     |                 |                                        |                 |          `-5*[{logwrite}]
     |                 |                                        |                 `-12*[{lifecycle-serve}]
     |                 |                                        `-4*[{logwrite}]

Ah, dockerd seems to be running on containerd. That’s why I cannot run dockerd on linuxkit!

/ # ctr namespace ls
NAME              LABELS 
services.linuxkit 

/ # ctr -n services.linuxkit container ls
CONTAINER            IMAGE    RUNTIME                  
acpid                -        io.containerd.runc.v2    
allowlist            -        io.containerd.runc.v2    
binfmt               -        io.containerd.runc.v2    
devenv-service       -        io.containerd.runc.v2    
dhcpcd               -        io.containerd.runc.v2    
diagnose             -        io.containerd.runc.v2    
dns-forwarder        -        io.containerd.runc.v2    
docker               -        io.containerd.runc.v2    
http-proxy           -        io.containerd.runc.v2    
kmsg                 -        io.containerd.runc.v2    
sntpc                -        io.containerd.runc.v2    
socks                -        io.containerd.runc.v2    
trim-after-delete    -        io.containerd.runc.v2    
volume-contents      -        io.containerd.runc.v2    
vpnkit-forwarder     -        io.containerd.runc.v2

Ok, dependent processes were running as containers on containerd. So, I had to go into dockerd container run by containerd.

/ # ctr --namespace services.linuxkit tasks exec --exec-id dockerd docker bash  
dockerd
time="2023-01-09T04:22:53.812403739Z" level=info msg="Starting up"
time="2023-01-09T04:22:53.814929367Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
time="2023-01-09T04:22:53.815148608Z" level=debug msg="Listener created for HTTP on unix (/var/run/docker.sock)"
time="2023-01-09T04:22:53.816622901Z" level=warning msg="could not change group /run/guest-services/docker.sock to docker: group docker not found"
time="2023-01-09T04:22:53.816727802Z" level=debug msg="Listener created for HTTP on unix (/run/guest-services/docker.sock)"
time="2023-01-09T04:22:53.816744792Z" level=debug msg="Containerd not running, starting daemon managed containerd"
time="2023-01-09T04:22:53.817736235Z" level=info msg="libcontainerd: started new containerd process" pid=62027
time="2023-01-09T04:22:53.817781680Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2023-01-09T04:22:53.817789148Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc

Yes! dockerd was residing in this container.

Docker bridge on Linuxkit Link to heading

Going back to Linuxkit VM, I was able to find the docker bridge finally on this Linux VM. br-a22db289b316 seemed to be minikube bridge.

/ # ip addr
7: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 02:42:4d:20:dd:fd brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:4dff:fe20:ddfd/64 scope link 
       valid_lft forever preferred_lft forever
8: br-a22db289b316: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 02:42:29:ca:ee:4f brd ff:ff:ff:ff:ff:ff
    inet 192.168.49.1/24 brd 192.168.49.255 scope global br-a22db289b316
       valid_lft forever preferred_lft forever
    inet6 fe80::42:29ff:feca:ee4f/64 scope link 
       valid_lft forever preferred_lft forever

Also, these were veth interfaces which were connected to containers:

  • docker0 -> ethecfe79b@if19 -> nginx
  • br-a22db289b316 (minikube) -> vethc741c48@if15 -> minikube
/ # ip link show
16: vethc741c48@if15: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue master br-a22db289b316 state UP 
    link/ether 5a:65:68:0c:39:9b brd ff:ff:ff:ff:ff:ff
20: vethecfe79b@if19: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue master docker0 state UP 
    link/ether 36:3b:7c:d8:89:93 brd ff:ff:ff:ff:ff:ff

Linuxkit LISTEN ports Link to heading

For nginx port 8080 was LISTEN on Linux VM. But minikube ports exposed on macOS (49674, 49675, 49762, 49763, 49761) were mapped to different port ranges (55004, 55000, 55001, 55002, 55003).

/ # netstat -apnt
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.1:55003         0.0.0.0:*               LISTEN      1668/dockerd
tcp        0      0 127.0.0.1:55002         0.0.0.0:*               LISTEN      1668/dockerd
tcp        0      0 127.0.0.1:55001         0.0.0.0:*               LISTEN      1668/dockerd
tcp        0      0 127.0.0.1:55000         0.0.0.0:*               LISTEN      1668/dockerd
tcp        0      0 127.0.0.1:55004         0.0.0.0:*               LISTEN      1668/dockerd
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      1668/dockerd

Routing on Linuxkit Link to heading

I figured out macOS port -> LinuxVM port mapping. Also, I confirmed docker bridges were deployed on LinuxVM indeed. The last one mile is IP address and Port translation for LinuxVM ports -> docker container. This routing was done by iptables Ports on linuxVM were listened by dockerd, so I dived into dockerd container again.

/ # ctr --namespace services.linuxkit tasks exec --exec-id dockerd docker bash 
iptables -L -t nat

Chain DOCKER (2 references)
target     prot opt source               destination         
DNAT       tcp  --  anywhere             localhost            tcp dpt:55000 to:192.168.49.2:32443
DNAT       tcp  --  anywhere             localhost            tcp dpt:55001 to:192.168.49.2:8443
DNAT       tcp  --  anywhere             localhost            tcp dpt:55002 to:192.168.49.2:5000
DNAT       tcp  --  anywhere             localhost            tcp dpt:55003 to:192.168.49.2:2376
DNAT       tcp  --  anywhere             localhost            tcp dpt:55004 to:192.168.49.2:22
DNAT       tcp  --  anywhere             anywhere             tcp dpt:http-alt to:172.17.0.2:8080

In the older docker for Mac, slirp-proxy was used for routing but current implementation seems to be resorting to iptables.

/ # cat /proc/version
Linux version 5.15.49-linuxkit (root@buildkitsandbox) (gcc (Alpine 10.2.1_pre1) 10.2.1 20201203, GNU ld (GNU Binutils) 2.35.2) #1 SMP Tue Sep 13 07:51:46 UTC 2022

Conclusion Link to heading

In summary, docker for mac networking is like the below image:

image

Warning
veth pairs are ommited on the above image. There should be veth pairs between linux bridges and docker containers.

Remaining task Link to heading

Still I’m not sure about how host(macOS) is communicating with hyperkit using port mapping. I guess the container vpnkit-forwarder on linuxkit is responsible for the communication between host(macOS) and linuxkit. Maybe sometime I will dive into the hyperkit implementation 🧐

Resources Link to heading