Debugging Asymmetric Routing II
While trying to take my own advice to bind ports to host and forward, I inadvertently reintroduced asymmetric routing to the the network.
This time I was able to fix it without a network redesign since the chaos was isolated to a single box.
Here’s my notes on diagnosis and solution.
Diagnosis
Inexplicable MQTT errors:
- Home Assistant/zigbee2mqtt:
z2m: MQTT error: Keepalive timeout - Mosquitto server:
Client XXXX has exceeded timeout, disconnecting mosquitto_sub- no errors- Many
Default deny / state violation rulepackets on router
Proof
This was easy. I just adjusted the tcp command in debugging asymmetric routing to monitor MQTT traffic. It was very obvious that replies were being sent out on the wrong interface and this was breaking routing
We can see why the OS does this by inspecting the routing table:
root@atlas:/home/geoff# ip route show
default via 10.100.0.1 dev br0 proto dhcp src 10.100.254.246 metric 1004
default via 172.16.0.1 dev br1 proto dhcp src 172.16.0.244 metric 1006
default via 10.110.0.1 dev br110 proto dhcp src 10.110.94.63 metric 1010
10.89.0.0/24 dev podman1 proto kernel scope link src 10.89.0.1
10.100.0.0/16 dev br0 proto dhcp scope link src 10.100.254.246 metric 1004
10.110.0.0/16 dev br110 proto dhcp scope link src 10.110.94.63 metric 1010
172.16.0.0/16 dev br1 proto dhcp scope link src 172.16.0.244 metric 1006
And these all come from dhcpcd running on several interfaces, as defined in /etc/network/interfaces (debian trixie):
...
# os bridge
auto br0
iface br0
bridge-ports enp1s0
use dhcp
# tagged vlan bridges
auto br1
iface br1
bridge-ports enp1s0.1
use dhcp
auto br110
iface br110
bridge-ports enp1s0.110
use dhcp
Solution
Turns out dhcpcd needed will always add routes for new interfaces so we need to clean them up ourselves to avoid mayhem with some little bash script hooks, like this:
/etc/dhcpcd.exit-hook
#!/bin/sh
# logs: journalctl -u networking
# if DHCP gives us a new address, we could check with $old_ip_addres == $new_ip_address
# but this happens basically 0% of the time so just reboot until run out of IP addresses
# for now/forever if changed address breaks things (traefik)
for script in /etc/dhcpcd.exit-hook.d/*.sh ; do
echo "dhcpcd.exit-hook run script: ${script}"
"$script"
/etc/dhcpcd.exit-hook.d/br1_isolate.sh
#!/bin/bash
# only when we actually have an address
if [ "$interface" = "br1" ] && [ -n "$new_ip_address" ]; then
echo "delete routes to avoid asymmetric routing - br1"
ip route del default dev br1
ip route del 172.16.0.0/16 dev br1
fi
/etc/dhcpcd.exit-hook.d/br110_isolate.sh
#!/bin/bash
# only when we actually have an address
if [ "$interface" = "br110" ] && [ -n "$new_ip_address" ]; then
echo "delete routes to avoid asymmetric routing - br110"
ip route del default dev br110
ip route del 10.110.0.0/16 dev br110
fi
After restarting networking, routing table looks perfect:
default via 10.100.0.1 dev br0 proto dhcp src 10.100.254.246 metric 1025
10.89.0.0/24 dev podman1 proto kernel scope link src 10.89.0.1
10.100.0.0/16 dev br0 proto dhcp scope link src 10.100.254.246 metric 1025
And MQTT works in Home Assistant again!
Root of the problem
This whole problem was caused by enabling a bunch of interfaces that grant access to different VLANS, but as mentioned in previous debugging post, VLANS are for traffic separation, and if you have something essentially bridging the VLANS as above, it defeats the whole point of the exercise.
So what was I trying to do here? I was trying to expose a bunch of services like nexus and MQTT to the whole network and run prometheus in the management VLAN.
The problem is the main host is not officially on the management VLAN, it has an interface setup with an IP address and thats it. After deleting the routes so it can’t reach anything, this effectively blocks me from using docker/podman bridge networking to run prometheus boudn to the management VLAN without a lot of extra/brittle work since podman expects the host to participate in networking.
MACVLAN and IPVLAN cause problems with dhcp which I already looked at extensively which leaves… VMs.
In the end, the solution is simple and bulletproof, if a little clunky: Just bind the interface into a VM directly and podman containers can just use the default network. Ironically, this also means the interface no longer needs an IP address on the host and you guessed it, removing the use dhcp from /etc/network/interfaces would have also prevented the asymmetric routing problem.
What a weekend.