<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="http://www.declarativesystems.com/feed.xml" rel="self" type="application/atom+xml" /><link href="http://www.declarativesystems.com/" rel="alternate" type="text/html" /><updated>2026-04-11T13:59:23+00:00</updated><id>http://www.declarativesystems.com/feed.xml</id><title type="html">Geoff Williams Blog</title><subtitle>Random tech, projects and fun stuff I don&apos;t want to forget</subtitle><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><entry><title type="html">DIY Router</title><link href="http://www.declarativesystems.com/2026/04/11/diy-router.html" rel="alternate" type="text/html" title="DIY Router" /><published>2026-04-11T00:00:00+00:00</published><updated>2026-04-11T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/04/11/diy-router</id><content type="html" xml:base="http://www.declarativesystems.com/2026/04/11/diy-router.html"><![CDATA[<p>After a year or so trying out <a href="https://opnsense.org/">OPNSense</a> on a <a href="https://shop.zimaspace.com/products/zimaboard-832-2021-special-edition">ZimaBoard</a>, a <a href="https://forum.opnsense.org/index.php?topic=51216.0">bungled upgrade</a> <a href="/2026/03/09/rescuing-broken-kubernetes.html">debacle</a> made me realise it was time for a significant router upgrade.</p>

<p>I looked at a few options, with the requirements:</p>

<ul>
  <li>NVME storage (No eMMC - its slow and wears out)</li>
  <li>Ideally 16GB RAM to support intrusion detection and extra tools</li>
  <li>At least 3 ethernet ports (for CARP)</li>
  <li>Must be easily fixable/replaceable by me, in Australia</li>
  <li>1GgE ethernet is all my switches support, and I’m not planning on upgrading due to cost</li>
</ul>

<h2 id="opnsense-appliance">OPNSense Appliance</h2>

<p><a href="https://shop.opnsense.com">OPNSense ship their own appliances, designed and manufactured in Europe</a>. These are slick units with a choice of either desktop or rack mount form factors.</p>

<p>As much as I wanted to buy one to support the project, the <a href="https://shop.opnsense.com/product/dec697-opnsense-desktop-security-appliance/">basic model</a> I selected was still €678 ($1124AUD) + shipping.</p>

<p>There’s a few video reviews on other models on youtube. The hardware seems solid but getting parts under warranty in Australia is not going to be a 24 hour turnaround and its just SO expensive.</p>

<h2 id="opnsense-appliance-x2-for-ha-with-carp-failover">OPNSense Appliance x2, for HA with CARP failover</h2>

<p>OPNsense lets you do HA for routers with <a href="https://docs.opnsense.org/manual/hacarp.html">HA-CARP</a>. This is an OPNSense specific HA protocol that lets a backup router take over if the main one fails. This would also let you do scheduled maintenance without an internet outage.</p>

<p>The drawback of this approach is that you now need 2x routers <em>and</em> 2x uplinks to your ISP. This type of HA protects you <em>only</em> against outages caused by router fault/reboot.</p>

<p>Since I can only get one IP address at a time from my ISP, I abandoned this approach.</p>

<h2 id="protectli-mini-pc">Protectli Mini PC</h2>

<p>The <a href="https://protectli.com/">Protectli</a> mini PCs are quite nice, I almost bought one but they are quite expensive for what you get and there are some identical looking boxes on AliExpress as well. Since its a mini-pc, your also stuck with whatever you bought in terms of network ports.</p>

<p>Cost with options came to about $500USD (700AUD) + shipping, high enough that we’re into full PC territory, <em>but</em> we are still getting very basic PC hardware for the price.</p>

<h2 id="build-a-small-pc">Build a small PC!</h2>

<p>In the end, I decided to just build myself a mini ITX PC, recycling some parts I already had and going deluxe with the case and cooler. Is this overkill? Absolutely:</p>

<table>
  <thead>
    <tr>
      <th>Item</th>
      <th>Price</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>AMD Ryzen 5 5500GT</td>
      <td>$175.00</td>
    </tr>
    <tr>
      <td>ASRock - B550M-ITX/ac</td>
      <td>$209.00</td>
    </tr>
    <tr>
      <td>Crucial 500GB NVMe</td>
      <td>$145.00</td>
    </tr>
    <tr>
      <td>Noctua - NH-L9x65</td>
      <td>$130.00</td>
    </tr>
    <tr>
      <td>Fractal Design Terra Jade Mini + CORSAIR SF750 PSU</td>
      <td>$496.00</td>
    </tr>
    <tr>
      <td>16 GB DDR</td>
      <td>$0</td>
    </tr>
    <tr>
      <td>Dual Intel I350-T2 NIC</td>
      <td>$0</td>
    </tr>
  </tbody>
</table>

<p>This setup means I have a quiet, powerful router platform that will last for years to come and handle anything I can throw at it with my current 1GbE network. I can very easily upgrade the networking hardware if I want to, along with anything else. If I get any hardware failures, I can source replacements locally and get them within 24 hours, and I have a seriously cool looking router I built myself.</p>

<h2 id="deployment-architecture">Deployment architecture</h2>

<p>There is enough hardware here to do things like run OPNSense in a VM, and also host other network services. This could let me do things like implement the HA/CARP setup with VMs. I’m pretty confident I could make this work with libvirt but at this point I took a pause and reached out to my colleagues at Confluent and setup a quick slack poll on our <code class="language-plaintext highlighter-rouge">#homelab</code> channel.</p>

<p>The results was unanimous: Run OPNSense on bare metal.</p>

<p>While this means no VM based HA or ability to take VM snapshot backups, the gain is a much simpler overall system that is basically just a simple FreeBSD system. Big problems? Plugin a screen and keyboard. The consensus was that an appliance-style deployment is just so much simpler to troubleshoot, and the nice thing about having this DIY setup is that if I ever did want to go the VM route in the future, its very easy to do so with no additional hardware purchase needed.</p>

<h2 id="opnsense-install">OPNSense Install</h2>

<p>Installing OPNSense was extremely easy. Just boot the VGA installer from USB and install.</p>

<p>This time, I made sure to fix all the static leases I cared about in the <em>old</em> router before doing a final backup and then restored this onto the <em>new</em> router.</p>

<p>There were only a couple of gotchas in this process:</p>
<ol>
  <li>Had to enable CSM boot/disable secure boot</li>
  <li>Interface devices had changed vs the backup so needed adjustment</li>
</ol>

<p>After these changes, I was up and running in minutes.</p>

<h2 id="verdict">Verdict</h2>

<p>It’s been about a month now and this router PC has been rock solid. It just sits there looking pretty and is almost completely silent thanks to the Noctua cooler. I have not thought more about converting to VMs since everything just works.</p>

<p>If your thinking of building your own PC for OPNSense, I highly recommend it if your comfortable building your own Computers.</p>

<p><img src="/assets/img/dream_router_pc.jpg" alt="dream router pc" /></p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[After a year or so trying out OPNSense on a ZimaBoard, a bungled upgrade debacle made me realise it was time for a significant router upgrade.]]></summary></entry><entry><title type="html">Rescuing broken kubernetes</title><link href="http://www.declarativesystems.com/2026/03/09/rescuing-broken-kubernetes.html" rel="alternate" type="text/html" title="Rescuing broken kubernetes" /><published>2026-03-09T00:00:00+00:00</published><updated>2026-03-09T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/03/09/rescuing-broken-kubernetes</id><content type="html" xml:base="http://www.declarativesystems.com/2026/03/09/rescuing-broken-kubernetes.html"><![CDATA[<p>After recovering the network from a failed router upgrade by doing a re-flash recently, I found all my kubernetes HA clusters were toast.</p>

<p>To cut a long story short, this happened because the DHCP leases database was reset and they all got new IP addresses, which meant individual nodes (K3S) could not startup.</p>

<p>There were 3 steps needed to recover the clusters, and I can prevent this from happening again very easily:</p>

<h2 id="solution">Solution</h2>

<h3 id="step-1---switch-back-to-old-ip-address">Step 1 - Switch back to old IP address</h3>

<p>I’m pretty sure my <code class="language-plaintext highlighter-rouge">etcd</code> was completely broken. The easiest way I found to get the old IP address was to just search the logs:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>journalctl | <span class="nb">grep</span> <span class="nt">-i</span> dhcp
</code></pre></div></div>

<p>Which worked on all of my Debian kubernetes nodes.</p>

<p>Note down the IP address, MAC address and hostname, then in your router, create a static lease with the old details. Don’t click apply until you are ready for the next step.</p>

<h3 id="step-2---reboot-all-nodes">Step 2 - Reboot all nodes</h3>

<p>eg, with ansible:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ansible all <span class="nt">-i</span> inventories/hosts.yml <span class="nt">-b</span> <span class="nt">-m</span> ansible.builtin.reboot
</code></pre></div></div>

<p>Now click <code class="language-plaintext highlighter-rouge">apply</code> on your router while the systems are rebooting.</p>

<h3 id="step-3---test-access-and-update-if-needed">Step 3 - Test access and update if needed</h3>

<p>On a cluster node, see if access is now working:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>kubectl get nodes
</code></pre></div></div>

<p>If this worked, you will now see your cluster nodes as usual instead of the some error.</p>

<p>Outside the cluster, you may find your previous credentials now give certificate errors. After checking requests are reaching the right cluster endpoint IP address, there is a good chance the cluster has re-issued itself new certificates. In this case, replace your credentials for <code class="language-plaintext highlighter-rouge">.kube/config</code> with a fresh copy from the node and this should fix things.</p>

<p>Agent nodes may have similar issues and find themselves permanently evicted. This happened to me and it was easiest to just delete the agent node from the control plane and reinstall the node as a fresh worker.</p>

<h3 id="fixed">Fixed</h3>

<p>After performing these steps, my cluster was operational again but this was NOT a fun exercise.</p>

<p>Thankfully this was a pretty empty lab setup. In the real world, 100% there would have been big disruption, crashed pods and customer impact.</p>

<p>I note that my <a href="https://rook.io/">rook/ceph</a> managed to recover itself on its own once IP addresses were fixed, which is outstanding.</p>

<h2 id="prevention">Prevention</h2>

<p>Basically Kubernetes likes to have static IP addresses, but I like the convenience and flexibility of DHCP, especially when I might want to reconfigure hosts from the router.</p>

<p>Most of my hosts are headless and with only one NIC, so any screw up in static address reconfiguration with say ansible means crawling around with HDMI cables and keyboards to fix.</p>

<p>There is however an easy fix I overlooked in OPNSense: On the far right, there is a <code class="language-plaintext highlighter-rouge">+</code> button that adds a static lease.</p>

<p><img src="/assets/img/opnsense_static_lease.png" alt="add static lease" /></p>

<p>So my new solution is when I’m happy a lab system is up and running nicely, just click the button and add a static lease if Kubernetes is involved.</p>

<p>That way it gets included in backups and if there router need re-flashing, systems should come back online with the right address soon after the router is back online.</p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[After recovering the network from a failed router upgrade by doing a re-flash recently, I found all my kubernetes HA clusters were toast.]]></summary></entry><entry><title type="html">ClickStack Homelab</title><link href="http://www.declarativesystems.com/2026/03/08/clickstack-homelab.html" rel="alternate" type="text/html" title="ClickStack Homelab" /><published>2026-03-08T00:00:00+00:00</published><updated>2026-03-08T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/03/08/clickstack-homelab</id><content type="html" xml:base="http://www.declarativesystems.com/2026/03/08/clickstack-homelab.html"><![CDATA[<p>I stumbled on <a href="https://clickhouse.com/clickstack">ClickStack</a> a couple of months ago. For a long time I’ve been looking for a logging solution for the homelab. I’ve played with <a href="https://grafana.com/docs/loki/latest/">Loki</a> before but never got arround to onboarding all my lab systems.</p>

<p>If you want to know more about deploying and using ClickStack, <a href="https://clickhouse.com/docs/use-cases/observability/clickstack">The Manual</a> is excellent.</p>

<p>Based on this doc, I was able to get up and running with the <code class="language-plaintext highlighter-rouge">docker.io/clickhouse/clickstack-all-in-one</code> container very quickly in docker on my laptop.</p>

<p>The all-in-one image bundles everything you need for a functioning log collector <em>and</em> frontend:</p>
<ul>
  <li><a href="https://clickhouse.com/">ClickHouse database</a></li>
  <li><a href="https://www.hyperdx.io/">HyperDX UI</a></li>
  <li><a href="https://opentelemetry.io/docs/collector/">OTEL collectors</a></li>
  <li>Glue scripts, etc <a href="https://hub.docker.com/layers/clickhouse/clickstack-all-in-one/latest/images/sha256-2749db438bb434ce054ea24ec666d11192bf9dc18705d4e965ba636d27df1fef">example Dockerfile</a></li>
</ul>

<h2 id="clickstack-into-production">ClickStack into “production”</h2>

<p><strong>Disclaimer: this is for a homelab!</strong></p>

<p>I found a permanent home for ClickStack on my “services” mini PC.</p>

<h3 id="podman-quadlet">Podman Quadlet</h3>

<p>I was able to run ClickStack as a first class homelab service quite easily <a href="/2025/03/12/podman-quadlet-services.html">with Podman Quadlet</a> like I do for some of my other lab services, like this:</p>

<p><strong>/etc/containers//systemd/clickstack-pod.kube</strong></p>

<div class="language-systemd highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">[Install]</span>
<span class="nt">WantedBy</span><span class="p">=</span>default.target

<span class="k">[Unit]</span>

<span class="k">[Kube]</span>
<span class="nt">Yaml</span><span class="p">=</span>/etc/containers/systemd/clickstack-pod.yml

<span class="c"># ui/hyperdx</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:8080:8080

<span class="c"># clickhouse/db</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:8123:8123

<span class="c"># otel/grpc</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:4317:4317

<span class="c"># otel/http</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:4318:4318
</code></pre></div></div>

<p><strong><code class="language-plaintext highlighter-rouge">/etc/containers//systemd/clickstack-pod.yml</code></strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiVersion: v1
kind: Pod
metadata:
  annotations:
    io.kubernetes.cri-o.ContainerType/app: container
    io.kubernetes.cri-o.TTY/app: "false"
    io.podman.annotations.autoremove/app: "FALSE"
    io.podman.annotations.init/app: "FALSE"
    io.podman.annotations.privileged/app: "FALSE"
    io.podman.annotations.publish-all/app: "FALSE"
  labels:
    app: clickstack
  name: clickstack
spec:
  automountServiceAccountToken: false
  dnsPolicy: None
  containers:
  - image: docker.io/clickhouse/clickstack-all-in-one:2.15.1
    name: app
    env:
    - name: FRONTEND_URL
      value: "https://clickstack.infrastructure.asio"
    ports:
    - containerPort: 8123
      hostPort: 8123
    - containerPort: 8080
      hostPort: 8080      
    - containerPort: 4317
      hostPort: 4317
    - containerPort: 4318
      hostPort: 4318 
    resources: {}
    securityContext:
      capabilities:
        drop:
        - NET_ADMIN
        - CAP_MKNOD
        - CAP_AUDIT_WRITE
    volumeMounts:
    - mountPath: /data/db
      name: clickstack-data-vol
      subPath: db
    - mountPath: /var/lib/clickhouse
      name: clickstack-data-vol
      subPath: clickhouse
    - mountPath: /var/log/clickhouse-server
      name: clickstack-log-vol

  enableServiceLinks: false
  hostname: clickstack
  restartPolicy: Never
  volumes:
  - hostPath:
      path: /data/containers/clickstack/data
      type: Directory
    name: clickstack-data-vol
  - hostPath:
      path: /data/containers/clickstack/log
      type: Directory
    name: clickstack-log-vol
status: {}
</code></pre></div></div>

<h3 id="traefik-frontend">Traefik frontend</h3>
<p>With <a href="/2026/03/07/traefik-frontend-for-podman.html">traefik already setup on the host</a>, I only needed a single drop-in:</p>

<p><strong><code class="language-plaintext highlighter-rouge">/etc/traefik/traefik.d/clickstack.infrastructure.asio.yml</code></strong></p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># ansible managed</span>

<span class="c1">#</span>
<span class="c1"># HTTP</span>
<span class="c1">#</span>
<span class="na">http</span><span class="pi">:</span>
  <span class="na">routers</span><span class="pi">:</span>
    <span class="na">clickstack-http</span><span class="pi">:</span>
      <span class="c1"># HostSNI only for TCP - we get the real headers for http routes</span>
      <span class="c1"># as TLS already terminated and http call parsed</span>
      <span class="na">rule</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Host(`clickstack.infrastructure.asio`)"</span>

      <span class="na">tls</span><span class="pi">:</span> <span class="pi">{}</span>

      <span class="na">entryPoints</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">infrastructure.https</span>
      <span class="na">service</span><span class="pi">:</span> <span class="s">clickstack-8080</span>

  <span class="na">services</span><span class="pi">:</span>
    <span class="na">clickstack-8080</span><span class="pi">:</span>
      <span class="na">loadBalancer</span><span class="pi">:</span>
        <span class="na">servers</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="na">url</span><span class="pi">:</span> <span class="s2">"</span><span class="s">http://localhost:8080"</span>

<span class="c1">#</span>
<span class="c1"># TCP</span>
<span class="c1">#</span>
<span class="na">tcp</span><span class="pi">:</span>
  <span class="na">routers</span><span class="pi">:</span>
    <span class="na">otel-grpc</span><span class="pi">:</span>
      <span class="c1"># always match - if not tls, there is no such thing as a header</span>
      <span class="c1"># so there is nothing to check against so just accept any value</span>
      <span class="na">rule</span><span class="pi">:</span> <span class="s2">"</span><span class="s">HostSNI(`*`)"</span>
      <span class="na">entryPoints</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">infrastructure.otel-grpc</span>
      <span class="na">service</span><span class="pi">:</span> <span class="s">clickstack-4317</span>
    <span class="na">otel-http</span><span class="pi">:</span>
      <span class="c1"># always match - if not tls, there is no such thing as a header</span>
      <span class="c1"># so there is nothing to check against so just accept any value</span>
      <span class="na">rule</span><span class="pi">:</span> <span class="s2">"</span><span class="s">HostSNI(`*`)"</span>
      <span class="na">entryPoints</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">infrastructure.otel-http</span>
      <span class="na">service</span><span class="pi">:</span> <span class="s">clickstack-4318</span>


  <span class="na">services</span><span class="pi">:</span>
    <span class="na">clickstack-4317</span><span class="pi">:</span>
      <span class="na">loadBalancer</span><span class="pi">:</span>
        <span class="na">servers</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="na">address</span><span class="pi">:</span> <span class="s2">"</span><span class="s">localhost:4317"</span>
    <span class="na">clickstack-4318</span><span class="pi">:</span>
      <span class="na">loadBalancer</span><span class="pi">:</span>
        <span class="na">servers</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="na">address</span><span class="pi">:</span> <span class="s2">"</span><span class="s">localhost:4318"</span>
</code></pre></div></div>

<h2 id="setting-up-clickstack">Setting up ClickStack</h2>

<p>With these items setup, hitting the URL I configured in the browser got me straight to a <em>set password</em> dialog in HyperDX. Done.</p>

<h2 id="client-setup">Client setup</h2>

<p>Now its time to send logs to ClickStack. I have a lone Windows machine on the network which is a source of constant pain, so this is the perfect starting point.</p>

<p>To collect logs from Windows:</p>
<ol>
  <li><a href="https://opentelemetry.io/docs/collector/install/binary/windows/">Install OTEL client for windows</a> and install it as a windows service</li>
  <li><a href="https://github.com/open-telemetry/opentelemetry-collector-releases/releases">Install the OTEL contrib files for windows</a> which are needed to collect the logs from Event Viewer</li>
  <li>Finally, configure the OTEL client in <code class="language-plaintext highlighter-rouge">C:\Program Files\OpenTelemetry Collector\config.yaml</code>. You will need a token for the ClickStack OTEL collector which you can find in the HyperDX UI by looking for <code class="language-plaintext highlighter-rouge">Your Ingestion API Key</code>:</li>
</ol>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># To limit exposure to denial of service attacks, change the host in endpoints below from 0.0.0.0 to a specific network interface.</span>
<span class="c1"># See https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks</span>


<span class="na">receivers</span><span class="pi">:</span>
  <span class="na">windowseventlog/application</span><span class="pi">:</span>
     <span class="na">channel</span><span class="pi">:</span> <span class="s">application</span>
  <span class="na">windowseventlog/system</span><span class="pi">:</span>
     <span class="na">channel</span><span class="pi">:</span> <span class="s">system</span>


<span class="na">processors</span><span class="pi">:</span>
  <span class="na">batch</span><span class="pi">:</span>
  <span class="na">resource</span><span class="pi">:</span>
    <span class="na">attributes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="na">key</span><span class="pi">:</span> <span class="s">host.os</span>
        <span class="na">value</span><span class="pi">:</span> <span class="s">windows</span>
        <span class="na">action</span><span class="pi">:</span> <span class="s">insert</span>


<span class="na">exporters</span><span class="pi">:</span>
  <span class="na">otlp/logs</span><span class="pi">:</span>
    <span class="na">endpoint</span><span class="pi">:</span> <span class="s">clickstack.infrastructure.asio:4317</span>
    <span class="na">tls</span><span class="pi">:</span>
      <span class="na">insecure</span><span class="pi">:</span> <span class="no">true</span>
    <span class="na">headers</span><span class="pi">:</span>
      <span class="na">authorization</span><span class="pi">:</span> <span class="s">xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx</span>
    <span class="na">compression</span><span class="pi">:</span> <span class="s">gzip</span>

<span class="na">service</span><span class="pi">:</span>
  <span class="na">pipelines</span><span class="pi">:</span>
    <span class="na">logs</span><span class="pi">:</span>
      <span class="na">receivers</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="s">windowseventlog/application</span>
        <span class="pi">-</span> <span class="s">windowseventlog/system</span>
      <span class="na">processors</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="s">batch</span>
      <span class="na">exporters</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="s">otlp/logs</span>
</code></pre></div></div>

<p>If everything worked, windows will start sending event logs to the OTEL collector previously exposed, and you will see events landing live in the UI:</p>

<p><img src="/assets/img/clickstack.png" alt="clickstack working" /></p>

<h2 id="verdict">Verdict</h2>

<p>This is a very slick and easy to use stack. The hardest part of setting this up turned out to be the OTEL client on Windows. I think its safe to say if it works on Windows, I’ll be able to collect the rest of the logs I need no problem, although I’m sure kubernetes logs will be a pain.</p>

<p>For an enterprise grade logging solution, I’d be looking for a more scalable deployment architecture probably running on Kubernetes with a support contract but for my little homelab this is perfect.</p>

<p>Pretty sure both of those asks are available…</p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[I stumbled on ClickStack a couple of months ago. For a long time I’ve been looking for a logging solution for the homelab. I’ve played with Loki before but never got arround to onboarding all my lab systems.]]></summary></entry><entry><title type="html">Debugging Asymmetric Routing II</title><link href="http://www.declarativesystems.com/2026/03/07/debugging-asymmetric-routing-ii.html" rel="alternate" type="text/html" title="Debugging Asymmetric Routing II" /><published>2026-03-07T00:00:00+00:00</published><updated>2026-03-07T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/03/07/debugging-asymmetric-routing-ii</id><content type="html" xml:base="http://www.declarativesystems.com/2026/03/07/debugging-asymmetric-routing-ii.html"><![CDATA[<p>While trying to take my own advice to <a href="/2026/02/14/podman-container-dhcp.html">bind ports to host and forward</a>, I inadvertently reintroduced asymmetric routing to the the network.</p>

<p>This time I was able to fix it without a network redesign since the chaos was isolated to a single box.</p>

<p>Here’s my notes on diagnosis and solution.</p>

<h2 id="diagnosis">Diagnosis</h2>

<p>Inexplicable MQTT errors:</p>
<ul>
  <li>Home Assistant/zigbee2mqtt: <code class="language-plaintext highlighter-rouge">z2m: MQTT error: Keepalive timeout</code></li>
  <li>Mosquitto server: <code class="language-plaintext highlighter-rouge">Client XXXX has exceeded timeout, disconnecting</code></li>
  <li><code class="language-plaintext highlighter-rouge">mosquitto_sub</code> - no errors</li>
  <li>Many <code class="language-plaintext highlighter-rouge">Default deny / state violation rule</code> packets on router</li>
</ul>

<h2 id="proof">Proof</h2>

<p>This was easy. I just adjusted the tcp command in <a href="/2025/03/13/debugging-asymmetric-routing.html">debugging asymmetric routing</a> to monitor MQTT traffic. It was very obvious that replies were being sent out on the wrong interface and this was breaking routing</p>

<p>We can see why the OS does this by inspecting the routing table:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@atlas:/home/geoff# ip route show
default via 10.100.0.1 dev br0 proto dhcp src 10.100.254.246 metric 1004 
default via 172.16.0.1 dev br1 proto dhcp src 172.16.0.244 metric 1006 
default via 10.110.0.1 dev br110 proto dhcp src 10.110.94.63 metric 1010 
10.89.0.0/24 dev podman1 proto kernel scope link src 10.89.0.1 
10.100.0.0/16 dev br0 proto dhcp scope link src 10.100.254.246 metric 1004 
10.110.0.0/16 dev br110 proto dhcp scope link src 10.110.94.63 metric 1010 
172.16.0.0/16 dev br1 proto dhcp scope link src 172.16.0.244 metric 1006 
</code></pre></div></div>

<p>And these all come from <code class="language-plaintext highlighter-rouge">dhcpcd</code> running on several interfaces, as defined in <code class="language-plaintext highlighter-rouge">/etc/network/interfaces</code> (debian trixie):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
# os bridge
auto br0
iface br0
    bridge-ports enp1s0
    use dhcp

# tagged vlan bridges
auto br1
iface br1
    bridge-ports enp1s0.1
    use dhcp

auto br110
iface br110
    bridge-ports enp1s0.110
    use dhcp
</code></pre></div></div>

<h2 id="solution">Solution</h2>

<p>Turns out <code class="language-plaintext highlighter-rouge">dhcpcd</code> needed will <em>always</em> add routes for new interfaces so we need to clean them up ourselves to avoid mayhem with some little bash script hooks, like this:</p>

<p><strong><code class="language-plaintext highlighter-rouge">/etc/dhcpcd.exit-hook</code></strong></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="c"># logs: journalctl -u networking</span>

<span class="c"># if DHCP gives us a new address, we could check with $old_ip_addres == $new_ip_address</span>
<span class="c"># but this happens basically 0% of the time so just reboot until run out of IP addresses</span>
<span class="c"># for now/forever if changed address breaks things (traefik)</span>

<span class="k">for </span>script <span class="k">in</span> /etc/dhcpcd.exit-hook.d/<span class="k">*</span>.sh <span class="p">;</span> <span class="k">do
    </span><span class="nb">echo</span> <span class="s2">"dhcpcd.exit-hook run script: </span><span class="k">${</span><span class="nv">script</span><span class="k">}</span><span class="s2">"</span> 
    <span class="s2">"</span><span class="nv">$script</span><span class="s2">"</span>
</code></pre></div></div>

<p><strong><code class="language-plaintext highlighter-rouge">/etc/dhcpcd.exit-hook.d/br1_isolate.sh</code></strong></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>

<span class="c"># only when we actually have an address</span>
<span class="k">if</span> <span class="o">[</span> <span class="s2">"</span><span class="nv">$interface</span><span class="s2">"</span> <span class="o">=</span> <span class="s2">"br1"</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="o">[</span> <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$new_ip_address</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
    </span><span class="nb">echo</span> <span class="s2">"delete routes to avoid asymmetric routing - br1"</span>
    ip route del default dev br1
    ip route del 172.16.0.0/16 dev br1
<span class="k">fi</span>
</code></pre></div></div>

<p><strong><code class="language-plaintext highlighter-rouge">/etc/dhcpcd.exit-hook.d/br110_isolate.sh</code></strong></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>

<span class="c"># only when we actually have an address</span>
<span class="k">if</span> <span class="o">[</span> <span class="s2">"</span><span class="nv">$interface</span><span class="s2">"</span> <span class="o">=</span> <span class="s2">"br110"</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="o">[</span> <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$new_ip_address</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
    </span><span class="nb">echo</span> <span class="s2">"delete routes to avoid asymmetric routing - br110"</span>
    ip route del default dev br110
    ip route del 10.110.0.0/16 dev br110
<span class="k">fi</span>
</code></pre></div></div>

<p>After restarting networking, routing table looks perfect:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>default via 10.100.0.1 dev br0 proto dhcp src 10.100.254.246 metric 1025 
10.89.0.0/24 dev podman1 proto kernel scope link src 10.89.0.1 
10.100.0.0/16 dev br0 proto dhcp scope link src 10.100.254.246 metric 1025
</code></pre></div></div>

<p>And MQTT works in Home Assistant again!</p>

<h2 id="root-of-the-problem">Root of the problem</h2>

<p>This whole problem was caused by enabling a bunch of interfaces that grant access to different VLANS, but as mentioned in previous debugging post, VLANS are for traffic separation, and if you have something essentially bridging the VLANS as above, it defeats the whole point of the exercise.</p>

<p>So what was I trying to do here? I was trying to expose a bunch of services like nexus and MQTT to the whole network <em>and</em> run prometheus in the <code class="language-plaintext highlighter-rouge">management</code> VLAN.</p>

<p>The problem is the main host is not officially on the <code class="language-plaintext highlighter-rouge">management</code> VLAN, it has an interface setup with an IP address and thats it.  After deleting the routes so it can’t reach anything, this effectively blocks me from using docker/podman bridge networking to run prometheus boudn to the <code class="language-plaintext highlighter-rouge">management</code> VLAN without a lot of extra/brittle work since podman expects the host to participate in networking.</p>

<p><code class="language-plaintext highlighter-rouge">MACVLAN</code> and <code class="language-plaintext highlighter-rouge">IPVLAN</code> cause problems with dhcp which I <a href="/2026/02/14/podman-container-dhcp.html">already looked at extensively</a> which leaves… VMs.</p>

<p>In the end, the solution is simple and bulletproof, if a little clunky: Just bind the interface into a VM directly and podman containers can just use the default network. Ironically, this also means the interface no longer needs an IP address on the host and you guessed it, removing the <code class="language-plaintext highlighter-rouge">use dhcp</code> from <code class="language-plaintext highlighter-rouge">/etc/network/interfaces</code> would have <em>also</em> prevented the asymmetric routing problem.</p>

<p>What a weekend.</p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[While trying to take my own advice to bind ports to host and forward, I inadvertently reintroduced asymmetric routing to the the network.]]></summary></entry><entry><title type="html">traefik frontend for podman</title><link href="http://www.declarativesystems.com/2026/03/07/traefik-frontend-for-podman.html" rel="alternate" type="text/html" title="traefik frontend for podman" /><published>2026-03-07T00:00:00+00:00</published><updated>2026-03-07T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/03/07/traefik-frontend-for-podman</id><content type="html" xml:base="http://www.declarativesystems.com/2026/03/07/traefik-frontend-for-podman.html"><![CDATA[<p>Following on from <a href="/2026/02/14/podman-container-dhcp.html">Podman in-container DHCP networking</a>, I needed a frontend proxy for my podman services.</p>

<p>Instead of tried and true <a href="https://www.haproxy.org/"><code class="language-plaintext highlighter-rouge">haproxy</code></a>, I tried out <a href="https://doc.traefik.io/traefik/"><code class="language-plaintext highlighter-rouge">traefik</code></a>.</p>

<h2 id="why-traefik">Why Traefik?</h2>

<ul>
  <li>Dynamic config sections</li>
  <li>Up-skill in traefik</li>
  <li>Much more flexible routing</li>
  <li>Choose HTTP support or raw TCP, mix and match as you please</li>
  <li>Perfect choice for flexible ingress in homelab or enterprise environments</li>
</ul>

<h2 id="how">How?</h2>

<p>Due to needing to operate across different networks, I chose to run the <code class="language-plaintext highlighter-rouge">traefik</code> binary directly on the host. Since the entrypoints section must be static, I ended up writing a <code class="language-plaintext highlighter-rouge">.j2</code> template for <code class="language-plaintext highlighter-rouge">traefik.yml</code>:</p>

<p><strong><code class="language-plaintext highlighter-rouge">traefik.yml.j2</code></strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
# ansible managed

#accessLog: 
#  format: json

entryPoints:

{% if br110_ip is defined %}
  infrastructure.http:
    address: "{{ br110_ip }}:80"
  infrastructure.https:
    address: "{{ br110_ip }}:443"
    http:
      tls: {}    
  infrastructure.docker:
    address: "{{ br110_ip }}:5000"
  infrastructure.dockers:
    address: "{{ br110_ip }}:5001"
  infrastructure.mqtts:
    address: "{{ br110_ip }}:8883/tcp"
  infrastructure.otel-grpc:
    address: "{{ br110_ip }}:4317"
  infrastructure.otel-http:
    address: "{{ br110_ip }}:4318"    
{% endif %}

{% if default_ip is defined %}
  default.http:
    address: "{{ default_ip }}:80"
  default.https:
    address: "{{ default_ip }}:443"
    http:
      tls: {}
  default.alertmanager:
    address: "{{ default_ip }}:9093"
  default.blackbox:
    address: "{{ default_ip }}:9115"
{% endif %}

providers:
  file:
    directory: /etc/traefik/traefik.d
    watch: true


log:
  level: INFO

</code></pre></div></div>

<p>The template configures itself to listen on specific IP addresses to avoid port clashes and expose services selectively.</p>

<p>The template gets pre-processed by a simple script and jinja2:</p>

<p>**<code class="language-plaintext highlighter-rouge">/usr/local/bin/configure_traefik.sh</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>

<span class="c"># generate traefik.yml by merging IP addresses with traefik template</span>

<span class="k">for </span>iface <span class="k">in</span> <span class="si">$(</span>ip <span class="nt">-o</span> <span class="nt">-4</span> addr show | <span class="nb">awk</span> <span class="s1">'{print $2}'</span> | <span class="nb">grep</span> <span class="s1">'^br'</span><span class="si">)</span><span class="p">;</span> <span class="k">do
    </span><span class="nv">ip</span><span class="o">=</span><span class="si">$(</span>ip <span class="nt">-o</span> <span class="nt">-4</span> addr show <span class="s2">"</span><span class="nv">$iface</span><span class="s2">"</span> | <span class="nb">awk</span> <span class="s1">'{print $4}'</span> | <span class="nb">cut</span> <span class="nt">-d</span>/ <span class="nt">-f1</span><span class="si">)</span>
    <span class="nb">export</span> <span class="s2">"</span><span class="k">${</span><span class="nv">iface</span><span class="k">}</span><span class="s2">_ip=</span><span class="nv">$ip</span><span class="s2">"</span>
<span class="k">done</span>

<span class="c"># wait up to 60 seconds for an IP</span>
<span class="k">for </span>i <span class="k">in</span> <span class="o">{</span>1..60<span class="o">}</span><span class="p">;</span> <span class="k">do
    </span><span class="nb">export </span><span class="nv">default_ip</span><span class="o">=</span><span class="si">$(</span><span class="nb">hostname</span> <span class="nt">-i</span><span class="si">)</span>
    <span class="k">if</span> <span class="o">[</span> <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$default_ip</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
        </span><span class="nb">break
    </span><span class="k">fi
    </span><span class="nb">sleep </span>1
<span class="k">done

if</span> <span class="o">[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="nv">$default_ip</span><span class="s2">"</span> <span class="o">]</span> <span class="p">;</span> <span class="k">then
    </span><span class="nb">echo</span> <span class="s2">"no default IP after waiting!"</span> 
    <span class="nb">exit </span>1
<span class="k">fi</span>

<span class="c"># run j2 with all in-scope environment variables available</span>
<span class="nb">echo</span> <span class="s2">"rebuild traefik config file"</span>
j2 <span class="nt">-e</span> <span class="s2">""</span> <span class="nt">-o</span> /etc/traefik/traefik.yml /etc/traefik/traefik.yml.j2
</code></pre></div></div>

<p>Each time the service is restarted, the script is called due to <code class="language-plaintext highlighter-rouge">ExecStartPre</code> and this generates the final <code class="language-plaintext highlighter-rouge">traefik.yml</code> file:</p>

<p><strong><code class="language-plaintext highlighter-rouge">traefik.service</code></strong></p>

<div class="language-systemd highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">[Unit]</span>
<span class="nt">Description</span><span class="p">=</span>Traefik Reverse Proxy

<span class="c"># online target does not work with dhcpcd</span>
<span class="nt">After</span><span class="p">=</span>network.target

<span class="k">[Service]</span>
<span class="c"># Grant the specific capability we need (ambient set preserves it across execs if any)</span>
<span class="nt">AmbientCapabilities</span><span class="p">=</span>CAP_NET_BIND_SERVICE

<span class="c"># Optional but strongly recommended: restrict to ONLY this capability</span>
<span class="c"># (prevents the process from gaining others accidentally)</span>
<span class="c"># allow bind port &lt; 1024 without root</span>
<span class="nt">CapabilityBoundingSet</span><span class="p">=</span>CAP_NET_BIND_SERVICE

<span class="c"># Good security hygiene: prevent privilege escalation tricks</span>
<span class="nt">NoNewPrivileges</span><span class="p">=</span>yes

<span class="nt">User</span><span class="p">=</span>traefik
<span class="nt">Group</span><span class="p">=</span>traefik

<span class="nt">Type</span><span class="p">=</span>simple
<span class="c"># runs as traefik user</span>
<span class="nt">ExecStartPre</span><span class="p">=</span>/usr/local/bin/configure_traefik.sh
<span class="nt">ExecStart</span><span class="p">=</span>/opt/traefik/traefik --configFile=/etc/traefik/traefik.yml
<span class="nt">Restart</span><span class="p">=</span>always

<span class="k">[Install]</span>
<span class="nt">WantedBy</span><span class="p">=</span>multi-user.target
</code></pre></div></div>

<h2 id="tls-config">TLS config</h2>

<p>TLS can <em>only</em> be configured with a dynamic file placed in scanned directory as configured above (<code class="language-plaintext highlighter-rouge">/etc/traefik/traefik.d</code>):</p>

<p><strong><code class="language-plaintext highlighter-rouge">/etc/traefik/traefik.d/tls.yml</code></strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tls:
  stores:
    default:
      defaultCertificate:
        certFile: /etc/traefik/tls/wildcard.infrastructure.asio.cert.fullchain.pem
        keyFile: /etc/traefik/tls/wildcard.infrastructure.asio.key.pem
</code></pre></div></div>

<h2 id="per-service-drop-ins">Per-service drop-ins</h2>

<p>The final part of the puzzle was to create drop-ins for each required service in the <code class="language-plaintext highlighter-rouge">/etc/traefik/traefik.d/</code> directory. Here’s my most complicated example - <a href="https://www.sonatype.com/products/sonatype-nexus-repository">Nexus</a>:</p>

<p><strong><code class="language-plaintext highlighter-rouge">/etc/traefik/traefik.d/nexus.infrastructure.asio.yml</code></strong></p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># ansible managed</span>

<span class="c1">#</span>
<span class="c1"># HTTP</span>
<span class="c1">#</span>
<span class="na">http</span><span class="pi">:</span>
  <span class="na">routers</span><span class="pi">:</span>
    <span class="na">nexus-http</span><span class="pi">:</span>
      <span class="c1"># HostSNI only for TCP - we get the real headers for http routes</span>
      <span class="c1"># as TLS already terminated and http call parsed</span>
      <span class="na">rule</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Host(`nexus.infrastructure.asio`)"</span>


      <span class="na">entryPoints</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">infrastructure.http</span>
      <span class="na">service</span><span class="pi">:</span> <span class="s">nexus-8081</span>
    <span class="na">docker</span><span class="pi">:</span>
      <span class="c1"># HostSNI only for TCP - we get the real headers for http routes</span>
      <span class="c1"># as TLS already terminated and http call parsed</span>
      <span class="na">rule</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Host(`nexus.infrastructure.asio`)"</span>


      <span class="na">entryPoints</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">infrastructure.docker</span>
      <span class="na">service</span><span class="pi">:</span> <span class="s">nexus-5000</span>

  <span class="na">services</span><span class="pi">:</span>
    <span class="na">nexus-8081</span><span class="pi">:</span>
      <span class="na">loadBalancer</span><span class="pi">:</span>
        <span class="na">servers</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="na">url</span><span class="pi">:</span> <span class="s2">"</span><span class="s">http://localhost:8081"</span>
    <span class="na">nexus-5000</span><span class="pi">:</span>
      <span class="na">loadBalancer</span><span class="pi">:</span>
        <span class="na">servers</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="na">url</span><span class="pi">:</span> <span class="s2">"</span><span class="s">http://localhost:5000"</span>

<span class="c1">#</span>
<span class="c1"># TCP</span>
<span class="c1">#</span>
<span class="na">tcp</span><span class="pi">:</span>
  <span class="na">routers</span><span class="pi">:</span>
    <span class="na">nexus-https</span><span class="pi">:</span>
      <span class="na">rule</span><span class="pi">:</span> <span class="s2">"</span><span class="s">HostSNI(`nexus.infrastructure.asio`)"</span>
      <span class="na">tls</span><span class="pi">:</span>
        <span class="na">passthrough</span><span class="pi">:</span> <span class="no">true</span>
      <span class="na">entryPoints</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">infrastructure.https</span>
      <span class="na">service</span><span class="pi">:</span> <span class="s">nexus-8443</span>
    <span class="na">dockers</span><span class="pi">:</span>
      <span class="na">rule</span><span class="pi">:</span> <span class="s2">"</span><span class="s">HostSNI(`nexus.infrastructure.asio`)"</span>
      <span class="na">tls</span><span class="pi">:</span>
        <span class="na">passthrough</span><span class="pi">:</span> <span class="no">true</span>
      <span class="na">entryPoints</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">infrastructure.dockers</span>
      <span class="na">service</span><span class="pi">:</span> <span class="s">nexus-5001</span>


  <span class="na">services</span><span class="pi">:</span>
    <span class="na">nexus-8443</span><span class="pi">:</span>
      <span class="na">loadBalancer</span><span class="pi">:</span>
        <span class="na">servers</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="na">address</span><span class="pi">:</span> <span class="s2">"</span><span class="s">localhost:8443"</span>
    <span class="na">nexus-5001</span><span class="pi">:</span>
      <span class="na">loadBalancer</span><span class="pi">:</span>
        <span class="na">servers</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="na">address</span><span class="pi">:</span> <span class="s2">"</span><span class="s">localhost:5001"</span>
</code></pre></div></div>

<p>Of course, I had some ansible create all of these codes for me from some basic per-host structured variables :)</p>

<h2 id="verdict">Verdict</h2>

<p>After automating the above for my podman services, I <em>finally</em> have:</p>
<ul>
  <li>Services available in the right VLAN</li>
  <li>No more MACVLAN DHCP chaos</li>
  <li>Correct host/VLAN isolation (see gotchas - below)</li>
  <li>TLS everywhere with wildcard certificate</li>
  <li>Easy, repeatable method for adding new podman services</li>
</ul>

<h2 id="why-not-kubernetes">Why not Kubernetes?</h2>

<p>These are my vital network services that Kubernetes needs itself - my other clusters use this one extensively, so these need to be amongst the most bulletproof services in my home lab and fool-proof containers/VMs are perfect for this.</p>

<h2 id="gotchas">Gotchas</h2>
<p>There were some unique challenges I had with traefik which are not well documented:</p>
<ul>
  <li>If you have any <code class="language-plaintext highlighter-rouge">entrypoints</code> specified, you <em>cannot</em> override them with arguments or environment variables</li>
  <li>You <em>cannot</em> configure your TLS certificates in the static section. They are <em>only</em> processed in the dynamic config files</li>
  <li>Static config files are fully static - no variables, etc</li>
  <li><a href="/2026/03/07/debugging-asymmetric-routing-ii.html">Cannot use this pattern to bridge host-isolated VLANS, use a VM instead</a></li>
  <li>Podman Quadlet services must bind to localhost only, to avoid port collision, like this:</li>
</ul>

<p><strong><code class="language-plaintext highlighter-rouge">/etc/containers/systemd/nexus-pod.kube</code></strong></p>

<div class="language-systemd highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">[Install]</span>
<span class="nt">WantedBy</span><span class="p">=</span>default.target

<span class="k">[Unit]</span>

<span class="k">[Kube]</span>
<span class="nt">Yaml</span><span class="p">=</span>/etc/containers/systemd/nexus-pod.yml

<span class="c"># web UI</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:8081:8081

<span class="c"># h2 database (nothing normally listens, not exposed outside host)</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:1234:1234

<span class="c"># web UI + TLS</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:8443:8443

<span class="c"># docker</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:5000:5000

<span class="c"># docker + TLS</span>
<span class="nt">PublishPort</span><span class="p">=</span>127.0.0.1:5001:5001
</code></pre></div></div>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[Following on from Podman in-container DHCP networking, I needed a frontend proxy for my podman services.]]></summary></entry><entry><title type="html">Podman in-container DHCP networking</title><link href="http://www.declarativesystems.com/2026/02/14/podman-container-dhcp.html" rel="alternate" type="text/html" title="Podman in-container DHCP networking" /><published>2026-02-14T00:00:00+00:00</published><updated>2026-02-14T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/02/14/podman-container-dhcp</id><content type="html" xml:base="http://www.declarativesystems.com/2026/02/14/podman-container-dhcp.html"><![CDATA[<h2 id="tldr">TL;DR</h2>

<p>Don’t do this. bind ports to host and port-forward.</p>

<h2 id="extended-version">Extended version</h2>

<p>This was a dead-end but documenting here for clarity in-case it helps someone.</p>

<p>I have a multi-homed/multi-VLAN host running <a href="/2025/03/12/podman-quadlet-services.html">podman kube services for the network</a>. So far I’ve been essentially plugging connectors directly into specific VLANs on the network.</p>

<p>Networking has always been a bit flakey for me with these containers. Today, I decided to fix things.</p>

<p>After much research, I ended up finding a pattern that worked better:</p>

<ul>
  <li>Separate podman network with bridge networking and external DHCP server</li>
  <li>In-container dhcp client</li>
</ul>

<p>The in-container DHCP client turns the container into a full network host <em>on a different VLAN to the host</em> as far as the router is concerned, and this has nice side effects like automatic DNS hostnames if you have configured this.</p>

<h3 id="separate-podman-network">Separate podman network</h3>

<p>I <a href="https://docs.ansible.com/projects/ansible/latest/collections/containers/podman/podman_network_module.html">manage my podman networks with ansible</a>. I was able to define a new network bound to my separately setup linux network bridge <code class="language-plaintext highlighter-rouge">br110</code> like this:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">podman bridge vlan br110/infrastructure</span>
  <span class="na">containers.podman.podman_network</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">podman-vlan-infrastructure</span>
    <span class="na">driver</span><span class="pi">:</span> <span class="s">bridge</span>
    <span class="na">opt</span><span class="pi">:</span>
      <span class="na">bridge_name</span><span class="pi">:</span> <span class="s">br110</span>
    <span class="na">ipam_driver</span><span class="pi">:</span> <span class="s">none</span>
    <span class="na">force</span><span class="pi">:</span> <span class="no">true</span>
</code></pre></div></div>

<p>Bridge networking is way more reliable then MACVLAN which I was previously using.</p>

<p>The network is referenced it in a <code class="language-plaintext highlighter-rouge">.kube</code> file like this:</p>

<div class="language-systemd highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">PodmanArgs</span><span class="p">=</span>--mac-address de:3b:ee:01:02:03
<span class="nt">PublishPort</span><span class="p">=</span>1883:1883
<span class="nt">PublishPort</span><span class="p">=</span>8883:8883
<span class="nt">PublishPort</span><span class="p">=</span>9001:9001
<span class="nt">Network</span><span class="p">=</span>podman-vlan-infrastructure
</code></pre></div></div>

<p>The fixed MAC address is so I can issue a static DHCP lease on the router.</p>

<h3 id="in-container-dhcp-client">In-container DHCP client</h3>

<p>There’s a few moving parts to this:</p>

<h4 id="container-image-must-provide-a-dhcp-client">Container image must provide a DHCP client</h4>

<p>Alpine images provide <code class="language-plaintext highlighter-rouge">udhcpc</code>, if nothing already in image, you will need to build a custom image</p>

<h4 id="writable-etcresolveconf">Writable <code class="language-plaintext highlighter-rouge">/etc/resolve.conf</code></h4>

<p><code class="language-plaintext highlighter-rouge">/etc/resolv.conf</code> inside containers is often a mount point provided by the Podman network stack itself. <code class="language-plaintext highlighter-rouge">udhcpc</code> will fail to rename it with <code class="language-plaintext highlighter-rouge">Resource busy</code> errors. We can fix this by providing our own mount instead.</p>

<p>Add a volume to the pod:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">spec</span><span class="pi">:</span>
  <span class="na">volumes</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">resolv-override</span>
    <span class="na">emptyDir</span><span class="pi">:</span> <span class="pi">{}</span>
</code></pre></div></div>

<p>Mount in the container(s) at the right place:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="na">volumeMounts</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">resolv-override</span>
      <span class="na">mountPath</span><span class="pi">:</span> <span class="s">/etc/resolv.conf</span>
</code></pre></div></div>

<h4 id="security-context">Security context</h4>

<p>Boost privileges as needed in the container(s) to allow DHCP broadcasts and file updates:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="na">securityContext</span><span class="pi">:</span>
      <span class="na">capabilities</span><span class="pi">:</span>
        <span class="c1"># if container runs as non-root, boost UID to root to let it</span>
        <span class="c1"># write /etc/resolve.conf</span>
        <span class="c1"># runAsUser: 0</span>
        <span class="c1"># runAsGroup: 0    </span>
        <span class="na">add</span><span class="pi">:</span> 
        <span class="pi">-</span> <span class="s">NET_ADMIN</span>
        <span class="pi">-</span> <span class="s">NET_RAW</span>       
</code></pre></div></div>

<h4 id="container-startup">Container startup</h4>

<p>Container(s) need their <code class="language-plaintext highlighter-rouge">command</code> and <code class="language-plaintext highlighter-rouge">args</code> tweaked to run DHCP client first and then delegate to the normal docker entrypoint with <code class="language-plaintext highlighter-rouge">exec</code>. This needs to be customized for each image you want to deal with, eg this one is for mosquitto:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="na">command</span><span class="pi">:</span> 
    <span class="pi">-</span> <span class="s">/bin/sh</span>
    <span class="na">args</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">-c</span>
    <span class="pi">-</span> <span class="pi">|</span>
      <span class="s">udhcpc</span>
      <span class="s">exec sh /docker-entrypoint.sh /usr/sbin/mosquitto -c /mosquitto/config/mosquitto.conf</span>
</code></pre></div></div>

<h4 id="complete-example">Complete example</h4>

<p>My completed mosquitto definition looks like this:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Save the output of this file and use kubectl create -f to import</span>
<span class="c1"># it into Kubernetes.</span>
<span class="c1">#</span>
<span class="c1"># Created with podman-4.3.1</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Pod</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">annotations</span><span class="pi">:</span>
    <span class="na">io.kubernetes.cri-o.ContainerType/app</span><span class="pi">:</span> <span class="s">container</span>
    <span class="na">io.kubernetes.cri-o.SandboxID/app</span><span class="pi">:</span> <span class="s">e026b5b478c4489f78b85e875af39f4d439467a7b7abdf469629a9daa312cdf</span>
    <span class="na">io.kubernetes.cri-o.TTY/app</span><span class="pi">:</span> <span class="s2">"</span><span class="s">false"</span>
    <span class="na">io.podman.annotations.autoremove/app</span><span class="pi">:</span> <span class="s2">"</span><span class="s">FALSE"</span>
    <span class="na">io.podman.annotations.init/app</span><span class="pi">:</span> <span class="s2">"</span><span class="s">FALSE"</span>
    <span class="na">io.podman.annotations.privileged/app</span><span class="pi">:</span> <span class="s2">"</span><span class="s">FALSE"</span>
    <span class="na">io.podman.annotations.publish-all/app</span><span class="pi">:</span> <span class="s2">"</span><span class="s">FALSE"</span>
  <span class="na">creationTimestamp</span><span class="pi">:</span> <span class="s2">"</span><span class="s">2025-01-27T10:57:38Z"</span>
  <span class="na">labels</span><span class="pi">:</span>
    <span class="na">app</span><span class="pi">:</span> <span class="s">nexus</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">nexus</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">automountServiceAccountToken</span><span class="pi">:</span> <span class="no">false</span>
  <span class="na">containers</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">image</span><span class="pi">:</span> <span class="s">docker.io/sonatype/nexus3:3.89.1-alpine</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">app</span>
    <span class="na">command</span><span class="pi">:</span> 
    <span class="pi">-</span> <span class="s">/bin/sh</span>
    <span class="na">args</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">-c</span>
    <span class="pi">-</span> <span class="pi">|</span>
      <span class="s">udhcpc</span>
      <span class="s">exec sh /opt/sonatype/nexus/bin/nexus run</span>
    <span class="na">ports</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">containerPort</span><span class="pi">:</span> <span class="m">1234</span>
      <span class="na">hostPort</span><span class="pi">:</span> <span class="m">1234</span>
    <span class="pi">-</span> <span class="na">containerPort</span><span class="pi">:</span> <span class="m">8081</span>
      <span class="na">hostPort</span><span class="pi">:</span> <span class="m">8081</span>
    <span class="pi">-</span> <span class="na">containerPort</span><span class="pi">:</span> <span class="m">8443</span>
      <span class="na">hostPort</span><span class="pi">:</span> <span class="s">8443</span> 
    <span class="na">resources</span><span class="pi">:</span> <span class="pi">{}</span>
    <span class="na">securityContext</span><span class="pi">:</span>
      <span class="na">capabilities</span><span class="pi">:</span>
        <span class="na">runAsUser</span><span class="pi">:</span> <span class="m">0</span>
        <span class="na">runAsGroup</span><span class="pi">:</span> <span class="s">0</span>    
        <span class="na">add</span><span class="pi">:</span> 
        <span class="pi">-</span> <span class="s">NET_ADMIN</span>
        <span class="pi">-</span> <span class="s">NET_RAW</span>       
        <span class="na">drop</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="s">CAP_MKNOD</span>
        <span class="pi">-</span> <span class="s">CAP_AUDIT_WRITE</span>
    <span class="na">volumeMounts</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">resolv-override</span>
      <span class="na">mountPath</span><span class="pi">:</span> <span class="s">/etc/resolv.conf</span>
    <span class="pi">-</span> <span class="na">mountPath</span><span class="pi">:</span> <span class="s">/nexus-data</span>
      <span class="na">name</span><span class="pi">:</span> <span class="s">nexus-data-vol</span>
    <span class="pi">-</span> <span class="na">mountPath</span><span class="pi">:</span> <span class="s">/nexus-data/etc</span>
      <span class="na">name</span><span class="pi">:</span> <span class="s">nexus-config-vol</span>  
    <span class="pi">-</span> <span class="na">mountPath</span><span class="pi">:</span> <span class="s">/nexus-data/etc/tls</span>
      <span class="na">name</span><span class="pi">:</span> <span class="s">nexus-tls-vol</span>            
  <span class="na">enableServiceLinks</span><span class="pi">:</span> <span class="no">false</span>
  <span class="na">hostname</span><span class="pi">:</span> <span class="s">nexus</span>
  <span class="na">restartPolicy</span><span class="pi">:</span> <span class="s">Never</span>
  <span class="na">volumes</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">hostPath</span><span class="pi">:</span>
      <span class="na">path</span><span class="pi">:</span> <span class="s">/data/containers/nexus/data</span>
      <span class="na">type</span><span class="pi">:</span> <span class="s">Directory</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">nexus-data-vol</span>    
  <span class="pi">-</span> <span class="na">hostPath</span><span class="pi">:</span>
      <span class="na">path</span><span class="pi">:</span> <span class="s">/data/containers/nexus/config</span>
      <span class="na">type</span><span class="pi">:</span> <span class="s">Directory</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">nexus-config-vol</span>   
  <span class="pi">-</span> <span class="na">hostPath</span><span class="pi">:</span>
      <span class="na">path</span><span class="pi">:</span> <span class="s">/data/containers/nexus/tls</span>
      <span class="na">type</span><span class="pi">:</span> <span class="s">Directory</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">nexus-tls-vol</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">resolv-override</span>
    <span class="na">emptyDir</span><span class="pi">:</span> <span class="pi">{}</span>
<span class="na">status</span><span class="pi">:</span> <span class="pi">{}</span>
</code></pre></div></div>

<h4 id="reboot-and-test">Reboot and test</h4>

<p>The container should appear on the network and DHCP should now be rock solid</p>

<h2 id="why-not-do-this">Why not do this?</h2>

<p>While this works, running DHCP clients in containers reduces security and increases complexity. Specifically, entrypoint needs to be configured differently for each container.</p>

<p>In the end it turns out to be much simpler to just forward ports from the host and register a DNS alias on the router.</p>

<p>In the past, I’ve held off doing this because having multiple VLANs with active IP addresses on my host was creating <a href="/2025/03/13/debugging-asymmetric-routing.html">asymmetric routing chaos</a>.</p>

<p>I’m confident I can solve these problems now and install a simple reverse proxy instead which should be way simpler to manage - stay tuned.</p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[TL;DR]]></summary></entry><entry><title type="html">ESP32 SD Card strangeness</title><link href="http://www.declarativesystems.com/2026/02/01/esp23-sd-card-strangeness.html" rel="alternate" type="text/html" title="ESP32 SD Card strangeness" /><published>2026-02-01T00:00:00+00:00</published><updated>2026-02-01T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/02/01/esp23-sd-card-strangeness</id><content type="html" xml:base="http://www.declarativesystems.com/2026/02/01/esp23-sd-card-strangeness.html"><![CDATA[<p>This is a new one:</p>

<p>ESP32 with a <code class="language-plaintext highlighter-rouge">vfat</code> SD card. Everything working fine. Take SD card out, copy <code class="language-plaintext highlighter-rouge">.json</code> config file over top of existing file <em>on PC</em>, SD card back in ESP32, power on and WIFI no longer connecting/soft lockup.</p>

<p>Running <code class="language-plaintext highlighter-rouge">fskc</code> on SD card gives no errors.</p>

<p>Luckily my program is configured to print out the config file on startup for debugging purposes. On a serial console, I see most of the config file being printed out, however, the end of the file is replaced by a continuous stream of smiley face (☺/<code class="language-plaintext highlighter-rouge">0x01</code>) characters:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"lomq32/devices/makerfabs/malo4mos/121": { "name": "tv pumps", "descript☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺
</code></pre></div></div>

<p>After some digging, it turns out on ESP32, the FAT implementation can sometimes read the <em>old/wrong</em> clusters when you copy files over the top of existing ones - and the strange characters I was seeing were basically an overflow/underflow when reading from the card.</p>

<p>The fix?</p>

<p>On PC:</p>
<ol>
  <li>Delete config file from SD card</li>
  <li>Copy the (identical) file from the PC to the SD card again</li>
  <li>Unmount SD card, insert into ESP32, power on</li>
</ol>

<p>The System booted normally and joined WIFI after this.</p>

<p>Strange huh?</p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[This is a new one:]]></summary></entry><entry><title type="html">Home Assistant MDNS (ZeroConf) Network forwarding on OPNsense</title><link href="http://www.declarativesystems.com/2026/01/31/homeassistant-mdns-forwarding.html" rel="alternate" type="text/html" title="Home Assistant MDNS (ZeroConf) Network forwarding on OPNsense" /><published>2026-01-31T00:00:00+00:00</published><updated>2026-01-31T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/01/31/homeassistant-mdns-forwarding</id><content type="html" xml:base="http://www.declarativesystems.com/2026/01/31/homeassistant-mdns-forwarding.html"><![CDATA[<p>If your home network is split up into <a href="https://en.wikipedia.org/wiki/VLAN">VLANs</a> like it should be, you will find that all the handy auto detection stuffs in Home Assistant no longer works and you need to type in IP addresses for devices manually. Also things like casting video to the TV won’t work. This is because <a href="https://en.wikipedia.org/wiki/Multicast_DNS">mDNS</a> can’t cross subnets.</p>

<p>Fixing this for the whole network on an <a href="https://opnsense.org/">OPNsense</a> router while preserving VLAN security is surprisingly simple. On the OPNsense web UI:</p>

<h2 id="step-0-rtfm">Step 0: RTFM</h2>

<ul>
  <li><a href="https://docs.opnsense.org/manual/how-tos/multicast-dns.html">Multicast DNS Proxy</a></li>
</ul>

<h2 id="step-1-install">Step 1: Install</h2>

<p><code class="language-plaintext highlighter-rouge">System</code> -&gt; <code class="language-plaintext highlighter-rouge">Firmware</code> -&gt; <code class="language-plaintext highlighter-rouge">Plugins</code></p>

<p>Select and install <code class="language-plaintext highlighter-rouge">os-mdns-repeater</code>, then reboot</p>

<h2 id="step-2-configure">Step 2: Configure</h2>

<p><code class="language-plaintext highlighter-rouge">Services</code> -&gt; <code class="language-plaintext highlighter-rouge">mDNS Repeater</code></p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Enable</code></li>
  <li><code class="language-plaintext highlighter-rouge">Listen Interfaces</code> is the networks you want to bridge. Apparently there is a limit of 5 although the UI does not prevent you selecting more. By bridging, we are effectively making a flat networking space for mDNS so that Home Assistant will find devices on other VLANs</li>
</ul>

<h2 id="step-3-firewall">Step 3: Firewall</h2>

<p>The last thing to do is add firewall rules to allow mDNS traffic where needed. I put the very slack rule:</p>

<p><img src="/assets/img/mdns_firewall_rule.png" alt="firewall rule" /></p>

<p>Which just allows all mDNS traffic to anywhere. You could restrict this further if needed.</p>

<p>Of course, Home Assistant also needs to be able to reach back to the devices it discovers, so this may also require additional rules depending how your network is setup.</p>

<h2 id="step-4-saveapplyreboot">Step 4: Save/apply/reboot</h2>

<p>After making changes like this, its good to reboot the router so you can be sure settings survive a reboot.</p>

<p>Some devices only send mDNS packets on startup as well, so this is a good time to go around the house rebooting printers etc.</p>

<h2 id="step-5-testenjoy">Step 5: Test/enjoy</h2>

<p>If everything worked, that’s really all there is to it. A few minutes after rebooting devices, I saw things showing up in Home Assistant -&gt; <code class="language-plaintext highlighter-rouge">Settings</code> -&gt; <code class="language-plaintext highlighter-rouge">Devices &amp; Services</code> -&gt; <code class="language-plaintext highlighter-rouge">Discovered</code>:</p>

<p><img src="/assets/img/home_assistant_mdns.png" alt="ha discovery working" /></p>

<p>On my phone, my TV was detected in Prime Video and I was able to watch content, whereas I normally have to join a different WIFI SSID.</p>

<p>Finally, on my Linux desktop, <code class="language-plaintext highlighter-rouge">avahi-browse</code> also finds devices:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>avahi-browse <span class="nt">-a</span>
+ wlp3s0 IPv4 Brother HL-L2460DW                            Web Site             <span class="nb">local</span>
+ wlp3s0 IPv4 Brother HL-L2460DW                            Secure Internet Printer <span class="nb">local</span>
+ wlp3s0 IPv4 Brother HL-L2460DW                            Internet Printer     <span class="nb">local</span>
+ wlp3s0 IPv4 Brother HL-L2460DW                            UNIX Printer         <span class="nb">local</span>
+ wlp3s0 IPv4 Home                                          _home-assistant._tcp <span class="nb">local</span>
...
</code></pre></div></div>

<p>In summary this makes using Home Assistant way simpler and restored casting to TV so that a normal human being can use it.</p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[If your home network is split up into VLANs like it should be, you will find that all the handy auto detection stuffs in Home Assistant no longer works and you need to type in IP addresses for devices manually. Also things like casting video to the TV won’t work. This is because mDNS can’t cross subnets.]]></summary></entry><entry><title type="html">Roomba in 2026</title><link href="http://www.declarativesystems.com/2026/01/31/roomba-2026.html" rel="alternate" type="text/html" title="Roomba in 2026" /><published>2026-01-31T00:00:00+00:00</published><updated>2026-01-31T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2026/01/31/roomba-2026</id><content type="html" xml:base="http://www.declarativesystems.com/2026/01/31/roomba-2026.html"><![CDATA[<blockquote>
  <p>Eventually, any serious IT conversation ends up with a discussion about robot vacuum cleaners</p>
</blockquote>

<p>I bought my Roomba 980 10 years ago and it still works fine, although some of the plastic parts have started breaking and its loud as hell. Sadly, Roomba recently went bankrupt - who knows what effect this will have on the Roomba app and privacy, or if the app will even continue to work at all.</p>

<p>With this in mind I searched for a modern replacement with good privacy controls and drew a blank. Most manufacturers have mandated cloud control for their new devices. There’s just no way I’m spending almost $2000 on a new robot that doesn’t respect privacy. I did find two exceptions:</p>

<ol>
  <li><a href="https://valetudo.cloud/">Valetudo</a> - An outstanding project from the author and clearly a labour of love. As much as I wanted to give this a go, I’ve just got too much going on to hunt down a compatible model, crack open the case and then hookup a serial programmer while also being careful to prevent any future firmware updates. If I already had a compatible robot I’d totally install this but buying into this ecosystem with <em>activly hostile</em> manufacturers is not something I’m willing to pay for.</li>
  <li><a href="https://maticrobots.com">Matic</a> - Promising. It claims to work locally instead of requiring cloud, but not open source, too big to clean under the bed, expensive and requires vacuum bags only available from Matic.</li>
</ol>

<h2 id="fixing-up-the-roomba">Fixing up the Roomba</h2>

<p>I decided to just fixup and secure the Roomba, and integrate it properly with Home Assistant.</p>

<h3 id="firewall-ports">Firewall ports</h3>

<p><a href="https://homesupport.irobot.com/s/article/9025">Roomba documentation</a> reproduced here, for your convenience:</p>

<blockquote>
  <p>Internal Network Traffic</p>

  <ul>
    <li>UDP port 5353/5678 for discovery.</li>
    <li>TCP/HTTPS 443 for data traffic.</li>
    <li>TCP/MQTT 8080/8883 for data traffic.</li>
  </ul>

  <p>Outbound Traffic to the Internet</p>

  <ul>
    <li>UDP/SNTP port 123 for time.</li>
    <li>TCP/HTTPS 443 (/80) for data traffic.</li>
    <li>TCP/MQTT 8080/8883 for data traffic.</li>
    <li>UDP/TCP port 53 for DNS.</li>
  </ul>
</blockquote>

<h3 id="repairs">Repairs</h3>

<p>I made a <a href="https://www.printables.com/@GeoffWilliams/collections/3121012">Roomba 980</a> collection on Printables.com and 3D printed replacement parts I needed, then ordered a new front wheel for $5 on Ali Express and a bag of spare rollers and filters for $20 on Amazon.</p>

<p>That covers all physical needs and the robot is back to tip-top condition.</p>

<h3 id="network-security-setup">Network Security Setup</h3>

<p>With the decision made that this robot is function complete and I’m not interested in maps, I decided to to cut the robot off from the Internet completely and just allow it access to NTP and DNS.</p>

<p>I have a complex setup spanning managed devices and routers but essentially what’s needed to do this is:</p>

<h4 id="omada-wifi-and-switches">Omada (wifi and switches)</h4>

<ul>
  <li><code class="language-plaintext highlighter-rouge">hell</code> VLAN to completely isolate traffic</li>
  <li>WIFI access point called <code class="language-plaintext highlighter-rouge">hell</code>, mapped to corresponding VLAN</li>
  <li>Modify trunk ports to carry <code class="language-plaintext highlighter-rouge">hell</code> VLAN, and allow Home Assistant to reach <code class="language-plaintext highlighter-rouge">hell</code> on its LAN port</li>
</ul>

<h4 id="opnsense-router">OPNsense (router)</h4>

<ul>
  <li>Interface for <code class="language-plaintext highlighter-rouge">hell</code></li>
  <li>Device mapping for <code class="language-plaintext highlighter-rouge">hell</code></li>
  <li>DHCP server for <code class="language-plaintext highlighter-rouge">hell</code></li>
  <li>Firewall rules for <code class="language-plaintext highlighter-rouge">hell</code></li>
</ul>

<p><img src="/assets/img/opnsense_hell.png" alt="hell firewall rules" /></p>

<p>The disabled rule at the top is to temporarily allow internet access when onboarding the robot with the app.</p>

<h3 id="robotapp">Robot/App</h3>

<p>Temporarily allow internet access in firewall, then either factory reset or change access point to <code class="language-plaintext highlighter-rouge">hell</code> in the app.</p>

<p>When the app says things are working, force close the app so that Home Assistant will be able to connect.</p>

<p>Back in your firewall, assign a static IP address to the robot.</p>

<h3 id="home-assistant">Home Assistant</h3>

<p>Home assistant provides a <a href="https://www.home-assistant.io/integrations/roomba/">Roomba integration</a> that works very nicely.</p>

<p>Even with <a href="/2026/01/31/homeassistant-mdns-forwarding.html">mDNS forwarding now working</a>, the Roomba is not detectable on a separate VLAN as it the integration uses the other discovery port, so just get the robot IP address from the router.</p>

<p>To connect, just allow the auto-detection to fail, then type the <strong>IP ADDRESS</strong> (not the hostname! - does not work) of the Roomba and follow the prompts on screen to get access automatically by pressing some buttons. There’s no need to run funny commands or docker images to get MQTT passwords any more.</p>

<p>If connecting fails, make sure the app is closed, reboot the Roomba and some reports even suggest to start vacuuming as well.</p>

<p>After successfully adding the Roomba, there will be a new panel on the dashboard where you can control the robot and you can <em>also</em> use voice control on the Home Assistant android app, like this (my robot is called Lucy):</p>

<p><img src="/assets/img/ha_voice_lucy.png" alt="roomba ha app" /></p>

<h3 id="noise">Noise</h3>

<p>Not much I can do about noise except avoid vacuuming at night and either go out or use noise cancelling headphones.</p>

<h3 id="disable-internet-access">Disable internet access</h3>

<p>With all systems working, I <em>disabled</em> the rule that allowed internet access, then connected to the <code class="language-plaintext highlighter-rouge">hell</code> SSID and <strong>tested it</strong> to make sure access was really blocked.</p>

<p>The app will no longer work but the Roomba is now fully secured from future updates and app breakages.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Really the only thing missing from this setup is the cleaning maps. Im not sure if my firmware is supported by <a href="https://github.com/jeremywillans/ha-rest980-roomba/issues/53">rest980</a> and at this point, I’m happy to let this one feature slide.</p>

<p>I’m confident I can get a few more years out of my Roomba, and perhaps there will be some more consumer friendly vacuums to buy by then. Otherwise, perhaps maintaining a vintage robot may become the modern equivalent of driving a classic car 😂</p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[Eventually, any serious IT conversation ends up with a discussion about robot vacuum cleaners]]></summary></entry><entry><title type="html">3D Printed WIFI QR Codes</title><link href="http://www.declarativesystems.com/2025/12/04/3d-printed-qr-codes.html" rel="alternate" type="text/html" title="3D Printed WIFI QR Codes" /><published>2025-12-04T00:00:00+00:00</published><updated>2025-12-04T00:00:00+00:00</updated><id>http://www.declarativesystems.com/2025/12/04/3d-printed-qr-codes</id><content type="html" xml:base="http://www.declarativesystems.com/2025/12/04/3d-printed-qr-codes.html"><![CDATA[<p>Sometimes you just want to 3D print a QR code without stuffing round with Fusion 360 or whatever. This is useful because a paper QR code stuck to the wall with blu tack will eventually soak up the oil and become un-scannable, so its just as well to do things properly.</p>

<p>In my case, I want to 3D print a WIFI QR code, mostly from the commandline on linux.</p>

<h2 id="step-1---generate-qr-code-for-wifi">Step 1 - Generate QR code for WIFI</h2>

<p>If you want a scan-to-join WIFI code, there is a special payload to use for your QR code, replce <code class="language-plaintext highlighter-rouge">THESSID</code> and <code class="language-plaintext highlighter-rouge">THEPASSWORD</code> for your own network:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>WIFI:T:WPA2;S:THESSID;P:THEPASSWORD;;
</code></pre></div></div>

<p>Now generate your QR code as an SVG:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># apt install qrencode</span>
qrencode <span class="nt">-t</span> SVG <span class="nt">-o</span> guest_wifi.svg <span class="s2">"WIFI:T:WPA2;S:THESSID;P:THEPASSWORD;;"</span>
</code></pre></div></div>

<h3 id="step-2---make-it-look-pretty">Step 2 - Make it look pretty</h3>

<p>Open inkscape, import the QR code SVG, add boxes and text to your hearts content, then export as a new SVG file.</p>

<h3 id="step-3---convert-svg-to-stl">Step 3 - Convert SVG to STL</h3>

<p>I found a handy tool <a href="https://github.com/positron48/svg2stl">svg2stl</a> to convert the pretty SVG from inkscape to an STL file, like this:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./svg2stl-linux <span class="nt">--thickness</span> 1.0 <span class="nt">--pixel_size</span> 0.025 ../guest_wifi_badge.svg
</code></pre></div></div>

<p>It worked perfectly for me, first time!</p>

<h3 id="step-4---slice-it">Step 4 - Slice it</h3>

<p>The STL file generated above is just the black parts of the SVG. It needs to go on a base to print nicely. There’s probably an easier way to do this but the steps I needed in the Prusaslicer were:</p>

<ol>
  <li>Import the STL</li>
  <li>Right click, add part, box</li>
  <li>Scale and translate box to cover the QR code badge</li>
  <li>Position the QR code above the box</li>
  <li>Add a layer change and assign black colour</li>
  <li>Make sure to print it big so the lines are clear - mine was about 12cm x 15cm</li>
  <li>Check the slicing result with the layer slider to check everything looks good</li>
</ol>

<p>Don’t forget to test-scan your code from the slicer app with your phone before printing.</p>

<p><strong>Update: after printing, removing the QR code from the base caused all the bits of QR code to turn into projectiles and fly across the room, making a horrendous mess - make sure to not mix plastics and/or making the base thicker to reduce flexing 😂</strong></p>]]></content><author><name>Geoff Williams</name><email>geoff@declarativesystems.com</email></author><summary type="html"><![CDATA[Sometimes you just want to 3D print a QR code without stuffing round with Fusion 360 or whatever. This is useful because a paper QR code stuck to the wall with blu tack will eventually soak up the oil and become un-scannable, so its just as well to do things properly.]]></summary></entry></feed>