Wayne Hayes 34422ba2b1 mailbox: give sidecar netns real IPv6 egress; resolve AAAA trap; DNS notes

Add enable_ipv6 + a ULA subnet to tailwart_default so the Stalwart
container (sharing the ts-stalwart netns) gets working IPv6 egress.
Because only egress is needed (inbound arrives via the edge/tailnet),
a ULA + Docker masquerade suffices -- no routable prefix, ndppd, or
host sysctl changes (Docker 29 enables ip6tables by default; host
forwarding was already on). Verified: ping6 + TCP/443 to v6 literals
from inside the netns; zero ENETUNREACH since boot.

LESSONS: mark #8/#9 resolved with the ULA-masquerade recipe, and add
#13 -- Spaceship's DNS API is RRSet-upsert (not zone-replace), so
Stalwart/ACME did not eat custom AAAA records; a vanished AAAA is a
provider-side loss, not Stalwart. Includes the safe read/verify flow
and the "don't publish mail AAAA before edge v6 listeners" caveat.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-11 23:53:28 +01:00

13 KiB

Raw Blame History

tailwart — lessons learned

Hard-won notes from bringing the mail edge up. Each entry is symptom → cause → fix, ordered roughly by how long it cost. Read this before re-debugging.

1. Postgres startup race ate cert/setting writes

Symptom: TLS certs (manual import and ACME) would validate but never persist — Stalwart kept serving its rcgen self-signed fallback. Logs showed Failed to create tables: error connecting to server on most boots.

Cause: Stalwart shares the ts-stalwart sidecar's netns. Its depends_on only waited for the sidecar's own health (/healthz = "tailscaled up"), which flips green before the tailnet route to Postgres (the-record-prod:5432) is usable. Stalwart started into that gap, failed the DB connect, and any write in that window — including a freshly obtained cert — was silently lost.

Fix: the sidecar healthcheck now also requires Postgres to be reachable (nc -z … 5432), so depends_on: service_healthy can't release Stalwart into the race. See docker-compose.yml. First clean boot after this: zero PG errors, 4 live connections immediately.

2. DNS-01 was blocked by a dead Spaceship API key

Symptom: Failed to set DNS RRSet: Unauthorized on every record; no cert issued; no _acme-challenge TXT ever set.

Cause: the cert design is ACME DNS-01 via the Spaceship provider (bundled in caddy/lego). The stored API key was invalid (recovery debris from an earlier config attempt). Note STALWART_ACME_PROVIDER / STALWART_ACME_TOKEN in .env are empty and not even passed through by compose — the provider + secret are entered in the admin UI (stored in the DB), not via env.

Gotcha: secret fields render blank in the Stalwart admin even when set (the S3 secret behaves identically). A blank field is not evidence it's unset.

Fix / how to verify a key directly (egresses the box's WAN IP, same as Stalwart):

curl -i 'https://spaceship.dev/api/v1/dns/records/<domain>?take=5&skip=0' \
  -H 'X-Api-Key: KEY' -H 'X-Api-Secret: SECRET'
# 401 application.unauthorized = bad key/secret or IP-restricted
# 200 = good

A fresh Spaceship key fixed it.

3. Stalwart's auto-ban vs PROXY protocol (the "8080 mystery")

Symptom: the edge box could relay mail fine but could not reach Stalwart's :8080 admin — connections accept then immediately close. Looked like "tagged devices rejected, user phone works."

Cause: Stalwart's fail2ban checks the proxied client IP (from the PROXY header) on the mail listeners, but the raw connection IP on the non-proxied admin listener. A banned edge-box IP therefore still relays mail (ban checked against the header IP) while direct →:8080 is dropped (checked against the box IP). Malformed probing of the mail ports re-arms the ban.

Fix: add 100.64.0.0/10 (and the box's WAN IP, which appears as the proxied client when you hit the box's own public hostname) to the fail2ban allow-list. Bans are in-memory — a Stalwart restart flushes them. Don't rapid-poll the mail ports to test.

4. The wildcard request required DNS-01 (why HTTP-01 was a dead end)

With "Additional Hostnames" left empty, Stalwart requests a wildcard (*.<domain>). Wildcards can only be issued via DNS-01 — HTTP-01 literally cannot satisfy them. We burned time on an HTTP-01 + Caddy-challenge-forwarding detour before realizing DNS-01 was the intended (and only viable) path. One wildcard cert then covers mail, mta-sts, autoconfig, autodiscover, etc.

5. `:443` web endpoints need SNI pass-through, not L7 proxy

MTA-STS / autoconfig / autodiscover serve over :443. You cannot L7 reverse_proxy them through Caddy, because the CAA record pins issuance to Stalwart's ACME account — Caddy can't get its own cert for those names. Stalwart holds the wildcard, so the edge passes TLS through by SNI. See caddy/README.md → "The HTTP side". Needed tcp:443 added to the reverse-proxy → stalwart ACL grant.

6. The sidecar is ephemeral — never hardcode its tailnet IP

ts-stalwart runs with ?ephemeral=true, so its tailnet IP changes on re-registration (an ACL re-sync did this mid-debug: 100.112.26.122 → 100.79.87.80). Everything must use the MagicDNS name stalwart.tail7b1641.ts.net. A hardcoded IP will mysteriously go Network is unreachable.

7. Don't trust crt.sh for rate-limit checks

crt.sh was flaky/empty all session. To gauge Let's Encrypt's weekly duplicate-cert limit, use certspotter instead: https://api.certspotter.com/v1/issuances?domain=<d>&include_subdomains=true. Also: LE limits are dimensioned — failed validations are hourly (5/hr/host, the one a retry storm trips), issued duplicates are weekly (5/wk). A renewal task hammering every 10 min trips the hourly one; consolidate to a single task.

8. The Stalwart container has no IPv6 — AAAA targets fail before IPv4 is tried

Symptom: Outbound delivery (and relay-to-smarthost) to any host with an AAAA record fails with I/O error: Network is unreachable (os error 101). Hosts that are IPv4-only deliver fine. Pointing a relay at a hostname that has both A and AAAA fails; pointing it at the raw IPv4 works.

Cause: Stalwart shares the ts-stalwart sidecar's netns, which has no global IPv6. When it resolves a dual-stack target it tries the AAAA first, gets ENETUNREACH immediately, and for a relay next-hop it does not fall back to the A record — it just records the v6 failure and backs off. So a single missing address family wedges all mail to dual-stack destinations.

Fix: Either (a) pin the relay/smarthost address to an IPv4 literal (no AAAA to trip on), or (b) give the container real IPv6. Note that relaying over the tailnet sidesteps this entirely — you connect to a tailnet 100.x address, which has no AAAA, so the v6-first trap never triggers.

RESOLVED (2026-06-11) — option (b) is now done. The container has real IPv6 egress; this trap no longer fires. See Lesson 9's fix for how.

9. Configuring IPv6 on the KVM host does NOT give the container IPv6

Symptom: ip -6 addr and ping6 google.com succeed on the KVM host, but Stalwart still dies with os error 101 on AAAA targets, and the box is still a broken IPv6 Tailscale exit node.

Cause: The host's eth0 and the container/sidecar netns are separate network stacks. Adding the provider's /64 to eth0 (ifupdown inet6 static

onlink default route, since the gateway is in a different /64) fixes the host, not the container. Docker doesn't hand IPv6 to containers by default, and the sidecar routes via Tailscale, not eth0.

Fix: Don't assume host IPv6 = container IPv6. Test from inside the container's netns. For mail egress, the IPv4-literal relay (Lesson 8) or the tailnet relay avoids needing container IPv6 at all. Enabling true container IPv6 (Docker IPv6 + routing the /64 in) is a separate, larger task.

RESOLVED (2026-06-11) — the easy way, no /64 routing or ndppd. Because the container only needs IPv6 egress (inbound arrives via the edge/tailnet, never v6), you don't need a routable prefix or NDP proxy at all — just a ULA subnet + masquerade, exactly like Docker does for v4:

# docker-compose.yml
networks:
  default:
    enable_ipv6: true
    ipam:
      config:
        - subnet: fd00:7a17:600d::/64
          gateway: fd00:7a17:600d::1

Docker 29 enables ip6tables by default and masquerades the ULA out the host's global v6, so the sidecar netns (shared by Stalwart via network_mode) gets a working v6 default route with zero host sysctl/daemon changes (host net.ipv6.conf.all.forwarding was already 1 from the static-v6 setup). Verify from inside the netns: ping6 google.com + a TCP connect to a v6 literal on :443. Recreating the network (docker compose down && up) bounces the stack and the ephemeral sidecar gets a new tailnet IP — MagicDNS covers it (Lesson 6), and the MTA route table rebuilds anyway (Lesson 12). This does not give inbound v6; for that you'd still publish AAAA + make the edge listen on v6 (separate).

10. The VPS blocks ALL outbound SMTP ports — relay over the tailnet

Symptom: Direct MX delivery and relay-to-public-host both fail with Connection timed out (os error 110), and the SYN never arrives at the destination. Not just port 25 — 465, 587, even alt-port 2525 all time out.

Cause: The KVM provider blocks all outbound SMTP submission ports to prevent spam. Only non-SMTP ports (443, etc.) egress. Confirmed with:

for p in 25 465 587 2525 443; do
  timeout 5 bash -c "exec 3<>/dev/tcp/<dst>/$p" && echo "$p OPEN" || echo "$p blocked"
done
# 443 OPEN, all SMTP ports timeout

Fix: Relay over the tailnet. Tailscale rides WireGuard/DERP (UDP 41641 / 443), so it's immune to SMTP port filtering. Point the relay at the smarthost's tailnet IP (e.g. 100.x:587), not its public address. Long-term: ask the provider to unblock outbound 25/587 for verified use.

11. The sidecar can RECEIVE on the tailnet but can't INITIATE without an ACL grant

Symptom: The relay to <mailbox-tailnet-ip>:587 times out (os error 110), yet the KVM host (same physical machine) can reach that exact IP:port over the tailnet fine. Looks like a routing or transparent-proxy bug.

Cause: The Stalwart container rides the ts-stalwart sidecar — a separate tailnet node (tag:stalwart) from the KVM host. The tailwart ACL block only listed tag:stalwart as a destination ("dst": ["tag:stalwart"]). Tailnet is default-deny, so the sidecar could receive connections but could not initiate the relay back to the mailbox → silent drop → timeout. The KVM host worked because it's a different, permitted identity, which masked the real cause.

Fix: Add an ACL rule granting tag:stalwart as a source:

{ "src": ["tag:stalwart"], "dst": ["tag:mail"], "ip": ["tcp:587"] }

(mailbox is tag:mail). Applies in seconds, no restart. See acl-snippet.hujson.

12. Stalwart only rebuilds its MTA route table at container startup

Symptom: You edit an MtaRoute (address, etc.) via API/UI, but delivery keeps using the old value. The datastore shows the new value; live delivery ignores it.

Cause: The routing_strategy map is built once when the process boots. The ReloadSettings action reloads the datastore but does not rebuild the SMTP route map. So route/strategy changes are invisible until restart.

Fix: After any MtaRoute / MtaOutboundStrategy change, docker restart tailwart-stalwart-1. (Side effect: the ephemeral sidecar gets a new tailnet IP each restart — anything addressing it by IP must rediscover it; use the MagicDNS name where possible.)

13. "Did Stalwart eat my custom DNS records?" — no; Spaceship is RRSet-upsert

Symptom: A manually-added record (e.g. an AAAA for the apex/mail) is gone from the zone, and the suspicion is that Stalwart's ACME DNS-01 integration overwrote it on a renewal.

Cause: Almost never Stalwart. Its only DNS-provider writes are _acme-challenge.<name> TXT (the rotating challenge) and _validation-persist TXT (the LE account-pinned persistent-validation record). It does not create or modify A/AAAA/MX/SRV — those you add yourself from its "recommended records" page. And the Spaceship API is RRSet-upsert keyed by (name, type), not a whole-zone replace: a PUT /api/v1/dns/records/{domain} with {"force":true,"items":[…]} only touches the RRSets named in items. Proof: 25 unrelated records coexist untouched through every rotating _acme-challenge write; and adding one apex AAAA left the other 25 exactly intact (25→26).

So a vanished AAAA is far more likely a provider-side loss/rollback (e.g. during a data-center DDoS) or a manual edit — not Stalwart.

How to inspect / verify (read-only), creds in .env:

KEY=$(grep '^SPACESHIP_KEY=' .env | cut -d= -f2)
SECRET=$(grep '^SPACESHIP_SECRET=' .env | cut -d= -f2)
curl -s "https://spaceship.dev/api/v1/dns/records/<domain>?take=100&skip=0" \
  -H "X-Api-Key: $KEY" -H "X-Api-Secret: $SECRET" | python3 -m json.tool

To add a record, PUT the same endpoint with a single-item items array — it won't disturb siblings. Snapshot the zone (GET) before any write and diff after; snapshots land in _backup/ (gitignored). Always re-check at the authoritative NS (dig +short AAAA <name> @launch1.spaceship.net), not a cache.

Caveat — don't publish mail AAAA before the edge listens on v6. Inbound mail follows MX → mail.<domain>; an AAAA there with no v6 :25 listener on the edge makes senders try v6 and some won't fall back → deferred/bounced mail. An apex AAAA is safe (it doesn't affect MX routing). Do mail AAAA + edge v6 listeners together.

13 KiB Raw Blame History