tailwart/caddy
Wayne Hayes 38ba2eb83d Harden mail edge: PG-race healthcheck gate, :443 SNI fan-out, docs
Fixes the root cause that was silently dropping Stalwart's cert/setting
writes, completes the public HTTPS endpoints, and captures the debugging
knowledge.

- docker-compose.yml: gate the ts-stalwart healthcheck on Postgres
  reachability (nc -z the-record-prod:5432) in addition to tailscaled
  health. Stalwart's depends_on: service_healthy can no longer release it
  into the window where the tailnet route to Postgres isn't up yet — which
  was failing table init and losing in-flight cert writes (-> rcgen).

- caddy/caddy.json + README: add the :443 SNI fan-out. mta-sts /
  autoconfig / autodiscover pass through to stalwart:443 (Stalwart
  terminates TLS with its wildcard cert; no proxy_protocol on :443).
  All other SNIs go to the box's web Caddy on :8443 (https_port 8443).
  L7 reverse_proxy is impossible here: CAA pins issuance to Stalwart's
  ACME account, so Caddy can't obtain its own cert for these names.

- acl-snippet.hujson: grant tcp:443 on reverse-proxy -> stalwart for the
  SNI pass-through.

- config/config.json: track the v0.16 bootstrap (commit-safe; the DB
  secret is an EnvironmentVariable reference, not inline).

- LESSONS.md: symptom -> cause -> fix notes (PG race, DNS-01/Spaceship
  dead key, auto-ban vs PROXY protocol, wildcard-requires-DNS-01, SNI
  pass-through, ephemeral sidecar IP, LE rate-limit checks).

- .gitignore: exclude _backup/ and _validate/ (DB dumps + an inline-secret
  config) and editor swap files. NEVER commit those.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 05:15:34 +01:00
..
caddy.json Harden mail edge: PG-race healthcheck gate, :443 SNI fan-out, docs 2026-06-11 05:15:34 +01:00
docker-compose.yml Scaffold tailwart: Stalwart mailbox as a Tailscale sidecar 2026-06-03 22:25:38 -04:00
Dockerfile caddy: build via caddyserver.com download URL, not local xcaddy 2026-06-03 22:39:33 -04:00
README.md Harden mail edge: PG-race healthcheck gate, :443 SNI fan-out, docs 2026-06-11 05:15:34 +01:00

tailwart edge — layer-4 mail proxy

A custom Caddy (with the caddy-l4 app) that pipes the public mail ports to the Stalwart sidecar over the tailnet. Pure TCP pass-through with PROXY protocol — Stalwart still terminates all the TLS. Runs anywhere with a public IP that's on the tailnet and tagged tag:reverse-proxy; doesn't need to share a host with the mailbox.

Why layer 4 and not a normal Caddy vhost

Web apps reverse-proxy at layer 7 (route by Host/SNI, Caddy terminates TLS). Mail can't: port 25 has no SNI (STARTTLS comes after connect), and you want one global :25 listener, not per-domain routing. So the edge is a dumb L4 pipe and Stalwart owns the TLS. The novelty you spotted: this is the same stream-style proxying nginx/Caddy can do for any TCP — it just usually isn't used for it.

Build & run

docker compose up -d --build      # builds the image, runs it

The Dockerfile doesn't compile Caddy — it pulls the prebuilt L4-enabled binary from caddyserver.com/api/download (the house method, see ~/docs/caddy.md "Custom Binary"), dodging the ~1GB-RAM local xcaddy build this VPS can't afford. The build still fails loudly if caddy-l4 isn't in the downloaded binary. To add plugins, append &p=<url-encoded module path> to CADDY_DOWNLOAD in the Dockerfile.

Edit the upstream

caddy.json dials stalwart.tail7b1641.ts.net:<port>. If your STALWART_MAGIC_NAME / TS_TAILNET differ, update the five dial lines. (JSON can't read .env; this is the one spot the MagicDNS name is hardcoded — same trade-off as pgAdmin's servers.json.)

The HTTP side (MTA-STS / autoconfig / autodiscover) — :443 SNI fan-out

Stalwart publishes DNS that points public HTTPS names at this edge: mta-sts., autoconfig., autodiscover.<domain>. They serve the MTA-STS policy and mail-client autoconfig over :443 — so the edge has to handle :443 too, which is where a naive setup collides with a box that already runs a web Caddy.

The fix is not an L7 reverse_proxy (terminate at Caddy). You can't: the domain's CAA record pins issuance to Stalwart's ACME account (accounturi=…), so Caddy can't obtain its own cert for *.<domain>. Stalwart already holds the wildcard. So we pass TLS through to it.

The web server in caddy.json owns :443 and fans out by SNI:

  • mta-sts / autoconfig / autodiscover.<domain>stalwart:443 (pass-through; Stalwart terminates with its wildcard cert — no proxy protocol on :443, unlike the mail ports).
  • every other SNI → 127.0.0.1:8443, the box's own web Caddy.

For that fallback to exist, move the web Caddy's HTTPS off :443:

{
    https_port 8443    # web vhosts now listen here; the L4 :443 forwards to them
}

your-web-site.example { reverse_proxy  }

HTTP→HTTPS redirects still resolve to :443 correctly. A mail-only edge (no web vhosts on the box) omits the web server entirely — keep just the mail ports above.

Note: tag:reverse-proxy → tag:stalwart must also grant tcp:443 in the Tailscale ACL (see ../acl-snippet.hujson), on top of the mail ports.

Prerequisites on the host running this

  • Joined to the tailnet, tagged tag:reverse-proxy (so the ACL lets it reach tag:stalwart).
  • Public firewall opens for whichever mail ports you expose (25 minimum).
  • Nothing else bound to those ports.