tailwart/acl-snippet.hujson
Wayne Hayes 38ba2eb83d Harden mail edge: PG-race healthcheck gate, :443 SNI fan-out, docs
Fixes the root cause that was silently dropping Stalwart's cert/setting
writes, completes the public HTTPS endpoints, and captures the debugging
knowledge.

- docker-compose.yml: gate the ts-stalwart healthcheck on Postgres
  reachability (nc -z the-record-prod:5432) in addition to tailscaled
  health. Stalwart's depends_on: service_healthy can no longer release it
  into the window where the tailnet route to Postgres isn't up yet — which
  was failing table init and losing in-flight cert writes (-> rcgen).

- caddy/caddy.json + README: add the :443 SNI fan-out. mta-sts /
  autoconfig / autodiscover pass through to stalwart:443 (Stalwart
  terminates TLS with its wildcard cert; no proxy_protocol on :443).
  All other SNIs go to the box's web Caddy on :8443 (https_port 8443).
  L7 reverse_proxy is impossible here: CAA pins issuance to Stalwart's
  ACME account, so Caddy can't obtain its own cert for these names.

- acl-snippet.hujson: grant tcp:443 on reverse-proxy -> stalwart for the
  SNI pass-through.

- config/config.json: track the v0.16 bootstrap (commit-safe; the DB
  secret is an EnvironmentVariable reference, not inline).

- LESSONS.md: symptom -> cause -> fix notes (PG race, DNS-01/Spaceship
  dead key, auto-ban vs PROXY protocol, wildcard-requires-DNS-01, SNI
  pass-through, ephemeral sidecar IP, LE rate-limit checks).

- .gitignore: exclude _backup/ and _validate/ (DB dumps + an inline-secret
  config) and editor swap files. NEVER commit those.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 05:15:34 +01:00

27 lines
1.3 KiB
Plaintext

// tailwart — merge into your live Tailscale policy (admin console).
// Snippet, not a full policy. Kept here so an upstream pull of any other repo
// can't clobber it.
// 1) tagOwners — add (self-ownership required for auth-key node creation):
// "tag:stalwart": ["autogroup:admin", "tag:stalwart"],
// 2) grants — Stalwart reaches the three shared backends:
{ "src": ["tag:stalwart"], "dst": ["tag:db-postgres"], "ip": ["tcp:5432"] },
{ "src": ["tag:stalwart"], "dst": ["tag:db-redis"], "ip": ["tcp:6379"] },
{ "src": ["tag:stalwart"], "dst": ["tag:garage"], "ip": ["tcp:3900"] },
// 3) grants — the edge proxy (tag:reverse-proxy) reaches the mailbox ports.
// 8080 is the JMAP/admin HTTP tier (fronted by the main L7 Caddy).
// 443 is Stalwart's HTTPS web listener; the edge L4-proxies the public
// mta-sts/autoconfig/autodiscover SNIs to it (Stalwart terminates TLS with
// its wildcard cert). PROXY protocol v2, same as the mail ports.
{
"src": ["tag:reverse-proxy"],
"dst": ["tag:stalwart"],
"ip": ["tcp:25", "tcp:465", "tcp:587", "tcp:143", "tcp:993", "tcp:443", "tcp:8080"],
},
// 4) admin console (not this file): assign tag:stalwart to the same OAuth
// client federatedSocial uses, on the Devices/Core + Keys/AuthKeys scopes.
// Missing → 403 "calling actor does not have enough permissions" at boot.