diff --git a/LESSONS.md b/LESSONS.md index f2c29a5..368ef76 100644 --- a/LESSONS.md +++ b/LESSONS.md @@ -237,12 +237,74 @@ curl -s "https://spaceship.dev/api/v1/dns/records/?take=100&skip=0" \ -H "X-Api-Key: $KEY" -H "X-Api-Secret: $SECRET" | python3 -m json.tool ``` To add a record, `PUT` the same endpoint with a single-item `items` array — it -won't disturb siblings. **Snapshot the zone (GET) before any write** and diff -after; snapshots land in `_backup/` (gitignored). Always re-check at the -authoritative NS (`dig +short AAAA @launch1.spaceship.net`), not a cache. +won't disturb siblings of a *different* name/type (but see #14 — for an existing +RRSet it **appends**, it does not replace). **Snapshot the zone (GET) before any +write** and diff after; snapshots land in `_backup/` (gitignored). Always +re-check at the authoritative NS (`dig +short AAAA @launch1.spaceship.net`), +not a cache. **Caveat — don't publish `mail` AAAA before the edge listens on v6.** Inbound mail follows `MX → mail.`; an `AAAA` there with no v6 `:25` listener on the edge makes senders try v6 and some won't fall back → deferred/bounced mail. An **apex** `AAAA` is safe (it doesn't affect MX routing). Do `mail` AAAA + edge v6 listeners together. + +## 14. Spaceship `PUT` is an APPEND-by-value, not a replace — it can dupe an RRSet + +**Symptom:** "Updating" the SPF record (`PUT` with `force:true` and the new +value) left the zone with **two** `v=spf1` apex TXT records. Two SPF records is +an RFC 7208 `permerror` → SPF **fails hard for everyone** — worse than the typo +you were fixing. + +**Cause:** Spaceship keys records by (name, type, **value**). A `PUT` whose value +differs from the existing record is a *new* record, so `force:true` **adds** +rather than replacing. (The earlier AAAA/SPF adds looked like clean "upserts" +only because there was no prior record at that name+type, or the value matched.) + +**Fix / correct pattern for an in-place value change:** `PUT` the new value, then +**`DELETE` the old one** — and the `DELETE` body is a **bare JSON array**, not +`{"items":[…]}` (the latter 422s with `Value is "object" but should be "array"`): +```bash +curl -s -X DELETE "https://spaceship.dev/api/v1/dns/records/" \ + -H "X-Api-Key: $KEY" -H "X-Api-Secret: $SECRET" -H 'Content-Type: application/json' \ + -d '[{"type":"TXT","name":"@","value":"v=spf1 mx -all"}]' +``` +Always GET-diff before/after (count + REMOVED/ADDED sets) to catch a stray dupe. + +## 15. ed25519 DKIM "fails" at Gmail with both ed25519+RSA — it's not your key + +**Symptom:** DMARC aggregate reports show, per message, `dkim=pass` for the RSA +selector but `dkim=fail` for the ed25519 selector (`v1-ed25519-…`), on the *same* +intact message. Looks like a broken/mismatched ed25519 key. + +**Cause:** **Not the key.** Verified cryptographically: the stored ed25519 seed +derives exactly the published `p=` (and the PKCS#8-v2 blob even embeds that same +pubkey). seed → pubkey → DNS all agree. It's the **known Stalwart dual-signing +issue** ([discussion #2727](https://github.com/stalwartlabs/stalwart/discussions/2727)): +when Stalwart applies *both* an ed25519 and an RSA signature, Gmail/Hotmail +mishandle the ed25519 one (`fail`, or `neutral (no key)`), while RSA passes. The +maintainer's own server runs with "ed25519 ignored, RSA passes." RSA carries +DMARC, so **mail is unaffected** — it's cosmetic, just noisy in reports. + +How the key was proven (the seed lives in settings table `s`, PKCS#8 v2): +```bash +# 32-byte seed from the OCTET STRING in the stored PKCS#8; wrap as clean v0 DER: +printf '302e020100300506032b657004220420%s' "$SEED_HEX" | xxd -r -p > /tmp/ed.der +openssl pkey -inform DER -in /tmp/ed.der -pubout -outform DER | tail -c 32 | base64 +# == the DNS p= value → key is correct +``` + +**Fix (proper = RSA-only):** the recommended cure is to stop emitting the ed25519 +signature, not republish anything. Two parts: +1. **DNS (done 2026-06-12):** removed the `v1-ed25519-20260604._domainkey` TXT — + turns the report `fail` into a harmless "no key", DMARC still green via RSA. +2. **Stalwart (still TODO):** disable the ed25519 **signature** in the admin UI / + JMAP signing config so outbound stops carrying it (DB surgery on the serialized + signature object is risky — do it through the supported surface). The fallback + admin can't mint an API token non-interactively (only `authorization_code` / + `device_code` grants; no ROPC), so this needs the web UI or a device-code login. + +**Aside discovered here:** outbound is a catch-all smarthost relay to +`mail.tail7b1641.ts.net` (auth `stalwart-relay@waynehayes.com`), which re-emits +as `mail.waynehayes.com` (`216.189.156.74` / `2602:ffc5:20::1:6b52`). That relay +IP is why SPF needed `include:waynehayes.com` (#14 / the SPF fix).