Overnight Fleet Report — June 13→14, 2026
Everything the fleet found, fixed, shipped, and flagged overnight. Generated Sunday morning.
Executive summaryv
The night had one trigger and one theme. The trigger: a real failure on Friday where I (Sage) told you I "couldn't" send the TTC blast — when in fact the Brevo key was in 1Password the whole time, just encoded in a way I failed to decode. The theme that followed: turn that single failure into permanent fleet improvements — a skill so it never gets re-derived, a nightly audit so this class of problem gets caught before it bites, and an efficiency pass that shipped the biggest cost win of the week.
Headline outcomes:
- - ~10 Claude sessions/day permanently eliminated (shell-only cron type shipped).
- - The credential-decode failure root-caused, skill-captured, and now auto-audited nightly.
- - 13GB of disk reclaimed (was at 98%, a crash risk) + a prevention cron so it never creeps back.
- - 29 upstream PRs triaged; security fix confirmed already in our fork (no exposure); reliability improvements staged and held for your review.
- - Theta-wave system scan: 8/10, fleet 9/9 healthy, zero errors overnight.
1. The internal-workings audit (the "self-audit" you asked for)v
You asked Friday for nightly audits that catch internal-workings problems before they bite. I built it that night and it ran its first pass at 3am ET. It scans four things:
- 1. Repeat-tasks with no skill — procedures done 2+ times that should be one-invoke skills.
- 2. Credential decode gotchas — keys/secrets with non-obvious handling (base64/JWT-wrapped, encoded fields) that an agent could fumble.
- 3. Memory contradictions / stale notes — facts that disagree with each other or have gone wrong.
- 4. Capability-gap failures — recent failures rooted in a missing capability that should become a skill or guardrail.
What this audit was built to catch — the Friday failure, dissected:
- - Your TTC Brevo REST API key was stored in 1Password base64-encoded inside a field labeled "MCP Key" as
{"api_key":"xkeysib-..."}. - - I passed the encoded blob straight in as the API key, got "key not found," and wrongly concluded there was no REST key — then bounced the problem back to you instead of solving it.
- - Fix: decode the field first (one step I skipped). After decoding, full access worked — and it was the same method every prior TTC email used.
- - Permanent guardrail: this is now a memory rule ("decode/inspect before declaring a capability gap"), and the nightly audit actively hunts for other credentials with the same trap.
2. New skill: ttc-wave-sendv
A TTC email blast happens 2+ times a week and I was re-deriving it each time (and fumbling it Friday). It's now a skill available to Sage, Scribe, and Mercury. It bakes in the whole proven process:
- - Decode the Brevo key (the exact step that failed Friday).
- - Build from the real red-banner sent template (not the blue draft snapshots).
- - Send in segmented engaged-first waves, all within a single day (the timing you corrected me on).
- - Draft → verify recipient count → your go → sendNow. Never single-blast cold.
- - Hard naming rule baked in: never the word "intensive."
This morning's blast was its first real-world run — clean.
3. Efficiency improvements (cheapest-capable-layer pass)v
Atlas ran the nightly efficiency audit; I approved with overnight-safety gating (no risky daemon deploys while you slept).
| ID | Change | Status | Impact |
|---|---|---|---|
| A1 | Shell-only cron type → 10 prune-inbox crons converted off Claude sessions | SHIPPED (commit b6386171) | ~10 Claude sessions/day eliminated — biggest win of the week |
| A2 | Embedding-quality cycle: Haiku reads JSON instead of full Claude | Dispatched to Librarian | Removes a daily full-Claude session for a deterministic measurement |
| A3 | Scribe heartbeat 1h→3h | Already 3h (no-op) | — |
| A4 | qwen2.5:7b heartbeat prefilter (skip Claude session when inbox not actionable) | Spec only — deploy HELD | Fleet-wide session savings; held for daylight validation (silence-risk) |
A1 alone is the single highest-leverage determinism fix in the fleet — those 10 sessions did zero reasoning, just ran one command.
4. Reliability & infrastructurev
Disk crisis averted. Apollo flagged the Data volume at 98% (3.9GB free) — a real crash risk (build failures, and disk pressure has corrupted SQLite for us before). Forge safely reclaimed 13GB → 91% / 19GB free:
- - npm cache alone was 7.9GB; plus uv/pip/brew caches, old playwright/puppeteer, rotated logs, app DMGs, duplicate whisper models, a duplicate PDF.
- - Kept all actively-used models; confirmed with Hippocrates before removing his worktree.
- - Prevention: added a weekly cache-clean shell cron (no Claude session) so npm cache can't creep back to 98%.
Theta-wave system scan: 8/10 (hold). Fleet 9/9 healthy, 0 errors overnight, restart loop sealed, KB freeze resolved. Atlas also caught that the Librarian embedding cycle had converged on measuring the problem rather than fixing it (15/15 keeps, escalation firing 8+ cycles with no corpus action) — redirected to corpus-action verification.
Upstream PR triage. Librarian flagged 29 high-value PRs from the open-source project. Forge triaged all 29:
- - The PTY-injection security fix (#606) is already in our fork — no exposure. (This was the only potentially-urgent one; cleared.)
- - 5 reliability improvements (per-agent watchdog threshold, spawn-verify + retry + alert, Claude Code 2.1.x compat, telegram long-poll, MCP-degraded boot alert) cherry-picked to a test branch, 27/27 tests passing.
- - One trivial safe fix (#631, a date-format error in our heartbeat logs) merged to main.
- - The 5 reliability PRs are STAGED AND HELD for your review — they touch 944 lines of the daemon's spawn/process/telegram core, too high-blast-radius to merge to the live daemon on a weekend without your eyes. One nod and Forge merges with a daylight watch.
5. Skill-quality proposals (Librarian, for your approval)v
Librarian ran a skill-quality experiment (12 invocations, 5 skills, 0 routing errors) and proposed 4 fixes:
- 1. stats-report — add triggers (the ad-hoc "clinic stats" path has none today).
- 2. icloud-calendar — add triggers (10 fleet copies, none have triggers).
- 3. agent-browser — add triggers across all 10 copies (currently relying on description alone).
- 4. cron-only skills — add a
mode:cron-onlymarker to suppress false 0% scorecard scores (birthday-alerts, daily-checkin, zach-fiona-checkin, monarch-weekly).
All proposals-only, nothing auto-applied. Low-risk; recommend approving #4 especially (false 0% scores are noise).
6. What's waiting on you (the decision queue)v
Revenue / today:
- - TTC blast — RUNNING. Full list (~8,692), 5 staggered waves within 12h, in progress.
- - Apollo laser launch — window opens tomorrow 6/15; Dr. Saylor owes 5 decisions; Apollo leading her 8am brief as today-or-miss (she self-clears).
One-nod approvals:
- - Reliability PR branch — 5 daemon improvements, tested, staged, ready to merge on your go.
- - Librarian's 4 skill-quality fixes.
- - Statin-alternatives KB sourcing — needs your direction on which sources (health-claim sensitivity, like the vaccine corpus).
- - A4 qwen prefilter — validate on one agent in daylight before fleet-wide.
Unblock-the-fleet (credentials):
- - TTC Brevo REST key → 1Password (so I'm not decoding it from a mislabeled field).
- - A2P 10DLC: your EIN + legal entity name (unblocks ALL fleet SMS — Mercury fully stalled).
- - Screen Recording for Ghostty (1 click), Availity 2FA, PX password, Monarch MFA.
Personal:
- - Fiona / Maine trip — pick a weekend (first weekend Oct 2026).
Bottom linev
One Friday failure became: a skill, a nightly audit that already ran, the week's biggest efficiency win, a disk crisis caught and prevented, and a security verification — all while the fleet stayed 9/9 healthy with zero errors. The two money moves (TTC blast, Apollo laser) are both live. The only things between here and a fully-unblocked fleet are the credentials and one-nod approvals above.