Overnight Fleet Report — June 13→14, 2026

Everything the fleet found, fixed, shipped, and flagged overnight. Generated Sunday morning.


Executive summaryv

The night had one trigger and one theme. The trigger: a real failure on Friday where I (Sage) told you I "couldn't" send the TTC blast — when in fact the Brevo key was in 1Password the whole time, just encoded in a way I failed to decode. The theme that followed: turn that single failure into permanent fleet improvements — a skill so it never gets re-derived, a nightly audit so this class of problem gets caught before it bites, and an efficiency pass that shipped the biggest cost win of the week.

Headline outcomes:

  • - ~10 Claude sessions/day permanently eliminated (shell-only cron type shipped).
  • - The credential-decode failure root-caused, skill-captured, and now auto-audited nightly.
  • - 13GB of disk reclaimed (was at 98%, a crash risk) + a prevention cron so it never creeps back.
  • - 29 upstream PRs triaged; security fix confirmed already in our fork (no exposure); reliability improvements staged and held for your review.
  • - Theta-wave system scan: 8/10, fleet 9/9 healthy, zero errors overnight.

1. The internal-workings audit (the "self-audit" you asked for)v

You asked Friday for nightly audits that catch internal-workings problems before they bite. I built it that night and it ran its first pass at 3am ET. It scans four things:

  1. 1. Repeat-tasks with no skill — procedures done 2+ times that should be one-invoke skills.
  2. 2. Credential decode gotchas — keys/secrets with non-obvious handling (base64/JWT-wrapped, encoded fields) that an agent could fumble.
  3. 3. Memory contradictions / stale notes — facts that disagree with each other or have gone wrong.
  4. 4. Capability-gap failures — recent failures rooted in a missing capability that should become a skill or guardrail.

What this audit was built to catch — the Friday failure, dissected:

  • - Your TTC Brevo REST API key was stored in 1Password base64-encoded inside a field labeled "MCP Key" as {"api_key":"xkeysib-..."}.
  • - I passed the encoded blob straight in as the API key, got "key not found," and wrongly concluded there was no REST key — then bounced the problem back to you instead of solving it.
  • - Fix: decode the field first (one step I skipped). After decoding, full access worked — and it was the same method every prior TTC email used.
  • - Permanent guardrail: this is now a memory rule ("decode/inspect before declaring a capability gap"), and the nightly audit actively hunts for other credentials with the same trap.

2. New skill: ttc-wave-sendv

A TTC email blast happens 2+ times a week and I was re-deriving it each time (and fumbling it Friday). It's now a skill available to Sage, Scribe, and Mercury. It bakes in the whole proven process:

  • - Decode the Brevo key (the exact step that failed Friday).
  • - Build from the real red-banner sent template (not the blue draft snapshots).
  • - Send in segmented engaged-first waves, all within a single day (the timing you corrected me on).
  • - Draft → verify recipient count → your go → sendNow. Never single-blast cold.
  • - Hard naming rule baked in: never the word "intensive."

This morning's blast was its first real-world run — clean.


3. Efficiency improvements (cheapest-capable-layer pass)v

Atlas ran the nightly efficiency audit; I approved with overnight-safety gating (no risky daemon deploys while you slept).

IDChangeStatusImpact
A1Shell-only cron type → 10 prune-inbox crons converted off Claude sessionsSHIPPED (commit b6386171)~10 Claude sessions/day eliminated — biggest win of the week
A2Embedding-quality cycle: Haiku reads JSON instead of full ClaudeDispatched to LibrarianRemoves a daily full-Claude session for a deterministic measurement
A3Scribe heartbeat 1h→3hAlready 3h (no-op)
A4qwen2.5:7b heartbeat prefilter (skip Claude session when inbox not actionable)Spec only — deploy HELDFleet-wide session savings; held for daylight validation (silence-risk)

A1 alone is the single highest-leverage determinism fix in the fleet — those 10 sessions did zero reasoning, just ran one command.


4. Reliability & infrastructurev

Disk crisis averted. Apollo flagged the Data volume at 98% (3.9GB free) — a real crash risk (build failures, and disk pressure has corrupted SQLite for us before). Forge safely reclaimed 13GB → 91% / 19GB free:

  • - npm cache alone was 7.9GB; plus uv/pip/brew caches, old playwright/puppeteer, rotated logs, app DMGs, duplicate whisper models, a duplicate PDF.
  • - Kept all actively-used models; confirmed with Hippocrates before removing his worktree.
  • - Prevention: added a weekly cache-clean shell cron (no Claude session) so npm cache can't creep back to 98%.

Theta-wave system scan: 8/10 (hold). Fleet 9/9 healthy, 0 errors overnight, restart loop sealed, KB freeze resolved. Atlas also caught that the Librarian embedding cycle had converged on measuring the problem rather than fixing it (15/15 keeps, escalation firing 8+ cycles with no corpus action) — redirected to corpus-action verification.

Upstream PR triage. Librarian flagged 29 high-value PRs from the open-source project. Forge triaged all 29:

  • - The PTY-injection security fix (#606) is already in our fork — no exposure. (This was the only potentially-urgent one; cleared.)
  • - 5 reliability improvements (per-agent watchdog threshold, spawn-verify + retry + alert, Claude Code 2.1.x compat, telegram long-poll, MCP-degraded boot alert) cherry-picked to a test branch, 27/27 tests passing.
  • - One trivial safe fix (#631, a date-format error in our heartbeat logs) merged to main.
  • - The 5 reliability PRs are STAGED AND HELD for your review — they touch 944 lines of the daemon's spawn/process/telegram core, too high-blast-radius to merge to the live daemon on a weekend without your eyes. One nod and Forge merges with a daylight watch.

5. Skill-quality proposals (Librarian, for your approval)v

Librarian ran a skill-quality experiment (12 invocations, 5 skills, 0 routing errors) and proposed 4 fixes:

  1. 1. stats-report — add triggers (the ad-hoc "clinic stats" path has none today).
  2. 2. icloud-calendar — add triggers (10 fleet copies, none have triggers).
  3. 3. agent-browser — add triggers across all 10 copies (currently relying on description alone).
  4. 4. cron-only skills — add a mode:cron-only marker to suppress false 0% scorecard scores (birthday-alerts, daily-checkin, zach-fiona-checkin, monarch-weekly).

All proposals-only, nothing auto-applied. Low-risk; recommend approving #4 especially (false 0% scores are noise).


6. What's waiting on you (the decision queue)v

Revenue / today:

  • - TTC blast — RUNNING. Full list (~8,692), 5 staggered waves within 12h, in progress.
  • - Apollo laser launch — window opens tomorrow 6/15; Dr. Saylor owes 5 decisions; Apollo leading her 8am brief as today-or-miss (she self-clears).

One-nod approvals:

  • - Reliability PR branch — 5 daemon improvements, tested, staged, ready to merge on your go.
  • - Librarian's 4 skill-quality fixes.
  • - Statin-alternatives KB sourcing — needs your direction on which sources (health-claim sensitivity, like the vaccine corpus).
  • - A4 qwen prefilter — validate on one agent in daylight before fleet-wide.

Unblock-the-fleet (credentials):

  • - TTC Brevo REST key → 1Password (so I'm not decoding it from a mislabeled field).
  • - A2P 10DLC: your EIN + legal entity name (unblocks ALL fleet SMS — Mercury fully stalled).
  • - Screen Recording for Ghostty (1 click), Availity 2FA, PX password, Monarch MFA.

Personal:

  • - Fiona / Maine trip — pick a weekend (first weekend Oct 2026).

Bottom linev

One Friday failure became: a skill, a nightly audit that already ran, the week's biggest efficiency win, a disk crisis caught and prevented, and a security verification — all while the fleet stayed 9/9 healthy with zero errors. The two money moves (TTC blast, Apollo laser) are both live. The only things between here and a fully-unblocked fleet are the credentials and one-nod approvals above.