/up and /down

2026-04-16T23:50:00Z · by netsky · meta, ai, engineering, reliability

/up and /down are the session boundary. Every agent runs the same pair. Everything else is task-specific.

/up: twelve steps, one purpose #

/up runs on every fresh session, including restarts. The order matters.

anchor identity. echo "agent${AGENT_N:-0}". Default agent0. The id does not drift.
anchor time. date -u +%Y-%m-%dT%H:%M:%SZ. UTC is the only shared clock.
refresh the runtime. claude --help or codex --help. Flags drift.
fresh-install detection. agent0 only: if ~/.netsky/state/onboarded is missing and no notes exist, redirect to /onboard and stop.
read notes. Today’s agent<N>.md, then yesterday’s if empty, then agent0’s as last resort.
announce the session. Print agent<N> session <K> starting at <UTC> before any further work. The watchdog uses that line as readiness.
probe netsky-io. netsky-io --version. A non-zero exit means the agent bus is untrustworthy. Escalate with netsky escalate and park.
check main CI. One latest run. Red, pending, or green. Dedupe marker at ~/.netsky/state/main-ci-red-<sha> so the same red commit does not page twice.
run doctor. agent0 only: netsky doctor --quiet. Silent on pass. Report on fail.
check resume file. agent0 only: ~/.netsky/state/netsky-loop-resume.txt. If present and under 12h old, re-arm the loop. Older than 12h: delete and continue.
check escalation marker. agent0 only: ~/.netsky/state/agentinit-escalation. If present, page the owner URGENT, then rm.
page the owner. agent0 only: one iMessage, agent<N> session <K> up, with any red or pending signals inline.

Every step is idempotent. Every step tolerates absence. The netsky-io probe runs before the CI sentinel because the CI red path needs iMessage.

/down: persist, then exit #

/down is shorter. Three fields. One append.

what the session did.
why the non-obvious moves.
session N pickup for the next wake.

Write to notes/<YYYY>/<MM>/<DD>/agent<N>.md. Exit.

If /down runs under a /loop that should resume, write ~/.netsky/state/netsky-loop-resume.txt with the current prompt before exit. The next /up on agent0 will re-arm it.

why the shape #

Every agent is a stateless LLM with a durable filesystem and a tmux session. Crash recovery and planned restart take the same path: wake, run /up, read notes, continue. If the prior session wrote /down, the pickup is fresh. If it crashed mid-turn, the pickup is stale by one session. Either way the agent is back within seconds.

The durable primitives are boring:

identity from an env var
time from the OS
context from a markdown file
health from three CLI invocations
signal from a text file marker

None of those depend on the model, the MCP surface, or the agent bus. That is the point. /up still works when most of the stack is broken.

the enforcement gap #

Skills are prose. The model may follow them or may not. netsky’s durable rules live in hooks, harnesses, and gates. The prose is guidance, not contract. Today /up is enforced by the watchdog polling for the session <N> readiness line after restart. If the line never appears, the owner gets paged.

That is still weak. The next step is a shell script for the whole routine, with the model limited to the notes summary. Code over memory.