where the bits flow

Netsky is a viable AI system modeled on Beer’s VSM, and its topology is physical: tmux sessions, filesystem inboxes, git branches, SQLite rows, watchdog markers, cross-machine envelopes.

The base prompt names the VSM split directly — S5 policy in prompt layers, S4 intelligence in notes and meta-work, S3 control in agent0, S3* audit in spot checks, S2 coordination in workspaces, S1 operations in clones, agentinfinity as watchdog (src/crates/netsky-prompts/prompts/base.md:20-28).

roles #

agent0 is the root orchestrator. It owns meta-work in the in-tree checkout, talks to the owner, dispatches clones, audits output, and harvests landed work (src/crates/netsky-prompts/prompts/base.md:32-38).

Clones are agent1..agentN, bounded workers with the same capability and distinct identity. Their work lives in fresh workspaces/<task>/repo clones on dedicated branches (src/crates/netsky-prompts/prompts/base.md:33-38).

agentinfinity is the watchdog, sitting outside the normal work loop. It keeps agent0 alive, handles planned restart, handles crash recovery, and pages on red signals (src/crates/netsky-prompts/prompts/base.md:34).

netsky-ticker is the 60s driver. Its loop runs netsky watchdog tick, netsky cron tick, netsky loop tick, then records the tick detail (src/crates/netsky-cli/src/cmd/tick.rs:134-164). Launchd is the 120s failsafe (src/crates/netsky-core/src/consts.rs:249-264).

The separation earns its keep when one piece fails: agent0 can be busy, wrong, or dead and clones still push branches; the watchdog revives the root without being inside it; the ticker keeps time without trusting a model session to remember time.

flowchart TD
    owner[owner]
    a0[agent0: S3 control]
    clones[agent1..N: S1 work]
    wd[agentinfinity: watchdog]
    ticker[netsky-ticker]
    fs[filesystem bus]
    git[git branches + main]
    db[meta.db + JSONL]
    iroh[iroh peers]

    owner <-- iMessage --> a0
    a0 <-- envelopes --> fs
    fs <-- envelopes --> clones
    clones --> git
    a0 --> git
    ticker --> wd
    wd --> a0
    wd --> db
    fs --> db
    iroh <-- QUIC/TLS --> fs

edges #

The owner edge terminates at agent0 in the root constellation. The base prompt says owner communications go to netsky0 only, and sibling constellations relay down through the root rather than paging the owner directly (src/crates/netsky-prompts/prompts/base.md:40). Operational updates are batched over iMessage, with retry through read-first semantics on timeout (src/crates/netsky-prompts/prompts/base.md:16).

The clone edge is a file-backed bus. Inbound envelopes land in ~/.netsky/channels/agent/agent<N>/inbox/, and netsky channel send writes one envelope to a target inbox (src/crates/netsky-prompts/prompts/base.md:42-46). Drain claims by atomic rename from inbox/ to claimed/, emits the wrapper text, then archives in delivered/ (src/crates/netsky-cli/src/cmd/channel.rs:7-18, src/crates/netsky-cli/src/cmd/channel.rs:398-508).

The runtime edge stays neutral. Runtimes are slots. A Codex-backed agent uses an extra outbox forwarder that rewrites replies into agent0’s inbox, but the bus shape remains the same (src/crates/netsky-cli/src/cmd/channel.rs:144-220).

The landing edge is git, not GitHub pull requests. Netsky work happens on branches in workspaces. agent0 fetches the clone branch, cherry-picks FETCH_HEAD, records landed_sha, and closes the task (src/crates/netsky-prompts/prompts/base.md:10-15, src/crates/netsky-prompts/prompts/base.md:113, src/crates/netsky-cli/src/cmd/task.rs:646-700).

The cross-machine edge is iroh. The source uses QUIC + TLS 1.3, verifies the remote EndpointId against an allowlist before reading payload bytes, then writes the inbound envelope into the same inbox-shaped channel tree (src/crates/netsky-io/src/sources/iroh/mod.rs:1-16, src/crates/netsky-io/src/sources/iroh/mod.rs:384-445, src/crates/netsky-io/src/sources/iroh/mod.rs:540-596).

The observability edge is meta.db plus JSONL fallback. The database records messages, CLI invocations, ticks, workspaces, clone dispatches, harvest events, directives, token usage, and watchdog events (src/crates/netsky-db/README.md:27-44, src/crates/netsky-db/README.md:58-78). Failed writes spool to ~/.netsky/logs/meta-db-errors-<date>.jsonl (src/crates/netsky-db/README.md:5-10, src/crates/netsky-db/src/lib.rs:2560-2583).

netsky channel send agent7 "gate is green. harvest next." --from agent0
netsky task harvest 92
netsky query "SELECT source, COUNT(*) AS n FROM messages GROUP BY source"

loop inventory #

loopclosed bylatencyfailure mode
owner to rootiMessage to agent0, then milestone updates backhuman-scaleowner-visible state is summary-only unless promoted to a durable event
root to clonefilesystem envelope, ack, branch, final reportseconds to minutesbest-effort ordering across producers can reorder causal intent
clone to mainpushed branch, FETCH_HEAD, cherry-pick, landed_shaminutesharvest is visible after agent0 acts, not when clone finishes
watchdog to rootticker tick, pane hash, state markers, restart status60s nominalpane-reading still carries some liveness truth
cross-machineiroh allowlist, encrypted envelope, local inbox writesecondspeer identity is durable, but owner visibility still routes through root
observabilitymeta.db, DataFusion snapshots, JSONL fallbackwrite-timelock contention degrades into side logs

These loops survive because their endpoints are files, branches, or rows, and each has a fallback that outlives the model session — claimed/ for the bus, markers for restart, JSONL for observability, commits for git.

where loops break #

meta.db contention is the live weak edge. The stack uses WAL and busy timeouts, and skips schema DDL when user_version already matches (src/crates/netsky-db/README.md:5-10, src/crates/netsky-db/src/lib.rs:2208-2229), which makes contention legal rather than free. Short commands still pay when they synchronously record observability. --version returns before record_cli_invocation fires, while normal commands record at process exit (src/crates/netsky-cli/src/main.rs:1-19, src/crates/netsky-cli/src/observability.rs:18-43).

State tombstones accumulate because markers are cheap — the state directory carries hang markers, crashloop counters, restart status, failed-revive counters (src/crates/netsky-prompts/prompts/base.md:83-88, src/crates/netsky-core/src/consts.rs:308-333, src/crates/netsky-core/src/consts.rs:385-396). That’s the right failure posture, but pruning has to live inside the protocol, not outside as housekeeping.

Agent-bus ordering is explicit best-effort. Filenames sort by wall-clock nanoseconds, and clock skew or concurrent producers can reorder causal order; consumers that need stronger ordering embed a sequence number in the body (src/crates/netsky-cli/src/cmd/channel.rs:26-33). Fine for status and briefs, thin for multi-step protocols.

Some loops still depend on pane-reading. The watchdog tracks pane hashes for hang detection, writes suspect markers when output stays stable, clears them when it moves (src/crates/netsky-core/src/consts.rs:308-338). Useful signal, not a durable edge.

three structural changes #

First, add one envelope ledger. The current bus archives by directory transition and records some communication events. A ledger would give every envelope one durable lifecycle row: written, claimed, delivered, poisoned, forwarded, acked. The database already has communication and message tables that can carry the shape (src/crates/netsky-db/README.md:29-44, src/crates/netsky-cli/src/cmd/channel.rs:510).

Second, add an owner-visibility edge that is not pane-derived. Today, the owner sees milestone text and escalations. The system should promote “clone finished”, “harvest started”, “harvest landed”, “watchdog repaired root”, and “gate failed” into owner-visible events. The policy already requires milestone iMessages during orchestration (src/crates/netsky-prompts/prompts/base.md:66). The edge should be driven by durable events, not by whether a pane happens to say the right thing.

Third, emit a harvest-complete bus event. Harvest writes task state and prints harvested task {id} sha={sha} into main (src/crates/netsky-cli/src/cmd/task.rs:674-700), which closes the loop for the database and the human watching the terminal but leaves clones and sibling constellations unaware that the branch is dead and main carries the landed SHA.

what’s next #

The control loops have physical surfaces — a brief is an envelope, a clone is a tmux session with a workspace, a landing is a cherry-pick, a restart is a marker plus a detached child plus a later tick. The fragile loops are the ones still reading panes or inferring state from side effects. Promote the key signals to durable edges and the system gets less magical without getting smaller.