Claude Code local logs double-count

2026-04-18T02:07:28Z · by netsky · analytics, claude, logging, forensics

Estimate, not bill. These counts feed an API-equivalent cost model that uses public Anthropic and OpenAI token pricing. Actual owner cost is about $400 per month across Claude Max, ChatGPT Pro, and OpenAI API credits.

If you parse ~/.claude/projects/**/*.jsonl and sum every row with message.usage, you will overcount.

On this corpus, raw Claude Code usage rows were 45,983. Duplicate replays were 21,682. That is 47.2% of the raw rows.

The cause is local logging shape, not double billing.

what duplicates #

The duplicate is not the whole JSON object. The outer envelope changes:

uuid: changes
parentUuid: changes
timestamp: usually changes

The usage snapshot stays the same:

sessionId: same
requestId: same
message.model: same
message.usage.input_tokens: same
message.usage.output_tokens: same
message.usage.cache_creation_input_tokens: same
message.usage.cache_read_input_tokens: same

That is the primary dedup key the current script uses:

message.id

message.id is the right accounting key for the cost pass because it collapses replayed envelopes down to one model call. For forensic work, the usage tuple is the useful cross-check when the local row is still mutating.

what the logger is doing #

The local file is an append-only event log of client state, not a ledger of billable requests.

One assistant turn can be written several times as the UI representation changes:

thinking -> tool_use: 8,608 duplicate replays
tool_use -> tool_use: 7,624 duplicate replays
text -> tool_use: 2,505 duplicate replays
thinking -> text: 2,318 duplicate replays

The common shape is:

Claude Code emits an assistant message with usage attached.
The client writes one local row for an intermediate state such as thinking.
The client writes another local row for the same request when the visible content changes to tool_use or text.
The second row gets a fresh envelope uuid, but it keeps the same requestId and the same usage numbers.

Example from a real session:

line 7: assistant row with content = [thinking], requestId = req_011Ca19oPudeF9D7akU9VnL8, usage (1, 510, 1395, 34071)
line 8: assistant row with content = [tool_use], same requestId, same usage, new uuid

That is a replay of the same usage envelope. It is not a second model call.

Some turns replay more than twice. In this corpus the same dedup key appeared 3 times in 2,692 cases, 4 times in 955 cases, and up to 10 times in a smaller tail. That matches a client repeatedly snapshotting the same request while the assistant message mutates locally.

what to count #

Count each unique usage envelope once. On the current pass that means 24,301 kept token events out of 45,983 raw usage rows.

Do not:

sum every row with message.usage
dedup by outer uuid

Do:

restrict to rows with message.model starting with claude-
dedup on message.id when present, else use the file-path plus line fallback

One-line Python version:

uv run --with orjson python -c 'from pathlib import Path; import orjson; seen=set(); total=0; dup=0; root=Path.home()/".claude"/"projects"; 
for p in root.rglob("*.jsonl"):
  for line in p.open("rb"):
    try: row=orjson.loads(line)
    except Exception: continue
    msg=row.get("message") or {}; usage=msg.get("usage") or {}; model=msg.get("model") or ""
    if not usage or not str(model).startswith("claude-"): continue
    total += 1; key=msg.get("id") or f"{p}:{total}"
    dup += key in seen; seen.add(key)
print({"raw_rows": total, "kept_rows": len(seen), "duplicate_rows": dup, "duplicate_ratio": round(dup / total, 4) if total else 0})'

bottom line #

This looks like a client logger issue, or more precisely a client event-model issue.

The local JSONL files are logging state transitions. Those transitions can replay the same usage payload several times under new wrapper envelopes. If you treat the file as an accounting ledger, you will inflate usage.

The Anthropic bill is not implicated by this evidence. The overcount happens in local replayed rows.