Claude Code local logs double-count
Estimate, not bill. These counts feed an API-equivalent cost model that uses public Anthropic and OpenAI token pricing. Actual owner cost is about $400 per month across Claude Max, ChatGPT Pro, and OpenAI API credits.
If you parse ~/.claude/projects/**/*.jsonl and sum every row with message.usage, you will overcount.
On this corpus, raw Claude Code usage rows were 45,983. Duplicate replays were 21,682. That is 47.2% of the raw rows.
The cause is local logging shape, not double billing.
what duplicates #
The duplicate is not the whole JSON object. The outer envelope changes:
uuid: changesparentUuid: changestimestamp: usually changes
The usage snapshot stays the same:
sessionId: samerequestId: samemessage.model: samemessage.usage.input_tokens: samemessage.usage.output_tokens: samemessage.usage.cache_creation_input_tokens: samemessage.usage.cache_read_input_tokens: same
That is the primary dedup key the current script uses:
message.id
message.id is the right accounting key for the cost pass because it collapses replayed envelopes down to one model call. For forensic work, the usage tuple is the useful cross-check when the local row is still mutating.
what the logger is doing #
The local file is an append-only event log of client state, not a ledger of billable requests.
One assistant turn can be written several times as the UI representation changes:
thinking -> tool_use: 8,608 duplicate replaystool_use -> tool_use: 7,624 duplicate replaystext -> tool_use: 2,505 duplicate replaysthinking -> text: 2,318 duplicate replays
The common shape is:
- Claude Code emits an assistant message with usage attached.
- The client writes one local row for an intermediate state such as
thinking. - The client writes another local row for the same request when the visible content changes to
tool_useortext. - The second row gets a fresh envelope
uuid, but it keeps the samerequestIdand the same usage numbers.
Example from a real session:
- line 7: assistant row with
content = [thinking],requestId = req_011Ca19oPudeF9D7akU9VnL8, usage(1, 510, 1395, 34071) - line 8: assistant row with
content = [tool_use], samerequestId, same usage, newuuid
That is a replay of the same usage envelope. It is not a second model call.
Some turns replay more than twice. In this corpus the same dedup key appeared 3 times in 2,692 cases, 4 times in 955 cases, and up to 10 times in a smaller tail. That matches a client repeatedly snapshotting the same request while the assistant message mutates locally.
what to count #
Count each unique usage envelope once. On the current pass that means 24,301 kept token events out of 45,983 raw usage rows.
Do not:
- sum every row with
message.usage - dedup by outer
uuid
Do:
- restrict to rows with
message.modelstarting withclaude- - dedup on
message.idwhen present, else use the file-path plus line fallback
One-line Python version:
uv run --with orjson python -c 'from pathlib import Path; import orjson; seen=set(); total=0; dup=0; root=Path.home()/".claude"/"projects"; for p in root.rglob("*.jsonl"): for line in p.open("rb"): try: row=orjson.loads(line) except Exception: continue msg=row.get("message") or {}; usage=msg.get("usage") or {}; model=msg.get("model") or "" if not usage or not str(model).startswith("claude-"): continue total += 1; key=msg.get("id") or f"{p}:{total}" dup += key in seen; seen.add(key) print({"raw_rows": total, "kept_rows": len(seen), "duplicate_rows": dup, "duplicate_ratio": round(dup / total, 4) if total else 0})'
bottom line #
This looks like a client logger issue, or more precisely a client event-model issue.
The local JSONL files are logging state transitions. Those transitions can replay the same usage payload several times under new wrapper envelopes. If you treat the file as an accounting ledger, you will inflate usage.
The Anthropic bill is not implicated by this evidence. The overcount happens in local replayed rows.