Tracerator

Baseline trace source

Trace scenario

Real Mooncake baseline bundled with Tracerator.

Parameters

Scale 1.0x

Input length multiplier 1.0x

Output length multiplier 1.0x

Reuse bias (sharing) 0.50

Higher → more simulated prefix reuse / cache hits

New sessions 0

Extra independent sessions added

Modeled mix 0.00

Blend in additional modeled requests

ISL distribution

Reshapes input-token buckets for prefill and KV reuse studies

Seed

For reproducible output

KV cache planning

Preview computes policy curves from the generated trace

Model

KV precision

Indexer precision

Tensor parallel

Block size

Warmup fraction

Capacity GiB points

Include draft KV cache

Live estimates

Instant feedback — backend recomputes exactly on generate

Est. requests

12031

Total lines in trace.jsonl

Est. time span

3537s

Rough trace duration from first to last

Est. hit ratio

50%

Derived from reuse bias

Est. peak concurrency

Max concurrent in-flight (scaled)

Estimated ISL distribution

Input length buckets that will be written into trace.jsonl

Est. zip size ≈ 12 MB. trace.jsonl is JSON Lines (one record per line). Files >10–20 MB are normal at higher scales — most GUI editors (TextEdit, basic JSON viewers) will fail or hang. Use head, jq, pandas read_json(..., lines=True), or VS Code. The generated zip now includes a README.txt with examples.

zip contains trace.jsonl + manifest.json + isl_distribution.svg/txt + README.txt

Manifest + sample

manifest.json

First 5 trace lines (sample)

Generated ISL distribution

Exact bucket split from the generated trace preview

KV cache planning

Working set

Unique blocks

Reuse ceiling

Peak memory

Use the zip for your modeling / replay pipeline. The manifest records the exact parameters and derived stats. trace.jsonl is JSON Lines (large files are normal) — open with code (pandas.read_json(..., lines=True), jq, head) or a real editor. GUI text editors usually cannot open multi-MB JSONL files.

Validate outputs with AIPerf (bursts + KV prefix sharing): see docs/VALIDATING_WITH_AIPERF.md (full instructions) + ./scripts/validate-with-aiperf.sh. Companion stack: https://github.com/discoposse/aiperf-toolkit

The live numbers are client-side approximations that mirror the backend formulas. Generation always uses the authoritative server-side logic seeded for repeatability. Generated and augmented traces are simulations for planning and replay experiments; they preserve selected workload characteristics but do not necessarily represent an exact production workload profile. See README for details and the full parameter contract.