Benchmark measurements pending

No numbers are published on this page because no methodology-compliant run has been completed yet. When results are available, they will appear here with full methodology disclosure.

Tokens per second (sustained)
Pending first compliant run
Joules per token (wall power)
Pending wall-meter measurement
Tokens per watt-hour
Derived from above two metrics
METHODOLOGY

How we measure.

Every published number will meet these requirements. If a number does not come from a compliant run, it will not appear on this page.

Token accounting
Prompt and completion tokens come directly from the inference engine's usage response field. Character or word counts are not used.
Power measurement
Public numbers use whole-node wall power (wall_power_W) from an external clamp meter or smart plug. Apple SMC package power is recorded separately as an internal reference and is clearly labelled as such if shown. A run without a wall meter is marked smc_only and not published publicly.
Fixed conditions
Every run records: hardware model and chip, OS version, runtime version, model identifier and quantization, prompt suite name and version, batch size, warm-up request count, power source, ambient temperature if available, and git SHA of Saturn-Control and Saturn-Node. Any change to these conditions produces a distinct run.
Sustained throughput, not peak
Published figures are mean sustained values over a fixed prompt suite. Peak figures are not published. Standard deviation is reported alongside the mean when the sample size allows it.
What we do not claim
We do not compare against third-party systems unless using identical methodology on identical hardware. We do not extrapolate from one model or workload to another. We do not publish estimated or adjusted numbers.
TOOLING

How measurements are collected.

A benchmark harness (saturn-bench) is in development as a SwiftPM executable. It dispatches each prompt in a fixed suite to Saturn-Control's /v1/chat/completions endpoint, records per-request timing and token usage, and writes a versioned JSON result file. Wall-power sampling is recorded separately by the operator during the run.

RESULT FILE SCHEMA (illustrative — not a real measurement)
{
  "schemaVersion": 1,
  "status": "dry_run",
  "hardware": { "model": "<hardware model>", "chip": "<chip>", "ramGB": null },
  "software": { "modelID": "<model-id-and-quantization>", "saturnControlSHA": "<sha>" },
  "promptSuite": { "name": "<suite-name-and-version>", "promptCount": null },
  "conditions": { "powerSource": "wall_power_W", "batchSize": 1 },
  "results": {
    "tokensPerSecond": null,
    "meanWatts": null,
    "joulesPerToken": null,
    "tokensPerWattHour": null
  },
  "notes": "Pending first compliant run."
}