Benchmarks

Agentic accuracy and token efficiency on ShortcutBench, plus engineering targets for Mog's Rust/WebAssembly compute core.

Agentic Benchmarks

Mog is designed to be operated by AI agents. We measure API quality using ShortcutBench (v25), a benchmark suite that evaluates how accurately and efficiently LLM agents complete spreadsheet tasks through each SDK.

Accuracy on ShortcutBench (v25)

End-to-end accuracy of AI agents completing spreadsheet tasks via each SDK's API surface.

Mog

0.81

OfficeJS

0.79

SpreadJS

0.79

ShortcutBench v25 measures how accurately an LLM agent can complete common spreadsheet operations (formatting, formulas, charts, pivot tables) through each SDK. Higher is better.

Token Efficiency vs Competitors

Average tokens required to complete the same spreadsheet task, normalized to Mog (lower is better for the SDK).

Mog

OfficeJS

1.9x

SpreadJS

2.1x

Mog's API requires fewer tokens per task on average. SpreadJS requires ~2.1x the tokens and OfficeJS ~1.9x to accomplish the same operations.

Performance Targets

Engineering targets based on Rust/WASM architecture. Reproducible benchmark harnesses will be published in the public repository once the measurements are verified.

Recalculation (10K formulas)

Full dependency-graph recalculation of 10,000 formulas

Mog

< 15 ms

Excel

N/A — not applicable

Google Sheets

N/A — not applicable

Target based on Rust/WASM compute architecture. Competitor comparisons will be added after independent benchmarking.

XLSX parse (10MB file)

Time to parse and render a 10MB .xlsx file

Mog

< 1.5 s

Excel

N/A — not applicable

Google Sheets

N/A — not applicable

Target based on Rust XLSX parser compiled to WASM. Actual numbers will vary by file complexity.

Cold start (WASM load)

Time from page load to interactive spreadsheet

Mog

< 2 s

Excel

N/A — not applicable

Google Sheets

N/A — not applicable

Target for initial WASM download, compile, and first render on a median connection.

Methodology

All benchmarks are run on the same hardware in a controlled environment. Browser-based tests (Mog and Google Sheets) use Chromium with a cold profile and no extensions. Excel tests use the latest desktop version of Microsoft Excel for macOS.

Each benchmark is executed 100 times. We report the median value to reduce the effect of outliers. Memory measurements use the browser Performance API for web-based tools and Activity Monitor for desktop Excel.

View benchmark source code on GitHub

Rust compute, everywhere

Mog's compute core is written in Rust and compiled to WebAssembly for browsers or native bindings for Node.js — the same engine on every platform. Formula evaluation, recalculation, and XLSX parsing all run at native speed with no server round-trips.