Tests whether free LLMs can correctly solve basic arithmetic across five operations: addition, subtraction, multiplication, division, and
exponentiation.
Examples: 2 + 3 → 5 · 10 - 3 → 7 · 4 × 5 → 20 · 6 / 2 → 3 · 2^3 → 8
A sanity-check benchmark — these are problems any calculator solves instantly, but small or quantized free models occasionally stumble on edge cases.