Blog – Narev

Tested the 'top of 2025' LLMs on a real task. GPT-3.5 won.

MMLU Pro winners against MMLU Pro loosers. How big is the gap on a single, repeatable task. The result is surprising.

October 14, 2025

MMLU scores vary by 13 points for the same model depending on who's measuring. Yet the "top" models differ by just 1%.

October 13, 2025

U.S. Census Bureau data shows AI adoption among large firms has continued to decline after peaking in July 2025, while smallest firms keep growing

October 11, 2025