Skip to main content
Welcome to the Narev blog. Find the latest research, product updates, and thoughts on the future of AI economics here.

Tested the 'top of 2025' LLMs on a real task. GPT-3.5 won.

Massive Multitask Language Understanding (MMLU) Pro winners against MMLU Pro losers. How big is the gap on a single, repeatable task? The result is surprising.

The MMLU Benchmark Reproducibility Problem

MMLU scores vary by 13 points for the same model depending on who measures it. Yet the “top” models differ by just 1%.

AI adoption rate for large firms continues to trend down

U.S. Census Bureau data shows AI adoption among large firms has continued to decline after peaking in July 2025.

Why Narev AI Was Built - Two futures for AI economics

There’s a world where AI gets infinitely cheaper, and another where developers need to choose the best tool for every job. Narev is building for the latter.