AI Leaderboards

Curated benchmarks and leaderboards to compare LLMs across coding, reasoning, speed, cost, and more.

Chatbot Arena

ELO-based ranking from real user votes. The gold standard for LLM quality comparison with blind A/B testing.

Measures: General LLM quality (ELO)

general

Live model popularity and performance rankings across 100+ models from all major providers.

Measures: Model popularity + performance

generalcost

Directory of applications built on OpenRouter, showing which models power real products.

Measures: App ecosystem

general

Comprehensive LLM benchmark suite covering reasoning, knowledge, and instruction following.

Measures: LLM benchmarks

generalreasoning

Independent quality, speed, and cost analysis of AI models with standardized testing methodology.

Measures: Speed, quality, cost

generalspeedcost