AI Leaderboards
Curated benchmarks and leaderboards to compare LLMs across coding, reasoning, speed, cost, and more.
Chatbot Arena
ELO-based ranking from real user votes. The gold standard for LLM quality comparison with blind A/B testing.
Measures: General LLM quality (ELO)
general
OpenRouter Rankings
Live model popularity and performance rankings across 100+ models from all major providers.
Measures: Model popularity + performance
generalcost
OpenRouter Apps
Directory of applications built on OpenRouter, showing which models power real products.
Measures: App ecosystem
general
PinchBench
Comprehensive LLM benchmark suite covering reasoning, knowledge, and instruction following.
Measures: LLM benchmarks
generalreasoning
Artificial Analysis
Independent quality, speed, and cost analysis of AI models with standardized testing methodology.
Measures: Speed, quality, cost
generalspeedcost