AI Models

Foundation models powering the AI ecosystem. Compare context windows, pricing, modality support, and real-world capabilities.

Claude Haiku 4.5

anthropic

Fast and cheap for high-volume tasks

200,000 ctx$0.8/M intext, image
Claude Opus 4.5

anthropic

80.9% SWE-bench Verified - frontier reasoning

200,000 ctx$15/M intext, image
Claude Opus 4.6

anthropic

Deep reasoning and architectural analysis at 1M context

1,000,000 ctx$15/M intext, image
Claude Sonnet 4.6

anthropic

Mid-tier model nearly matching flagships at 79.6% SWE-bench

200,000 ctx$3/M intext, image
Claude Sonnet 5

anthropic

First model to break 80% on SWE-bench Verified (82.1%)

200,000 ctx$3/M intext, image
DeepSeek V4

deepseek

Open Weight

Engram architecture - 1M+ context at 50% lower cost

1,000,000 ctx$0.14/M intext
Gemini 3 Pro

google

ARC-AGI-2 leader at 77.1%, native multimodal

2,000,000 ctx$1.25/M intext, image, audio, video
Gemini 3.1 Pro

google

80.6% SWE-bench Verified, 2M context

2,000,000 ctx$1.25/M intext, image, audio, video
GLM-5

zhipu

Open Weight

Leads open-weight leaderboard at 85 overall score

128,000 ctxtext
GPT-5

openai

OpenAI flagship with native audio and broad capabilities

128,000 ctx$5/M intext, image, audio
GPT-5.3-Codex

openai

Most capable agentic coding model per OpenAI, 25% faster

128,000 ctx$5/M intext, image
Grok 4.5

xai

xAI flagship with real-time X data integration

128,000 ctxtext, image
Kimi K2.5

moonshot

Open Weight

Strongest HumanEval at 99.0 - best for code generation

200,000 ctxtext
Llama 4

meta

Open Weight

Broadest ecosystem support and community tooling

128,000 ctxtext, image
Mistral Small 4

mistral

Open Weight

24B params fits on consumer GPU with quantization

256,000 ctx$0.1/M intext
Qwen 3.5 397B

alibaba

Open Weight

Best GPQA Diamond score of any model at 88.4

128,000 ctxtext