LLM API Pricing Comparison 2026

Compare pricing across 56 models from 11 providers including GPT-5.4, Claude 4.6, Gemini 3.1, DeepSeek, and Grok. All prices per 1 million tokens, updated daily.

Calculate your actual monthly cost
Get notified when prices change
56 models
Provider
Mistral NemoCheapest Mistral modelMistral$0.020$0.040
Nova MicroCheapest Nova. Text-onlyAmazon$0.035$0.14
Command R7BCheapest Cohere modelCohere$0.037$0.15
Llama 3.1 8B (Groq)Ultra-cheap small modelGroq$0.050$0.080
Mistral Small 3.2Latest small. 24B paramsMistral$0.060$0.18
Nova LiteBudget multimodal. 300K contextAmazon$0.060$0.24
GPT-4.1 Nano1M context. Cheapest OpenAIOpenAI$0.10$0.40
Gemini 2.5 Flash-LiteUltra-budget. Free tier availableGoogle$0.10$0.40
Llama 4 Scout (Groq)109B total, 17B active (MoE). LPU inferenceGroq$0.11$0.34
Llama 4 Scout (Fireworks)Fireworks AI$0.12$0.35
Llama 4 Scout (Together)Serverless inferenceTogether AI$0.14$0.43
GPT-4o MiniLegacy budgetOpenAI$0.15$0.60
Command RBudget RAG modelCohere$0.15$0.60
Grok 4.1 FastBest value xAI. 2M context at $0.20/$0.50xAI$0.20$0.50
Grok Code FastCode-optimizedxAI$0.20$1.50
Llama 4 Maverick (Groq)400B total, 17B active (MoE). LPU inferenceGroq$0.20$0.60
Llama 4 Maverick (Fireworks)Fireworks AI$0.22$0.88
Gemini 3.1 Flash-Lite PreviewBudget tier. Free tier availableGoogle$0.25$1.50
Llama 4 Maverick (Together)Serverless inferenceTogether AI$0.27$0.85
DeepSeek V3.2 ChatCheapest frontier model. Cache 90% offDeepSeek$0.28$0.42
DeepSeek R1 (V3.2 Thinking)Reasoning mode. Same pricing as V3 chatDeepSeek$0.28$0.42
Qwen3 32B (Groq)Alibaba Qwen3 on GroqGroq$0.29$0.59
Gemini 2.5 Flash1M context. Best value with thinkingGoogle$0.30$2.50
Grok 3 MiniLegacy budget reasoningxAI$0.30$0.50
CodestralCode-specialized. 256K contextMistral$0.30$0.90
GPT-4.1 Mini1M context. Cost-effectiveOpenAI$0.40$1.60
Mistral Medium 3Mid-tier reasoningMistral$0.40$2.00
Mistral Large 3Current flagship. 262K contextMistral$0.50$1.50
o3 MiniBudget reasoningOpenAI$0.55$2.20
Llama 3.3 70B (Groq)Dense 70B. LPU inferenceGroq$0.59$0.79
Claude Haiku 3.5Legacy budget modelAnthropic$0.80$4.00
Nova ProMid-tier. 300K contextAmazon$0.80$3.20
GPT-5.2OpenAI$0.88$7.00
Llama 3.3 70B (Together)Same input/output priceTogether AI$0.88$0.88
Claude Haiku 4.5Fast and cost-efficientAnthropic$1.00$5.00
o4 MiniLatest compact reasoningOpenAI$1.10$4.40
GPT-5.1OpenAI$1.25$10.00
Gemini 2.5 Pro1M context. >200K: $2.50/$15Google$1.25$10.00
GPT-5.3 CodexAgentic coding modelOpenAI$1.75$14.00
GPT-5.3 ChatChat-optimizedOpenAI$1.75$14.00
o3Reasoning. Internal CoT billed as outputOpenAI$2.00$8.00
GPT-4.11M context. LegacyOpenAI$2.00$8.00
Gemini 3.1 Pro PreviewNewest. >200K: $4/$18. Batch: $1/$6Google$2.00$12.00
Grok 4.20Latest flagship. 2M contextxAI$2.00$6.00
GPT-5.4Latest flagship. 1M contextOpenAI$2.50$15.00
GPT-4oLegacy multimodalOpenAI$2.50$10.00
Nova PremierMost capable Nova. 1M contextAmazon$2.50$12.50
Command ALatest flagship. 256K contextCohere$2.50$10.00
Command R+Previous flagship. RAG-optimizedCohere$2.50$10.00
Claude Sonnet 4.6Balanced. 1M context standardAnthropic$3.00$15.00
Claude Sonnet 4.5Previous gen. 1M in beta (>200K: $6/$22.50)Anthropic$3.00$15.00
Grok 4Previous flagshipxAI$3.00$15.00
Claude Opus 4.6Most capable. 1M context. Fast mode: $30/$150Anthropic$5.00$25.00
GPT-5.2 ProPrevious-gen premium reasoningOpenAI$10.50$84.00
o3 ProPremium reasoning. Hidden CoT billed as outputOpenAI$20.00$80.00
GPT-5.4 ProMost advanced reasoning. 2x/1.5x surcharge >272K tokensOpenAI$30.00$180.00

Frequently Asked Questions

What is the cheapest LLM API in 2026?

Mistral Nemo is the cheapest at $0.02/1M input tokens. For frontier-class models, DeepSeek V3.2 at $0.28/$0.42 per 1M tokens is the best value. Amazon Nova Micro ($0.035/$0.14) is the cheapest from a major cloud provider.

How much does GPT-5.4 cost per token?

GPT-5.4 costs $2.50 per 1 million input tokens and $15.00 per 1 million output tokens. The Pro version costs $30.00/$180.00 per 1M tokens for maximum reasoning quality.

How much does Claude 4.6 cost?

Claude Opus 4.6 costs $5.00/1M input and $25.00/1M output tokens with 1M context. Claude Sonnet 4.6 costs $3.00/$15.00. Cached input is 90% cheaper at $0.50 and $0.30 respectively.

Which LLM has the largest context window?

xAI Grok 4.20 and Grok 4.1 Fast both offer 2 million token context windows. OpenAI GPT-5.4 offers 1.05M tokens. Google Gemini and Anthropic Claude offer 1M tokens.

How do I calculate my LLM API cost?

Multiply your average input tokens per request by the input price, add your output tokens times the output price, then multiply by your daily request count and 30 days. Use our calculator for instant comparisons across all providers.