LLM API Pricing Comparison 2026
Compare pricing across 56 models from 11 providers including GPT-5.4, Claude 4.6, Gemini 3.1, DeepSeek, and Grok. All prices per 1 million tokens, updated daily.
Calculate your actual monthly cost| Provider | |||||
|---|---|---|---|---|---|
| Mistral NemoCheapest Mistral model | Mistral | $0.020 | $0.040 | 131K | Fast |
| Nova MicroCheapest Nova. Text-only | Amazon | $0.035 | $0.14 | 128K | Fast |
| Command R7BCheapest Cohere model | Cohere | $0.037 | $0.15 | 128K | Fast |
| Llama 3.1 8B (Groq)Ultra-cheap small model | Groq | $0.050 | $0.080 | 128K | Fast |
| Mistral Small 3.2Latest small. 24B params | Mistral | $0.060 | $0.18 | 131K | Fast |
| Nova LiteBudget multimodal. 300K context | Amazon | $0.060 | $0.24 | 300K | Fast |
| GPT-4.1 Nano1M context. Cheapest OpenAI | OpenAI | $0.10 | $0.40 | 1M | Fast |
| Gemini 2.5 Flash-LiteUltra-budget. Free tier available | $0.10 | $0.40 | 1.0M | Fast | |
| Llama 4 Scout (Groq)109B total, 17B active (MoE). LPU inference | Groq | $0.11 | $0.34 | 128K | Fast |
| Llama 4 Scout (Fireworks) | Fireworks AI | $0.12 | $0.35 | 128K | Fast |
| Llama 4 Scout (Together)Serverless inference | Together AI | $0.14 | $0.43 | 512K | Fast |
| GPT-4o MiniLegacy budget | OpenAI | $0.15 | $0.60 | 128K | Fast |
| Command RBudget RAG model | Cohere | $0.15 | $0.60 | 128K | Fast |
| Grok 4.1 FastBest value xAI. 2M context at $0.20/$0.50 | xAI | $0.20 | $0.50 | 2M | Fast |
| Grok Code FastCode-optimized | xAI | $0.20 | $1.50 | 256K | Fast |
| Llama 4 Maverick (Groq)400B total, 17B active (MoE). LPU inference | Groq | $0.20 | $0.60 | 128K | Fast |
| Llama 4 Maverick (Fireworks) | Fireworks AI | $0.22 | $0.88 | 128K | Fast |
| Gemini 3.1 Flash-Lite PreviewBudget tier. Free tier available | $0.25 | $1.50 | 1.0M | Fast | |
| Llama 4 Maverick (Together)Serverless inference | Together AI | $0.27 | $0.85 | 256K | Fast |
| DeepSeek V3.2 ChatCheapest frontier model. Cache 90% off | DeepSeek | $0.28 | $0.42 | 128K | Fast |
| DeepSeek R1 (V3.2 Thinking)Reasoning mode. Same pricing as V3 chat | DeepSeek | $0.28 | $0.42 | 128K | Med |
| Qwen3 32B (Groq)Alibaba Qwen3 on Groq | Groq | $0.29 | $0.59 | 131K | Fast |
| Gemini 2.5 Flash1M context. Best value with thinking | $0.30 | $2.50 | 1.0M | Fast | |
| Grok 3 MiniLegacy budget reasoning | xAI | $0.30 | $0.50 | 131.1K | Fast |
| CodestralCode-specialized. 256K context | Mistral | $0.30 | $0.90 | 256K | Fast |
| GPT-4.1 Mini1M context. Cost-effective | OpenAI | $0.40 | $1.60 | 1M | Fast |
| Mistral Medium 3Mid-tier reasoning | Mistral | $0.40 | $2.00 | 131K | Med |
| Mistral Large 3Current flagship. 262K context | Mistral | $0.50 | $1.50 | 262K | Med |
| o3 MiniBudget reasoning | OpenAI | $0.55 | $2.20 | 200K | Fast |
| Llama 3.3 70B (Groq)Dense 70B. LPU inference | Groq | $0.59 | $0.79 | 128K | Fast |
| Claude Haiku 3.5Legacy budget model | Anthropic | $0.80 | $4.00 | 200K | Fast |
| Nova ProMid-tier. 300K context | Amazon | $0.80 | $3.20 | 300K | Med |
| GPT-5.2 | OpenAI | $0.88 | $7.00 | 400K | Med |
| Llama 3.3 70B (Together)Same input/output price | Together AI | $0.88 | $0.88 | 131.1K | Fast |
| Claude Haiku 4.5Fast and cost-efficient | Anthropic | $1.00 | $5.00 | 200K | Fast |
| o4 MiniLatest compact reasoning | OpenAI | $1.10 | $4.40 | 200K | Fast |
| GPT-5.1 | OpenAI | $1.25 | $10.00 | 128K | Med |
| Gemini 2.5 Pro1M context. >200K: $2.50/$15 | $1.25 | $10.00 | 1.0M | Med | |
| GPT-5.3 CodexAgentic coding model | OpenAI | $1.75 | $14.00 | 400K | Med |
| GPT-5.3 ChatChat-optimized | OpenAI | $1.75 | $14.00 | 128K | Med |
| o3Reasoning. Internal CoT billed as output | OpenAI | $2.00 | $8.00 | 200K | Med |
| GPT-4.11M context. Legacy | OpenAI | $2.00 | $8.00 | 1M | Med |
| Gemini 3.1 Pro PreviewNewest. >200K: $4/$18. Batch: $1/$6 | $2.00 | $12.00 | 1.0M | Med | |
| Grok 4.20Latest flagship. 2M context | xAI | $2.00 | $6.00 | 2M | Med |
| GPT-5.4Latest flagship. 1M context | OpenAI | $2.50 | $15.00 | 1.1M | Med |
| GPT-4oLegacy multimodal | OpenAI | $2.50 | $10.00 | 128K | Fast |
| Nova PremierMost capable Nova. 1M context | Amazon | $2.50 | $12.50 | 1M | Med |
| Command ALatest flagship. 256K context | Cohere | $2.50 | $10.00 | 256K | Med |
| Command R+Previous flagship. RAG-optimized | Cohere | $2.50 | $10.00 | 128K | Med |
| Claude Sonnet 4.6Balanced. 1M context standard | Anthropic | $3.00 | $15.00 | 1M | Med |
| Claude Sonnet 4.5Previous gen. 1M in beta (>200K: $6/$22.50) | Anthropic | $3.00 | $15.00 | 1M | Med |
| Grok 4Previous flagship | xAI | $3.00 | $15.00 | 256K | Med |
| Claude Opus 4.6Most capable. 1M context. Fast mode: $30/$150 | Anthropic | $5.00 | $25.00 | 1M | Slow |
| GPT-5.2 ProPrevious-gen premium reasoning | OpenAI | $10.50 | $84.00 | 400K | Slow |
| o3 ProPremium reasoning. Hidden CoT billed as output | OpenAI | $20.00 | $80.00 | 200K | Slow |
| GPT-5.4 ProMost advanced reasoning. 2x/1.5x surcharge >272K tokens | OpenAI | $30.00 | $180.00 | 1.1M | Slow |
Frequently Asked Questions
What is the cheapest LLM API in 2026?
Mistral Nemo is the cheapest at $0.02/1M input tokens. For frontier-class models, DeepSeek V3.2 at $0.28/$0.42 per 1M tokens is the best value. Amazon Nova Micro ($0.035/$0.14) is the cheapest from a major cloud provider.
How much does GPT-5.4 cost per token?
GPT-5.4 costs $2.50 per 1 million input tokens and $15.00 per 1 million output tokens. The Pro version costs $30.00/$180.00 per 1M tokens for maximum reasoning quality.
How much does Claude 4.6 cost?
Claude Opus 4.6 costs $5.00/1M input and $25.00/1M output tokens with 1M context. Claude Sonnet 4.6 costs $3.00/$15.00. Cached input is 90% cheaper at $0.50 and $0.30 respectively.
Which LLM has the largest context window?
xAI Grok 4.20 and Grok 4.1 Fast both offer 2 million token context windows. OpenAI GPT-5.4 offers 1.05M tokens. Google Gemini and Anthropic Claude offer 1M tokens.
How do I calculate my LLM API cost?
Multiply your average input tokens per request by the input price, add your output tokens times the output price, then multiply by your daily request count and 30 days. Use our calculator for instant comparisons across all providers.