LLM API Pricing Comparison 2026

Compare pricing across 56 models from 11 providers including GPT-5.4, Claude 4.6, Gemini 3.1, DeepSeek, and Grok. All prices per 1 million tokens, updated daily.

Calculate your actual monthly cost

56 models

	Provider
Mistral NemoCheapest Mistral model	Mistral	$0.020	$0.040	131K	Fast
Nova MicroCheapest Nova. Text-only	Amazon	$0.035	$0.14	128K	Fast
Command R7BCheapest Cohere model	Cohere	$0.037	$0.15	128K	Fast
Llama 3.1 8B (Groq)Ultra-cheap small model	Groq	$0.050	$0.080	128K	Fast
Mistral Small 3.2Latest small. 24B params	Mistral	$0.060	$0.18	131K	Fast
Nova LiteBudget multimodal. 300K context	Amazon	$0.060	$0.24	300K	Fast
GPT-4.1 Nano1M context. Cheapest OpenAI	OpenAI	$0.10	$0.40	1M	Fast
Gemini 2.5 Flash-LiteUltra-budget. Free tier available	Google	$0.10	$0.40	1.0M	Fast
Llama 4 Scout (Groq)109B total, 17B active (MoE). LPU inference	Groq	$0.11	$0.34	128K	Fast
Llama 4 Scout (Fireworks)	Fireworks AI	$0.12	$0.35	128K	Fast
Llama 4 Scout (Together)Serverless inference	Together AI	$0.14	$0.43	512K	Fast
GPT-4o MiniLegacy budget	OpenAI	$0.15	$0.60	128K	Fast
Command RBudget RAG model	Cohere	$0.15	$0.60	128K	Fast
Grok 4.1 FastBest value xAI. 2M context at $0.20/$0.50	xAI	$0.20	$0.50	2M	Fast
Grok Code FastCode-optimized	xAI	$0.20	$1.50	256K	Fast
Llama 4 Maverick (Groq)400B total, 17B active (MoE). LPU inference	Groq	$0.20	$0.60	128K	Fast
Llama 4 Maverick (Fireworks)	Fireworks AI	$0.22	$0.88	128K	Fast
Gemini 3.1 Flash-Lite PreviewBudget tier. Free tier available	Google	$0.25	$1.50	1.0M	Fast
Llama 4 Maverick (Together)Serverless inference	Together AI	$0.27	$0.85	256K	Fast
DeepSeek V3.2 ChatCheapest frontier model. Cache 90% off	DeepSeek	$0.28	$0.42	128K	Fast
DeepSeek R1 (V3.2 Thinking)Reasoning mode. Same pricing as V3 chat	DeepSeek	$0.28	$0.42	128K	Med
Qwen3 32B (Groq)Alibaba Qwen3 on Groq	Groq	$0.29	$0.59	131K	Fast
Gemini 2.5 Flash1M context. Best value with thinking	Google	$0.30	$2.50	1.0M	Fast
Grok 3 MiniLegacy budget reasoning	xAI	$0.30	$0.50	131.1K	Fast
CodestralCode-specialized. 256K context	Mistral	$0.30	$0.90	256K	Fast
GPT-4.1 Mini1M context. Cost-effective	OpenAI	$0.40	$1.60	1M	Fast
Mistral Medium 3Mid-tier reasoning	Mistral	$0.40	$2.00	131K	Med
Mistral Large 3Current flagship. 262K context	Mistral	$0.50	$1.50	262K	Med
o3 MiniBudget reasoning	OpenAI	$0.55	$2.20	200K	Fast
Llama 3.3 70B (Groq)Dense 70B. LPU inference	Groq	$0.59	$0.79	128K	Fast
Claude Haiku 3.5Legacy budget model	Anthropic	$0.80	$4.00	200K	Fast
Nova ProMid-tier. 300K context	Amazon	$0.80	$3.20	300K	Med
GPT-5.2	OpenAI	$0.88	$7.00	400K	Med
Llama 3.3 70B (Together)Same input/output price	Together AI	$0.88	$0.88	131.1K	Fast
Claude Haiku 4.5Fast and cost-efficient	Anthropic	$1.00	$5.00	200K	Fast
o4 MiniLatest compact reasoning	OpenAI	$1.10	$4.40	200K	Fast
GPT-5.1	OpenAI	$1.25	$10.00	128K	Med
Gemini 2.5 Pro1M context. >200K: $2.50/$15	Google	$1.25	$10.00	1.0M	Med
GPT-5.3 CodexAgentic coding model	OpenAI	$1.75	$14.00	400K	Med
GPT-5.3 ChatChat-optimized	OpenAI	$1.75	$14.00	128K	Med
o3Reasoning. Internal CoT billed as output	OpenAI	$2.00	$8.00	200K	Med
GPT-4.11M context. Legacy	OpenAI	$2.00	$8.00	1M	Med
Gemini 3.1 Pro PreviewNewest. >200K: $4/$18. Batch: $1/$6	Google	$2.00	$12.00	1.0M	Med
Grok 4.20Latest flagship. 2M context	xAI	$2.00	$6.00	2M	Med
GPT-5.4Latest flagship. 1M context	OpenAI	$2.50	$15.00	1.1M	Med
GPT-4oLegacy multimodal	OpenAI	$2.50	$10.00	128K	Fast
Nova PremierMost capable Nova. 1M context	Amazon	$2.50	$12.50	1M	Med
Command ALatest flagship. 256K context	Cohere	$2.50	$10.00	256K	Med
Command R+Previous flagship. RAG-optimized	Cohere	$2.50	$10.00	128K	Med
Claude Sonnet 4.6Balanced. 1M context standard	Anthropic	$3.00	$15.00	1M	Med
Claude Sonnet 4.5Previous gen. 1M in beta (>200K: $6/$22.50)	Anthropic	$3.00	$15.00	1M	Med
Grok 4Previous flagship	xAI	$3.00	$15.00	256K	Med
Claude Opus 4.6Most capable. 1M context. Fast mode: $30/$150	Anthropic	$5.00	$25.00	1M	Slow
GPT-5.2 ProPrevious-gen premium reasoning	OpenAI	$10.50	$84.00	400K	Slow
o3 ProPremium reasoning. Hidden CoT billed as output	OpenAI	$20.00	$80.00	200K	Slow
GPT-5.4 ProMost advanced reasoning. 2x/1.5x surcharge >272K tokens	OpenAI	$30.00	$180.00	1.1M	Slow

Frequently Asked Questions

What is the cheapest LLM API in 2026?

Mistral Nemo is the cheapest at $0.02/1M input tokens. For frontier-class models, DeepSeek V3.2 at $0.28/$0.42 per 1M tokens is the best value. Amazon Nova Micro ($0.035/$0.14) is the cheapest from a major cloud provider.

How much does GPT-5.4 cost per token?

GPT-5.4 costs $2.50 per 1 million input tokens and $15.00 per 1 million output tokens. The Pro version costs $30.00/$180.00 per 1M tokens for maximum reasoning quality.

How much does Claude 4.6 cost?

Claude Opus 4.6 costs $5.00/1M input and $25.00/1M output tokens with 1M context. Claude Sonnet 4.6 costs $3.00/$15.00. Cached input is 90% cheaper at $0.50 and $0.30 respectively.

Which LLM has the largest context window?

xAI Grok 4.20 and Grok 4.1 Fast both offer 2 million token context windows. OpenAI GPT-5.4 offers 1.05M tokens. Google Gemini and Anthropic Claude offer 1M tokens.

How do I calculate my LLM API cost?

Multiply your average input tokens per request by the input price, add your output tokens times the output price, then multiply by your daily request count and 30 days. Use our calculator for instant comparisons across all providers.