Claude API Cost Calculator
Updated February 17, 2026 – includes Sonnet 4.6
Token Estimator – paste text to estimate tokens and cost
Model Comparison – see costs across all models
| Model | Per Request | 10K Requests |
|---|
Current Claude Model Lineup
As of February 17, 2026, Anthropic has released two new models in the space of two weeks. Claude Sonnet 4.6 is now the default on all Free and Pro plans. Here is the full lineup with pricing and recommended use cases.
Released February 17, 2026. Now the default model across all Claude plans including Free and Pro. Scores 79.6% on SWE-bench (within 1.2 points of the flagship Opus 4.5 record) and 72.5% on OSWorld for computer use – within 0.2% of Opus 4.6 at one-fifth the Opus price. Outperforms Opus 4.6 on office tasks and financial analysis benchmarks. Users prefer it over Sonnet 4.5 in 70% of head-to-head comparisons and over Opus 4.5 in 59% of comparisons. ARC-AGI-2 jumped 4.3x from 13.6% to 58.3%. Supports extended thinking, adaptive thinking, and context compaction in beta for long agentic runs. API string: claude-sonnet-4-6
Released February 5, 2026. Flagship model with 1M context window in beta and enhanced agentic capabilities. Retains a clear lead on pure reasoning benchmarks like GPQA Diamond and Humanity’s Last Exam, where Sonnet 4.6 does not yet close the gap. Best for complex autonomous workflows and tasks where maximum reasoning depth matters more than cost efficiency.
Released November 2025. Still holds the highest SWE-bench score in the lineup at 80.9%. The effort parameter lets you dial reasoning depth per request – at medium effort it matches Sonnet 4.5 using 76% fewer tokens. Strong choice for mission-critical coding where even Sonnet 4.6 is not sufficient.
Released September 2025. Sonnet 4.6 is the recommended upgrade path at the same price. Sonnet 4.5 remains available and fully functional for existing workflows built around it.
Released October 2025. Near-frontier performance at the lowest cost in the active lineup. One-third the price of Sonnet with more than twice the speed. Best for real-time applications, chat assistants, pair programming, and high-volume sub-agent orchestration under an Opus or Sonnet orchestrator.
Previous flagship models. With Opus 4.6 and 4.5 delivering better performance at 66% lower cost, these are now relevant only for workflows that specifically require Opus 4.1 or Opus 4 model behavior. New deployments should use Opus 4.6 or Opus 4.5.
Legacy Haiku models. Haiku 3 at $0.25/$1.25 remains the absolute lowest-cost option. Haiku 4.5 offers dramatically better capabilities at modest additional cost and is the recommended path for new work.
Cost Optimization Features
Two features reduce API costs significantly when used appropriately. They can be combined. See the official pricing documentation for full detail on each.
Prompt Caching
90% off readsCache system prompts, documentation, or any context that repeats across multiple requests. Cache reads cost 10% of the standard input rate for whichever model you are using. Reduces latency by up to 85% for long prompts.
Batch Processing
50% offHalf price for requests that do not need a real-time response. Typical turnaround is under 1 hour, with a maximum of 24 hours. Does not count against standard API rate limits.
Combining both: Batch processing with 1-hour prompt caching can reduce input costs by up to 95% on prompt-heavy workloads. Use 1-hour TTL with batches since processing time typically exceeds the 5-minute window, resulting in better cache hit rates.
Performance vs. Pricing
Claude Sonnet 4.6 closes the gap between Sonnet and Opus tiers to the point where most workloads no longer require a premium model. On computer use and office tasks it matches or exceeds the flagship.
| Model | SWE-bench | OSWorld | Input (1M) |
|---|---|---|---|
| Claude Sonnet 4.6 NEW | 79.6% | 72.5% | $3.00 |
| Claude Opus 4.6 NEW | – | 72.7% | $5.00 |
| Claude Opus 4.5 | 80.9% | – | $5.00 |
| Claude Sonnet 4.5 | 77.2% | 61.4% | $3.00 |
| Claude Haiku 4.5 | 73.0% | – | $1.00 |
| Claude Opus 4.1 | 74.5% | – | $15.00 |
| Gemini 3 Pro | ~78% | – | $2.00 |
| GPT-5.2 | ~76% | 38.2% | $1.25 |
Computer use gap: Sonnet 4.6’s OSWorld score of 72.5% is within 0.2% of Opus 4.6’s 72.7% – and both leave GPT-5.2 at 38.2% well behind. On office productivity tasks (GDPval-AA), Sonnet 4.6 actually outperforms Opus 4.6.
1M context window: Available in beta on Sonnet 4.6, Opus 4.6, Sonnet 4.5, and Sonnet 4. Requires usage tier 4 or a custom rate limit agreement. Requests exceeding 200K tokens are charged at premium long-context rates.
Opus 4.6 / 4.5 vs. Legacy Opus Models
Opus 4.6 and 4.5 are the direct upgrade paths from Opus 4.1 and Opus 4. Both deliver better performance at 66% lower cost with no breaking changes.
| Metric | Opus 4.5 | Opus 4.1 | Difference |
|---|---|---|---|
| Input cost (1M tokens) | $5.00 | $15.00 | 66% cheaper |
| Output cost (1M tokens) | $25.00 | $75.00 | 66% cheaper |
| SWE-bench Verified | 80.9% | 74.5% | +6.4 points |
| Token efficiency | Up to 65% fewer | Baseline | Significant |
| Effort parameter | Yes | No | New feature |
Example savings: Processing 10M input tokens and 5M output tokens monthly costs $50 + $125 = $175 with Opus 4.5, versus $150 + $375 = $525 with Opus 4.1 – saving $350 per month (67% reduction).
Effort Parameter (Opus 4.5)
Opus 4.5 lets you control reasoning depth per request via an effort parameter. This balances quality against cost and token usage on a per-call basis.
Low Effort
Fastest responses with minimal reasoning depth. Best for simple tasks, quick classification, or high-volume applications where speed matters more than thorough analysis.
Matches Sonnet 4.5’s best SWE-bench score while using 76% fewer output tokens than high effort. The practical default for most production coding tasks.
Exceeds Sonnet 4.5 by 4.3 percentage points on SWE-bench while still using 48% fewer tokens than the unconstrained equivalent. Use for mission-critical code, complex debugging, and tasks requiring maximum accuracy.
Token Costs and Usage Patterns
Claude’s tokenizer produces approximately 33% more tokens than simple word counts. Use 1.33 tokens per word as a planning estimate for English text.
| Content Type | Tokens per 1K Words | Sonnet 4.6 Input |
|---|---|---|
| Natural language | 1,330 | $0.00399 |
| Technical documentation | 1,400 | $0.0042 |
| Source code | 1,500 | $0.0045 |
Managing Output Costs
Output tokens cost 5x more than input tokens across all models. Request specific output formats (JSON, bullet points) to keep responses concise. Set length constraints when detailed analysis is not required. Use the effort parameter on Opus 4.5 to limit reasoning token usage. Extended thinking, while charged at standard input rates, can add significant volume on complex tasks.
API Rate Limits and Scaling
Tier progression is automatic based on deposit history and wait periods. The 1M context window (Sonnet 4.6, Opus 4.6, Sonnet 4.5, Sonnet 4) requires tier 4 or a custom agreement.
| Tier | Monthly Limit | Deposit | Rate Limits |
|---|---|---|---|
| Tier 1 | $100 | $5 | 20 RPM, 4K tokens/min |
| Tier 2 | $500 | $40 | 40 RPM, 8K tokens/min |
| Tier 4 | $5,000 | $400 | 200 RPM, 40K tokens/min |
Enterprise: Team plans start at $25-30 per user monthly (5 user minimum). Enterprise contracts start at $50,000 annually with custom limits, dedicated support, and priority access.
Additional Pricing Components
Beyond base token costs, several features carry their own charges.
Code Execution
$0.05 per session-hour with a 5-minute minimum. Each organization receives 50 free hours daily.
Web Search
$10 per 1,000 searches. Does not include the input/output token cost of processing search results.
Tool Use Overhead
System prompt tokens added automatically per request: 313-346 tokens for basic tools, approximately 700 tokens for the text editor tool, and approximately 245 tokens for bash. Opus 4.5 introduced Tool Search to reduce context bloat by 85% in large tool sets.
Extended and Adaptive Thinking
Charged at standard input token rates. Available on Sonnet 4.6, Opus 4.6, Opus 4.5, Haiku 4.5, and select older models. Sonnet 4.6 additionally supports adaptive thinking and context compaction in beta for sustained agentic runs.
Long Context Premium
When a request exceeds 200K input tokens on Sonnet 4.6, Opus 4.6, Sonnet 4.5, or Sonnet 4, all tokens in that request are billed at premium long-context rates – not just the tokens above 200K. Batch and caching discounts still apply on top of long-context pricing.
Implementation Strategy
A structured approach to model selection and optimization reduces costs significantly without sacrificing output quality.
Development Phase
Start with Claude Haiku 4.5 at $1/$5 for prototyping. Its near-frontier performance enables realistic testing at minimal cost. Set up API integration, error handling, and token usage monitoring before scaling.
Production Deployment
Deploy Haiku 4.5 for high-volume tasks and Sonnet 4.6 or Opus 4.6/4.5 for complex reasoning. Enable prompt caching for repeated context. Route non-urgent workloads to batch processing. Use the effort parameter on Opus 4.5 to control per-request reasoning depth and cost.
Multi-Agent Architecture
Let Opus 4.6 or Sonnet 4.6 orchestrate teams of Haiku 4.5 sub-agents for complex multi-step workflows. Analyze cost per feature and per user. Implement dynamic model selection where task complexity determines which model handles each step.