DeepSeek V4 Pricing: Flash vs Pro

DeepSeek is the model people reach for when they want to cut AI coding costs hard, and its pricing is the reason. But “DeepSeek pricing” isn’t one number — V4 Flash and V4 Pro have different rates, and cache-hit input is billed very differently from cache-miss input. This explains how it all fits together so you can estimate what you’ll actually pay.

One thing up front: DeepSeek is pay-per-token only. There’s no flat coding plan like GLM’s or Alibaba’s. That’s a feature for light or bursty use — you pay for exactly what you run.

The model tiers

DeepSeek V4 model tiers (official pricing snapshot, June 2026)

deepseek-v4-flash	$0.0028/M cache-hit input, $0.14/M cache-miss input, $0.28/M output
deepseek-v4-pro	$0.003625/M cache-hit input, $0.435/M cache-miss input, $0.87/M output

The official page also lists two base URLs: https://api.deepseek.com for OpenAI-format tools and https://api.deepseek.com/anthropic for Anthropic-format tools like Claude Code.

In older guides and tools, you will still see deepseek-chat and deepseek-reasoner. DeepSeek says those aliases map to non-thinking and thinking modes of deepseek-v4-flash, and are scheduled for deprecation on July 24, 2026 at 15:59 UTC. New configs should prefer the current V4 model names where the tool supports them.

Cache-hit discounts

DeepSeek bills cached input tokens far cheaper than fresh ones. When your prompts reuse context — the same system prompt, the same files loaded across an agent session — those repeated tokens hit the cache and cost a fraction of the standard input rate.

This matters a lot for coding agents like Claude Code and OpenCode, which resend context constantly. On V4 Flash, cache-hit input is $0.0028 per million tokens versus $0.14 for cache-miss input, so repeated context is where DeepSeek becomes absurdly cheap.

Base URLs and model names

Use the base URL that matches your tool:

# OpenAI-compatible tools such as Cline, OpenCode, Aider, Codex custom providers
https://api.deepseek.com

# Anthropic-compatible tools such as Claude Code
https://api.deepseek.com/anthropic

For new setups, use deepseek-v4-flash as the default model where supported. Use deepseek-v4-pro only for hard tasks where the extra capability is worth the higher output price.

How to estimate your cost

A simple method:

Estimate input and output tokens per task (an agent session might be hundreds of thousands of tokens with context).
Multiply by the per-million rates for your chosen model from the pricing page.
Discount the repeated input for cache hits.
Multiply by tasks per day and days per month.

For most individual developers, real DeepSeek spend is low because agent sessions reuse context. The number to watch is output tokens; output is far more expensive than cache-hit input.

Pro vs Flash: which to use

Everyday coding → deepseek-v4-flash. Cheapest, fast, fine for the bulk of work.
Hard reasoning, tricky bugs, architecture → deepseek-v4-pro.
Old tool configs → replace deepseek-chat / deepseek-reasoner when your tool supports the V4 names.

A good pattern is to default to the cheap tier and escalate only when a task genuinely needs more.

Before you budget

Check current per-million rates on the official pricing page
Use deepseek-v4-flash by default and deepseek-v4-pro for hard tasks
Account for cache-hit discounts on repeated context
Use the OpenAI base URL for OpenAI-compatible tools and /anthropic for Claude Code
Update old deepseek-chat / deepseek-reasoner aliases before the deprecation date

Wrapping up

DeepSeek’s pricing is pay-per-token across V4 Flash and V4 Pro, with cache-hit discounts that make repeated agent context extremely cheap. There’s no coding plan, which suits light and bursty use perfectly. Default to V4 Flash, escalate to V4 Pro when needed, use the right base URL for your tool, and confirm the live numbers on the official page since they move.

To put it to work, see run DeepSeek with Claude Code. To compare against flat-rate options, see coding plans vs pay-per-token.

Frequently asked questions

Does DeepSeek have a coding plan or subscription?

No. DeepSeek is pay-per-token only — there's no flat monthly coding plan like GLM or Alibaba offer. You pay for the input and output tokens you use, which keeps light usage extremely cheap.

What's the difference between V4 Pro and V4 Flash?

V4 Flash is the cheaper everyday model and V4 Pro is the stronger, pricier model for harder reasoning and complex coding. Pick Flash for routine agent work and Pro only when a task needs the extra capability.

How do cache-hit discounts work?

When parts of your prompt repeat (the same system prompt or context across calls), DeepSeek bills those cached input tokens at a steep discount. Agentic coding reuses a lot of context, so cache hits can cut real-world cost noticeably.

Which DeepSeek model should I use for coding?

Use deepseek-v4-flash as the default for everyday coding and deepseek-v4-pro for harder reasoning or complex debugging. Flash is much cheaper; Pro is the escalation path.

How do I estimate my monthly cost?

Multiply your expected input and output tokens by the per-million rates for V4 Flash or V4 Pro, then account for cache-hit input. Coding agents resend context constantly, so cache-hit pricing can be the difference between a tiny bill and a merely cheap one.

Related guides

A WSL terminal running Claude Code on DeepSeek V4 on Windows

AI Coding Tools & Models

How to Run DeepSeek V4 With Claude Code on WSL

Run DeepSeek V4 with Claude Code inside WSL on Windows. Node setup, the Anthropic-compatible endpoint, persistent env vars, model choice, and pricing notes.

MCSA Guru Team Jun 17, 2026 3 min read

Claude Code on Windows configured to run on DeepSeek V4 via environment variables

AI Coding Tools & Models

How to Run DeepSeek V4 With Claude Code on Windows

Run DeepSeek V4 with Claude Code on Windows using its Anthropic-compatible endpoint. API key, environment variables, model choice, pricing, and the common fixes.

MCSA Guru Team Jun 16, 2026 4 min read

Doubao Seed 2.0 Pro compared against DeepSeek V4 for coding

AI Coding Tools & Models

Doubao Seed 2.0 Pro vs DeepSeek V4 for Coding

Doubao Seed 2.0 Pro vs DeepSeek V4 compared for coding: cost, setup ease, ecosystem, and which cheap model to pick for Claude Code, Codex, or OpenCode.

MCSA Guru Team Jul 19, 2026 3 min read

DeepSeek V4 Pricing Explained: V4 Flash vs V4 Pro

The model tiers

DeepSeek V4 model tiers (official pricing snapshot, June 2026)

Cache-hit discounts

Base URLs and model names

How to estimate your cost

Pro vs Flash: which to use

Before you budget

Wrapping up

Frequently asked questions

Sources & further reading

Related guides

How to Run DeepSeek V4 With Claude Code on WSL

How to Run DeepSeek V4 With Claude Code on Windows

Doubao Seed 2.0 Pro vs DeepSeek V4 for Coding

Fixing something right now?

DeepSeek V4 Pricing Explained: V4 Flash vs V4 Pro

The model tiers#

DeepSeek V4 model tiers (official pricing snapshot, June 2026)

Cache-hit discounts#

Base URLs and model names#

How to estimate your cost#

Pro vs Flash: which to use#

Before you budget

Wrapping up#

Frequently asked questions

Sources & further reading

Related guides

How to Run DeepSeek V4 With Claude Code on WSL

How to Run DeepSeek V4 With Claude Code on Windows

Doubao Seed 2.0 Pro vs DeepSeek V4 for Coding

Fixing something right now?

The model tiers

Cache-hit discounts

Base URLs and model names

How to estimate your cost

Pro vs Flash: which to use

Wrapping up