DeepSeek-V4 Preview: Entering the Era of Affordable Million-Token Context

2026-04-24 阅读量：203 AI Tools 0

Today, the preview version of our new series model DeepSeek-V4 is officially launched and open-sourced simultaneously.

DeepSeek-V4 features an ultra-long context of one million tokens, achieving leading performance in Agent capabilities, world knowledge, and reasoning among domestic and open-source models. The model comes in two sizes:

2026-04-24T07:51:16.png

Starting today, visit the official website chat.deepseek.com or the official App to converse with the latest DeepSeek-V4 and explore the new experience of 1M ultra-long context memory. The API service has also been updated — simply change model_name to deepseek-v4-pro or deepseek-v4-flash to call it.

DeepSeek-V4-Pro: Performance rivaling top closed-source models

2026-04-24T07:51:40.png

Significantly improved Agent capabilities: Compared to previous generations, DeepSeek-V4-Pro's Agent capabilities have been markedly enhanced. In Agentic Coding evaluations, V4-Pro has achieved the best level among current open-source models, and also performs excellently in other Agent-related evaluations. Currently, DeepSeek-V4 has become the Agentic Coding model used internally by company employees. According to evaluation feedback, the user experience is better than Sonnet 4.5, and delivery quality is close to Opus 4.6 non‑thinking mode, though there is still a certain gap compared to Opus 4.6 thinking mode.

Rich world knowledge: In world knowledge evaluations, DeepSeek-V4-Pro significantly outperforms other open-source models, falling only slightly behind the top closed-source model Gemini-Pro-3.1.

World-class reasoning performance: In evaluations of mathematics, STEM, and competitive coding, DeepSeek-V4-Pro surpasses all currently published open-source models, achieving outstanding results comparable to the world's top closed-source models.

DeepSeek-V4-Flash: A faster, more economical option

Compared to DeepSeek-V4-Pro, DeepSeek-V4-Flash has slightly less world knowledge, but demonstrates close reasoning capabilities. Due to its smaller parameter count and activation size, V4-Flash provides faster and more cost-effective API services.

In Agent evaluations, DeepSeek-V4-Flash performs on par with DeepSeek-V4-Pro on simple tasks, but still lags behind on highly difficult tasks.

Structural innovation and ultra-high context efficiency

DeepSeek-V4 introduces a brand‑new attention mechanism that compresses along the token dimension. Combined with DSA (DeepSeek Sparse Attention), it achieves world‑leading long‑context capabilities while significantly reducing compute and VRAM requirements compared to traditional methods. From now on, 1M (one million) tokens of context will be the standard for all official DeepSeek services.

DeepSeek-V4 vs DeepSeek-V3.2: changes in computation and memory usage with context length

Specialized optimization for Agent capabilities

DeepSeek-V4 has been adapted and optimized for mainstream Agent products such as Claude Code, OpenClaw, OpenCode, and CodeBuddy, showing improved performance in coding tasks, document generation, and more. The image below shows an example of a slide generated by V4-Pro using a certain Agent framework:

(Swipe up/down or click to enlarge)

API Access

The DeepSeek API now supports both V4-Pro and V4-Flash, compatible with the OpenAI ChatCompletions interface and the Anthropic interface. To access the new models, keep the base_url unchanged and change the model parameter to deepseek-v4-pro or deepseek-v4-flash.

[Image]

Both V4-Pro and V4-Flash have a maximum context length of 1M, and support both non‑thinking mode and thinking mode. The thinking mode supports the reasoning_effort parameter to set thinking intensity (high/max). For complex Agent scenarios, it is recommended to use thinking mode with intensity set to max. Please refer to the API documentation for model invocation and parameter adjustment:

https://api-docs.deepseek.com/guides/thinking_mode

Please note: The two old API model names deepseek-chat and deepseek-reasoner will be deprecated after three months (2026-07-24). During the current transitional period, these two model names point to the non‑thinking mode and thinking mode of deepseek-v4-flash, respectively.

Open‑source weights and local deployment

DeepSeek-V4 model open‑source links:

https://huggingface.co/collections/deepseek-ai/deepseek-v4

https://modelscope.cn/collections/deepseek-ai/DeepSeek-V4

DeepSeek-V4 technical report:

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf

DeepSeek V4, DeepSeek, deepseek-v4-flash, deepseek-v4-pro