DeepSeek-V2

Second generation MoE model; 236B total parameters (21B active); 128K context window; 50% lower training cost

Released in May 2024, DeepSeek-V2 marked the second major iteration of DeepSeek's foundation models using advanced Mixture of Experts architecture. With 236 billion total parameters but only 21 billion active per token, it achieved significant efficiency gains. Supporting a 128,000-token context window, DeepSeek-V2 dramatically improved knowledge integration and faster response times compared to V1. The model reduced training costs by approximately 50% versus competing architectures while maintaining competitive performance across benchmarks. This release established the architectural pattern that would carry forward through V3 and subsequent versions.

Reviews

No Reviews Yet

Be the first to share your experience with this AI tool

DeepSeek-V2

Reviews

No Reviews Yet

More models from Deepseek

DeepSeek-V3.1-Terminus

DeepSeek-V3.2-Exp

DeepSeek-V3.2

DeepSeek-V3.1

DeepSeek-V3.2-Speciale

DeepSeek-Math

DeepSeek-Coder-V2

DeepSeek-V2.5

DeepSeek-Coder

DeepSeek-LLM

DeepSeek-MoE

DeepSeek-V3