Sunday, December 28, 2025

Deepseek V3 Deep dive

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures 

Deepseek v3 101

FP32 is used for RSNorm but FP8 mixed-precision for attention and feed forward network

No comments: