Chinese startup DeepSeek is shedding new light on the economics of the race for supremacy in artificial intelligence. In a publication in the prestigious journal Nature, the company revealed that the cost of training its R1 model, focused on reasoning, was just $294,000.
This amount is a fraction of the estimates quoted by the US giants, where spending in excess of $100 million on flagship training is unofficially rumoured.
This information reignites the debate about China’s real position in the global AI competition and challenges the narrative that building foundational models is reserved exclusively for players with almost unlimited budgets.
According to the article’s authors, a cluster of 512 Nvidia H800 GPUs was used to train the R1 for 80 hours. The choice of hardware is no accident. The H800 chips are a version designed by Nvidia specifically for the Chinese market after the US blocked the export of the more powerful A100 and H100 units in 2022.
However, the issue of DeepSeek’s access to advanced chips raised questions. US officials suggested that the company may have already sourced more powerful H100 chips after the restrictions were imposed.
In a document accompanying the publication, DeepSeek acknowledged that in the early, preparatory stages of development it used legally owned A100 chips, but that the main R1 training had already taken place on H800 units.
The second important thread is the technique of so-called model distillation, a process in which one AI system learns from data generated by another, more advanced model. This is a method that can significantly reduce cost and development time, but is seen as controversial when done without the consent of the original model creator.
DeepSeek has faced allegations that it deliberately ‘distilled’ OpenAI models. The company has addressed these suggestions in a recent publication. It admitted that the training data for one of their models contained a “significant number” of responses generated by OpenAI systems.
However, DeepSeek representatives maintain that this was not a deliberate act, but merely an accidental side effect of indexing publicly available web content.
The disclosure of low costs, coupled with explanations of the hardware and training methods used, is a strategic move by DeepSeek. The startup is demonstrating that it is able to create competitive solutions much more cheaply, which could in future influence the dynamics of the entire AI market, which has so far been dominated by a narrative of gigantic costs and barriers to entry.