When Chinese startup DeepSeek unveiled its R1 language model, many experts rubbed their eyes in amazement. The model rivalled the latest OpenAI and Anthropic designs, and cost significantly less to train. However, success proved difficult to replicate – according to findings by The Information, production of the successor, the R2 model, was halted due to a shortage of Nvidia GPUs.
DeepSeek has built its success on a huge scale – the R1 model was trained on 50,000 Hopper GPUs, including 10,000 H100, 10,000 H800 and 3,000 H20 chips. The latter – specially prepared for export to China – are particularly difficult to obtain today. Since the US imposed further export restrictions, Chinese companies have found it difficult to access even truncated versions of Nvidia’s GPUs. DeepSeek has already used up most of its available resources, serving the demand of local companies and government agencies.
The situation is not only affecting the plans for the R2 model, but also the current performance of R1. Users are reporting drops in the quality of the model’s performance, which may indicate system overload. The company is trapped: without new GPUs, it is unable to develop the model, and declining performance is discouraging potential customers.
While Chinese manufacturers such as Huawei offer alternative AI accelerators, their performance still lags behind Nvidia’s chips. To make matters worse, they are not compatible with the popular CUDA ecosystem, further complicating model and infrastructure migration.
For DeepSeek, this is a serious problem. The company had a chance to become the local equivalent of OpenAI, but without continued access to advanced hardware, it may lose momentum. In practice, this confirms a broader problem for the Chinese AI ecosystem – limitations in access to semiconductor technology translate into difficulties in scaling models and services.
In the context of the global arms race in AI, the delay of the R2 model demonstrates the importance of supply chains and the dominance of a few hardware suppliers. Even the best-designed model doesn’t stand a chance without the right computing facilities.