Nvidia, hitherto positioning itself as a hardware hegemon, is increasingly boldly taking control of the software layer that binds the world’s supercomputers together. The December acquisition of SchedMD, the company behind the open-source Slurm system, is a move of scale far beyond the standard portfolio additions. Questions are growing in Silicon Valley: will the chip market leader remain a neutral custodian of the commons, or will it turn it into an elevated highway, accessible mainly to its own architecture?
Control over power scheduling
Slurm is the silent engine of the AI revolution. It is the software that manages the resources in 60% of the world’s supercomputers, deciding how huge packets of data go to the processors. Without it, it would be almost impossible to effectively train models such as Claude from Anthropic or Meta solutions. Traditionally associated with weather forecasting and nuclear research, Slurm has become a critical link in the commercial AI arms race.
Risk of hidden optimisation
For Nvidia’s competitors, such as AMD and Intel, the acquisition is a wake-up call. Although Nvidia claims to keep its code open, industry experts point to a subtle threat: optimisation asymmetry. If updates to support new features on Nvidia chips appear faster or perform more efficiently than those for rivals, Slurm will de facto become part of a ‘walled garden’.
The story of 2022 and the purchase of Bright Computing shows that these concerns are not unfounded. While that software theoretically remained cross-platform, market practice suggests that it is Nvidia’s ecosystem that benefits most from its performance.
Giant’s credibility test
From a business perspective, Nvidia faces an image dilemma. On the one hand, the company has the capital to breathe new life into the somewhat ossified Slurm code, which could accelerate innovation across the industry. On the other hand, any attempt to favour its own technologies, such as the InfiniBand network, could trigger a defensive reaction from the market and a flight of customers towards alternative solutions, such as those based on Google technologies.
For decision makers in AI data centres, the coming months will be a test of Jensen Huang’s intentions. The first test will be the agility with which the new SchedMD framework will integrate AMD’s upcoming chips. In a world of high performance computing, where seconds of latency cost millions of dollars, software neutrality is not just a matter of ethics – it is the foundation of fair competition.
