The cost of AI in 2030 - Why will agent-based AI deployment not be cheap?

According to recent analysis by Gartner, the cost of inference on AI models with a trillion parameters will fall by more than 90% by 2030. From a spreadsheet perspective, this seems like a harbinger of digital abundance, in which powerful computing power becomes an almost free commodity. However, a deeper analysis of market mechanisms and the evolution of the technology itself suggests a very different scenario.

Although the per-unit price of process data, or tokens, is decreasing dramatically, overall business spending on artificial intelligence is likely to continue its upward trend. This phenomenon, known as the cheap token paradox, is now becoming a key focal point for the digital strategy of modern organisations.

Understanding these dynamics requires looking beyond the technology itself, towards the economics of the providers of large language models. The current market landscape resembles a phase of intense colonisation, with major players such as OpenAI, Google and Anthropic operating at, and often below, the break-even point. Investment in infrastructure and research is gigantic, and the optimisation of inference costs mentioned by Gartner is primarily a route to profitability for these players, rather than a mechanism for lowering prices for the end customer. Efficiencies resulting from better chip design and improved model architecture will allow suppliers to balance their own balance sheets before real savings are fully passed on to the market.

However, the real cost revolution will not take place in the field of simple queries, but in the area of the next generation of agent-based artificial intelligence solutions. So far, interaction with models has largely been based on the paradigm of the assistant – a tool that responds to a specific command and generates a static response. Today, the market is shifting towards autonomous agents, capable of autonomous planning, using external tools and correcting their own errors in the decision loop. This qualitative shift has powerful financial implications. Every second of autonomous agent work, which has to repeatedly ‘think through’ a task before taking action, consumes many times more tokens than a single user prompt. It is estimated that moving from a simple bot to an executive agent increases the demand for process data by five to as much as thirty times. As a result, although the price per thousand tokens becomes symbolic, their massive consumption leaves the final bill unchanged or increasing.

Executives face the challenge of redefining the concept of value in IT projects. A strategy based on the search for the cheapest solution may turn out to be a dead end, leading to systems with low business usability. The key to success becomes the so-called barbell strategy, a two-pronged approach.

On the one hand, organisations should aim to maximise the utilisation of lower-cost, smaller-scale models for routine, repetitive tasks where high-precision reasoning is not critical.

On the other hand, the financial resources thus freed up are worth directing to the ‘technological frontier’ – the most advanced agent models which, although costly, are capable of generating unique added value, impossible to copy by competitors using off-the-shelf, cost-optimised solutions.

The development of inference on edge devices and specialised chips will also be an important factor in the expenditure architecture. Moving some computing directly to laptops, phones or local company servers will allow some emancipation from the cloud giants, but even here there is a hidden cost in the form of the need to upgrade the hardware fleet and maintain the distributed infrastructure. Deciding what to process ‘in-house’ and what to process in the cloud will become one of the most important operational competencies of modern CIOs.

Ultimately, the role of the technology leader is evolving from resource manager to intellectual efficiency strategist. Instead of focusing on negotiating token rates, the focus should be on optimising the return on each unit of computing invested in business processes. Indeed, the cheapness of technology is only an opportunity to increase the complexity of the tasks performed. If a company in 2030 spends as much on artificial intelligence as it does today, but in return receives full autonomy of logistical processes instead of a simple report generator, this will represent a triumph of strategy over pure accounting.

In this context, Gartner’s predictions should not be read as an announcement of budget cuts, but as a signal to prepare organisations for an unprecedented increase in the appetite for data. The future belongs to entities that understand that in the knowledge economy, the most expensive resource is no longer the technology itself, but the ability to scale it properly where it brings real market advantage.