In 2025, entrepreneurs will launch an avalanche of applications based on artificial intelligence. Finally, generative AI will live up to the hype with a new generation of affordable applications for consumers and businesses. This is not the current consensus opinion. OpenAI, Google and xAI are locked in an arms race to train the most powerful large language model (LLM) for artificial general intelligence, known as AGI, and their gladiatorial battle dominates the mindshare and revenue of the nascent ecosystem. Gene AI. .
For example, Elon Musk raised $6 billion to launch newcomer xAI and purchased 100,000 Nvidia H100 GPUs, the expensive chips used to process AI, costing more than $3 billion to train his model, Grok. At those prices, only technology tycoons can afford to build these giant LLMs.
The incredible spending by companies like OpenAI, Google, and xAI has created an unbalanced ecosystem that is bottom-heavy and top-light. LLMs trained by these huge GPU farms are also typically very expensive for inference, the process of inputting a message and generating a response from large language models that is built into every application that uses AI. It’s like everyone has 5G smartphones, but data usage is too expensive for anyone to watch a TikTok video or browse social media. As a result, excellent LLMs with high inference costs have made the proliferation of killer applications unaffordable.
This unbalanced ecosystem of ultra-rich tech moguls fighting each other has enriched Nvidia while forcing app developers into a vicious cycle: use a low-cost, low-performance model meant to disappoint users, or face exorbitant inference costs and risk losing money. spoiled.
In 2025, a new approach will emerge that can change all that. This will go back to what we have learned from previous technological revolutions, such as the Intel and Windows PC era or the Qualcomm and Android mobile era, where Moore’s Law improved PCs and applications, and the lower cost of bandwidth band improved mobile phones and applications. after year.
But what about the high cost of inference? A new law for AI inference is just around the corner. The cost of inference has fallen by a factor of 10 per year, driven by new AI algorithms, inference technologies, and better chips at lower prices.
As a point of reference, if a third-party developer used OpenAI’s best models to create AI searches, in May 2023 the cost would be approximately $10 per query, while Google’s non-AI search costs $0.01, a 1,000 times the difference. But in May 2024, the price of OpenAI’s top model dropped to about $1 per query. With this unprecedented price drop of 10 times a year, application developers will be able to use higher quality and lower cost models, leading to a proliferation of AI applications in the next two years.