Home Tech Elon Musk says all human data for AI training is ‘exhausted’

Elon Musk says all human data for AI training is ‘exhausted’

0 comments
Elon Musk says all human data for AI training is 'exhausted'

Artificial intelligence companies have run out of data to train their models and have “exhausted” the sum of human knowledge, said Elon Musk.

The world’s richest person suggested that technology companies would have to turn to “synthetic” data – or material created by AI models – to build and tune new systems, a process already being carried out with fast-paced technology. development.

“The accumulated sum of human knowledge has been exhausted in AI training. That happened basically last year,” Musk said in an interview broadcast live on his social media platform, X.

AI models, such as the GPT-4o model that powers the ChatGPT chatbot, are “trained” on a wide range of data taken from the Internet, where they actually learn to detect patterns in that information, allowing them to predict, e.g. , the next word in a sentence.

Musk said the “only way” to counter the lack of source material to train new models was to move to synthetic data created by AI.

Referring to data depletion, he said, “The only way to supplement that is with synthetic data where… you’ll write an essay or put together a thesis and then you’ll grade yourself and… you’ll go through this self-study process.”

Meta, the owner of Facebook and Instagram, has used synthetic data to refine its larger Llama AI model, while Microsoft has also used AI-created content for its Phi-4 model. Google and OpenAIthe company behind ChatGPT, has also used synthetic data in its AI work.

However, Musk also warned that AI models’ habit of generating “hallucinations” (a term for inaccurate or meaningless results) was a danger to the data synthesis process.

he told the live broadcast interview with Mark Penn, president of the Stagwell advertising group, that hallucinations had made the process of using artificial material “challenging” because “how do you know if… the answer is a hallucination or it’s a real answer?” .

High-quality data and control over it is one of the legal battlegrounds in the rise of AI. OpenAI admitted last year that it would be impossible to create tools like ChatGPT without access to copyrighted material, while creative industries and publishers are demanding compensation for the use of their output in the model training process.

You may also like