Home Money OpenAI offers a look inside ChatGPT

OpenAI offers a look inside ChatGPT

0 comments
OpenAI offers a look inside ChatGPT

ChatGPT developer OpenAI’s approach to building artificial intelligence came under fire this week from former employees who accuse the company of taking unnecessary risks with technology that could become harmful.

Today OpenAI published a new research paper seemingly intended to show that it is serious about combating AI risk by making its models more explainable. In it paper, the company’s researchers present a way to look inside the artificial intelligence model that powers ChatGPT. They devised a way to identify how it stores certain concepts, including those that could perhaps cause an AI system to misbehave.

While the research makes OpenAI’s work to keep AI under control more visible, it also highlights recent turmoil at the company. The new research was conducted by the recently disbanded “superalignment” team at OpenAI that was dedicated to studying the long-term risks posed by the technology.

Co-authors from the former group, Ilya Sutskever and Jan Leike, who left OpenAI, are named co-authors. Sutskever, the company’s co-founder and former chief scientist, was among the board members who voted to fire OpenAI CEO Sam Altman last November, leading to a chaotic few days that culminated in Altman’s return. as a leader.

ChatGPT is powered by a family of large language models called GPT, based on a machine learning approach known as artificial neural networks. These mathematical networks have demonstrated great power in learning useful tasks by analyzing example data, but their operation cannot be easily examined in the way conventional computer programs can. The complex interaction between layers of “neurons” within an artificial neural network makes reverse engineering a system like ChatGPT with a particular response very challenging.

“Unlike most human creations, we don’t really understand the inner workings of neural networks,” the researchers behind the work write in an accompanying paper. blog post. Some prominent AI researchers believe that more powerful AI models, including ChatGPT, could perhaps be used to design chemical or biological weapons and coordinate cyberattacks. A long-term concern is that AI models may choose to hide information or act in harmful ways to achieve their goals.

The new OpenAI paper describes a technique that takes the mystery down a bit, identifying patterns that represent specific concepts within a machine learning system with the help of an additional machine learning model. The key innovation is to refine the network used to look inside the system of interest by identifying concepts, to make it more efficient.

OpenAI demonstrated the approach by identifying patterns that represent concepts within GPT-4, one of its largest AI models. The company released code related to the work of interpretability and a visualization tool which can be used to see how words in different sentences activate concepts including swear words and erotic content in GPT-4 and another model. Knowing how a model represents certain concepts could be a step toward being able to reduce those associated with unwanted behaviors, to keep an AI system running. It could also allow an artificial intelligence system to be adjusted to favor certain topics or ideas.

You may also like