OpenAI is letting some users try out a new ChatGPT feature that uses its artificial intelligence to operate a web browser to book travel, shop for groceries, hunt for bargains, and perform many other online tasks.
The new tool, called Operator, is an AI agent: it relies on an AI model trained on both text and images to interpret commands and figure out how to use a web browser to execute them. OpenAI claims it has the potential to automate many daily tasks and work errands.
The OpenAI Operator follows rival launches from Google and Anthropic, which have already proven they are capable of using the web. AI agents are widely seen as the next evolutionary stage of AI after chatbots, and many companies have already jumped on the hype train by promoting them. In most cases, these have very limited capabilities and simply use a language model to automate things that are normally done with regular software.
“AI is evolving from a tool that could answer your questions to one that is also capable of acting in the world, carrying out complex, multi-step workflows,” says Peter Welinder, vice president of product at OpenAI. “We will see a big impact on people’s productivity, but also on the quality of work that people can do.”
OpenAI admits that giving ChatGPT access to a web browser introduces new risks and says that the Operator can sometimes misbehave. It says it has implemented several new safeguards and plans to gradually expand the Operator’s capabilities.
Welinder and Yash Kumar, engineering and product lead for OpenAI’s Computer Using Agent, say the plan is to learn from how people use the tool. They acknowledge that the tool could make unwanted reservations or purchases, but add that a lot of work has been done to ensure that you ask before doing something risky. “He will come back to me and ask for confirmations before taking action that could be irreversible,” Kumar says.
OpenAI also released a new “system card” today that outlines issues that could arise with Operator. These include the possibility of you misinterpreting commands or deviating from what the user is asking; be misused by users; or be targeted by cybercriminals.
“It also poses an incredible number of security challenges,” Kumar says. “Because your attack vector area and your risk vector area increase quite significantly.”
Operator will initially be available as a “research preview” for ChatGPT users with a Pro account, which costs a hefty $200 per month. The company says it plans to expand access while slowly rolling out the tool because it will inevitably make some mistakes along the way.
In several demonstrations, Operator showed the potential for AI to take on a more active role as a web helper. The tool has a remote web browser and a chat window to communicate with a user.
At WIRED’s request, the operator was asked to book an Amtrak train trip from New Haven to Washington DC. He went to the correct website, correctly entered the information necessary to display the schedule, and then requested further instructions. If a user were to log in to the Amtrak website or browser profile with credit card information stored, the Operator could go ahead and book a ticket, although it is designed to ask permission first.
Kumar asked the Operator to reserve a table at Beretta, a restaurant in San Francisco. The program went to the OpenTable website, found the right restaurant, and searched for availability before asking what to do next. OpenAI says it has partnered with several popular sites, including OpenTable, to ensure Operator works seamlessly on them.
The new tool is based on OpenAI’s GPT-4o AI model, which can perceive a browser and a web page and converse in written text. The tool incorporates additional training designed to help you understand how to execute online tasks. OpenAI will also make its Computer Use Agent available through its API.