OpenAI's agent that can do work for you is here

OpenAI launched a research preview of its AI agent, Operator, which can perform tasks on the web

We may earn a commission from links on this page.
Sam Altman wearing a brown shirt and looking out in front of a black backdrop that says OpenAI in white letters
Sam Altman at Microsoft Build in Seattle, Washington on May 21, 2024.
Photo: Jason Redmond/AFP (Getty Images)

OpenAI has launched a research preview of its artificial intelligence agent, Operator, which can perform tasks on the web on behalf of users.

Operator uses its own browser, and can interact with a webpage by typing, clicking, and scrolling, OpenAI said. Users can have Operator do tasks such as completing online forms and grocery shopping, according to the startup.

Advertisement

The AI agent is powered by a new OpenAI model called Computer-Using Agent (CUA), which combines vision capabilities from OpenAI’s multimodal GPT-4o model with advanced reasoning from reinforcement learning. CUA was trained to interact with graphical user interfaces, or GUIs, such as buttons and text fields on webpages. Because Operator has “reasoning” skills, it can “self-correct” and give users back control when it needs help.

Advertisement

The research preview is only being made available to ChatGPT Pro users in the U.S. for now, OpenAI said, because it has “limitations and will evolve based on user feedback.” One example, the startup said, is “challenges with complex interfaces like creating slideshows or managing calendars.”

Advertisement

The startup plans to roll the AI agent out to other ChatGPT users, and eventually integrate Operator’s capabilities into the chatbot.

Operator was designed “to refuse harmful requests and block disallowed content,” OpenAI said, adding that the startup can send warnings and revoke access over multiple violations through its moderation systems. The AI agent “is trained to ensure that the person using it is always in control and asks for input at critical points,” the startup added.

Advertisement

For example, Operator will prompt the user to take over when it needs to fill out sensitive information, such as logging in to a website or entering credit card details.

“While Operator is designed with these safeguards, no system is flawless and this is still a research preview,” OpenAI said.