OpenAI launches Operator, its first AI agent that works on the web
OpenAI’s Operator is currently a research preview and is only available to ChatGPT Pro users in the US.
listen to the story
OpenAI has released its new AI agent – ​​Operator – which can navigate the web to perform tasks for the user. “Using its own browser, it can view a webpage and interact with it by typing, clicking and scrolling,” the company explains. Interestingly, this is OpenAI’s first AI agent that can act independently. In this case, it essentially works as a middleman, able to complete tasks while interacting with the web. Currently, OpenAI has released a research preview, which means it has limitations and is open to feedback. This version is available now for ChatGPT Pro users in the US.
Operator and its operations
Operators can be tasked with a wide range of repetitive browser activities, such as completing forms, ordering groceries, or even creating memes. By leveraging the same interfaces and tools that people use every day, AI expands its practical applications, helping individuals save time on routine tasks, while creating new opportunities for businesses to connect with customers. Are there.
OpenAI plans to expand this to Plus, Teams, and enterprise users and integrate these capabilities into ChatGPT in the future.
In the blog, OpenAI says, “The operator is powered by a new model called Computer-Using Agent (CUA). By combining GPT-4O’s vision capabilities with advanced reasoning through reinforcement learning, the CUA is trained to interact with graphical user interfaces (GUIs) – the buttons, menus, and text fields that people see on screen.
The operator can “view” the browser through screenshots and “interact” with it using standard mouse and keyboard actions, allowing it to function on the web without the need for custom API integration. But when it comes to hallucinations is it better? OpenAI explains that if the operator encounters challenges or errors, he can use his reasoning abilities to correct himself. If it reaches an impasse and needs help, it seamlessly hands over control to the user, ensuring a seamless and collaborative experience.
Similar to using multiple tabs on a browser, users can run multiple tasks simultaneously from the operator by creating new conversations.
o3-mini will come to free users
As OpenAI introduced Operator, its first AI agent, CEO Sam Altman said on X (formerly Twitter) that the company has decided to offer O3-Mini in the free tier.
The O3-mini marks a significant advancement over its predecessor, the O1-mini, by integrating advanced logic capabilities that enable step-by-step logical analysis. This AI model is yet to see the day of light. According to Altman, it can be launched within 2-weeks.