Google unveiled the AI model that will power the humanoid robot
Google Deepmind has introduced two new AI models based on Gemini 2.0: Gemini Robotics and Gemini Robotics- Er.
Listen to the story

Google announced its first AI model, Gemini in 2023. Since then, the company has been working to make its models more advanced. From the region model to the image generator, Gemini of Google has all this. Now, the company has announced its next AI model, which aims to power the robot and help them perform like humans: Gemini Robotics and Gemini Robotics Er. In his blog, Google says, “To be useful and useful for people in the physical realm for AI, they have to demonstrate” embodied “, which argues human ability to understand and react to the world around us and also take safe action to get things as well.”

Let’s deeply to understand these models.
Gemini Robotics: What is it, and how does it work?
Google explains that Gemini Robotics is “an advanced vision-language-action (VLA) model that was created in addition to physical functions as a new output model for the purpose of controlling the robot directly on Gemini 2.0.”
The new model brings progress to three important areas, which google deepmind believes that useful robots are required: generality, interaction and dexterity. With its ability to adapt new conditions, Gemini robotics is more effective in connecting with people and its environment. It is also capable of performing delicate physical functions, such as ignoring folding paper or bottle cap.

The model understands and reacts to a wide range of natural language instructions, adjusting its functions based on your input. It constantly monitors its surroundings, detects changes in its environment or instructions, and adopts its actions as required. This level of control, or “sterequence,”, from home to workplace, allows better collaboration with robot assistants in various settings.

Google also said that since just like humans, robots also come in all sizes and sizes, Gemini robotics are designed to adapt to it. It states, “We mainly trained the model on the data of B-Arm Robotic Platform, Aloha 2, but we also demonstrated that it could control a B-Arm platform based on the franca weapons used in many educational laboratories.”

Gemini Robotics Er
With Gemini robotics, Google has also introduced Gemini Robotics Er. This model improves the understanding of Gemini’s Gemini’s understanding for robotics, especially in spatial arguments, and enables roboticists to integrate it with their current lower level controllers.

Gemini Robotics-Ar significantly increases the capabilities of Gemini 2.0, such as gesture and 3D detection. By combining the spatial argument with Gemini’s coding capabilities, it can quickly develop new tasks completely. For example, when a coffee mug is shown, the model can easily determine how to understand it with two fingers on the handle and plan a safe passage to approach it.
Mithun Robotics-ER is capable of performing all the necessary stages to directly control a robot, in which perception, state estimate, spatial understanding, plan, plan, plan, plan, plan, plan, plan, plan, plan, plan, plan, plan, plan, plan, plan, plan, scheme, scheme, And code generation.