Google says DeepMind’s Genie 2 can create endless 3D worlds for users
Google’s new AI model creates images of entire 3D worlds not just like others, raising the possibility that in the future we may get games that create their environments in real time.
listen to the story
![Google says DeepMind’s Genie 2 can create endless 3D worlds for users Google says DeepMind’s Genie 2 can create endless 3D worlds for users](https://akm-img-a-in.tosshub.com/indiatoday/images/story/202412/genie-2-from-deepmind-053905550-16x9_0.jpg?VersionId=fTGqei1chNXHv32KoC9c9C9kqg0SKdcr&size=690:388)
DeepMind, the Google division tasked with AI research, is keeping itself busy. DeepMind’s latest product is Genie 2, a new AI model that, according to the company, is capable of creating infinite and playable 3D worlds. DeepMind builds this new AI model on the foundation laid by its predecessor Genie, a foundation AI model capable of transforming single images into playable environments. However, this new version of the Genie is now capable of creating 3D game environments.
In a recent blog post, DeepMind explained that Genie 2 is its advanced large-scale foundation world model, designed to generate dynamic and realistic 3D environments. Using a single image or text prompt, users can create an interactive virtual world. For example, if users type “a warrior in the snow”, they can create a simulation world where players wear warrior attire in a snowy environment. The model helps users perform various actions, such as jumping, swimming, and interacting with objects, while following real-world physics and lighting.
According to DeepMind, Genie 2 “can create coherent worlds with different viewpoints, such as first-person and isometric views, for up to a minute, with most lasting 10 to 20 seconds.”
The company explains that this capability stems from its advanced training on huge datasets of videos, which enables it to simulate environments with remarkable detail and coherence.
How does Genie 2 work?
Describing the process, DeepMind explains that it starts with a text or image prompt, which is fed into Imagen3, another generative model, to generate the corresponding visual representation. Users can then explore or interact with the generated environment using Genie 2. The model operates auto-regressively, creating video frame by frame based on previous frames and user input. “Genie 2 responds intelligently to actions taken by pressing keys on the keyboard,” explains DeepMind. “For example, our model can detect that arrow keys should move a robot, not trees or clouds.”
Elaborating further, DeepMind revealed that Genie 2 has action control capabilities, where the AI accurately interprets user inputs. For example, pressing directional keys moves a robot character in the generated environment instead of other objects such as trees or clouds.
Additionally, the feature supports diverse perspectives, including first-person, isometric, and third-person views, enabling users to navigate and interact with the virtual world in a variety of formats. Additionally the Genie 2 comes with long-term memory, which allows it to recall unseen parts of the environment and render them accurately when revisited.
Of course, Genie 2 is not a gaming platform. Instead, DeepMind has positioned it as a creative and research tool. So this could lead to video games where characters and their virtual worlds would be created instantly. The company notes, “Thanks to Genie 2’s out-of-distribution normalization capabilities, concept art and illustrations can be transformed into fully interactive environments.”