Google unveils Veo 2 video AI generator to take on OpenAI’s Sora
Google has introduced a new and improved Veo 2 video generator model to compete with OpenAI’s Sora. The company claims that the successor to the original Vo AI model can produce realistic motion and high quality output up to 4K, which is superior to leading AI video generator platforms.
listen to the story
Google has introduced a new and improved Veo 2 video generator model to compete with OpenAI’s Sora. The company claims that the successor to the original Vo AI model can produce realistic motion and high quality output up to 4K, which is superior to leading AI video generator platforms. Along with this, Google also announced the latest Imagen 3 version and new Whisk model for creating a single image from multiple views. Here’s everything you need to know.
Google launches Veo 2, Imagen 3 and Whisk AI models
Google shared a series of short video clips created using Veo 2, showing that the platform can generate hyper-realistic videos of animals and food. We can also watch animated clips of humans, which are all 8-second videos.
Google said, “Veo 2 outperforms other leading video generation models based on human assessment of its performance.” Although the company did not mention the names of rivals, it is likely pointing towards OpenAI’s Sora – which is also a video generator. The company has added a graph in the benchmark list, which claims that people like its VO2 model more than Meta Movie Gen, Cling V1.5, Minimax and Sora Turbo.
The samples shared by Google look great, but some of the motion scenes appear to be a bit inaccurate. Some details are missing in parts of the frame. Google acknowledges this and says that complete stability still remains a challenge in complex scenes or scenes with complex motion. But, the overall quality of the video seems to be quite impressive.
Google said, “While Veo 2 demonstrates incredible progress, creating realistic, dynamic or complex video and maintaining complete stability in complex scenes or scenes with complex motion remains a challenge. We continue to develop and refine performance in these areas.” will continue.” DeepMind.
For the Imagen 3 model, Google claims it can now create brighter and more realistic images with vibrant colors, better color balance and fidelity. The company is also claiming that it can currently generate highly detailed textures and attractive visuals. The new version now offers a wider range of styles, including photorealism, impressionism, abstract and anime.
The company also showed off a new Whisk AI model, a new experimental version from Google Labs. It lets you give prompts with images instead of words. You can basically use multiple images and create something from them. To upload a photo, you get 3-4 boxes, which include subject, scene and style. For example, you add your image to the Subject box, a mountain scene to the Scene tool, and an animated photo to the Style box. After uploading all these photos, Whisk helps in creating a new image.
Gemini Models automatically writes a detailed caption of your images, and then it feeds those descriptions into Imagen 3. This process allows you to easily remix your themes, visuals, and styles in fun, new ways.
Currently, these tools are not available in India, but users in the US are using them. However, the company is expected to bring these to the Indian market in the near future.