Google Veo2 vs OpenAI Sora: AI models now create text-to-video clips, promising a new future
What is the future of video production creating high-quality, realistic videos with AI tools like Veo2 and Sora? More importantly, which one is better for you? We try to find answers to some of these questions.
listen to the story
Imagine a vivid dream filled with mythological characters and vast landscapes, and you as the hero in futuristic armor. Now, imagine describing that dream and seeing it transformed into a high-definition video. Seems impossible? Not anymore.
Enter Veo2, Google DeepMind’s latest text-to-video platform, which creates realistic 4K video from simple text prompts. Its capabilities have stunned the Internet with mesmerizing demo videos that have many people in awe – and some video creators worried about their jobs.
But Veo2 is not alone. OpenAI’s Sora is another challenger to the competition, bringing its own perspective to this exciting new era of AI-generated video creation. So, how do these platforms compare? which one is better? And what does this mean for the future of video production?
vo2 vs sora
Launched in early December this year, Sora is available to ChatGPT Plus users globally. While OpenAI video creators have gained the lead with general users being able to use the platform, Google’s Veo2 is still in its beta testing.
It appears that Google’s Veo2 has the edge over Sora for several reasons:
- 4K video resolution: Veo2 offers users video resolution up to 4K which means better quality videos. In contrast, the Sora offers a maximum resolution of 1080p which isn’t bad but is still 4K.
- video duration: Veo2 offers videos up to 2 minutes long. In comparison, Sora makes short videos of up to 20 seconds.
- cinematic controlsVeo2 offers virtual camera controls with options to add cinematic movements like pan, tilt, etc. Its accuracy has surprised many users online. You can also play around with the lighting for a particular scene. This helps in enhancing the storytelling ability. Sora focuses more on style presets and storyboarding. It’s similar to editing photos on your phone. You can choose between each adjustment manually or use the presets the phone offers.
The beautiful, snowy city of Tokyo is bustling. The camera pans across a bustling city street, showing many people enjoying the beautiful snowy weather and shopping at nearby stalls. Beautiful sakura petals are blowing in the wind with snowflakes.
T: Sora
bot: vo2 pic.twitter.com/382tLPBYox
– Nick St. Pierre (@nickfloats) 18 December 2024
This video showcases the cinematic difference between the two. This tracking shot captured on the VO2 (below) on a busy city street shows better results in terms of camera angle and lighting than the Sora (above).
The camera pans around a large stack of old TVs, all showing different programs – 1950s sci-fi movies, horror movies, news, static, 1970s sitcoms, etc., which are being shown on a New York Large museums are set up inside the galleries.
Top: Sora
Bottom: VO2 pic.twitter.com/v1AQjJ5qJn
– Nick St. Pierre (@nickfloats) 18 December 2024
Or like in this post, you might look at a slow zoom-in shot or how the camera pans around a stack of TVs. Despite the prompt to rotate the camera, the output in Sora presented a still camera shot. This hinders the vision of the creator in many ways.
- realism: Some Veo2 online renders show its true ability to output photorealistic video. This is also true in terms of physics-based motion accuracy which makes animations more natural. This is an area where Sora struggles.
The following video from X user Ruben Hasid shows the differences between the two video engines. There are many inconsistencies in the results produced by Sora whereas Veo2 is able to provide more vibrant results.
I tested the Sora vs the new Google Veo-2.
I feel like comparing a bike vs a starship: pic.twitter.com/YcHsVcUyn2
– Ruben Hassid (@RubenHssd) 17 December 2024
Ideally, Veo2 is a comprehensive option for video creators, but sadly it is not available to general users. Google’s DeepMind has made the tool available only to a select few users, with no clarity on when the final version will be launched.
On the other hand, Sora Chat is available for business use with GPT Plus subscription, which costs around Rs 1676 in India.
room for improvement
Many X users have posted similar renders comparing the two and most feel that Sora is losing the battle in many departments. Although Sora may not be the most refined version of itself, there is still room for improvement. It is still in the early stages of its release and I believe that an improved version of this platform will be released in the coming days, especially after the improved performance of Veo2. Right now, it seems like Sora focuses more on speed rather than physics and accuracy, a sentiment expressed by an X user.
Sora vs VO2:
I spent a few hours running Prompt on both models and wanted to share some comparisons.
IMO – Sora has a bias towards more speed, while VO focuses more on accuracy/physics. And a large percentage of the clips from Veo are usable.
“Man jumping over obstacles” pic.twitter.com/WI9zIaJA64
-Justin Moore (@venturetwins) 18 December 2024
The company recently launched a rebranded version of the platform called Sora Turbo, but had to halt new signups due to the sheer volume of users on the servers. This is also an issue that online video platforms will have to face in the coming days.
Google has set a high bar for improving OpenAI, but the real challenge is the time at hand. With 2024 almost over, OpenAI now needs to focus on 2025 to see how it can beat Google at its own game. It doesn’t matter who wins the crown in the end, ultimately the users win.
Boon or danger?
Now to address the elephant in the room. It’s likely that these AI tools will eat away the jobs of people who create (shoot or edit) video content for a living. This pickle is similar to killing ChatGPT’s write operations.
Gradual improvements in Veo2, Sora and other similar devices will result in significantly reducing the requirements of traditional video-production methods. Why would someone pay for camerapersons, equipment, travel and logistics, when they can write a cue on a computer and produce 4K video with specific details regarding lighting, environment and color accuracy?
This will meet more needs of industries or creators who require more stock footage. It would be more cost effective to ask Veo2 or Sora to create a stock video rather than building from scratch or paying for a stock video platform like Envato, Adobe or iStock.
However, some videos (for commercial use) cannot be created by AI due to copyright issues. Videos of celebrities, interviews of politicians, sports programs and live TV. All these profiles still require people to function within the ecosystem. However, you can create similar videos like this amazing Game of Thrones trailer created using Veo2.
I just recreated iconic scenes from Game of Thrones using Veo 2 @GoogleDeepMind,
How many can you recognize?
We can finally have the ending we deserve. pic.twitter.com/YKwjVw8ODg
– Ammaar Reshi (@ammaar) 18 December 2024
So, no, these devices will not be the end all for jobs in this sector. In fact, thanks to these devices, more high-quality content will be created, which means more consumption and this will soon lead to more creators.
Rest assured, the future of video production is about to change, led by tools like Veo2 and Sora. Who knows, you might be able to create a big Hollywood-like project at home without spending a lot of money!