Google Unveils Gemini Omni AI Video Generation Model at I/O 2026
- AI News
- 4 min read
- Published: May 19, 2026
- Harish Prajapat
Google just made its biggest AI video play yet. At Google I/O 2026 on May 19 at Shoreline Amphitheatre in Mountain View, the company introduced the Gemini Omni AI video generation model, a new family DeepMind describes as “create anything from anything from any input, starting with video.”
The first release is Gemini Omni Flash. It started rolling out the same day to the Gemini app, Google Flow, and YouTube Shorts.
Why this matters. Until now, Google’s video stack ran on Veo. Veo 3 launched at I/O 2025 and Veo 3.1 followed in April 2026, both very good at turning text into cinematic clips. Both pretty limited on multimodal input and conversational editing. Gemini Omni is the answer to that gap.
What Gemini Omni Flash actually does
Omni Flash accepts any combination of images, audio, video, and text as inputs. It outputs video grounded in Gemini’s world knowledge across history, science, biology, physics, and cultural context. Think of it as the Nano Banana approach to image editing, but for video.
The big shift is conversational editing. Instead of writing one mega-prompt and praying, you talk to it. Change the lighting. Swap the jacket. Move the camera. Each edit builds on the last while the model holds scene layout and character identity steady across every turn.
Physics got a serious upgrade too. Google says Omni has an intuitive grasp of gravity, kinetic energy, and fluid dynamics, which were the exact weak spots in earlier Veo generations. Water actually behaves like water now (mostly).
One real catch. Audio editing is being held back. Google is letting users create videos with their own voice through a digital avatar, but full speech editing is still in testing while the team works on responsible deployment. So if you wanted to rewrite dialogue in an existing clip on day one, you’re waiting.
Availability and pricing
Gemini Omni Flash is free for YouTube Shorts creators. Google AI Plus, Pro, and Ultra subscribers get it inside the Gemini app and Google Flow worldwide.
Google Flow also picked up a new Flow Agent for creative planning, brainstorming, multi-variation generation, and batch editing, available to all Flow users globally. There’s a bespoke Tools feature too. You describe a custom video workflow or editor in plain language, no code, and you can share it with other Flow users who can remix it.
Flow Music got Omni Flash support for music video creation, section-by-section song editing, cover track generation, and full style transformation, all for Google AI subscribers. Mobile is shipping too. Google Flow launched an Android beta with iOS coming soon. Flow Music landed on iOS first, Android next. Both require users to be 18 or older.
Watermarking and provenance
Every clip created or edited with Omni in Gemini, Flow, or Shorts ships with an imperceptible SynthID watermark and C2PA Content Credentials. You can verify provenance inside the Gemini app today, with Chrome and Search support coming soon.
How it stacks up against Sora, Veo, and Seedance
The 2026 video model field is brutal. OpenAI shut down the consumer Sora 2 app in March 2026 after reportedly burning $8 to $12 million a month, leaving Sora 2 as API-only. ByteDance’s Seedance 2.0 has been topping public benchmarks on raw quality. Kuaishou’s Kling 3.0 is pulling over $20 million a month in China. Runway still owns a chunk of the pro creative workflow market.
Omni’s pitch is the combo. Unified multimodal input, conversational multi-turn editing, and physics simulation in one model. No single competitor currently bundles all three. That’s the bet.
The numbers behind the launch
The Gemini app has crossed 900 million monthly active users across 230 countries and 70 languages, up from 400 million at I/O 2025. That’s 2x year over year. Google is processing 3.2 quadrillion tokens per month across its AI services, a 7x jump. And the company is putting $190 billion into AI capex in 2026, which is 6x what it spent in 2022.
For creators who don’t want to wait around for invite codes, MagicShot’s AI Video Generator already runs the current top of the market including VEO 3.1 and Seedance 2.0 in one place. You can also dig into the Seedance 2.0 breakdown if you want to see how the benchmark leader handles real prompts.
Conversational multimodal video editing just became the new baseline. The next 12 months are going to get loud.
Frequently Asked Questions
Gemini Omni is Google’s new multimodal video model announced at Google I/O 2026. It can take any mix of text, image, audio, and video as input and generate or edit video output through multi-turn conversation, with physics simulation and scene consistency built in.
Gemini Omni Flash is rolling out to the Gemini app, Google Flow, and YouTube Shorts starting May 19, 2026. It’s free for YouTube Shorts users and included for Google AI Plus, Pro, and Ultra subscribers in Gemini and Flow globally.
Gemini Omni competes with Seedance 2.0 on raw quality, Kling 3.0 Omni on multi-shot audio-visual output, and the API-only Sora 2 on cinematic fidelity. Its differentiator is combining unified multimodal input, conversational editing, and physics simulation in one model, which no single competitor matches today.
