What Gemini Omni, Google's new AI model can do for creating videos | Guatemala Herald News

The American technology giant Google This Tuesday it launched its new artificial intelligence (AI) models: Gemini 3.5 Flash and Gemini Omni.

Gemini Omni is a natively multimodal model in its inputs – it accepts text, audio, images and video – which initially generates video outputs and will soon be joined by audio and image outputs.

The CEO and co-founder of Google DeepMind, Demis Hassabis, highlighted during the event that this new model is capable of “reaching a new level of understanding of the world, multimodality and editing.”

“Models such as Leo, Nano, Banana and Genie (all from Google) are capable of creating extremely realistic videos, images and interactive simulations. Although they are not perfect, they already demonstrate an impressive intuitive capacity. With Omni we have advanced even further. It represents a radical change in the simulation of phenomena such as kinetic energy and gravity,” Hassabis detailed today during the presentation.

Gemini Omni will replace Veo in the Gemini app. Omni combines the core intelligence of Gemini with advanced generative media capabilities, such as image-to-video conversion and AI video editing.

What you can do with Gemini Omni

Combine text, photos and video into a single video
Create videos from reference photos (up to five)
Edit videos easily

The official page describes that Gemini Omni is available for users over 18 years of age who have a Google AI Plus, Pro or Ultra plan, in all languages and markets where the Gemini app is available.

Some features, such as AI video-to-video editing, may be restricted in some countries. It will also be possible to create an avatar.

With information from EFE

Source

What Gemini Omni, Google’s new AI model can do for creating videos

What you can do with Gemini Omni