Skip to content
Google · Video generation

Veo 2

Google Veo 2 on a visual canvas. Fast 720p clips from text or an image, wired into a pipeline you can publish as an API.

At a glance
  • Veo 2 is Google's first-generation video model, available through the Gemini API.
  • 720p only. No native audio. Supports 5, 6, 7, or 8 second clips.
  • Still useful on PlugNode for bulk iteration, low-res variants, and flows where audio is stripped anyway.
  • Swap to Veo 3 or Veo 3.1 by changing the model dropdown when you need 1080p, 4K, or native audio.

What it is

Veo 2 is the first production Veo model Google shipped through the Gemini API. It renders 720p video from a text prompt or a reference image, with clip lengths from 5 to 8 seconds. It does not generate audio. That arrived in Veo 3. On PlugNode, Veo 2 is an option on the Video node and fills a specific slot: low-resolution drafts while you iterate on the prompt, bulk variants for A/B tests, and downstream flows that strip audio anyway (muted social previews, thumbnails, GIF conversions). For final production renders where audio and 1080p matter, pick Veo 3 or Veo 3.1 on the same node.

What you can do with it

  • Text-to-video at 720p with variable duration
  • Image-to-video from an uploaded reference
  • 16:9 and 9:16 aspect ratios out of the box
  • Cheaper per second than Veo 3 for draft iteration
  • Swap to Veo 3 or Veo 3.1 on the same node when you need 1080p or audio
  • Publish the pipeline as a signed API with one click

Where Veo 2 still makes sense

Veo 2 is not the headline model anymore. It stays useful for three reasons.

First, 720p draft iteration. You can chew through 10 prompt variants in the time one Veo 3.1 4K render takes, then promote the winner.

Second, muted surfaces. Social previews, GIF conversions, internal QA, or any platform that strips the audio on upload.

Third, legacy flows. If you built a pipeline against Veo 2 and it satisfies a downstream consumer contract, there is no reason to upgrade "because". Model selection is a dropdown on the Video node, so switching is cheap when you do decide to move.

Audio and Veo 2

Veo 2 does not render audio. The MP4 it returns has a silent track. If you want a narrated or ambient-scored clip, add an ElevenLabs Audio node after the Video node and return both streams from Respond to Webhook, or pass them to a downstream mixer.

Many production flows layer TTS on top even with Veo 3's native audio, because ElevenLabs gives you control over voice, stability, and similarity: branded spokespeople, consistent accents, multi-voice podcasts. If your pipeline was going to add ElevenLabs anyway, Veo 2's missing audio is not a blocker.

Prompt format and constraints

Veo 2 shares the Gemini video prompt contract with Veo 3 and 3.1: describe subject, motion, camera, lighting, and style in natural language. Reference images lock duration to 8 seconds. 720p is the only resolution option. Aspect ratio toggles between 16:9 and 9:16 at node config time.

Content policy and safety filters are Google's. Flows that hit them surface as errors in PlugNode's execution log, which you can inspect from the run history. Failing fast is better than silently degrading.

Bulk variants with Veo 2

A practical Veo 2 pattern: put the Video node inside a flow whose HTTP Trigger accepts a prompt and a variant count. Generate N clips with slight prompt variations, return an array of MP4 URLs. Your orchestrating code sends them to a ranker (human or model) and promotes the winner.

This is cheaper at Veo 2 rates than at Veo 3 rates, which matters when you are testing 50 headlines, not shipping one final cut. For the final cut, re-run with the winning prompt on Veo 3.1 at 4K. Same flow, one dropdown change.

The multi-model A/B testing use case shows the fan-out pattern applied to text models; the same shape works for video.

When to upgrade

Signals that Veo 2 has hit its ceiling in a flow: your downstream platform renders at 1080p or higher and the 720p output looks soft; the creative brief requires native audio and the ElevenLabs layer alone does not cover ambient; a stakeholder asks for 4K.

Each of those is a model-dropdown change on PlugNode with no other flow edits. The pipeline keeps its endpoint URL, rate limit, secret, and version history. Only the model executing inside the Video node changes.

Run it on PlugNode

Veo 2 is selectable on the Video node. Pick it for low-res iteration or muted pipelines. Swap to Veo 3 or Veo 3.1 on the same node when 1080p or audio enters the requirements.

Veo 2 is billed by Google at Gemini API rates. PlugNode adds no markup and does not charge for model compute.

Frequently asked questions

Does Veo 2 generate audio?
No. Veo 2 produces silent MP4 output. For narration or ambient sound, chain an ElevenLabs Audio node after the Video node.
What resolutions does Veo 2 support?
720p only. For 1080p or 4K, pick Veo 3 or Veo 3.1 from the same dropdown on the Video node.
Is Veo 2 cheaper than Veo 3?
Typically yes, at Google's published Gemini API rates. Check the current Gemini pricing page for the exact per-second cost.
Can I use an image as input on Veo 2?
Yes. The File Input node accepts an image and the Video node passes it as the reference. A reference image locks the clip to 8 seconds.
When should I upgrade from Veo 2?
When the output needs 1080p or higher, when native audio enters the brief, or when you need video extension (Veo 3.1 only).

Last updated 2026-04-25

Generate your first video ad in 3 minutes.

Free to start. No credit card. Upload a product photo, connect your AI models, click Run.