Veo 3.1
Google Veo 3.1 on a visual canvas. Generate 4K video with audio, wire it into a pipeline, publish the whole flow as a signed endpoint.
- Veo 3.1 is Google's text-to-video and image-to-video model with native audio generation.
- Supports 720p, 1080p, and 4K at 4, 6, or 8 second durations. 4K and 1080p lock to 8s.
- PlugNode runs Veo 3.1 as the default Video node. Bring your own Gemini API key, no markup.
- Chain Veo with Gemini Text, Image, and ElevenLabs Audio, then publish the flow behind one URL.
What it is
Veo 3.1 is Google's video generation model, available through the Gemini API. It takes a text prompt, an optional reference image, and an optional video clip for extension, then returns an MP4 with synchronised audio. PlugNode exposes Veo 3.1 as a first-class Video node: pick the model from a dropdown, set the prompt, drop in a reference image, and wire the output into downstream nodes. The same node powers both the browser canvas (live preview) and the server engine (production runs). Every request hits Google's Gemini API directly with your own key. No proxy, no credit system, no markup. Use the HTTP Trigger node to turn the whole flow into a signed webhook your app can POST to.
What you can do with it
- Generate a video from a text prompt in one node
- Start from a reference image for consistent subjects
- Extend an existing clip by feeding a video URL as input
- Produce synchronised dialogue, music, and sound effects
- Return the MP4 URL directly to a webhook caller via Respond to Webhook
- Chain with Gemini Text for scripted scene descriptions
- Layer ElevenLabs voiceover on top for multi-voice narration
What Veo 3.1 is good at
Veo 3.1 is the newest video model in Google's Veo family and adds two things that matter on production pipelines: higher resolution ceilings (1080p and 4K in addition to 720p) and native audio generation. The model takes a prompt and produces footage with dialogue, music, and ambient sound baked in. No separate TTS pass required.
The Fast variant cuts latency at the cost of some fidelity, which is useful when you're iterating on ideas or generating bulk variants. Where Veo 3.1 shines: short product shots, cinematic establishing shots, stylised b-roll for explainers, and single-subject motion from a reference still.
Where it struggles: long-form narrative continuity across shots, precise object placement, and anything requiring frame-level editing. For those, Veo remains one node in a larger pipeline rather than the whole pipeline.
Resolution, duration, and aspect-ratio rules
The Veo 3.1 API imposes a handful of constraints that matter when you're wiring a flow. 720p accepts 4s, 6s, or 8s durations in either 16:9 or 9:16. 1080p locks the clip to 8s and 16:9 only. 4K locks to 8s and 16:9. Passing a reference image also forces the clip to 8s regardless of resolution.
The Video node in PlugNode encodes these rules in the config panel. Changing the resolution updates the duration and aspect-ratio options automatically so you can't save a flow that will fail at execution time.
If you need a 9:16 vertical clip at high resolution for Reels or TikTok, the current answer is 720p. Veo has not shipped 4K vertical at the time of writing.
Native audio and what it includes
Veo 3.1 generates audio alongside the video frames, synchronised to the visual. That audio includes dialogue when the prompt describes speakers, ambient sound that matches the scene (wind, crowds, room tone), and stylised music cues when the prompt specifies them.
It is not a replacement for a mastered soundtrack and it will not match a specific voice from a prompt. For branded voiceover you still want ElevenLabs on top.
A typical pipeline on PlugNode looks like: Gemini Text generates the script, Video (Veo 3.1) renders the shot with native ambient audio, ElevenLabs Audio generates a branded voiceover, and an output node returns both streams. Because the whole thing is one flow, re-running with a tweaked script takes one click instead of four tools.
Running Veo 3.1 as an API from PlugNode
This is the wedge. Drop an HTTP Trigger node on the canvas, wire it into a Video node configured for Veo 3.1, and chain a Respond to Webhook node at the end. Hit Publish. You now have a rate-limited, secret-protected HTTP endpoint that accepts a prompt and returns a video URL.
Your product code POSTs to that URL; the flow runs on the server engine; the MP4 lands back as JSON. The endpoint is versioned. Every publish creates a hash-diffed snapshot you can roll back to if a new prompt format breaks downstream.
Secrets rotate from Settings, rate limits are 60 requests per minute per trigger, and the response is either synchronous (pass ?wait=true) or asynchronous with a webhook callback. That is what "publish a flow as an API" means in practice. See the publish-as-API pillar for the full pattern, the Gemini Video integration for the node reference, and the HTTP Trigger integration for trigger-side details.
Where Veo 3.1 fits next to Veo 3 and Veo 2
Veo 2 is 720p only and does not generate audio. It remains cheaper per second and is a sensible default when you need bulk low-res variants or when audio would be stripped anyway. Veo 3 adds 1080p and native audio and is the balance option. Veo 3.1 adds 4K, longer video extension, and refinements in motion stability at higher resolutions.
In PlugNode, switching between the three is a dropdown change on the Video node. Everything downstream stays the same.
A production pattern that works well: draft with Veo 2 or Veo 3.1 Fast while iterating on prompt and script, then swap to Veo 3.1 at 1080p or 4K for the final publish. No other downstream node needs to change.
Run it on PlugNode
Veo 3.1 is the default model on PlugNode's Video node. Sign in, drop a Video node on the canvas, paste your Gemini API key in Settings, and you're generating. No quota system, no credit markup. Wire it through HTTP Trigger and Respond to Webhook to turn the pipeline into a signed API endpoint your app can call.
You pay Google directly at Gemini API rates for Veo. PlugNode does not add a per-second markup or a credit layer. See the official Gemini pricing page for the current per-second cost by resolution.
Use it inside these workflows
All use casesFrequently asked questions
- Is Veo 3.1 available through PlugNode today?
- Yes. Veo 3.1 is the default model on PlugNode's Video node. You bring your own Gemini API key in Settings and the canvas runs the model on every flow execution.
- Does Veo 3.1 generate audio?
- Yes. Veo 3.1 produces synchronised audio (dialogue, ambient sound, music cues) inside the same MP4 as the video. For branded voiceover you still layer ElevenLabs on top via the Audio node.
- What is the max resolution and clip length?
- 4K (3840×2160) at 8 seconds in 16:9. 1080p also locks to 8s. 720p supports 4, 6, or 8 seconds in 16:9 or 9:16.
- Can I start Veo 3.1 from a reference image?
- Yes. Attach a reference image to the Video node and the model animates from that still. Passing a reference image locks the duration to 8 seconds regardless of resolution.
- How do I turn a Veo flow into an API my app can call?
- Drop an HTTP Trigger node, wire it through your Video node, and end with Respond to Webhook. Publish the flow. PlugNode gives you a signed endpoint with a rotating secret, 60 req/min rate limiting, and versioned rollback.
- Do I pay PlugNode for Veo generations?
- No. You pay Google at Gemini API rates with your own key. PlugNode does not mark up model costs or run a credit system on top.
Related models
All modelsLast updated 2026-04-25
Generate your first video ad in 3 minutes.
Free to start. No credit card. Upload a product photo, connect your AI models, click Run.