Publish an AI flow as an API.
Turn any PlugNode canvas into a signed HTTP endpoint. Versioned, rate-limited, rotatable. No deploy step, no infra, no SDK.
- Every PlugNode flow can become a signed HTTP endpoint with one click. No deploy step.
- Each publish creates a hash-diffed version. Roll back in one click if a change breaks a caller.
- Built-in: secret rotation, 60 req/min rate limit per trigger, SSRF protection, run history, webhook callbacks.
- Two response modes: synchronous (pass `?wait=true`) or asynchronous (default 202 plus a callback).
- Mid-flow early reply: the Respond to Webhook node returns a result while the rest of the flow keeps running.
What "publish as API" means
Publishing is the one thing a visual AI canvas usually does not do. Most tools stop at a preview button. PlugNode takes the same flow you ran inside the browser and exposes it as a real HTTP endpoint your backend can POST to. Every published version is immutable, hash-diffed against the previous one, and rollback-ready. The endpoint carries a rotating secret, a rate limit, and SSRF protection by default. This page is the full mechanism. What publishing means, how each primitive works, what the trade-offs are next to rolling your own, and the patterns teams use to ship production AI pipelines behind one URL.
What ships today
- One-click publish: draft flow becomes a signed `POST` endpoint
- Hash-diffed versions with full history and one-click rollback
- Rotating secret per flow, rotatable from Settings
- 60 requests per minute per trigger, configurable per workspace
- Synchronous (`?wait=true`) or asynchronous (202 + callback) response modes
- Mid-flow early reply via the Respond to Webhook node
- Per-node run capture: inputs, outputs, duration, token counts
- SSRF protection on external URLs that the flow references
Why "publish as API" is the wedge
Every AI canvas tool agrees the pipeline should be visual. The disagreement starts at the output.
Most canvas tools stop at a preview button. You click Run, the flow executes once inside the browser, and that is the product. Everything downstream (putting the flow behind a URL your app can call, handling retries, rotating credentials, rolling back a bad deploy) is your problem to solve with infrastructure PlugNode does not own.
That gap is where the actual production work lives. Wiring nodes is 10% of shipping a media pipeline. The other 90% is the boring infra around it: turning the pipeline into a request your backend can fire, protecting it from abuse, versioning changes so a prompt edit does not brick a paying customer, and getting signal out when a run fails.
PlugNode's wedge is that this 90% ships with the canvas. The publish primitive is first-class, not an afterthought.
What "publish" does at the node level
Publishing a flow is a one-click action on the canvas toolbar. Behind it, three things happen.
First, PlugNode serialises the flow graph (nodes, edges, viewport, per-node config) into JSON and computes a SHA-256 hash. If the hash matches the previously published version, publish is a no-op. If it differs, a new FlowVersion row is created in the database with status: 'published', the previous version is archived, and Flow.publishedVersionId is updated to point at the new snapshot.
Second, the HTTP Trigger node on the flow is promoted to the public surface. If the flow has no HTTP Trigger, publishing surfaces an error rather than producing a broken endpoint. The trigger node's generated URL is POST https://plugnode.ai/api/trigger/{secret}/{nodeId}. The secret is per-flow, the node id is per-trigger inside the flow.
Third, the server engine starts accepting requests on that URL. Each incoming request spawns a run, the run executes the flow on the server (not the browser), and the response is shaped by whichever response-side node the flow ended on: Respond to Webhook for rich custom responses or Output for a standard JSON envelope.
The flow is now a live API. No Docker, no function deploy, no queue you have to provision.
Versioning with SHA-256 diffs
Every publish generates a hash-diffed version. The hash is a SHA-256 over the canonicalised flow JSON, and the diff is rendered as a tree of added / removed / changed node and edge entries you can inspect before restoring.
Flow.currentVersionId is the draft you are editing on the canvas. Flow.publishedVersionId is the one the endpoint is running. Those two can diverge: you can edit freely on the canvas without touching what your callers see, then publish when you are ready. Each publish creates a new row under @@unique([flowId, version]); version numbers are per-flow and monotonic.
To roll back, open the version history, pick the target version, click Restore. PlugNode updates publishedVersionId to point at the older snapshot. The endpoint URL and secret do not change. Callers see the behaviour of the older version immediately after the pointer flips.
This is the same mechanism Vercel uses for deployments. The flow graph is the artifact, publishes are deployments, and rollbacks are pointer updates.
Secret rotation
Every published flow has a rotating secret. It lives in Flow.triggerSecret and forms part of the public URL. Two rules matter.
First, the secret is the only thing proving the caller is authorised. PlugNode intentionally does not run a separate API-key system on top of the trigger secret. That is one less credential for you to manage.
Second, rotation is cheap. Open the flow, open Settings, hit Rotate. The URL path updates with the new secret, the old secret stops working the moment the rotation commits. Your orchestrating code reads the new URL from the flow settings (or from the dashboard API) and continues.
Real-world use: rotate after a contractor leaves the project, after a log leak, after a major version change when you want to close off old callers. Rotation is idempotent and has no rate limit of its own.
Rate limits and burst behaviour
The default rate limit is 60 requests per minute per published trigger. A request past the limit returns HTTP 429 with a Retry-After header. Limits are per-trigger, so a flow with multiple HTTP Trigger nodes has independent buckets for each one; two different flows in the same workspace also have independent limits.
Why 60 per minute: the limit balances fair use against the reality that most AI model calls cost money and latency scales with concurrency. It is high enough for normal production traffic, low enough to protect you from a runaway upstream retry storm. The limit is configurable per workspace when your traffic pattern justifies a higher ceiling.
The rate limit sits in front of the server engine, so a 429 does not consume a flow run or a model call. Your Gemini, OpenAI, or ElevenLabs keys only get hit once the request reaches the engine.
Synchronous, asynchronous, and mid-flow early reply
Three response patterns, one endpoint.
Async (default). The endpoint returns HTTP 202 with a run id the moment it accepts the payload. The flow runs on the server. You get the result either by polling the run-status API or by registering an on_complete webhook in the flow settings.
Synchronous with ?wait=true. The endpoint holds the connection open until the flow finishes and returns the full response body. Pick this for short flows: text generation, a single image pass, image resize. Anything under ~30 seconds is a reasonable sync use case.
Mid-flow early reply via the Respond to Webhook node. Drop a Respond to Webhook node anywhere in the flow, not only at the end. When the flow reaches that node, the response is returned to the caller and the run continues. Subsequent nodes keep executing in the background. This is the right pattern when you want to acknowledge a long job fast and produce the expensive output (a video render, a multi-voice podcast) after the caller has already gone.
A common pattern: a mid-flow Respond to Webhook returns { "jobId": "..." } immediately, the flow runs the video generation, and an on_complete webhook fires to tell your backend the asset is ready.
SSRF protection and untrusted URLs
Flows reference URLs: file inputs from the payload, reference images for a model, outbound callbacks. All of those pass through an SSRF filter before the server engine makes a request.
The filter blocks private IP ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), loopback (127.0.0.0/8 and IPv6 ::1), link-local (169.254.0.0/16), and cloud-metadata endpoints (169.254.169.254, metadata.google.internal). It also blocks raw IP literals that skip DNS resolution and catches DNS rebinding by re-resolving at fetch time.
This is enforced on the server engine, not the browser. A malicious caller sending a crafted reference URL does not hit your internal network. The filter runs even on authenticated calls because compromise often comes through authenticated channels.
Run history and observability
Every run writes a FlowRun row with per-flow sequential runNumber. The run's JSON columns capture node inputs, outputs, duration, token counts, and any error traceback scoped to the node that failed.
From the dashboard you can open any run and see: what payload arrived at the trigger, which nodes ran, what each node returned, how long each step took, and where it broke if it broke. No grep through logs, no correlation IDs to chase.
This matters most on failures. When a Veo generation returns a content-policy error, the execution log points to the exact node and the exact prompt. When a downstream integration returns a 500, the log shows the HTTP status and the response body. You fix the right node instead of guessing.
Webhook callbacks and fan-out
Three callback types ship: on_complete, on_error, and on_node_update.
on_complete fires when the run finishes successfully, delivering the final response body and run metadata. on_error fires when the run fails, carrying the error object and the node that raised it. on_node_update fires on every node completion and is useful for streaming progress to your own UI or to a job-queue monitor.
Deliveries are logged inside the run's webhookDeliveries JSON so you can audit what went out. Callbacks use exponential backoff on 5xx responses.
Pair these with the mid-flow early reply pattern and you get a full job-queue feel with no queue to manage: caller gets a fast ack, your backend gets an async on_complete with the result, and the run history carries the evidence.
Payload validation at the trigger
Each node in a published flow declares the shape of the inputs it accepts. At the HTTP Trigger boundary, incoming JSON is validated against the declared types (text, image URL, file ID, number, boolean) before the run starts.
A malformed payload returns a 400 with a field-level error. The server engine never executes a run with bad input, which means your Gemini / OpenAI / ElevenLabs keys never get called on garbage data. This is one of the cheapest wins of shipping behind a typed canvas instead of a raw REST proxy.
For richer validation (enums, string length, regex), wire a Text Input node with constraints or pre-validate in a light server step before the model call.
Walkthrough: voiceover API in four nodes
This is the full pattern. Goal: a URL your backend POSTs text to, gets back an MP3 URL.
- Drop an HTTP Trigger node on the canvas. Set its expected payload to
{ "text": "string" }. - Add a Gemini Text node. System prompt: "Rewrite the input text for natural speech cadence. Remove URLs. Expand abbreviations. Preserve meaning." Connect the trigger's
textoutput to the Geminiprompt_ininput. - Add an ElevenLabs Audio node. Pick a voice, set stability and similarity, connect the Gemini output to the Audio node's text input.
- Add a Respond to Webhook node. Connect the Audio node's URL output to the response body. Publish.
You now have a production voiceover endpoint that polishes the script before synthesis. Your backend calls it with { "text": "..." }, gets back { "audio_url": "..." }. Rotate the secret when a contractor leaves. Roll back when a prompt change makes the voice sound weird. All from the same flow.
The AI voiceover generator use case walks through the same flow step by step.
Versus rolling your own
You could build this yourself. Spin up a Fastify server, wire in the Gemini SDK and the ElevenLabs SDK, put it behind a rate limiter, add a secret-rotation endpoint, ship a version-history system, stand up SSRF filtering, add an on_complete webhook emitter. Each piece is known engineering. Together it is two weeks of work that is not your product.
PlugNode is not arguing that shipping an API is hard. It is arguing that shipping the same API as a 4-node flow is faster, the rollback is cleaner, and the contract is visible. A new engineer opening your flow sees the entire pipeline on one screen. No grep, no README.
That is the case for a canvas that publishes. Fal.ai Workflows publish but cap at YAML without a visual editor, mutable endpoints, no rollback. ComfyUI vs PlugNode covers the other side: a visual canvas that does not publish at all.
Versus Fal.ai Workflows and ComfyUI wrappers
Fal.ai Workflows is the closest adjacent product. It publishes endpoints, but the authoring experience is YAML and the endpoints are mutable (no per-publish versioning, no rollback). Fal.ai vs PlugNode goes deep on the tradeoffs.
ComfyUI third-party wrappers (ComfyDeploy, RunComfy) retrofit an API layer onto a ComfyUI graph. They handle GPU hosting well. Versioning and rollback are bolted on or absent depending on which wrapper you pick. Secret rotation is per-wrapper. SSRF protection varies.
The pattern matters. If your flow is not going to change, any of these can publish. If your flow will change (and production flows always change), hash-diffed versions and one-click rollback are what separate "it ran that one time" from "I can edit this without a pager".
Common production patterns
A few patterns repeat across teams shipping on PlugNode.
Feature-flag via version rollback. Publish the new version, watch the execution log and your own metrics for a few minutes, roll back if the error rate climbs. Cheap feature flagging on a pipeline you do not need to deploy separately.
Mid-flow ack + slow tail. Use Respond to Webhook early to return { "jobId": "..." }, finish the expensive render (video, bulk image variants), fire on_complete. Your caller integrates like a job queue without running one. The product video ads use case uses this exact shape.
Model A/B via parallel fan-out. Two Text nodes (one Gemini, one OpenAI) downstream of the same trigger, both wired into Respond to Webhook. The response contains both outputs plus per-node latency. Your analytics picks the winner over time. See the multi-model A/B testing use case.
Bulk variant generation. Single HTTP Trigger that accepts a brief and a count. Fan out to N Image nodes with slight prompt variations. Respond to Webhook returns an array of URLs. The social media content pipeline use case shows the fanout applied to platform-specific sizes. For e-commerce teams running the same pattern SKU-by-SKU, the Shopify ad variants use case extends it into a full per-platform ad set.
What ships today vs roadmap
Native on PlugNode today: HTTP Trigger, Respond to Webhook, hash-diffed versioning, rollback, secret rotation, rate limits, SSRF protection, run history, on_complete / on_error / on_node_update webhooks, synchronous and asynchronous response modes, mid-flow early reply, per-node payload validation, BYO keys for Gemini (including Veo 3.1 and Nano Banana Pro), OpenAI, and ElevenLabs.
On the roadmap but not shipped: per-caller API keys layered on top of the trigger secret, WebSocket streaming responses, per-flow custom domains, dedicated node support for non-Gemini image and video vendors (Flux, Kling, Sora, Recraft, Ideogram). Non-native vendors are honest gaps, not marketing claims. When they ship, they get their own model page.
Frequently asked questions
- How does publishing a flow work?
- Drop an HTTP Trigger node on the canvas, wire your pipeline, end with Respond to Webhook, and hit Publish. PlugNode assigns the flow a versioned endpoint URL and a rotating secret. Your app POSTs to that URL with the payload the trigger expects and the flow runs on PlugNode's server engine. No deploy step, no Docker, no queue to manage.
- What is the difference between synchronous and asynchronous mode?
- Default: the endpoint returns HTTP 202 immediately and the flow runs in the background. Pass `?wait=true` in the URL and the endpoint holds the connection open until the flow completes and returns the full response. Pick synchronous for short flows (text generation, image resize) and asynchronous for longer ones (video generation, multi-step pipelines).
- What is mid-flow early reply?
- The Respond to Webhook node can return a response to the caller at any point in the flow, not just at the end. The rest of the flow keeps running after the response goes out. Useful for pipelines that need to acknowledge a job fast and continue expensive work (video render, batch image generation) in the background.
- How are secrets rotated?
- Open the flow, open Settings, hit Rotate. The old secret stops working instantly and a new one is issued. The endpoint URL does not change; only the secret does. Rotation is an idempotent operation you can run as often as you want.
- What happens if I publish a breaking change?
- Every publish is versioned with a SHA-256 diff against the previous version. Open the version history from the flow, pick any past version, click Restore. The endpoint URL stays stable. Rollback is instant because it is a version pointer update, not a rebuild. The caller contract is unchanged while you investigate the break.
- Are there rate limits?
- 60 requests per minute per published trigger by default. Bursts beyond the limit return 429 with a Retry-After header. The limit is configurable per workspace. Multiple triggers on different flows have independent limits.
- What about SSRF and malicious URLs?
- PlugNode blocks flows from fetching private IP ranges, localhost, and cloud-metadata endpoints by default. External URLs that a flow references are validated before the server engine reaches them. This is enforced on both the HTTP Trigger input and any file URLs referenced in the payload.
- How does this differ from Fal.ai Workflows or a ComfyUI export?
- Fal.ai Workflows are YAML-defined and ship as mutable endpoints with no versioning or rollback. ComfyUI has no native publishing. Third-party wrappers retrofit an API layer but do not add hash-diffed version history. PlugNode is the only visual canvas in the image and video category that ships the full publishing stack (versioning, rollback, rotation, rate limits, SSRF, mid-flow reply) as a first-class feature.
Last updated 2026-04-25
Generate your first video ad in 3 minutes.
Free to start. No credit card. Upload a product photo, connect your AI models, click Run.