Runtime guardrails

Most guardrails watch the model’s words. ifivo stops the action.

Runtime guardrails at the action layer. Every tool call, payment, refund, external send, and infra operation is evaluated against your policy before it executes — deterministically, in under 50ms at the median, with a full audit trail of the decision.

The audit replays your real agent logs against our starter policy packs. No signup, auto-deletes in 30 days.

Where guardrails live

Three layers, three jobs. You probably need all three.

“Guardrails” is an overloaded word. Here’s how we think about the stack, where ifivo fits, and honestly — where we don’t.

Model-level guardrails
NeMo Guardrails, Guardrails AI, output validators
What it catches

Toxicity, hallucination patterns, schema violations in the LLM's text output

What it doesn’t

What the agent actually does after the output — the tool call, the payment, the external send

Runtime guardrails (this is us)
ifivo
What it catches

The action itself — every tool call, vendor call, destination, amount, and context lineage, evaluated deterministically before it executes

What it doesn’t

We are not a content filter. We don't tell you whether the model's prose is safe to read; we tell you whether the action it just decided to take is safe to run.

Observability platforms
LangSmith, Langfuse, Arize
What it catches

Traces after the fact — latency, cost, token usage, what prompts were sent

What it doesn’t

They watch. They don't stop anything. By the time you see the trace, the refund is issued and the email is sent.

What a runtime guardrail actually does

Four categories of risk, one deterministic policy engine, no LLM in the critical path.

Spend caps and velocity limits

Every action that moves money is typed and metered. Per-call ceilings, daily budgets, per-vendor caps, per-agent caps. Exceeded limits block or queue for approval.

High-risk operations

Production DB writes, infra provisioning, bulk sends, external API keys, new destinations. A default-deny posture with explicit allowlists, not a suggestion in the system prompt.

Prompt-injection containment

Fifteen deterministic patterns run against every context source. When untrusted input meets an external destination, the action pauses — regardless of what the model believes.

Approval routing and kill switches

Ambiguous actions route to the in-app queue and, optionally, Slack, email, or HMAC-signed webhooks that reach PagerDuty, Opsgenie, Twilio, n8n, Zapier, or custom endpoints. One switch pauses an agent, a vendor, or the whole control plane. Everything audit-logged, immutable.

Prompt-injection stance

Containment, not detection. Here’s why that matters.

Detection alone is a losing race.

Prompt-injection patterns mutate the moment you publish a detector. Attackers paraphrase, translate, encode, or hide the instruction in zero-width characters or HTML comments. A classifier that says “this prompt is safe” is one rewording away from wrong.

Containment is the durable answer.

We detect patterns, but we don’t bet the system on detection. The durable control is at the action layer: regardless of how cleverly the model was fooled, the outgoing call still has to clear policy. An untrusted source that wants to leak data needs a destination, and destinations are observable, enumerable, and governable.

We do
  • Flag 15+ deterministic injection patterns
  • Track context lineage — which sources are untrusted
  • Enumerate destinations and classify external vs internal
  • Block or require approval when untrusted input meets an external destination
We don’t
  • Claim to detect every attack — that’s not physically possible
  • Use an LLM to judge whether a prompt is malicious
  • Block text based on tone or “feel”
  • Filter model output for toxicity or PII

How runtime guardrails differ from the other layers

Side-by-side so you can see where the overlap really is — and where it isn’t.

CapabilityRuntime guardrails
ifivo
Content-filter guardrails
NeMo / Guardrails AI
Observability
LangSmith / Langfuse
Runs before the action executes
Deterministic (no LLM in the critical path)partial
Evaluates tool calls, payments, destinations
Filters LLM text output for toxicity / schema
Human-in-the-loop approval queue
Immutable audit log of every decisionpartial

None of these rows are zero-sum. We expect teams to run a content filter, an observability platform, and ifivo. They’re complementary.

When ifivo is the wrong tool

We’d rather you know now than later. A few honest cases:

  • You only need output moderation

    If your agent just generates text for users to read and never takes actions on their behalf, a content-filter library is simpler and cheaper than a control plane. Pick NeMo Guardrails or Guardrails AI.

  • You need tracing, not blocking

    If your problem is “why did the agent do that?” and not “stop the agent from doing that,” an observability platform like LangSmith or Langfuse fits better.

  • Your agents don’t touch money, infra, or external sends

    Runtime guardrails matter most when wrong actions compound. If the blast radius of a bad decision is small, ship without us and come back when it isn’t.

  • You need a full IAM/policy server

    We are opinionated toward agent actions. For generic policy-as-code across services, pair us with OPA or don’t use us.

See what runtime guardrails would have caught on your real traffic.

Upload the last 30 days of your agent logs. We replay them against curated policy packs and show you every action that would’ve been blocked, paused, or flagged for injection — no signup, report expires in 30 days.