Most guardrails watch the model’s words. ifivo stops the action.
Runtime guardrails at the action layer. Every tool call, payment, refund, external send, and infra operation is evaluated against your policy before it executes — deterministically, in under 50ms at the median, with a full audit trail of the decision.
The audit replays your real agent logs against our starter policy packs. No signup, auto-deletes in 30 days.
Three layers, three jobs. You probably need all three.
“Guardrails” is an overloaded word. Here’s how we think about the stack, where ifivo fits, and honestly — where we don’t.
Toxicity, hallucination patterns, schema violations in the LLM's text output
What the agent actually does after the output — the tool call, the payment, the external send
The action itself — every tool call, vendor call, destination, amount, and context lineage, evaluated deterministically before it executes
We are not a content filter. We don't tell you whether the model's prose is safe to read; we tell you whether the action it just decided to take is safe to run.
Traces after the fact — latency, cost, token usage, what prompts were sent
They watch. They don't stop anything. By the time you see the trace, the refund is issued and the email is sent.
What a runtime guardrail actually does
Four categories of risk, one deterministic policy engine, no LLM in the critical path.
Every action that moves money is typed and metered. Per-call ceilings, daily budgets, per-vendor caps, per-agent caps. Exceeded limits block or queue for approval.
Production DB writes, infra provisioning, bulk sends, external API keys, new destinations. A default-deny posture with explicit allowlists, not a suggestion in the system prompt.
Fifteen deterministic patterns run against every context source. When untrusted input meets an external destination, the action pauses — regardless of what the model believes.
Ambiguous actions route to the in-app queue and, optionally, Slack, email, or HMAC-signed webhooks that reach PagerDuty, Opsgenie, Twilio, n8n, Zapier, or custom endpoints. One switch pauses an agent, a vendor, or the whole control plane. Everything audit-logged, immutable.
Containment, not detection. Here’s why that matters.
Prompt-injection patterns mutate the moment you publish a detector. Attackers paraphrase, translate, encode, or hide the instruction in zero-width characters or HTML comments. A classifier that says “this prompt is safe” is one rewording away from wrong.
We detect patterns, but we don’t bet the system on detection. The durable control is at the action layer: regardless of how cleverly the model was fooled, the outgoing call still has to clear policy. An untrusted source that wants to leak data needs a destination, and destinations are observable, enumerable, and governable.
- Flag 15+ deterministic injection patterns
- Track context lineage — which sources are untrusted
- Enumerate destinations and classify external vs internal
- Block or require approval when untrusted input meets an external destination
- Claim to detect every attack — that’s not physically possible
- Use an LLM to judge whether a prompt is malicious
- Block text based on tone or “feel”
- Filter model output for toxicity or PII
How runtime guardrails differ from the other layers
Side-by-side so you can see where the overlap really is — and where it isn’t.
| Capability | Runtime guardrails ifivo | Content-filter guardrails NeMo / Guardrails AI | Observability LangSmith / Langfuse |
|---|---|---|---|
| Runs before the action executes | |||
| Deterministic (no LLM in the critical path) | partial | ||
| Evaluates tool calls, payments, destinations | |||
| Filters LLM text output for toxicity / schema | |||
| Human-in-the-loop approval queue | |||
| Immutable audit log of every decision | partial |
None of these rows are zero-sum. We expect teams to run a content filter, an observability platform, and ifivo. They’re complementary.
When ifivo is the wrong tool
We’d rather you know now than later. A few honest cases:
- You only need output moderation
If your agent just generates text for users to read and never takes actions on their behalf, a content-filter library is simpler and cheaper than a control plane. Pick NeMo Guardrails or Guardrails AI.
- You need tracing, not blocking
If your problem is “why did the agent do that?” and not “stop the agent from doing that,” an observability platform like LangSmith or Langfuse fits better.
- Your agents don’t touch money, infra, or external sends
Runtime guardrails matter most when wrong actions compound. If the blast radius of a bad decision is small, ship without us and come back when it isn’t.
- You need a full IAM/policy server
We are opinionated toward agent actions. For generic policy-as-code across services, pair us with OPA or don’t use us.
Upload the last 30 days of your agent logs. We replay them against curated policy packs and show you every action that would’ve been blocked, paused, or flagged for injection — no signup, report expires in 30 days.