Guardrails AI: LLM Output Validation and Filtering
Introduction
Large Language Models (LLMs) are incredibly powerful at generating text, but their output is not always reliable. LLMs can produce inaccurate information, leak sensitive data such as PII (Personally Identifiable Information), or fail to conform to expected formats. In production applications, especially customer-facing ones, this can be a serious problem.
Guardrails AI provides a solution for validating, filtering, and structuring LLM outputs. This framework acts as a safety layer between the LLM and end users, ensuring that every response meets predefined quality and safety criteria.
In this tutorial, we will learn how to use Guardrails AI from installation, Guard objects, built-in validators, Pydantic integration, to building a safe customer-facing chatbot with PII filtering and factual grounding.
Installation
Install Guardrails AI and required dependencies:
pip install guardrails-ai
After installation, configure the Guardrails CLI and install validators from the Guardrails Hub:
guardrails configure
Install validators from Hub
guardrails hub install hub://guardrails/regexmatch
guardrails hub install hub://guardrails/detectpii
guardrails hub install hub://guardrails/toxicity
guardrails hub install hub://guardrails/provenancellm
Make sure your LLM API key is configured:
export OPENAIAPIKEY="sk-your-api-key-here"
Guard Object
The Guard is the primary object in Guardrails AI. It acts as a wrapper around LLM calls, adding validation to inputs and/or outputs.
from guardrails import Guard
from guardrails.hub import RegexMatch
import openai
Create a simple Guard
guard = Guard().use(
RegexMatch(regex=r"^\d{4}-\d{2}-\d{2}$",
onfail="exception")
)
Use Guard to call LLM
result = guard(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "Give me today's date in YYYY-MM-DD format only, no other text."
}]
)
print(f"Validated output: {result.validatedoutput}")
print(f"Validation status: {result.validationpassed}")
Guard supports several actions when validation fails:
# onfail options:
"exception" - Raise an exception
"filter" - Remove output that fails validation
"fix" - Attempt to fix the output
"reask" - Ask the LLM to regenerate
"noop" - Continue without action (log only)
guard
withreask = Guard().use(
RegexMatch(
regex=r"^[A-Z][a-z]+$",
on
fail="reask" # Ask LLM to try again if it fails
)
)
Validators
Guardrails AI provides various built-in validators through the Guardrails Hub. Here are some of the most commonly used ones:
Regex Validation
from guardrails import Guard
from guardrails.hub import RegexMatch
Validate email format
emailguard = Guard().use(
RegexMatch(
regex=r"^[a-zA-Z0-9.%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$",
onfail="reask"
)
)
result = emailguard(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "Provide a valid business email address example."
}]
)
print(f"Email: {result.validatedoutput}")
PII Detection
from guardrails import Guard
from guardrails.hub import DetectPII
Guard to detect and remove PII
piiguard = Guard().use(
DetectPII(
piientities=[
"EMAILADDRESS",
"PHONENUMBER",
"PERSON",
"CREDITCARD",
"IPADDRESS",
"ID"
],
onfail="fix" # Automatically mask detected PII
)
)
result = piiguard(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "Tell me about our customer John Doe who can be reached at john@email.com or 555-123-4567."
}]
)
print(f"Safe output: {result.validatedoutput}")
PII will be masked: "[PERSON] who can be reached at [EMAIL] or [PHONE]"