Outlines: Structured LLM Generation with Constrained Decoding
One of the biggest challenges when working with Large Language Models (LLMs) is consistently getting structured and valid output. When you need valid JSON, an enum from specific choices, or other specific formats, LLMs often produce output that does not match the required format. Outlines solves this problem with a constrained decoding approach at the token level, guaranteeing that output is always structurally valid.
How Constrained Decoding Works
Before diving into Outlines, it is important to understand the difference between two approaches to structured generation:
Retry-Based Approach (like Instructor)
Libraries like Instructor use a "generate-then-validate" approach:
The problem: API costs increase, latency is unpredictable, and there is no guarantee of convergence.
Constrained Decoding Approach (Outlines)
Outlines uses a fundamentally different approach:
The result: output is guaranteed to be 100% valid without retries, with minimal computational overhead.
Installation
Installing Outlines is straightforward:
pip install outlines
To use with local transformers models:
pip install outlines[transformers]
For integration with llama.cpp:
pip install outlines[llamacpp]
For integration with vLLM:
pip install outlines[vllm]
JSON Generation with Pydantic Models
The most popular feature of Outlines is the ability to generate valid JSON based on Pydantic models.
Basic Example
import outlines
from pydantic import BaseModel, Field
from typing import List, Optional
from enum import Enum
Define schema with Pydantic
class Address(BaseModel):
street: str = Field(description="Street name and number")
city: str = Field(description="City name")
state: str = Field(description="State or province")
zipcode: str = Field(description="Postal code")
class Employee(BaseModel):
name: str = Field(description="Full name of the employee")
age: int = Field(ge=18, le=65, description="Employee age")
email: str = Field(description="Email address")
department: str = Field(description="Work department")
salary: float = Field(gt=0, description="Monthly salary")
address: Address
skills: List[str] = Field(description="List of skills")
Load model
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
Create generator with JSON schema
generator = outlines.generate.json(model, Employee)
Generate structured data
prompt = """Create a fictional employee record for a technology company
in San Francisco. This employee is a senior data engineer."""
result = generator(prompt)
print(type(result)) #
print(result.name)
print(result.department)
print(result.modeldumpjson(indent=2))
Nested and Complex Schemas
from pydantic import BaseModel, Field
from typing import List, Optional
from enum import Enum
import outlines
class Priority(str, Enum):
HIGH = "high"
MEDIUM = "medium"
LOW = "low"
class Status(str, Enum):
TODO = "todo"
INPROGRESS = "inprogress"
REVIEW = "review"
DONE = "done"
class SubTask(BaseModel):
title: str
completed: bool
class Task(BaseModel):
id: int
title: str = Field(maxlength=100)
description: str
priority: Priority
status: Status
assignee: str
estimatedhours: float = Field(gt=0)
subtasks: List[SubTask]
tags: List[str]