DSPy: A Framework for Programmatic LLM Optimization
Manual prompt engineering is a tedious and hard-to-maintain process. Every time a model changes or data shifts, prompts need to be rewritten. DSPy (Declarative Self-improving Python) provides a revolutionary solution: a framework that lets you programmatically optimize LLM prompts, much like training a machine learning model.
In this tutorial, we will learn how to use DSPy to build LLM pipelines that can be automatically optimized, from basic concepts to a complete RAG (Retrieval-Augmented Generation) implementation.
What Is DSPy?
DSPy is a Python framework that fundamentally changes how we work with LLMs. Instead of writing prompts manually, DSPy allows you to define what you want to achieve (through signatures and modules), then automatically optimizes how to achieve it (through optimizers/teleprompters).
A simple analogy: if manual prompt engineering is like writing assembly code, DSPy is like using a compiler that automatically optimizes your code.
Core concepts of DSPy:
- Signatures: Define the input and output of a task
- Modules: Building blocks that implement prompting strategies
- Optimizers (Teleprompters): Algorithms that automatically optimize prompts
- Metrics: Evaluation functions to measure output quality
- Compilation: The process of combining all components and optimizing
Installation
Install DSPy and its required dependencies:
pip install dspy-ai openai
For additional features:
# For retrieval with ChromaDB
pip install dspy-ai chromadb
For using local models
pip install dspy-ai transformers torch
Configure your API key:
export OPENAIAPIKEY="sk-your-api-key-here"
Basic Setup and Configuration
The first step is configuring the LLM you will use.
import dspy
Configure with OpenAI
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)
Or with other models
lm = dspy.LM("anthropic/claude-sonnet-4-20250514")
lm = dspy.LM("openai/gpt-4o")
Signatures: Defining Tasks
Signatures are how DSPy defines the input and output of a task. They are the most fundamental abstraction in DSPy.
Inline Signatures (simple)# Format: "input -> output"
Classify sentiment
classify = dspy.Predict("text -> sentiment")
result = classify(text="This product is amazing and high quality!")
print(result.sentiment) # positive
Question answering
qa = dspy.Predict("question -> answer")
result = qa(question="What is the capital of France?")
print(result.answer) # Paris
Summarization
summarize = dspy.Predict("document -> summary")
result = summarize(
document="A long article about AI and its impact..."
)
print(result.summary)
Class-based Signatures (more expressive)
class SentimentAnalysis(dspy.Signature):
"""Analyze sentiment from product review text."""
text: str = dspy.InputField(desc="Product review text")
sentiment: str = dspy.OutputField(
desc="Sentiment: positive, negative, or neutral"
)
confidence: float = dspy.OutputField(
desc="Confidence level between 0.0 and 1.0"
)
reasoning: str = dspy.OutputField(
desc="Reasoning behind the sentiment classification"
)
Use the signature
predictor = dspy.Predict(SentimentAnalysis)
result = predictor(
text="Item arrived on time, quality matches description. "
"But the packaging was sloppy."
)
print(f"Sentiment: {result.sentiment}")
print(f"Confidence: {result.confidence}")
print(f"Reasoning: {result.reasoning}")
Modules: Building Blocks
DSPy provides several built-in modules that implement different prompting strategies.
dspy.Predict
The most basic module that performs a single LLM call.
predictor = dspy.Predict("question -> answer")
result = predictor(question="Explain the concept of machine learning")
print(result.answer)