Complete LlamaIndex Tutorial: Building RAG Applications with LLMs
LlamaIndex is a powerful data framework for building LLM-powered applications. It provides tools to ingest, structure, and access private or domain-specific data, making it perfect for building Retrieval-Augmented Generation (RAG) systems.
Why LlamaIndex?
LlamaIndex Advantages:- Easy data ingestion: Connect to 100+ data sources
- Flexible indexing: Multiple index types for different use cases
- Query engines: Natural language querying over your data
- Agent capabilities: Build autonomous LLM agents
- Production ready: Scalable and observable
- Question answering over documents
- Chatbots with knowledge bases
- Document summarization
- Semantic search
- Data analysis agents
Installation
pip install llama-index
With OpenAI
pip install llama-index-llms-openai llama-index-embeddings-openai
With local models
pip install llama-index-llms-ollama llama-index-embeddings-huggingface
Verify
python -c "import llamaindex; print(llamaindex.version)"
Quick Start
1. Basic RAG Pipeline
from llamaindex.core import VectorStoreIndex, SimpleDirectoryReader
from llama
index.llms.openai import OpenAI
import os
os.environ["OPENAIAPIKEY"] = "your-api-key"
Load documents
documents = SimpleDirectoryReader("./data").loaddata()
Create index
index = VectorStoreIndex.fromdocuments(documents)
Query
queryengine = index.asqueryengine()
response = queryengine.query("What is the main topic of these documents?")
print(response)
2. With Custom LLM
from llamaindex.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama
index.llms.openai import OpenAI
from llamaindex.embeddings.openai import OpenAIEmbedding
Configure settings
Settings.llm = OpenAI(model="gpt-4", temperature=0.1)
Settings.embedmodel = OpenAIEmbedding(model="text-embedding-3-small")
Load and index
documents = SimpleDirectoryReader("./data").loaddata()
index = VectorStoreIndex.fromdocuments(documents)
Query with streaming
queryengine = index.asqueryengine(streaming=True)
response = queryengine.query("Summarize the key points")
for token in response.responsegen:
print(token, end="", flush=True)
3. With Local Models (Ollama)
from llamaindex.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llamaindex.llms.ollama import Ollama
from llamaindex.embeddings.huggingface import HuggingFaceEmbedding
Use local models
Settings.llm = Ollama(model="llama2", requesttimeout=300.0)
Settings.embedmodel = HuggingFaceEmbedding(modelname="BAAI/bge-small-en-v1.5")
Build RAG
documents = SimpleDirectoryReader("./data").loaddata()
index = VectorStoreIndex.fromdocuments(documents)
queryengine = index.asqueryengine()
response = queryengine.query("What does this document discuss?")
print(response)
Data Loading
1. File Readers
from llamaindex.core import SimpleDirectoryReader
Load from directory
documents = SimpleDirectoryReader(
inputdir="./data",
recursive=True,
requiredexts=[".pdf", ".txt", ".md"]
).loaddata()
Load specific files
documents = SimpleDirectoryReader(
inputfiles=["doc1.pdf", "doc2.txt"]
).loaddata()
With metadata
documents = SimpleDirectoryReader(
"./data",
filemetadata=lambda filename: {"source": filename}
).loaddata()
2. Web Readers
from llamaindex.readers.web import SimpleWebPageReader, BeautifulSoupWebReader
Simple web reader
reader = SimpleWebPageReader()
documents = reader.loaddata(["https://example.com/page1", "https://example.com/page2"])
BeautifulSoup reader
reader = BeautifulSoupWebReader()
documents = reader.loaddata(
urls=["https://example.com"],