Complete LlamaIndex Tutorial: Building RAG Applications with LLMs

LlamaIndex is a powerful data framework for building LLM-powered applications. It provides tools to ingest, structure, and access private or domain-specific data, making it perfect for building Retrieval-Augmented Generation (RAG) systems.

Why LlamaIndex?

LlamaIndex Advantages:

Easy data ingestion: Connect to 100+ data sources
Flexible indexing: Multiple index types for different use cases
Query engines: Natural language querying over your data
Agent capabilities: Build autonomous LLM agents
Production ready: Scalable and observable

Use Cases:

Question answering over documents
Chatbots with knowledge bases
Document summarization
Semantic search
Data analysis agents

Installation

pip install llama-index

With OpenAI
pip install llama-index-llms-openai llama-index-embeddings-openai

With local models
pip install llama-index-llms-ollama llama-index-embeddings-huggingface

Verify
python -c "import llamaindex; print(llamaindex.version)"

Quick Start

1. Basic RAG Pipeline

from llamaindex.core import VectorStoreIndex, SimpleDirectoryReader
from llamaindex.llms.openai import OpenAI
import os

os.environ["OPENAIAPIKEY"] = "your-api-key"

Load documents
documents = SimpleDirectoryReader("./data").loaddata()


Create index
index = VectorStoreIndex.fromdocuments(documents)

Query
queryengine = index.asqueryengine()

response = queryengine.query("What is the main topic of these documents?")
print(response)

2. With Custom LLM

from llamaindex.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llamaindex.llms.openai import OpenAI
from llamaindex.embeddings.openai import OpenAIEmbedding


Configure settings
Settings.llm = OpenAI(model="gpt-4", temperature=0.1)
Settings.embedmodel = OpenAIEmbedding(model="text-embedding-3-small")

Load and index
documents = SimpleDirectoryReader("./data").loaddata()

index = VectorStoreIndex.fromdocuments(documents)

Query with streaming
queryengine = index.asqueryengine(streaming=True)

response = queryengine.query("Summarize the key points")

for token in response.responsegen:

    print(token, end="", flush=True)

3. With Local Models (Ollama)

from llamaindex.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llamaindex.llms.ollama import Ollama

from llamaindex.embeddings.huggingface import HuggingFaceEmbedding

Use local models
Settings.llm = Ollama(model="llama2", requesttimeout=300.0)

Settings.embedmodel = HuggingFaceEmbedding(modelname="BAAI/bge-small-en-v1.5")


Build RAG
documents = SimpleDirectoryReader("./data").loaddata()
index = VectorStoreIndex.fromdocuments(documents)

queryengine = index.asqueryengine()

response = queryengine.query("What does this document discuss?")

print(response)

Data Loading

1. File Readers

from llamaindex.core import SimpleDirectoryReader

Load from directory
documents = SimpleDirectoryReader(
    inputdir="./data",

    recursive=True,
    requiredexts=[".pdf", ".txt", ".md"]
).loaddata()


Load specific files
documents = SimpleDirectoryReader(
    inputfiles=["doc1.pdf", "doc2.txt"]
).loaddata()


With metadata
documents = SimpleDirectoryReader(
    "./data",
    filemetadata=lambda filename: {"source": filename}
).loaddata()

2. Web Readers

from llamaindex.readers.web import SimpleWebPageReader, BeautifulSoupWebReader

Simple web reader
reader = SimpleWebPageReader()
documents = reader.loaddata(["https://example.com/page1", "https://example.com/page2"])


BeautifulSoup reader
reader = BeautifulSoupWebReader()
documents = reader.loaddata(
    urls=["https://example.com"],

Complete LlamaIndex Tutorial: Building RAG Applications with LLMs

Complete LlamaIndex Tutorial: Building RAG Applications with LLMs

Why LlamaIndex?

Installation

With OpenAI

With local models

Verify

Quick Start

1. Basic RAG Pipeline

Load documents

Create index

Query

2. With Custom LLM

Configure settings

Load and index

Query with streaming

3. With Local Models (Ollama)

Use local models

Build RAG

Data Loading

1. File Readers

Load from directory

Load specific files

With metadata

2. Web Readers

Simple web reader

BeautifulSoup reader

Related Articles

Complete Qdrant Tutorial: Vector Database for AI Applications

Complete ChromaDB Tutorial: Simple Vector Database for AI

RAGAS: Evaluation Framework for RAG Pipelines

Milvus Tutorial: Distributed Vector Database for AI

Related Articles

Complete Qdrant Tutorial: Vector Database for AI Applications

Tutorial Lengkap Qdrant: Vector Database untuk Aplikasi AI Qdrant adalah vector database performa tinggi yang dirancang ...

Complete ChromaDB Tutorial: Simple Vector Database for AI

Tutorial Lengkap ChromaDB: Vector Database Sederhana untuk AI ChromaDB adalah open-source vector database yang dirancang...

RAGAS: Evaluation Framework for RAG Pipelines

RAGAS: Framework Evaluasi untuk Pipeline RAG Pendahuluan Retrieval-Augmented Generation (RAG) telah menjadi arsitektur s...

Milvus Tutorial: Distributed Vector Database for AI

Tutorial 10: Milvus - Database Vektor Terdistribusi untuk AI Daftar Isi Pendahuluan Prasyarat Arsitektur Milvus [Instala...