Semantic Search Engine from Scratch Tutorial: Embeddings and Vector Search

# Membangun Mesin Pencari Semantik dari Nol ## Daftar Isi 1. [Pendahuluan](#pendahuluan) 2. [Prasyarat](#prasyarat) 3. [Memahami Pencarian Semantik](#memahami-pencarian-semantik) 4. [Text Embedding...

By Ruby Abdullah · · tutorial
Semantic SearchEmbeddingsFAISSVector SearchSentence TransformersFastAPI

Semantic Search Engine from Scratch

Table of Contents

  • Introduction
  • Prerequisites
  • Understanding Semantic Search
  • Text Embeddings with Sentence-Transformers
  • Vector Indexing with FAISS
  • Vector Indexing with Annoy
  • Building the Search Pipeline
  • Filtering and Metadata
  • Reranking for Improved Relevance
  • Hybrid Search: Combining Semantic and Keyword Search
  • Building the API with FastAPI
  • Evaluation Metrics
  • Best Practices
  • Conclusion

  • Introduction

    Traditional keyword-based search engines match documents by exact or fuzzy word matches. Semantic search goes further by understanding the meaning behind queries and documents. When a user searches for "how to fix a broken pipe," a semantic search engine can also return results about "plumbing repair" or "pipe leak solutions" -- even if those exact words are not in the query.

    This tutorial guides you through building a complete semantic search engine from scratch. You will learn how to generate text embeddings, build vector indices with FAISS and Annoy, implement filtering and reranking, combine semantic and keyword search into a hybrid system, expose everything through a FastAPI REST API, and measure search quality with standard evaluation metrics.


    Prerequisites

    • Python 3.9 or higher
    • Basic understanding of machine learning concepts
    • Familiarity with REST APIs

    pip install sentence-transformers faiss-cpu annoy numpy fastapi uvicorn rank-bm25 scikit-learn pydantic
    


    Semantic search works in three stages:

  • Indexing: Documents are converted into dense vector embeddings and stored in a vector index.
  • Querying: The user query is converted into an embedding using the same model.
  • Retrieval: The vector index finds the documents whose embeddings are closest to the query embedding.
  • The key insight is that semantically similar texts produce similar vectors, enabling meaning-based retrieval rather than keyword matching.

    User Query: "affordable electric cars"
    

    |

    v

    [Embedding Model] -> Query Vector [0.12, -0.45, 0.78, ...]

    |

    v

    [Vector Index] -> Nearest Neighbor Search

    |

    v

    Results:

  • "Budget-friendly EVs for 2025" (similarity: 0.92)
  • "Low-cost electric vehicles comparison" (similarity: 0.89)
  • "Tesla Model 3 pricing guide" (similarity: 0.84)

  • Text Embeddings with Sentence-Transformers

    Sentence-Transformers is a Python library that provides pre-trained models for generating high-quality text embeddings.

    Loading and Using Embedding Models

    from sentencetransformers import SentenceTransformer
    

    import numpy as np

    Load a pre-trained model

    'all-MiniLM-L6-v2' is a good balance of speed and quality

    model = SentenceTransformer('all-MiniLM-L6-v2')

    Generate embeddings for single texts

    text = "Machine learning is a subset of artificial intelligence."

    embedding = model.encode(text)

    print(f"Embedding shape: {embedding.shape}") # (384,)

    print(f"Embedding dtype: {embedding.dtype}") # float32

    Generate embeddings for multiple texts (batched for efficiency)

    documents = [

    "Python is a versatile programming language.",

    "Deep learning uses neural networks with many layers.",

    "FastAPI is a modern web framework for Python.",

    "Natural language processing deals with text understanding.",

    "Docker containers simplify application deployment.",

    ]

    docembeddings = model.encode(documents, showprogressbar=True, batchsize=32)

    print(f"Batch embeddings shape: {docembeddings.shape}") # (5, 384)

    Related Articles

    Sentence Transformers Tutorial: Embeddings, Similarity, and Rerankers

    Sentence Transformers: Embedding, Kemiripan Semantik, dan Reranker Sentence Transformers (sering disebut SBERT) adalah p...

    FAISS Tutorial: Efficient Vector Similarity Search at Scale

    FAISS: Pencarian Kemiripan Vektor yang Efisien dalam Skala Besar FAISS (Facebook AI Similarity Search) adalah library C+...

    Reflex Tutorial: Building Full-Stack Web Apps in Pure Python

    Reflex: Membangun Aplikasi Web Full-Stack dengan Python Murni Reflex memungkinkan Anda membangun aplikasi web lengkap — ...

    BERTopic Tutorial: Modern Topic Modeling with Embeddings

    BERTopic: Pemodelan Topik Modern dengan Embedding BERTopic adalah library pemodelan topik yang menggabungkan embedding t...