Tutorial 13: Apache Kafka for Real-Time ML Pipelines

Introduction

Prerequisites

Understanding Apache Kafka

Setting Up Kafka Locally

Producers and Consumers with Python

Building a Streaming Feature Pipeline

Real-Time ML Inference with Kafka

Schema Registry for Data Contracts

Kafka Connect for Data Integration

Production Best Practices

Conclusion

Introduction

In modern machine learning systems, the ability to process data and generate predictions in real time is a critical differentiator. Batch inference pipelines that run on hourly or daily schedules simply cannot meet the requirements of fraud detection, recommendation engines, dynamic pricing, or anomaly detection systems that demand sub-second response times.

Apache Kafka is a distributed event streaming platform that enables you to build robust, scalable, real-time data pipelines. When integrated with ML models, Kafka allows you to ingest streaming data, compute features on the fly, serve predictions in real time, and feed results back into downstream systems — all with high throughput and fault tolerance.

This tutorial walks you through the complete journey: from Kafka fundamentals to building a production-grade real-time ML inference pipeline using Python.

Prerequisites

Python 3.9 or higher
Docker and Docker Compose installed
Basic understanding of machine learning concepts
Familiarity with REST APIs and JSON
Install required Python packages:

pip install confluent-kafka fastavro requests scikit-learn numpy pandas fastapi uvicorn

Understanding Apache Kafka

Core Concepts

Apache Kafka operates around several fundamental abstractions:

Topics are named feeds or categories to which records are published. Think of a topic as a database table or a folder in a filesystem. Each topic is split into partitions, which are ordered, immutable sequences of records. Partitions enable parallelism — multiple consumers can read from different partitions simultaneously. Producers are client applications that publish (write) events to Kafka topics. Consumers are applications that subscribe to (read and process) events from topics. Consumers are organized into consumer groups, where each partition is consumed by exactly one consumer within a group, enabling load balancing. Brokers are the Kafka servers that store data and serve clients. A Kafka cluster consists of multiple brokers for redundancy and scalability. Offsets are unique sequential IDs assigned to each record within a partition. Consumers track their position using offsets, enabling exactly-once or at-least-once processing guarantees.

Why Kafka for ML?

| Feature | Benefit for ML |

|---------|---------------|

| High throughput | Handle millions of prediction requests per second |

| Durability | Never lose input data even if ML service goes down |

| Decoupling | Separate data ingestion from model inference |

| Replayability | Reprocess historical data with updated models |

| Scalability | Add more consumers/partitions as load increases |

Setting Up Kafka Locally

Create a docker-compose.yml file for a complete Kafka stack:

version: '3.8'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.5.0
    environment:
      ZOOKEEPERCLIENTPORT: 2181
      ZOOKEEPERTICKTIME: 2000
    ports:
"2181:2181"

  kafka:
    image: confluentinc/cp-kafka:7.5.0
    dependson:

zookeeper
    ports:
"9092:9092"
    environment:
      KAFKABROKERID: 1

      KAFKAZOOKEEPERCONNECT: zookeeper:2181

      KAFKAADVERTISEDLISTENERS: PLAINTEXT://localhost:9092

      KAFKAOFFSETSTOPICREPLICATIONFACTOR: 1

Apache Kafka for Real-Time ML Tutorial: Streaming Data Pipeline

Tutorial 13: Apache Kafka for Real-Time ML Pipelines

Table of Contents

Introduction

Prerequisites

Understanding Apache Kafka

Core Concepts

Why Kafka for ML?

Setting Up Kafka Locally

Related Articles

Complete Prefect Tutorial: Modern Workflow Orchestration for ML

Complete Apache Airflow Tutorial: Workflow Orchestration for Data Pipelines

Kedro Tutorial: Reproducible and Maintainable Data Science Pipelines

Ray Train & Ray Tune Tutorial: Distributed Training and Hyperparameter Tuning

Related Articles

Complete Prefect Tutorial: Modern Workflow Orchestration for ML

Tutorial Lengkap Prefect: Modern Workflow Orchestration untuk ML Prefect adalah platform workflow orchestration modern y...

Complete Apache Airflow Tutorial: Workflow Orchestration for Data Pipelines

Tutorial Lengkap Apache Airflow: Workflow Orchestration untuk Data Pipelines Apache Airflow adalah platform open-source ...

Kedro Tutorial: Reproducible and Maintainable Data Science Pipelines

Kedro: Pipeline Data Science yang Reproducible dan Mudah Dirawat Sebagian besar proyek data science dimulai dari satu no...

Ray Train & Ray Tune Tutorial: Distributed Training and Hyperparameter Tuning

Ray Train & Ray Tune: Pelatihan Terdistribusi dan Penyetelan Hiperparameter Sebagian besar proyek machine learning dimul...