Tutorial Lengkap Feast: Feature Store untuk Machine Learning

# Tutorial Lengkap Feast: Feature Store untuk Machine Learning Feast (Feature Store) adalah feature store open-source yang membantu tim ML mengelola, menemukan, dan menyajikan features untuk model ma...

By Ruby Abdullah · · tutorial
FeastFeature StoreMLOpsMachine LearningPythonData Engineering

Tutorial Lengkap Feast: Feature Store untuk Machine Learning

Feast (Feature Store) adalah feature store open-source yang membantu tim ML mengelola, menemukan, dan menyajikan features untuk model machine learning. Feast menjembatani gap antara data engineering dan machine learning dengan menyediakan cara konsisten untuk mendefinisikan, menyimpan, dan mengambil features.

Mengapa Feast?

Keunggulan Feast:
  • Feature consistency: Features sama di training dan serving
  • Feature sharing: Reuse features antar tim
  • Point-in-time correctness: Cegah data leakage
  • Low latency serving: Online feature retrieval
  • Feature discovery: Centralized feature catalog

Use Cases:
  • ML model training dan serving
  • Real-time recommendation systems
  • Fraud detection
  • Personalization engines
  • Feature sharing antar tim

Instalasi

# Basic installation

pip install feast

Dengan specific providers

pip install feast[redis] # Redis online store

pip install feast[gcp] # Google Cloud

pip install feast[aws] # AWS

pip install feast[snowflake] # Snowflake

Verify installation

feast version

Quick Start

1. Initialize Project

# Buat Feast project baru

feast init myfeaturerepo

cd myfeaturerepo

Project structure:

myfeaturerepo/

├── featurerepo/

│ ├── init.py

│ ├── examplerepo.py

│ └── featurestore.yaml

└── data/

└── driverstats.parquet

2. Feature Store Configuration

# featurestore.yaml

project: myproject

provider: local

registry: data/registry.db

onlinestore:

type: sqlite

path: data/onlinestore.db

offlinestore:

type: file

entitykeyserializationversion: 2

3. Define Features

# featurerepo/features.py

from datetime import timedelta

from feast import Entity, Feature, FeatureView, FileSource, ValueType

from feast.types import Float32, Int64, String

Define data source

driverstatssource = FileSource(

name="driverstatssource",

path="data/driverstats.parquet",

timestampfield="eventtimestamp",

createdtimestampcolumn="created",

)

Define entity

driver = Entity(

name="driverid",

valuetype=ValueType.INT64,

description="Driver identifier",

)

Define feature view

driverstatsfv = FeatureView(

name="driverstats",

entities=[driver],

ttl=timedelta(days=1),

schema=[

Feature(name="convrate", dtype=Float32),

Feature(name="accrate", dtype=Float32),

Feature(name="avgdailytrips", dtype=Int64),

],

source=driverstatssource,

online=True,

tags={"team": "driverperformance"},

)

4. Apply dan Materialize

# Apply feature definitions

feast apply

Materialize features ke online store

feast materialize-incremental $(date +%Y-%m-%dT%H:%M:%S)

Atau materialize date range tertentu

feast materialize 2024-01-01T00:00:00 2024-01-31T00:00:00

5. Retrieve Features

from feast import FeatureStore

import pandas as pd

Initialize feature store

store = FeatureStore(repopath=".")

Get historical features (untuk training)

entitydf = pd.DataFrame({

"driverid": [1001, 1002, 1003],

"eventtimestamp": pd.todatetime([

"2024-01-15 10:00:00",

"2024-01-15 10:00:00",

"2024-01-15 10:00:00",

])

})

trainingdf = store.gethistoricalfeatures(

entitydf=entitydf,

features=[

"driverstats:convrate",

"driverstats:accrate",

"driverstats:avgdailytrips",

],

).todf()

print(trainingdf)

Artikel Terkait

Tutorial Lengkap Vertex AI: Platform ML Terpadu Google Cloud

Tutorial Lengkap Vertex AI: Platform ML Terpadu di Google Cloud Vertex AI adalah platform machine learning terpadu Googl...

Tutorial Lengkap Azure Machine Learning: End-to-End ML Platform

Tutorial Lengkap Azure Machine Learning: ML End-to-End di Azure Azure Machine Learning adalah platform berbasis cloud un...

Tutorial Lengkap AWS SageMaker: Machine Learning di Cloud

Tutorial Lengkap AWS SageMaker: End-to-End ML Pipeline Amazon SageMaker adalah layanan machine learning terkelola penuh ...

Tutorial Lengkap Ray Serve: Scalable ML Model Serving

Tutorial Lengkap Ray Serve: Scalable ML Model Serving Ray Serve adalah library model serving yang scalable dibangun di a...