Tutorial Lengkap Vertex AI Feature Store: Manajemen Fitur Terpusat
Vertex AI Feature Store adalah repositori terpusat untuk mengorganisir, menyimpan, dan menyajikan fitur ML. Memungkinkan penggunaan ulang fitur, mengurangi training-serving skew, dan menyediakan akses fitur yang konsisten antar tim.
Mengapa Feature Store?
Manfaat Utama:- Fitur terpusat: Satu sumber kebenaran
- Penggunaan ulang fitur: Berbagi fitur antar model
- Serving latensi rendah: Pengambilan fitur online cepat
- Konsistensi: Fitur sama untuk training dan serving
- Time travel: Pencarian fitur point-in-time
Prerequisites
pip install google-cloud-aiplatform
gcloud auth login
gcloud config set project your-project-id
Setup
1. Inisialisasi Vertex AI
from google.cloud import aiplatform
aiplatform.init(project="your-project-id", location="us-central1")
2. Buat Feature Store
# Buat feature store
featurestore = aiplatform.Featurestore.create(
featurestoreid="myfeaturestore",
onlinestorefixednodecount=1
)
print(f"Feature store dibuat: {featurestore.resourcename}")
Entity Types
1. Buat Entity Type
# Buat customer entity type
customerentity = featurestore.createentitytype(
entitytypeid="customer",
description="Customer entity untuk prediksi churn"
)
Buat product entity type
productentity = featurestore.createentitytype(
entitytypeid="product",
description="Product entity untuk rekomendasi"
)
2. List Entity Types
entitytypes = featurestore.listentitytypes()
for et in entitytypes:
print(f"{et.entitytypeid}: {et.description}")
Features
1. Buat Features
# Buat features untuk customer entity
customerentity.createfeature(
featureid="age",
valuetype="INT64",
description="Usia customer"
)
customerentity.createfeature(
featureid="tenuremonths",
valuetype="INT64",
description="Bulan sebagai customer"
)
customerentity.createfeature(
featureid="monthlycharges",
valuetype="DOUBLE",
description="Tagihan bulanan"
)
customerentity.createfeature(
featureid="totalcharges",
valuetype="DOUBLE",
description="Total tagihan sampai saat ini"
)
customerentity.createfeature(
featureid="contracttype",
valuetype="STRING",
description="Tipe kontrak"
)
2. Batch Create Features
# Buat multiple features sekaligus
featuresconfig = {
"age": {"valuetype": "INT64", "description": "Usia customer"},
"tenuremonths": {"valuetype": "INT64", "description": "Tenure dalam bulan"},
"monthlycharges": {"valuetype": "DOUBLE", "description": "Tagihan bulanan"},
"totalcharges": {"valuetype": "DOUBLE", "description": "Total tagihan"},
"contracttype": {"valuetype": "STRING", "description": "Tipe kontrak"}
}
customerentity.batchcreatefeatures(featuresconfig)
Ingesting Features
1. Ingest dari BigQuery
# Ingest features dari BigQuery
customerentity.ingestfrombq(
featureids=["age", "tenuremonths", "monthlycharges", "totalcharges"],
featuretime="updatetime",
bqsourceuri="bq://project.dataset.customerfeatures",
entityidfield="customerid"
)
2. Ingest dari DataFrame
import pandas as pd
from datetime import datetime
Buat feature dataframe
df = pd.DataFrame({
"customerid": ["C001", "C002", "C003"],
"age": [25, 35, 45],
"tenuremonths": [12, 24, 36],