Complete Vertex AI Tutorial: Unified ML Platform on Google Cloud
Vertex AI is Google Cloud's unified machine learning platform that brings together all Google Cloud's ML services. It provides tools for building, deploying, and scaling ML models with AutoML and custom training.
Why Vertex AI?
Key Benefits:- Unified platform: All ML tools in one place
- AutoML: No-code model building
- Custom training: Full control with custom code
- MLOps: Built-in pipelines and monitoring
- Scalable: Enterprise-grade infrastructure
- Datasets
- Training (AutoML and Custom)
- Model Registry
- Endpoints
- Pipelines
- Feature Store
- Experiments
Prerequisites
pip install google-cloud-aiplatform
Authenticate
gcloud auth login
gcloud config set project your-project-id
Setup
1. Initialize Vertex AI
from google.cloud import aiplatform
aiplatform.init(
project="your-project-id",
location="us-central1",
stagingbucket="gs://your-bucket"
)
2. Enable APIs
gcloud services enable aiplatform.googleapis.com
gcloud services enable compute.googleapis.com
gcloud services enable storage.googleapis.com
Datasets
1. Create Tabular Dataset
from google.cloud import aiplatform
Create from BigQuery
dataset = aiplatform.TabularDataset.create(
displayname="customer-churn-dataset",
bqsource="bq://project.dataset.table"
)
Create from GCS
dataset = aiplatform.TabularDataset.create(
displayname="customer-churn-dataset",
gcssource="gs://bucket/data/train.csv"
)
print(f"Dataset created: {dataset.resourcename}")
2. Create Image Dataset
# Create image dataset
imagedataset = aiplatform.ImageDataset.create(
displayname="product-images",
gcssource="gs://bucket/images/",
importschemauri=aiplatform.schema.dataset.ioformat.image.singlelabelclassification
)
3. Create Text Dataset
# Create text dataset
textdataset = aiplatform.TextDataset.create(
displayname="sentiment-dataset",
gcssource="gs://bucket/text/data.jsonl",
importschemauri=aiplatform.schema.dataset.ioformat.text.singlelabelclassification
)
AutoML Training
1. AutoML Tabular
# Create AutoML tabular training job
job = aiplatform.AutoMLTabularTrainingJob(
displayname="churn-automl",
optimizationpredictiontype="classification",
optimizationobjective="maximize-au-roc"
)
Train model
model = job.run(
dataset=dataset,
targetcolumn="churn",
trainingfractionsplit=0.8,
validationfractionsplit=0.1,
testfractionsplit=0.1,
budgetmillinodehours=1000,
modeldisplayname="churn-model"
)
print(f"Model trained: {model.resourcename}")
2. AutoML Image Classification
# Create AutoML image training job
job = aiplatform.AutoMLImageTrainingJob(
displayname="image-classifier",
predictiontype="classification",
multilabel=False
)
Train model
model = job.run(
dataset=imagedataset,
trainingfractionsplit=0.8,
validationfractionsplit=0.1,
testfractionsplit=0.1,
budgetmillinodehours=8000,
modeldisplayname="product-classifier"
)
3. AutoML Text Classification
# Create AutoML text training job
job = aiplatform.AutoMLTextTrainingJob(
displayname="sentiment-classifier",
predictiontype="classification",
multilabel=False
)
Train model
model = job.run(
dataset=textdataset,