Azure ML Pipelines Tutorial: ML Pipeline Automation

# Tutorial Lengkap Azure ML Pipelines: CI/CD untuk Machine Learning Azure ML Pipelines memungkinkan Anda membangun workflow machine learning yang reproducible dan reusable. Pipeline mengotomatisasi l...

By Ruby Abdullah · · tutorial
AzureAzure MLPipelinesMLOpsAutomationCI/CD

Complete Azure ML Pipelines Tutorial: CI/CD for Machine Learning

Azure ML Pipelines enable you to build reproducible, reusable machine learning workflows. They automate the end-to-end ML lifecycle from data preparation to model deployment with version control and collaboration.

Why Azure ML Pipelines?

Key Benefits:
  • Reproducibility: Version-controlled workflows
  • Reusability: Modular pipeline components
  • Automation: Scheduled and triggered pipelines
  • Collaboration: Team-based development
  • Integration: Azure DevOps and GitHub Actions

Use Cases:
  • Automated model training
  • Data preprocessing workflows
  • Batch inference pipelines
  • MLOps CI/CD
  • Feature engineering automation

Prerequisites

pip install azure-ai-ml azure-identity

Azure CLI

az login

az extension add -n ml

Quick Start

1. Connect to Workspace

from azure.ai.ml import MLClient

from azure.identity import DefaultAzureCredential

mlclient = MLClient(

credential=DefaultAzureCredential(),

subscriptionid="your-subscription-id",

resourcegroupname="my-resource-group",

workspacename="my-ml-workspace"

)

2. Simple Pipeline

from azure.ai.ml import dsl, Input, Output

from azure.ai.ml.entities import Pipeline

Define pipeline

@dsl.pipeline(

compute="cpu-cluster",

description="Simple training pipeline"

)

def trainingpipeline(trainingdata):

# Preprocessing step

preprocessstep = preprocesscomponent(

inputdata=trainingdata

)

# Training step

trainstep = traincomponent(

trainingdata=preprocessstep.outputs.outputdata

)

return {

"modeloutput": trainstep.outputs.model

}

Create pipeline

pipeline = trainingpipeline(

trainingdata=Input(type="urifile", path="azureml:my-dataset:1")

)

Submit pipeline

pipelinejob = mlclient.jobs.createorupdate(

pipeline,

experimentname="training-pipeline"

)

print(f"Pipeline submitted: {pipelinejob.name}")

Pipeline Components

1. Create Component from Code

from azure.ai.ml import command

from azure.ai.ml.entities import Component

Data preprocessing component

preprocesscomponent = command(

name="preprocessdata",

displayname="Preprocess Data",

description="Clean and prepare data for training",

inputs={

"inputdata": Input(type="urifile")

},

outputs={

"outputdata": Output(type="urifolder")

},

code="./components/preprocess",

command="python preprocess.py --input ${{inputs.inputdata}} --output ${{outputs.outputdata}}",

environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest"

)

Register component

preprocesscomponent = mlclient.components.createorupdate(preprocesscomponent)

print(f"Component registered: {preprocesscomponent.name}")

2. Component Script

# components/preprocess/preprocess.py

import argparse

import pandas as pd

import os

def main():

parser = argparse.ArgumentParser()

parser.addargument("--input", type=str, required=True)

parser.addargument("--output", type=str, required=True)

args = parser.parseargs()

# Load data

df = pd.readcsv(args.input)

# Preprocess

df = df.dropna()

df = df.dropduplicates()

# Normalize numeric columns

numericcols = df.selectdtypes(include=['number']).columns

df[numericcols] = (df[numericcols] - df[numericcols].mean()) / df[numericcols].std()

# Save output

os.makedirs(args.output, existok=True)

df.tocsv(os.path.join(args.output, "processeddata.csv"), index=False)

print(f"Processed {len(df)} rows")

if name == "main":

Related Articles

Azure DevOps for MLOps Tutorial: CI/CD for Machine Learning

Tutorial Lengkap Azure DevOps untuk MLOps: CI/CD untuk Machine Learning Azure DevOps menyediakan kemampuan CI/CD kompreh...

AWS SageMaker Pipelines Tutorial: ML Pipeline Automation

Tutorial Lengkap AWS SageMaker Pipelines: Automasi ML Workflows SageMaker Pipelines adalah layanan CI/CD yang dibuat khu...

Vertex AI Pipelines Tutorial: ML Pipeline Orchestration

Tutorial Lengkap Vertex AI Pipelines: Orkestrasi Workflow ML Vertex AI Pipelines memungkinkan Anda mengorkestrasi workfl...

Azure ML Managed Endpoints Tutorial: Production Model Deployment

Tutorial Lengkap Azure ML Managed Endpoints: Deployment Model Production Azure ML Managed Endpoints menyediakan solusi f...