Complete GitHub Actions Tutorial for ML CI/CD
GitHub Actions enables you to automate Machine Learning workflows from testing to deployment. In this tutorial, we'll build a comprehensive CI/CD pipeline for ML projects, including data validation, model training, testing, and deployment.
Why CI/CD for ML?
Challenges in ML projects:
- Reproducibility: Ensuring consistent results across environments
- Testing: Validating data, models, and code
- Automation: Reducing manual work and human error
- Collaboration: Standardizing workflows across teams
- Monitoring: Tracking performance and detecting regressions
- Automated testing on every push/PR
- Scheduled retraining
- Model validation gates
- Automated deployment
- Integration with cloud services
GitHub Actions Basics
1. Workflow File Structure
# .github/workflows/ml-pipeline.yml
name: ML Pipeline
Triggers
on:
push:
branches: [main, develop]
pullrequest:
branches: [main]
schedule:
- cron: '0 0 0' # Weekly on Sunday
workflowdispatch: # Manual trigger
Environment variables
env:
PYTHONVERSION: '3.10'
MODELNAME: 'my-model'
Jobs
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHONVERSION }}
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run tests
run: pytest tests/
2. Workflow Triggers
on:
# Push to specific branches
push:
branches: [main]
paths:
- 'src/'
- 'tests/'
- 'requirements.txt'
# Pull requests
pullrequest:
branches: [main]
# Scheduled runs
schedule:
- cron: '0 2 ' # Daily at 2 AM UTC
# Manual trigger with inputs
workflowdispatch:
inputs:
environment:
description: 'Deployment environment'
required: true
default: 'staging'
type: choice
options:
- staging
- production
retrain:
description: 'Force retrain model'
required: false
type: boolean
default: false
ML Testing Pipeline
1. Code Quality and Unit Tests
name: Code Quality & Tests
on:
push:
branches: [main, develop]
pullrequest:
branches: [main]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install linters
run: |
pip install flake8 black isort mypy
- name: Run flake8
run: flake8 src/ tests/
- name: Check black formatting
run: black --check src/ tests/
- name: Check import sorting
run: isort --check-only src/ tests/
- name: Run mypy
run: mypy src/
test:
runs-on: ubuntu-latest
needs: lint
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Cache pip
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run unit tests
run: pytest tests/unit/ -v --cov=src --cov-report=xml
- name: Upload coverage
uses: codecov/codecov-action@v4
with:
files: ./coverage.xml