Complete Guide to MediaPipe: Computer Vision Made Easy

MediaPipe is an open-source framework from Google for building multimodal machine learning pipelines. MediaPipe provides ready-to-use solutions for various computer vision tasks such as face detection, hand tracking, pose estimation, and much more.

In this tutorial, we'll learn MediaPipe from basics to practical implementation for various real-time applications.

Why MediaPipe?

MediaPipe Advantages:

Cross-Platform: Runs on Android, iOS, Web, Desktop (Python/C++)

Real-Time Performance: Optimized for real-time inference

Pre-trained Models: Ready-to-use models with high accuracy

Easy to Use: Simple and intuitive API

Production Ready: Used by Google products (Google Meet, etc)

Open Source: Free and customizable

Available Solutions:

| Solution | Description |

|----------|-------------|

| Face Detection | Detect faces in images/video |

| Face Mesh | 468 3D face landmarks |

| Hand Tracking | 21 hand landmarks |

| Pose Estimation | 33 body landmarks |

| Holistic | Combination of face, hands, and pose |

| Object Detection | General object detection |

| Image Segmentation | Selfie/background segmentation |

| Gesture Recognition | Hand gesture recognition |

Installation

Install MediaPipe

# Install with pip
pip install mediapipe

Install with OpenCV (usually included)
pip install opencv-python

For GPU version (optional)
pip install mediapipe-gpu

Specific version
pip install mediapipe==0.10.9

Verify Installation

import mediapipe as mp
import cv2

print(f"MediaPipe version: {mp.version}")
print(f"OpenCV version: {cv2.version}")

MediaPipe Structure

import mediapipe as mp

Solutions - Pre-built ML pipelines
mp.solutions.facedetection

mp.solutions.facemesh
mp.solutions.hands
mp.solutions.pose
mp.solutions.holistic
mp.solutions.objectron
mp.solutions.selfiesegmentation


Drawing utilities
mp.solutions.drawingutils
mp.solutions.drawingstyles

Face Detection

Basic Face Detection

import cv2
import mediapipe as mp

Initialize
mpfacedetection = mp.solutions.facedetection
mpdrawing = mp.solutions.drawingutils

def detectfacesimage(imagepath):

    """Detect faces in a static image."""
    # Read image
    image = cv2.imread(imagepath)
    imagergb = cv2.cvtColor(image, cv2.COLORBGR2RGB)

    # Initialize face detection
    with mpfacedetection.FaceDetection(
        modelselection=1,  # 0: short-range (2m), 1: full-range (5m)

        mindetectionconfidence=0.5

    ) as facedetection:

        # Process image
        results = facedetection.process(imagergb)

        # Draw detections
        if results.detections:
            for detection in results.detections:
                mpdrawing.drawdetection(image, detection)

                # Get bounding box
                bbox = detection.locationdata.relativeboundingbox

                h, w,  = image.shape
                x = int(bbox.xmin  w)

                y = int(bbox.ymin  h)
                width = int(bbox.width  w)

                height = int(bbox.height  h)

                # Get confidence score
                score = detection.score[0]
                print(f"Face detected: confidence={score:.2f}, bbox=({x}, {y}, {width}, {height})")

    cv2.imshow('Face Detection', image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

Run
detectfacesimage('photo.jpg')

Real-Time Face Detection (Webcam)

import cv2
import mediapipe as mp

def realtimefacedetection():
    """Real-time face detection from webcam."""
    mpfacedetection = mp.solutions.facedetection

    mpdrawing = mp.solutions.drawingutils


    # Open webcam

Complete Guide to MediaPipe: Computer Vision Made Easy

Complete Guide to MediaPipe: Computer Vision Made Easy

Why MediaPipe?

MediaPipe Advantages:

Available Solutions:

Installation

Install MediaPipe

Install with OpenCV (usually included)

For GPU version (optional)

Specific version

Verify Installation

MediaPipe Structure

Solutions - Pre-built ML pipelines

Drawing utilities

Face Detection

Basic Face Detection

Initialize

Run

Real-Time Face Detection (Webcam)

Related Articles

OpenCV + Deep Learning Tutorial: Modern Image Processing with Python

SHAP Tutorial: Explainable AI and Model Interpretability

PyOD Tutorial: Anomaly and Outlier Detection in Python

spaCy Tutorial: Industrial-Strength NLP in Python

Related Articles

OpenCV + Deep Learning Tutorial: Modern Image Processing with Python

OpenCV + Deep Learning: Tutorial Komprehensif Daftar Isi Pendahuluan Prasyarat Dasar-Dasar Preprocessing Gambar [T...

SHAP Tutorial: Explainable AI and Model Interpretability

SHAP - Panduan Praktis Explainable AI dan Interpretabilitas Model Model machine learning makin sering dipakai untuk meng...

PyOD Tutorial: Anomaly and Outlier Detection in Python

Deteksi Anomali di Python dengan PyOD: Panduan Praktis Sebagian besar dataset di dunia nyata mengandung sebagian kecil d...

spaCy Tutorial: Industrial-Strength NLP in Python

spaCy: NLP Kelas Industri di Python spaCy adalah pustaka open-source untuk pemrosesan bahasa alami (NLP) yang dirancang ...