Complete Guide to MediaPipe: Computer Vision Made Easy

# Panduan Lengkap MediaPipe: Computer Vision Made Easy MediaPipe adalah framework open-source dari Google untuk membangun pipeline machine learning multimodal. MediaPipe menyediakan solusi siap pakai...

By Ruby Abdullah · · tutorial
PythonMediaPipeComputer VisionMachine LearningOpenCV

Complete Guide to MediaPipe: Computer Vision Made Easy

MediaPipe is an open-source framework from Google for building multimodal machine learning pipelines. MediaPipe provides ready-to-use solutions for various computer vision tasks such as face detection, hand tracking, pose estimation, and much more.

In this tutorial, we'll learn MediaPipe from basics to practical implementation for various real-time applications.

Why MediaPipe?

MediaPipe Advantages:

  • Cross-Platform: Runs on Android, iOS, Web, Desktop (Python/C++)
  • Real-Time Performance: Optimized for real-time inference
  • Pre-trained Models: Ready-to-use models with high accuracy
  • Easy to Use: Simple and intuitive API
  • Production Ready: Used by Google products (Google Meet, etc)
  • Open Source: Free and customizable
  • Available Solutions:

    | Solution | Description |

    |----------|-------------|

    | Face Detection | Detect faces in images/video |

    | Face Mesh | 468 3D face landmarks |

    | Hand Tracking | 21 hand landmarks |

    | Pose Estimation | 33 body landmarks |

    | Holistic | Combination of face, hands, and pose |

    | Object Detection | General object detection |

    | Image Segmentation | Selfie/background segmentation |

    | Gesture Recognition | Hand gesture recognition |

    Installation

    Install MediaPipe

    # Install with pip
    

    pip install mediapipe

    Install with OpenCV (usually included)

    pip install opencv-python

    For GPU version (optional)

    pip install mediapipe-gpu

    Specific version

    pip install mediapipe==0.10.9

    Verify Installation

    import mediapipe as mp
    

    import cv2

    print(f"MediaPipe version: {mp.version}")

    print(f"OpenCV version: {cv2.version}")

    MediaPipe Structure

    import mediapipe as mp
    
    

    Solutions - Pre-built ML pipelines

    mp.solutions.facedetection

    mp.solutions.facemesh

    mp.solutions.hands

    mp.solutions.pose

    mp.solutions.holistic

    mp.solutions.objectron

    mp.solutions.selfiesegmentation

    Drawing utilities

    mp.solutions.drawingutils

    mp.solutions.drawingstyles

    Face Detection

    Basic Face Detection

    import cv2
    

    import mediapipe as mp

    Initialize

    mpfacedetection = mp.solutions.facedetection

    mpdrawing = mp.solutions.drawingutils

    def detectfacesimage(imagepath):

    """Detect faces in a static image."""

    # Read image

    image = cv2.imread(imagepath)

    imagergb = cv2.cvtColor(image, cv2.COLORBGR2RGB)

    # Initialize face detection

    with mpfacedetection.FaceDetection(

    modelselection=1, # 0: short-range (2m), 1: full-range (5m)

    mindetectionconfidence=0.5

    ) as facedetection:

    # Process image

    results = facedetection.process(imagergb)

    # Draw detections

    if results.detections:

    for detection in results.detections:

    mpdrawing.drawdetection(image, detection)

    # Get bounding box

    bbox = detection.locationdata.relativeboundingbox

    h, w, = image.shape

    x = int(bbox.xmin w)

    y = int(bbox.ymin h)

    width = int(bbox.width w)

    height = int(bbox.height h)

    # Get confidence score

    score = detection.score[0]

    print(f"Face detected: confidence={score:.2f}, bbox=({x}, {y}, {width}, {height})")

    cv2.imshow('Face Detection', image)

    cv2.waitKey(0)

    cv2.destroyAllWindows()

    Run

    detectfacesimage('photo.jpg')

    Real-Time Face Detection (Webcam)

    import cv2
    

    import mediapipe as mp

    def realtimefacedetection():

    """Real-time face detection from webcam."""

    mpfacedetection = mp.solutions.facedetection

    mpdrawing = mp.solutions.drawingutils

    # Open webcam

    Related Articles

    OpenCV + Deep Learning Tutorial: Modern Image Processing with Python

    OpenCV + Deep Learning: Tutorial Komprehensif Daftar Isi Pendahuluan Prasyarat Dasar-Dasar Preprocessing Gambar [T...

    SHAP Tutorial: Explainable AI and Model Interpretability

    SHAP - Panduan Praktis Explainable AI dan Interpretabilitas Model Model machine learning makin sering dipakai untuk meng...

    PyOD Tutorial: Anomaly and Outlier Detection in Python

    Deteksi Anomali di Python dengan PyOD: Panduan Praktis Sebagian besar dataset di dunia nyata mengandung sebagian kecil d...

    spaCy Tutorial: Industrial-Strength NLP in Python

    spaCy: NLP Kelas Industri di Python spaCy adalah pustaka open-source untuk pemrosesan bahasa alami (NLP) yang dirancang ...