Panduan Lengkap MediaPipe: Computer Vision Made Easy
MediaPipe adalah framework open-source dari Google untuk membangun pipeline machine learning multimodal. MediaPipe menyediakan solusi siap pakai untuk berbagai task computer vision seperti face detection, hand tracking, pose estimation, dan banyak lagi.
Dalam tutorial ini, kita akan mempelajari MediaPipe dari dasar hingga implementasi praktis untuk berbagai aplikasi real-time.
Mengapa MediaPipe?
Keunggulan MediaPipe:
Solusi yang Tersedia:
| Solusi | Deskripsi |
|--------|-----------|
| Face Detection | Deteksi wajah dalam gambar/video |
| Face Mesh | 468 landmark wajah 3D |
| Hand Tracking | 21 landmark tangan |
| Pose Estimation | 33 landmark tubuh |
| Holistic | Kombinasi face, hands, dan pose |
| Object Detection | Deteksi objek umum |
| Image Segmentation | Segmentasi selfie/background |
| Gesture Recognition | Pengenalan gesture tangan |
Instalasi
Install MediaPipe
# Install dengan pip
pip install mediapipe
Install dengan OpenCV (biasanya sudah termasuk)
pip install opencv-python
Untuk versi GPU (opsional)
pip install mediapipe-gpu
Versi spesifik
pip install mediapipe==0.10.9
Verifikasi Instalasi
import mediapipe as mp
import cv2
print(f"MediaPipe version: {mp.version}")
print(f"OpenCV version: {cv2.version}")
Struktur MediaPipe
import mediapipe as mp
Solutions - Pre-built ML pipelines
mp.solutions.facedetection
mp.solutions.facemesh
mp.solutions.hands
mp.solutions.pose
mp.solutions.holistic
mp.solutions.objectron
mp.solutions.selfiesegmentation
Drawing utilities
mp.solutions.drawingutils
mp.solutions.drawingstyles
Face Detection
Basic Face Detection
import cv2
import mediapipe as mp
Initialize
mpfacedetection = mp.solutions.facedetection
mpdrawing = mp.solutions.drawingutils
def detectfacesimage(imagepath):
"""Detect faces in a static image."""
# Read image
image = cv2.imread(imagepath)
imagergb = cv2.cvtColor(image, cv2.COLORBGR2RGB)
# Initialize face detection
with mpfacedetection.FaceDetection(
modelselection=1, # 0: short-range (2m), 1: full-range (5m)
mindetectionconfidence=0.5
) as facedetection:
# Process image
results = facedetection.process(imagergb)
# Draw detections
if results.detections:
for detection in results.detections:
mpdrawing.drawdetection(image, detection)
# Get bounding box
bbox = detection.locationdata.relativeboundingbox
h, w, = image.shape
x = int(bbox.xmin w)
y = int(bbox.ymin h)
width = int(bbox.width w)
height = int(bbox.height h)
# Get confidence score
score = detection.score[0]
print(f"Face detected: confidence={score:.2f}, bbox=({x}, {y}, {width}, {height})")
cv2.imshow('Face Detection', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Run
detectfacesimage('photo.jpg')
Real-Time Face Detection (Webcam)
import cv2
import mediapipe as mp
def realtimefacedetection():
"""Real-time face detection from webcam."""
mpfacedetection = mp.solutions.facedetection
mpdrawing = mp.solutions.drawingutils