Complete Guide to Finetuning EasyOCR for Custom Datasets

# Tutorial Lengkap Finetuning EasyOCR untuk Dataset Custom EasyOCR adalah library OCR (Optical Character Recognition) open-source yang powerful dan mendukung lebih dari 80 bahasa. Namun, untuk kasus...

By Ruby Abdullah · · tutorial
EasyOCROCRComputer VisionDeep LearningFinetuning

Complete Guide to Finetuning EasyOCR for Custom Datasets

EasyOCR is a powerful open-source OCR (Optical Character Recognition) library that supports over 80 languages. However, for specific use cases such as custom fonts, historical documents, handwriting, or unique document formats, finetuning the EasyOCR model can significantly improve accuracy.

In this tutorial, we will learn how to perform EasyOCR finetuning from scratch to model evaluation.

Prerequisites

Before starting, ensure you have:

  • Python 3.7+ (Python 3.8 or 3.9 recommended)
  • GPU with CUDA support (highly recommended, minimum 8GB VRAM)
  • At least 16GB system RAM
  • Image dataset with ground truth labels (minimum 1000 samples for good results)
  • Minimum 10GB disk space for models and dataset

Installation and Environment Setup

1. Clone EasyOCR Repository

# Clone repository

git clone https://github.com/JaidedAI/EasyOCR.git

cd EasyOCR

Checkout to stable branch (optional)

git checkout v1.7.0

2. Setup Virtual Environment

# Create virtual environment

python -m venv easyocrenv

Activate

For Linux/Mac:

source easyocrenv/bin/activate

For Windows:

easyocrenv\Scripts\activate

3. Install Dependencies

# Install PyTorch with CUDA support

Adjust CUDA version to match your system

pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118

Install EasyOCR requirements

pip install -r requirements.txt

Install additional training dependencies

pip install tensorboard

pip install lmdb

pip install pillow

pip install opencv-python

pip install albumentations

pip install python-Levenshtein

4. Verify Installation

import torch

print(f"PyTorch version: {torch.version}")

print(f"CUDA available: {torch.cuda.isavailable()}")

print(f"CUDA version: {torch.version.cuda}")

print(f"GPU count: {torch.cuda.devicecount()}")

if torch.cuda.isavailable():

print(f"GPU name: {torch.cuda.getdevicename(0)}")

Dataset Preparation

The dataset is the most important component in finetuning. Dataset quality and quantity significantly affect the final results.

1. Dataset Folder Structure

Create folder structure like this:

dataset/

├── raw/

│ ├── train/

│ │ ├── img001.jpg

│ │ ├── img002.jpg

│ │ └── ...

│ ├── validation/

│ │ ├── img001.jpg

│ │ └── ...

│ └── test/

│ ├── img001.jpg

│ └── ...

├── labels/

│ ├── trainlabels.txt

│ ├── vallabels.txt

│ └── testlabels.txt

└── lmdb/

├── train/

└── validation/

2. Label File Format

Label files use TSV (Tab-Separated Values) format:

train
labels.txt:
img001.jpg	Hello World

img002.jpg Invoice #12345

img003.jpg Total: $1,250.00

img004.jpg PT Rubythalib Data Konsulta

Each line contains:

  • Image filename
  • Tab character (\t)
  • Ground truth text

Tips for Quality Dataset:
  • Images must be clear and readable
  • Minimum resolution 64x256 pixels
  • Varied fonts, sizes, and styles
  • Various lighting conditions
  • Include realistic noise and distortion
  • Balanced character distribution

3. Dataset Preprocessing Script

Create file preparedataset.py:

import os

import cv2

import numpy as np

from PIL import Image

from pathlib import Path

def preprocessimage(imagepath, outputpath):

"""

Image preprocessing for OCR training

"""

# Read image

img = cv2.imread(str(imagepath))

if img is None:

print(f"Error reading {imagepath}")

return False

# Convert to grayscale

gray = cv2.cvtColor(img, cv2.COLORBGR2GRAY)

# Noise reduction

denoised = cv2.fastNlMeansDenoising(gray, None, 10, 7, 21)

Related Articles

OpenCV + Deep Learning Tutorial: Modern Image Processing with Python

OpenCV + Deep Learning: Tutorial Komprehensif Daftar Isi Pendahuluan Prasyarat Dasar-Dasar Preprocessing Gambar [T...

Complete Ultralytics Tutorial: Object Detection with YOLO

Tutorial Lengkap Ultralytics: Object Detection dengan YOLO Ultralytics adalah framework Python yang menyediakan implemen...

Florence-2: Microsoft's Multi-Task Vision Foundation Model

Florence-2: Model Vision Multi-Task dari Microsoft Daftar Isi Pendahuluan Prasyarat Instalasi Memuat Model Florence-2

Supervision: Computer Vision Toolkit by Roboflow

Supervision: Toolkit Computer Vision dari Roboflow Dalam proyek computer vision, setelah model mendeteksi objek, Anda ma...