Complete Guide to Finetuning EasyOCR for Custom Datasets

EasyOCR is a powerful open-source OCR (Optical Character Recognition) library that supports over 80 languages. However, for specific use cases such as custom fonts, historical documents, handwriting, or unique document formats, finetuning the EasyOCR model can significantly improve accuracy.

In this tutorial, we will learn how to perform EasyOCR finetuning from scratch to model evaluation.

Prerequisites

Before starting, ensure you have:

Python 3.7+ (Python 3.8 or 3.9 recommended)
GPU with CUDA support (highly recommended, minimum 8GB VRAM)
At least 16GB system RAM
Image dataset with ground truth labels (minimum 1000 samples for good results)
Minimum 10GB disk space for models and dataset

Installation and Environment Setup

1. Clone EasyOCR Repository

# Clone repository git clone https://github.com/JaidedAI/EasyOCR.git cd EasyOCR Checkout to stable branch (optional) git checkout v1.7.0

2. Setup Virtual Environment

# Create virtual environment
python -m venv easyocrenv


Activate
For Linux/Mac:
source easyocrenv/bin/activate
For Windows:
easyocrenv\Scripts\activate

3. Install Dependencies

# Install PyTorch with CUDA support Adjust CUDA version to match your system pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118 Install EasyOCR requirements pip install -r requirements.txt Install additional training dependencies pip install tensorboard pip install lmdb pip install pillow pip install opencv-python pip install albumentations pip install python-Levenshtein

4. Verify Installation

import torch
print(f"PyTorch version: {torch.version}")
print(f"CUDA available: {torch.cuda.isavailable()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"GPU count: {torch.cuda.devicecount()}")

if torch.cuda.isavailable():
    print(f"GPU name: {torch.cuda.getdevicename(0)}")

Dataset Preparation

The dataset is the most important component in finetuning. Dataset quality and quantity significantly affect the final results.

1. Dataset Folder Structure

Create folder structure like this:

dataset/
├── raw/
│   ├── train/
│   │   ├── img001.jpg

│   │   ├── img002.jpg
│   │   └── ...
│   ├── validation/
│   │   ├── img001.jpg

│   │   └── ...
│   └── test/
│       ├── img001.jpg
│       └── ...
├── labels/
│   ├── trainlabels.txt

│   ├── vallabels.txt
│   └── testlabels.txt

└── lmdb/
    ├── train/
    └── validation/

2. Label File Format

Label files use TSV (Tab-Separated Values) format:
trainlabels.txt:

img001.jpg Hello World img002.jpg Invoice #12345 img003.jpg Total: $1,250.00 img004.jpg PT Rubythalib Data Konsulta

Each line contains:

Image filename
Tab character (\t)
Ground truth text

Tips for Quality Dataset:

Images must be clear and readable
Minimum resolution 64x256 pixels
Varied fonts, sizes, and styles
Various lighting conditions
Include realistic noise and distortion
Balanced character distribution

3. Dataset Preprocessing Script

Create file preparedataset.py:

import os
import cv2
import numpy as np
from PIL import Image
from pathlib import Path

def preprocessimage(imagepath, outputpath):
    """
    Image preprocessing for OCR training
    """
    # Read image
    img = cv2.imread(str(imagepath))


    if img is None:
        print(f"Error reading {imagepath}")
        return False

    # Convert to grayscale
    gray = cv2.cvtColor(img, cv2.COLORBGR2GRAY)


    # Noise reduction
    denoised = cv2.fastNlMeansDenoising(gray, None, 10, 7, 21)

Complete Guide to Finetuning EasyOCR for Custom Datasets

Complete Guide to Finetuning EasyOCR for Custom Datasets

Prerequisites

Installation and Environment Setup

1. Clone EasyOCR Repository

Checkout to stable branch (optional)

2. Setup Virtual Environment

Activate

For Linux/Mac:

For Windows:

easyocrenv\Scripts\activate

3. Install Dependencies

Adjust CUDA version to match your system

Install EasyOCR requirements

Install additional training dependencies

4. Verify Installation

Dataset Preparation

1. Dataset Folder Structure

2. Label File Format

3. Dataset Preprocessing Script

Related Articles

OpenCV + Deep Learning Tutorial: Modern Image Processing with Python

Complete Ultralytics Tutorial: Object Detection with YOLO

Florence-2: Microsoft's Multi-Task Vision Foundation Model

Supervision: Computer Vision Toolkit by Roboflow

Related Articles

OpenCV + Deep Learning Tutorial: Modern Image Processing with Python

OpenCV + Deep Learning: Tutorial Komprehensif Daftar Isi Pendahuluan Prasyarat Dasar-Dasar Preprocessing Gambar [T...

Complete Ultralytics Tutorial: Object Detection with YOLO

Tutorial Lengkap Ultralytics: Object Detection dengan YOLO Ultralytics adalah framework Python yang menyediakan implemen...

Florence-2: Microsoft's Multi-Task Vision Foundation Model

Florence-2: Model Vision Multi-Task dari Microsoft Daftar Isi Pendahuluan Prasyarat Instalasi Memuat Model Florence-2

Supervision: Computer Vision Toolkit by Roboflow

Supervision: Toolkit Computer Vision dari Roboflow Dalam proyek computer vision, setelah model mendeteksi objek, Anda ma...