All Case Studies
#PaddleOCR#DotMatrix#OCR#ExpiryDate#Retail#AIModel#Training#Dataset#ComputerVision#DeepLearning#CameraScan#Inference#Accuracy#Automation

Training PaddleOCR for Dot-Matrix Expiry Date Detection Using Mobile Cameras for a Large Retail Chain

Training PaddleOCR for Dot-Matrix Expiry Date Detection Using Mobile Cameras for a Large Retail Chain

A custom PaddleOCR model trained to accurately detect dot-matrix expiry dates from product packaging using mobile cameras, built for a major retail chain to automate QC operations.

Client

A large retail chain managing thousands of daily product entries across supermarkets, warehouses, and distribution centers.

A major operational challenge:
Reading expiry dates printed in dot-matrix fonts on packets, bottles, cartons, and flexible packaging.

These dates are:

  • Hard to read

  • Faint or low contrast

  • Curved or distorted on packaging

  • Inconsistent in font style & placement

Manual entry was slow, error-prone, and expensive.

The client wanted an AI-powered camera scanning solution using mobile devices to:

  • Detect expiry date

  • Identify manufacture date

  • Scan products quickly

  • Reduce human errors


Project Overview

We built and trained a custom PaddleOCR model capable of:

  • Detecting dot-matrix style text

  • Recognizing expiry/manufacture dates

  • Running on mobile phones

  • Handling real-world camera conditions

  • Providing high accuracy across SKUs

The final solution delivered:

  • > 92% detection accuracy

  • Real-time inference on mobile devices

  • Automatic cleaning of low-light/distorted images


Key Challenges

1. Dot-Matrix Fonts Are Hard To Read

Expiry dates printed with dot-matrix technology often suffer from:

  • Broken character strokes

  • Pixelated edges

  • Tiny font size

  • Low contrast on colored packaging

2. Camera Conditions in Retail

Images captured in:

  • Low light

  • Glare from plastic wraps

  • Motion blur

  • Different angles

  • Curved surfaces

3. Inconsistent Date Formats

Examples:

  • 12/04/24

  • 12-APR-2024

  • EXP 12/04/2024

  • MFG: 120424

OCR needed to understand multiple formats reliably.

4. Need for Fast On-Device Inference

The retail team wanted:

  • Live scanning

  • Instant result

  • No cloud dependency


Our Solution

1. Custom OCR Dataset Creation (30,000+ Images)

We built a specialized dataset:

✔ Real product images

Captured from client stores & warehouses.

✔ Synthetic dot-matrix generation

We created a generator for:

  • Random dot patterns

  • Variable dot spacing

  • Random distortions

  • Different date layouts

  • Curved surface warp simulation

✔ Preprocessing augmentation

  • Blurring

  • Brightness variations

  • Noise

  • Glare simulation

  • Perspective transformations

This created a robust dataset for PaddleOCR training.


2. Training PaddleOCR with Custom Dot-Matrix Font Recognition

We fine-tuned:

  • PaddleOCR Text Detection (DBNet)

  • Text Recognition Model (CRNN-based)

Enhancements included:

✔ Modified backbone

To handle faint strokes and pixelated shapes.

✔ Character-level augmentation

To train the model on broken/incomplete characters.

✔ Date-pattern post-processing

Regex + heuristics to auto-correct misread characters, such as:

  • 1 → I

  • 0 → O

  • 8 → B


3. Expiry Date Detection Logic

We combined OCR with:

✔ Regex pattern extraction

To detect:

  • DD/MM/YY

  • MM/YYYY

  • EXP 12-04-2024

  • BEST BEFORE, USE BY patterns

✔ Confidence scoring

Filtering low-confidence recognition cases.

✔ Fallback mechanism

If expiry date ambiguous → prompt user for rescan.


4. Real-Time Inference on Mobile

Using:

  • PaddleLite

  • Model quantization

  • FP16 optimization

We achieved:

  • < 80 ms inference time

  • Smooth scanning experience

  • Offline capability (no internet needed)


Architecture Diagram (Text Version)

Camera CapturePreprocessingPaddleOCR DetectionPaddleOCR RecognitionDate Format ExtractionConfidence ScoringResult Display

Results & Impact

🔍 92% Accurate Expiry Date Detection

Even with low-quality packaging prints.

Fast Real-Time Scanning

Inference under 100 ms per frame.

📉 Reduced Manual Labor

Retail staff saved significant time previously spent reading and entering dates.

📦 Supports Thousands of SKUs

Model performed consistently across products:

  • Beverages

  • Packaged foods

  • Cosmetics

  • Medicine boxes

📱 Works Directly on Mobile Devices

No cloud dependency → instant response.


Conclusion

By creating a custom PaddleOCR-based dot-matrix detection model, we delivered a highly accurate, mobile-friendly expiry date recognition system for a major retail chain.

This solution:

  • Automates quality checks

  • Reduces human error

  • Speeds up product handling

  • Works in real-world retail environments

It now powers part of the retailer’s digital inventory and product validation workflow.

Abhi

Written by

Abhi

client
client
client
client
client
client
client
client
client
client