Client
A large retail chain managing thousands of daily product entries across supermarkets, warehouses, and distribution centers.
A major operational challenge:
Reading expiry dates printed in dot-matrix fonts on packets, bottles, cartons, and flexible packaging.
These dates are:
-
Hard to read
-
Faint or low contrast
-
Curved or distorted on packaging
-
Inconsistent in font style & placement
Manual entry was slow, error-prone, and expensive.
The client wanted an AI-powered camera scanning solution using mobile devices to:
-
Detect expiry date
-
Identify manufacture date
-
Scan products quickly
-
Reduce human errors
Project Overview
We built and trained a custom PaddleOCR model capable of:
-
Detecting dot-matrix style text
-
Recognizing expiry/manufacture dates
-
Running on mobile phones
-
Handling real-world camera conditions
-
Providing high accuracy across SKUs
The final solution delivered:
-
> 92% detection accuracy
-
Real-time inference on mobile devices
-
Automatic cleaning of low-light/distorted images
Key Challenges
1. Dot-Matrix Fonts Are Hard To Read
Expiry dates printed with dot-matrix technology often suffer from:
-
Broken character strokes
-
Pixelated edges
-
Tiny font size
-
Low contrast on colored packaging
2. Camera Conditions in Retail
Images captured in:
-
Low light
-
Glare from plastic wraps
-
Motion blur
-
Different angles
-
Curved surfaces
3. Inconsistent Date Formats
Examples:
-
12/04/24
-
12-APR-2024
-
EXP 12/04/2024
-
MFG: 120424
OCR needed to understand multiple formats reliably.
4. Need for Fast On-Device Inference
The retail team wanted:
-
Live scanning
-
Instant result
-
No cloud dependency
Our Solution
1. Custom OCR Dataset Creation (30,000+ Images)
We built a specialized dataset:
✔ Real product images
Captured from client stores & warehouses.
✔ Synthetic dot-matrix generation
We created a generator for:
-
Random dot patterns
-
Variable dot spacing
-
Random distortions
-
Different date layouts
-
Curved surface warp simulation
✔ Preprocessing augmentation
-
Blurring
-
Brightness variations
-
Noise
-
Glare simulation
-
Perspective transformations
This created a robust dataset for PaddleOCR training.
2. Training PaddleOCR with Custom Dot-Matrix Font Recognition
We fine-tuned:
-
PaddleOCR Text Detection (DBNet)
-
Text Recognition Model (CRNN-based)
Enhancements included:
✔ Modified backbone
To handle faint strokes and pixelated shapes.
✔ Character-level augmentation
To train the model on broken/incomplete characters.
✔ Date-pattern post-processing
Regex + heuristics to auto-correct misread characters, such as:
-
1 → I
-
0 → O
-
8 → B
3. Expiry Date Detection Logic
We combined OCR with:
✔ Regex pattern extraction
To detect:
-
DD/MM/YY
-
MM/YYYY
-
EXP 12-04-2024
-
BEST BEFORE, USE BY patterns
✔ Confidence scoring
Filtering low-confidence recognition cases.
✔ Fallback mechanism
If expiry date ambiguous → prompt user for rescan.
4. Real-Time Inference on Mobile
Using:
-
PaddleLite
-
Model quantization
-
FP16 optimization
We achieved:
-
< 80 ms inference time
-
Smooth scanning experience
-
Offline capability (no internet needed)
Architecture Diagram (Text Version)
Results & Impact
🔍 92% Accurate Expiry Date Detection
Even with low-quality packaging prints.
⚡ Fast Real-Time Scanning
Inference under 100 ms per frame.
📉 Reduced Manual Labor
Retail staff saved significant time previously spent reading and entering dates.
📦 Supports Thousands of SKUs
Model performed consistently across products:
-
Beverages
-
Packaged foods
-
Cosmetics
-
Medicine boxes
📱 Works Directly on Mobile Devices
No cloud dependency → instant response.
Conclusion
By creating a custom PaddleOCR-based dot-matrix detection model, we delivered a highly accurate, mobile-friendly expiry date recognition system for a major retail chain.
This solution:
-
Automates quality checks
-
Reduces human error
-
Speeds up product handling
-
Works in real-world retail environments
It now powers part of the retailer’s digital inventory and product validation workflow.

Written by
Abhi




