7.7.26. Image Classification Example
This example demonstrates image classification using Tiny ML Tensorlab to classify visual data on microcontrollers.
7.7.26.1. Overview
Task: Image classification (multi-class)
Application: Visual quality inspection, object detection
Model: MobileNet-based architectures
Input: RGB or grayscale images
7.7.26.2. When to Use Image Classification
Image classification is useful for:
Visual quality inspection
Simple object recognition
Scene classification
Presence/absence detection
7.7.26.3. Running the Example
cd tinyml-modelzoo
./run_tinyml_modelzoo.sh examples/image_classification/config.yaml
cd tinyml-modelzoo
run_tinyml_modelzoo.bat examples\\image_classification\\config.yaml
7.7.26.4. Configuration
common:
target_module: 'image'
task_type: 'generic_image_classification'
target_device: 'F28P55'
dataset:
dataset_name: 'image_classification_example'
input_data_path: '/path/to/image/dataset'
data_processing_feature_extraction:
image_size: [96, 96] # Width x Height
channels: 3 # RGB (3) or Grayscale (1)
training:
model_name: 'MobileNetV2_Small'
training_epochs: 50
batch_size: 32
testing:
enable: True
compilation:
enable: True
7.7.26.5. Dataset Format
Image datasets use folder-per-class structure:
my_image_dataset/
├── annotations.yaml
└── classes/
├── class_a/
│ ├── image_001.jpg
│ ├── image_002.png
│ └── ...
├── class_b/
│ ├── image_001.jpg
│ └── ...
└── class_c/
└── ...
annotations.yaml:
name: my_image_dataset
description: Custom image classification dataset
task_type: image_classification
Supported formats: JPEG, PNG, BMP
7.7.26.6. Image Size Considerations
Smaller images = faster inference but less detail:
Size |
Memory |
Inference Time |
Detail |
|---|---|---|---|
32x32 |
Very Low |
Fastest |
Low |
64x64 |
Low |
Fast |
Moderate |
96x96 |
Moderate |
Moderate |
Good |
128x128 |
Higher |
Slower |
High |
Recommendation: Start with 64x64 or 96x96 for most applications.
data_processing_feature_extraction:
image_size: [64, 64] # Start small, increase if needed
channels: 3
7.7.26.7. Available Models
Model |
Parameters |
Description |
|---|---|---|
|
~10k |
Minimal, simple tasks |
|
~50k |
Good balance |
|
~100k |
Complex classification |
|
~20k |
Simple custom architecture |
Note: Image models are typically larger than time series models due to 2D spatial processing requirements.
7.7.26.8. Expected Results
Training complete.
Float32 Model:
Accuracy: 95%+
F1-Score: 0.94
Quantized Model:
Accuracy: 92%+
7.7.26.9. Grayscale vs RGB
Grayscale (1 channel):
Smaller model input
Faster inference
Good when color is not important
data_processing_feature_extraction:
channels: 1
RGB (3 channels):
Full color information
Larger model input
Needed when color matters for classification
data_processing_feature_extraction:
channels: 3
7.7.26.10. Data Augmentation
Image augmentation improves model robustness:
data_processing_feature_extraction:
augmentation:
horizontal_flip: True
vertical_flip: False
rotation: 15 # degrees
brightness: 0.2
contrast: 0.2
zoom: 0.1
Common augmentations:
Flip: For symmetric objects
Rotation: When orientation varies
Brightness/Contrast: For lighting variation
Zoom/Crop: For scale variation
7.7.26.11. Practical Applications
Visual Quality Inspection:
Detect defects in manufactured parts:
common:
target_module: 'image'
task_type: 'generic_image_classification'
target_device: 'F28P55'
dataset:
dataset_name: 'defect_inspection_dataset'
data_processing_feature_extraction:
image_size: [96, 96]
channels: 1 # Grayscale for surface defects
training:
model_name: 'MobileNetV2_Small'
Classes: good, scratch, dent, contamination
Presence Detection:
Detect if an object is present:
common:
task_type: 'generic_image_classification'
dataset:
dataset_name: 'presence_detection_dataset'
# classes: present, absent
data_processing_feature_extraction:
image_size: [64, 64]
channels: 3
training:
model_name: 'MobileNetV2_Tiny'
Scene Classification:
Classify environmental conditions:
common:
task_type: 'generic_image_classification'
dataset:
dataset_name: 'scene_dataset'
# classes: indoor, outdoor, low_light, etc.
data_processing_feature_extraction:
image_size: [96, 96]
channels: 3
training:
model_name: 'MobileNetV2_Medium'
7.7.26.12. Memory Constraints
Image classification requires more memory than time series:
Memory Budget:
Input buffer: W × H × C × 4 bytes (float)
Example: 96 × 96 × 3 × 4 = 110 KB
Model weights: Varies by model
Example: MobileNetV2_Small ≈ 200 KB
Total: Plan for 300-500 KB for image models
Optimization Tips:
Use smaller image size
Use grayscale if possible
Choose quantized models (int8)
Select device with sufficient memory
7.7.26.13. Inference Performance
Image inference is slower than time series:
Device |
Image Size |
Model |
Latency |
|---|---|---|---|
F28P55 (NPU) |
64x64 |
Small |
~5 ms |
F28P55 (NPU) |
96x96 |
Small |
~15 ms |
F28P55 (CPU) |
64x64 |
Small |
~50 ms |
Note: Actual performance depends on model architecture.
7.7.26.14. Transfer Learning
For better results with limited data, use pretrained models:
training:
model_name: 'MobileNetV2_Small'
pretrained: True # Start from ImageNet weights
freeze_backbone: False # Fine-tune entire model
training_epochs: 30
Transfer learning helps when:
You have limited training images
Your classes are similar to common objects
You want faster convergence
7.7.26.15. Camera Integration
For device deployment, consider:
Camera Interface:
DCMI/CSI for high-speed capture
GPIO for simple cameras
Frame buffer management
Frame Rate:
Typical: 1-10 fps for classification
Higher rates need faster inference
Preprocessing:
Resize on device or camera
Convert color space if needed
Normalize pixel values
7.7.26.16. Troubleshooting
Low accuracy:
Increase image size
Use larger model
Add more training data
Apply appropriate augmentation
Out of memory:
Reduce image size
Use grayscale
Choose smaller model
Check device memory specs
Slow inference:
Use NPU-compatible model
Reduce image size
Optimize model architecture
Overfitting (train >> test accuracy):
Add more augmentation
Reduce model complexity
Increase training data
7.7.26.17. Limitations
Image classification on MCUs has limitations:
Resolution: Limited to small images (32-128 pixels)
Complexity: Cannot match server-side accuracy
Memory: Large images exhaust RAM
Speed: Real-time video difficult
Best suited for:
Simple binary/few-class problems
Controlled lighting conditions
Fixed camera position
Non-safety-critical applications
7.7.26.18. Next Steps
Review Image Classification details
Learn about data preparation in Classification Dataset Format
Deploy to device: NPU Device Deployment