8. Advanced Features
Tiny ML Tensorlab includes several advanced features to help you build more accurate and efficient models. This section covers these capabilities in detail.
Contents
- 8.1. Neural Architecture Search
- 8.1.1. Overview
- 8.1.2. When to Use NAS
- 8.1.3. Code Flow
- 8.1.4. Configuration
- 8.1.5. Model Size Presets
- 8.1.6. Usage
- 8.1.7. Running NAS
- 8.1.8. Example: Full NAS Configuration
- 8.1.9. Tips
- 8.1.10. Best Practices
- 8.1.11. Search Algorithm
- 8.1.12. Search Space
- 8.1.13. NAS Framework Internals
- 8.1.14. References
- 8.1.15. Next Steps
- 8.2. Quantization
- 8.2.1. Overview
- 8.2.2. Configuration Parameters
- 8.2.3. Quantization Modes
- 8.2.4. Quantization Methods
- 8.2.5. Bit Widths
- 8.2.6. NPU Quantization Requirements
- 8.2.7. Output Files
- 8.2.8. Accuracy Comparison
- 8.2.9. Troubleshooting Accuracy Loss
- 8.2.10. Best Practices
- 8.2.11. Example: Full Quantization Workflow
- 8.2.12. Memory Savings
- 8.2.13. Performance Impact
- 8.2.14. Quantization Wrapper Architecture
- 8.2.15. NPU Hardware Constraints
- 8.2.16. Using Quantization Wrappers Directly
- 8.2.17. Wrapper API Reference
- 8.2.18. Next Steps
- 8.3. Standalone Quantization Examples
- 8.4. Automatic Mixed Precision Quantization
- 8.5. Feature Extraction
- 8.5.1. Overview
- 8.5.2. Feature Extraction Pipeline
- 8.5.3. Configuration Parameters
- 8.5.4. Preset System
- 8.5.5. Available Presets
- 8.5.6. Data Processing Transforms
- 8.5.7. Feature Extraction Transforms
- 8.5.8. Custom Feature Extraction
- 8.5.9. Multi-Channel Data
- 8.5.10. Forecasting Configuration
- 8.5.11. Data Augmentation
- 8.5.12. Choosing the Right Preset
- 8.5.13. Performance Impact
- 8.5.14. On-Device Feature Extraction
- 8.5.15. Example Configurations
- 8.5.16. Stacking Modes
- 8.5.17. Gain Variation Augmentation
- 8.5.18. Q15 Fixed-Point Transforms
- 8.5.19. Frame Offset (Overlap Control)
- 8.5.20. Analysis Bandwidth
- 8.5.21. Feature Extraction Only Mode
- 8.5.22. Evaluating Feature Extraction Quality
- 8.5.23. Best Practices
- 8.5.24. Next Steps
- 8.6. Goodness of Fit
- 8.6.1. Overview
- 8.6.2. Enabling GoF Test
- 8.6.3. Running the Test
- 8.6.4. Output Files
- 8.6.5. Understanding the Visualizations
- 8.6.6. Interpreting Results
- 8.6.7. 8-Plot Analysis
- 8.6.8. Common Patterns
- 8.6.9. Actionable Insights
- 8.6.10. Frame Size Sweeping
- 8.6.11. Multi-Cluster Analysis
- 8.6.12. Example: Motor Fault GoF Analysis
- 8.6.13. GoF Without Training
- 8.6.14. Comparing Feature Extraction
- 8.6.15. Best Practices
- 8.6.16. Limitations
- 8.6.17. Next Steps
- 8.7. Post-Training Analysis
- 8.7.1. Overview
- 8.7.2. Enabling Analysis
- 8.7.3. Output Files
- 8.7.4. Confusion Matrix
- 8.7.5. ROC Curves
- 8.7.6. Class Score Histograms
- 8.7.7. FPR/TPR Thresholds
- 8.7.8. Classification Report
- 8.7.9. Error Analysis
- 8.7.10. Quantized vs Float Comparison
- 8.7.11. File-Level Classification Summary
- 8.7.12. Regression Analysis
- 8.7.13. Anomaly Detection Analysis
- 8.7.14. Custom Analysis Scripts
- 8.7.15. Generating Reports
- 8.7.16. Example: Complete Analysis Configuration
- 8.7.17. Best Practices
- 8.7.18. Troubleshooting Low Accuracy
- 8.7.19. Next Steps
- 8.8. On-Device Training (ODT)
- 8.8.1. Overview
- 8.8.2. Use Cases
- 8.8.3. Architecture: Frozen + Trainable Split
- 8.8.4. Supported Task Types
- 8.8.5. Workflow
- 8.8.6. Configuration
- 8.8.7. Memory Considerations
- 8.8.8. Limitations
- 8.8.9. Best Practices
- 8.8.10. Examples
- 8.8.11. Related Features
- 8.8.12. FAQ
- 8.8.13. Troubleshooting
- 8.8.14. Further Reading
- 8.9. On-Device Training — Advanced API & Config
- 8.9.1. Library Architecture
- 8.9.2. ModelContext_t — Central Training State
- 8.9.3. Memory Layout
- 8.9.4. Core API
- 8.9.5. Trainable Model Configuration
- 8.9.6. Custom Training Loop Example
- 8.9.7. Adding a New Layer Type
- 8.9.8. Batch Size Optimization
- 8.9.9. Logging System
- 8.9.10. API Quick Reference
- 8.9.11. Related Documentation
8.10. Feature Overview
Neural Architecture Search (NAS)
Automatically discover optimal neural network architectures for your dataset. NAS can optimize for memory usage or computational efficiency.
Preset sizes:
s,m,l,xl,xxlOptimization modes: Memory or Compute
GPU recommended for practical use
Quantization
Reduce model size and improve inference speed through quantization:
QAT (Quantization-Aware Training) - Best accuracy
PTQ (Post-Training Quantization) - Faster, no retraining
Weight bit-widths: 2-bit, 4-bit, 8-bit
Automatic Mixed Precision Quantization
Fully automatic, Hessian-aware per-layer bit width assignment using a
greedy algorithm. Enabled by setting auto_quantization: True:
Estimates per-layer sensitivity via Hessian eigenvalue (power iteration)
Greedy assignment from
{2, 4, 8, 32}bit widths maximising accuracy per bitAutomatic average bit width selection via binary search calibration
Fixes regression tasks where uniform 8-bit QAT fails
Standalone Quantization Examples
Runnable Python examples demonstrating direct use of quantization wrappers:
FMNIST, Audio KWS, Motor Fault, MNIST, Torque Regression
QAT and PTQ workflows with 2/4/8-bit quantization
ONNX export and inference validation
Feature Extraction
Transform raw time-series data into meaningful features:
FFT (Fast Fourier Transform)
Binning and normalization
Haar and Hadamard wavelets
Logarithmic scaling
Goodness of Fit Test
Evaluate whether your dataset is suitable for classification before training. Uses PCA and t-SNE visualization to assess class separability.
Post-Training Analysis
Understand model performance with:
ROC curves for classification
Confusion matrices
FPR/TPR threshold analysis
PCA visualization of feature-extracted data
On-Device Training (ODT)
Enable models to continue training directly on microcontrollers:
Deploy frozen backbone + trainable head
Adapt to local data and environment drift
Reduce re-deployment costs
Support for classification, regression, anomaly detection tasks