8.3. Feature Extraction
Feature extraction transforms raw sensor data into a representation that helps the neural network learn patterns more effectively.
8.3.1. Overview
Why use feature extraction?
Reduced input size: Compress long time series
Better patterns: Transform to domain where patterns are clearer
Faster inference: Smaller inputs mean faster models
Domain knowledge: Incorporate signal processing expertise
8.3.2. Feature Extraction Pipeline
Raw data flows through two stages: data processing transforms, then feature extraction transforms:
Raw Signal → Data Processing → Feature Extraction → Model Input
(data_proc_transforms) (feat_ext_transform)
e.g. SimpleWindow, e.g. FFT_FE, BINNING,
Downsample ABS, LOG_DB, CONCAT
8.3.3. Configuration Parameters
The data_processing_feature_extraction section supports the following
parameters. There are two usage modes: using a preset name, or defining
a custom pipeline.
Core Parameters:
Option |
Description |
|---|---|
|
Preset name (e.g., |
|
List of data processing transforms applied before feature extraction.
Common values: |
|
List of feature extraction transforms applied in order. Common values
include: |
|
Number of input channels/variables. Supports three formats:
an integer (select first N columns), a list of column indices
(e.g., |
|
Number of samples per frame (e.g., |
|
Number of output features per frame after transform
(e.g., |
|
Number of frames to concatenate (e.g., |
|
Stride between frames as a fraction (e.g., |
Signal Processing Parameters:
Option |
Description |
|---|---|
|
Original sampling rate of the input signal. Used with the
|
|
Target sampling rate after downsampling. Used with the
|
|
Scaling factor applied to input data (e.g., |
|
Offset added to input data. |
|
Number of frames to skip between selected frames (e.g., |
|
Enable bin normalization ( |
|
Feature stacking mode: |
|
Minimum frequency bin index to include. |
|
Fraction of bandwidth to analyse (e.g., |
Logarithmic Transform Parameters:
Option |
Description |
|---|---|
|
Multiplier for logarithmic scaling (e.g., |
|
Base for logarithm (e.g., |
|
Minimum threshold to avoid log(0) (e.g., |
Fixed-Point (Q15) Parameters:
Option |
Description |
|---|---|
|
Scale factor for Q15 fixed-point quantization (e.g., |
Data Augmentation and Testing:
Option |
Description |
|---|---|
|
Dictionary mapping class names to |
|
Run Goodness of Fit test on extracted features ( |
Output Control:
Option |
Description |
|---|---|
|
Store extracted feature data to disk ( |
|
Use neural network for feature extraction ( |
Forecasting-Specific Parameters:
Option |
Description |
|---|---|
|
Number of future timesteps to predict (e.g., |
|
List of column indices or names for the target variable(s) to forecast
(e.g., |
8.3.4. Preset System
Tiny ML Tensorlab provides predefined feature extraction presets. When using
a preset, simply specify the feature_extraction_name and variables:
data_processing_feature_extraction:
feature_extraction_name: 'Generic_1024Input_FFTBIN_64Feature_8Frame'
variables: 1
Preset Naming Convention:
Generic_<InputSize>Input_<Transform>_<Features>Feature_<Frames>Frame
Example: Generic_1024Input_FFTBIN_64Feature_8Frame
- Input: 1024 samples
- Transform: FFT with binning
- Features: 64 frequency bins
- Frames: 8 temporal frames
- Total: 64 x 8 = 512 features to model
8.3.5. Available Presets
FFT-Based Presets:
Best for frequency-domain patterns (vibration, arc faults):
Preset |
Features |
Use Case |
|---|---|---|
|
512 |
General purpose |
|
256 |
Smaller input |
|
256 |
Full spectrum |
Raw Time-Domain Presets:
Best for waveform shape patterns:
Preset |
Features |
Use Case |
|---|---|---|
|
512 |
Full waveform |
|
256 |
Shorter window |
|
128 |
Compact input |
Application-Specific Presets:
Preset |
Application |
|---|---|
|
Motor fault (3-axis) |
|
Arc fault detection |
|
PIR detection |
8.3.6. Data Processing Transforms
The data_proc_transforms parameter specifies preprocessing steps applied
to raw data before feature extraction:
SimpleWindow
Segments continuous data into fixed-size windows:
data_processing_feature_extraction:
data_proc_transforms: ['SimpleWindow']
frame_size: 256
stride_size: 0.01
variables: 1
Downsample
Reduces the sampling rate of input data:
data_processing_feature_extraction:
data_proc_transforms: ['Downsample', 'SimpleWindow']
sampling_rate: 313000
new_sr: 3130
frame_size: 256
stride_size: 0.01
variables: 1
Multiple transforms can be chained in order:
data_processing_feature_extraction:
data_proc_transforms:
- SimpleWindow
- Downsample
frame_size: 256
sampling_rate: 100
new_sr: 1
variables: 1
8.3.7. Feature Extraction Transforms
The feat_ext_transform parameter defines the feature extraction pipeline
as an ordered list of transforms. Each step processes the output of the
previous step.
Common Transform Steps:
Transform |
Description |
|---|---|
|
Compute FFT (Fast Fourier Transform) |
|
Keep only positive frequency half of FFT |
|
Apply windowing function |
|
Group frequency bins to reduce feature count |
|
Normalize features |
|
Take absolute value |
|
Convert to logarithmic (dB) scale |
|
Remove DC component |
|
Concatenate frames into final feature vector |
|
Fixed-point Q15 FFT (for MCU deployment) |
|
Q15 scaling |
|
Q15 magnitude computation |
|
Q15 binning |
|
ECG-specific normalization |
Example: FFT with Binning Pipeline:
data_processing_feature_extraction:
feat_ext_transform: ['FFT_FE', 'FFT_POS_HALF', 'DC_REMOVE', 'ABS', 'BINNING', 'LOG_DB', 'CONCAT']
frame_size: 1024
feature_size_per_frame: 64
num_frame_concat: 4
variables: 1
Example: FFT without Binning Pipeline:
data_processing_feature_extraction:
feat_ext_transform: ['FFT_FE', 'FFT_POS_HALF', 'DC_REMOVE', 'ABS', 'LOG_DB', 'CONCAT']
frame_size: 256
feature_size_per_frame: 128
num_frame_concat: 1
variables: 6
Example: Fixed-Point Q15 Pipeline (for MCU deployment):
data_processing_feature_extraction:
feat_ext_transform: ['FFT_Q15', 'Q15_SCALE', 'Q15_MAG', 'DC_REMOVE', 'BIN_Q15', 'CONCAT']
frame_size: 256
feature_size_per_frame: 16
num_frame_concat: 8
q15_scale_factor: 5
normalize_bin: True
variables: 1
8.3.8. Custom Feature Extraction
For advanced use cases, use a Custom_* feature extraction name and specify
the transform pipeline manually:
data_processing_feature_extraction:
data_proc_transforms: ['SimpleWindow']
feature_extraction_name: 'Custom_Default'
feat_ext_transform: ['FFT_FE', 'FFT_POS_HALF', 'WINDOWING', 'BINNING', 'NORMALIZE', 'ABS', 'LOG_DB', 'CONCAT']
frame_size: 32
feature_size_per_frame: 8
num_frame_concat: 8
variables: 5
You can also configure additional parameters for fine-grained control:
data_processing_feature_extraction:
data_proc_transforms: []
feature_extraction_name: 'Custom_MotorFault'
feat_ext_transform: ['FFT_FE', 'FFT_POS_HALF', 'DC_REMOVE', 'ABS', 'BINNING', 'LOG_DB', 'CONCAT']
frame_size: 1024
feature_size_per_frame: 64
num_frame_concat: 4
normalize_bin: 1
stacking: '1D'
offset: 0
scale: 1
frame_skip: 1
log_mul: 20
log_base: 10
log_threshold: 1e-100
variables: 3
8.3.9. Multi-Channel Data
For sensors with multiple axes (e.g., 3-axis accelerometer), set
variables to the number of channels:
data_processing_feature_extraction:
feature_extraction_name: 'Input256_FFTBIN_16Feature_8Frame_3InputChannel_removeDC_2D1'
variables: 3
The variables parameter supports three formats:
Integer: Select first N columns (e.g.,
variables: 3)List of indices: Select specific columns (e.g.,
variables: [0, 2, 4])List of names: Select columns by name (e.g.,
variables: ['accel_x', 'accel_y', 'accel_z'])
8.3.10. Forecasting Configuration
Forecasting tasks require specific additional parameters:
data_processing_feature_extraction:
data_proc_transforms:
- SimpleWindow
frame_size: 32
stride_size: 0.1
forecast_horizon: 2
variables: 1
target_variables:
- 0
Note
SimpleWindow must be specified in data_proc_transforms for
forecasting tasks.
8.3.11. Data Augmentation
Use gain_variations to augment training data with gain variations per class:
data_processing_feature_extraction:
data_proc_transforms:
- Downsample
- SimpleWindow
gain_variations:
arc: [0.9, 1.1]
normal: [0.8, 1.2]
sampling_rate: 313000
new_sr: 3130
frame_size: 256
stride_size: 0.01
variables: 1
8.3.12. Choosing the Right Preset
Decision Tree:
Is the pattern in frequency content?
|-- Yes --> Use FFT-based preset
| |-- Need full spectrum? --> FFT_FullBandwidth
| |-- Reduce features? --> FFTBIN
|-- No --> Use RAW preset
|-- Need temporal context? --> Multi-frame
|-- Single snapshot? --> 1Frame
Common Choices by Application:
Application |
Recommended Preset |
|---|---|
Arc fault detection |
|
Motor bearing fault |
|
ECG classification |
|
Vibration anomaly |
|
Simple waveforms |
|
PIR detection |
|
8.3.13. Performance Impact
Feature extraction affects model size and speed:
Features |
Model Input |
Model Size |
Inference Time |
|---|---|---|---|
128 |
Small |
Smaller |
Faster |
256 |
Medium |
Medium |
Medium |
512 |
Large |
Larger |
Slower |
Trade-off:
More features = more information = potentially better accuracy
Fewer features = faster inference = fits smaller devices
8.3.14. On-Device Feature Extraction
Feature extraction runs on the MCU before inference. The compilation process generates C code for the feature extraction pipeline configured in your YAML.
Memory Usage:
Feature extraction buffers add to memory requirements:
Input buffer: frame_size x variables x sizeof(data_type)
FFT buffer: frame_size x sizeof(data_type)
Output buffer: feature_size_per_frame x num_frame_concat x sizeof(data_type)
8.3.15. Example Configurations
Arc Fault Classification (using preset):
data_processing_feature_extraction:
feature_extraction_name: 'FFT1024Input_256Feature_1Frame_Full_Bandwidth'
variables: 1
Motor Bearing Fault (using preset with override):
data_processing_feature_extraction:
feature_extraction_name: 'Input256_FFTBIN_16Feature_8Frame_3InputChannel_removeDC_2D1'
variables: 3
feature_size_per_frame: 4
Anomaly Detection with Downsampling:
data_processing_feature_extraction:
data_proc_transforms:
- SimpleWindow
- Downsample
frame_size: 1024
sampling_rate: 100
new_sr: 1
variables: 1
Regression with Simple Windowing:
data_processing_feature_extraction:
data_proc_transforms:
- SimpleWindow
frame_size: 512
stride_size: 0.1
variables: 6
Forecasting (PMSM Rotor Temperature):
data_processing_feature_extraction:
data_proc_transforms:
- SimpleWindow
frame_size: 3
stride_size: 0.4
forecast_horizon: 1
variables: 6
target_variables:
- 5
Goodness of Fit Testing:
Enable the gof_test parameter to run Goodness of Fit analysis on
extracted features:
data_processing_feature_extraction:
feature_extraction_name: 'Input256_FFTBIN_16Feature_8Frame_3InputChannel_removeDC_2D1'
gof_test: True
variables: 3
PCA Visualization of Extracted Features:
PCA (Principal Component Analysis) helps visualize how well the extracted features separate your classes. Well-separated clusters indicate good feature extraction.
PCA visualization of extracted features on training data
PCA visualization of extracted features on validation data
Interpreting PCA plots:
Tight clusters: Features represent the class well
Well-separated clusters: Good class separability
Overlapping clusters: May need different feature extraction
Scattered points: High variance, potentially noisy data
8.3.16. Best Practices
Match to signal characteristics: FFT for periodic, raw for transient
Start with standard presets: Customize only if needed
Consider device constraints: Fewer features for smaller devices
Test multiple options: Compare accuracy with different presets
Use domain knowledge: Understand what patterns you’re looking for
8.3.17. Next Steps
See Goodness of Fit to analyze dataset quality
Learn about Quantization for model compression
Explore Time Series Classification for classification