8.5. Goodness of Fit
The Goodness of Fit (GoF) test helps you analyze dataset quality and class separability before investing time in model training.
8.5.1. Overview
GoF testing answers:
Are my classes separable in feature space?
Is my feature extraction appropriate?
Will a neural network be able to learn the patterns?
Which classes might be confused?
Running GoF tests before training saves time by identifying data or feature extraction problems early.
8.5.2. Enabling GoF Test
Add the GoF section to your configuration:
common:
task_type: 'generic_timeseries_classification'
target_device: 'F28P55'
dataset:
dataset_name: 'your_dataset'
data_processing_feature_extraction:
feature_extraction_name: 'Generic_1024Input_FFTBIN_64Feature_8Frame'
gof_test: True
frame_size: 256
training:
enable: True # Can set to False for GoF-only analysis
8.5.3. Running the Test
cd tinyml-modelzoo
./run_tinyml_modelzoo.sh examples/your_example/config.yaml
cd tinyml-modelzoo
run_tinyml_modelzoo.bat examples\\your_example\\config.yaml
8.5.4. Output Files
GoF test generates analysis files:
.../gof_test/
├── gof_pca_2d.png # PCA visualization
├── gof_tsne_2d.png # t-SNE visualization
├── gof_lda_2d.png # LDA visualization
├── class_separability.csv # Quantitative metrics
├── confusion_potential.csv # Likely confusion pairs
└── feature_importance.csv # Important features
8.5.5. Understanding the Visualizations
PCA Plot (gof_pca_2d.png)
Principal Component Analysis projection:
Example GoF Plots:
GoF analysis for arc fault detection with 256 frame size
GoF analysis for arc fault detection with 1024 frame size
GoF analysis for motor bearing fault detection
PC2
^
| * * * Class A
| * * * *
|
| + + + Class B
| + + + +
+-------------------> PC1
Well-separated clusters = Good separability
Overlapping clusters = Potential confusion
Scattered points = High variance, harder to classify
t-SNE Plot (gof_tsne_2d.png)
Non-linear dimensionality reduction:
Better at revealing complex cluster structures
Preserves local neighborhoods
May show separability that PCA misses
LDA Plot (gof_lda_2d.png)
Linear Discriminant Analysis:
Maximizes class separation
Shows best linear separation achievable
Most relevant for linear-like classifiers
8.5.6. Interpreting Results
Class Separability Score:
class_separability.csv:
class_pair,separability_score,overlap_percentage
A-B,0.95,2.3%
A-C,0.82,8.5%
B-C,0.99,0.1%
Score > 0.9: Excellent separability
Score 0.7-0.9: Good separability
Score 0.5-0.7: Moderate (may need better features)
Score < 0.5: Poor (investigate data or features)
Confusion Potential:
confusion_potential.csv:
class_1,class_2,potential_confusion
A,C,high
B,D,low
Identifies which classes are most likely to be confused.
8.5.7. 8-Plot Analysis
GoF generates 8 different visualizations by combining three processing stages: 2 Transforms x 2 Scalings x 2 Dimensionality Reductions = 8 plots.
Transforms
FFT + Abs + Log: Converts the time-series data from the time domain into the frequency domain using a Fast Fourier Transform. Only the positive half of the symmetric FFT output is retained. The absolute value is taken, followed by a log transform to compress large magnitudes.
Wavelet Transform (WT): Analyzes the data in both the time and frequency domains simultaneously. This makes it especially effective at capturing localized events in time-series data, such as sudden spikes or anomalies.
Scalings
Standard Scaler (Z-score normalization): Standardizes the data by subtracting the mean and dividing by the standard deviation, producing features with mean=0 and standard deviation=1.
Min-Max Scaler: Scales each feature to the [0, 1] range. Useful when you want to preserve the relative distances between data points.
Dimensionality Reduction
PCA (Principal Component Analysis): A linear method that projects the data into fewer dimensions while preserving as much variance as possible.
t-SNE (t-Distributed Stochastic Neighbor Embedding): A non-linear method that maps high-dimensional data into 2D by preserving local neighborhood relationships. Especially good at revealing cluster structures.
The 8 Combinations
Plot 1: FFT+Abs+Log + Standard Scaler + PCA
Plot 2: FFT+Abs+Log + Standard Scaler + t-SNE
Plot 3: FFT+Abs+Log + MinMax Scaler + PCA
Plot 4: FFT+Abs+Log + MinMax Scaler + t-SNE
Plot 5: WT + Standard Scaler + PCA
Plot 6: WT + Standard Scaler + t-SNE
Plot 7: WT + MinMax Scaler + PCA
Plot 8: WT + MinMax Scaler + t-SNE
Important
Not all 8 plots need to show separable clusters. Each plot represents a different method of analyzing the time-series data. If any one of the 8 plots shows separable clusters, it is a strong sign that the dataset is suitable for classification.
8.5.8. Common Patterns
Good Dataset:
- Tight, well-separated clusters
- Consistent within-class variance
- Clear boundaries between classes
Problematic Dataset:
- Overlapping clusters
- Outliers far from clusters
- One class scattered, others tight
Feature Extraction Issue:
- All classes overlap completely
- No structure visible
- Random-looking scatter
8.5.9. Actionable Insights
If classes overlap:
Try different feature extraction:
data_processing_feature_extraction: # Try FFT instead of raw feature_extraction_name: 'Generic_1024Input_FFTBIN_64Feature_8Frame'
Increase feature count:
data_processing_feature_extraction: feature_extraction_name: 'Generic_512Input_RAW_512Feature_1Frame'
Review data labeling for errors
If one class is scattered:
Check for mislabeled samples
Consider splitting into sub-classes
Need more training data for that class
If all classes overlap:
Feature extraction may be inappropriate
Data might not contain discriminative information
Consider domain expertise for better features
8.5.10. Frame Size Sweeping
Sometimes the default frame_size does not capture enough of the signal to produce
meaningful GoF plots. In such cases, sweeping across multiple frame sizes can reveal
the right setting.
Arc Fault Classification Example
The Arc Fault dataset (two classes: Arc and Normal) demonstrates this clearly. Starting with a small frame size and progressively increasing it:
frame_size=256 --> Poor: clusters lack purity, no class separation
frame_size=512 --> Still poor: significant overlap persists
frame_size=1024 --> Improving: WT-based plots (5-8) begin showing less overlap
frame_size=2048 --> Better: continued improvement in WT plots
frame_size=4096 --> Good: clear, well-separated clusters visible in
Plot 7 (WT + MinMax Scaler + PCA)
Why Larger Frame Sizes Help
The Arc Fault dataset has a high sampling frequency, meaning many data points are recorded per second. With a small frame size, each frame captures only a tiny slice of the signal and misses the broader pattern. Increasing the frame size allows more data points per frame, revealing the true structure of the data.
Recommendation: Start with a frame_size that matches the intended model input
size. If plots are inconclusive, try 2x and 4x larger frame sizes. The frame size
that produces the clearest separation in the GoF plots is a good indicator of the
minimum signal length needed for reliable classification.
# Sweep example: run GoF at multiple frame sizes
data_processing_feature_extraction:
gof_test: True
frame_size: 1024 # Try 256, 512, 1024, 2048, 4096
8.5.11. Multi-Cluster Analysis
When a single class appears as multiple separate clusters in the GoF plots, it does not necessarily mean the data is bad. Multiple clusters per class can arise from real variations within the data collection process.
Motor Fault Four-Class Example: Sampling Frequency
Running GoF on the Motor Fault Four-Class Dataset (frame_size=256) revealed that
each class formed roughly 10 separate clusters. Investigation showed that the
dataset contained samples collected at 10 different sampling frequencies (10 Hz to
100 Hz). Filtering the dataset to a single frequency (40 Hz) reduced the cluster
count from 10 per class to 4 clusters – one per actual class. The sampling
frequency variation had introduced unwanted structure.
Motor Fault Six-Class Example: Equipment Variation
The Motor Fault Six-Class Dataset showed a similar multi-cluster pattern. However, filtering to a single sampling frequency (40 Hz) still left roughly 3 clusters per class. A deeper look revealed that the data had been collected from 3 different motors. Filtering to a single motor finally produced exactly 6 clusters, matching the actual number of classes.
Common Causes of Multi-Cluster Patterns
Different sampling frequencies in the dataset
Different environmental conditions during data collection
Different equipment or sensors used across collection sessions
Varying operational states within the same nominal class
What To Do
Examine metadata (frequency, sensor ID, conditions) for systematic variation.
Filter or stratify the data by the suspected variable and re-run GoF.
If filtering resolves the multi-cluster pattern, consider whether the model should be trained on the full mixed dataset or on a controlled subset.
8.5.12. Example: Motor Fault GoF Analysis
common:
task_type: 'generic_timeseries_classification'
target_device: 'F28P55'
dataset:
dataset_name: 'motor_fault_classification_dsk'
data_processing_feature_extraction:
feature_extraction_name: 'Input256_FFTBIN_16Feature_8Frame_3InputChannel_removeDC_2D1'
variables: 3
gof_test: True
frame_size: 256
training:
enable: False # GoF only, skip training
Expected Good Results:
6 fault classes showing clear separation:
- Normal: tight cluster, well separated
- Contaminated: distinct from normal
- Erosion: some overlap with flaking (similar faults)
- Flaking: some overlap with erosion
- No Lubrication: well separated
- Localized Fault: distinct signature
8.5.13. GoF Without Training
Run GoF analysis only (no model training):
data_processing_feature_extraction:
gof_test: True
training:
enable: False
testing:
enable: False
compilation:
enable: False
This is useful for:
Rapid dataset evaluation
Feature extraction comparison
Data quality assessment
8.5.14. Comparing Feature Extraction
Run GoF with different feature extraction to compare:
Configuration 1:
data_processing_feature_extraction:
feature_extraction_name: 'Generic_1024Input_FFTBIN_64Feature_8Frame'
gof_test: True
Configuration 2:
data_processing_feature_extraction:
feature_extraction_name: 'Generic_512Input_RAW_512Feature_1Frame'
gof_test: True
Compare the visualizations to see which gives better separability.
8.5.15. Best Practices
Always run GoF first: Before long training runs
Compare multiple feature extractions: Find the best approach
Investigate overlapping classes: May need more/different data
Use domain knowledge: Understand why classes separate (or don’t)
Document findings: GoF results inform model expectations
8.5.16. Limitations
GoF is a linear analysis; neural networks can learn non-linear boundaries
Good GoF doesn’t guarantee good model accuracy
Poor GoF may still yield acceptable models with enough complexity
2D projections can hide separability in higher dimensions
Use GoF as a guide, not a definitive answer.
8.5.17. Next Steps
Learn about Feature Extraction options
See Post-Training Analysis for model evaluation
Proceed to training if GoF looks good