7.7.24. Arc Fault Anomaly Detection Example

This example demonstrates using autoencoder-based anomaly detection to identify DC arc fault patterns in current waveform data.

7.7.24.1. Overview

Task: Anomaly detection (binary: normal vs anomaly)
Application: DC arc fault detection
Model: Autoencoder architecture (AD_2k_NPU)
Training: Uses only normal data
Detection: High reconstruction error indicates anomaly
Dataset: DC arc fault current waveforms (DSK variant)

This example uses the same DC arc fault dataset as the classification example, but approaches it as an anomaly detection problem. The autoencoder learns to reconstruct normal current waveforms and flags arc fault patterns as anomalies based on high reconstruction error.

7.7.24.2. Running the Example

cd tinyml-modelzoo

# DSK dataset variant
./run_tinyml_modelzoo.sh examples/dc_arc_fault/config_anomaly_detection_dsk.yaml

# DSI dataset variant
./run_tinyml_modelzoo.sh examples/dc_arc_fault/config_anomaly_detection_dsi.yaml

cd tinyml-modelzoo

# DSK dataset variant
run_tinyml_modelzoo.bat examples\dc_arc_fault\config_anomaly_detection_dsk.yaml

# DSI dataset variant
run_tinyml_modelzoo.bat examples\dc_arc_fault\config_anomaly_detection_dsi.yaml

7.7.24.3. Configuration

common:
  task_type: 'generic_timeseries_anomalydetection'
  target_device: 'F28P55'

dataset:
  dataset_name: 'dc_arc_fault_example_dsk'
  input_data_path: 'https://software-dl.ti.com/C2000/esd/mcu_ai/01_03_00/datasets/dc_arc_fault_example_dsk.zip'

data_processing_feature_extraction:
  feature_extraction_name: 'FFT1024Input_256Feature_1Frame_Full_Bandwidth'
  variables: 1

training:
  model_name: 'AD_2k_NPU'
  training_epochs: 200
  batch_size: 256

testing:
  enable: True

compilation:
  enable: True

7.7.24.4. Dataset Format

The dataset follows the anomaly detection folder structure:

dc_arc_fault_example_dsk/
└── classes/
    ├── Normal/           # Normal current waveforms (training data)
    │   ├── file1.csv
    │   └── ...
    └── Anomaly/          # Arc fault waveforms (test-only data)
        ├── file1.csv
        └── ...

The model trains only on “Normal” class data. Anomaly data is used exclusively for testing and threshold evaluation.

See Anomaly Detection Dataset Format for full dataset format details.

7.7.24.5. Available Models

Anomaly detection models use autoencoder architecture:

Model	Parameters	Description
`AD_500_NPU`	~500	Minimal, simple patterns
`AD_1k`	~1,000	Compact autoencoder
`AD_2k_NPU`	~2,000	Balanced (used in this example)
`AD_4k`	~4,000	Complex patterns
`AD_6k_NPU`	~6,000	Depthwise separable encoder
`AD_8k_NPU`	~8,000	High complexity
`AD_Linear`	Varies	Deep linear autoencoder
`Ondevice_Trainable_AD_Linear`	Varies	On-device trainable variant

Model Selection:

Simple, repetitive patterns: AD_500_NPU or AD_1k
Moderate complexity: AD_2k_NPU or AD_4k
Complex, variable patterns: AD_6k_NPU or AD_8k_NPU

7.7.24.6. Expected Results

After training, you should see output similar to:

INFO: Best epoch: 39
INFO: MSE: 1.773

INFO: Reconstruction Error Statistics:
INFO: Normal training data - Mean: 1.662490, Std: 1.968127
INFO: Anomaly test data - Mean: 141.985321, Std: 112.756683
INFO: Normal test data - Mean: 2.849831, Std: 1.343052

INFO: Threshold for K = 4.5: 10.519060
INFO: False positive rate: 0.00%
INFO: Anomaly detection rate (recall): 100.00%

Key indicators of good training:

Large gap between normal mean error and anomaly mean error
Low false positive rate
High recall (anomaly detection rate)

7.7.24.7. Threshold Selection

The threshold determines the sensitivity:

threshold = mean_train + k * std_train

Lower k: More anomalies detected, but more false alarms
Higher k: Fewer false alarms, but may miss subtle anomalies
Typical starting point: k=3 (covers ~99.7% of normal data)

Refer to the threshold_performance.csv output file to select the optimal k value for your application. It contains precision, recall, F1, and false positive rate for each k value from 0 to 4.5.

7.7.24.8. Interpreting Outputs

After training, ModelMaker generates the following analysis outputs in the post_training_analysis/ folder:

Reconstruction Error Histogram (reconstruction_error_histogram.png):

Shows distribution of reconstruction errors for normal vs anomaly data:

Separated distributions = good detection capability
Overlapping distributions = may need different features or model

Threshold Performance CSV (threshold_performance.csv):

Contains detection metrics for each k value:

k_value,threshold,accuracy,precision,recall,f1_score,false_positive_rate,...
0.0,1.662,98.65,98.65,100.0,99.32,83.54,...
1.0,3.631,99.71,99.70,100.0,99.85,18.13,...
...
4.5,10.519,100.0,100.0,100.0,100.0,0.0,...

Use this file to select the threshold that best balances precision and recall for your deployment requirements.

7.7.24.9. Advanced Configuration

Adjust Model Size:

Smaller models compress more aggressively, which may miss subtle anomalies:

training:
  model_name: 'AD_500_NPU'   # Smaller, simpler patterns
  # model_name: 'AD_8k_NPU'  # Larger, complex patterns

Feature Engineering:

Better features improve detection:

data_processing_feature_extraction:
  # FFT captures frequency anomalies
  feature_extraction_name: 'Generic_1024Input_FFTBIN_64Feature_8Frame'

  # Raw captures waveform anomalies
  # feature_extraction_name: 'Generic_512Input_RAW_512Feature_1Frame'

7.7.24.10. Practical Applications

Vibration Monitoring:

common:
  task_type: 'generic_timeseries_anomalydetection'
  target_device: 'F28P55'

dataset:
  dataset_name: 'vibration_normal_only_dsk'

training:
  model_name: 'AD_4k_NPU'

Current Waveform Monitoring:

common:
  task_type: 'generic_timeseries_anomalydetection'

data_processing_feature_extraction:
  feature_extraction_name: 'FFT1024Input_256Feature_1Frame_Full_Bandwidth'
  variables: 1

training:
  model_name: 'AD_2k_NPU'

7.7.24.11. Comparison with Classification

Aspect	Classification	Anomaly Detection
Training data	All classes needed	Only normal data
Unknown faults	Cannot detect	Can detect (as anomaly)
Fault identification	Identifies fault type	Only detects abnormality
When to use	Known, labeled faults	Unknown or unlabeled faults

7.7.24.12. Troubleshooting

High false positive rate:

Threshold too low – increase k value
Normal data not representative of all operating conditions
Need more diverse normal training data

Missing anomalies:

Threshold too high – decrease k value
Model too simple – increase size
Feature extraction missing relevant patterns

Unstable reconstruction error:

Increase training epochs
Try different learning rate
Check for data preprocessing issues

7.7.24.13. Next Steps

Review the Anomaly Detection guide for threshold theory
Learn about Feature Extraction
Deploy to device: NPU Device Deployment