11.2. Compilation Only

If you have a pre-trained model (ONNX format) from another framework, you can use Tiny ML Tensorlab to compile it for TI microcontrollers.

11.2.1. Overview

The compilation-only workflow:

  1. Train your model externally (PyTorch, TensorFlow, etc.)

  2. Export to ONNX format

  3. Use Tiny ML Tensorlab to compile for target device

  4. Deploy compiled model to MCU

This is useful when:

  • You have existing models from other frameworks

  • You prefer a different training environment

  • You need specific model architectures not in ModelZoo

11.2.2. ONNX Model Requirements

Your ONNX model must meet these requirements:

Supported Operations:

  • Convolution (Conv, ConvTranspose)

  • Pooling (MaxPool, AveragePool, GlobalAveragePool)

  • Fully Connected (Gemm, MatMul)

  • Activation (ReLU, Sigmoid, Tanh, Softmax)

  • Normalization (BatchNormalization)

  • Arithmetic (Add, Sub, Mul, Div)

  • Reshape (Reshape, Flatten, Squeeze, Unsqueeze)

Data Types:

  • Float32 (will be quantized)

  • Int8 (already quantized)

Input/Output:

  • Single input tensor

  • Single output tensor (or multiple for specific tasks)

11.2.3. Compilation Configuration

Create a YAML configuration for compilation. The key points are disabling dataset loading and training, and providing the path to your ONNX model under the compilation section:

common:
  target_module: 'timeseries'
  task_type: 'generic_timeseries_classification'
  target_device: 'F28P55'
  run_name: '{date-time}/{model_name}'

dataset:
  enable: False          # Important: disable dataset loading
  dataset_name: my_model # Can be anything, used for directory naming

feature_extraction:
  feature_extraction_name: None

training:
  enable: False          # Important: disable training
  model_name: 'a'       # Can be anything, used for directory naming

compilation:
  enable: True
  model_path: "/path/to/your/model.onnx"  # Path to the model to compile

The most important line is model_path under compilation, which tells the tool where to find your pre-trained ONNX model. Both dataset and training must have enable: False since you are only compiling.

11.2.4. Running Compilation

cd tinyml-modelmaker
python tinyml_modelmaker/run_tinyml_modelmaker.py byom_config.yaml

11.2.5. Model Formats

The compilation workflow accepts models in ONNX or TFLite format. Provide the path to your model file using model_path in the compilation section of the config.

11.2.6. Output Artifacts

After compilation:

.../byom_output/
├── mod.a                    # Compiled library
├── mod.h                    # Interface header
├── model_config.h           # Configuration
└── compilation_log.txt      # Compilation details

11.2.7. NPU Compilation

For NPU devices, your ONNX model must follow NPU constraints:

Channel Requirements:

  • All intermediate channels must be multiples of 4

  • First layer input channels = 1

Kernel Constraints:

  • Convolution kernel height ≤ 7

  • MaxPool kernel ≤ 4x4

Verification:

import onnx

model = onnx.load('your_model.onnx')

# Check model structure
for node in model.graph.node:
    if node.op_type == 'Conv':
        # Check kernel size
        for attr in node.attribute:
            if attr.name == 'kernel_shape':
                kernel_h = attr.ints[0]
                if kernel_h > 7:
                    print(f"Warning: Kernel height {kernel_h} > 7")

If your model doesn’t meet NPU constraints, you have options:

  1. Modify and retrain the model

  2. Target a non-NPU device

  3. Accept CPU-only inference on NPU device

11.2.8. Example: External PyTorch Model

Step 1: Train in PyTorch

import torch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 4, kernel_size=(5, 1), padding=(2, 0))
        self.bn1 = nn.BatchNorm2d(4)
        self.pool = nn.MaxPool2d((2, 1))
        self.conv2 = nn.Conv2d(4, 8, kernel_size=(3, 1), padding=(1, 0))
        self.bn2 = nn.BatchNorm2d(8)
        self.fc = nn.Linear(8 * 128, 3)

    def forward(self, x):
        x = torch.relu(self.bn1(self.conv1(x)))
        x = self.pool(x)
        x = torch.relu(self.bn2(self.conv2(x)))
        x = self.pool(x)
        x = x.view(x.size(0), -1)
        return self.fc(x)

# Train your model
model = MyModel()
# ... training code ...

Step 2: Export to ONNX

# Export
dummy_input = torch.randn(1, 1, 512, 1)
torch.onnx.export(
    model,
    dummy_input,
    'my_model.onnx',
    input_names=['input'],
    output_names=['output'],
    opset_version=11
)

Step 3: Configure and compile

# byom_config.yaml
common:
  target_module: 'timeseries'
  task_type: 'generic_timeseries_classification'
  target_device: 'F28P55'
  run_name: '{date-time}/{model_name}'

dataset:
  enable: False
  dataset_name: my_pytorch_model

feature_extraction:
  feature_extraction_name: None

training:
  enable: False
  model_name: 'my_model'

compilation:
  enable: True
  model_path: 'my_model.onnx'

Step 4: Run compilation

python tinyml_modelmaker/run_tinyml_modelmaker.py byom_config.yaml

11.2.9. Example: TensorFlow Model

Step 1: Export from TensorFlow to ONNX

import tensorflow as tf
import tf2onnx

# Your trained TF model
tf_model = tf.keras.models.load_model('my_tf_model.h5')

# Convert to ONNX
spec = (tf.TensorSpec((1, 1, 512, 1), tf.float32, name="input"),)
model_proto, _ = tf2onnx.convert.from_keras(
    tf_model,
    input_signature=spec,
    opset=11
)

# Save
with open('tf_model.onnx', 'wb') as f:
    f.write(model_proto.SerializeToString())

Step 2: Continue with compilation as above

11.2.10. Troubleshooting

Unsupported Operation:

Error: Operation 'MyCustomOp' not supported

Solution: Replace with supported operations or simplify model.

Shape Mismatch:

Error: Input shape mismatch

Solution: Verify that the model at model_path has the expected input shape.

NPU Constraint Violation:

Error: Channel count 5 not multiple of 4

Solution: Modify model architecture to meet NPU requirements.

11.2.11. Best Practices

  1. Verify ONNX model first: Use onnxruntime to test

  2. Match training preprocessing: Calibration data should match inference

  3. Test on representative data: Ensure accuracy after quantization

  4. Start with non-NPU: Debug on CPU target first

  5. Compare outputs: Validate compiled model matches original

11.2.12. Next Steps