2. Compiling Models

TI TVM compiles a model into a deployable artifact through three stages:

ONNX model
    → Import into TVM
    → Partition
          TIDL-supported layers  →  TIDL artifacts
          Unsupported layers     →  C7x™ NPU generated code
    → deploy_lib.so + deploy_graph.json + deploy_param.params

The same model is compiled twice in sequence — first for x86 (host emulation) and then for AArch64 (EVM). The TIDL partitioning and import steps run only once; the results are reused for the EVM build via the REUSE_TIDL_ARTIFACTS environment variable.

2.1. Supported Model Formats

TI TVM accepts models in ONNX format. ONNX has been validated by TI. If your model is in a different framework, export it to ONNX first.

2.2. Calibration Data

TIDL runs inference with quantized fixed-point values. During compilation, calibration data is used to estimate each layer’s dynamic range and compute scaling factors. You only need to provide calibration data for the whole model; TI TVM automatically extracts the tensor values at TIDL subgraph boundaries.

Provide a small representative set of input frames (typically 2–10) as a .npz file:

import numpy as np

# Each key is an input name; value is an array of N frames with shape [N, ...]
calib_dict = {
    "input": np.stack([preprocess(img) for img in calib_images])
}
np.savez_compressed("calibration.npz", **calib_dict)

Choose calibration inputs that represent the distribution of real inference inputs — poor calibration data leads to accuracy loss after quantization.

2.3. Compilation Interfaces

TI TVM provides two equivalent interfaces for compilation. Both produce identical artifacts and accept the same options — the choice is a matter of preference:

  • TVMC — command-line interface; options passed via a YAML config file

  • Python APIcompile_model function; options passed as a Python dict

2.3.1. Compiling with TVMC

Prepare a YAML config file with the TIDL compile options (see Compilation Options for the full list):

# config.yaml
compile_options:
  "tensor_bits": 8
  "advanced_options:calibration_frames": 5
  "advanced_options:calibration_iterations": 5
  "advanced_options:c7x_codegen": 1

Then compile the model:

python -m tvm.driver.tvmc compile model.onnx \
  --target tidl \
  --tidl-config config.yaml \
  --tidl-calibration-input calibration.npz \
  --output ./artifacts/

2.3.1.1. TVMC options

Option

Default

Description

--target

Must be set to tidl to enable TI TVM compilation.

--tidl-config

Path to a YAML file containing TIDL compile options.

--tidl-calibration-input

Path to a .npz file containing calibration frames. Required when TIDL offload is enabled.

--c7x-codegen

0

Set to 1 to compile TVM-generated code for C7™ NPU execution. Unsupported layers run on C7™ NPU instead of Arm. Overrides the value in --tidl-config if both are set.

--enable-tidl-offload

1

Set to 0 to disable TIDL offload and run the entire model via TVM code generation only.

--compile-for-device

1

Set to 1 to cross-compile for AArch64 (EVM). Set to 0 to compile for x86 host execution only.

--output

./model-artifacts

Directory where compiled artifacts are written.

2.3.2. Compiling with the Python API

The compile_model function accepts the same options as the TVMC YAML config file, passed as a Python dict via delegate_options:

Example — compile for both host and EVM:

import os
import numpy as np
from tvm.contrib.tidl.compile import compile_model

delegate_options = {
    "artifacts_folder":                    "./artifacts",
    "tensor_bits":                         8,
    "advanced_options:c7x_codegen":        1,
    "advanced_options:calibration_frames": 5,
}

calibration_input = [{"input": frame} for frame in calib_frames]

# Compile for x86 host first
compile_model(
    platform=os.environ["SOC"],
    compile_for_device=False,
    enable_tidl_offload=True,
    delegate_options=delegate_options,
    calibration_input_list=calibration_input,
    model_path="model.onnx",
    input_shape_dict=[{"input": (1, 3, 224, 224)}],
)

# Reuse TIDL artifacts from the host build for the EVM build
os.environ["REUSE_TIDL_ARTIFACTS"] = "1"

compile_model(
    platform=os.environ["SOC"],
    compile_for_device=True,
    enable_tidl_offload=True,
    delegate_options=delegate_options,
    calibration_input_list=calibration_input,
    model_path="model.onnx",
    input_shape_dict=[{"input": (1, 3, 224, 224)}],
)

2.4. Compilation Options

The following options control TIDL compilation behaviour. They are passed via the compile_options section in the TVMC YAML config file, or as keys in the delegate_options dict when using the Python API.

Option

Default

Description

tensor_bits

8

Quantization precision: 8, 16, or 32.

advanced_options:c7x_codegen

0

Set to 1 to compile TVM-generated code for C7™ NPU execution. Unsupported layers run on C7™ NPU instead of Arm.

advanced_options:calibration_frames

2

Number of calibration frames used for quantization.

advanced_options:calibration_iterations

5

Number of calibration iterations per frame.

Note

For a complete, production-ready example of both compilation interfaces see tvmrt_wrapper.py in the edgeai-tidl-tools repository.

2.5. Compilation Artifacts

After a successful compilation, three files are written to artifacts_folder:

  • deploy_lib.so — fat binary containing TIDL subgraphs and C7™ NPU code

  • deploy_graph.json — execution graph describing nodes and data flow

  • deploy_param.params — model weights

These three files are everything needed to run inference on the EVM.

Hint

During development you can NFS-mount the host compilation directory on the EVM, avoiding the need to copy artifacts after each recompile.

2.5.1. Intermediate artifacts

TI TVM also writes intermediate files to artifacts_folder/tempDir/ that are useful for understanding and debugging compilation.

Relay graphs — snapshots of the model at each compilation stage:

  • relay_graph.orig.txt — original graph from the TVM frontend

  • relay_graph.prepared.txt — after pre-TIDL transformations

  • relay_graph.annotated.txt — annotated for TIDL offload

  • relay_graph.partitioned.txt — after partitioning; check this to see which layers are offloaded to TIDL and which are not

  • relay_graph.import.txt — used for TIDL import

  • relay_graph.optimized.txt — final optimized graph for code generation

  • relay_graph.wrapper.txt — Arm-side wrapper graph for C7™ NPU dispatch

TIDL artifacts:

  • relay.gv.svg — graphical view of the whole network with TIDL subgraph boundaries highlighted

  • subgraph<n>_net.bin.html — detailed view of each TIDL subgraph

Generated C7|tm| NPU code (when c7x_codegen=1):

  • model_<n>.c — TVM-generated C7™ NPU code for each subgraph and non-TIDL layer

2.6. Running Unsupported Layers on Arm

Setting --c7x-codegen 0 maps TIDL-unsupported layers to the Arm core instead of C7™ NPU. This is useful when the C7000 CGT toolchain is not available or for quick bring-up without the full toolchain.

python -m tvm.driver.tvmc compile model.onnx \
  --target tidl \
  --c7x-codegen 0 \
  --tidl-calibration-input calibration.npz \
  --output ./artifacts/

Note

Arm fallback requires ARM64_GCC_PATH to be set and increases inference latency due to data transfers between Arm and C7™ NPU. Use c7x_codegen=1 for production deployments.

2.7. Debugging Compilation

Set TIDL_RELAY_IMPORT_DEBUG to get verbose output during compilation.

Value

Output

1

Per-node TIDL support status, Relay-to-TIDL node conversion, subgraph summary, and calibration progress.

2, 3

All output from level 1, plus detailed TIDL subgraph import information.

Example output at level 1:

export TIDL_RELAY_IMPORT_DEBUG=1
python -m tvm.driver.tvmc compile model.onnx --target tidl ...
RelayImportDebug: In TIDL_relayImportNode:
RelayImportDebug: node name: 185, op name: tidl.conv2d, num_args: 2
RelayImportDebug:   args[0] dims: [1, 512, 7, 7]
RelayImportDebug:   args[1] dims: [512, 512, 3, 3]