2. Compilation Explained¶
This section explains TI TVM compilation in more detail. An introduction to this topic is provided in Compiling Models.
2.1. Environment Setup¶
Please refer the setup script provided as part of TI edgeai-tidl-tools for environment setup.
2.2. Frontends¶
TI fork of TVM accepts machine learning models exported to the ONNX format. Other formats have not undergone validation from TI.
As the first step of compilation, ONNX models are imported into TVM’s internal representation, Relay IR, before being mapped to
TIDL format internally for TIDL supported layers. The compile_model API explained in getting-started/compilation.rst accepts either
converted Relay module and parameters. It also has provision to directly accept an ONNX model and internally convert to Relay IR representation.
2.3. Calibration Data¶
After partitioning layers into subgraphs that can be offloaded to TIDL, these subgraphs need to be imported to TIDL. Because TIDL runs inference with quantized fixed-point values, the TIDL import process requires calibration data so that each layer’s dynamic range can be estimated and the scaling factor for converting between floating point and fixed point can be computed.
You only need to provide calibration data for the whole model. The TVM+TIDL compilation flow automatically obtains the corresponding tensor values at the TIDL subgraph boundaries and feeds those values into the TIDL import process for calibration. The calibration data you provide should represent typical input for the model.
2.4. Artifacts¶
2.4.1. Deployable module¶
After a successful TVM+TIDL compilation, a deployable module consists of 3 files that are saved in
the <artifacts_folder>.
.json: A JSON file describing the compiled graph through information about nodes, allocation, etc.
.so: The shared library containing code to run nodes in the compiled graph. This is a fat binary. Imported TIDL subgraph artifacts and generated C7x code are embedded in this fat binary.
.params: Contains weights associated with the nodes in the compiled graph.
At inference time, TVM runtime reads these 3 files and creates a runtime instance to run inference.
Hint
During development, you may export the x86_64 Linux filesystem where you run compilation, and mount the filesystem on your EVM so that you do not need to copy the deployable module.
For deploying onto your EVM, the deployable module is all that is needed. During compilation, TI TVM saves intermediate results in the <artifacts_folder>/tempDir directory. These results may
help you understand and debug the compilation. The following subsections describe some of the intermediate artifacts.
2.4.2. Relay graphs¶
relay_graph.orig.txt: The original relay graph from the TVM frontend.
relay_graph.prepared.txt: The relay graph after transformations that prepare for TIDL offload.
relay_graph.annotated.txt: The relay graph annotated for TIDL offload.
relay_graph.partitioned.txt: The relay graph partitioned for TIDL offload.
relay_graph.import.txt: The relay graph used to import into TIDL.
relay_graph.boundary.txt: The relay graph used to obtain calibration data at TIDL subgraph boundaries.
relay_graph.optimized.txt: The optimized relay graph for code generation.
relay_graph.wrapper.txt: The wrapper relay graph on the Arm side for dispatching the graph to C7x.
For example, you may examine relay_graph.import.txt to see how many TIDL subgraphs are
created and which layers are not offloaded to TIDL.
2.4.3. Imported TIDL artifacts¶
TIDL subgraphs are imported into TIDL artifacts in TIDL-specific formats. These artifacts are embedded into
the .so fat binary in the deployable module. The TVM runtime retrieves TIDL artifacts
and invokes the TIDL runtime at inference time.
relay.gv.svg: A graphical view of the whole network and where the TIDL subgraphs are located.
subgraph<n>_net.bin.html: A graphical view of TIDL subgraphs.
2.4.4. Generated C7x code¶
When advanced_options:c7x_codegen is set to 1 in the compilation script, TI TVM generates C7x code for layers
not offloaded to TIDL. This C7x code is compiled and embedded into the .so fat binary in
the deployable module. The TVM runtime retrieves the C7x code and dispatches it to the C7x
for execution.
model_<n>.c: Contains generated code either to run a TIDL subgraph or non-TIDL layers.
2.5. Debugging Compilation¶
TI TVM uses the TIDL_RELAY_IMPORT_DEBUG environment variable to help debug the TVM+TIDL compilation flow. The available settings are as follows.
2.5.1. TIDL_RELAY_IMPORT_DEBUG=1¶
When set to 1, verbose output at the terminal provides more information about the TIDL import. This includes whether a node is supported by TIDL, relay node to TIDL node conversion, imported TIDL subgraphs, optimized TIDL subgraphs, and calibration processes. For example:
RelayImportDebug: In TIDL_relayImportNode:
RelayImportDebug: node name: 185, op name: tidl.conv2d, num_args: 2
RelayImportDebug: args[0] dims: [1, 512, 7, 7]
RelayImportDebug: args[1] dims: [512, 512, 3, 3]
2.5.2. TIDL_RELAY_IMPORT_DEBUG=2, 3¶
When set to 2 or 3, verbose information about importing TIDL subgraphs is provided in addition to the information provided when the setting is 1.