4. Recommended Development Flow¶
This section describes the recommended development flow for using TVM to compile and infer a model.
4.1. Step 1: Model Selection¶
You may have already developed and trained a model. But if you are using a model downloaded from public domain, we recommend you look at TI EdgeAI ModelZoo first. TI EdgeAI ModelZoo contains models that have been tweaked and optimized for inference speed on TI SoCs.
4.2. Step 2: Compile with c7x_codegen=0¶
Adapt the example compilation scripts in TI edgeai-tidl-tools for use with your model. Additional examples are provided in the TVM Git repository.
See the Compilation Explained section for more about the compilation process and compiled artifacts.
When troubleshooting and optimizing, check the following:
Have all layers been offloaded to TIDL?
If not, which layers are not offloaded?
How many TIDL subgraphs are there?
4.3. Step 3: Inference and Performance Profiling¶
Adapt the example inference scripts in TI edgeai-tidl-tools for your model. Additional examples are provided in the TVM Git repository.
First, get the compiled model artifacts (TVM deployable module) to run on the EVM. Then check to make sure the inference results match the expected outputs for the given inputs.
After the model is running correctly on the EVM, use the performance profiling method described in the Inference Explained section to see if performance matches expectations.
4.4. Step 4: Compile with c7x_codegen=1¶
If there are TIDL unsupported layers in the model, you may also try running them on the C7x. Running layers on the C7x can help save the overhead between C7x TIDL subgraphs and layers on Arm. This can also lead to better performance with either TVM auto-generated C7x code or user-written C7x code for the TIDL unsupported layers.
4.5. Step 5: Inference and Performance Profiling¶
Once the model is compiled successfully with c7x_codegen=1, run it on the EVM and check to make sure the inference results still match the expected outputs for the given inputs.
After the model is running correctly on the EVM, use the performance profiling method described in the Inference Explained section to see if performance matches expectations.
4.6. Step 6: Performance Tuning¶
If the performance of TIDL unsupported layers does not match expectations, try the following:
Work around issues by rewriting Relay IR code.
Optimize the C7x code (either TVM-generated or user-written)
See the Extending TVM section for examples. Feedback on TI E2E forum is welcome (see the Support section).