11. Release Notes¶

TI_MCU_2.1.1

TI_MCU_2.1.1 User’s Guide
Supported devices
- F28P55x and C28x core based devices
- F29H85x and C29x core based devices
- AM13x and Arm Cortex-M33 core based devices
- AM26x and Arm Cortex-R5 core based devices
- MSPM0G5187 and Arm Cortex-M0+ core based devices
- MSPM33C321A and Arm Cortex-M33 core based devices
- CC2745 device and Arm Cortex-M33 core based devices
Added a model memory usage summary and layer offloading summary to compiler output.
Optimized NPU setup time for the F28P55x device, resulting in improved overall performance.
Updated the “skip_normalize” option to extract input normalization scale values as floats instead of integers.
Resolved defects
- CODEGEN-14807: TI MCU NNC crashes when NHWC only padding optimization is applied to NCHW data layout
- CODEGEN-14838: TI MCU NNC crashes when a non-existent attribute is accessed on a node
- CODEGEN-14981: TVM ONNX frontend fails with ONNX 1.20

TI_MCU_2.1.0

TI_MCU_2.1.0 User’s Guide
Supported devices
- F28P55x and C28x core based devices
- F29H85x and C29x core based devices
- MSPM0 devices
- CC2745 device and Arm Cortex-M33 core based devices
Supported all layer configs with 8-bit activations and 8-/4-/2-bit weights that can be offloaded to TI-NPU
Supported all layer configs with 8-bit activations and 8-bit weights that can be accelerated using the M33 Custom Datapath Extension (CDE).
Improved quantized (8-bit) model inference performance on C29x core. Note that floating-point models typically run faster than their quantized counterparts on C29x core, despite requiring a larger memory footprint.

TI_MCU_2.0.0

TI_MCU_2.0.0 User’s Guide
Supported devices
- F28P55x and C28x core based devices
- F29H85x and C29x core based devices
- MSPM0 devices
Expanded layer configs that can be supported on TI-NPU
Added an option to compress TI-NPU layer data
Optimized floating point inference on C29x core
Supported quantized models in QDQ format with integer inference code on CPU

TI_MCU_1.3.0

TI_MCU_1.3.0 User’s Guide
Windows and Linux support
Multiple model support in the same application
Generic padding support
Optimized performance for motor fault classification models and arc fault detection models
- With TI-NPU acceleration, these models saw a 1.2x to 1.5x speedup from TI_MCU_1.2.0 release
- With software-only execution on C28, these models saw a 3.9x to 4.8x speedup from TI_MCU_1.2.0 release
Added example and documentation on how to apply TI-NPU quantization to an existing PyTorch training script
Simplified command line options

TI_MCU_1.2.0