Introduction

TIDL-RT supports various base feature extractors / back-bone networks like Resnets, MobileNets, EfficientNets, ShuffleNets, VGG, DenseNet etc. Apart from these back-bone networks TIDL-RT also supports following post processing meta architectures from Object Detection:

Single Shot Detection (SSD)
You Only Look Once (YOLO) V3 and V5 Architecture
RetinaNet Architecture

A Single Shot Detection (SSD)

A.1 Caffe

TIDL-RT supports SSD networks and post processing layers as defined in Caffe-SSD implementation by original SSD authors. User should follow the following steps to provide this information to TIDL-RT via import configuration file:

Set the metaArchType = 0 (TIDL_metaArchCaffeJacinto)
Set inputNetFile and inputParamsFile to point to Caffe Prototxt and Caffemodel file with post processing information

User can refer following models and configuration as reference:

JdetNet512x512:
- Model Link
- Import config file : ti_dl/test/testvecs/config/import/public/caffe/tidl_import_jdetNet.txt
Pelee Pascal VOC 304x304:
- Model Link
- Import config file : ti_dl/test/testvecs/config/import/public/caffe/tidl_import_peeleNet.txt

A.2 TensorFlow/TFLite

TIDL-RT supports SSD - post processing as defined in TensorFlow Object detection API. User should follow the following steps to provide this information to TIDL-RT via import configuration file:

Set the metaArchType = 1 (TIDL_metaArchTFSSD)
List of all the Box and Class prediction heads as part of outDataNamesList
Set metaLayersNamesList to point to the corresponding pipeline config file

User can refer following model and configuration as reference:

ssd_mobilenet_v2:
- Model Link
- Import Config file : ti_dl/test/testvecs/config/import/public/tensorflow/tidl_import_mobileNetv2_ssd.txt
- Pipeline Config file : ti_dl/test/testvecs/config/import/public/mobilenet_ssd_pipeline.config
Note: If the user is not using TensorFlow Object detection API and using SSD post processing as defined by original author, then we would recommend the method as described in next section.

A.3 ONNX

TIDL-RT supports SSD - post processing in ONNX model format. In order to enable this, TIDL-RT defines a protocol buffer format which enables providing the SSD post processing information as defined by original SSD author to TIDL. Protocol buffer definition is available in the following file:

├── ti_dl                             # Base Directory
│   ├── utils                        
|   |    |── tidlMetaArch/tidl_meta_arch.proto

User should follow the following steps to provide this information to TIDL-RT via import configuration file:

Set the metaArchType = 3 (TIDL_metaArchTIDLSSD)
List the the tensor names of the Box and Class prediction heads as in original model in the prototxt file. An example for the same is given here
Set metaLayersNamesList to point to the prototxt file

TIDL-RT model import tool would make the complete network with Flatten, Concatenate and ODPost processing layer. This mechanism is validated with models trained in Pytorch and exported to ONNX. The Object detection demo in SDK uses this flow. User can refer following model and configuration as reference:

MLPerf ssd-resnet34:
- Model link
- Import Config file: ti_dl/test/testvecs/config/import/public/onnx/tidl_import_mlperf_resnet34_ssd.txt"

B. YOLO Architecture

TIDL-RT supports Yolo architecture (V3 and V5) for object detection post processing. This architecture also takes post processing inform as defined in ONNX-SSD section. User can refer following model and configuration as reference:

Set the metaArchType = 4 (TIDL_metaArchTIDLYolo) for V3 architecture and metaArchType = 6 (TIDL_metaArchTIDLYoloV5) for V5 architecture
List the the tensor names of the Box and Class prediction heads as in original model in the prototxt file.
Set metaLayersNamesList to point to the prototxt file

User can refer following import configuration file as reference :

YoloV3 model:
- Model link
- Import Config file: ti_dl/test/testvecs/config/import/public/onnx/tidl_import_yolo3.txt
- Protocol Buffer file: ti_dl/test/testvecs/config/import/public/onnx/tidl_import_yolo3_metaarch.prototxt

C. RetinaNet Architecture

TIDL-RT supports RetinaNet architecture for object detection post processing.This architecture also takes post processing inform as defined in ONNX-SSD section. User should follow the following steps to provide this information to TIDL-RT via import configuration file:

Set the metaArchType = 5 (TIDL_metaArchTIDLRetinaNet)
List the the tensor names of the Box and Class prediction heads as in original model in the prototxt file.
Set metaLayersNamesList to point to the prototxt file

Performance of SSD Post processing layer

The optimized implementation for SSD post processing (Box decoding, Score computation, Non-maximum suppression) are targeted for generic model (Any number of classes, Prior boxes, head etc).
Currently only 8-bit post Caffe-SSD / TIDL-SSD are optimized. TensorFlow object detection API Post processing and 16-bit version are provided for feature completeness purpose and not optimized.
It is recommend to write optimized version of post processing for given configuration. If the number of classes, prior Boxes are known upfront, theses post processing can be optimized well on C66x DSP or A72 to offload C7x-MMA for compute heavy layers ( Convolutions, Pooling etc)

Example TIDL Proto File for Custom SSD network

In the below example box_input: "376", 376 is the output tensor name of convolutions layer with box/loc prediction. in_width and in_height are base image resolution. This shall match with width and height parameters as set in the import config file. All the other parameters are as defined by Original Caffe-SSD implementation.

name: "TIAD SSD ARCH"
caffe_ssd {
  name: "ssd_post_proc"
  box_input: "376"
  box_input: "380"
  box_input: "384"
  box_input: "388"
  box_input: "392"
  box_input: "396"
  class_input: "378"
  class_input: "382"
  class_input: "386"
  class_input: "390"
  class_input: "394"
  class_input: "398"
  output: "psd_bboxes"
  in_width: 768
  in_height: 384
  prior_box_param {
    min_size: 46.1
    max_size: 113.7
    aspect_ratio: 3.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
    step: 16
  }
  prior_box_param {
    min_size: 113.7
    max_size: 181.2
    aspect_ratio: 3.0
    aspect_ratio: 5.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
    step: 32
  }
  prior_box_param {
    min_size: 181.2
    max_size: 248.8
    aspect_ratio: 3.0
    aspect_ratio: 5.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
    step: 64
  }
  prior_box_param {
    min_size: 248.8
    max_size: 316.4
    aspect_ratio: 3.0
    aspect_ratio: 5.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
    step: 128
  }
  prior_box_param {
    min_size: 316.4
    max_size: 384.0
    aspect_ratio: 3.0
    aspect_ratio: 5.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
    step_w: 256
    step_h: 192
  }
  prior_box_param {
    min_size: 384.0
    max_size: 768.0
    aspect_ratio: 3.0
    aspect_ratio: 5.0
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset: 0.5
    step: 384
  }
  detection_output_param {
    num_classes: 4
    share_location: true
    background_label_id: 0
    nms_param {
      nms_threshold: 0.60
      top_k: 100
    }
    code_type: CENTER_SIZE
    keep_top_k: 100
    confidence_threshold: 0.5
  }
}

Table of Contents