6. Layer Configurations Accelerated by Arm Cortex-M33 CDE¶
Layers in Neural Network Models trained for the NPU can be mapped to the M33 CDE instruction support for machine learning acceleration. This page lists the supported layer types and configurations that are accelerated by M33 CDE. Layer types and configurations not mentioned here can still run on Arm Cortex-M33 but will not be accelerated by CDE
6.1. Overview of Layer Types¶
Generic Convolution layer (GCONV, not depth-wise, not point-wise, input feature map channel is a multiple of 4)
Depth-Wise Convolution layer (DWCONV)
Point-Wise Convolution layer (PWCONV)
Point-Wise Convolution with Residual input layer (PWCONVRES)
Transposed Convolution layer (TCONV)
Fully-Connected layer (FC)
6.2. Terminology and Notation¶
On the NPU, a layer computation takes an input feature map, computes with weights such as a convolution kernel, and produces an output feature map.
In the tables below, the column headings indicate the following:
ifmap: input feature map, also known as input tensor
ofmap: output feature map, also known as output tensor
kernel: convolution weights matrix, also known as filter
iB, iH, iW, iC: input feature map bit-width, height, width, channels
oB, oH, oW, oC: output feature map bit-width, height, width, channels
kB, kH, kW: kernel (or pool) bit-width, height, width
sH, sW: stride on height, width
pL, pR, pT, pB: padding input feature map on left, right, top, bottom. In general, padding on the input feature map is supported. When non-zero values are specified in a layer configuration row, it means that the specified padding is handled by the layer implementation on NPU. Otherwise, it means that padding (if any) is supported separately outside of the layer implementation.
In the tables below, the values in the rows below the headings indicate the following:
any: any positive integer value
m4: multiples of 4, e.g., 4, 8, 12, …
m5: multiples of 5, e.g., 5, 10, 15, …
m8b16: multiples of 8, begin with 16 (inclusive), e.g. 16, 24, 32, …
m1b69e72: multiples of 1, begin with 69, end with 72 (inclusive on both ends)
NA: not applicable
6.3. GCONV¶
iB |
oB |
kB |
kH |
kW |
sH |
sW |
iH |
iW |
iC |
oH |
oW |
oC |
pL |
pR |
pT |
pB |
comment |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 |
8 |
8 |
any |
1 |
any |
1 |
any |
any |
any |
any |
any |
m4 |
0 |
0 |
0 |
0 |
6.4. DWCONV¶
iB |
oB |
kB |
kH |
kW |
sH |
sW |
iH |
iW |
iC |
oH |
oW |
oC |
pL |
pR |
pT |
pB |
comment |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 |
8 |
8 |
any |
any |
any |
any |
any |
any |
m4 |
any |
any |
m4 |
0 |
0 |
0 |
0 |
6.5. PWCONV¶
iB |
oB |
kB |
kH |
kW |
sH |
sW |
iH |
iW |
iC |
oH |
oW |
oC |
pL |
pR |
pT |
pB |
comment |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 |
8 |
8 |
1 |
1 |
1 |
1 |
any |
any |
m4 |
any |
any |
m4 |
0 |
0 |
0 |
0 |
6.6. PWCONVRES¶
iB |
oB |
kB |
kH |
kW |
sH |
sW |
iH |
iW |
iC |
oH |
oW |
oC |
pL |
pR |
pT |
pB |
comment |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 |
8 |
8 |
1 |
1 |
1 |
1 |
any |
any |
m4 |
any |
any |
m4 |
0 |
0 |
0 |
0 |
6.7. TCONV¶
iB |
oB |
kB |
kH |
kW |
sH |
sW |
iH |
iW |
iC |
oH |
oW |
oC |
pL |
pR |
pT |
pB |
comment |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 |
8 |
8 |
any |
any |
kH |
kW |
any |
any |
m4 |
any |
any |
m4 |
0 |
0 |
0 |
0 |
6.8. FC¶
iB |
oB |
kB |
kH |
kW |
sH |
sW |
iH |
iW |
iC |
oH |
oW |
oC |
pL |
pR |
pT |
pB |
comment |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 |
8 |
8 |
NA |
NA |
NA |
NA |
NA |
NA |
any |
NA |
NA |
any |
NA |
NA |
NA |
NA |
6.9. AVGPOOL (non-global)¶
Non-global AVGPOOL layers are converted to DWCONV layers during compilation. Refer to DWCONV for supported configs.