6. Layer Configurations Accelerated by Arm Cortex-M33 CDE¶

Layers in Neural Network Models trained for the NPU can be mapped to the M33 CDE instruction support for machine learning acceleration. This page lists the supported layer types and configurations that are accelerated by M33 CDE. Layer types and configurations not mentioned here can still run on Arm Cortex-M33 but will not be accelerated by CDE

6.1. Overview of Layer Types¶

Generic Convolution layer (GCONV, not depth-wise, not point-wise, input feature map channel is a multiple of 4)

Depth-Wise Convolution layer (DWCONV)

Point-Wise Convolution layer (PWCONV)

Point-Wise Convolution with Residual input layer (PWCONVRES)

Transposed Convolution layer (TCONV)

Fully-Connected layer (FC)

6.2. Terminology and Notation¶

On the NPU, a layer computation takes an input feature map, computes with weights such as a convolution kernel, and produces an output feature map.

In the tables below, the column headings indicate the following:

ifmap: input feature map, also known as input tensor

ofmap: output feature map, also known as output tensor

kernel: convolution weights matrix, also known as filter

iB, iH, iW, iC: input feature map bit-width, height, width, channels

oB, oH, oW, oC: output feature map bit-width, height, width, channels

kB, kH, kW: kernel (or pool) bit-width, height, width

sH, sW: stride on height, width

pL, pR, pT, pB: padding input feature map on left, right, top, bottom. In general, padding on the input feature map is supported. When non-zero values are specified in a layer configuration row, it means that the specified padding is handled by the layer implementation on NPU. Otherwise, it means that padding (if any) is supported separately outside of the layer implementation.

In the tables below, the values in the rows below the headings indicate the following:

any: any positive integer value

m4: multiples of 4, e.g., 4, 8, 12, …

m5: multiples of 5, e.g., 5, 10, 15, …

m8b16: multiples of 8, begin with 16 (inclusive), e.g. 16, 24, 32, …

m1b69e72: multiples of 1, begin with 69, end with 72 (inclusive on both ends)

NA: not applicable

6.3. GCONV¶

iB	oB	kB	kH	kW	sH	sW	iH	iW	iC	oH	oW	oC	pL	pR	pT	pB	comment
8	8	8	any	1	any	1	any	any	any	any	any	m4	0	0	0	0

6.4. DWCONV¶

iB	oB	kB	kH	kW	sH	sW	iH	iW	iC	oH	oW	oC	pL	pR	pT	pB	comment
8	8	8	any	any	any	any	any	any	m4	any	any	m4	0	0	0	0

6.5. PWCONV¶

iB	oB	kB	kH	kW	sH	sW	iH	iW	iC	oH	oW	oC	pL	pR	pT	pB	comment
8	8	8	1	1	1	1	any	any	m4	any	any	m4	0	0	0	0

6.6. PWCONVRES¶

iB	oB	kB	kH	kW	sH	sW	iH	iW	iC	oH	oW	oC	pL	pR	pT	pB	comment
8	8	8	1	1	1	1	any	any	m4	any	any	m4	0	0	0	0

6.7. TCONV¶

iB	oB	kB	kH	kW	sH	sW	iH	iW	iC	oH	oW	oC	pL	pR	pT	pB	comment
8	8	8	any	any	kH	kW	any	any	m4	any	any	m4	0	0	0	0

6.8. FC¶

iB	oB	kB	kH	kW	sH	sW	iH	iW	iC	oH	oW	oC	pL	pR	pT	pB	comment
8	8	8	NA	NA	NA	NA	NA	NA	any	NA	NA	any	NA	NA	NA	NA

6.9. AVGPOOL (non-global)¶

Non-global AVGPOOL layers are converted to DWCONV layers during compilation. Refer to DWCONV for supported configs.