Detailed Description

Structure containing the parameters initialization of CNN convolution computation.

Definition at line 79 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

Data Fields
int8_t	funcStyle
	Variant of the function refer to MMALIB_FUNCTION_STYLE More...

int32_t	No
	Number of output feature maps More...

int32_t	inChOffset
	offset of the input feature maps in B matrix. This is power of 2 for circular buffer and atleast 64 byte aligned for linear buffer More...

int32_t	validColsIn
	Valid columns of input feature maps in B matrix for one call of processing of the kernel for non strided convolution. More...

int32_t	validColsPerRowIn
	Valid columns in a row of input feature maps for one call of processing of the kernel for strided convolution. The is validColsPerRowIn = inWidth + pad. The pad is on the left More...

int32_t	validRowsIn
	Valid input rows of input feature maps for one call of processing of the kernel for strided convolution. Ip = Lc + Pi/2 More...

int32_t	inputPitchPerRow
	Valid pitch for each input rows of input feature maps for strided convolution. This is in units of bytes Lr + Pi for entire feature map but can be Fr rows to be minimum More...

int32_t	outputPitchPerRow
	Valid output pitch for each output rows of output feature maps for strided convolution. This is units of bytes More...

int32_t	inWidth
	Width of each row of input feature map in units of data type More...

int32_t	pad
	Pad of each row of input feature map, specify pad only on one side and for strided flow the Pad is on the left More...

int32_t	maxHeight
	Height of the input feature map in units of data type More...

int32_t	subMChannels
	Number of output channels per kernel call. More...

int32_t	numGroupsPerKernel
	number of groups per kernel call > 1 will enable processing when No <= MMA size for non strided kernels and default value is 1 More...

int32_t	shift
	Shift parameter for output precision More...

int32_t	Fr
	coefficient rows (height) More...

int32_t	Fc
	coefficient columns (width) More...

int32_t	strideX
	stride of columns More...

int32_t	strideY
	stride of rows More...

int32_t	dilationX
	dilation of coefficients of columns More...

int32_t	dilationY
	dilation of coefficients of rows More...

int32_t	bias
	bias value in B matrix same as data type of B matrix More...

uint8_t	activationType
	activation RELU, SAT or none for output More...

uint8_t	mode
	mode for input feature map in Circular or Linear mode in B matrix More...

uint8_t	weightReorderFlag
	MChannel < mma width and NiFrFc < mma width for non strided cases blocks processing > 2, subMChannels % mma width != 0, NiFrFc % mma width != 0 More...

int32_t	numBiasVals
	Number of elements used for the bias (cols in weights, rows in feature maps) More...

Field Documentation

◆ funcStyle

int8_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::funcStyle

Variant of the function refer to MMALIB_FUNCTION_STYLE

Definition at line 82 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ No

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::No

Number of output feature maps

Definition at line 84 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ inChOffset

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::inChOffset

offset of the input feature maps in B matrix. This is power of 2 for circular buffer and atleast 64 byte aligned for linear buffer

Definition at line 87 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ validColsIn

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::validColsIn

Valid columns of input feature maps in B matrix for one call of processing of the kernel for non strided convolution.

This is in units of data type.
ValidColsIn is initialized which remains the same for all calls to kernel for a given layer except for the last call
The input feature maps are flattened such that rows and columns are represented in a single dimension.
For Pad, the rows are padded minimum of (Fc - 1)/2 where Fc is filter coefficient width.
The data is optimized such that only one side of the pad is included between two rows. This reduces the computation Lc = input width, Lr = input height, Filter width Fc, Filter Height Fr
Pi = Input Pad = (Fc - 1)*DilationX
The minimum number of validColsIn (Lc + (Fc -1)/2)*(Fr - 1) + Fc validColsIn for entire feature map will be
Ip = (Lc + Pi/2)*(Lr + Pi) + Pi/2 In case the next layer has pad Po > Pi validColsIn for entire feature map will be Ip' = (Lc + Po/2)*(Lr + Po) + Po/2

Definition at line 108 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ validColsPerRowIn

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::validColsPerRowIn

Valid columns in a row of input feature maps for one call of processing of the kernel for strided convolution. The is validColsPerRowIn = inWidth + pad. The pad is on the left

Definition at line 112 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ validRowsIn

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::validRowsIn

Valid input rows of input feature maps for one call of processing of the kernel for strided convolution. Ip = Lc + Pi/2

Definition at line 116 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ inputPitchPerRow

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::inputPitchPerRow

Valid pitch for each input rows of input feature maps for strided convolution. This is in units of bytes Lr + Pi for entire feature map but can be Fr rows to be minimum

Definition at line 120 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ outputPitchPerRow

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::outputPitchPerRow

Valid output pitch for each output rows of output feature maps for strided convolution. This is units of bytes

Definition at line 123 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ inWidth

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::inWidth

Width of each row of input feature map in units of data type

Definition at line 125 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ pad

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::pad

Pad of each row of input feature map, specify pad only on one side and for strided flow the Pad is on the left

Definition at line 128 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ maxHeight

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::maxHeight

Height of the input feature map in units of data type

Definition at line 130 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ subMChannels

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::subMChannels

Number of output channels per kernel call.

When filter coefficients do not fit in L2 memory but feature maps do, then the kernel is called multiple times with same subMChannels in each kernel call a. subMChannels can be more than MMA SIZE for non strided convolution b. subMChannels should be less than or equal to MMA SIZE for strided convolution
When filter coefficients fit in L2 memory but feature does not than kernel is called multiple times for the entire feature map and subMChannels is same as number of rows in the feature map buffer structure. MMA SIZE (64 for 8 bit and 32 for 16 bit) per kernel call

Definition at line 144 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.