MMALIB User Guide
MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs Struct Reference

Detailed Description

Structure containing the parameters initialization of CNN convolution computation.

Definition at line 79 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

Data Fields

int8_t funcStyle
 Variant of the function refer to MMALIB_FUNCTION_STYLE
More...
 
int32_t No
 Number of output feature maps
More...
 
int32_t inChOffset
 offset of the input feature maps in B matrix. This is power of 2 for circular buffer and atleast 64 byte aligned for linear buffer
More...
 
int32_t validColsIn
 Valid columns of input feature maps in B matrix for one call of processing of the kernel for non strided convolution. More...
 
int32_t validColsPerRowIn
 Valid columns in a row of input feature maps for one call of processing of the kernel for strided convolution. The is validColsPerRowIn = inWidth + pad. The pad is on the left
More...
 
int32_t validRowsIn
 Valid input rows of input feature maps for one call of processing of the kernel for strided convolution. Ip = Lc + Pi/2
More...
 
int32_t inputPitchPerRow
 Valid pitch for each input rows of input feature maps for strided convolution. This is in units of bytes Lr + Pi for entire feature map but can be Fr rows to be minimum
More...
 
int32_t outputPitchPerRow
 Valid output pitch for each output rows of output feature maps for strided convolution. This is units of bytes
More...
 
int32_t inWidth
 Width of each row of input feature map in units of data type
More...
 
int32_t pad
 Pad of each row of input feature map, specify pad only on one side and for strided flow the Pad is on the left
More...
 
int32_t maxHeight
 Height of the input feature map in units of data type
More...
 
int32_t subMChannels
 Number of output channels per kernel call. More...
 
int32_t numGroupsPerKernel
 number of groups per kernel call > 1 will enable processing when No <= MMA size for non strided kernels and default value is 1
More...
 
int32_t shift
 Shift parameter for output precision
More...
 
int32_t Fr
 coefficient rows (height)
More...
 
int32_t Fc
 coefficient columns (width)
More...
 
int32_t strideX
 stride of columns
More...
 
int32_t strideY
 stride of rows
More...
 
int32_t dilationX
 dilation of coefficients of columns
More...
 
int32_t dilationY
 dilation of coefficients of rows
More...
 
int32_t bias
 bias value in B matrix same as data type of B matrix
More...
 
uint8_t activationType
 activation RELU, SAT or none for output
More...
 
uint8_t mode
 mode for input feature map in Circular or Linear mode in B matrix
More...
 
uint8_t weightReorderFlag
 MChannel < mma width and NiFrFc < mma width for non strided cases blocks processing > 2, subMChannels % mma width != 0, NiFrFc % mma width != 0
More...
 
int32_t numBiasVals
 Number of elements used for the bias (cols in weights, rows in feature maps)
More...
 

Field Documentation

◆ funcStyle

int8_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::funcStyle

Variant of the function refer to MMALIB_FUNCTION_STYLE

Definition at line 82 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ No

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::No

Number of output feature maps

Definition at line 84 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ inChOffset

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::inChOffset

offset of the input feature maps in B matrix. This is power of 2 for circular buffer and atleast 64 byte aligned for linear buffer

Definition at line 87 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ validColsIn

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::validColsIn

Valid columns of input feature maps in B matrix for one call of processing of the kernel for non strided convolution.

  • This is in units of data type.
  • ValidColsIn is initialized which remains the same for all calls to kernel for a given layer except for the last call
  • The input feature maps are flattened such that rows and columns are represented in a single dimension.
  • For Pad, the rows are padded minimum of (Fc - 1)/2 where Fc is filter coefficient width.
  • The data is optimized such that only one side of the pad is included between two rows. This reduces the computation Lc = input width, Lr = input height, Filter width Fc, Filter Height Fr
  • Pi = Input Pad = (Fc - 1)*DilationX
  • The minimum number of validColsIn (Lc + (Fc -1)/2)*(Fr - 1) + Fc validColsIn for entire feature map will be
  • Ip = (Lc + Pi/2)*(Lr + Pi) + Pi/2 In case the next layer has pad Po > Pi validColsIn for entire feature map will be Ip' = (Lc + Po/2)*(Lr + Po) + Po/2

Definition at line 108 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ validColsPerRowIn

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::validColsPerRowIn

Valid columns in a row of input feature maps for one call of processing of the kernel for strided convolution. The is validColsPerRowIn = inWidth + pad. The pad is on the left

Definition at line 112 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ validRowsIn

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::validRowsIn

Valid input rows of input feature maps for one call of processing of the kernel for strided convolution. Ip = Lc + Pi/2

Definition at line 116 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ inputPitchPerRow

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::inputPitchPerRow

Valid pitch for each input rows of input feature maps for strided convolution. This is in units of bytes Lr + Pi for entire feature map but can be Fr rows to be minimum

Definition at line 120 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ outputPitchPerRow

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::outputPitchPerRow

Valid output pitch for each output rows of output feature maps for strided convolution. This is units of bytes

Definition at line 123 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ inWidth

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::inWidth

Width of each row of input feature map in units of data type

Definition at line 125 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ pad

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::pad

Pad of each row of input feature map, specify pad only on one side and for strided flow the Pad is on the left

Definition at line 128 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ maxHeight

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::maxHeight

Height of the input feature map in units of data type

Definition at line 130 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ subMChannels

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::subMChannels

Number of output channels per kernel call.

  1. When filter coefficients do not fit in L2 memory but feature maps do, then the kernel is called multiple times with same subMChannels in each kernel call a. subMChannels can be more than MMA SIZE for non strided convolution b. subMChannels should be less than or equal to MMA SIZE for strided convolution
  2. When filter coefficients fit in L2 memory but feature does not than kernel is called multiple times for the entire feature map and subMChannels is same as number of rows in the feature map buffer structure. MMA SIZE (64 for 8 bit and 32 for 16 bit) per kernel call

Definition at line 144 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ numGroupsPerKernel

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::numGroupsPerKernel

number of groups per kernel call > 1 will enable processing when No <= MMA size for non strided kernels and default value is 1

Definition at line 147 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ shift

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::shift

Shift parameter for output precision

Definition at line 149 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ Fr

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::Fr

coefficient rows (height)

Definition at line 151 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ Fc

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::Fc

coefficient columns (width)

Definition at line 153 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ strideX

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::strideX

stride of columns

Definition at line 155 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ strideY

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::strideY

stride of rows

Definition at line 157 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ dilationX

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::dilationX

dilation of coefficients of columns

Definition at line 159 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ dilationY

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::dilationY

dilation of coefficients of rows

Definition at line 161 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ bias

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::bias

bias value in B matrix same as data type of B matrix

Definition at line 163 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ activationType

uint8_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::activationType

activation RELU, SAT or none for output

Definition at line 165 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ mode

uint8_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::mode

mode for input feature map in Circular or Linear mode in B matrix

Definition at line 168 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ weightReorderFlag

uint8_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::weightReorderFlag

MChannel < mma width and NiFrFc < mma width for non strided cases blocks processing > 2, subMChannels % mma width != 0, NiFrFc % mma width != 0

Definition at line 172 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.

◆ numBiasVals

int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs::numBiasVals

Number of elements used for the bias (cols in weights, rows in feature maps)

Definition at line 175 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.