MMALIB User Guide
MMALIB_CNN_convolveBias_row_ixX_ixX_oxX

Introduction

Kernel for computing dense CNN convolution with row based processing.

The input feature maps are passed to a kernel with all the rows are next to each other. All the input feature maps of same number of pixels fed into the kernel buffer. When starting processing the feature maps can start at a given column -> col parameter of a row and kernel will start processing from the intermediate location using the parameter.

Input buffer for strided and non strided convolution
starting location with col parameter

The filter coeffieint buffer for each output feature map is layer out in a linear manner with Ni*Fr*Fc values. The dilated kernel coefficients are used without the dilated zero values.

Filter coefficients buffer

The kernel requires multiple handles for a given feature maps which will be prestored in the L1D memory. For a givn CNN layer these handles will be fixed. There are three categories of these handles

Examples of parameters for a different handles for 3x3 stride 1 convolution

Strided convolution

Examples of parameters for a different handles for 3x3 stride 2 convolution

Data Structures

struct  MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs
 Structure containing the parameters initialization of CNN convolution computation. More...
 
struct  MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecInArgs
 Structure containing the parameters for input to the execute phase of CNN convolution computation These parameters will not exist in J7AM, kept for J7ES compatibility. More...
 
struct  MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecOutArgs
 Structure containing the parameters for output from the execute phase of CNN convolution computation. More...
 

Functions

int32_t MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_getHandleSize (MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs *pKerInitArgs)
 This is a query function to calculate the size of internal handle. More...
 
MMALIB_STATUS MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_init (MMALIB_kernelHandle handle, const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams2D_t *src2_addr, const MMALIB_bufParams1D_t *src3_addr, const MMALIB_bufParams3D_t *dst_addr, const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs *pKerInitArgs)
 This function call is required to initialize the handle. In this function most of the one time operation are performed and results are stored in handle. More...
 
MMALIB_STATUS MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_init_checkParams (MMALIB_kernelHandle handle, const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams2D_t *src2_addr, const MMALIB_bufParams1D_t *src3_addr, const MMALIB_bufParams3D_t *dst_addr, const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs *pKerInitArgs)
 This function call is required to initialize the handle. In this function most of the one time operation are performed and results are stored in handle. More...
 
MMALIB_STATUS MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_exec (MMALIB_kernelHandle handle, const void *src0, const void *src1, const void *src2, const void *src3, const uint8_t *src4, void *dst, const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs, MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecOutArgs *pKerOutArgs)
 This function is the main compute function, and performs the convolution primitive (conv + ReLU) for CNN on the row based data arrangement. It is called multiple times. More...
 
MMALIB_STATUS MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_exec_checkParams (MMALIB_kernelHandle handle, const void *src0, const void *src1, const void *src2, const void *src3, const uint8_t *src4, void *dst, const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs)
 This function checks the parameters and should be called before kernel executuon. It can be called once. More...
 
void MMA_CNNLIB_convolveBias_ixX_ixX_oxX_perfEst (const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams3D_t *dst_addr, MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs *pKerInitArgs, const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs, MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecOutArgs *pKerOutArgs, int32_t iterN, uint64_t *archCycles, uint64_t *estCycles)
 This function generates the performance of MMALIB kernels. More...
 

Enumerations

enum  MMALIB_CNN_CONVOLVEBIAS_ROW_IXX_IXX_OXX_STATUS_NAME { MMALIB_CNN_CONVOLVEBIAS_ROW_IXX_IXX_OXX_ERR_SMALL_K = MMALIB_ERROR_MAX , MMALIB_CNN_CONVOLVEBIAS_ROW_IXX_IXX_OXX_ERR_MAX }
 Enum to define the error codes. More...
 

Enumeration Type Documentation

◆ MMALIB_CNN_CONVOLVEBIAS_ROW_IXX_IXX_OXX_STATUS_NAME

Enum to define the error codes.

Enumerator
MMALIB_CNN_CONVOLVEBIAS_ROW_IXX_IXX_OXX_ERR_SMALL_K 
MMALIB_CNN_CONVOLVEBIAS_ROW_IXX_IXX_OXX_ERR_MAX 

Error case because k < Ni*Fr*Fc

Definition at line 88 of file MMALIB_CNN_convolveBias_row_ixX_ixX_oxX.h.

Function Documentation

◆ MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_getHandleSize()

int32_t MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_getHandleSize ( MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs pKerInitArgs)

This is a query function to calculate the size of internal handle.

Parameters
[in]pKerInitArgs: Pointer to structure holding init parameters
Returns
Size of the buffer in bytes
Remarks
Application is expected to allocate buffer of the requested size and provide it as input to other functions requiring it.

◆ MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_init()

MMALIB_STATUS MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_init ( MMALIB_kernelHandle  handle,
const MMALIB_bufParams2D_t src0_addr,
const MMALIB_bufParams2D_t src1_addr,
const MMALIB_bufParams2D_t src2_addr,
const MMALIB_bufParams1D_t src3_addr,
const MMALIB_bufParams3D_t dst_addr,
const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs pKerInitArgs 
)

This function call is required to initialize the handle. In this function most of the one time operation are performed and results are stored in handle.

Parameters
[in]handle: Active handle to the kernel
[in]src0_addr: Pointer to structure containing dimensional information of src0 weights/coeffcients
[in]src1_addr: Pointer to structure containing dimensional information of src1 feature maps
[in]src2_addr: Pointer to structure containing dimensional information of src2 bias
[in]src3_addr: Pointer to structure containing dimensional information of src3 scale values
[out]dst_addr: Pointer to structure containing dimensional information of dst feature maps
[in]pKerInitArgs: Pointer to structure holding init parameters
Returns
Status of success or Error with Error Codes
Remarks
Application is expected to do provide valid handle

◆ MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_init_checkParams()

MMALIB_STATUS MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_init_checkParams ( MMALIB_kernelHandle  handle,
const MMALIB_bufParams2D_t src0_addr,
const MMALIB_bufParams2D_t src1_addr,
const MMALIB_bufParams2D_t src2_addr,
const MMALIB_bufParams1D_t src3_addr,
const MMALIB_bufParams3D_t dst_addr,
const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs pKerInitArgs 
)

This function call is required to initialize the handle. In this function most of the one time operation are performed and results are stored in handle.

Parameters
[in]handle: Active handle to the kernel
[in]src0_addr: Pointer to structure containing dimensional information of src0 weights/coefficients
[in]src1_addr: Pointer to structure containing dimensional information of src1 input feature maps
[in]src2_addr: Pointer to structure containing dimensional information of src2 bias
[in]src3_addr: Pointer to structure containing dimensional information of src3 scale values
[out]dst_addr: Pointer to structure containing dimensional information of dst output feature maps
[in]pKerInitArgs: Pointer to structure holding init parameters
Returns
Status of success or Error with Error Codes
Remarks
Application is expected to do provide valid handle

◆ MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_exec()

MMALIB_STATUS MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_exec ( MMALIB_kernelHandle  handle,
const void *  src0,
const void *  src1,
const void *  src2,
const void *  src3,
const uint8_t *  src4,
void *  dst,
const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecInArgs pKerInArgs,
MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecOutArgs pKerOutArgs 
)

This function is the main compute function, and performs the convolution primitive (conv + ReLU) for CNN on the row based data arrangement. It is called multiple times.

The flow and the expectations of this function are as follows

  • Performs both strided and non-strided CNN convolution
  • Function generates partial or full output feature maps with multiple calls by the application
  • Function creates atleast three output blocks when KBlocks is less than 3
  • Function creates at least one output block when KBlocks is greater than equal to 3 except for 1x1 stride 2 convolution has greater than 3
  • Functions expect all the data for input and weights available for one block of output
  • One output block has 64 output feature maps and 64 columns for 8 bit
  • One output block has 64 output feature maps and 64 columns for 16 bit
  • Function computes non multiple of 64 for 8 bit and 32 for 16 bit of output feature maps without requirement of extra memory
  • Function takes Bias as compute with a constant value in B matrix and variable values for A matrix with both 8 bit or 16 bit based on precision. example Bias = (A0 + A1 + A2 + ....)*B.
Parameters
[in]handle: Active handle to the kernel
[in]src0[]: Pointer to buffer holding convolution weights/coefficents*
[in]src1[]: Pointer to buffer holding input feature map
[in]src2[]: Pointer to buffer holding the bias
[in]src3[]: Pointer to buffer holding the scale values
[in]src4[]: Pointer to buffer holding the shift values
[out]dst[]: Pointer to buffer holding output feature map
[in]pKerInArgs: Pointer to structure holding input Arguments
[out]pKerOutArgs: Pointer to structure holding output Arguments
Returns
Status of success or Error with Error Codes
Assumptions:
  • I/O buffer pointers are assumed to be not aliased.
Performance Considerations:
  • For best performance, the following parameter settings are recommended:
    • Set widths equal to strides
    • Align all pointers to 64 byte boundaries
    • Set all stride values to a multiple of 64 for 8 bit and 32 for 16 bit
    • Set all width values to a multiple of 64 for 8 bit and 32 for 16 bit
    • Set output feature maps to be 64 for 8 bit and 32 for 16 bit
    • Bias value trained to fit in the B matrix rows upto making the B matrix as multiple of SIMD width
Remarks
Application is expected to do call of checkParams function prior to this function as it avoids check of paramaters for each invocation for optimization

◆ MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_exec_checkParams()

MMALIB_STATUS MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_exec_checkParams ( MMALIB_kernelHandle  handle,
const void *  src0,
const void *  src1,
const void *  src2,
const void *  src3,
const uint8_t *  src4,
void *  dst,
const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecInArgs pKerInArgs 
)

This function checks the parameters and should be called before kernel executuon. It can be called once.

Parameters
[in]handle: Active handle to the kernel
[in]src0[]: Pointer to buffer holding convolution weights/coefficents*
[in]src1[]: Pointer to buffer holding input feature map
[in]src2[]: Pointer to buffer holding bias
[in]src3[]: Pointer to buffer holding scale
[in]src4[]: Pointer to buffer holding shift
[out]dst[]: Pointer to buffer holding output feature map
[in]pKerInArgs: Pointer to structure holding input Arguments
Returns
Status of success or Error with Error Codes
Remarks
None

◆ MMA_CNNLIB_convolveBias_ixX_ixX_oxX_perfEst()

void MMA_CNNLIB_convolveBias_ixX_ixX_oxX_perfEst ( const MMALIB_bufParams2D_t src0_addr,
const MMALIB_bufParams2D_t src1_addr,
const MMALIB_bufParams3D_t dst_addr,
MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_InitArgs pKerInitArgs,
const MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecInArgs pKerInArgs,
MMALIB_CNN_convolveBias_row_ixX_ixX_oxX_ExecOutArgs pKerOutArgs,
int32_t  iterN,
uint64_t *  archCycles,
uint64_t *  estCycles 
)

This function generates the performance of MMALIB kernels.

Parameters
[in]src0_addr: Pointer to structure containing dimensional information of src0 weights/coefficients
[in]src1_addr: Pointer to structure containing dimensional information of src1 input feature maps
[out]dst_addr: Pointer to structure containing dimensional information of dst output feature maps
[in]pKerInitArgs: Pointer to structure holding init parameters
[in]pKerInArgs: Pointer to structure holding input Arguments
[in]pKerOutArgs: Pointer to structure holding output Arguments
[in]iterN: number of subMBlocks iterations
[out]archCycles: pointer to store architecture cycles
[out]estCycles: pointer to store estimated kernel cycles
Remarks
None