![]() |
MMALIB User Guide
|
Kernel for computing dense CNN convolution with row based processing and matrix multiplication.
Data Structures | |
| struct | MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs |
| Structure containing the parameters initialization of CNN convolution computation. More... | |
| struct | MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecInArgs |
| Structure containing the parameters for input to the execute phase of CNN convolution computation. More... | |
| struct | MMALIB_CNN_convolve_row_ixX_ixX_oxX_reorderWeights_Args |
| This structure holds all the input parameters for reordering CNN filter weights for row convolution kernel. More... | |
| struct | MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecOutArgs |
| Structure containing the parameters for output from the execute phase of CNN convolution computation. More... | |
Functions | |
| int32_t | MMALIB_CNN_convolve_row_ixX_ixX_oxX_getHandleSize (MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs) |
| This is a query function to calculate the size of internal handle. More... | |
| MMALIB_STATUS | MMALIB_CNN_convolve_row_ixX_ixX_oxX_init (MMALIB_kernelHandle handle, const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams3D_t *dst_addr, const MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs) |
| This function call is required to initialize the handle. In this function most of the one time operation are performed and results are stored in handle. More... | |
| MMALIB_STATUS | MMALIB_CNN_convolve_row_ixX_ixX_oxX_init_checkParams (MMALIB_kernelHandle handle, const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams3D_t *dst_addr, const MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs) |
| This function call is required to initialize the handle. In this function most of the one time operation are performed and results are stored in handle. More... | |
| MMALIB_STATUS | MMALIB_CNN_convolve_row_ixX_ixX_oxX_exec (MMALIB_kernelHandle handle, const void *src0, const void *src1, void *dst, const MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs, MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecOutArgs *pKerOutArgs) |
| This function is the main compute function, and performs the convolution primitive (conv + ReLU) for CNN on the row based data arrangement. It is called multiple times. More... | |
| MMALIB_STATUS | MMALIB_CNN_convolve_row_ixX_ixX_oxX_exec_checkParams (MMALIB_kernelHandle handle, const void *src0, const void *src1, void *dst, const MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs) |
| This function checks the parameters and should be called before kernel executuon. It can be called once. More... | |
| int32_t | MMALIB_CNN_generateFillSeamPredicateRegisters (MMALIB_kernelHandle handle, int32_t inputWidth, int32_t pad, int32_t inputHeight, int32_t mmaWidth, int32_t MChannels, int32_t subMChannels) |
| This function generates the predicate registers once per layer Predicate buffers are create to identify where to insert pad in the output generated between consecutive rows. The pad inserted is either same as the current layer or used for the next layer. More... | |
| int32_t | MMALIB_CNN_seamPredicateRegistersSize (int32_t inputWidth, int32_t pad, int32_t inputHeight, int32_t mmaWidth, int32_t MChannels, int32_t subMChannels) |
| This function provides total bytes needed for seam insertion buffer. More... | |
| int32_t | MMALIB_CNN_convolve_row_reorderWeights (const void *restrict pWeights, void *restrict pReorderWeights, void *restrict pBias, const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src2_addr, MMALIB_CNN_convolve_row_ixX_ixX_oxX_reorderWeights_Args *reorderWeights, const MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs) |
| This function reorder the weights for M < 1 and K < 1 cases. More... | |
| int32_t | MMALIB_CNN_convolve_row_reorderWeightsFlag (MMALIB_CNN_convolve_row_ixX_ixX_oxX_reorderWeights_Args *reorderWeights) |
| This function return the flag for M < 1 and K < 1 cases. More... | |
| int32_t | MMALIB_CNN_convolve_row_reorderWeightsBufferSize (MMALIB_bufParams2D_t *src0_addr, MMALIB_CNN_convolve_row_ixX_ixX_oxX_reorderWeights_Args *reorderWeights, const MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs) |
| This function return the buffer size for M < 1 and K < 1 cases. More... | |
| int32_t | MMALIB_CNN_seamPredicateRegistersSizeDefault () |
| This function generates the predicate registers once per layer. More... | |
| void | MMA_CNNLIB_convolveBiasReLUCompute_ixX_ixX_oxX_perfEst (const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams3D_t *dst_addr, MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs, const MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs, MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecOutArgs *pKerOutArgs, int32_t iterN, uint64_t *archCycles, uint64_t *estCycles) |
| This function generates the performance of MMALIB kernels. More... | |
Enumerations | |
| enum | MMALIB_CNN_CONVOLVE_ROW_IXX_IXX_OXX_STATUS_NAME { MMALIB_CNN_CONVOLVE_ROW_IXX_IXX_OXX_ERR_SMALL_K = MMALIB_ERROR_MAX, MMALIB_CNN_CONVOLVE_ROW_IXX_IXX_OXX_ERR_MAX } |
| Enum to define the error codes. More... | |
Enum to define the error codes.
| Enumerator | |
|---|---|
| MMALIB_CNN_CONVOLVE_ROW_IXX_IXX_OXX_ERR_SMALL_K | |
| MMALIB_CNN_CONVOLVE_ROW_IXX_IXX_OXX_ERR_MAX | Error case because k < Ni*Fr*Fc |
Definition at line 68 of file MMALIB_CNN_convolve_row_ixX_ixX_oxX.h.
| int32_t MMALIB_CNN_convolve_row_ixX_ixX_oxX_getHandleSize | ( | MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs * | pKerInitArgs | ) |
This is a query function to calculate the size of internal handle.
| [in] | pKerInitArgs | : Pointer to structure holding init parameters |
| MMALIB_STATUS MMALIB_CNN_convolve_row_ixX_ixX_oxX_init | ( | MMALIB_kernelHandle | handle, |
| const MMALIB_bufParams2D_t * | src0_addr, | ||
| const MMALIB_bufParams2D_t * | src1_addr, | ||
| const MMALIB_bufParams3D_t * | dst_addr, | ||
| const MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs * | pKerInitArgs | ||
| ) |
This function call is required to initialize the handle. In this function most of the one time operation are performed and results are stored in handle.
| [in] | handle | : Active handle to the kernel |
| [in] | src0_addr | : Pointer to structure containing dimensional information of src0 weights/coeffcients |
| [in] | src1_addr | : Pointer to structure containing dimensional information of src1 feature maps |
| [out] | dst_addr | : Pointer to structure containing dimensional information of dst feature maps |
| [in] | pKerInitArgs | : Pointer to structure holding init parameters |
| MMALIB_STATUS MMALIB_CNN_convolve_row_ixX_ixX_oxX_init_checkParams | ( | MMALIB_kernelHandle | handle, |
| const MMALIB_bufParams2D_t * | src0_addr, | ||
| const MMALIB_bufParams2D_t * | src1_addr, | ||
| const MMALIB_bufParams3D_t * | dst_addr, | ||
| const MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs * | pKerInitArgs | ||
| ) |
This function call is required to initialize the handle. In this function most of the one time operation are performed and results are stored in handle.
| [in] | handle | : Active handle to the kernel |
| [in] | src0_addr | : Pointer to structure containing dimensional information of src0 weights/coefficients |
| [in] | src1_addr | : Pointer to structure containing dimensional information of src1 input feature maps |
| [out] | dst_addr | : Pointer to structure containing dimensional information of dst output feature maps |
| [in] | pKerInitArgs | : Pointer to structure holding init parameters |
| MMALIB_STATUS MMALIB_CNN_convolve_row_ixX_ixX_oxX_exec | ( | MMALIB_kernelHandle | handle, |
| const void * | src0, | ||
| const void * | src1, | ||
| void * | dst, | ||
| const MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecInArgs * | pKerInArgs, | ||
| MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecOutArgs * | pKerOutArgs | ||
| ) |
This function is the main compute function, and performs the convolution primitive (conv + ReLU) for CNN on the row based data arrangement. It is called multiple times.
The flow and the expectations of this function are as follows
| [in] | handle | : Active handle to the kernel |
| [in] | src0[] | : Pointer to buffer holding convolution weights/coefficents* |
| [in] | src1[] | : Pointer to buffer holding input feature map |
| [out] | dst[] | : Pointer to buffer holding output feature map |
| [in] | pKerInArgs | : Pointer to structure holding input Arguments |
| [out] | pKerOutArgs | : Pointer to structure holding output Arguments |
| MMALIB_STATUS MMALIB_CNN_convolve_row_ixX_ixX_oxX_exec_checkParams | ( | MMALIB_kernelHandle | handle, |
| const void * | src0, | ||
| const void * | src1, | ||
| void * | dst, | ||
| const MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecInArgs * | pKerInArgs | ||
| ) |
This function checks the parameters and should be called before kernel executuon. It can be called once.
| [in] | handle | : Active handle to the kernel |
| [in] | src0[] | : Pointer to buffer holding convolution weights/coefficents* |
| [in] | src1[] | : Pointer to buffer holding input feature map |
| [out] | dst[] | : Pointer to buffer holding output feature map |
| [in] | pKerInArgs | : Pointer to structure holding input Arguments |
| int32_t MMALIB_CNN_generateFillSeamPredicateRegisters | ( | MMALIB_kernelHandle | handle, |
| int32_t | inputWidth, | ||
| int32_t | pad, | ||
| int32_t | inputHeight, | ||
| int32_t | mmaWidth, | ||
| int32_t | MChannels, | ||
| int32_t | subMChannels | ||
| ) |
This function generates the predicate registers once per layer Predicate buffers are create to identify where to insert pad in the output generated between consecutive rows. The pad inserted is either same as the current layer or used for the next layer.
| [in] | handle | : Active handle to the kernel |
| [in] | inputWidth | : Width of Feature map |
| [in] | pad | : Pad between rows |
| [out] | inputHeight | : Maximum height of feature map |
| [in] | mmaWidth | : MMA width |
| [in] | MChannels | : Number of output channels |
| [in] | subMChannels | Number of output channels per kernel call |
| int32_t MMALIB_CNN_seamPredicateRegistersSize | ( | int32_t | inputWidth, |
| int32_t | pad, | ||
| int32_t | inputHeight, | ||
| int32_t | mmaWidth, | ||
| int32_t | MChannels, | ||
| int32_t | subMChannels | ||
| ) |
This function provides total bytes needed for seam insertion buffer.
| [in] | inputWidth | : Width of Feature map |
| [in] | pad | : Pad between rows |
| [out] | inputHeight | : Maximum height of feature map |
| [in] | mmaWidth | : MMA width |
| [in] | MChannels | : Number of output channels |
| [in] | subMChannels | Number of output channels per kernel call |
| int32_t MMALIB_CNN_convolve_row_reorderWeights | ( | const void *restrict | pWeights, |
| void *restrict | pReorderWeights, | ||
| void *restrict | pBias, | ||
| const MMALIB_bufParams2D_t * | src0_addr, | ||
| const MMALIB_bufParams2D_t * | src2_addr, | ||
| MMALIB_CNN_convolve_row_ixX_ixX_oxX_reorderWeights_Args * | reorderWeights, | ||
| const MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs * | pKerInitArgs | ||
| ) |
This function reorder the weights for M < 1 and K < 1 cases.
| [in] | pWeights[] | : Pointer to buffer holding convolution weights/coefficents* |
| [in] | pReorderWeights[] | Pointer to buffer holding convolution weights/coefficents reordered |
| [in] | pBias[] | : Pointer to buffer holding Bias value |
| [in] | src0_addr | : Pointer to structure containing dimensional information of src0 weights/coefficients |
| [in] | src2_addr | : Pointer to structure containing dimensional information of src2 bias |
| [in] | reorderWeights | : Pointer to structure holding reorderWeight parameters information of src1 input feature maps |
| [in] | pKerInitArgs | : Pointer to structure holding init parameters information of src1 input feature maps |
| int32_t MMALIB_CNN_convolve_row_reorderWeightsFlag | ( | MMALIB_CNN_convolve_row_ixX_ixX_oxX_reorderWeights_Args * | reorderWeights | ) |
This function return the flag for M < 1 and K < 1 cases.
| [in] | reorderWeights | : Pointer to structure holding reorderWeight parameters information of src1 input feature maps |
| int32_t MMALIB_CNN_convolve_row_reorderWeightsBufferSize | ( | MMALIB_bufParams2D_t * | src0_addr, |
| MMALIB_CNN_convolve_row_ixX_ixX_oxX_reorderWeights_Args * | reorderWeights, | ||
| const MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs * | pKerInitArgs | ||
| ) |
This function return the buffer size for M < 1 and K < 1 cases.
| [in] | src0_addr | : Pointer to structure containing dimensional information of src0 weights/coefficients |
| [in] | reorderWeights | : Pointer to structure holding reorderWeight parameters information of src1 input feature maps |
| [in] | pKerInitArgs | : Pointer to structure holding init parameters information of src1 input feature maps |
| int32_t MMALIB_CNN_seamPredicateRegistersSizeDefault | ( | ) |
This function generates the predicate registers once per layer.
| void MMA_CNNLIB_convolveBiasReLUCompute_ixX_ixX_oxX_perfEst | ( | const MMALIB_bufParams2D_t * | src0_addr, |
| const MMALIB_bufParams2D_t * | src1_addr, | ||
| const MMALIB_bufParams3D_t * | dst_addr, | ||
| MMALIB_CNN_convolve_row_ixX_ixX_oxX_InitArgs * | pKerInitArgs, | ||
| const MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecInArgs * | pKerInArgs, | ||
| MMALIB_CNN_convolve_row_ixX_ixX_oxX_ExecOutArgs * | pKerOutArgs, | ||
| int32_t | iterN, | ||
| uint64_t * | archCycles, | ||
| uint64_t * | estCycles | ||
| ) |
This function generates the performance of MMALIB kernels.
| [in] | src0_addr | : Pointer to structure containing dimensional information of src0 weights/coefficients |
| [in] | src1_addr | : Pointer to structure containing dimensional information of src1 input feature maps |
| [out] | dst_addr | : Pointer to structure containing dimensional information of dst output feature maps |
| [in] | pKerInitArgs | : Pointer to structure holding init parameters |
| [in] | pKerInArgs | : Pointer to structure holding input Arguments |
| [in] | pKerOutArgs | : Pointer to structure holding output Arguments |
| [in] | iterN | : number of subMBlocks iterations |
| [out] | archCycles | : pointer to store architecture cycles |
| [out] | estCycles | : pointer to store estimated kernel cycles |