MMALIB User Guide
MMALIB_CNN_deconvolve_row_ixX_ixX_oxX

Introduction

Kernel for computing dense CNN deconvolution with row-based processing and matrix-matrix multiplication.

Data Structures

struct  MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs
 Structure containing the parameters initialization of CNN deconvolution computation. More...
 
struct  MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecInArgs
 Structure containing the parameters for input to the execute phase of CNN deconvolution computation. More...
 
struct  MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecOutArgs
 

Functions

int32_t MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_getHandleSize (MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs)
 This is a query function to return the size of internal handle. More...
 
MMALIB_STATUS MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_init (MMALIB_kernelHandle handle, const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams2D_t *dst_addr, const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs)
 This function call is required to initialize the handle. In this function, most of the one-time operation are performed and results are stored in handle. More...
 
MMALIB_STATUS MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_init_checkParams (MMALIB_kernelHandle handle, const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams2D_t *dst_addr, const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs *pKerInitArgs)
 This function call is required to initialize the handle. In this function, most of the one-time operation are performed and results are stored in handle. More...
 
MMALIB_STATUS MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_exec (MMALIB_kernelHandle handle, const void *src0, const void *src1, void *dst, const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs, MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecOutArgs *pKerOutArgs)
 This function is the main compute function and performs the deconvolution primitive (conv + ReLU) for CNN on the row-based data arrangement. It is typically called multiple times. More...
 
MMALIB_STATUS MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_exec_checkParams (MMALIB_kernelHandle handle, const void *src0, const void *src1, const void *dst, const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs)
 This function checks the parameters and should be called before kernel execution. It can be called once. More...
 
MMALIB_STATUS MMALIB_CNN_deconvolve_row_4x4Stride2PreProcessParameters (uint32_t kDim, uint32_t numInChannels, uint32_t pitchA, uint32_t numOutChannels, uint32_t numGroups, const uint32_t mmaSize, const void *restrict src, void *restrict dst)
 This is a pre-processing function that reshapes the parameter buffer from \( N_o \times N_i \times F_r \times F_c \) to \( 4 \times N_o \times N_i \times \frac{F_r}{2} \times \frac{F_c}{2} \). The kernel expects the parameter tensor in the aforementioned shape to perform \( 4 \times 4 \) stride 2 deconvolution via four, \( 2 \times 2 \) stride 1 convolutions. More...
 
MMALIB_STATUS MMALIB_CNN_deconvolve_row_2x2Stride2PreProcessParameters (uint32_t kDim, uint32_t numInChannels, uint32_t pitchA, uint32_t numOutChannels, uint32_t numGroups, const uint32_t mmaSize, const void *restrict src, void *restrict dst)
 This is a pre-processing function that reshapes the parameter buffer from \( N_o \times N_i \times F_r \times F_c \) to \( 4 \times N_o \times N_i \times \frac{F_r}{2} \times \frac{F_c}{2} \). The kernel expects the parameter tensor in the aforementioned shape to perform \( 2 \times 2 \) stride 2 deconvolution via four, \( 1 \times 1 \) stride 1 convolutions. More...
 
MMALIB_STATUS MMALIB_CNN_deconvolve_row_8x8Stride2PreProcessParameters (uint32_t kDim, uint32_t numInChannels, uint32_t pitchA, uint32_t numOutChannels, uint32_t numGroups, const uint32_t mmaSize, const void *restrict src, void *restrict dst)
 This is a pre-processing function that reshapes the parameter buffer from \( N_o \times N_i \times F_r \times F_c \) to \( 4 \times N_o \times N_i \times \frac{F_r}{2} \times \frac{F_c}{2} \). The kernel expects the parameter tensor in the aforementioned shape to perform \( 8 \times 8 \) stride 2 deconvolution via four, \( 4 \times 4 \) stride 1 convolutions. More...
 
void MMALIB_CNN_deconvolveBiasReLUCompute_ixX_ixX_oxX_perfEst (const MMALIB_bufParams2D_t *src0_addr, const MMALIB_bufParams2D_t *src1_addr, const MMALIB_bufParams2D_t *dst_addr, const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs *kerInitArgs, const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecInArgs *pKerInArgs, MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecOutArgs *pKerOutArgs, int32_t iterN, uint64_t *archCycles, uint64_t *estCycles)
 This function estimates the cycles consumed for the kernel execution. More...
 

Enumerations

enum  MMALIB_CNN_DECONVOLVE_ROW_IXX_IXX_OXX_STATUS_NAME { MMALIB_CNN_DECONVOLVE_ROW_IXX_IXX_OXX_ERR_SMALL_K , MMALIB_CNN_DECONVOLVE_ROW_IXX_IXX_OXX_ERR_MAX }
 Enumeration for different Error codes for MMALIB_CNN_DECONVOLVE_ROW Kernel. More...
 

Enumeration Type Documentation

◆ MMALIB_CNN_DECONVOLVE_ROW_IXX_IXX_OXX_STATUS_NAME

Enumeration for different Error codes for MMALIB_CNN_DECONVOLVE_ROW Kernel.

Enumerator
MMALIB_CNN_DECONVOLVE_ROW_IXX_IXX_OXX_ERR_SMALL_K 
MMALIB_CNN_DECONVOLVE_ROW_IXX_IXX_OXX_ERR_MAX 

Error case because k < Ni*Fr*Fc

Definition at line 162 of file MMALIB_CNN_deconvolve_row_ixX_ixX_oxX.h.

Function Documentation

◆ MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_getHandleSize()

int32_t MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_getHandleSize ( MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs pKerInitArgs)

This is a query function to return the size of internal handle.

Parameters
[in]pKerInitArgs: Pointer to structure holding init parameters
Returns
Size of the buffer in bytes
Remarks
Application is expected to allocate buffer of the requested size and provide it during init and exec function calls

◆ MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_init()

MMALIB_STATUS MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_init ( MMALIB_kernelHandle  handle,
const MMALIB_bufParams2D_t src0_addr,
const MMALIB_bufParams2D_t src1_addr,
const MMALIB_bufParams2D_t dst_addr,
const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs pKerInitArgs 
)

This function call is required to initialize the handle. In this function, most of the one-time operation are performed and results are stored in handle.

Parameters
[in]handle: Active handle to the kernel
[in]src0_addr: Pointer to structure containing dimensional information of src0
[in]src1_addr: Pointer to structure containing dimensional information of src1
[out]dst_addr: Pointer to structure containing dimensional information of dst
[in]pKerInitArgs: Pointer to structure holding init parameters
Returns
Status of success or Error with Error Codes. Refer to MMALIB_STATUS
Remarks
Application is expected to do provide valid handle

◆ MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_init_checkParams()

MMALIB_STATUS MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_init_checkParams ( MMALIB_kernelHandle  handle,
const MMALIB_bufParams2D_t src0_addr,
const MMALIB_bufParams2D_t src1_addr,
const MMALIB_bufParams2D_t dst_addr,
const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs pKerInitArgs 
)

This function call is required to initialize the handle. In this function, most of the one-time operation are performed and results are stored in handle.

Parameters
[in]handle: Active handle to the kernel
[in]src0_addr: Pointer to structure containing dimensional information of src0 weights/coefficients
[in]src1_addr: Pointer to structure containing dimensional information of src1 input feature maps
[out]dst_addr: Pointer to structure containing dimensional information of dst output feature maps
[in]pKerInitArgs: Pointer to structure holding init parameters
Returns
Status of success or Error with Error Codes
Remarks
Application is expected to do provide valid handle

◆ MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_exec()

MMALIB_STATUS MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_exec ( MMALIB_kernelHandle  handle,
const void *  src0,
const void *  src1,
void *  dst,
const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecInArgs pKerInArgs,
MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecOutArgs pKerOutArgs 
)

This function is the main compute function and performs the deconvolution primitive (conv + ReLU) for CNN on the row-based data arrangement. It is typically called multiple times.

Parameters
[in]handle: Active handle to the kernel
[in]src0[]: Pointer to buffer holding convolution weights [ A matrix]
[in]src1[]: Pointer to buffer holding input feature map [ B matrix]
[out]dst[]: Pointer to buffer holding paritial output feature map [ C matrix]
[in]pKerInArgs: Pointer to structure holding input Arguments
[out]pKerOutArgs: Pointer to structure holding output Arguments
Returns
Status of success or Error with Error Codes. Refer to MMALIB_STATUS
Assumptions:
  • I/O buffer pointers are assumed to be not aliased.
Performance Considerations:
  • For best performance, the following parameter settings are recommended:
    • Set widths equal to strides
    • Align all pointers to 8-byte boundaries
    • Set all stride values to a multiple of 8
    • Set all width values to a multiple of 16
Remarks
Application is expected to do call of checkParams function prior to this function as it avoids check of paramaters for each invocation for optimization

◆ MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_exec_checkParams()

MMALIB_STATUS MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_exec_checkParams ( MMALIB_kernelHandle  handle,
const void *  src0,
const void *  src1,
const void *  dst,
const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecInArgs pKerInArgs 
)

This function checks the parameters and should be called before kernel execution. It can be called once.

Parameters
[in]handle: Active handle to the kernel
[in]src0[]: Pointer to buffer holding convolution weights [ A matrix]
[in]src1[]: Pointer to buffer holding input feature map data [ B matrix]
[out]dst[]: Pointer to buffer holding output feature map data [ C matrix]
[in]pKerInArgs: Pointer to structure holding input Arguments
Returns
Status of success or Error with Error Codes. Refer to MMALIB_STATUS
Remarks
None

◆ MMALIB_CNN_deconvolve_row_4x4Stride2PreProcessParameters()

MMALIB_STATUS MMALIB_CNN_deconvolve_row_4x4Stride2PreProcessParameters ( uint32_t  kDim,
uint32_t  numInChannels,
uint32_t  pitchA,
uint32_t  numOutChannels,
uint32_t  numGroups,
const uint32_t  mmaSize,
const void *restrict  src,
void *restrict  dst 
)

This is a pre-processing function that reshapes the parameter buffer from \( N_o \times N_i \times F_r \times F_c \) to \( 4 \times N_o \times N_i \times \frac{F_r}{2} \times \frac{F_c}{2} \). The kernel expects the parameter tensor in the aforementioned shape to perform \( 4 \times 4 \) stride 2 deconvolution via four, \( 2 \times 2 \) stride 1 convolutions.

Parameters
[in]kDim: Length of parameter buffer
[in]numInChannels: Number of input channels in parameter tensor
[in]pitchA: Pitch of parameter buffer
[in]numOutChannels: Number of output channels in parameter tensor
[in]numGroups: Number of groups in parameter tensor
[in]mmaSize: MMA width
[in]src: Pointer to buffer with parameter tensor
[out]dst: Pointer to buffer with reshaped parameter tensor
Returns
Status of success or Error with Error Codes. Refer to MMALIB_STATUS

◆ MMALIB_CNN_deconvolve_row_2x2Stride2PreProcessParameters()

MMALIB_STATUS MMALIB_CNN_deconvolve_row_2x2Stride2PreProcessParameters ( uint32_t  kDim,
uint32_t  numInChannels,
uint32_t  pitchA,
uint32_t  numOutChannels,
uint32_t  numGroups,
const uint32_t  mmaSize,
const void *restrict  src,
void *restrict  dst 
)

This is a pre-processing function that reshapes the parameter buffer from \( N_o \times N_i \times F_r \times F_c \) to \( 4 \times N_o \times N_i \times \frac{F_r}{2} \times \frac{F_c}{2} \). The kernel expects the parameter tensor in the aforementioned shape to perform \( 2 \times 2 \) stride 2 deconvolution via four, \( 1 \times 1 \) stride 1 convolutions.

Parameters
[in]kDim: Length of parameter buffer
[in]numInChannels: Number of input channels in parameter tensor
[in]pitchA: Pitch of parameter buffer
[in]numOutChannels: Number of output channels in parameter tensor
[in]numGroups: Number of groups in parameter tensor
[in]mmaSize: MMA width
[in]src: Pointer to buffer with parameter tensor
[out]dst: Pointer to buffer with reshaped parameter tensor
Returns
Status of success or Error with Error Codes. Refer to MMALIB_STATUS

◆ MMALIB_CNN_deconvolve_row_8x8Stride2PreProcessParameters()

MMALIB_STATUS MMALIB_CNN_deconvolve_row_8x8Stride2PreProcessParameters ( uint32_t  kDim,
uint32_t  numInChannels,
uint32_t  pitchA,
uint32_t  numOutChannels,
uint32_t  numGroups,
const uint32_t  mmaSize,
const void *restrict  src,
void *restrict  dst 
)

This is a pre-processing function that reshapes the parameter buffer from \( N_o \times N_i \times F_r \times F_c \) to \( 4 \times N_o \times N_i \times \frac{F_r}{2} \times \frac{F_c}{2} \). The kernel expects the parameter tensor in the aforementioned shape to perform \( 8 \times 8 \) stride 2 deconvolution via four, \( 4 \times 4 \) stride 1 convolutions.

Parameters
[in]kDim: Length of parameter buffer
[in]numInChannels: Number of input channels in parameter tensor
[in]pitchA: Pitch of parameter buffer
[in]numOutChannels: Number of output channels in parameter tensor
[in]numGroups: Number of groups in parameter tensor
[in]mmaSize: MMA width
[in]src: Pointer to buffer with parameter tensor
[out]dst: Pointer to buffer with reshaped parameter tensor
Returns
Status of success or Error with Error Codes. Refer to MMALIB_STATUS

◆ MMALIB_CNN_deconvolveBiasReLUCompute_ixX_ixX_oxX_perfEst()

void MMALIB_CNN_deconvolveBiasReLUCompute_ixX_ixX_oxX_perfEst ( const MMALIB_bufParams2D_t src0_addr,
const MMALIB_bufParams2D_t src1_addr,
const MMALIB_bufParams2D_t dst_addr,
const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_InitArgs kerInitArgs,
const MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecInArgs pKerInArgs,
MMALIB_CNN_deconvolve_row_ixX_ixX_oxX_ExecOutArgs pKerOutArgs,
int32_t  iterN,
uint64_t *  archCycles,
uint64_t *  estCycles 
)

This function estimates the cycles consumed for the kernel execution.

Parameters
[in]src0_addr: Pointer to the structure containing dimensional information of src0
[in]src1_addr: Pointer to the structure containing dimensional information of src1
[out]dst_addr: Pointer to the structure containing dimensional information of dst
[in]kerInitArgs: Pointer to structure holding init parameters
[in]pKerInArgs: Pointer to structure holding input arguments
[in]pKerOutArgs: Pointer to structure holding output arguments
[in]iterN: Number of subMBlocks iterations
[out]archCycles: Cycles estimated for the compute, startup and teardown
[out]estCycles: Cycles estimated for the compute, startup, teardown and any associated overhead
Remarks
None