MMALIB User Guide
MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX_processWeights.h File Reference

Go to the source code of this file.

Data Structures

struct  MMALIB_CNN_fullyConnectedBias_processWeights_Args
 This structure holds all the input parameters for reordering CNN filter weights. More...
 

Functions

MMALIB_STATUS MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX_reorderWeights (int32_t mmaSize, int32_t NiBias, int32_t No, uint32_t strideOut, const void *restrict pWeights, void *restrict pReorderWeights)
 This function re-orders the input weights (kernel matrix). Reordering of data is required to alleviate potential bank conflicts arising when the kernel matrix is accessed in L2 via SE1 in conjunction with data transfer from MSMC to L2 via DMA. The current re-ordering scheme results in a bank-acess pattern of {0,0,1,1,2,2,3,3,0,0,1,1,2,2,3,3, ...} for SE1. More...
 
int32_t MMALIB_CNN_fullyConnectedBias_processWeights_getMemorySize (const MMALIB_CNN_fullyConnectedBias_processWeights_Args *pArgs, const void *restrict pWeights)
 This function returns the amount of memory that needs to be allocated for reordered kernel coefficients needed to support MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX. More...
 
MMALIB_STATUS MMALIB_CNN_fullyConnectedBias_processWeights_reorder (const MMALIB_CNN_fullyConnectedBias_processWeights_Args *pArgs, const void *restrict pWeights, void *restrict pReordered_Weights)
 This function takes a set of weights and reorders them for use in computing convolve row flow convolution. More...
 

Function Documentation

◆ MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX_reorderWeights()

MMALIB_STATUS MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX_reorderWeights ( int32_t  mmaSize,
int32_t  NiBias,
int32_t  No,
uint32_t  strideOut,
const void *restrict  pWeights,
void *restrict  pReorderWeights 
)

This function re-orders the input weights (kernel matrix). Reordering of data is required to alleviate potential bank conflicts arising when the kernel matrix is accessed in L2 via SE1 in conjunction with data transfer from MSMC to L2 via DMA. The current re-ordering scheme results in a bank-acess pattern of {0,0,1,1,2,2,3,3,0,0,1,1,2,2,3,3, ...} for SE1.

Parameters
[in]mmaSize: Size of MMA for given test case
  • 64 for 8-bit datatype
  • 32 for 16-bit datatype
[in]NiBias: Ni (number of input channels) + Bias columns
[in]No: Number of output channels
[in]strideOut: Stride of matrix after re-ordering
[in]pWeights: Pointer for input weights
[in]pReorderWeights: Pointer for ouput weights after re-ordering
Returns
Status value indicating success or failure. Refer to MMALIB_STATUS.

◆ MMALIB_CNN_fullyConnectedBias_processWeights_getMemorySize()

int32_t MMALIB_CNN_fullyConnectedBias_processWeights_getMemorySize ( const MMALIB_CNN_fullyConnectedBias_processWeights_Args pArgs,
const void *restrict  pWeights 
)

This function returns the amount of memory that needs to be allocated for reordered kernel coefficients needed to support MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX.

Parameters
[in]pArgs: Pointer to the structure containing the required dimensional information
[in]pWeights: Pointer to weights array in natural order
Returns
Number of bytes required to store the reordered kernel coefficients
Remarks
Application is expected allocate this amount of memory for kernel coefficients

◆ MMALIB_CNN_fullyConnectedBias_processWeights_reorder()

MMALIB_STATUS MMALIB_CNN_fullyConnectedBias_processWeights_reorder ( const MMALIB_CNN_fullyConnectedBias_processWeights_Args pArgs,
const void *restrict  pWeights,
void *restrict  pReordered_Weights 
)

This function takes a set of weights and reorders them for use in computing convolve row flow convolution.

The function can receive the kernel weights a priori.

Parameters
[in]pArgs: Pointer to argument structure containing necessary parameters for reordering weights
[in]pWeights[]: Pointer to buffer holding naturally ordered convolution weights
[out]pReordered_Weights[]: Pointer to buffer holding the reordered weights output
Returns
Status of success or error with error codes, refer to MMALIB_STATUS.
Performance Considerations:
  • This function may either be called during the processing flow, or offline whenever the weights are known.