MMALIB User Guide
Convolutional Neural Networks (CNN) kernels

Introduction

This module consists of kernels to implement the core computations occurring in the context of convolutional neural networks.

This module consists of kernel performance estimation functions to estimate kernel compute cycles occurring in the context of convolutional neural networks.

Sub Modules

 MMALIB_CNN_convolveBias_row_ixX_ixX_oxX
 Kernel for computing dense CNN convolution with row based processing.
 
 MMALIB_CNN_convolve_col_smallNo_highPrecision
 NOTE: This API is now a wrapper to MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost with the lutValues input argument set to NULL. It is recommended to call MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost directly.
 
 MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost
 Kernel for computing CNN-style 2D convolution using column major data ordering on the input and output feature maps. This approach computes more quickly if filter grouping is chosen such that Ni=No=1, or if filter grouping is chosen such that NiFrFc < MMA_SIZE, otherwise use regular convolution method MMALIB_CNN_convolve_row_ixX_ixX_oxX. This kernel is also referred to as depth-wise convolution.
 
 MMALIB_CNN_convolve_col_smallNo_ixX_ixX_oxX
 Kernel for computing CNN-style 2D convolution using column major data ordering on the input and output feature maps. This approach computes more quickly if filter grouping is chosen such that Ni=No=1, or if filter grouping is chosen such that NiFrFc < MMA_SIZE, otherwise use regular convolution method MMALIB_CNN_convolve_row_ixX_ixX_oxX. This kernel is also referred to as depth-wise convolution.
 
 MMALIB_CNN_convolve_row_ixX_ixX_oxX
 Kernel for computing dense CNN convolution with row based processing and matrix multiplication.
 
 MMALIB_CNN_deconvolveBias_row_ixX_ixX_oxX
 Kernel for computing dense CNN deconvolution with row-based processing and matrix-matrix multiplication.
 
 MMALIB_CNN_deconvolve_row_ixX_ixX_oxX
 Kernel for computing dense CNN deconvolution with row-based processing and matrix-matrix multiplication.
 
 MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX
 Kernel provides compute functionality of Fully Connected Layer: \( Y^T = X^T \times H^T + B^T\).
 
 MMALIB_CNN_fullyConnected_ixX_ixX_oxX
 Kernel provides compute functionality of Fully Connected Layer: \( Y^T = X^T \times H^T \).
 
 MMALIB_CNN_pixelShuffle_row_ixX_ixX_oxX
 Kernel for computing dense CNN convolution with row based processing and matrix multiplication followed by column interleaving, which results in a partial output of final form in pixel shuffle operator.
 
 MMALIB_CNN_tensor_convert_ixX_oxX
 Kernel for converting tensors of various datatypes and formats.