Introduction

This module consists of kernels to implement the core computations occurring in the context of convolutional neural networks.

This module consists of kernel performance estimation functions to estimate kernel compute cycles occurring in the context of convolutional neural networks.

CNN Module is partitioned into
- Row Convolution, CNN convolution operation is performed with continuous rows of input feature map processing. This is targeted for dense convolution
- Column Convolution, CNN convolution operation is performed with partial rows but all columns processed. This is targeted for depthwise convolution
Fully connected layer
CNN Deconvolution operation post processing operation on the convolution output
Input feature map data is 8, 16 bit signed and unsigned
Coefficient data is 8, 16 bit signed
Output feature map data is 8, 16 bit signed and unsigned

Sub Modules
	MMALIB_CNN_convolveBias_row_ixX_ixX_oxX
	Kernel for computing dense CNN convolution with row based processing.

	MMALIB_CNN_convolve_col_smallNo_highPrecision
	NOTE: This API is now a wrapper to MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost with the lutValues input argument set to NULL. It is recommended to call MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost directly.

	MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost
	Kernel for computing CNN-style 2D convolution using column major data ordering on the input and output feature maps. This approach computes more quickly if filter grouping is chosen such that Ni=No=1, or if filter grouping is chosen such that NiFrFc < MMA_SIZE, otherwise use regular convolution method MMALIB_CNN_convolve_row_ixX_ixX_oxX. This kernel is also referred to as depth-wise convolution.

	MMALIB_CNN_convolve_col_smallNo_ixX_ixX_oxX
	Kernel for computing CNN-style 2D convolution using column major data ordering on the input and output feature maps. This approach computes more quickly if filter grouping is chosen such that Ni=No=1, or if filter grouping is chosen such that NiFrFc < MMA_SIZE, otherwise use regular convolution method MMALIB_CNN_convolve_row_ixX_ixX_oxX. This kernel is also referred to as depth-wise convolution.

	MMALIB_CNN_convolve_row_ixX_ixX_oxX
	Kernel for computing dense CNN convolution with row based processing and matrix multiplication.

	MMALIB_CNN_deconvolveBias_row_ixX_ixX_oxX
	Kernel for computing dense CNN deconvolution with row-based processing and matrix-matrix multiplication.

	MMALIB_CNN_deconvolve_row_ixX_ixX_oxX
	Kernel for computing dense CNN deconvolution with row-based processing and matrix-matrix multiplication.

	MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX
	Kernel provides compute functionality of Fully Connected Layer: \( Y^T = X^T \times H^T + B^T\).

	MMALIB_CNN_fullyConnected_ixX_ixX_oxX
	Kernel provides compute functionality of Fully Connected Layer: \( Y^T = X^T \times H^T \).

	MMALIB_CNN_pixelShuffle_row_ixX_ixX_oxX
	Kernel for computing dense CNN convolution with row based processing and matrix multiplication followed by column interleaving, which results in a partial output of final form in pixel shuffle operator.

	MMALIB_CNN_tensor_convert_ixX_oxX
	Kernel for converting tensors of various datatypes and formats.

Introduction

Sub Modules