![]() |
MMALIB User Guide
|
This module consists of kernels to implement the core computations occurring in the context of convolutional neural networks.
This module consists of kernel performance estimation functions to estimate kernel compute cycles occurring in the context of convolutional neural networks.
Sub Modules | |
| MMALIB_CNN_convolveBias_row_ixX_ixX_oxX | |
| Kernel for computing dense CNN convolution with row based processing. | |
| MMALIB_CNN_convolve_col_smallNo_highPrecision | |
| NOTE: This API is now a wrapper to MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost with the lutValues input argument set to NULL. It is recommended to call MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost directly. | |
| MMALIB_CNN_convolve_col_smallNo_highPrecision_pointwisePost | |
| Kernel for computing CNN-style 2D convolution using column major data ordering on the input and output feature maps. This approach computes more quickly if filter grouping is chosen such that Ni=No=1, or if filter grouping is chosen such that NiFrFc < MMA_SIZE, otherwise use regular convolution method MMALIB_CNN_convolve_row_ixX_ixX_oxX. This kernel is also referred to as depth-wise convolution. | |
| MMALIB_CNN_convolve_col_smallNo_ixX_ixX_oxX | |
| Kernel for computing CNN-style 2D convolution using column major data ordering on the input and output feature maps. This approach computes more quickly if filter grouping is chosen such that Ni=No=1, or if filter grouping is chosen such that NiFrFc < MMA_SIZE, otherwise use regular convolution method MMALIB_CNN_convolve_row_ixX_ixX_oxX. This kernel is also referred to as depth-wise convolution. | |
| MMALIB_CNN_convolve_row_ixX_ixX_oxX | |
| Kernel for computing dense CNN convolution with row based processing and matrix multiplication. | |
| MMALIB_CNN_deconvolveBias_row_ixX_ixX_oxX | |
| Kernel for computing dense CNN deconvolution with row-based processing and matrix-matrix multiplication. | |
| MMALIB_CNN_deconvolve_row_ixX_ixX_oxX | |
| Kernel for computing dense CNN deconvolution with row-based processing and matrix-matrix multiplication. | |
| MMALIB_CNN_fullyConnectedBias_ixX_ixX_oxX | |
| Kernel provides compute functionality of Fully Connected Layer: \( Y^T = X^T \times H^T + B^T\). | |
| MMALIB_CNN_fullyConnected_ixX_ixX_oxX | |
| Kernel provides compute functionality of Fully Connected Layer: \( Y^T = X^T \times H^T \). | |
| MMALIB_CNN_pixelShuffle_row_ixX_ixX_oxX | |
| Kernel for computing dense CNN convolution with row based processing and matrix multiplication followed by column interleaving, which results in a partial output of final form in pixel shuffle operator. | |
| MMALIB_CNN_tensor_convert_ixX_oxX | |
| Kernel for converting tensors of various datatypes and formats. | |