Detailed Description

PRELIMINARY APU driver interface

WARNING These APIs are PRELIMINARY, and subject to change in the next few months.

Overview

The APU driver allows you to interact with the Algorithm Processing Unit (APU), a linear algebra accelerator peripheral using 64-bit complex numbers. Complex numbers are represented by the float complex type from complex.h, where each number is made up of two 32-bit floats. The APU works with complex numbers in either Cartesian or Polar formats. In the Cartesian representation, the lower 32 bits are the real part and the upper 32 bits are the imaginary part. In Polar format, the lower 32 bits are the absolute part and the upper 32 bits are the angle, represented as pi radians. As long as arguments to the APU functions use the same format, the result will also be in that format. Mixing formats produces wrong results. The driver provides basic data types for vectors and matrices constructed from float complex, as well as many common linear algebra operations on these data types.

Data management

All the vector/matrix operations can accept input and output pointers that are inside or outside APU RAM. If all pointers provided to an operation are inside APU RAM, the APU will operate in scratchpad mode . This means the driver will assume that, given the current pointers, the input is already in APU memory and that input and result will not overlap each other. Therefore, no data will be copied, and the function output will be placed inside APU memory. This is the most efficient way to utilize the APU and is ideal for algorithms with multiple operations that feed into each other, such as MUSIC (https://en.wikipedia.org/wiki/MUSIC_(algorithm)). When chaining together operations, make sure as many as possible use vectors and matrices that are already in APU memory, to prevent unnecessary copying and overhead.

If any of the pointers are outside APU memory, the driver will copy input data to the start of its memory, place the result immediately following, and then copy the output back to the provided pointer. This may overwrite data that was already in this location.

Warning: The APU memory has a memory access limitation that must be respected, as to not cause a bus fault. Software must not perform any combination of back-to-back read or write instructions to APU memory (RR/ WW/ WR/ RW). There must be some other instruction in-between. For safety, load data into APU memory using any of the APU operations, APULPF3_loadArgMirrored() or APULPF3_loadTriangular(). Copying data back from APU memory is automatically handled by the driver, and happens in an interrupt when the result pointer is outside APU memory.

The primary purpose of this driver is executing the MUSIC algorithm for distance estimation in Bluetooth Channel Sounding. An implementation of MUSIC using the APU can be found in the apu_music example.

Usage

This section will cover driver usage.

Synopsis

APULPF3_init();
float complex *apuMem = (float complex *)APULPF3_MEM_BASE;
float complex argA[10];
float complex argB[10];
APULPF3_ComplexVector vecA = {.data = bufA, .size = 10};
APULPF3_ComplexVector vecB = {.data = bufB, .size = 10};
APULPF3_ComplexVector resultVec = {.data = apuMem, .size = 10};
// Get control of APU
APULPF3_startOperationSequence();
// Perform element-wise product, placing the result in resultVec,
// which is inside APU memory
APULPF3_elemProduct(&argA, &argB, resultVec);
// Perform non-conjugated dot product inside of APU memory, which
// reduces overhead.
APULPF3_dotProduct(&resultVec, &resultVec, false, apuMem)
// Give up control of APU
APULPF3_finishOperationSequence();

#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include <complex.h>
#include <ti/drivers/dpl/HwiP.h>
#include <ti/drivers/dpl/SemaphoreP.h>
#include <ti/devices/DeviceFamily.h>
#include <DeviceFamily_constructPath(inc/hw_memmap.h)>

Include dependency graph for APULPF3.h:

Go to the source code of this file.

Data Structures
struct	APULPF3_ComplexVector
	APULPF3 Vector Struct. More...

struct	APULPF3_ComplexMatrix
	APULPF3 Matrix Struct. More...

struct	APULPF3_ComplexTriangleMatrix
	APULPF3 Upper Triangle Matrix Struct. More...

struct	APULPF3_Config
	APULPF3 Global configuration. More...

struct	APULPF3_HWAttrs
	APULPF3 Hardware attributes. More...

Macros
#define	APULPF3_STATUS_SUCCESS 0
	Successful status code. More...

#define	APULPF3_STATUS_ERROR 1
	Generic error status code. More...

#define	APULPF3_STATUS_RESOURCE_UNAVAILABLE 2
	An error status code returned if the hardware or software resource is currently unavailable. More...

#define	APULPF3_RESULT_INPLACE 0
	APU operation is in-place, overwriting the input. More...

#define	APULPF3_MEM_BASE VCERAM_DATA0_BASE
	Start of APU RAM. More...

#define	APULPF3_MEM_SIZE_MIRRORED APURAM_DATA0_SIZE
	Size of APU RAM in mirrored mode. More...

Enumerations
enum	APULPF3_OperationMode { APULPF3_OperationMode_MIRRORED }
	Define the APU memory operation modes, which are the ways the APU expects data to be stored in its memory. More...

enum	APULPF3_SchedulingMode { APULPF3_SchedulingMode_PIPELINED }
	Define the APU memory scheduling modes, which are the ways APU memory operations are scheduled and pipelined. More...

Functions
void	APULPF3_init (void)
	APU init function. More...

void	APULPF3_startOperationSequence ()
	APU function to prepare the start of an operation chain. More...

void	APULPF3_stopOperationSequence ()
	APU function to finish an operation chain. More...

int_fast16_t	APULPF3_dotProduct (APULPF3_ComplexVector vecA, APULPF3_ComplexVector vecB, bool conjugate, float complex *result)
	APU function for calculating the dot product of two vectors, with the option to perform the complex conjugate on the second vector first. More...

int_fast16_t	APULPF3_vectorMult (APULPF3_ComplexVector vecA, APULPF3_ComplexVector vecB, bool conjugate, APULPF3_ComplexVector *result)
	APU function for calculating the element-wise product of two vectors, with the option to perform the complex conjugate on the second vector first. More...

int_fast16_t	APULPF3_vectorSum (APULPF3_ComplexVector vecA, APULPF3_ComplexVector vecB, APULPF3_ComplexVector *result)
	APU function for calculating the element-wise sum of two vectors. More...

int_fast16_t	APULPF3_cartesianToPolarVector (APULPF3_ComplexVector vec, APULPF3_ComplexVector result)
	APU function for converting a complex vector in cartesian format to polar format. More...

int_fast16_t	APULPF3_polarToCartesianVector (APULPF3_ComplexVector vec, float complex temp, APULPF3_ComplexVector *result)
	APU function for converting a complex vector in polar format to cartesian format. More...

int_fast16_t	APULPF3_sortVector (APULPF3_ComplexVector vec, APULPF3_ComplexVector result)
	APU function for sorting the real parts of a complex vector in descending order. This function ignores the complex parts of each element and makes no guarantees to their contents after the operation is complete. More...

int_fast16_t	APULPF3_covMatrixSpatialSmoothing (APULPF3_ComplexVector vec, uint16_t covMatrixSize, bool fbAveraging, APULPF3_ComplexTriangleMatrix result)
	APU function for covariance matrix computation using spatial smoothing and optionally forward-backward averaging. More...

int_fast16_t	APULPF3_computeFFT (APULPF3_ComplexVector vec, bool inverse, APULPF3_ComplexVector result)
	APU function for computing the Discrete Fourier transform (DFT) of a complex vector using the Fast Fourier Transform (FFT) algorithm. Optionally, the Inverse DFT can be computed. Combines two APU operations; first configuring the APU for a fourier transform, then actually computing it. More...

int_fast16_t	APULPF3_matrixMult (APULPF3_ComplexMatrix matA, APULPF3_ComplexMatrix matB, APULPF3_ComplexMatrix *result)
	APU function for multiplying two matrices. The number of rows in the first matrix must be equal to the number of columns in the second matrix. More...

int_fast16_t	APULPF3_matrixSum (APULPF3_ComplexMatrix matA, APULPF3_ComplexMatrix matB, APULPF3_ComplexMatrix *result)
	APU function for adding two matrices. The matrices must be of exact same sizes. More...

int_fast16_t	APULPF3_matrixScalarSum (APULPF3_ComplexMatrix mat, float complex scalar, APULPF3_ComplexMatrix *result)
	APU function for adding a scalar to a each of a matrix' elements. More...

int_fast16_t	APULPF3_matrixScalarMult (APULPF3_ComplexMatrix mat, float complex scalar, APULPF3_ComplexMatrix *result)
	APU function for multiplying each of a matrix' elements by a scalar. More...

int_fast16_t	APULPF3_matrixNorm (APULPF3_ComplexMatrix mat, float complex result)
	Compute the Frobenius norm of a matrix. More...

int_fast16_t	APULPF3_jacobiEVD (APULPF3_ComplexTriangleMatrix mat, uint16_t maxIter, float stopThreshold, APULPF3_ComplexVector result)
	APU function to compute the Jacobi Eigen-Decomposition (EVD) of a triangular Hermitian Matrix. This results in a triangular, diagonal matrix of eigenvalues sorted in descending order replacing the original matrix, and a full matrix, where the eigenvectors are the columns. The eigenvector matrix can be placed anywhere in APU memory except for where the input matrix is placed. When not using scratchpad mode, the eigenvector matrix will be placed next to the eigenvector matrix. More...

int_fast16_t	APULPF3_gaussJordanElim (APULPF3_ComplexMatrix mat, float zeroThreshold, APULPF3_ComplexMatrix result)
	Reduce the input matrix A[MxN] to reduced echelon form using Gauss-Jordan Elimination. More...

int_fast16_t	APULPF3_unitCircle (uint16_t numPoints, uint16_t constant, uint16_t phase, bool conjugate, APULPF3_ComplexVector *result)
	Generate points evenly distributed on a unit circle APU generates a unit circle as follow: exp(-j2pi(kM+phase)/1024 * (-1)^(conjugate)) More...

void	APULPF3_prepareResult (uint16_t resultSize, uint16_t inputSize, complex float *resultBuffer)
	Configure the APU pointers for temporary (in APU memory) and final results for a APU operation. This function is intended to be used inside APU operations, such as dot product. More...

uint16_t	APULPF3_prepareVectors (APULPF3_ComplexVector vecA, APULPF3_ComplexVector vecB)
	Configure the APU for vector operation inputs. If not in scratchpad mode, the vectors are loaded one after the other into the start of APU memory. More...

uint16_t	APULPF3_prepareMatrices (APULPF3_ComplexMatrix matA, APULPF3_ComplexMatrix matB)
	Configure the APU for matrix operation inputs. If not in scratchpad mode, the matrices are loaded one after the other into the start of APU memory. More...

void *	APULPF3_loadTriangular (APULPF3_ComplexMatrix *mat, uint16_t offset)
	Loads the upper triangular part of a full matrix into APU memory. More...

void *	APULPF3_loadArgMirrored (uint16_t argSize, uint16_t offset, float complex *src)
	Load operation arguments into APU memory, assuming the APU is in mirrored mode. This means memory is viewed as one block that can fit 1024 complex numbers. More...

Macro Definition Documentation

§ APULPF3_RESULT_INPLACE

#define APULPF3_RESULT_INPLACE 0

APU operation is in-place, overwriting the input.

§ APULPF3_MEM_BASE

#define APULPF3_MEM_BASE VCERAM_DATA0_BASE

Start of APU RAM.

§ APULPF3_MEM_SIZE_MIRRORED

#define APULPF3_MEM_SIZE_MIRRORED APURAM_DATA0_SIZE

Size of APU RAM in mirrored mode.

Enumeration Type Documentation

§ APULPF3_OperationMode

enum APULPF3_OperationMode

Define the APU memory operation modes, which are the ways the APU expects data to be stored in its memory.

Enumerator
APULPF3_OperationMode_MIRRORED	In mirrored mode elements are stored sequentially in APU memory.

§ APULPF3_SchedulingMode

enum APULPF3_SchedulingMode

Define the APU memory scheduling modes, which are the ways APU memory operations are scheduled and pipelined.

Enumerator
APULPF3_SchedulingMode_PIPELINED	In pipelined mode, read operations proceed per clock cycle, and only the incoming samples will regulate the data flow.

Function Documentation

§ APULPF3_init()

void APULPF3_init ( void )

APU init function.

This function initializes the APU internal state. It sets power dependencies, loads the necessary firmware, configures the APU and sets up interrupts and semaphores

§ APULPF3_startOperationSequence()

void APULPF3_startOperationSequence ( )

APU function to prepare the start of an operation chain.

This function disables standby and acquire exclusive access to the APU. It should be called before any sequence of APU operations, wrapping them together with APULPF3_stopOperationSequence(). APU operations should not be called outside of a pair of APULPF3_startOperationSequence() and APULPF3_stopOperationSequence() functions.

Precondition: APULPF3_init() has been called.

§ APULPF3_stopOperationSequence()

void APULPF3_stopOperationSequence ( )

APU function to finish an operation chain.

This function enables standby and releases exclusive access to the APU. It should be called after any sequence of APU operations, wrapping them together with APULPF3_startOperationSequence(). APU operations should not be called outside of a pair of APULPF3_startOperationSequence() and APULPF3_stopOperationSequence() functions.

Precondition: APULPF3_init() and APULPF3_startOperationSequence() has been called.

§ APULPF3_prepareResult()

void APULPF3_prepareResult	(	uint16_t	resultSize,
		uint16_t	inputSize,
		complex float *	resultBuffer
	)

Configure the APU pointers for temporary (in APU memory) and final results for a APU operation. This function is intended to be used inside APU operations, such as dot product.

Parameters

[in]	resultSize	the size of an operation result
[in]	inputSize	the size of an operation input
[out]	resultBuffer	a pointer where the operation result will be placed in APU memory. If not using scratchpad mode, usually when an external buffer is supplied, configure to place the result at the start of APU memory, before copying it to the buffer.

Returns: A status code indicating whether the APU operation was a success.

Return values

APULPF3_STATUS_SUCCESS The call was successful.

§ APULPF3_prepareVectors()

uint16_t APULPF3_prepareVectors	(	APULPF3_ComplexVector *	vecA,
		APULPF3_ComplexVector *	vecB
	)

Configure the APU for vector operation inputs. If not in scratchpad mode, the vectors are loaded one after the other into the start of APU memory.

Parameters

[in]	vecA	an input vector
[in]	vecB	an input vector, potentially the same as `vecA`

Returns: The total input size. If the input vectors are the same, this is the vector's size, otherwise it is the sum of the vector sizes.

§ APULPF3_prepareMatrices()

uint16_t APULPF3_prepareMatrices	(	APULPF3_ComplexMatrix *	matA,
		APULPF3_ComplexMatrix *	matB
	)

Configure the APU for matrix operation inputs. If not in scratchpad mode, the matrices are loaded one after the other into the start of APU memory.

Parameters

[in]	matA	a pointer to an input matrix
[in]	matB	a pointer to an input matrix, potentially the same as `matA`

Returns: The total input size. If the input matrices are the same, this is the matrix's size, otherwise it is the sum of the vector sizes.

§ APULPF3_loadTriangular()

void* APULPF3_loadTriangular	(	APULPF3_ComplexMatrix *	mat,
		uint16_t	offset
	)

Loads the upper triangular part of a full matrix into APU memory.

Parameters

[in]	mat	a pointer to a source matrix
[in]	offset	offset into APU memory to load to

Returns: Pointer in APU memory where the argument is stored.

§ APULPF3_loadArgMirrored()

void* APULPF3_loadArgMirrored	(	uint16_t	argSize,
		uint16_t	offset,
		float complex *	src
	)

Load operation arguments into APU memory, assuming the APU is in mirrored mode. This means memory is viewed as one block that can fit 1024 complex numbers.

Parameters

[in]	argSize	how many complex numbers to load
[in]	offset	offset into APU memory to load to
[in]	src	data source pointer

Returns: Pointer in APU memory where the argument is stored.

Detailed Description

Overview

Data management

Usage

Synopsis

Data Structures

Macros

Enumerations

Functions

Macro Definition Documentation

§ APULPF3_RESULT_INPLACE

§ APULPF3_MEM_BASE

§ APULPF3_MEM_SIZE_MIRRORED

Enumeration Type Documentation

§ APULPF3_OperationMode

§ APULPF3_SchedulingMode

Function Documentation

§ APULPF3_init()

§ APULPF3_startOperationSequence()

§ APULPF3_stopOperationSequence()

§ APULPF3_prepareResult()

§ APULPF3_prepareVectors()

§ APULPF3_prepareMatrices()

§ APULPF3_loadTriangular()

§ APULPF3_loadArgMirrored()