![]() |
![]() |
PRELIMINARY APU driver interface
WARNING These APIs are PRELIMINARY, and subject to change in the next few months.
The APU driver allows you to interact with the Algorithm Processing Unit (APU), a linear algebra accelerator peripheral using 64-bit complex numbers. Complex numbers are represented by the float complex type from complex.h, where each number is made up of two 32-bit floats. The APU works with complex numbers in either Cartesian or Polar formats. In the Cartesian representation, the lower 32 bits are the real part and the upper 32 bits are the imaginary part. In Polar format, the lower 32 bits are the absolute part and the upper 32 bits are the angle, represented as pi radians. As long as arguments to the APU functions use the same format, the result will also be in that format. Mixing formats produces wrong results. The driver provides basic data types for vectors and matrices constructed from float complex, as well as many common linear algebra operations on these data types.
All the vector/matrix operations can accept input and output pointers that are inside or outside APU RAM. If all pointers provided to an operation are inside APU RAM, the APU will operate in scratchpad mode . This means the driver will assume that, given the current pointers, the input is already in APU memory and that input and result will not overlap each other. Therefore, no data will be copied, and the function output will be placed inside APU memory. This is the most efficient way to utilize the APU and is ideal for algorithms with multiple operations that feed into each other, such as MUSIC (https://en.wikipedia.org/wiki/MUSIC_(algorithm)). When chaining together operations, make sure as many as possible use vectors and matrices that are already in APU memory, to prevent unnecessary copying and overhead.
If any of the pointers are outside APU memory, the driver will copy input data to the start of its memory, place the result immediately following, and then copy the output back to the provided pointer. This may overwrite data that was already in this location.
The primary purpose of this driver is executing the MUSIC algorithm for distance estimation in Bluetooth Channel Sounding. An implementation of MUSIC using the APU can be found in the apu_music example.
This section will cover driver usage.
#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include <complex.h>
#include <ti/drivers/dpl/HwiP.h>
#include <ti/drivers/dpl/SemaphoreP.h>
#include <ti/devices/DeviceFamily.h>
#include <DeviceFamily_constructPath(inc/hw_memmap.h)>
Go to the source code of this file.
Data Structures | |
struct | APULPF3_ComplexVector |
APULPF3 Vector Struct. More... | |
struct | APULPF3_ComplexMatrix |
APULPF3 Matrix Struct. More... | |
struct | APULPF3_ComplexTriangleMatrix |
APULPF3 Upper Triangle Matrix Struct. More... | |
struct | APULPF3_Config |
APULPF3 Global configuration. More... | |
struct | APULPF3_HWAttrs |
APULPF3 Hardware attributes. More... | |
Macros | |
#define | APULPF3_STATUS_SUCCESS 0 |
Successful status code. More... | |
#define | APULPF3_STATUS_ERROR 1 |
Generic error status code. More... | |
#define | APULPF3_STATUS_RESOURCE_UNAVAILABLE 2 |
An error status code returned if the hardware or software resource is currently unavailable. More... | |
#define | APULPF3_RESULT_INPLACE 0 |
APU operation is in-place, overwriting the input. More... | |
#define | APULPF3_MEM_BASE VCERAM_DATA0_BASE |
Start of APU RAM. More... | |
#define | APULPF3_MEM_SIZE_MIRRORED APURAM_DATA0_SIZE |
Size of APU RAM in mirrored mode. More... | |
Enumerations | |
enum | APULPF3_OperationMode { APULPF3_OperationMode_MIRRORED } |
Define the APU memory operation modes, which are the ways the APU expects data to be stored in its memory. More... | |
enum | APULPF3_SchedulingMode { APULPF3_SchedulingMode_PIPELINED } |
Define the APU memory scheduling modes, which are the ways APU memory operations are scheduled and pipelined. More... | |
Functions | |
void | APULPF3_init (void) |
APU init function. More... | |
void | APULPF3_startOperationSequence () |
APU function to prepare the start of an operation chain. More... | |
void | APULPF3_stopOperationSequence () |
APU function to finish an operation chain. More... | |
int_fast16_t | APULPF3_dotProduct (APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, bool conjugate, float complex *result) |
APU function for calculating the dot product of two vectors, with the option to perform the complex conjugate on the second vector first. More... | |
int_fast16_t | APULPF3_vectorMult (APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, bool conjugate, APULPF3_ComplexVector *result) |
APU function for calculating the element-wise product of two vectors, with the option to perform the complex conjugate on the second vector first. More... | |
int_fast16_t | APULPF3_vectorSum (APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, APULPF3_ComplexVector *result) |
APU function for calculating the element-wise sum of two vectors. More... | |
int_fast16_t | APULPF3_cartesianToPolarVector (APULPF3_ComplexVector *vec, APULPF3_ComplexVector *result) |
APU function for converting a complex vector in cartesian format to polar format. More... | |
int_fast16_t | APULPF3_polarToCartesianVector (APULPF3_ComplexVector *vec, float complex *temp, APULPF3_ComplexVector *result) |
APU function for converting a complex vector in polar format to cartesian format. More... | |
int_fast16_t | APULPF3_sortVector (APULPF3_ComplexVector *vec, APULPF3_ComplexVector *result) |
APU function for sorting the real parts of a complex vector in descending order. This function ignores the complex parts of each element and makes no guarantees to their contents after the operation is complete. More... | |
int_fast16_t | APULPF3_covMatrixSpatialSmoothing (APULPF3_ComplexVector *vec, uint16_t covMatrixSize, bool fbAveraging, APULPF3_ComplexTriangleMatrix *result) |
APU function for covariance matrix computation using spatial smoothing and optionally forward-backward averaging. More... | |
int_fast16_t | APULPF3_computeFFT (APULPF3_ComplexVector *vec, bool inverse, APULPF3_ComplexVector *result) |
APU function for computing the Discrete Fourier transform (DFT) of a complex vector using the Fast Fourier Transform (FFT) algorithm. Optionally, the Inverse DFT can be computed. Combines two APU operations; first configuring the APU for a fourier transform, then actually computing it. More... | |
int_fast16_t | APULPF3_matrixMult (APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB, APULPF3_ComplexMatrix *result) |
APU function for multiplying two matrices. The number of rows in the first matrix must be equal to the number of columns in the second matrix. More... | |
int_fast16_t | APULPF3_matrixSum (APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB, APULPF3_ComplexMatrix *result) |
APU function for adding two matrices. The matrices must be of exact same sizes. More... | |
int_fast16_t | APULPF3_matrixScalarSum (APULPF3_ComplexMatrix *mat, float complex *scalar, APULPF3_ComplexMatrix *result) |
APU function for adding a scalar to a each of a matrix' elements. More... | |
int_fast16_t | APULPF3_matrixScalarMult (APULPF3_ComplexMatrix *mat, float complex *scalar, APULPF3_ComplexMatrix *result) |
APU function for multiplying each of a matrix' elements by a scalar. More... | |
int_fast16_t | APULPF3_matrixNorm (APULPF3_ComplexMatrix *mat, float complex *result) |
Compute the Frobenius norm of a matrix. More... | |
int_fast16_t | APULPF3_jacobiEVD (APULPF3_ComplexTriangleMatrix *mat, uint16_t maxIter, float stopThreshold, APULPF3_ComplexVector *result) |
APU function to compute the Jacobi Eigen-Decomposition (EVD) of a triangular Hermitian Matrix. This results in a triangular, diagonal matrix of eigenvalues sorted in descending order replacing the original matrix, and a full matrix, where the eigenvectors are the columns. The eigenvector matrix can be placed anywhere in APU memory except for where the input matrix is placed. When not using scratchpad mode, the eigenvector matrix will be placed next to the eigenvector matrix. More... | |
int_fast16_t | APULPF3_gaussJordanElim (APULPF3_ComplexMatrix *mat, float zeroThreshold, APULPF3_ComplexMatrix *result) |
Reduce the input matrix A[MxN] to reduced echelon form using Gauss-Jordan Elimination. More... | |
int_fast16_t | APULPF3_unitCircle (uint16_t numPoints, uint16_t constant, uint16_t phase, bool conjugate, APULPF3_ComplexVector *result) |
Generate points evenly distributed on a unit circle APU generates a unit circle as follow: exp(-j*2*pi*(k*M+phase)/1024 * (-1)^(conjugate)) More... | |
void | APULPF3_prepareResult (uint16_t resultSize, uint16_t inputSize, complex float *resultBuffer) |
Configure the APU pointers for temporary (in APU memory) and final results for a APU operation. This function is intended to be used inside APU operations, such as dot product. More... | |
uint16_t | APULPF3_prepareVectors (APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB) |
Configure the APU for vector operation inputs. If not in scratchpad mode, the vectors are loaded one after the other into the start of APU memory. More... | |
uint16_t | APULPF3_prepareMatrices (APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB) |
Configure the APU for matrix operation inputs. If not in scratchpad mode, the matrices are loaded one after the other into the start of APU memory. More... | |
void * | APULPF3_loadTriangular (APULPF3_ComplexMatrix *mat, uint16_t offset) |
Loads the upper triangular part of a full matrix into APU memory. More... | |
void * | APULPF3_loadArgMirrored (uint16_t argSize, uint16_t offset, float complex *src) |
Load operation arguments into APU memory, assuming the APU is in mirrored mode. This means memory is viewed as one block that can fit 1024 complex numbers. More... | |
#define APULPF3_RESULT_INPLACE 0 |
APU operation is in-place, overwriting the input.
#define APULPF3_MEM_BASE VCERAM_DATA0_BASE |
Start of APU RAM.
#define APULPF3_MEM_SIZE_MIRRORED APURAM_DATA0_SIZE |
Size of APU RAM in mirrored mode.
void APULPF3_init | ( | void | ) |
APU init function.
This function initializes the APU internal state. It sets power dependencies, loads the necessary firmware, configures the APU and sets up interrupts and semaphores
void APULPF3_startOperationSequence | ( | ) |
APU function to prepare the start of an operation chain.
This function disables standby and acquire exclusive access to the APU. It should be called before any sequence of APU operations, wrapping them together with APULPF3_stopOperationSequence(). APU operations should not be called outside of a pair of APULPF3_startOperationSequence() and APULPF3_stopOperationSequence() functions.
void APULPF3_stopOperationSequence | ( | ) |
APU function to finish an operation chain.
This function enables standby and releases exclusive access to the APU. It should be called after any sequence of APU operations, wrapping them together with APULPF3_startOperationSequence(). APU operations should not be called outside of a pair of APULPF3_startOperationSequence() and APULPF3_stopOperationSequence() functions.
void APULPF3_prepareResult | ( | uint16_t | resultSize, |
uint16_t | inputSize, | ||
complex float * | resultBuffer | ||
) |
Configure the APU pointers for temporary (in APU memory) and final results for a APU operation. This function is intended to be used inside APU operations, such as dot product.
[in] | resultSize | the size of an operation result |
[in] | inputSize | the size of an operation input |
[out] | resultBuffer | a pointer where the operation result will be placed in APU memory. If not using scratchpad mode, usually when an external buffer is supplied, configure to place the result at the start of APU memory, before copying it to the buffer. |
APULPF3_STATUS_SUCCESS | The call was successful. |
uint16_t APULPF3_prepareVectors | ( | APULPF3_ComplexVector * | vecA, |
APULPF3_ComplexVector * | vecB | ||
) |
Configure the APU for vector operation inputs. If not in scratchpad mode, the vectors are loaded one after the other into the start of APU memory.
[in] | vecA | an input vector |
[in] | vecB | an input vector, potentially the same as vecA |
uint16_t APULPF3_prepareMatrices | ( | APULPF3_ComplexMatrix * | matA, |
APULPF3_ComplexMatrix * | matB | ||
) |
Configure the APU for matrix operation inputs. If not in scratchpad mode, the matrices are loaded one after the other into the start of APU memory.
[in] | matA | a pointer to an input matrix |
[in] | matB | a pointer to an input matrix, potentially the same as matA |
void* APULPF3_loadTriangular | ( | APULPF3_ComplexMatrix * | mat, |
uint16_t | offset | ||
) |
Loads the upper triangular part of a full matrix into APU memory.
[in] | mat | a pointer to a source matrix |
[in] | offset | offset into APU memory to load to |
void* APULPF3_loadArgMirrored | ( | uint16_t | argSize, |
uint16_t | offset, | ||
float complex * | src | ||
) |
Load operation arguments into APU memory, assuming the APU is in mirrored mode. This means memory is viewed as one block that can fit 1024 complex numbers.
[in] | argSize | how many complex numbers to load |
[in] | offset | offset into APU memory to load to |
[in] | src | data source pointer |