Data Structures | Macros | Enumerations | Functions
APULPF3.h File Reference

Detailed Description

PRELIMINARY APU driver interface


WARNING These APIs are PRELIMINARY, and subject to change in the next few months.

Overview

The APU driver allows you to interact with the Algorithm Processing Unit (APU), a linear algebra accelerator peripheral using 64-bit complex numbers. Complex numbers are represented by the float complex type from complex.h, where each number is made up of two 32-bit floats. The APU works with complex numbers in either Cartesian or Polar formats. In the Cartesian representation, the lower 32 bits are the real part and the upper 32 bits are the imaginary part. In Polar format, the lower 32 bits are the absolute part and the upper 32 bits are the angle, represented as pi radians. As long as arguments to the APU functions use the same format, the result will also be in that format. Mixing formats produces wrong results. The driver provides basic data types for vectors and matrices constructed from float complex, as well as many common linear algebra operations on these data types.

Data management

All the vector/matrix operations can accept input and output pointers that are inside or outside APU RAM. If all pointers provided to an operation are inside APU RAM, the APU will operate in scratchpad mode . This means the driver will assume that, given the current pointers, the input is already in APU memory and that input and result will not overlap each other. Therefore, no data will be copied, and the function output will be placed inside APU memory. This is the most efficient way to utilize the APU and is ideal for algorithms with multiple operations that feed into each other, such as MUSIC (https://en.wikipedia.org/wiki/MUSIC_(algorithm)). When chaining together operations, make sure as many as possible use vectors and matrices that are already in APU memory, to prevent unnecessary copying and overhead.

If any of the pointers are outside APU memory, the driver will copy input data to the start of its memory, place the result immediately following, and then copy the output back to the provided pointer. This may overwrite data that was already in this location.

Warning
The APU memory has a memory access limitation that must be respected, as to not cause a bus fault. Software must not perform any combination of back-to-back read or write instructions to APU memory (RR/ WW/ WR/ RW). There must be some other instruction in-between. For safety, load data into APU memory using any of the APU operations, APULPF3_loadArgMirrored() or APULPF3_loadTriangular(). Copying data back from APU memory is automatically handled by the driver, and happens in an interrupt when the result pointer is outside APU memory.

The primary purpose of this driver is executing the MUSIC algorithm for distance estimation in Bluetooth Channel Sounding. An implementation of MUSIC using the APU can be found in the apu_music example.

Usage

This section will cover driver usage.

Synopsis

float complex *apuMem = (float complex *)APULPF3_MEM_BASE;
float complex argA[10];
float complex argB[10];
APULPF3_ComplexVector vecA = {.data = bufA, .size = 10};
APULPF3_ComplexVector vecB = {.data = bufB, .size = 10};
APULPF3_ComplexVector resultVec = {.data = apuMem, .size = 10};
// Get control of APU
// Perform element-wise product, placing the result in resultVec,
// which is inside APU memory
APULPF3_elemProduct(&argA, &argB, resultVec);
// Perform non-conjugated dot product inside of APU memory, which
// reduces overhead.
APULPF3_dotProduct(&resultVec, &resultVec, false, apuMem)
// Give up control of APU
APULPF3_finishOperationSequence();
#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include <complex.h>
#include <ti/drivers/dpl/HwiP.h>
#include <ti/drivers/dpl/SemaphoreP.h>
#include <ti/devices/DeviceFamily.h>
#include <DeviceFamily_constructPath(inc/hw_memmap.h)>
Include dependency graph for APULPF3.h:

Go to the source code of this file.

Data Structures

struct  APULPF3_ComplexVector
 APULPF3 Vector Struct. More...
 
struct  APULPF3_ComplexMatrix
 APULPF3 Matrix Struct. More...
 
struct  APULPF3_ComplexTriangleMatrix
 APULPF3 Upper Triangle Matrix Struct. More...
 
struct  APULPF3_Config
 APULPF3 Global configuration. More...
 
struct  APULPF3_HWAttrs
 APULPF3 Hardware attributes. More...
 

Macros

#define APULPF3_STATUS_SUCCESS   0
 Successful status code. More...
 
#define APULPF3_STATUS_ERROR   1
 Generic error status code. More...
 
#define APULPF3_STATUS_RESOURCE_UNAVAILABLE   2
 An error status code returned if the hardware or software resource is currently unavailable. More...
 
#define APULPF3_RESULT_INPLACE   0
 APU operation is in-place, overwriting the input. More...
 
#define APULPF3_MEM_BASE   VCERAM_DATA0_BASE
 Start of APU RAM. More...
 
#define APULPF3_MEM_SIZE_MIRRORED   APURAM_DATA0_SIZE
 Size of APU RAM in mirrored mode. More...
 

Enumerations

enum  APULPF3_OperationMode { APULPF3_OperationMode_MIRRORED }
 Define the APU memory operation modes, which are the ways the APU expects data to be stored in its memory. More...
 
enum  APULPF3_SchedulingMode { APULPF3_SchedulingMode_PIPELINED }
 Define the APU memory scheduling modes, which are the ways APU memory operations are scheduled and pipelined. More...
 

Functions

void APULPF3_init (void)
 APU init function. More...
 
void APULPF3_startOperationSequence ()
 APU function to prepare the start of an operation chain. More...
 
void APULPF3_stopOperationSequence ()
 APU function to finish an operation chain. More...
 
int_fast16_t APULPF3_dotProduct (APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, bool conjugate, float complex *result)
 APU function for calculating the dot product of two vectors, with the option to perform the complex conjugate on the second vector first. More...
 
int_fast16_t APULPF3_vectorMult (APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, bool conjugate, APULPF3_ComplexVector *result)
 APU function for calculating the element-wise product of two vectors, with the option to perform the complex conjugate on the second vector first. More...
 
int_fast16_t APULPF3_vectorSum (APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, APULPF3_ComplexVector *result)
 APU function for calculating the element-wise sum of two vectors. More...
 
int_fast16_t APULPF3_cartesianToPolarVector (APULPF3_ComplexVector *vec, APULPF3_ComplexVector *result)
 APU function for converting a complex vector in cartesian format to polar format. More...
 
int_fast16_t APULPF3_polarToCartesianVector (APULPF3_ComplexVector *vec, float complex *temp, APULPF3_ComplexVector *result)
 APU function for converting a complex vector in polar format to cartesian format. More...
 
int_fast16_t APULPF3_sortVector (APULPF3_ComplexVector *vec, APULPF3_ComplexVector *result)
 APU function for sorting the real parts of a complex vector in descending order. This function ignores the complex parts of each element and makes no guarantees to their contents after the operation is complete. More...
 
int_fast16_t APULPF3_covMatrixSpatialSmoothing (APULPF3_ComplexVector *vec, uint16_t covMatrixSize, bool fbAveraging, APULPF3_ComplexTriangleMatrix *result)
 APU function for covariance matrix computation using spatial smoothing and optionally forward-backward averaging. More...
 
int_fast16_t APULPF3_computeFFT (APULPF3_ComplexVector *vec, bool inverse, APULPF3_ComplexVector *result)
 APU function for computing the Discrete Fourier transform (DFT) of a complex vector using the Fast Fourier Transform (FFT) algorithm. Optionally, the Inverse DFT can be computed. Combines two APU operations; first configuring the APU for a fourier transform, then actually computing it. More...
 
int_fast16_t APULPF3_matrixMult (APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB, APULPF3_ComplexMatrix *result)
 APU function for multiplying two matrices. The number of rows in the first matrix must be equal to the number of columns in the second matrix. More...
 
int_fast16_t APULPF3_matrixSum (APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB, APULPF3_ComplexMatrix *result)
 APU function for adding two matrices. The matrices must be of exact same sizes. More...
 
int_fast16_t APULPF3_matrixScalarSum (APULPF3_ComplexMatrix *mat, float complex *scalar, APULPF3_ComplexMatrix *result)
 APU function for adding a scalar to a each of a matrix' elements. More...
 
int_fast16_t APULPF3_matrixScalarMult (APULPF3_ComplexMatrix *mat, float complex *scalar, APULPF3_ComplexMatrix *result)
 APU function for multiplying each of a matrix' elements by a scalar. More...
 
int_fast16_t APULPF3_matrixNorm (APULPF3_ComplexMatrix *mat, float complex *result)
 Compute the Frobenius norm of a matrix. More...
 
int_fast16_t APULPF3_jacobiEVD (APULPF3_ComplexTriangleMatrix *mat, uint16_t maxIter, float stopThreshold, APULPF3_ComplexVector *result)
 APU function to compute the Jacobi Eigen-Decomposition (EVD) of a triangular Hermitian Matrix. This results in a triangular, diagonal matrix of eigenvalues sorted in descending order replacing the original matrix, and a full matrix, where the eigenvectors are the columns. The eigenvector matrix can be placed anywhere in APU memory except for where the input matrix is placed. When not using scratchpad mode, the eigenvector matrix will be placed next to the eigenvector matrix. More...
 
int_fast16_t APULPF3_gaussJordanElim (APULPF3_ComplexMatrix *mat, float zeroThreshold, APULPF3_ComplexMatrix *result)
 Reduce the input matrix A[MxN] to reduced echelon form using Gauss-Jordan Elimination. More...
 
int_fast16_t APULPF3_unitCircle (uint16_t numPoints, uint16_t constant, uint16_t phase, bool conjugate, APULPF3_ComplexVector *result)
 Generate points evenly distributed on a unit circle APU generates a unit circle as follow: exp(-j*2*pi*(k*M+phase)/1024 * (-1)^(conjugate)) More...
 
void APULPF3_prepareResult (uint16_t resultSize, uint16_t inputSize, complex float *resultBuffer)
 Configure the APU pointers for temporary (in APU memory) and final results for a APU operation. This function is intended to be used inside APU operations, such as dot product. More...
 
uint16_t APULPF3_prepareVectors (APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB)
 Configure the APU for vector operation inputs. If not in scratchpad mode, the vectors are loaded one after the other into the start of APU memory. More...
 
uint16_t APULPF3_prepareMatrices (APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB)
 Configure the APU for matrix operation inputs. If not in scratchpad mode, the matrices are loaded one after the other into the start of APU memory. More...
 
void * APULPF3_loadTriangular (APULPF3_ComplexMatrix *mat, uint16_t offset)
 Loads the upper triangular part of a full matrix into APU memory. More...
 
void * APULPF3_loadArgMirrored (uint16_t argSize, uint16_t offset, float complex *src)
 Load operation arguments into APU memory, assuming the APU is in mirrored mode. This means memory is viewed as one block that can fit 1024 complex numbers. More...
 

Macro Definition Documentation

§ APULPF3_RESULT_INPLACE

#define APULPF3_RESULT_INPLACE   0

APU operation is in-place, overwriting the input.

§ APULPF3_MEM_BASE

#define APULPF3_MEM_BASE   VCERAM_DATA0_BASE

Start of APU RAM.

§ APULPF3_MEM_SIZE_MIRRORED

#define APULPF3_MEM_SIZE_MIRRORED   APURAM_DATA0_SIZE

Size of APU RAM in mirrored mode.

Enumeration Type Documentation

§ APULPF3_OperationMode

Define the APU memory operation modes, which are the ways the APU expects data to be stored in its memory.

Enumerator
APULPF3_OperationMode_MIRRORED 

In mirrored mode elements are stored sequentially in APU memory.

§ APULPF3_SchedulingMode

Define the APU memory scheduling modes, which are the ways APU memory operations are scheduled and pipelined.

Enumerator
APULPF3_SchedulingMode_PIPELINED 

In pipelined mode, read operations proceed per clock cycle, and only the incoming samples will regulate the data flow.

Function Documentation

§ APULPF3_init()

void APULPF3_init ( void  )

APU init function.

This function initializes the APU internal state. It sets power dependencies, loads the necessary firmware, configures the APU and sets up interrupts and semaphores

§ APULPF3_startOperationSequence()

void APULPF3_startOperationSequence ( )

APU function to prepare the start of an operation chain.

This function disables standby and acquire exclusive access to the APU. It should be called before any sequence of APU operations, wrapping them together with APULPF3_stopOperationSequence(). APU operations should not be called outside of a pair of APULPF3_startOperationSequence() and APULPF3_stopOperationSequence() functions.

Precondition
APULPF3_init() has been called.

§ APULPF3_stopOperationSequence()

void APULPF3_stopOperationSequence ( )

APU function to finish an operation chain.

This function enables standby and releases exclusive access to the APU. It should be called after any sequence of APU operations, wrapping them together with APULPF3_startOperationSequence(). APU operations should not be called outside of a pair of APULPF3_startOperationSequence() and APULPF3_stopOperationSequence() functions.

Precondition
APULPF3_init() and APULPF3_startOperationSequence() has been called.

§ APULPF3_prepareResult()

void APULPF3_prepareResult ( uint16_t  resultSize,
uint16_t  inputSize,
complex float *  resultBuffer 
)

Configure the APU pointers for temporary (in APU memory) and final results for a APU operation. This function is intended to be used inside APU operations, such as dot product.

Parameters
[in]resultSizethe size of an operation result
[in]inputSizethe size of an operation input
[out]resultBuffera pointer where the operation result will be placed in APU memory. If not using scratchpad mode, usually when an external buffer is supplied, configure to place the result at the start of APU memory, before copying it to the buffer.
Returns
A status code indicating whether the APU operation was a success.
Return values
APULPF3_STATUS_SUCCESSThe call was successful.

§ APULPF3_prepareVectors()

uint16_t APULPF3_prepareVectors ( APULPF3_ComplexVector vecA,
APULPF3_ComplexVector vecB 
)

Configure the APU for vector operation inputs. If not in scratchpad mode, the vectors are loaded one after the other into the start of APU memory.

Parameters
[in]vecAan input vector
[in]vecBan input vector, potentially the same as vecA
Returns
The total input size. If the input vectors are the same, this is the vector's size, otherwise it is the sum of the vector sizes.

§ APULPF3_prepareMatrices()

uint16_t APULPF3_prepareMatrices ( APULPF3_ComplexMatrix matA,
APULPF3_ComplexMatrix matB 
)

Configure the APU for matrix operation inputs. If not in scratchpad mode, the matrices are loaded one after the other into the start of APU memory.

Parameters
[in]matAa pointer to an input matrix
[in]matBa pointer to an input matrix, potentially the same as matA
Returns
The total input size. If the input matrices are the same, this is the matrix's size, otherwise it is the sum of the vector sizes.

§ APULPF3_loadTriangular()

void* APULPF3_loadTriangular ( APULPF3_ComplexMatrix mat,
uint16_t  offset 
)

Loads the upper triangular part of a full matrix into APU memory.

Parameters
[in]mata pointer to a source matrix
[in]offsetoffset into APU memory to load to
Returns
Pointer in APU memory where the argument is stored.

§ APULPF3_loadArgMirrored()

void* APULPF3_loadArgMirrored ( uint16_t  argSize,
uint16_t  offset,
float complex *  src 
)

Load operation arguments into APU memory, assuming the APU is in mirrored mode. This means memory is viewed as one block that can fit 1024 complex numbers.

Parameters
[in]argSizehow many complex numbers to load
[in]offsetoffset into APU memory to load to
[in]srcdata source pointer
Returns
Pointer in APU memory where the argument is stored.
© Copyright 1995-2024, Texas Instruments Incorporated. All rights reserved.
Trademarks | Privacy policy | Terms of use | Terms of sale