3. Profiling¶
Profiling is used to focus optimization efforts on functions that account for a majority of the runtime.
There are different approaches to profiling:
Code Composer Studio™ (CCS) Profile clock feature
CPUTimer
CPUTimer and Function entry/exit hooks
Toggle GPIO pin
3.1. CCS Profile Clock¶
The Code Composer Studio Profile Clock feature can be used to count the number of cycles from one breakpoint to the next. This is a quick way to determine the cycles taken by an arbitrary region of code.
3.2. CPUTimer¶
Using the CPUTimer is a programmatic approach to determining the number of cycles between any two points in the code. For example, Listing 3.3 illustrates how to determine the number of cycles taken by the loop using the CPUTimer.
1 2 3 4 5 6 7 8 9 10 | #ifndef _CYCLE_COUNTER_H_
#define _CYCLE_COUNTER_H_
#include <stdint.h>
void CycleCounter_Init(void);
uint32_t CycleCounter_Read(void);
void CycleCounter_Stop(void);
#endif
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | #include "common/inc/cycle_counter.h"
#include "driverlib/cputimer.h"
#define CPUTIMER_MAX_PERIOD (0xFFFFFFFFUL)
void CycleCounter_Init(void)
{
CPUTimer_clearOverflowFlag(CPUTIMER1_BASE);
CPUTimer_setPeriod(CPUTIMER1_BASE, CPUTIMER_MAX_PERIOD);
CPUTimer_setPreScaler(CPUTIMER1_BASE, 0UL);
CPUTimer_reloadTimerCounter(CPUTIMER1_BASE);
CPUTimer_stopTimer(CPUTIMER1_BASE);
CPUTimer_startTimer(CPUTIMER1_BASE);
}
void CycleCounter_Stop(void)
{
CPUTimer_stopTimer(CPUTIMER1_BASE);
}
// With higher levels of optimization, it’s possible that the compiler moves
// application code before the first call to CycleCounter_Read() or after the
// second call to CycleCounter_Read(). This can result in the reported cycle
// count being lower than the actual cycle count.
// Disabling inlining of CycleCounter_Read prevents this from occurring.
#pragma FUNC_CANNOT_INLINE(CycleCounter_Read)
uint32_t CycleCounter_Read(void)
{
return CPUTIMER_MAX_PERIOD - CPUTimer_getTimerCount(CPUTIMER1_BASE);
}
|
Note
For simplicity, this implementation does not consider overflow.
For details on the DriverLib CPU Timer functions, refer to the DriverLib User’s Guide for the device. E.g. the User’s Guide for F28004x is available at
<C2000Ware install dir>/device_support/f28004x/docs/F28004x_DriverLib_Users_Guide.pdf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | CycleCounter_Init();
uint32_t start = CycleCounter_Read();
uint32_t overhead = CycleCounter_Read() - start;
start = CycleCounter_Read();
int i;
for (i=0; i < 100; i++)
asm("\tNOP;");
uint32_t time = CycleCounter_Read() - start - overhead;
printf("Cycles: %ld\n", time);
|
3.3. CPUTimer with Function Entry/Exit Hooks¶
The CPUTimer can be combined with the Function Entry/Exit Hooks feature available in the compiler to generate a quick profiler.
When the Entry/Exit hooks feature is enabled using the --entry_hook
and --exit_hook
options, the compiler inserts a call to an entry hook on entry to each function in the program. The compiler also inserts a call to a exit hook on exit of each function. Refer to the TMS320C28x Optimizing C/C++ Compiler User’s Guide for details - Section 2.14, Enabling Entry Hook and Exit Hook Functions.
Listing 3.4 illustrates using the entry and exit hooks along with the CPUTimer to implement a simple profiler.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | #include "common/inc/cycle_counter.h"
#include "common/inc/profiling_hooks.h"
#include <stdio.h>
#define MAX_ENTRIES (64)
// Indicate if the timestamp is associated with function entry or exit
typedef enum { PD_ENTRY=0, PD_EXIT } PD_Mode;
// Struct for data associated with a single timestamp
typedef struct {
uint32_t function_address;
uint32_t timestamp;
PD_Mode mode;
} ProfileData;
// Array to store profile data
ProfileData table[MAX_ENTRIES];
int index = 0;
// Entry hook function used to record cycle count on entry into function
void __entry_hook(void (*addr)())
{
if (index >= MAX_ENTRIES) return;
table[index].function_address = (uint32_t)addr;
table[index].mode = PD_ENTRY;
table[index].timestamp = CycleCounter_Read();
index++;
}
// Exit hook function used to record cycle count on exit from function
void __exit_hook(void (*addr)())
{
if (index >= MAX_ENTRIES) return;
table[index].timestamp = CycleCounter_Read();
table[index].function_address = (uint32_t)addr;
table[index].mode = PD_EXIT;
index++;
}
|
Files with functions that need to be profiled are compiled using the --entry_hook --entry_parm=address --exit_hook --exit_parm=address
options.
When an application is built with the entry/exit hooks in Listing 3.4, the table is populated with profile data. For example, the code snippet in Listing 3.5 results in the following table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | int main()
{
ProfileData_init();
foo();
ProfileData_print();
return 0;
}
void foo()
{
int i;
for (i=0; i < 100; i++)
asm("\tNOP;");
bar();
bar();
}
void bar()
{
int i;
for (i=0; i < 100; i++)
asm("\tNOP;");
}
|
0x00a647, 0, 25
0x00a637, 0, 562
0x00a637, 1, 1090
0x00a637, 0, 1134
0x00a637, 1, 1662
0x00a647, 1, 1697