4.7. Loop unrolling¶

Loop unrolling is a technique to improve performance. Small loops are expanded such that an iteration of the loop is replicated a certain number of times in the loop body. The number of times an iteration is replicated is known as the unroll factor.

4.7.1. Benefits¶

Reduce branch overhead This is especially significant for small loops. For example, the loop in Listing 4.10 is unrolled by a factor of 4. From the assembly Listing 4.11, it is evident that the branch overhead is reduced by a factor of 4 (one BANZ for 4 iterations vs. 1 without unrolling).

There are additional benefits on C28x CPUs with FPU support:

Generate RPTB for small loops - loop unrolling increases the number of instructions in the loop body and enables the compiler to meet the minimum block size requirements for the RPTB instruction.
Improved floating-point performance - loop unrolling can improve performance by providing the compiler more instructions to schedule across the unrolled iterations. This reduces the number of NOPs generated and also provides the compiler with a greater opportunity to generate parallel instructions.

Note

Loop unrolling will result in a code size increase because the compiler replicates the loop body. #pragma UNROLL(1) can be used to prevent the compiler from unrolling the loop.

4.7.2. Performing unrolling¶

There are two ways in which loop unrolling can be performed:

The compiler can automatically unroll the loop. Listing 4.12 is an example of a loop that is unrolled 4 times by the compiler.
The UNROLL pragma can be used to indicate to the compiler that the loop is a candidate for unrolling. Refer to UNROLL for details.