4.3. Inlining

Inlining is the process of inserting code for a function at the point of call. Benefits:

  • Saves the overhead of a function call.

  • Allows the optimizer to optimize the function in the context of the surrounding code.

When an inline function is called, a copy of the C/C++ source code for the function is inserted at the point of the call. Inlining function expansion can speed up execution by eliminating function call overhead. This is particularly beneficial for very small functions that are called frequently or larger functions that are called very few times (once or twice). Function inlining involves a tradeoff between execution speed and code size, because the code is duplicated at each function call site.

Table 4.5 lists cycle counts for executing the function sequence in Listing 4.9 without and with inlining enabled.

foo1 calls foo2, which calls foo3 which in turn calls foo4. Using static enables the compiler to remove the function bodies after inlining. This reduces code size by removing the need to have more than one copy of the function.

Table 4.5 Comparing code execution times for Flash vs. RAM

Description

Cycles on F28004x

With –opt_level=3. Inlining disabled using –auto_inline=0

58

With –opt_level=3 (Inlining is enabled by default at this optimization level). The reduction in cycles is due to the elimination of call instructions and additional optimization opportunities from inlining.

19

Listing 4.9 Function call sequence used to illustrate benefits of inlining
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
float foo1(float f1, float f2)
{
    return f1 * f2 + foo2(f1, f2);
}

static float foo2(float f1, float f2)
{
    return f1 * 2.0f - foo3(f1, f2);
}

static float foo3(float f1, float f2)
{
    return f2 * 4.0f - foo4(f1, f2);
}

static float foo4(float f1, float f2)
{
    return f1 * (f2 - f1);
}

There are different approaches to controlling the scope of inlining to manage the execution speed - code size tradeoff.

  • If the project is compiled with --opt_level=3 (-O3) or higher.

    -O3 has the side effect of enabling inlining across all the files in the project and can result in a significant code size increase. Use --auto_inline=[size] with --opt_level=3 to place a limit on the size of the functions that are inlined. If required, inlining can be disabled at -O3 using --auto_inline=0 or -oi0. Refer to the TMS320C28x Optimizing C/C++ Compiler User’s Guide, Section 3.5, “Automatic Inline Expansion (–auto_inline Option)” for details.

  • If the project is compiled with --opt_level=1 or --opt_level=2

    Use static inline on specific functions that would benefit from being inlined into call sites.

  • To enforce inlining irrespective of optimization level, use either the attribute always_inline or the pragma FUNC_ALWAYS_INLINE.

Refer to the TMS320C28x Optimizing C/C++ Compiler User’s Guide, Section 2.11, “Using Inline Function Expansion” for details.