TI Arm Clang Compiler Tools - 2.1.0.LTS Release Notes

Table of Contents

Introduction

Version 2.1.0.LTS of the TI Arm Clang Compiler Tools, also known as the tiarmclang compiler, is derived from the open source LLVM/Clang source code base and the LLVM Compiler Infrastructure source base that can be found in GitHub (github.com).

The tiarmclang compiler can be used to compile and link C/C++ and assembly source files to build static executable application files that can be loaded and run on an Arm Cortex processor (m0, m0plus, m3, m4, m33, r4, and r5). Please see the Device Support section below for further information about which compiler options to use when building an application for a particular Arm Cortex processor configuration.

Long-Term Support Release

This is a Long-Term Support (LTS) release.

For definitions and explanations of STS, LTS, and the versioning number scheme, please see SDTO Compiler Version Numbers.

Documentation

The TI Arm Clang Compiler Tools User’s Guide is now available online at the following URL:

Since the tiarmclang compiler is derived from the LLVM project’s Clang compiler source base, much of the generic Clang online documentation is also applicable to the tiarmclang compiler. The latest version of the generic Clang documentation can be found here:

TI E2E Community - Where to Get Help

Post compiler related questions to the TI E2E design community forum and select the TI device being used.

The following is the top-level webpage for all of TI’s Code Generation Tools.

If submitting a defect report, please attach a scaled-down test case with command-line options and the compiler version number to allow us to reproduce the issue easily.

Defect Tracking Database

Compiler defect reports can be tracked at the new Development Tools bug database, SIR. SIR is a JIRA-based view into all public tools defects. The old SDOWP tracking database will be retired.

A my.ti.com account is required to access this page. To find an issue in SIR, enter your defect id in the top right search box once logged in. Alternatively, from the top red navigation bar, select “Issues” then “Search for Issues”.

To find an old SDOWP issue, place the SDOWP ID in the search box and use double quotes around the SDOWP ID.

What’s New

Support for Multiple Condition/Decision Coverage (MC/DC)

The tiarmclang 2.1.0.LTS release provides support for Modified Condition/Decision Coverage (MC/DC) on top of the existing Source-Based Code Coverage framework. MC/DC is an ISO26262 functional safety requirement for ASIL-D for compound boolean expression decisions to show that each condition in a decision independently affects the outcome of a decision. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions.

For further information about support for Code Coverage, including details about how to use MC/DC support, please see: Code Coverage

Reduction of Code Coverage Instrumentation Footprint

The tiarmclang 2.1.0.LTS release provides two new code coverage related options that can help reduce the size of the instrumentation code and data that is added to an application build to enable computation and visualization of code coverage information.

Reduce Size of Profile Counter: -fprofile-counter-size=[64|32]

The default size for the compiler generated profile counters that annotate an application when code coverage is enabled is 64-bits. The

```
-fprofile-counter-size=32
```

option instructs the compiler to use 32-bit integer values to record the execution count associated with a basic block (a sequence of executable code that can potentially be the destination of a call or branch) where applicable.

Limit Generation of Code Coverage Information to Functions

Normally when compiler generated code coverage is enabled in tiarmclang, the compiler will annotate an application with execution counters for basic blocks. The

```
-ffunction-coverage-only
```

option can be used to reduce the code coverage instrumentation footprint by limiting compiler generated code coverage information to function entry execution counts.

Beginning with the tiarmclang 2.0.0.STS release, support for whole-program optimization via link-time inter-module optimizations is available.

The -flto Option Turns on LTO

The LTO feature can be enabled using the -flto option on the tiarmclang command-line.

Building an LTO-Enabled Application with tiarmclang from the Command-Line Interface

If compiling and linking from a single tiarmclang command, the -flto option can be inserted among the other compiler options. A typical tiarmclang command-line that turns on the LTO feature will look like this:

```
%> tiarmclang -mcpu=cortex-m4 -Oz -flto hello.c -o hello.out \
    -Wl,-llnk.cmd,-mhello.map
``` 

If compiling and linking in separate steps, the -flto option should be specified on both the tiarmclang compilation and linking commands, like so:

``` 
%> tiarmclang -mcpu=cortex-m4 -Oz -flto -c hello.c
%> tiarmclang -mcpu=cortex-m4 -Oz -flto hello.o -o hello.out \
    -Wl,-llnk.cmd,-mhello.map
```

Building an LTO-Enabled Application with tiarmclang in a Code Composer Studio Project

A tiarmclang Code Composer Studio (CCS) project that has been imported into or created in a workspace can be built with LTO enabled by checking the “Select Link-Time Optimization (LTO) (-flto)” box in the Build->Arm Compiler->Optimization tab in the project build settings dialog pop-up window. This capability is available starting in CCS 12.0.

For example, given a simple “Hello World!” CCS project as the project in focus in a workspace, you can click on Project->Build Settings to bring up the Properties pop-up window. Assuming that the “TI Clang v2.1.0.LTS” compiler has been selected in the General->Compiler version box and other settings besides -flto have been accounted for, then:

  1. Click on Build->Arm Compiler->Optimization
  2. Click on check-box beside “Select Link-Time Optimization (LTO) (-flto)”
  3. Click on the “Apply and Close” button
  4. Build your project

The -flto option will be used in both the compile and link steps of the project build.

If you are using a version of CCS prior to 12.0, then you can still enable LTO for the build of your application. Assuming the same “Hello World!” CCS project with all other settings accounted for, you can enable LTO by inserting the -flto option into both the Build->Arm Compiler and Build->Arm Linker tabs in the Project->Build Settings pop-up window as follows:

  1. Click on Build->Arm Compiler and edit the Command-line pattern contents as follows:

    Before:  ${command} ${flags} ${output_flag}${output} ${inputs}
    After:   ${command} -flto ${flags} ${output_flag}${output} ${inputs}
  2. Similarly, for the link-step, click on Build->Arm Linker and edit the Command-line pattern contents as follows:

    Before:  ${command} ${flags} ${output_flag}${output} ${inputs}
    After:   ${command} -flto ${flags} ${output_flag}${output} ${inputs}
  3. Click on the “Apply and Close” button
  4. Build your project

LTO Development Flow

There are essentially two steps to employing link-time inter-module optimizations in the build of a given application.

  1. Compile as much C/C++ source code as possible with the -flto option.

    Compiling a C/C++ source file with the -flto option instructs the compiler to embed an intermediate representation (IR) in the compiler-generated object file that is produced by the compiler. This includes any object files contained in libraries. In fact, all of the runtime libraries that are shipped with the tiarmclang toolchain are built with the -flto option. This allows a given object file from a runtime library to be able to participate in LTO during the link step if LTO is turned on. An object file with embedded IR will be interpreted as a normal object file if LTO is not turned on during the link step.

  2. Turn on the LTO feature during the link of your application

    As explained in the above section, LTO can be turned on by specifying the -flto option on the tiarmclang command during compilation and linking.

    When LTO is turned on during the link, the linker will:

    a. Extract the embedded IR content from each input object file that contains embedded IR to create a source IR module. This also applies to object files that are pulled in from object libraries to resolve references to undefined symbols. b. The source IR modules are linked together into a combined IR module. c. The combined IR module is presented to the compiler to “re-compile” the program with inter-module optimizations enabled.

    d. The resulting object file from the “re-compile” is linked with all other input object files that do not contain embedded IR to produce the linked output file.

Benefits of Using LTO - Enabling Inter-Module Optimizations

Let’s consider a simple example application to demonstrate just one of the potential benefits of using LTO to enable inter-module optimization …

Consider a series of source files in which many of the same string constants are referenced repeatedly and across multiple source files.

If we compile and link without LTO turned on:

```
%> tiarmclang -mcpu=cortex-m4 -Oz constant_merge_test.c \
       ic_s10.c ic_s20.c ic_s30.c ic_s40.c s10.c s20.c s30.c s40.c \
       -o no_lto.out -Wl,-llnk.cmd,-mno_lto.map
```

The map file reveals that the size of the .rodata section where all of the string constants are defined is reasonably large:

no_lto.map:

```
...
SEGMENT ALLOCATION MAP

run origin  load origin   length   init length attrs members
----------  ----------- ---------- ----------- ----- -------
00000020    00000020    00007a4c   00007a4c    r-x
  00000020    00000020    00004ad2   00004ad2    r-- .rodata
  ...
...
```

But if we then compile with LTO enabled:

```
%> tiarmclang -mcpu=cortex-m4 -flto -Oz constant_merge_test.c \
       ic_s10.c ic_s20.c ic_s30.c ic_s40.c s10.c s20.c s30.c s40.c \
       -o with_lto.out -Wl,-llnk.cmd,-mwith_lto.map
```

Then the map file shows that the .rodata is significantly smaller in the LTO enabled build:

```
...
SEGMENT ALLOCATION MAP

run origin  load origin   length   init length attrs members
----------  ----------- ---------- ----------- ----- -------
00000020    00000020    00005b84   00005b84    r-x
  ...
  00004530    00004530    00001674   00001674    r-- .rodata
...
```

The use of LTO in this example enables the compiler to perform an inter-module constant merging optimization that results in a savings of 0x4ad2 - 0x1674 -> 0x345e (13406) bytes in the .rodata section. Note that in this example, the savings in the size of the .rodata section is offset somewhat by increased code size in other sections like .text. The net savings is 0x7a4c - 0x5b84 -> 0x1ec8 (7880) bytes.

Improved Compiler Generated Debug Information to Enable Use of CCS Stack Usage View

The tiarmclang 2.1.0.LTS compiler will emit estimated stack usage debug information for all functions, including functions defined in runtime libraries, to enable the use of the Stack Usage View in Code Composer Studio (CCS). Additionally, functions in the runtime libraries that are sourced in assembly language have been annotated with assembly directives to supply estimated stack usage information for those functions.

Recently Fixed Issues

CODEGEN-6288 : tiarmclang optimizer removes empty loops that don’t have side effects

In tiarmclang releases prior to 1.2.1.STS, the optimizer would remove an empty while loop that contained no side effects. If a function contained only such a loop, then the optimizer would remove references to the function from other functions in the same compilation unit even if the function were annotated with an optnone function attribute.

In tiarmclang releases starting with 1.2.1.STS, you can now mark a function containing an empty loop with no side effects with an optnone function attribute and references to the function will not be removed.

Alternatively, you can specify an asm() statement inside the body of the empty loop to create a side effect that will prevent the loop from being removed. For example:

```
while (1) {
  __asm(" ");
}
```

Host Support / Dependencies

The following host-specific versions of the 2.1.0.LTS tiarmclang compiler are available:

Device Support

The tiarmclang compiler supports development of applications that are to be loaded and run on one of the following Arm Cortex processor variants:

ARM Processor Variant Options
Cortex-M0 “-mcpu=cortex-m0”
Cortex-M0+ “-mcpu=cortex-m0plus”
Cortex-M3 “-mcpu=cortex-m3”
Cortex-M4 without FPv4SPD16 “-mcpu=cortex-m4 -mfloat-abi=soft”
Cortex-M4 with FPv4SPD16 “-mcpu=cortex-m0 -mfloat-abi=hard -mfpu=fpv4-sp-d16”
Cortex-M33 without FPv5SPD16 “-mcpu=cortex-m33 -mfloat-abi=soft”
Cortex-M33 with FPv5SPD16 “-mcpu=cortex-m33 -mfloat-abi=hard -mfpu=fpv5-sp-d16”
Cortex-R4 (Thumb) without VFPv3D16 “-mcpu=cortex-r4 -mthumb -mfloat-abi=soft”
Cortex-R4 (Thumb) with VFPv3D16 “-mcpu=cortex-r4 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16”
Cortex-R4 without VFPv3D16 “-mcpu=cortex-r4 -mfloat-abi=soft”
Cortex-R4 with VFPv3D16 “-mcpu=cortex-r4 -mfloat-abi=hard -mfpu=vfpv3-d16”
Cortex-R5 (Thumb) without VFPv3D16 “-mcpu=cortex-r5 -mthumb -mfloat-abi=soft”
Cortex-R5 (Thumb) with VFPv3D16 “-mcpu=cortex-r5 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16”
Cortex-R5 without VFPv3D16 “-mcpu=cortex-r5 -mfloat-abi=soft”
Cortex-R5 with VFPv3D16 “-mcpu=cortex-r5 -mfloat-abi=hard -mfpu=vfpv3-d16”

Resolved Defects

ID Summary
CODEGEN-9997 tiarmclang: LTO behaves differently than non-LTO with regards to how zero-initialized variables are defined
CODEGEN-9850 Unresolved reference to runtime library function when that function is referenced from asm statement
CODEGEN-9838 Update tiarmclang documentation to explain that C++ library does not support features related to threads and concurrency
CODEGEN-9834 Compiler ignores attribute((used))
CODEGEN-9779 Function local static array allocated to .data section, and not .bss
CODEGEN-9669 TI Arm Clang mismatch between source code and debugger view with function subsections
CODEGEN-9092 tiarmclang mistakenly documents support for -fpic position independent code
CODEGEN-8914 _enable_IRQ in ti_compatibility.h only supports Cortex-M devices
CODEGEN-8899 tiarmlnk generates cinit record for tiny .init_array section
CODEGEN-8887 Compiler does not support linking code that uses C++ exceptions
CODEGEN-8639 tiarmar.exe is denied permission to create an archive file on Windows 7
CODEGEN-8533 Use of virtual functions causes many RTS print functions to be linked into the program
CODEGEN-8471 Hex utility, when splitting a section as required by the bootloader, ignores the section alignment for the second part of the split
CODEGEN-8255 tiarmclang: zero-initialized static and global variables are being defined in .bss
CODEGEN-6288 tiarmclang: optimizer removes empty loops that don't have side effects

Known Defects

The up-to-date known defects in v2.1.0.LTS can be found here (dynamically generated):

ID Summary
CODEGEN-9415 Compiler inappropriately generates non-empty ARM.exidx sections
CODEGEN-9398 Use of -save-temps causes warning diagnostic to not be emitted
CODEGEN-8955 math.h does not work with -std=c89 or c90 or gnu90
CODEGEN-8216 Code coverage symbols not defined when profile counter section manually placed