TI Arm Clang Compiler Tools - 2.1.0.LTS Release Notes
Table of Contents
Introduction
Version 2.1.0.LTS of the TI Arm Clang Compiler Tools, also known as the tiarmclang compiler, is derived from the open source LLVM/Clang source code base and the LLVM Compiler Infrastructure source base that can be found in GitHub (github.com).
The tiarmclang compiler can be used to compile and link C/C++ and assembly source files to build static executable application files that can be loaded and run on an Arm Cortex processor (m0, m0plus, m3, m4, m33, r4, and r5). Please see the Device Support section below for further information about which compiler options to use when building an application for a particular Arm Cortex processor configuration.
Long-Term Support Release
This is a Long-Term Support (LTS) release.
For definitions and explanations of STS, LTS, and the versioning number scheme, please see SDTO Compiler Version Numbers.
Documentation
The TI Arm Clang Compiler Tools User’s Guide is now available online at the following URL:
Since the tiarmclang compiler is derived from the LLVM project’s Clang compiler source base, much of the generic Clang online documentation is also applicable to the tiarmclang compiler. The latest version of the generic Clang documentation can be found here:
TI E2E Community - Where to Get Help
Post compiler related questions to the TI E2E design community forum and select the TI device being used.
The following is the top-level webpage for all of TI’s Code Generation Tools.
If submitting a defect report, please attach a scaled-down test case with command-line options and the compiler version number to allow us to reproduce the issue easily.
Defect Tracking Database
Compiler defect reports can be tracked at the new Development Tools bug database, SIR. SIR is a JIRA-based view into all public tools defects. The old SDOWP tracking database will be retired.
A my.ti.com account is required to access this page. To find an issue in SIR, enter your defect id in the top right search box once logged in. Alternatively, from the top red navigation bar, select “Issues” then “Search for Issues”.
To find an old SDOWP issue, place the SDOWP ID in the search box and use double quotes around the SDOWP ID.
What’s New
Support for Multiple Condition/Decision Coverage (MC/DC)
The tiarmclang 2.1.0.LTS release provides support for Modified Condition/Decision Coverage (MC/DC) on top of the existing Source-Based Code Coverage framework. MC/DC is an ISO26262 functional safety requirement for ASIL-D for compound boolean expression decisions to show that each condition in a decision independently affects the outcome of a decision. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions.
For further information about support for Code Coverage, including details about how to use MC/DC support, please see: Code Coverage
Reduction of Code Coverage Instrumentation Footprint
The tiarmclang 2.1.0.LTS release provides two new code coverage related options that can help reduce the size of the instrumentation code and data that is added to an application build to enable computation and visualization of code coverage information.
Reduce Size of Profile Counter: -fprofile-counter-size=[64|32]
The default size for the compiler generated profile counters that annotate an application when code coverage is enabled is 64-bits. The
```
-fprofile-counter-size=32
```
option instructs the compiler to use 32-bit integer values to record the execution count associated with a basic block (a sequence of executable code that can potentially be the destination of a call or branch) where applicable.
Limit Generation of Code Coverage Information to Functions
Normally when compiler generated code coverage is enabled in tiarmclang, the compiler will annotate an application with execution counters for basic blocks. The
```
-ffunction-coverage-only
```
option can be used to reduce the code coverage instrumentation footprint by limiting compiler generated code coverage information to function entry execution counts.
Inter-Module Optimizations via Link-Time Optimization (LTO)
Beginning with the tiarmclang 2.0.0.STS release, support for whole-program optimization via link-time inter-module optimizations is available.
The -flto Option Turns on LTO
The LTO feature can be enabled using the -flto option on the tiarmclang command-line.
Building an LTO-Enabled Application with tiarmclang from the Command-Line Interface
- Compiling and Linking an Application from a Single tiarmclang Command
If compiling and linking from a single tiarmclang command, the -flto option can be inserted among the other compiler options. A typical tiarmclang command-line that turns on the LTO feature will look like this:
```
%> tiarmclang -mcpu=cortex-m4 -Oz -flto hello.c -o hello.out \
-Wl,-llnk.cmd,-mhello.map
```
- Compiling and Linking an Application in Separate Steps with tiarmclang
If compiling and linking in separate steps, the -flto option should be specified on both the tiarmclang compilation and linking commands, like so:
```
%> tiarmclang -mcpu=cortex-m4 -Oz -flto -c hello.c
%> tiarmclang -mcpu=cortex-m4 -Oz -flto hello.o -o hello.out \
-Wl,-llnk.cmd,-mhello.map
```
Building an LTO-Enabled Application with tiarmclang in a Code Composer Studio Project
A tiarmclang Code Composer Studio (CCS) project that has been imported into or created in a workspace can be built with LTO enabled by checking the “Select Link-Time Optimization (LTO) (-flto)” box in the Build->Arm Compiler->Optimization tab in the project build settings dialog pop-up window. This capability is available starting in CCS 12.0.
For example, given a simple “Hello World!” CCS project as the project in focus in a workspace, you can click on Project->Build Settings to bring up the Properties pop-up window. Assuming that the “TI Clang v2.1.0.LTS” compiler has been selected in the General->Compiler version box and other settings besides -flto have been accounted for, then:
- Click on Build->Arm Compiler->Optimization
- Click on check-box beside “Select Link-Time Optimization (LTO) (-flto)”
- Click on the “Apply and Close” button
- Build your project
The -flto option will be used in both the compile and link steps of the project build.
If you are using a version of CCS prior to 12.0, then you can still enable LTO for the build of your application. Assuming the same “Hello World!” CCS project with all other settings accounted for, you can enable LTO by inserting the -flto option into both the Build->Arm Compiler and Build->Arm Linker tabs in the Project->Build Settings pop-up window as follows:
Click on Build->Arm Compiler and edit the Command-line pattern contents as follows:
Before: ${command} ${flags} ${output_flag}${output} ${inputs} After: ${command} -flto ${flags} ${output_flag}${output} ${inputs}
Similarly, for the link-step, click on Build->Arm Linker and edit the Command-line pattern contents as follows:
Before: ${command} ${flags} ${output_flag}${output} ${inputs} After: ${command} -flto ${flags} ${output_flag}${output} ${inputs}
- Click on the “Apply and Close” button
Build your project
LTO Development Flow
There are essentially two steps to employing link-time inter-module optimizations in the build of a given application.
Compile as much C/C++ source code as possible with the -flto option.
Compiling a C/C++ source file with the -flto option instructs the compiler to embed an intermediate representation (IR) in the compiler-generated object file that is produced by the compiler. This includes any object files contained in libraries. In fact, all of the runtime libraries that are shipped with the tiarmclang toolchain are built with the -flto option. This allows a given object file from a runtime library to be able to participate in LTO during the link step if LTO is turned on. An object file with embedded IR will be interpreted as a normal object file if LTO is not turned on during the link step.
Turn on the LTO feature during the link of your application
As explained in the above section, LTO can be turned on by specifying the -flto option on the tiarmclang command during compilation and linking.
When LTO is turned on during the link, the linker will:
a. Extract the embedded IR content from each input object file that contains embedded IR to create a source IR module. This also applies to object files that are pulled in from object libraries to resolve references to undefined symbols. b. The source IR modules are linked together into a combined IR module. c. The combined IR module is presented to the compiler to “re-compile” the program with inter-module optimizations enabled.d. The resulting object file from the “re-compile” is linked with all other input object files that do not contain embedded IR to produce the linked output file.
Benefits of Using LTO - Enabling Inter-Module Optimizations
Let’s consider a simple example application to demonstrate just one of the potential benefits of using LTO to enable inter-module optimization …
Consider a series of source files in which many of the same string constants are referenced repeatedly and across multiple source files.
If we compile and link without LTO turned on:
```
%> tiarmclang -mcpu=cortex-m4 -Oz constant_merge_test.c \
ic_s10.c ic_s20.c ic_s30.c ic_s40.c s10.c s20.c s30.c s40.c \
-o no_lto.out -Wl,-llnk.cmd,-mno_lto.map
```
The map file reveals that the size of the .rodata section where all of the string constants are defined is reasonably large:
no_lto.map:
```
...
SEGMENT ALLOCATION MAP
run origin load origin length init length attrs members
---------- ----------- ---------- ----------- ----- -------
00000020 00000020 00007a4c 00007a4c r-x
00000020 00000020 00004ad2 00004ad2 r-- .rodata
...
...
```
But if we then compile with LTO enabled:
```
%> tiarmclang -mcpu=cortex-m4 -flto -Oz constant_merge_test.c \
ic_s10.c ic_s20.c ic_s30.c ic_s40.c s10.c s20.c s30.c s40.c \
-o with_lto.out -Wl,-llnk.cmd,-mwith_lto.map
```
Then the map file shows that the .rodata is significantly smaller in the LTO enabled build:
```
...
SEGMENT ALLOCATION MAP
run origin load origin length init length attrs members
---------- ----------- ---------- ----------- ----- -------
00000020 00000020 00005b84 00005b84 r-x
...
00004530 00004530 00001674 00001674 r-- .rodata
...
```
The use of LTO in this example enables the compiler to perform an inter-module constant merging optimization that results in a savings of 0x4ad2 - 0x1674 -> 0x345e (13406) bytes in the .rodata section. Note that in this example, the savings in the size of the .rodata section is offset somewhat by increased code size in other sections like .text. The net savings is 0x7a4c - 0x5b84 -> 0x1ec8 (7880) bytes.
Improved Compiler Generated Debug Information to Enable Use of CCS Stack Usage View
The tiarmclang 2.1.0.LTS compiler will emit estimated stack usage debug information for all functions, including functions defined in runtime libraries, to enable the use of the Stack Usage View in Code Composer Studio (CCS). Additionally, functions in the runtime libraries that are sourced in assembly language have been annotated with assembly directives to supply estimated stack usage information for those functions.
Recently Fixed Issues
CODEGEN-6288 : tiarmclang optimizer removes empty loops that don’t have side effects
In tiarmclang releases prior to 1.2.1.STS, the optimizer would remove an empty while loop that contained no side effects. If a function contained only such a loop, then the optimizer would remove references to the function from other functions in the same compilation unit even if the function were annotated with an optnone function attribute.
In tiarmclang releases starting with 1.2.1.STS, you can now mark a function containing an empty loop with no side effects with an optnone function attribute and references to the function will not be removed.
Alternatively, you can specify an asm() statement inside the body of the empty loop to create a side effect that will prevent the loop from being removed. For example:
```
while (1) {
__asm(" ");
}
```
Host Support / Dependencies
The following host-specific versions of the 2.1.0.LTS tiarmclang compiler are available:
- Linux: Ubuntu, RHEL 7
- Windows: 7, 8, 10
- Mac: OSX
Device Support
The tiarmclang compiler supports development of applications that are to be loaded and run on one of the following Arm Cortex processor variants:
ARM Processor Variant | Options |
---|---|
Cortex-M0 | “-mcpu=cortex-m0” |
Cortex-M0+ | “-mcpu=cortex-m0plus” |
Cortex-M3 | “-mcpu=cortex-m3” |
Cortex-M4 without FPv4SPD16 | “-mcpu=cortex-m4 -mfloat-abi=soft” |
Cortex-M4 with FPv4SPD16 | “-mcpu=cortex-m0 -mfloat-abi=hard -mfpu=fpv4-sp-d16” |
Cortex-M33 without FPv5SPD16 | “-mcpu=cortex-m33 -mfloat-abi=soft” |
Cortex-M33 with FPv5SPD16 | “-mcpu=cortex-m33 -mfloat-abi=hard -mfpu=fpv5-sp-d16” |
Cortex-R4 (Thumb) without VFPv3D16 | “-mcpu=cortex-r4 -mthumb -mfloat-abi=soft” |
Cortex-R4 (Thumb) with VFPv3D16 | “-mcpu=cortex-r4 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16” |
Cortex-R4 without VFPv3D16 | “-mcpu=cortex-r4 -mfloat-abi=soft” |
Cortex-R4 with VFPv3D16 | “-mcpu=cortex-r4 -mfloat-abi=hard -mfpu=vfpv3-d16” |
Cortex-R5 (Thumb) without VFPv3D16 | “-mcpu=cortex-r5 -mthumb -mfloat-abi=soft” |
Cortex-R5 (Thumb) with VFPv3D16 | “-mcpu=cortex-r5 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16” |
Cortex-R5 without VFPv3D16 | “-mcpu=cortex-r5 -mfloat-abi=soft” |
Cortex-R5 with VFPv3D16 | “-mcpu=cortex-r5 -mfloat-abi=hard -mfpu=vfpv3-d16” |
Resolved Defects
ID | Summary |
---|---|
CODEGEN-9997 | tiarmclang: LTO behaves differently than non-LTO with regards to how zero-initialized variables are defined |
CODEGEN-9850 | Unresolved reference to runtime library function when that function is referenced from asm statement |
CODEGEN-9838 | Update tiarmclang documentation to explain that C++ library does not support features related to threads and concurrency |
CODEGEN-9834 | Compiler ignores attribute((used)) |
CODEGEN-9779 | Function local static array allocated to .data section, and not .bss |
CODEGEN-9669 | TI Arm Clang mismatch between source code and debugger view with function subsections |
CODEGEN-9092 | tiarmclang mistakenly documents support for -fpic position independent code |
CODEGEN-8914 | _enable_IRQ in ti_compatibility.h only supports Cortex-M devices |
CODEGEN-8899 | tiarmlnk generates cinit record for tiny .init_array section |
CODEGEN-8887 | Compiler does not support linking code that uses C++ exceptions |
CODEGEN-8639 | tiarmar.exe is denied permission to create an archive file on Windows 7 |
CODEGEN-8533 | Use of virtual functions causes many RTS print functions to be linked into the program |
CODEGEN-8471 | Hex utility, when splitting a section as required by the bootloader, ignores the section alignment for the second part of the split |
CODEGEN-8255 | tiarmclang: zero-initialized static and global variables are being defined in .bss |
CODEGEN-6288 | tiarmclang: optimizer removes empty loops that don't have side effects |
Known Defects
The up-to-date known defects in v2.1.0.LTS can be found here (dynamically generated):
ID | Summary |
---|---|
CODEGEN-9415 | Compiler inappropriately generates non-empty ARM.exidx sections |
CODEGEN-9398 | Use of -save-temps causes warning diagnostic to not be emitted |
CODEGEN-8955 | math.h does not work with -std=c89 or c90 or gnu90 |
CODEGEN-8216 | Code coverage symbols not defined when profile counter section manually placed |