TI Arm Clang Compiler Tools - 2.1.3.LTS Release Notes
Table of Contents
- Introduction
- Long-Term Support Release
- Documentation
- TI E2E Community - Where to Get Help
- Defect Tracking Database
- What’s New in TI Arm Clang Compiler Tools 2.1.3.LTS
- What’s New in TI Arm Clang Compiler Tools 2.1.1.LTS
- What’s New in TI Arm Clang Compiler Tools 2.1.0.LTS
- Recently Fixed Issues
- CODEGEN-10591 : Optimization of Logical NOT on a condition yields incorrect Modified Condition/Decision Coverage (MC/DC) tracking data
- CODEGEN-10444 : Enabling Code Coverage in combination with Link-Time Optimization (LTO) results in missing profile data section
- CODEGEN-10229 : tiarmclang compiler-generated debug information can cause Code Composer Studio (CCS) to crash
- CODEGEN-10000 : Mixing ARM and THUMB mode object files when linking a Cortex-R4 or -R5 application with Link-Time Optimization (LTO) enabled can cause a run-time hardware exception (attempt to execute an ARM mode instruction while in THUMB mode, or vice versa)
- CODEGEN-9415 : tiarmclang compiler inappropriately generates non-empty ARM.exidx sections
- CODEGEN-8216 : Code coverage symbols not defined when profile counter section is manually placed in target memory
- CODEGEN-6288 : tiarmclang optimizer removes empty loops that don’t have side effects
- Host Support / Dependencies
- Device Support
- Resolved Defects
- Known Defects
Introduction
Version 2.1.3.LTS of the TI Arm Clang Compiler Tools, also known as the tiarmclang compiler, is derived from the open source LLVM/Clang source code base and the LLVM Compiler Infrastructure source base that can be found in GitHub (github.com).
The tiarmclang compiler can be used to compile and link C/C++ and assembly source files to build static executable application files that can be loaded and run on an Arm Cortex processor (m0, m0plus, m3, m4, m33, r4, and r5). Please see the Device Support section below for further information about which compiler options to use when building an application for a particular Arm Cortex processor configuration.
Long-Term Support Release
This is a Long-Term Support (LTS) release.
Version 2.1.3.LTS of the TI Arm Clang Compiler Tools is the third maintenance release for the 2.1.0.LTS release series. It contains all features provided by the 2.1.0.LTS release plus bug fixes that have been implemented since the 2.1.0.LTS release was made available.
For definitions and explanations of STS, LTS, and the versioning number scheme, please see SDTO Compiler Version Numbers.
Documentation
The TI Arm Clang Compiler Tools User’s Guide is now available online at the following URL:
Since the tiarmclang compiler is derived from the LLVM project’s Clang compiler source base, much of the generic Clang online documentation is also applicable to the tiarmclang compiler. The latest version of the generic Clang documentation can be found here:
TI E2E Community - Where to Get Help
Post compiler related questions to the TI E2E design community forum and select the TI device being used.
The following is the top-level webpage for all of TI’s Code Generation Tools.
If submitting a defect report, please attach a scaled-down test case with command-line options and the compiler version number to allow us to reproduce the issue easily.
Defect Tracking Database
Compiler defect reports can be tracked at the new Development Tools bug database, SIR. SIR is a JIRA-based view into all public tools defects. The old SDOWP tracking database will be retired.
A my.ti.com account is required to access this page. To find an issue in SIR, enter your defect id in the top right search box once logged in. Alternatively, from the top red navigation bar, select “Issues” then “Search for Issues”.
To find an old SDOWP issue, place the SDOWP ID in the search box and use double quotes around the SDOWP ID.
What’s New in TI Arm Clang Compiler Tools 2.1.3.LTS
Support for ACLE coprocessor intrinsics (CODEGEN-10425)
In the 2.1.3.LTS release, support for the ACLE coprocessor intrinsics, as described in Arm C Language Extensions, has been added.
This includes support for defining the __ARM_FEATURE_COPROC pre-defined macro symbol for applicable TI Arm processor variants. If the __ARM_FEATURE_COPROC pre-defined macro symbol is defined at compile-time, its value will determine which ACLE coprocessor intrinsics are available, as detailed here …
If the (__ARM_FEATURE_COPROC & 1) != 0, then the following ACLE coprocessor intrinsics are available:
- void __arm_cdp (int coproc, unsigned opc1, int CRd, int CRn, int CRm, unsigned opc2);
- void __arm_ldc (int coproc, int CRd, const void \*p);
- void __arm_ldcl (int coproc, int CRd, const void \*p);
- void __arm_stc (int coproc, int CRd, const void \*p);
- void __arm_stcl (int coproc, int CRd, const void \*p);
- void __arm_mcr (int coproc, unsigned opc1, uint32\_t value, int CRn, int CRm, unsigned opc2);
- void __arm_mrc (int coproc, unsigned opc1, int CRn, int CRm, unsigned opc2);
If the (__ARM_FEATURE_COPROC & 2) != 0, then the following ACLE coprocessor intrinsics are available:
- void __arm_cdp2 (int coproc, unsigned opc1, int CRd, int CRn, int CRm, unsigned opc2);
- void __arm_ldc2 (int coproc, int CRd, const void \*p);
- void __arm_ldc2l (int coproc, int CRd, const void \*p);
- void __arm_stc2 (int coproc, int CRd, const void \*p);
- void __arm_stc2l (int coproc, int CRd, const void \*p);
- void __arm_mcr2 (int coproc, unsigned opc1, uint32\_t value, int CRn, int CRm, unsigned opc2);
- void __arm_mrc2 (int coproc, unsigned opc1, int CRn, int CRm, unsigned opc2);
If the (__ARM_FEATURE_COPROC & 4) != 0, then the following ACLE coprocessor intrinsics are available:
- void __arm_mcrr (int coproc, unsigned opc1, uint64\_t value, int CRm);
- void __arm_mrrc (int coproc, unsigned opc1, int CRm);
If the (__ARM_FEATURE_COPROC & 8) != 0, then the following ACLE coprocessor intrinsics are available:
- void __arm_mcrr2 (int coproc, unsigned opc1, uint64\_t value, int CRm);
- void __arm_mrrc2 (int coproc, unsigned opc1, int CRm);
What’s New in TI Arm Clang Compiler Tools 2.1.1.LTS
Support for new tiarmhex options: –boot_align_sect and –boot_block_size=size
NOTE:
This recommended use of the new tiarmhex options, “–boot_align_sect” and “–boot_block_size” only applies when using the tiarmhex utility to generate an Arm hex boot file that will be consumed by the C28 on-chip bootloader per the C28 FLASH_API.
There are two new options available in the tiarmhex utility that assist in the resolution of two issues (CODEGEN-9831 and CODEGEN-8471):
- –boot_align_sect will cause tiarmhex to adjust the default boot record limit size based on section alignment (for alignment > 1). For example, if an output section has ALIGN(16), then the boot record size will be adjusted from default value of 0xFFFE to 0xFFF0.
- –boot_block_size=size allows overriding the hex utility default boot block size. (ARM default is 0xFFFF which is not supported by C28 FLASH API).
If you are using the tiarmhex utility to generate an Arm hex boot file for use by the C28 FLASH API, the above options should be included in the tiarmhex command-line:
%> tiarmhex <current boot options> --boot_align_sect --boot_block_size=0xFFFE
What’s New in TI Arm Clang Compiler Tools 2.1.0.LTS
Support for Multiple Condition/Decision Coverage (MC/DC)
The tiarmclang 2.1.0.LTS release provides support for Modified Condition/Decision Coverage (MC/DC) on top of the existing Source-Based Code Coverage framework. MC/DC is an ISO26262 functional safety requirement for ASIL-D for compound boolean expression decisions to show that each condition in a decision independently affects the outcome of a decision. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions.
For further information about support for Code Coverage, including details about how to use MC/DC support, please see: Code Coverage
Reduction of Code Coverage Instrumentation Footprint
The tiarmclang 2.1.0.LTS release provides two new code coverage related options that can help reduce the size of the instrumentation code and data that is added to an application build to enable computation and visualization of code coverage information.
Reduce Size of Profile Counter: -fprofile-counter-size=[64|32]
The default size for the compiler generated profile counters that annotate an application when code coverage is enabled is 64-bits. The
-fprofile-counter-size=32
option instructs the compiler to use 32-bit integer values to record the execution count associated with a basic block (a sequence of executable code that can potentially be the destination of a call or branch) where applicable.
Limit Generation of Code Coverage Information to Functions
Normally when compiler generated code coverage is enabled in tiarmclang, the compiler will annotate an application with execution counters for basic blocks. The
-ffunction-coverage-only
option can be used to reduce the code coverage instrumentation footprint by limiting compiler generated code coverage information to function entry execution counts.
Inter-Module Optimizations via Link-Time Optimization (LTO)
Beginning with the tiarmclang 2.0.0.STS release, support for whole-program optimization via link-time inter-module optimizations is available.
The -flto Option Turns on LTO
The LTO feature can be enabled using the -flto option on the tiarmclang command-line.
Building an LTO-Enabled Application with tiarmclang from the Command-Line Interface
- Compiling and Linking an Application from a Single tiarmclang Command
If compiling and linking from a single tiarmclang command, the -flto option can be inserted among the other compiler options. A typical tiarmclang command-line that turns on the LTO feature will look like this:
%> tiarmclang -mcpu=cortex-m4 -Oz -flto hello.c -o hello.out -Wl,-llnk.cmd,-mhello.map
- Compiling and Linking an Application in Separate Steps with tiarmclang
If compiling and linking in separate steps, the -flto option should be specified on both the tiarmclang compilation and linking commands, like so:
%> tiarmclang -mcpu=cortex-m4 -Oz -flto -c hello.c
%> tiarmclang -mcpu=cortex-m4 -Oz -flto hello.o -o hello.out -Wl,-llnk.cmd,-mhello.map
Building an LTO-Enabled Application with tiarmclang in a Code Composer Studio Project
A tiarmclang Code Composer Studio (CCS) project that has been imported into or created in a workspace can be built with LTO enabled by checking the “Select Link-Time Optimization (LTO) (-flto)” box in the Build->Arm Compiler->Optimization tab in the project build settings dialog pop-up window. This capability is available starting in CCS 12.0.
For example, given a simple “Hello World!” CCS project as the project in focus in a workspace, you can click on Project->Build Settings to bring up the Properties pop-up window. Assuming that the “TI Clang v2.1.0.LTS” compiler has been selected in the General->Compiler version box and other settings besides -flto have been accounted for, then:
- Click on Build->Arm Compiler->Optimization
- Click on check-box beside “Select Link-Time Optimization (LTO) (-flto)”
- Click on the “Apply and Close” button
- Build your project
The -flto option will be used in both the compile and link steps of the project build.
If you are using a version of CCS prior to 12.0, then you can still enable LTO for the build of your application. Assuming the same “Hello World!” CCS project with all other settings accounted for, you can enable LTO by inserting the -flto option into both the Build->Arm Compiler and Build->Arm Linker tabs in the Project->Build Settings pop-up window as follows:
Click on Build->Arm Compiler and edit the Command-line pattern contents as follows:
Before: ${command} ${flags} ${output_flag}${output} ${inputs} After: ${command} -flto ${flags} ${output_flag}${output} ${inputs}
Similarly, for the link-step, click on Build->Arm Linker and edit the Command-line pattern contents as follows:
Before: ${command} ${flags} ${output_flag}${output} ${inputs} After: ${command} -flto ${flags} ${output_flag}${output} ${inputs}
- Click on the “Apply and Close” button
Build your project
LTO Development Flow
There are essentially two steps to employing link-time inter-module optimizations in the build of a given application.
Compile as much C/C++ source code as possible with the -flto option.
Compiling a C/C++ source file with the -flto option instructs the compiler to embed an intermediate representation (IR) in the compiler-generated object file that is produced by the compiler. This includes any object files contained in libraries. In fact, all of the runtime libraries that are shipped with the tiarmclang toolchain are built with the -flto option. This allows a given object file from a runtime library to be able to participate in LTO during the link step if LTO is turned on. An object file with embedded IR will be interpreted as a normal object file if LTO is not turned on during the link step.
Turn on the LTO feature during the link of your application
As explained in the above section, LTO can be turned on by specifying the -flto option on the tiarmclang command during compilation and linking.
When LTO is turned on during the link, the linker will:
a. Extract the embedded IR content from each input object file that contains embedded IR to create a source IR module. This also applies to object files that are pulled in from object libraries to resolve references to undefined symbols. b. The source IR modules are linked together into a combined IR module. c. The combined IR module is presented to the compiler to “re-compile” the program with inter-module optimizations enabled.d. The resulting object file from the “re-compile” is linked with all other input object files that do not contain embedded IR to produce the linked output file.
Benefits of Using LTO - Enabling Inter-Module Optimizations
Let’s consider a simple example application to demonstrate just one of the potential benefits of using LTO to enable inter-module optimization …
Consider a series of source files in which many of the same string constants are referenced repeatedly and across multiple source files.
If we compile and link without LTO turned on:
%> tiarmclang -mcpu=cortex-m4 -Oz constant_merge_test.c \
ic_s10.c ic_s20.c ic_s30.c ic_s40.c s10.c s20.c s30.c s40.c \
-o no_lto.out -Wl,-llnk.cmd,-mno_lto.map
The map file reveals that the size of the .rodata section where all of the string constants are defined is reasonably large:
no_lto.map:
...
SEGMENT ALLOCATION MAP
run origin load origin length init length attrs members
---------- ----------- ---------- ----------- ----- -------
00000020 00000020 00007a4c 00007a4c r-x
00000020 00000020 00004ad2 00004ad2 r-- .rodata
...
...
But if we then compile with LTO enabled:
%> tiarmclang -mcpu=cortex-m4 -flto -Oz constant_merge_test.c \
ic_s10.c ic_s20.c ic_s30.c ic_s40.c s10.c s20.c s30.c s40.c \
-o with_lto.out -Wl,-llnk.cmd,-mwith_lto.map
Then the map file shows that the .rodata is significantly smaller in the LTO enabled build:
...
SEGMENT ALLOCATION MAP
run origin load origin length init length attrs members
---------- ----------- ---------- ----------- ----- -------
00000020 00000020 00005b84 00005b84 r-x
...
00004530 00004530 00001674 00001674 r-- .rodata
...
The use of LTO in this example enables the compiler to perform an inter-module constant merging optimization that results in a savings of 0x4ad2 - 0x1674 -> 0x345e (13406) bytes in the .rodata section. Note that in this example, the savings in the size of the .rodata section is offset somewhat by increased code size in other sections like .text. The net savings is 0x7a4c - 0x5b84 -> 0x1ec8 (7880) bytes.
Improved Compiler Generated Debug Information to Enable Use of CCS Stack Usage View
The tiarmclang 2.1.0.LTS compiler will emit estimated stack usage debug information for all functions, including functions defined in runtime libraries, to enable the use of the Stack Usage View in Code Composer Studio (CCS). Additionally, functions in the runtime libraries that are sourced in assembly language have been annotated with assembly directives to supply estimated stack usage information for those functions.
Recently Fixed Issues
CODEGEN-10591 : Optimization of Logical NOT on a condition yields incorrect Modified Condition/Decision Coverage (MC/DC) tracking data
The tiarmclang code generator will sometimes flip the sense of a negated condition used by a branch to elide an instruction. When this happens, MC/DC incorrectly instruments the wrong sense of the condition, yielding incorrect tracking data. This results in a post-processing failure where tiarmcov is not able to match an executed test vector to its list of known test vectors.
In the tiarmclang 2.1.3.LTS release, this issue is no longer present. The relevant optimization is avoided on leaf-level conditions when instrumenting for MC/DC.
CODEGEN-10444 : Enabling Code Coverage in combination with Link-Time Optimization (LTO) results in missing profile data section
In tiarmclang 2.1.x.LTS releases prior to 2.1.3.LTS, when Link-Time Optimization (LTO) is used in combination with Code Coverage, the __llvm_prf_data dependencies that tiarmclang adds to the __llvm_prf_cnts counter section are effectively erased. As such, after LTO, the __llvm_prf_data is no longer present, and downstream code coverage tools fail.
This issue can be avoided in the prior 2.1.x.LTS releases by not enabling LTO during the link step of an application build.
The issue has been fixed in release 2.1.3.LTS.
CODEGEN-10229 : tiarmclang compiler-generated debug information can cause Code Composer Studio (CCS) to crash
In the tiarmclang releases prior to the 2.1.1.LTS release, the tiarmclang compiler would generate self-referencing debug information in some cases. The issue has been fixed in the tiarmclang 2.1.1.LTS release.
This issue can be avoided in tiarmclang releases prior to 2.1.1.LTS by taking the following actions when launching the debugger:
- Launch the debugger
- Connect to target
- Insert “symbol_loader=1” in the “Expressions View”
- Load symbols/program
CODEGEN-10000 : Mixing ARM and THUMB mode object files when linking a Cortex-R4 or -R5 application with Link-Time Optimization (LTO) enabled can cause a run-time hardware exception (attempt to execute an ARM mode instruction while in THUMB mode, or vice versa)
In tiarmclang releases prior to the 2.1.1.LTS release, the tiarmclang linker, when run with LTO enabled, can incorrectly resolve a call from an ARM mode function to another ARM mode function (or a THUMB mode function call to another THUMB function) with a BLX opcode. This causes the processor to switch instruction modes during the call and attempt to execute an ARM mode instruction as a THUMB mode instruction, or vice versa. This issue has been fixed in the 2.1.1.LTS release. The issue does not exist in the 1.3.0.LTS or 1.3.1.LTS releases since the LTO feature was not available until the 2.1.0.LTS release.
The issue can be avoided by building a Cortex-R4 or -R5 application completely in ARM mode. That is, avoid mixing compiler generated ARM mode object files with THUMB mode object files during the link step of your application build.
CODEGEN-9415 : tiarmclang compiler inappropriately generates non-empty ARM.exidx sections
In tiarmclang releases prior to 2.1.3.LTS, when building for an application with C++, the tiarmclang will generate ARM.exidx sections, even though exception handling is not enabled.
The tiarmclang 2.1.3.LTS compiler will not generate ARM.exidx sections when exceptions are not enabled.
CODEGEN-8216 : Code coverage symbols not defined when profile counter section is manually placed in target memory
In tiarmclang releases prior to 2.1.3.LTS, if the __llvm_prf_cnts section’s placement is explicitly dictated by a linker command file’s SECTIONS directive, then the __start___llvm_prf_cnts and __stop___llvm_prf_cnts symbols will not be created.
This problem can be avoided by not specifying explicit placement of the __llvm_prf_cnts section in a SECTIONS directive.
This issue has been resolved in the tiarmclang 2.1.3.LTS release.
CODEGEN-6288 : tiarmclang optimizer removes empty loops that don’t have side effects
In tiarmclang releases prior to 1.2.1.STS, the optimizer would remove an empty while loop that contained no side effects. If a function contained only such a loop, then the optimizer would remove references to the function from other functions in the same compilation unit even if the function were annotated with an optnone function attribute.
In tiarmclang releases starting with 1.2.1.STS, you can now mark a function containing an empty loop with no side effects with an optnone function attribute and references to the function will not be removed.
Alternatively, you can specify an asm() statement inside the body of the empty loop to create a side effect that will prevent the loop from being removed. For example:
while (1) {
__asm(" ");
}
Host Support / Dependencies
The following host-specific versions of the 2.1.0.LTS tiarmclang compiler are available:
- Linux: Ubuntu, RHEL 7
- Windows: 7, 8, 10
- Mac: OSX
Device Support
The tiarmclang compiler supports development of applications that are to be loaded and run on one of the following Arm Cortex processor variants:
ARM Processor Variant | Options |
---|---|
Cortex-M0 | “-mcpu=cortex-m0” |
Cortex-M0+ | “-mcpu=cortex-m0plus” |
Cortex-M3 | “-mcpu=cortex-m3” |
Cortex-M4 without FPv4SPD16 | “-mcpu=cortex-m4 -mfloat-abi=soft” |
Cortex-M4 with FPv4SPD16 | “-mcpu=cortex-m0 -mfloat-abi=hard -mfpu=fpv4-sp-d16” |
Cortex-M33 without FPv5SPD16 | “-mcpu=cortex-m33 -mfloat-abi=soft” |
Cortex-M33 with FPv5SPD16 | “-mcpu=cortex-m33 -mfloat-abi=hard -mfpu=fpv5-sp-d16” |
Cortex-R4 (Thumb) without VFPv3D16 | “-mcpu=cortex-r4 -mthumb -mfloat-abi=soft” |
Cortex-R4 (Thumb) with VFPv3D16 | “-mcpu=cortex-r4 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16” |
Cortex-R4 without VFPv3D16 | “-mcpu=cortex-r4 -mfloat-abi=soft” |
Cortex-R4 with VFPv3D16 | “-mcpu=cortex-r4 -mfloat-abi=hard -mfpu=vfpv3-d16” |
Cortex-R5 (Thumb) without VFPv3D16 | “-mcpu=cortex-r5 -mthumb -mfloat-abi=soft” |
Cortex-R5 (Thumb) with VFPv3D16 | “-mcpu=cortex-r5 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16” |
Cortex-R5 without VFPv3D16 | “-mcpu=cortex-r5 -mfloat-abi=soft” |
Cortex-R5 with VFPv3D16 | “-mcpu=cortex-r5 -mfloat-abi=hard -mfpu=vfpv3-d16” |
Resolved Defects
ID | Summary |
---|---|
CODEGEN-10591 | Optimization of Logical NOT on condition yields incorrect MC/DC test vector tracking |
CODEGEN-10444 | Enabling Code Coverage and LTO results in missing profile data section |
CODEGEN-10383 | Document ‘#pragma clang section bss’ to be used for uninitialized variables |
CODEGEN-10229 | Crash can occur when loading symbols due to self-referencing DIE |
CODEGEN-10067 | LTO: linker should include undefined symbols that are referenced from a static function in the IR symbol table that is passed to the LTO recompile |
CODEGEN-10000 | LTO: Compiling a source file with cortex-r4/r5 with -mthumb and linking with ARM mode cortex-r4/r5 runtime libraries improperly resolves an R_ARM_CALL relocation |
CODEGEN-9997 | tiarmclang: LTO behaves differently than non-LTO with regards to how zero-initialized variables are defined |
CODEGEN-9850 | Unresolved reference to runtime library function when that function is referenced from asm statement |
CODEGEN-9838 | Update tiarmclang documentation to explain that C++ library does not support features related to threads and concurrency |
CODEGEN-9834 | Compiler ignores attribute((used)) |
CODEGEN-9831 | ARM hextool boot table max block size limitation (when used with C28 on-chip boot loader) |
CODEGEN-9779 | Function local static array allocated to .data section, and not .bss |
CODEGEN-9415 | Compiler inappropriately generates non-empty ARM.exidx sections |
CODEGEN-9092 | tiarmclang mistakenly documents support for -fpic position independent code |
CODEGEN-8914 | _enable_IRQ in ti_compatibility.h only supports Cortex-M devices |
CODEGEN-8899 | tiarmlnk generates cinit record for tiny .init_array section |
CODEGEN-8887 | Compiler does not support linking code that uses C++ exceptions |
CODEGEN-8639 | tiarmar.exe is denied permission to create an archive file on Windows 7 |
CODEGEN-8533 | Use of virtual functions causes many RTS print functions to be linked into the program |
CODEGEN-8471 | Hex utility, when splitting a section as required by the bootloader, ignores the section alignment for the second part of the split |
CODEGEN-8255 | tiarmclang: zero-initialized static and global variables are being defined in .bss |
CODEGEN-8216 | Code coverage symbols not defined when profile counter section manually placed |
CODEGEN-7745 | Disassembler armobjdump -d generates incorrect output |
CODEGEN-6288 | tiarmclang: optimizer removes empty loops that don't have side effects |
Known Defects
The up-to-date known defects in v2.1.3.LTS can be found here (dynamically generated):
End of File