TI Arm Clang Compiler Tools - 3.0.0.STS Release Notes

Table of Contents

Introduction

Version 3.0.0.STS of the TI Arm Clang Compiler Tools, also known as the tiarmclang compiler, is derived from the open source LLVM/Clang source code base and the LLVM Compiler Infrastructure source base that can be found in GitHub (github.com).

The tiarmclang compiler can be used to compile and link C/C++ and assembly source files to build static executable application files that can be loaded and run on an Arm Cortex processor (m0, m0plus, m3, m4, m33, r4, and r5). Please see the Device Support section below for further information about which compiler options to use when building an application for a particular Arm Cortex processor configuration.

Short-Term Support Release

This is a Short-Term Support (STS) release.

For definitions and explanations of STS, LTS, and the versioning number scheme, please see SDTO Compiler Version Numbers.

Documentation

The TI Arm Clang Compiler Tools User’s Guide is now available online at the following URL:

Since the tiarmclang compiler is derived from the LLVM project’s Clang compiler source base, much of the generic Clang online documentation is also applicable to the tiarmclang compiler. The latest version of the generic Clang documentation can be found here:

TI E2E Community - Where to Get Help

Post compiler related questions to the TI E2E design community forum and select the TI device being used.

The following is the top-level webpage for all of TI’s Code Generation Tools.

If submitting a defect report, please attach a scaled-down test case with command-line options and the compiler version number to allow us to reproduce the issue easily.

Defect Tracking Database

Compiler defect reports can be tracked at the new Development Tools bug database, SIR. SIR is a JIRA-based view into all public tools defects. The old SDOWP tracking database will be retired.

A my.ti.com account is required to access this page. To find an issue in SIR, enter your defect id in the top right search box once logged in. Alternatively, from the top red navigation bar, select “Issues” then “Search for Issues”.

To find an old SDOWP issue, place the SDOWP ID in the search box and use double quotes around the SDOWP ID.

What’s New

Support for C++ Exceptions (-fexceptions)

By default, C++ exceptions are disabled.

Beginning with version 3.0.0.STS of the tiarmclang compiler tools, the -fexceptions compiler option can be specified when the compiler is invoked to enable support for C++ exceptions.

If the -fexceptions compiler option is used to compile an application’s source code, then the linker will be instructed to link with runtime support libraries that support C++ exceptions during the link-step of an application build.

Example

Consider the following simple example of utilizing C++ exceptions:

```
#include <iostream>

int main() {
  int age;

  std::cout << "How old are you? ";
  std::cin >> age;
  try {
    if (age >= 18) {
      std::cout << "Please proceed to an open booth to vote ... Thank you!\n";    
}
else {
  throw (age);
}
  }
  catch (int input_age) {
    std::cout << "I'd like to help you, but you're too young to vote (" << input_age << ")\n";
  }

  return 0;
}
```

Compile the above C++ source file as follows:

```
%> tiarmclang -mcpu=cortex-m4 -fexceptions check_age.cpp -o check_age.out -Wl,-llnk.cmd
```

When loaded and run, the above application will generate the following output:

```
How old are you? 21
Please proceed to an open booth to vote ... Thank you!

How old are you? 5
I'd like to help you, but you're too young to vote (5)
```

Enable Compiler Generation of Execute-Only Code for Cortex-M0/M0+ Functions (-mexecute-only)

The -mexecute-only compiler option can be used in version 3.0.0.STS of the tiarmclang compiler tools to generate “execute-only” code for Cortex-M0/M0+. Use of the -mexecute-only compiler option on the compiler invocation will prevent constant data from being embedded in the code section that the compiler generates for a function.

When an application’s source files are compiled with the -mexecute-only option, the linker will be instructed to link with execute-only versions of the runtime support libraries during the link-step of an application build.

Example

Consider a simple example with a function that contains a switch statement:

```
#include <stdio.h>

void mySwitch(int n) {

  switch (n) {
    case 1:
      printf("Input value is 1\n");
      break;
    case 2:
      printf("Input value is 2\n");
      break;
    case 3:
      printf("Input value is 3\n");
      break;
    default:
      printf("Invalid input\n");
      break;
  }
}
```

Compile the above C source file as follows, using the -S to emit compiler-generated assembly:

```
%> tiarmclang -mcpu=cortex-m0plus -mexecute-only -S ex_switch.c
```

The compiler generated assembly file contains the following:

```
%> cat ex_switch.s
...
        .section        .text.mySwitch,"axy",%progbits,unique,0
        .globl  mySwitch
        .p2align        1
        .code   16                              @ @mySwitch
        .thumb_func
mySwitch:
...
.LBB0_5:                                @ %sw.bb3
        movs    r0, :upper8_15:.L.str.2
        lsls    r0, r0, #8
        adds    r0, :upper0_7:.L.str.2
        lsls    r0, r0, #8
        adds    r0, :lower8_15:.L.str.2
        lsls    r0, r0, #8
        adds    r0, :lower0_7:.L.str.2
        bl      printf
...
        .section        .rodata.str1.1,"aMS",%progbits,1
.L.str:
        .asciz  "Input value is 1\n"
        .size   .L.str, 18
...
.L.str.2:
        .asciz  "Input value is 3\n"
        .size   .L.str.2, 18
...
```

Note that in the above compiler-generated code, the address where each string constant resides in the .rodata.str1.1 section is loaded via direct addressing. This requires four 8-bit loads to load each part of the address into a register before the call to printf can be made.

If we then compare this to the code that is generated when execute-only is disabled:

```
%> tiarmclang -mcpu=cortex-m0plus -S ex_switch.c
%> cat ex_switch.s
...
        .section        .text.mySwitch,"ax",%progbits
        .globl  mySwitch
        .p2align        1
        .code   16                              @ @mySwitch
        .thumb_func
mySwitch:
...
.LBB0_5:                                @ %sw.bb3
        ldr     r0, .LCPI0_0
        bl      printf
...
        pop     {r7, pc}
    .p2align        2
.LCPI0_0:
        .long   .L.str.2
...
        .section        .rodata.str1.1,"aMS",%progbits,1
.L.str:
        .asciz  "Input value is 1\n"
        .size   .L.str, 18
...
.L.str.2:
        .asciz  "Input value is 3\n"
        .size   .L.str.2, 18
...
```

Observe that in the non-execute-only generated code the address of the location where the string constant resides is in a table of constants that is included in the .text section that contains the definition of the mySwitch function. This allows the address to be loaded via PC-relative addressing.

While the non-execute-only compiler-generated code is smaller and more efficient, the execute-only code may reside in special execute-only memory. This can be useful when code security is a concern in your application.

Enable Linker Generation of an XML Function Hash Table for OpTI-Flash/OpTI-SHARE (–gen_xml_func_hash)

In the Sitara OpTI-Flash multicore context, the ability to identify common functions across multiple executables is desired in order to allow users to abstract these functions out and place them in shared memory in order reduce individual executable size. This is also known as “OpTI-SHARE”. In order to identify common functions in a meaningful way (where function name and size are not enough), the tiarmclang version 3.0.0.STS linker can now generate an MD5 hash based on the function’s raw data prior to relocation and emit it within a table of function symbols in the linker-generated XML link info file.

The linker will also generate a list of referenced data sections from each global function uniquely identified by their object component IDs. Common read-only data sections can also be allocated in shared memory. However, writes to read-write data sections from common code must be managed through hardware address translation available on the device (aka “RAT”). These referenced data section lists can also be used in conjunction with “Smart Placement” where fast data access from frequently executed functions is desired.

When linking an application, the aforementioned table and referenced section lists will be generated when the “–xml_link_info” option is used in conjunction with “–gen_xml_func_hash”. The “–xml_link_info” option can be given a specified file name to use for the output.

Example

The generated table is designated by a “func_symbol_table” XML tag, with each global function represented by a “symbol” tag. The associated MD5 hash is indicated by a “value” tag and the referenced data section lists indicated in “refd_ro_sections” and “refd_rw_sections” tags for read-only (constant) data and read-write data, respectively. For example:

```
<func_symbol_table>
  <symbol>
    <name>func0</name>
    <sectname>.text.main</sectname>
    <value>b6e5b51736000aef4da6e8afb91846e4</value>
  </symbol>
  <symbol>
    <name>func1</name>
    <sectname>.text.foo</sectname>
    <value>b1b9d95dd364df1b53f4e8c571ddaf68</value>
  </symbol>
  <symbol>
    <name>func2</name>
    <refd_ro_sections>
      <object_component_ref idref="oc-92"/>
      <object_component_ref idref="oc-99"/>
    </refd_ro_sections>
    <refd_rw_sections>
      <object_component_ref idref="oc-94"/>
      <object_component_ref idref="oc-96"/>
      <object_component_ref idref="oc-97"/>
      <object_component_ref idref="oc-98"/>
    </refd_rw_sections>
  </symbol>
</func_symbol_table>
```

Enable Use of Custom Datapath Extension (CDE) Intrinsics on Cortex-M33

Support for using Custom Datapath Extension (CDE) intrinsics in source code to be compiled for the TI Arm Cortex-M33 processor has been added to version 3.0.0.STS of the tiarmclang compiler tools.

CDE Intrinsics

The CDE intrinsics are defined in the arm_cde.h header file that is included in the tiarmclang compiler tools installation.

The arm_cde.h header file must be included in any compilation unit that references a CDE intrinsic. For example,

```
#include <arm_cde.h>

void foo(void) {
  ...
  uint32 my_u32 = __arm_cx2a(10, 20, 30, 40);
  ...
}
```

The available CDE instrinsics include the following:

Each of the CDE intrinsics is defined in arm_cde.h as a static inline function and implemented via a compiler runtime built-in function that is defined in the relevant version of the libclang_rt.builtins.a runtime library, which is included in the tiarmclang 3.0.0.STS compiler tools installation.

Specify -march Option to Enable Use of CDE Intrinsics on Cortex-M33

To enable the use of CDE intrinsics during a compilation of a source file, one of the following -march compiler options must be specified when the compiler is invoked:

Cortex-M4 and Cortex-R5 Performance Improvements

The 3.0.0.STS version of the tiarmclang compiler tools are capable of generating slightly higher performance Cortex-M4 and Cortex-R5 code versus the tiarmclang 2.1.3.LTS release due to the following improvements:

tiarmclang 2.1.3.LTS v. tiarmclang 3.0.0.STS Benchmark Scores

The Coremark and Dhrystone performance benchmarks were built and run with both the tiarmclang 2.1.3.LTS and tiarmclang 3.0.0.STS releases. The following tables provide a sense of the performance improvements that can be anticipated when moving an application build from the 2.1.3.LTS compiler tools to the 3.0.0.STS compiler tools.

Cortex-M4:

Benchmark 2.1.3.LTS Score 3.0.0.STS Score
Coremark (inlining off) 2.42 2.69
Coremark (inlining on) 3.12 3.51
Dhrystone (inlining off) 1.00 1.13
Dhrystone (inlining on) 1.25 1.56

Cortex-R5:

Benchmark 2.1.3.LTS Score 3.0.0.STS Score
Coremark (inlining off) 2.87 2.87
Coremark (inlining on) 3.58 3.60
Dhrystone (inlining off) 1.25 1.55
Dhrystone (inlining on) 1.61 2.05

Host Support / Dependencies

The following host-specific versions of the 3.0.0.STS tiarmclang compiler tools are available:

Device Support

The tiarmclang compiler tools support development of applications that are to be loaded and run on one of the following Arm Cortex processor and runtime environment configurations:

Cortex-M0:

Runtime Environment Configuration Options
Cortex-M0 “-mcpu=cortex-m0”
exceptions on “-mcpu=cortex-m0 -fexceptions”
execute-only on “-mcpu=cortex-m0 -mexecute-only”
execute-only and exceptions on “-mcpu=cortex-m0 -mexecute-only -fexceptions”

Cortex-M0+:

Runtime Environment Configuration Options
Cortex-M0+ “-mcpu=cortex-m0plus”
exceptions on “-mcpu=cortex-m0plus -fexceptions”
execute-only on “-mcpu=cortex-m0plus -mexecute-only”
execute-only on, exceptions on “-mcpu=cortex-m0plus -mexecute-only -fexceptions”

Cortex-M3:

Runtime Environment Configuration Options
Cortex-M3 “-mcpu=cortex-m3”
exceptions on “-mcpu=cortex-m3 -fexceptions”

Cortex-M4:

Runtime Environment Configuration Options
Cortex-M4 (FPv4SPD16 on by default) “-mcpu=cortex-m4”
FPv4SPD16 on “-mcpu=cortex-m4 -mfloat-abi=hard -mfpu=fpv4-sp-d16”
FPv4SPD16 on, exceptions on “-mcpu=cortex-m4 -fexceptions”
FPv4SPD16 on, exceptions on “-mcpu=cortex-m4 -mfloat-abi=hard -mfpu=fpv4-sp-d16 -fexceptions”
FPv4SPD16 off “-mcpu=cortex-m4 -mfloat-abi=soft”
FPv4SPD16 off, exceptions on “-mcpu=cortex-m4 -mfloat-abi=soft -fexceptions”

Cortex-M33:

Runtime Environment Configuration Options
Cortex-M33 (FPv5SPD16 on by default) “-mcpu=cortex-m33”
FPv5SPD16 on “-mcpu=cortex-m33 -mfloat-abi=hard -mfpu=fpv5-sp-d16”
FPv5SPD16 on, exceptions on “-mcpu=cortex-m33 -fexceptions”
FPv5SPD16 on, exceptions on “-mcpu=cortex-m33 -mfloat-abi=hard -mfpu=fpv5-sp-d16 -fexceptions”
FPv5SPD16 off “-mcpu=cortex-m33 -mfloat-abi=soft”
FPv5SPD16 off, exceptions on “-mcpu=cortex-m33 -mfloat-abi=soft -fexceptions”
CDE CP0 on, FPv5SPD16 on “-mcpu=cortex-m33 -march=armv8.1-m.main+cdecp0”
CDE CP0 on, FPv5SPD16 on “-mcpu=cortex-m33 -march=thumbv81-m.main+cdecp0”
CDE CP0 on, FPv5SPD16 on “-mcpu=cortex-m33 -march=armv8.1-m.main+cdecp0 -mfloat-abi=hard -mfpu=fpv5-sp-d16”
CDE CP0 on, FPv5SPD16 on “-mcpu=cortex-m33 -mfloat-abi=hard -march=thumbv81-m.main+cdecp0 -mfpu=fpv5-sp-d16”
CDE CP0 on, FPv5SPD16 on, exceptions on “-mcpu=cortex-m33 -march=armv8.1-m.main+cdecp0 -fexceptions”
CDE CP0 on, FPv5SPD16 on, exceptions on “-mcpu=cortex-m33 -march=thumbv81-m.main+cdecp0 -fexceptions”
CDE CP0 on, FPv5SPD16 on, exceptions on “-mcpu=cortex-m33 -march=armv8.1-m.main+cdecp0 -mfloat-abi=hard -mfpu=fpv5-sp-d16 -fexceptions”
CDE CP0 on, FPv5SPD16 on, exceptions on “-mcpu=cortex-m33 -march=thumbv81-m.main+cdecp0 -mfloat-abi=hard -mfpu=fpv5-sp-d16 -fexceptions”
CDE CP0 on, FPv5SPD16 off “-mcpu=cortex-m33 -march=armv8.1-m.main+cdecp0 -mfloat-abi=soft”
CDE CP0 on, FPv5SPD16 off “-mcpu=cortex-m33 -march=thumbv81-m.main+cdecp0 -mfloat-abi=soft”
CDE CP0 on, FPv5SPD16 off, exceptions on “-mcpu=cortex-m33 -march=armv8.1-m.main+cdecp0 -mfloat-abi=soft -fexceptions”
CDE CP0 on, FPv5SPD16 off, exceptions on “-mcpu=cortex-m33 -march=thumbv81-m.main+cdecp0 -mfloat-abi=soft -fexceptions”

Please Note:

Cortex-R4:

Runtime Environment Configuration Options
Cortex-R4 (default Arm mode, VFPv3D16 off) “-mcpu=cortex-r4”
Arm mode, VFPv3D16 off “-mcpu=cortex-r4 -mfloat-abi=soft”
Arm mode, VFPv3D16 off, exceptions on “-mcpu=cortex-r4 -fexceptions”
Arm mode, VFPv3D16 off, exceptions on “-mcpu=cortex-r4 -mfloat-abi=soft -fexceptions”
Arm mode, VFPv3D16 on “-mcpu=cortex-r4 -mfloat-abi=hard -mfpu=vfpv3-d16”
Arm mode, VFPv3D16 on, exceptions on “-mcpu=cortex-r4 -mfloat-abi=hard -mfpu=vfpv3-d16”
Thumb mode, VFPv3D16 off “-mcpu=cortex-r4 -mthumb”
Thumb mode, VFPv3D16 off “-mcpu=cortex-r4 -mthumb -mfloat-abi=soft”
Thumb mode, VFPv3D16 off, exceptions on “-mcpu=cortex-r4 -mthumb -fexceptions”
Thumb mode, VFPv3D16 off, exceptions on “-mcpu=cortex-r4 -mthumb -mfloat-abi=soft -fexceptions”
Thumb mode, VFPv3D16 on “-mcpu=cortex-r4 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16”
Thumb mode, VFPv3D16 on, exceptions on “-mcpu=cortex-r4 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16 -fexceptions”

Cortex-R5:

Runtime Environment Configuration Options
Cortex-R5 (default Arm mode, VFPv3D16 on “-mcpu=cortex-r5”
Arm mode, VFPv3D16 on “-mcpu=cortex-r5 -mfloat-abi=hard -mfpu=vfpv3-d16”
Arm mode, VFPv3D16 on, exceptions on “-mcpu=cortex-r5 -fexceptions”
Arm mode, VFPv3D16 on, exceptions on “-mcpu=cortex-r5 -mfloat-abi=hard -mfpu=vfpv3-d16 -fexceptions”
Arm mode, VFPv3D16 off “-mcpu=cortex-r5 -mfloat-abi=soft”
Arm mode, VFPv3D16 off, exceptions on “-mcpu=cortex-r5 -mfloat-abi=soft -fexceptions”
Thumb mode, VFPv3D16 on “-mcpu=cortex-r5 -mthumb”
Thumb mode, VFPv3D16 on “-mcpu=cortex-r5 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16”
Thumb mode, VFPv3D16 on, exceptions on “-mcpu=cortex-r5 -mthumb -fexceptions”
Thumb mode, VFPv3D16 on, exceptions on “-mcpu=cortex-r5 -mthumb -mfloat-abi=hard -mfpu=vfpv3-d16 -fexceptions”
Thumb mode, VFPv3D16 off “-mcpu=cortex-r5 -mthumb -mfloat-abi=soft”
Thumb mode, VFPv3D16 off, exceptions on “-mcpu=cortex-r5 -mthumb -mfloat-abi=soft -fexceptions”

Resolved Defects

ID Summary
CODEGEN-10591 Optimization of Logical NOT on condition yields incorrect MC/DC test vector tracking
CODEGEN-10444 Enabling Code Coverage and LTO results in missing profile data section
CODEGEN-10383 Document ‘#pragma clang section bss’ to be used for uninitialized variables
CODEGEN-10251 Initialization of array of structures is mistakenly filled with 0
CODEGEN-10229 Crash can occur when loading symbols due to self-referencing DIE
CODEGEN-10067 LTO: linker should include undefined symbols that are referenced from a static function in the IR symbol table that is passed to the LTO recompile
CODEGEN-10000 LTO: Compiling a source file with cortex-r4/r5 with -mthumb and linking with ARM mode cortex-r4/r5 runtime libraries improperly resolves an R_ARM_CALL relocation
CODEGEN-9997 tiarmclang: LTO behaves differently than non-LTO with regards to how zero-initialized variables are defined
CODEGEN-9850 Unresolved reference to runtime library function when that function is referenced from asm statement
CODEGEN-9838 Update tiarmclang documentation to explain that C++ library does not support features related to threads and concurrency
CODEGEN-9834 Compiler ignores attribute((used))
CODEGEN-9779 Function local static array allocated to .data section, and not .bss
CODEGEN-9669 TI Arm Clang mismatch between source code and debugger view with function subsections
CODEGEN-9415 Compiler inappropriately generates non-empty ARM.exidx sections
CODEGEN-9092 tiarmclang mistakenly documents support for -fpic position independent code
CODEGEN-8914 _enable_IRQ in ti_compatibility.h only supports Cortex-M devices
CODEGEN-8899 tiarmlnk generates cinit record for tiny .init_array section
CODEGEN-8887 Compiler does not support linking code that uses C++ exceptions
CODEGEN-8639 tiarmar.exe is denied permission to create an archive file on Windows 7
CODEGEN-8533 Use of virtual functions causes many RTS print functions to be linked into the program
CODEGEN-8471 Hex utility, when splitting a section as required by the bootloader, ignores the section alignment for the second part of the split
CODEGEN-8255 tiarmclang: zero-initialized static and global variables are being defined in .bss
CODEGEN-8216 Code coverage symbols not defined when profile counter section manually placed
CODEGEN-6288 tiarmclang: optimizer removes empty loops that don't have side effects

Known Defects

The up-to-date known defects in v3.0.0.STS can be found here (dynamically generated):

known defects in v3.0.0.STS

End of File