Readme for C29 Clang Code Generation Tools 2.2.0.LTS
Table of Contents
Introduction
Version 2.2.0.LTS of the C29 Clang Compiler Tools, also known as the c29clang compiler, is derived from the open source LLVM/Clang source code base and the LLVM Compiler Infrastructure source base that can be found in GitHub (github.com).
The c29clang compiler can be used to compile and link C/C++ and assembly source files to build static executable application files that can be loaded and run on a C29 processor.
Long-Term Support Release
This is a Long-Term Support (LTS) release.
For definitions and explanations of STS, LTS, and the versioning number scheme, please see SDTO Compiler Version Numbers.
Documentation
The C29 Clang Compiler Tools User’s Guide is now available online at the following URL:
TI E2E Community - Where to Get Help
Post compiler related questions to the TI E2E design community forum and select the TI device being used.
The following is the top-level webpage for all of TI’s Code Generation Tools.
If submitting a defect report, please attach a scaled-down test case with command-line options and the compiler version number to allow us to reproduce the issue easily.
Defect Tracking Database
Compiler defect reports can be tracked at the new Development Tools bug database, SIR. SIR is a JIRA-based view into all public tools defects.
A my.ti.com account is required to access this page. To find an issue in SIR, enter your defect id in the top right search box once logged in. Alternatively, from the top red navigation bar, select “Issues” then “Search for Issues”.
New Features Added in the 2.2.0.LTS Release
Link-time Optimization (-flto)
The c29clang compiler toolchain now supports the powerful link-time optimization strategy. When ‘-flto’ is passed as an option to a link command, the linker identifies other object files (including those in libraries) that were compiled with ‘-flto’ and generates a single, combined code module. The compiler uses this new knowledge to further optimize the final program. For example, consider two source files:
// file1.c
extern int return_value();
int add_value(int x) { return x + return_value(); }
// file2.c
int return_value() { return 5; }
With c29clang file1.c file2.c -O3, the compiler can’t see the definition of ‘return_value’ in ‘file2.c’ while optimizing ‘file1.c’, leading the generated code for add_value to contain a non-performant CALL sequence.
With c29clang file1.c file2.c -flto -O3, the compiler acts the same as before until the linker executes. The tool will extract representations of code from file1.c and file2.c and generate a new translation unit that looks similar to:
// files.c
extern int return_value();
int add_value(int x) { return x + return_value(); }
int return_value() { return 5; }
This is optimized once more. Now that the compiler can see the definition of return_value, it can replace the call to it in add_value with the literal 5, removing the CALL sequence entirely.
All objects must be built with -flto to be included in the optimization strategy. In particular, it’s important that libraries built or imported into your project have been built with -flto. Objects not built with -flto will not be added to the combined module.
Link-time optimization interacts in complex ways with hardware security configurations because it involves the movement of code and data accesses across what may be a secure boundary containing protected calls and returns. Because of this, a link-time optimized project with security enabled may be slower than a link-time optimized project without security.
Since the last LTS release, a number of issues related to link-time optimization have been resolved. Reference the resolved defects at the end of this document to see if issues you have experienced are fixed in 2.2.0.LTS.
Software Floating Point Support
In some interrupt use-cases, the save/restore of floating-point registers adds unnecessary overhead in the form of code size and load/store cycles. Previous solutions to avoid this overhead have run into issues with the compiler assuming that floating point registers are still available. This is because under default conditions the hardware supports floating point registers and operations, which conflicts with the software that doesn’t want it.
For the 2.2.0.LTS release, the repurposed -mfpu=none option can now be used to fully disable floating point registers. Code compiled with -mfpu=none is not compatible with code compiled with -mfpu=f32 or -mfpu=f64, and interlinking such code will cause errors during compilation. Any floating-point operations in code compiled with -mfpu=none are guaranteed to not use floating point registers, but will suffer major performance and code size degradation similar to using 64-bit floating point operations with -mfpu=f32.
Hardware Security (C29 SSU)
Previous releases contained security guides in linker command files using the SECURE_GROUP syntax. In 2.2.0.LTS, there have been changes to this syntax.
- SECURE_GROUP should only be applied to MEMORY ranges, and will apply to all output SECTIONS that may be placed into it
- SECURE_GROUP is still valid on output SECTIONS, but is deprecated and may be removed at a future date
- SECURE_GROUPs require a name, the ‘call group’
- Two SECURE_GROUPs with the same name can freely contain calls into one another
- A SECURE_GROUP may optionally specify either the PUBLIC or PRIVATE attribute
- PRIVATE is the default specification if nothing is specified
- If the SECURE_GROUP of a callee is PRIVATE and has a different call group than the caller, the linker generates an error
- If the SECURE_GROUP of a callee is PUBLIC and has a different call group than the caller, the linker generates a pair of functions called a trampoline and landing pad. These functions implement the missing and required protected CALL and RET sequences
- SECURE_GROUPs can optionally declare lists of resources that code in that group can/will read from or write to
- This is done with the READS and WRITES SECURE_GROUP attributes
- Resource names may be any informative string, but generally are identified as MEMORY ranges that will contain the data being accessed
- Two code sections are compatible if every placement option of the caller has access to resources accessed by the callee in any placement configuration, making them amenable to inter-procedural optimization during link-time optimization
Note: These hardware security guides have no effect on the hardware configuration and serve only as extra information for the linker. The CPU must be set up for security manually or with the SysCfg tool, which additionally generates these guides for you.
Example:
MEMORY {
FLASH (X): origin=0x20 length=0x1000
SECURE_GROUP(FLASH_GROUP, PUBLIC, READS=(R1, R2), WRITES=(W1))
R1 (R): origin=0x1000 length=0x1000
R2 (R): origin=0x2000 length=0x1000
W1 (W): origin=0x3000 length=0x1000
}
SECTIONS {
.text: {} > FLASH // Inherits the FLASH_GROUP SECURE_GROUP, reading from ranges R1 and R2 and
// writing to range W1
}
Interlinking ‘-mfpu=f64’ Code With ‘-mfpu=f32’ Libraries
In the previous releases, trying to link common ‘-mfpu=f32’ libraries into an otherwise ‘-mfpu=f64’ project failed in the linker. In 2.2.0.LTS, linking f32 libraries into an f64 project is allowed. Linking f64 libraries into an f32 project will still result in a linker error.
Fast Versions of C Library Functions
The following 32-bit floating point C library functions will be replaced with much faster instruction sequences when ‘-ffast-math’ is enabled. For users not using ‘-ffast-math’, an associated builtin function may be available to generate the sequence.
- Inverse square root (1/sqrtf()) - There is no builtin for this operation
- sinf - __builtin_c29_sinf
- cosf - __builtin_c29_cosf
- fmodf - __builtin_c29_fast_fmodf
- roundf - __builtin_c29_fast_roundf
- floorf - __builtin_c29_fast_floorf
- truncf - __builtin_c29_fast_truncf
- ceilf - __builtin_c29_fast_ceilf
- atan2f - __builtin_c29_fast_atan2f
- acosf - __builtin_c29_fast_acosf
- asinf - __builtin_c29_fast_asinf
- logf - __builtin_c29_fast_logf
- atanf - __builtin_c29_fast_atanf
Note: These sequences may not be as precise as the standard C library function.
Out-of-the-box Performance and Code Size Improvements
- Improves performance of loops with structures similar to strcmp or linked list search.
- Improves identification of the saturation idiom which generates the MINMAXF instruction.
- Eliminates excessive PAD instructions in a number of internal helper functions, improving code size.
- Inlines integer division sequences at -O3, avoiding call overhead.
- More intelligently chooses the base and offset of addresses so built-in address mode scaling saves explicit shift instructions.
- Generates fewer superfluous sign/zero extension instructions.
- Prefers rematerializing constants over storing and reloading them.
Host Support / Dependencies
The following host-specific versions of the 2.2.0.LTS c29clang compiler tools are available:
- Linux: Ubuntu 18, RHEL 8
- Windows: 7, 8, 10
- Mac: OSX
Device Support
product ~ “Code Generation Tools” AND (“Found In Release” ~ C29_2.1.0.STS) AND (NOT (“Fix In Release” ~ C29_2.1.0.STS) OR “Fix In Release” IS EMPTY) ORDER BY id
The c29clang compiler tools support development of applications that are to be loaded and run on one of the following processor and runtime environment configurations:
| Runtime Environment Configuration | Options |
|---|---|
| C29 (default) | “-mcpu=c29.c0” |
| Floating Point 32bit hardware ops, 64bit emulation (default) | “-mfpu=f32” |
| Floating Point 32bit and 64bit hardware ops | “-mfpu=f64” |
| Floating Point 32bit emulation, 64bit emulation | “-mfpu=none” |
Resolved Defects
| ID | Summary |
|---|---|
| CODEGEN-15021 | Incorrect RETD delay count |
| CODEGEN-14896 | Compiler hang/crash when loading/storing to addresses that are sums and include a left shift operation |
| CODEGEN-14888 | Incorrect results for asinf, acosf, and roundf RTS functions when compiled with -ffast-math |
| CODEGEN-14886 | Run placement fails for section “.TI.bound:<variable>.<k>” |
| CODEGEN-14664 | Compiler generates infinite loop in memset with -O1 or higher and -fno-inline-functions |
| CODEGEN-14522 | c29clang-tidy reports “unknown target CPU” |
| CODEGEN-14179 | Compiler allocates wrong register for asm statement |
Known Defects
The up-to-date known defects in v2.2.0.LTS can be found here (dynamically generated):
End of File